You are not logged in.
Pages: 1
Hi,
I seem to be having a similar issue to the following post [url]https://33hops.com/forum/viewtopic.php?id=584[/url]
I have started a new thread as my configuration is somewhat simpler.
[Don't take it as a 'complaint'... I realise we are early days with XSIBackup-DC... I hope our bug reports can help its development]
Basically I get the above error part way through a backup run... at this point, not only has it part completed a backup of a VM, but it has also fully completed other VMs.
The setup is XSIBackup-DC running on ESXi 6.5 (free version). My backup destination is a Linux machine which is providing an NFS server. ESXi has this NFS volume mounted as a datastore named 'backup'.
I have had this exact same setup using XSIBackup-Pro which works without error.
I have has one backup that completed when I started testing XSIBackup-dc but that was a version zero which I have since upgraded from.
Version: XSIBackup-Datacenter 1.0.0.5
My command line is;
/xsibackup \
> --backup \
> "VMs(RUNNING)" \
> "/vmfs/volumes/backup" \
> --backup-how=Hot \
> --use-smtp=1 \
> --mail-to=noc@xxxxxxxxxxxx
Part of output showing failure:
-----------------------------------------------------------------------------------------------------------------
Virtual Machine Name: FileServer.Black
-----------------------------------------------------------------------------------------------------------------
Creating snapshot VM : FileServer.Black (powered on)
-----------------------------------------------------------------------------------------------------------------
*** Snapshot was successfully created ***
-----------------------------------------------------------------------------------------------------------------
New Backup: FileServer.Black
-----------------------------------------------------------------------------------------------------------------
Backup start date: 2019-10-25 13:34:49
-----------------------------------------------------------------------------------------------------------------
2019-10-25 13:34:49 | Backing up 23 files, total size is 434056647599
-----------------------------------------------------------------------------------------------------------------
NUMBER FILE SIZE PROGRESS
-----------------------------------------------------------------------------------------------------------------
1/23 FileServer.vmsd 443.00 B | Done 0.00%
-----------------------------------------------------------------------------------------------------------------
2/23 temp-flat.vmdk 400.00 GB | Done 0.00%
-----------------------------------------------------------------------------------------------------------------
::: detail ::: 45.59% done | block 18672 out of 40960 | Done 45.11%2019-10-25T16:28:05 | Error code 434 at file dedup-in.c, line 434
Error description: can't rename temp block: /vmfs/volumes/backup/data/e/4/2/b/5/e42b5b14bba8a501bfc14e945103c3a129868c98
-----------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------
SEGFAULT condition was trapped
-----------------------------------------------------------------------------------------------------------------
Cleaning up...
-----------------------------------------------------------------------------------------------------------------
*** Snapshot was removed ***
-----------------------------------------------------------------------------------------------------------------
Removed /tmp/xsi dir OK
-----------------------------------------------------------------------------------------------------------------
Unlocked backup OK
-----------------------------------------------------------------------------------------------------------------
Removed PID OK
-----------------------------------------------------------------------------------------------------------------
Note, that it is nearly half the way through this VM before it errors... it has done nearly 200GB.
It has also completed another VM, total size 64GB before hand.
Offline
Well, the curious thing here is that you are backing up locally over NFS. What NFS version are you using?
This seems an NFS configuration problem, as the xsibackup binary is being run in the ESXi security context and from that point of view there isn't any difference with using mv (move) to rename some file in your NFS share.
In fact that rename operation is performed by a single, simple system call.
V. 1.0.0.6 will include some extended debug information around this point. Being a simple rename() call in a local FS, we weren't expecting to require to delve that much into some eventual issue here.
We have assumed this was due to some permission issue, but it could very well be due to some I/O error, which would in turn point to some more gross problem.
Contact us so that we provide the 1.0.0.6 binary on advance, which will in turn include the string error returned by the system
Offline
Hints/ questions:
1 - Are you using Async NFS?
2 - Are you using NFS4 or NFS3?
3 - Are you using root_squash option?
Offline
Thanks for your quick reply,
Yes, as you say, as far as XSIBackup-dc is concerned, this should be the simplest form of backup, source on a local datastore, backing up to a second datastore on that machine (mounted via NFS rather than direct attached storage).
The target is running Ubuntu 16.04 (reason for an older distribution is that it seems to have better NFS performance than the latest)
My exports file contains;
/var/data/nfs <internal network here>/24(rw,async,no_root_squash)
I am presently mounting this from the ESXi box as NFS4
The odd thing here is that the same host and same target with the same NFS mounted 'backup' datastore works fine with XSIBackup-Pro
Thanks for your help so far.
If you send me a link for the latest .6 release, I can test further (email request sent).
Offline
It's not exactly the same, Pro is slower, thus if you are using async NFS, the remote FS is not acknowledging the write of the tmp block succeding.
You may put yourself in a situation in which the client tries to rename a block when it's actually not still available to the remote FS. So, use sync NFS, remove the Async word from your exports file and try again. In other words XSIBackup-DC requires Syncronous I/O to work, as blocks are temporarily stored and then moved to their position in the remote deduplicated repository.
You will most probably not notice any decrease in performance as the async option seems to be working quite "syncronously" along with the client most of the time. The problem is that if at any moment in time the remote FS slows down a bit..., you're done.
Offline
Thanks again for the quick reply.
I am presently running a test backup with the new 1.006 release.
After that completes (or fails), I will retry having set my exports to be 'sync' rather than async. and retry the test.
async might not be working as I had expected.. I thought that it would give me the best throughput, and if at some point it was not 'ready' on a client read or write, it would then block until completion.
The underlying target hardware is an HP server, with a hardware RAID controller with FBWC, it is connected to the source via twin 1GB connections. I suppose it depends what the slowest part of the chain is... I previously thought that async would 'keep up' with the data rate being supplied by the client...maybe with new, faster XSIBackup-dc, I have exceeded that rate at source.
Update later - after trial backup completes.
Last edited by marco (2019-10-28 11:44:48)
Offline
Well, we are just guessing, but async NFS could very well be the cause of the issue. Other candidate is NFS4, which has represented a huge cause of bugs and errors in recent ESXi versions and is indeed not as mature as NFS3 is. In fact, in your case using NFS4 is useless, as you are just using one ethernet trunk.
As said, we have made lots of tests with sync & async NFS and in a LAN with Intel NICs and decent switches the difference is negligible.
Offline
Just to update the thread...
I have had some good support direct from Roberto.
Backups are now running well, and the rename issue has disappeared when I use the pre-release version 1.006
I understand this may not be related to my issue as the changes in that version should not affect my problem, only the error messages generated.
I am continuing to test and will report back to the forum when the picture is clearer.
Offline
Thank you
This thread is closed by now. You may post again here if you find some similar issue.
As an excerpt of what's alrady been discussed. If you are nor getting goos results by using Async NFS, switch to Sync NFS.
Offline
Pages: 1