#1 2018-04-30 08:12:20

underdpt
Member
Registered: 2018-04-30
Posts: 2

Backups stalls

Hi,

I'm evaluating XSIBackup Pro, and doing some tests backing up VMS from an ESXi 5.5 to ESXi 6.5, using the 6.5 machine as the backup manager, thus launching the backup with this command:

./xsibackup --host=XXX.XXX.XXX.XXX --backup-point="YYY.YYY.YYY.YYY:22:/vmfs/volumes/datastore1/backups/remoto1"  --use-smtp=1 --remote-xsipath=. --exec=yes --mail-to=REMOVED --backup-type=running --backup-prog=onediff:z

Every try I do, It stalls at some point. Sometimes backups 3 machines, somethimes 1 (after doing a incremental backup over the machines already done). This is the output I get:

2018-04-30T05:28:49|  ###############################################################################
2018-04-30T05:28:49|     XSIBACKUP-PRO 10.3.3: new execution request
2018-04-30T05:28:49|  ###############################################################################
2018-04-30T05:28:49|
2018-04-30T05:28:50|  NOTICE: (c) XSIBackup kills any user launched jobs, make sure you don't overlap manual jobs
---------------------------------------------------------------------------------------------------------------------------------
XSIBackup PID:         39392680                                        YYY.YYY.YYY.YYY
Mon, 30 Apr 2018 05:28:49 +0000                                 IPv4: YYY.YYY.YYY.YYY/255.255.255.0
VMware ESXi 5.5.0 build-2456374                              (c) Rsync 3.1.0 as opt. dependency
---------------------------------------------------------------------------------------------------------------------------------
Backup Id:              unknown                       Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:51|  ADVICE: no SSD disks, please consider adding an SSD cache disk to improve performance
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:51|  Backup user is: root
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:51|  Backup program is: onediff
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:51|  Service OpenSSH ready at server XXX.XXX.XXX.XXX:22
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:51|  Info: remote XSIBACKUP-PRO install dir has been set by means of the --remote-xsipath argument
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:52|  Remote xsi path set to: . (filesystem: VMFS-5)
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:52|  Compression option is deprecated since XSIBackup 10.0.4, SSH 2.0 is forced since this version and it handles compression dynamically
2018-04-30T05:28:54|  Remote ESXi version is 6.5.0
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:54|  Mirroring to server 94.23.195.183 port 22
2018-04-30T05:28:54|  Checking Rsync exists on the other side...
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:55|  (c) Rsync (samba.org) found at [ XXX.XXX.XXX.XXX:22:./bin/xsibackup-rsync... ]
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:57|  (c) XSIDiff found at [ XXX.XXX.XXX.XXX:22:./bin/xsidiff... ]
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:28:57|  Getting list of all VMs...
---------------------------------------------------------------------------------------------------------------------------------
DATA REMOVED
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:29:08|  VMs to backup:
---------------------------------------------------------------------------------------------------------------------------------
DATA REMOVED
---------------------------------------------------------------------------------------------------------------------------------
SOME INCREMENTAL BACKUPS DONE OK
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:19|  [reverse_proxy] Starting backup (size is 102400M on 102400M file)
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:19|  XSIBackup will backup your VMs while they are running and will quiesce guest services too, so that users
2018-04-30T05:56:19|  can continue to use the VM while the backup is taking place. You can also run cold and warm --backup-how
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:19|  Hot backup selected for VM: [reverse_proxy], will not be switched off
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:20|  [reverse_proxy] info: boot partition is MBR
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:20|  Remember: --date-dir argument will be ignored in OneDiff backups
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:22|  [reverse_proxy] notice DIFRMMIS: no [reverse_proxy-flat.vmdk] in remote OneDiff mirror, first run?.
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:23|  [reverse_proxy] (c) OneDiff algorithm
---------------------------------------------------------------------------------------------------------------------------------
[reverse_proxy] info: OneDiff backup first run, removing snapshots
[reverse_proxy] info: all snapshots removed
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:23|  Snapshot & Quiescing
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:23|  [reverse_proxy] info: the VM will now be quiesced to ensure proper data handling
2018-04-30T05:56:23|  [reverse_proxy] info: round 1
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:34|  [reverse_proxy] info: snapshot taken, quiescing status: YES
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:36|  Removing snapshots, please wait...
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:37|  Syncronizing config files
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:40|  [reverse_proxy] info: created dir to host VM backup
2018-04-30T05:56:42|  [reverse_proxy] info: VMX file succesfully queued
2018-04-30T05:56:42|  [reverse_proxy] info: VMSD file succesfully queued
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:44|  Backing up virtual disks...
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:44|  DISK=/vmfs/volumes/datastore2/reverse_proxy/reverse_proxy-000001.vmdk (excluded)
2018-04-30T05:56:44|  DISK=/vmfs/volumes/datastore2/reverse_proxy/reverse_proxy-Snapshot1.vmsn
2018-04-30T05:56:44|  DISK=/vmfs/volumes/datastore2/reverse_proxy/reverse_proxy.vmdk
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:45|  Rsync: transfering file | /vmfs/volumes/datastore2/reverse_proxy/reverse_proxy-Snapshot1.vmsn
---------------------------------------------------------------------------------------------------------------------------------
sending incremental file list
reverse_proxy-Snapshot1.vmsn
         28,447 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=0/1)

sent 28,553 bytes  received 35 bytes  19,058.67 bytes/sec
total size is 28,447  speedup is 1.00
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:47|  Rsync: transfering file | /vmfs/volumes/datastore2/reverse_proxy/reverse_proxy.vmdk
---------------------------------------------------------------------------------------------------------------------------------
sending incremental file list
reverse_proxy.vmdk
            508 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=0/1)

sent 604 bytes  received 35 bytes  426.00 bytes/sec
total size is 508  speedup is 0.79
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:48|  Info: transfering file | /vmfs/volumes/datastore2/reverse_proxy/reverse_proxy-flat.vmdk
---------------------------------------------------------------------------------------------------------------------------------
2018-04-30T05:56:48|  Info: activate your (c)XSIDiff license to boost transfer speed
---------------------------------------------------------------------------------------------------------------------------------
sending incremental file list
reverse_proxy-flat.vmdk
    869,072,896   0%   11.16MB/s    2:35:16

And it keeps there for hours, until I kill the process o launch another backup.

So, after some tries, I got some VMs backed up, but I think something's not correct or that I'm missing something.

Thanks!

Offline

#2 2018-04-30 09:56:23

admin
Administrator
Registered: 2017-04-21
Posts: 1,367

Re: Backups stalls

1 - You are using the --host argument and at the same time backing up to a remote IP. So you are executing the backup command in a remote machine, and from there backing up to other IP. Well other or the same, we don't know.

2 - To be able to use OneDiff differential backup, you need to have coherent Virtual Hardware versions. If you choose to use OneDiff from ESXi 6.5 to a lower version, the VMs hosted at ESXi 6.5 must necessarily be of a lower HW version which is compatible with the lower ESXi version you are moving the VMs to.

3 - You have set this argument --remote-xsipath=., which makes no sense. Always use an absolute path here, if you need to, but most probably you won't need to change the default value, which is set in the conf/xsiopts file.

4 - Never use compression for OneDiff, unless you are backing up through a really narrow WAN connection. GZip compression is counterproductive in a LAN. Not just a bit, but absolutely self-defeating.

5 - Do not overlap backup jobs, use the event handlers --on-success and --on-error

6 - The argument --exec=yes is reserved for when you use external crontabs and for internal use of the --host argument. If you combine --exec=yes with --host, you are executing it twice.

Start by using the other backup programs, which don't have the above implications and: use the same ESXi versions when running OneDiff or study your setup carefully.

Offline

#3 2018-05-08 17:13:19

underdpt
Member
Registered: 2018-04-30
Posts: 2

Re: Backups stalls

Hi,

First, thank you for your quick and comprehensive answer. I've been doing tests these days and will try to do my best explaining every point:

1. My desired setup is: a backup server, from where i launch, log and manage the backups. The backups are launched on remote servers, and the storage server is the backup server. I though that setup were doable with XSIBACKUP.

2. I know that, and because of that the backup server (the one storing the backups) is a ESXi 6.5, i'm doing the backup from older to newer version of ESXi. Anyway I'm not interested on running the backups on the backup server (but would like that to be possible).

3. That's because on first run, without using --remote-xsipath, XSIBACKUP did an install on the root folder (without asking or using a more logical path). I noticed a message about speeding up the process and tried with that, using the path where previously XSIBACKUP was installed. I think that might be an issue (I didn't tell where to install the binaries)

4. I'm doing a WAN backup, with 250Mbps connections, but I've disabled compression to check it (I think I lost some throughput here, going from around 10MB/s to 7MB/s on very-quick-and-dirty tests). The backup stalls anyway.

5. I'm not trying to overlap backup jobs. I'm only launching a second backup when I see the first one stalls (there is a message saying that when you launch a overlapping backup, the first one is killed, and that's what I intended to do).

6. Understood. Without that argument, the backup did stall again.

Start by using the other backup programs, which don't have the above implications and: use the same ESXi versions when running OneDiff or study your setup carefully.

I've been trying with  --backup-prog=rsync:z and  --backup-prog=rsync and the issue remains: the backup stalls at some point (sometimes during 'sending incremental file list' phase, sometimes in the middle of a transfer. Could you point me on anything I can try?

Thanks,
David

Offline

#4 2018-05-09 15:06:34

admin
Administrator
Registered: 2017-04-21
Posts: 1,367

Re: Backups stalls

UPDATE: run that very same backup but launching it from the other side instead of launching from master and then making it write back to master. Just for testing purposes.

1 - Yes, that backup topology is perfectly doable. Your logs show the signs of a clogged environment. You will need to determine why Rsync just hangs. To be more precise, I never saw Rsync hanging, but it behaves like that when it runs out of resources. XSIBackup launches Rsync and coordinates different binaries, but the cause of an Rsync TCP transfer freezing has to do with what's going on at the TCP and OS level. Clogged switch or NICs? ESXi server running out of memory or CPU?

In any case, before trying to set up some complex backup topology, I would just try some simpler ones, to at least have an overview of how your hardware is behaving under a simpler scenario.

2 - Just as long as you stick to compatible hardware versions, no problem.

3 - XSIBackup tries to determine what's the remote installation point. If it doesn't find one, or if it finds more than one, it then tries to use the value parsed in --remote-xsipath; if this last value is not parsed, then it tries the value hardcoded at conf/xsiopts (/vmfs/volumes/datastore1/xsi-dir), and if after all this it can't find a plece where to copy the files, it does at the root, which BTW is not persistent. So,... check that you have a datastore1, and if you don't, look for an alternative place in your system that is persistent accross reboots. Once you know where to place files, use that path, i.e.: /vmfs/volumes/my_persistent_ds/xsi-dir

4 - Now we know you are over a WAN, we can add it as the most suspicious cause of the problem. We usually run backups and move VMs over FO without much problems, although effective transfer speeds fall down to real available bandwidth between two given points. Try to use a network monitoring tool to check your WAN stability and also try to find out if your ISP might be reducing your available bandwiths under some conditions.

Just run your backup under a reliable LAN and then again through your WAN to compare, I don't think it's your backup job that stalls, but your WAN.

Offline

Board footer