#1 2019-12-12 04:27:03

kcrouch@miningsystems.com
Member
From: Thornton
Registered: 2019-11-19
Posts: 10
Website

XSIBackup is deleting the backups part way through

I am running xsibackup with --backup-prog=Vmkfstools and --backup-type=All. It may successfully backup many of my virtual machines until it thinks it has to backup a vm with no name. At this point it produces some errors saying that it is skipping that vm but then it DELETES ALL of the vm backups that it has already done up until this point (that is everything on my --backup-point). It then continues to backup the remaining vms. It might even happen again in the same backup and I am left with only a few backed up machines or none at all.

This happens on all 4 hosts in the cluster but works OK on other clusters.

I initially thought that something weird was happening with the --backup-room and --del-dirs options but I removed these with no improvement. I have watched the backup run and monitored the backup point and have witnessed it delete all of the files at the time when the errors occurred.

Please note that the VM ID it references does NOT show up when I run "vim-cmd vmsvc/getallvms includeConfigNotAvailable" and does not exist in /etc/vmware/hostd/vmInventory.xml. I do not know where xsibackup is getting it from.

Other relevant history:
- I was recently using --backup-prog=OneDiff and then --backup-prog=XSIdiff but changed to Vmkfstools as I was getting too many errors. I was also getting similar errors with those backup progs but it did not DELETE the other backups when the error occurred.
- XSIbackup version 11.2.2
- VSphere version 6.0

Backup Job:
"/vmfs/volumes/esx6-local-1/xsi-dir/xsibackup" \
--backup-prog=Vmkfstools \
--date-dir=yes \
--backup-point=/vmfs/volumes/backup \
--backup-type=All \
--backup-how=Hot \
--backup-room=25000 \
--use-smtp=2 \
--mail-to=xxx@xxx.com.au \
--del-dirs=+3d \
--backup-id=esx6f \
--description="esx6 All Full" \
--exec=yes >> "/vmfs/volumes/esx6-local-1/xsi-dir/var/logs/xsibackup.log"

Errors in log file at time of file deletion:
-------------------------------------------------
2019-12-11T22:27:23|  [] Starting backup (size is 40485M on 40962M file)
---------------------------------------------------------------------------------------------------------------------------------
2019-12-11T22:27:23|  XSIBackup will backup your VMs while they are running, so that users can continue to use the VM
2019-12-11T22:27:23|  while the backup is taking place. You can also run cold and warm --backup-how
---------------------------------------------------------------------------------------------------------------------------------
2019-12-11T22:27:23|  Hot backup selected for VM: [], will not be switched off
---------------------------------------------------------------------------------------------------------------------------------
2019-12-11T22:27:25|  [] error: the .vmx file was not found, this VM cannot be backed up
---------------------------------------------------------------------------------------------------------------------------------
2019-12-11T22:27:25|  [] info: boot partition is MBR
---------------------------------------------------------------------------------------------------------------------------------
2019-12-11T22:27:26|  Removing snapshots, please wait...
---------------------------------------------------------------------------------------------------------------------------------
2019-12-11T22:27:29|  Error CLDELSN1: cannot delete snapshot VM Id: 705, details: (vim.fault.NotFound) {
2019-12-11T22:27:29|     faultCause = (vmodl.MethodFault) null,
2019-12-11T22:27:29|     msg = "Unable to find a VM corresponding to "705""
2019-12-11T22:27:29|  }
2019-12-11T22:27:29|  (vim.fault.NotFound) {
2019-12-11T22:27:29|     faultCause = (vmodl.MethodFault) null,
2019-12-11T22:27:29|     msg = "Unable to find a VM corresponding to "705""
2019-12-11T22:27:29|  }
2019-12-11T22:27:29|  cat: can't open '/vmfs/*.vmsd': No such file or directory
2019-12-11T22:27:30|  Syncronizing config files
---------------------------------------------------------------------------------------------------------------------------------
2019-12-11T22:27:31|  [] error: the .vmx file was not found, this VM cannot be backed up
---------------------------------------------------------------------------------------------------------------------------------
2019-12-11T22:27:31|  [](705) warning: no .vmsd file found
2019-12-11T22:27:35|  [] error DIFFEXUP: no VMDK disks present, skipping VM, nothing to backup
2019-12-11T22:27:36|  [] info: VMWare tools were not detected, the system will not be quiesced
---------------------------------------------------------------------------------------------------------------------------------
2019-12-11T22:27:39|  Backing up virtual disks...
---------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------

Errors to Standard out for entire job (note it deleted all existing backups twice once at 699 and once at 705):
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "699""
}
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "699""
}
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "699""
}
cat: can't open '': No such file or directory
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "699""
}
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "699""
}
/vmfs/volumes/esx6-local-1/xsi-dir/xsibackup: eval: line 9: to: not found
cat: can't open '': No such file or directory
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "699""
}
cat: can't open '': No such file or directory
grep: : No such file or directory
cat: can't open '': No such file or directory
/vmfs/volumes/esx6-local-1/xsi-dir/xsibackup: eval: line 4: to: not found
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "699""
}
cat: can't open '/vmfs/*.vmsd': No such file or directory
/vmfs/volumes/esx6-local-1/xsi-dir/xsibackup: eval: line 9: to: not found
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "699""
}
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "705""
}
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "705""
}
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "705""
}
cat: can't open '': No such file or directory
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "705""
}
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "705""
}
/vmfs/volumes/esx6-local-1/xsi-dir/xsibackup: eval: line 9: to: not found
cat: can't open '': No such file or directory
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "705""
}
cat: can't open '': No such file or directory
grep: : No such file or directory
cat: can't open '': No such file or directory
/vmfs/volumes/esx6-local-1/xsi-dir/xsibackup: eval: line 4: to: not found
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "705""
}
cat: can't open '/vmfs/*.vmsd': No such file or directory
/vmfs/volumes/esx6-local-1/xsi-dir/xsibackup: eval: line 9: to: not found
(vim.fault.NotFound) {
   faultCause = (vmodl.MethodFault) null,
   msg = "Unable to find a VM corresponding to "705""

Offline

#2 2019-12-12 04:59:57

kcrouch@miningsystems.com
Member
From: Thornton
Registered: 2019-11-19
Posts: 10
Website

Re: XSIBackup is deleting the backups part way through

I have just determined that the machines in question were vMotioned from one host to another between when xsibackup started and when it tried to back them up. It should not result in all backup files being deleted, however!

Offline

#3 2019-12-12 09:33:03

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: XSIBackup is deleting the backups part way through

You should disable the vMotion interface while the backup is taking place to prevent problems related to VMs being moved around during the backup windows.

You can alternatively disable DRS using scheduling during the backup Windows too.

--backup-prog=Vmkfstools will delete the target directory as a preparation for the backup operation as you don't want your backup files to get mixed with some eventual preexisting content. If for some reason you have some VMs installed into the root directory of some other VM, you might be deleting some previous backups in preparation for the top level VM.

Needless to say this would be an extremely awkward situation that would require immediate action and repair from your part. We can't be sure if this is the source of your issue though.

In any case more than one circumstance must have taken place for some previous backup to be deleted, as if all VMs are installed to same level paths, deleting previous backups is virtually impossible.

You can alternatively select the VMs you want to backup using the --backup-type=custom argument until you eventually fix nested installations.

Offline

#4 2019-12-12 10:43:57

kcrouch@miningsystems.com
Member
From: Thornton
Registered: 2019-11-19
Posts: 10
Website

Re: XSIBackup is deleting the backups part way through

You have misunderstood me. The one backup backs up say 20 vms using a single job. It may have backed up 10 out of 20. It then hits the migrated vm that is no longer there. It then removes all files for the 10 vms it has only just backed up and continues to backup the remaining 9. At the end of the backup I may only have 9 out of 20 backups remaining. But more likely it comes across another migrated machine and the whole thing happens again and I might only be left with 3 backed up machines or even none.

This has occurred not just once but every day for a week on each of the 4 hosts.

This is happening on all 4 hosts in a cluster. The exact same job is running on a different cluster with no problems but I don’t see the migration problems on that cluster.

This seems like a big bug to me. I take your point about disabling DRS which I already did today after I finally worked out that this was the cause. It has been a very frustrating process.

Offline

#5 2019-12-12 11:47:35

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: XSIBackup is deleting the backups part way through

Taking DRS into account in regards to backups is a must, not only using (c)XSIBackup, but any other software working at host level, method which in turn has some advantages over doing it at vCenter level and some drawbacks as having to deal with DRS should you happen to have it enabled.

We believe to have gotten you right at first. Nonetheless, what we are trying to explain is that the only line of code that deletes anything in your backup prog (--backup-prog=Vmkfstools) is in preparation to backup a given VM at some --backup-point, i.e.: /vmfs/volumes/backup/MY-VM01, thus if you are getting some previous backup in the same job deleted, it can only be due to some previous VM in the row being under some other VM backup path.

Let's say you are to backup this VMs: VM01, VM02, VM03, VM04, VM05 at /vmfs/volumes/backup/. First VM will be backed up at /vmfs/volumes/backup/VM01 VM02 at /vmfs/volumes/backup/VM02 and so on, and those dirs will be previously wiped to make sure you don't mix files.

That delete operation, the wipping of the the target folder, is the only one that is performed at the backup point root. Thusly if your backups at /vmfs/volumes/backup/VM01 and /vmfs/volumes/backup/VM02 are being wiped, that can only be due to some VM in the row after them being backep up to /vmfs/volumes/backup so that the backup dir is deleted to host some VM named "backup" which is sitting one dir below the others.

You can of course stretch this explanation to any directory depth, the matter is relative to the previous backup folders.

Of course, another way of deleting previous backups is if you backup VMs with the same name to the same backup point, but that needs no explanation. If you are backing up VMs hosted in a cluster, you should make sure that you keep names unique.

Apart from that we will of course revise any possibility that may occur during code execution to prevent any unwanted deletion, but (c)XSIBackup works under a general working principle which is to never stop even it it's receiving errors.

Look at it this way:

If when hitting an inexistent VM or some other issue, due to RDS or any other circumstance like a backup end target directory containing data (c)XSIBackup would just complain and stop execution, you wouldn't have any previous backup in the job deleted cause there would be nothing to delete and on top of that you would not have the remaining ones either.

Offline

#6 2019-12-12 20:19:55

kcrouch@miningsystems.com
Member
From: Thornton
Registered: 2019-11-19
Posts: 10
Website

Re: XSIBackup is deleting the backups part way through

Please see the details of my job included and you will see that —date-dir=yes. Every new backup when it starts will go to a different date-dir directory. I have no problem with vmfsktools deleting files at the START of a backup because it is a new directory every time. What I have a problem with it doing is deleting any backups that same single job has just created half way through the backup when it encounters an error. I can 100% tell you that this is what happens as I have watched it do it. The error I made in my first description that may have led you to misunderstand me is that I said it deleted everything from the —backup-point when I should have said —backup-point/date-dir.

So, if it backs up /vmfs/volumes/backup/2019121200000059/vm01 and then the next one is  /vmfs/volumes/backup/2019121200000059/vm02 and then it hits the described error due to DRS as vm03 has migrated to another host, it then DELETES vm01 and vm02 from the date-dir directory and then continues to backup vm04.

I also have no problem with it continuing to backup vm04 but I have a massive problem with it deleting vm01 and vm02 which were successful backups. It deletes them NOT at the start of a new backup but part way through the one backup.

My backup window is so large that I would need to disable DRS permanently to fix this issue. I do not mind so much if it simply missed the migrated ones but It is unacceptable that the others are deleted.

I cannot confirm it for sure but I think it only happens with vmkfstools and not onediff or xsidiff. I certainly didn’t notice it with the others.

I would appreciate it if you tested it and check your code. Maybe when it errors in this instance the code is starting again from the beginning where it does actually delete everything in the directory. Or maybe some variables are incorrectly initialised or set and it loops back around and executes the deletion code. Or maybe it is in something in vmkfstools itself.

Offline

#7 2019-12-13 09:11:05

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: XSIBackup is deleting the backups part way through

We will of course test it and eventually fix any issue and or make any reasonable improvement, but please do notice that normal behavior would be to halt the backup job and notify the error, so the deletion of the previous VMs in the job is trivial as well as the backup of the subsequent VMs that is indeed performed, as the backup job is invalid in your context.

You can't consider anything happening around your backup job as something regular, as it's exceptional. The deletion of the first VMs backup folders as well as the backup of the subsequent ones is spurious.

Offline

#8 2019-12-13 09:53:13

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: XSIBackup is deleting the backups part way through

We will reinforce the code to detect whether there has been some previous error and cancel pre-deletion in these cases, as well as improving the error handling when the VM is not there any more. We'll have a new version ready in short.

Please, note this are measures to save some furniture once the house is on fire.

We detected an error in your job not having to do with this matter, the --backupid argument must contain numeric strings in the range 000-999

Offline

#9 2019-12-13 11:29:23

kcrouch@miningsystems.com
Member
From: Thornton
Registered: 2019-11-19
Posts: 10
Website

Re: XSIBackup is deleting the backups part way through

Thank you. That is much appreciated.

I did read somewhere that the backup ID should be numeric but it was in this format when I started and seemed to work so I didn’t change it. As long as I don’t over 5 characters it seems to be OK. Is there something that won’t work if it is not numeric? I will change it - just curious. Does the name of the file that the job is stored in have to match the backup id?

Offline

#10 2019-12-13 15:28:38

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: XSIBackup is deleting the backups part way through

The GUI won't work if you try to load a job with an Id outside 000-999

We found constraining some values was required, as people are really imaginative when it comes to assign Ids. If everybody would pledge to something like your Id it would be allright though.

Please come back in a couple of days to our website, the new version will be ready.

Offline

#11 2019-12-14 19:19:40

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: XSIBackup is deleting the backups part way through

We just released (c)XSIBackup-Pro 11.2.14 which incorporates some improvements that will help you detect when DRS is moving your VMs around so that you can take action.

Offline

#12 2019-12-16 04:00:51

kcrouch@miningsystems.com
Member
From: Thornton
Registered: 2019-11-19
Posts: 10
Website

Re: XSIBackup is deleting the backups part way through

Thanks very much. I have tested this and it works correctly. It does not delete the backups that it successfully completed prior to the error and the error message is more meaningful when a VM has been migrated / vmotioned before its turn at being backed up.

Offline

#13 2019-12-16 10:34:53

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: XSIBackup is deleting the backups part way through

In any case remember to disable vMotion during the backup windows, otherwise (c)XSIBackup will try to grab them like fish in a barrel. You might very well be able to back them all up at one host or the other, but some may manage to escape by jumping around.

Offline

Board footer