So the problem has moved to the NFS datastore created linking the NAS RAID volume, don't know if could be the NAS hardware, firmware, or something else, but i thing nothing related to XSIBackup itself.
My onediff backups are not related to any NFS datastore , so in my case it's not linked to any of that.
Just tested with 10.0.3 and last ESXi updates and problem is still here.
<code>[root@XServe3-ESXi:/] lsof |grep ZeusSrv
35346 vmx FILE 3 ZeusSrv.nvram
35346 vmx FILE 50 /vmfs/volumes/55785226-1e682e1a-0f8c-001e52f3e01c/ZeusSrv/ZeusSrv.vmx.lck
35346 vmx FILE 51 /vmfs/volumes/55785226-1e682e1a-0f8c-001e52f3e01c/ZeusSrv/ZeusSrv.vmx
35346 vmx FILE 52 /vmfs/volumes/55785226-1e682e1a-0f8c-001e52f3e01c/ZeusSrv/ZeusSrv.vmx~
35346 vmx FILE 53 /vmfs/volumes/55785226-1e682e1a-0f8c-001e52f3e01c/ZeusSrv/vmware.log
35346 vmx FILE 77 /vmfs/volumes/55785226-1e682e1a-0f8c-001e52f3e01c/ZeusSrv/ZeusSrv-000002-delta.vmdk
35346 vmx FILE 82 /vmfs/volumes/55785226-1e682e1a-0f8c-001e52f3e01c/ZeusSrv/ZeusSrv-flat.vmdk
35346 vmx FILE 83 /vmfs/devices/deltadisks/5f9bead-ZeusSrv-000002-delta.vmdk
[root@XServe3-ESXi:/] esxcli vm process list
World ID: 35351
Process ID: 0
VMX Cartel ID: 35346
UUID: 56 4d 66 f3 34 97 8a 64-d5 22 9f c1 89 bc 57 bb
Display Name: ZeusSrv
Config File: /vmfs/volumes/55785226-1e682e1a-0f8c-001e52f3e01c/ZeusSrv/ZeusSrv.vmx
ESXi duplicates snapshot files when performing a quiesced backup on GPT Windows guests, this causes the OneDiff algorithm to fail. We are working on a solution.
We want to clarify that OneDiff algorithm works perfectly well on all other GPT OSs which support quiescing. It is only Windows that fails, or..., just follows a totally different behaviour than the rest of OSs. This error is very similar to this previous one affecting MBR Windows Systems: https://kb.vmware.com/selfservice/micro … Id=2006849
It looks like Windows developers are not able to make their VSS behave as consistently as LVM in Linux. In fact, I would say that in regards to VMWare: 1/ They aren't able to make Windows work consistently on VMWare, 2/ They don't want to make Windows behave consistently in VMWare
As said, we will release a fix in short. By now, if you want to perform a quiesced backup on a Windows GPT or MBR server, you will need to asume a short down time (usually less than a minute) and use the [https://33hops.com/xsibackup-help-man-page.html#backuphow]--backup-prog=warm[/url] switch, which will turn the server off, take a snapshot and then switch the OS back again, before making the hot backup.
We have created a version of the OneDiff algorithm to backup Windows VMs and still be able to quiesce them properly. Normal cloning of this type of VMs will also be allowed. Should some Pro user want to try a preview of this upcoming version, please just send us a form stating your registered address at: https://33hops.com/contact-form.html
I keep getting the same OneDiff errors. This is for all running VMs on the server. One of which is (Xpenology)linux based and the other Windows 10.
Last error raised for the above VM:
ERROR DIFQMSH4, details: [ds3615] error: first 500M mistmatch [ds3615_2-flat.vmdk]
Last error raised for the above VM:
ERROR DIFQMSH4, details: [FD1] error: first 500M mistmatch [Win10_AMD-flat.vmdk]
• ----Snapshot Created On : 8/3/2018 12:2:35
• ----Snapshot Created On : 8/3/2018 12:7:54
• ----Snapshot Desciption : xsibackupdiff 2
• ----Snapshot Id : 14
• ----Snapshot Id : 18
• ----Snapshot Name : xsibackupdiff
• ----Snapshot State : powered off
• --Snapshot Created On : 8/3/2018 8:44:47
• --Snapshot Created On : 8/3/2018 8:50:50
• --Snapshot Desciption : xsibackupdiff 1
• --Snapshot Id : 13
• --Snapshot Id : 17
• --Snapshot Name : xsibackuphot
• --Snapshot State : powered off
• [ Fri Aug 3 12:03:28 UTC 2018 ] ERROR (DIFMERG1), details [ds3615] error: cannot merge diff data back to VM, details: Remove Snapshot:
• [ Fri Aug 3 12:05:32 UTC 2018 ] ERROR (DIFQMSH4), details [ds3615] error: first 500M mistmatch [ds3615_2-flat.vmdk]
• [ Fri Aug 3 12:09:35 UTC 2018 ] ERROR (DIFMERG1), details [FD1] error: cannot merge diff data back to VM, details: Remove Snapshot:
• [ Fri Aug 3 12:14:57 UTC 2018 ] ERROR (DIFQMSH4), details [FD1] error: first 500M mistmatch [Win10_AMD-flat.vmdk]
• Remove snapshot failed
[ Fri Aug 3 12:05:37 UTC 2018 ] WARNING (DIFRMVMX), details [ds3615] measure DIFRMVMX: took measure, renamed remote .vmx file to reinitialize OneDiff
• [ Fri Aug 3 12:14:57 UTC 2018 ] WARNING (DIFRMVMX), details [FD1] measure DIFRMVMX: took measure, renamed remote .vmx file to reinitialize OneDiff
Also, the corresponding VMs also create " accessible" namesakes with the "xxx_XSIBACK"
I have tried removing all snapshots, consolidating the VMs. Then delete the backup folders. And then restart the job again. Same errors still.
This is with 11.02 xsibackup Pro
Any updates on this matter? Cannot figure out why this mismatch happens with such regularity
The size of the Trivial Check can be set on the conf/xsiopts general configuration variables file.
The size and first 500M check AKA Trivial Check is extremely usefull 99% of the times.
Using a Linux OS within ESXi is something seamless and will work like a charm when quiescing the OS.
Windows OSs are something apart. It's obvious that this two monster companies are having an open commercial war, as Microsoft tries to force their users to use Hyper-V. There are bugs, especially in everything related to quiescing, that have plainly not been resolved. That is most probably due to MS putting efforts in making their OSs not fully compatible with VMWare.
It's not a coincidence that quiescing MS Operating Systems be the battle ground, as this operation is critical to properly backing up mission critical servers, like Exchange or SQL Server, which are some of the MS flagships.
First of all, if you want to run mission critical services, use Unix or Linux. If you already have MS in place and still need to back them up, follow our guide to quiescing Microsoft OSs [https://33hops.com/troubleshooting-windows-snapshots-in-esxi.html]Troubleshooting Windows Snapshots In Esxi[/url]
Be prepared to still receive some quiescing errors from time to time. You can use the --options=no-500M-check argument at your own risk, but take the following on account:
If you take your time to quiesce some Microsoft Operating System in a mission critical 24x7 service, you might still find yourself fighting in the mud against some invisible bugs that cause quiescing operations to time out or raise errors in the MS Event Viewer, which are in turn catalogued as "ignorable" by the Microsoft Knowledge base articles.
Thus, it's probably wiser to use one of two approaches:
1 - Plainly run a --backup-type=warm backup with no quiescing. This will imply not even 30 s. of downtime vs a nightmare that in the end will cause more downtime.
2 - Take a VSS snapshot prior to backing the VM up and delete it afterwards, also without quiescing in the virtual machine backup operation. This will ensure that you can discard the VSS snapshot and return to a stable configuration, should some I/O operation get trapped during the VM snapshot process.