I strongly suspect that the issue is with one of the "disks" of that VM.
Are you using coherent: ESXi versions, VMFS versions and HW versions?
You can't use Onediff between any two given ESXi hosts.
Needless to say you need to start with a working consolidated VM with no snapshots and Onediff to same VMFS version and some ESXi version equal or slightly above the ESXi server where the original VM to clone is.
I think I have found the core issue now. Been working on it all day. I will post back when I have tested a little bit more. If it is what I think it is, then it would explain why it worked when copying on the same server to another disk, but not between servers. And fil on just one VM at that.
Yes, the two ESXi hosts have identical versions. Can you elaborate on:
"You can't use Onediff between any two given ESXi hosts."
Do you mean: "The source and target ESXi hosts need to be the same version"?
OK. Here is the tl;dr summary:
One of the disk files in that VM had a space in the filename.
The long story:
The VM is an appliance called "Filr". It is a file sync/publishing appliance. A little bit like Sharepoint/OneDrive, but with the added benefit that files are in-house so you can be GDPR-compliant.
We have been using this for a long time and the VM probably started out running on VMWare 4. It has been upgraded several times, but from version 2 and forward you upgraded by installing a new appliance and added the data disk from the old. Thus, the data disk kept the name it was asutomatically given when the Filr2 installation was created "Novell Filr-2_1.vmdk".
This worked fine for XSIBackup when running from disk A to B on the same host, but not from one host to another. The transfer ended up first transferring "Novell Filr-2_1.vmdk" as "Novell.vmdk" then came "Novell Filr-2_1-flat.vmdk" and overwrote as "Novell.vmdk".
My initial mistake when troubleshooting was that I looked at the hostd.log on the source, when I should have looked at the target. I misunderstood the wording of the error message: "ERROR DIFDELAL, details: [Filr4] error: error deleting all snapshots VM [Filr4_XSIBAK] Id 29, error: Remove All Snapshots". When I read it now I definitely see that it uniquely points to the target, but then again is hindsight always 20-20.
Hope this helps someone else!
Oh, yes. To rename a VMWare disk file:
1. Shut down VM
2. Remove all snapshots
3. Note which disk file goes where in the VM
4. Remove the offending disk and all disks after. DO NOT DELETE FILES!
5. cd into the directory where the file is
6. vmkfstools -E "My Bad File.vmdk" MyBadFile.vmdk
7. Add files back in correct order
VMWare disks are always two files file.vmdk and file-flat.vmdk. Renaming file.vmdk will rename the flat file automatically.
(c)XSIBackup-Pro suppports spaces in file names.
"(c)XSIBackup-Pro suppports spaces in file names."
Any yet it would, on every second go complain that the file was not there:
2020-12-25T18:14:03.596Z info hostd [Originator@6876 sub=DiskLib opID=esxui-8168-138b user=root] DISKLIB-DSCPTR: DescriptorOpenInt: failed to open '/vmfs/volumes/5f97f8f4-7790c2f6-ea6b-74867aed9da0/Filr4/Novell Filr-2_1.vmdk': Could not find the file (60002)
Instead of the two files, there would be just one, named "Novell.vdmk"
I could clearly see that it cut off the name after the space. 100% repeatable. Could it have been a limitation of RSync?
Anyway. It works after the rename.
Are you quoting paths with spaces?
It could be something specific to your ESXi build. We have detected different bugs in the command line binaries, in fact some of them related to spaces. An upgrade to the latest build in the branch should be enough.
Now you know why we insist so much in you: sysadmins, taking things easy and renouncing to spaces in paths, long unjustified names, special characters and so on.... If you pledge to that way of doing things you will save time enough to master playing electric guitar, or anything else you like.
"Are you quoting paths with spaces?"
No, it is just that the VM in question had one disk where the name had a space and that was VMWare's doing, not mine. The name was automatically generated somewhere back in the VMWare 4.x days when the appliance was installed. Thus all I did was to create a job to backup that VM
I have been doing this for a very long time, since the CP/M days and I would never myself put a space in a filename, use accented characters in a filename, password or whatever as it is a sure way of breaking things along the way
It could very well be what you say that this is a bug in that particular build of VMWare and it annoys the heck out of me that I did not spot that earlier. Would have saved me a lot of work.