#1 2020-10-12 11:37:50

AndersG
Member
Registered: 2020-01-16
Posts: 47

Backup of one VM fails, every other day.

This popped up as I am adding VMs to the backup job. It is always the same VM and the error ocurrs every other day:

Last error raised for the above VM:ERROR DIFQMSH4, details: [Filr-3.0] error: first 500M mistmatch [Filr-3.0_1-flat.vmdk]

• [ Mon Oct 12 05:11:14 UTC 2020 ] ERROR (DIFDELAL), details [Filr-3.0] error: error deleting all snapshots VM [Filr-3.0_XSIBAK] Id 15, error: Remove All Snapshots:• [ Mon Oct 12 05:11:22 UTC 2020 ] ERROR (DIFQMSH4), details [Filr-3.0] error: first 500M mistmatch [Filr-3.0-flat.vmdk]
• [ Mon Oct 12 05:11:30 UTC 2020 ] ERROR (DIFQMSH4), details [Filr-3.0] error: first 500M mistmatch [Novell Filr-2_1-flat.vmdk]
• [ Mon Oct 12 05:11:38 UTC 2020 ] ERROR (DIFQMSH4), details [Filr-3.0] error: first 500M mistmatch [Filr-3.0_1-flat.vmdk]
• Remove all snapshots failed Command: vim-cmd vmsvc/snapshot.removeall 15 2>&1 • [ Mon Oct 12 05:11:38 UTC 2020 ] WARNING (DIFRMVMX), details [Filr-3.0] measure DIFRMVMX: took measure, renamed remote .vmx file to reinitialize OneDiff

Any idea what may cause this?

XSIBACKUP-PRO 11.2.19
ESXi 6.7.0 build-14320388

I saw a post with a similar error and I checked that both machines run the same version of VMWare but the source has VMFS6 and the target VMFS5, could that be the issue?

Offline

#2 2020-10-13 13:47:24

admin
Administrator
Registered: 2017-04-21
Posts: 1,474

Re: Backup of one VM fails, every other day.

It could be a worn out disk raising some read error in some sector. The best part of post backup checksums is that you really know what's going on, the worst part is that when you know you start to wonder.

If you recover that VM you will, most probably, find out that it works. This is due to read errors being more frequent than write errors, thus the probability that you are getting a read error on some byte is higher than that of the byte having been written wrongly.

This might not justify changing the disk, so you have these options:

1 - Ignoring the error. You should first check the restored VM.
2 - Clone the disk that's causing the problem to a new virtual disk file.
3 - Change the disk.

Read this post to have an overview on why these errors are detected and how to deal with them:
https://33hops.com/xsibackup-disk-check … ption.html

Offline

#3 2020-10-14 06:00:30

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

OK. It is a raidset, but I will read that document you linked to and see what can be done. I will change the datastore to VMFS6 anyway an incoming weekend. The target server is the old server and it will just be used for backup and archival purposes.

Offline

#4 2020-12-01 07:02:00

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

OK. I had this error in the past every other day with the Filr3 appliance. I upgraded this appliance to Filr4 a week ago and the upgraded appliance VM now emits this error on backup, every other day. The server is a brand new server with a RAID array.

• [ Tue Dec 1 05:11:08 UTC 2020 ] ERROR (DIFDELAL), details [Filr4] error: error deleting all snapshots VM [Filr4_XSIBAK] Id 12, error: Remove All Snapshots:• [ Tue Dec 1 05:11:16 UTC 2020 ] ERROR (DIFQMSH4), details [Filr4] error: first 500M mistmatch [Filr4-flat.vmdk]
• [ Tue Dec 1 05:11:24 UTC 2020 ] ERROR (DIFQMSH4), details [Filr4] error: first 500M mistmatch [Novell Filr-2_1-flat.vmdk]
• [ Tue Dec 1 05:11:32 UTC 2020 ] ERROR (DIFQMSH4), details [Filr4] error: first 500M mistmatch [Filr4_1-flat.vmdk]
• Remove all snapshots failed Command: vim-cmd vmsvc/snapshot.removeall 12 2>&1 • [ Tue Dec 1 05:11:32 UTC 2020 ] WARNING (DIFRMVMX), details [Filr4] measure DIFRMVMX: took measure, renamed remote .vmx file to reinitialize OneDiff

The log does not say why it cannot delete the snapshot:
---------------------------------------------------------------------------------------------------------------------------------
2020-12-01T05:11:07|  [Filr4] info: deleting snapshot at [Filr4_XSIBAK] Id 12...
2020-12-01T05:11:08|  [Filr4] error DIFDELAL: error deleting all snapshots VM [Filr4_XSIBAK] Id 12, details: Remove All Snapshots:
2020-12-01T05:11:08|  Remove all snapshots failed
----------------------------------------------------------------------------------------------------------------------------
XSIBACKUP-PRO 11.2.19
VMware ESXi 6.7.0 build-14320388

Last edited by AndersG (2020-12-01 08:23:55)

Offline

#5 2020-12-01 11:01:39

admin
Administrator
Registered: 2017-04-21
Posts: 1,474

Re: Backup of one VM fails, every other day.

You only have one issue: not being able to delete the snapshot. The rest of errors are just derived from that fact.
Not being able to delete a snapshot is one of the most common errors in ESXi, it can be due to a number of reasons.

(c)XSIBackup-Pro is just the postman, it forwards the message thrown by the ESXi shell, which we agree is not very verbose.

To delve into the details on why that error was raised you will have to search in the VM and host logs. You will find the VM logs in the same dir where the .vmx file is kept, the host logs are at /scratch/logs, we would take a look at /scratch/log/hostd.log

Most common reasons are:

- You are trying to quiesce a service, but (c)VMWare tools are not installed or they are not working properly with the service to be quiesced.
- There isn't room enough to perform the operation.
- Some disk is damaged.
- Some ESXi service is not working properly.

Before going any further, just reboot the host. If the problem persists, you can read our posts on how to deal with snapshot errors:

https://33hops.com/esxi-snapshot-errors … tions.html
https://33hops.com/esxi-snapshot-errors … tions.html
https://33hops.com/troubleshooting-wind … -esxi.html

Offline

#6 2020-12-01 11:13:10

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

I will schedule a reboot next weekend. One difference I can see is that this VM does not have VMWare native tools but: open-vm-tools 10.3.10 build 12406962

Log:
2020-11-26T12:43:29.683Z info hostd[2099430] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-c329-c5d3 user=root] Task Created : haTask--vim.FileManager.makeDirectory-112209
2020-11-26T12:43:29.684Z info hostd[2099430] [Originator@6876 sub=Nfcsvc opID=esxui-c329-c5d3 user=root] Create requested for /vmfs/volumes/5e945a5b-865e687e-cfc0-b02628fba814/Filr4
2020-11-26T12:43:29.707Z info hostd[2099430] [Originator@6876 sub=Nfcsvc opID=esxui-c329-c5d3 user=root] file mkdir  (/vmfs/volumes/5e945a5b-865e687e-cfc0-b02628fba814/Filr4) by 'root'
2020-11-26T12:43:29.707Z info hostd[2099430] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-c329-c5d3 user=root] Task Completed : haTask--vim.FileManager.makeDirectory-112209 Status success
2020-11-26T12:43:29.888Z info hostd[2099446] [Originator@6876 sub=DatastoreBrowser opID=esxui-9824-c5e6 user=root] DatastoreBrowserImpl::SearchInt MoId:5e945a5b-865e687e-cfc0-b02628fba814-datastorebrowser dsPath:[datastore1] recursive:false
2020-11-26T12:43:33.300Z info hostd[2098919] [Originator@6876 sub=DatastoreBrowser opID=esxui-5060-c5f3 user=root] DatastoreBrowserImpl::SearchInt MoId:5e945a5b-865e687e-cfc0-b02628fba814-datastorebrowser dsPath:[datastore1] Filr-3.0 recursive:false
2020-11-26T12:43:35.093Z info hostd[2099455] [Originator@6876 sub=DatastoreBrowser opID=esxui-9add-c5fa user=root] DatastoreBrowserImpl::SearchInt MoId:5e945a5b-865e687e-cfc0-b02628fba814-datastorebrowser dsPath:[datastore1] Filr-3.0 recursive:false
2020-11-26T12:43:38.179Z info hostd[2098919] [Originator@6876 sub=DatastoreBrowser opID=esxui-747-c600 user=root] DatastoreBrowserImpl::SearchInt MoId:5e945a5b-865e687e-cfc0-b02628fba814-datastorebrowser dsPath:[datastore1] recursive:false
2020-11-26T12:43:39.767Z info hostd[2099451] [Originator@6876 sub=DatastoreBrowser opID=esxui-fe88-c605 user=root] DatastoreBrowserImpl::SearchInt MoId:5e945a5b-865e687e-cfc0-b02628fba814-datastorebrowser dsPath:[datastore1] Filr4 recursive:false
2020-11-26T12:43:41.719Z info hostd[2099447] [Originator@6876 sub=DatastoreBrowser opID=esxui-e581-c612 user=root] DatastoreBrowserImpl::SearchInt MoId:5e945a5b-865e687e-cfc0-b02628fba814-datastorebrowser dsPath:[datastore1] Filr-3.0 recursive:false
2020-11-26T12:43:41.774Z info hostd[2099444] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-590c-c616 user=root] Task Created : haTask--vim.VirtualDiskManager.copyVirtualDisk-112224
2020-11-26T12:43:41.776Z info hostd[2099436] [Originator@6876 sub=DiskLib opID=esxui-590c-c616 user=root] DISKLIB-LIB_CLONE   : Failed to clone disk using Object Cloning
2020-11-26T12:43:41.898Z info hostd[2099436] [Originator@6876 sub=Libs opID=esxui-590c-c616 user=root] DDB: "adapterType" = "lsilogic" (was "buslogic")
2020-11-26T12:43:41.898Z info hostd[2099436] [Originator@6876 sub=Libs opID=esxui-590c-c616 user=root] DDB: "longContentID" = "cb04ad82ddf414bb7a409a538e3da8a0" (was "cd376258d9a7c6da085cfdbefffffffe")
2020-11-26T12:43:41.898Z info hostd[2099436] [Originator@6876 sub=Libs opID=esxui-590c-c616 user=root] DDB: "virtualHWVersion" = "7" (was "14")
2020-11-26T12:43:41.899Z info hostd[2099436] [Originator@6876 sub=DiskLib opID=esxui-590c-c616 user=root] DISKLIB-LINK  : DiskLinkNativeVmfsCloneExisting: Failed to create native clone. (24).
2020-11-26T12:43:41.899Z info hostd[2099436] [Originator@6876 sub=DiskLib opID=esxui-590c-c616 user=root] DISKLIB-LIB_CLONE   : Failed to create native clone on destination handle : The specified feature is not supported by this version (24).
2020-11-26T12:43:43.063Z info hostd[2098835] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/5e945a5b-865e687e-cfc0-b02628fba814/Filr-3.0/Filr-3.0.vmx] State Transition (VM_STATE_OFF -> VM_STATE_CREATE_SCREENSHOT)
2020-11-26T12:43:43.064Z info hostd[2098835] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/5e945a5b-865e687e-cfc0-b02628fba814/Filr-3.0/Filr-3.0.vmx] State Transition (VM_STATE_CREATE_SCREENSHOT -> VM_STATE_OFF)
2020-11-26T12:43:43.064Z info hostd[2099434] [Originator@6876 sub=Solo.ScreenHandler] Requested screenshot was zero bytes. Failing operation.
2020-11-26T12:43:46.963Z info hostd[2098919] [Originator@6876 sub=Default] IPMI SEL sync took 0 seconds 0 sel records, last 4

edit: Found this: https://communities.vmware.com/t5/vSphe … -p/2672360

Last edited by AndersG (2020-12-01 11:16:40)

Offline

#7 2020-12-01 11:37:37

admin
Administrator
Registered: 2017-04-21
Posts: 1,474

Re: Backup of one VM fails, every other day.

That is the recommended (by ©VMWare itself) package for Linux guests.
If you upgraded the OS and you have not rebooted the host services, most likely cause is some process loaded into memory is expecting to communicate with some previous version of something. ESXi is great, but it's still not perfect.

Offline

#8 2020-12-01 11:40:05

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

This VM is an appliance so upgrading is basically spinning up a new VM and migrating the data. It is a Linux VM. I could also take this up with the vendor. I have not had any problem with any other VM.

Offline

#9 2020-12-01 12:16:38

admin
Administrator
Registered: 2017-04-21
Posts: 1,474

Re: Backup of one VM fails, every other day.

If you took the time to trace this error down to its primary cause, you would probably find it has to do with the MySQL service, could also be any other service running in the VM which refuses to be quiesced or is not playing well with VMWare Tools.

In case this is not caused by some incompatibility or lack of some piece of software, a reboot will help it. If, on the contrary, this is caused by some sort of incompatibiity, you will need to contact the vendor

Offline

#10 2020-12-01 12:21:54

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

OK. FWIW it is not a question of having the time, but rather the necessary knowledge smile But Filr4 uses PostgreSQL. What should I look for in the logs?

Offline

#11 2020-12-01 12:34:42

admin
Administrator
Registered: 2017-04-21
Posts: 1,474

Re: Backup of one VM fails, every other day.

We have narrowed the problem down more than a generic error allows to.
What about?

tail -n1000 /scratch/log/hostd.log | grep -i "error"

Offline

#12 2020-12-01 13:14:55

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

Well, I posted the relevant snippet from the log earlier smile Anyway. The error might be related to something stopping ESXi from quiescing the machine. I will investigate further.

And Yes now that we have narrowed it down, I can troubleshoot further.

Offline

#13 2020-12-12 12:22:03

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

OK. When looking at hostd.log and comparing to good days, I see:

2020-12-12T05:37:04.627Z error hostd[2099433] [Originator@6876 sub=SoapAdapter.HTTPService] Failed to read request; stream: <io_obj p:0x000000868a48d8d0, h:-1, <TCP '127.0.0.1 : 8307'>, <TCP '127.0.0.1 : 36575'> FD Closed>, error: N7Vmacore16TimeoutExceptionE(Operation timed out: Stream: <io_obj p:0x000000868a48d8d0, h:-1, <TCP '127.0.0.1 : 8307'>, <TCP '127.0.0.1 : 36575'> FD Closed>, duration: 00:00:47.668509 (hh:mm:ss.us))
--> [context]zKq7AVICAgAAAASD2gALaG9zdGQAACxONmxpYnZtYWNvcmUuc28AALC+GwCQnBcAQcwaAPjRLAAVaSkAteUoAJPpKACrxDYBO30AbGlicHRocmVhZC5zby4wAAJ9nw5saWJjLnNvLjYA[/context]
2020-12-12T05:37:09.698Z error hostd[5224387] [Originator@6876 sub=Default] [LikewiseGetDomainJoinInfo:354] QueryInformation(): ERROR_FILE_NOT_FOUND (2/0):

Offline

#14 2020-12-22 17:21:52

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

I could really use some help here. It is absolutely repeatable. It fails every other day. Is there ano other logs I could look at? In hostd I see an entry "Task Created : haTask-24-vim.VirtualMachine.removeAllSnapshots" on the days when it works, but not on the days it does not. On days when it fails the first access for that VM seems to be "Task Created : haTask-24-vim.ManagedEntity.reload"

Offline

#15 2020-12-23 11:46:58

admin
Administrator
Registered: 2017-04-21
Posts: 1,474

Re: Backup of one VM fails, every other day.

Well, we don't even know what application that chunk of log above corresponds to:
Does SoapAdapter.HTTPService mean something to you?.
It could be some other error, are you sure it's raised right at the time to take the snapshot?.

All we can see it's an error related to something communicating on a file descriptor that seems to be related to a process communicating on TCP in localhost (127.0.0.1) on ports 8307, 36575

If that is the culprint, then you might be running some service at the same time the snapshot is being taken, which is not compatible with VMWare Tools quiescing.

The solution is simple, keep it iddle during that backup window or use some type of backup that does not require quiescing, like warm or cold backup types.

Please note, that using warm or cold, does indeed quiesce the FS the tough way, that is: stopping the VM. In case of warm backups, a snapshot is taken from off state and then the VM is rapidly switched on again, thus the downtime is minimal.

If your VM is gently switched off, you will have the equivalent of a total service quiescing state (as all services will be orderly turned off), while, if your VM needs to be powered off (due to the same service not responding to a shutdown request), you will have the same situation as if you would plainly issue a non-quiesced snapshot.

Determining whether the abrupt power off or non-quiesced snapshot can compromise the integrity of your data will depend on knowing the "rebel" service in more detail, namely: how intelligent it is at the time to manage that kind of situation.

Offline

#16 2020-12-23 12:14:14

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

Well I see the SoapAdapter.HTTPService error both for failed backups and backups that run OK. Also is the fact that it happens exactly every other day interesting. There is no activity that I know of at this time, but I could change the time and see if it makes any difference.

Would it help if I mailed you the log?

Offline

#17 2020-12-23 12:36:06

admin
Administrator
Registered: 2017-04-21
Posts: 1,474

Re: Backup of one VM fails, every other day.

There's always plenty of activity at any given server, but that service does not seem to be something generic, it's probably some other application that makes use of it. You may not even need it to be running. You will have to find that out.

We have already gone too far with this issue. You are being affected by an everyday's IT Admin life general issue which is not directly related to ©XSIBackup, but to quiescing the FS.

If you can't solve this on your own we can help you, we would charge for our time though.

Offline

#18 2020-12-23 12:50:05

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

The server in Question is for file sync. It is only accessed by clients. I will start by breaking out the backup to a separate job at another time which will make comparing logs easier. I will get back to you if I cannot solve this on our own.

Offline

#19 2020-12-23 18:20:01

admin
Administrator
Registered: 2017-04-21
Posts: 1,474

Re: Backup of one VM fails, every other day.

Well, that's revealing. You will probably need to make sure the file sync service is iddle, at least, while the snapshot process is taking place. Nonetheless, if all it does is file sync'in you won't need quiescing. If you still get that error when attempting a non quiesced backup, then:

1 - Create some simple script that stops the service before taking a snapshot and restart it when the snapshot is done.
https://kb.vmware.com/s/article/1006671
https://communities.vmware.com/t5/VI-VM … -p/1663316

2 - Perform a warm backup if you can afford to stop the VM for some seconds.

Offline

#20 2020-12-24 09:13:58

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

Thanks and yes. I have had too little time to troubleshot this properly, but now I will have ample time over christmas smile Anyway. Step one is to break out this backup as a separate job and run it at another time. That way it will be so much easier to spot differences. Will post my findings here.

So, have a very nice christmas and a happy new year!

Offline

#21 2020-12-24 12:20:16

admin
Administrator
Registered: 2017-04-21
Posts: 1,474

Re: Backup of one VM fails, every other day.

Don't look over our previous post, we are pointing at the right way to handle your situation.
When you have some intensive I/O service, it needs to be stopped before the snapshot and started right after. (c)VMWare Tools provide very specific ways to do it manually when you can't rely on quiescing.

Merry Christmas to you too!!

Offline

#22 2020-12-24 13:41:36

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

Tried first with the service shut down (in between preparing Christmas dinner). Fails on every second backup. Then turned off the VM completely. Still fails, so the error must be somewhere else?

Offline

#23 2020-12-24 13:44:44

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

But hey! Have I been looking in the wrong place?

[Filr4] info: deleting snapshot at [Filr4_XSIBAK] Id 29...
[Filr4] error DIFDELAL: error deleting all snapshots VM [Filr4_XSIBAK] Id 29, details: Remove All Snapshots:
Remove all snapshots failed


This is removing the snapshot on the remote copy, right?

Offline

#24 2020-12-25 09:28:15

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

So, if I look at hostd.log on the target, I definitely see things:

2020-12-25T09:15:33.297Z info hostd[2099162] [Originator@6876 sub=Libs opID=vim-cmd-2d-b471 user=root] OBJLIB-FILEBE : FileBEOpen: can't open '/vmfs/volumes/5f97f8f4-7790c2f6-ea6b-74867aed9da0/Filr4/Novell Filr-2_1.vmdk' : Could not find the file (393218).
2020-12-25T09:15:33.297Z info hostd[2099162] [Originator@6876 sub=DiskLib opID=vim-cmd-2d-b471 user=root] DISKLIB-DSCPTR: DescriptorOpenInt: failed to open '/vmfs/volumes/5f97f8f4-7790c2f6-ea6b-74867aed9da0/Filr4/Novell Filr-2_1.vmdk': Could not find the file (60002)
2020-12-25T09:15:33.298Z info hostd[2099162] [Originator@6876 sub=DiskLib opID=vim-cmd-2d-b471 user=root] DISKLIB-LINK  : "/vmfs/volumes/5f97f8f4-7790c2f6-ea6b-74867aed9da0/Filr4/Novell Filr-2_1.vmdk" : failed to open (The system cannot find the file specified).
2020-12-25T09:15:33.298Z info hostd[2099162] [Originator@6876 sub=DiskLib opID=vim-cmd-2d-b471 user=root] DISKLIB-CHAIN :"/vmfs/volumes/5f97f8f4-7790c2f6-ea6b-74867aed9da0/Filr4/Novell Filr-2_1-000002.vmdk": Failed to open parent "/vmfs/volumes/5f97f8f4-7790c2f6-ea6b-74867aed9da0/Filr4/Novell Filr-2_1.vmdk": The system cannot find the file specified.
2020-12-25T09:15:33.298Z info hostd[2099162] [Originator@6876 sub=DiskLib opID=vim-cmd-2d-b471 user=root] DISKLIB-CHAIN : "/vmfs/volumes/5f97f8f4-7790c2f6-ea6b-74867aed9da0/Filr4/Novell Filr-2_1.vmdk" : failed to open (The parent of this virtual disk could not be opened).
2020-12-25T09:15:33.298Z info hostd[2099162] [Originator@6876 sub=DiskLib opID=vim-cmd-2d-b471 user=root] DISKLIB-LIB   : Failed to open '/vmfs/volumes/5f97f8f4-7790c2f6-ea6b-74867aed9da0/Filr4/Novell Filr-2_1-000002.vmdk' with flags 0x3 The parent of this virtual disk could not be opened (23).


So, why this Filr4/Novell Filr-2_1-000002.vmdk? That file exists on the target but not on source. Also present on the target is several Filr4-Snapshotnnnn files.

I also see the remove snapshot erroro on the target server in the UI. I will move the datastore folder on the target to another disk and retry the backup operation

Last edited by AndersG (2020-12-25 09:31:15)

Offline

#25 2020-12-25 18:21:40

AndersG
Member
Registered: 2020-01-16
Posts: 47

Re: Backup of one VM fails, every other day.

OK. Same thing. When I look at the other VMs on the target server, they are all without snapshots, but the problem one (Filr4) has a snapshot on it, and it cannot be removed through the UI either. the error is that it complains about file Novell Filr-2_1-000001.vmdk, it is looking for a parent, Novell Filr-2_1.vmdk, but there is just a Novell Filr-2_1-flat.vmdk

Offline

Board footer