#1 2019-10-14 22:33:40

wowbagger
Member
Registered: 2017-05-11
Posts: 23

Error when running backup

When grepping through the log file I see several errors popping up for each VM that is being backed up.
DC throws the error and moves on to the next VM.

::: detail :::  32.17% done | block    3624 out of   11264                              | Done  29.35%2019-10-14T18:36:38 | Error code 434 at file dedup-in.c, line 434
Error description: can't rename temp block: /vmfs/volumes/XSI_BACKUP/POLLUX-xsitools/2019_DC//data/9/9/3/f/6/993f66a34b72396e2df357216056e72d3cc28da2
-----------------------------------------------------------------------------------------------------------------
--

Target repository is a Synology NFS mount on the ESX that has been running XSIBackup Pro for the last 2 years.

Gr

Offline

#2 2019-10-15 14:02:20

admin
Administrator
Registered: 2017-04-21
Posts: 1,386

Re: Error when running backup

Well, the error is obvious, you are lacking rename permissions. (c)XSIBackup-DC runs under a user that is not allowed to rename the blocks.

UPDATE:

It could also have something to do with possible physical damage in the disk or some SSH timeout. We'll investigate and let you know.

Offline

#3 2019-10-16 15:37:13

admin
Administrator
Registered: 2017-04-21
Posts: 1,386

Re: Error when running backup

There's a double slash in your path (...POLLUX-xsitools/2019_DC//data/9/9/3/f/...), although that does not seem to be the cause of your issue. We have performed lots of tests since months ago, specially in Synology devices as NFS, and we never came accross some error like yours. It is double puzzling the fact that the job seems to have run to 32%, thus the information that you present, as is, makes us believe that you filled your disk.

Offline

#4 2019-10-16 20:51:10

wowbagger
Member
Registered: 2017-05-11
Posts: 23

Re: Error when running backup

It happens for every backup run. Basically the XSI invocation is a Jenkins job that does a remote SSH's to the ESX server as a user with admin rights and executes XSIBackup cmdline with a list of VM's. The repo is a NFS mounted synology 6TB disk. It holds a 4TB XSIBackup-Pro repository and 2TB of free disk space. It throws the error and moves on to the next VM in the backup list, processes it for a while (actual blocks are added to the data folder) and throws the same error, it does that for every VM in the backup list. I've checked the Synology to see if the disks are failing/going to fail (SMAT) but it all looks good. I've switched back to XSIBackup Pro and initialised a new repository (renamed the old 4TB etc) and it runs without a problem so I'm not sure if it's infrastructure related at this point but then again, they're two different products.

What does the rename temp block actually do?
I tested it with a 4GB VM and didn't have this problem, I'll try to manually do one of the big VM's to see if it's occurring also.

Offline

#5 2019-10-17 10:47:36

admin
Administrator
Registered: 2017-04-21
Posts: 1,386

Re: Error when running backup

You are stretching the chain of permissions too much. That is not (c)XSIBackup's fault, if you design a complex script where permissions are being delegated down the chain you must be prepared to debug your scenario. What the rename block routine does is just that, rename a block, but that has nothing to do with your problem.

You are authenticating via SSH and you are being granted some rights for some limited time and that time is expiring. Study the SSH protocol to narrow your issue down or simplify your set up, I see no other choice.

The postman is not responsible for the bad news.

UPDATE:

Imagine for a moment that you run a remote command over SSH, so the remotely launched process starts to run in the background. Then due to a timeout the original SSH session expires. What would happen with the auth token issued to the SSH job and the process depending on it?, would my process still have rights to rename, create files, etc...?

That is the kind of question I would make myself if I were in your shoes.

Offline

#6 2019-10-18 13:11:30

wowbagger
Member
Registered: 2017-05-11
Posts: 23

Re: Error when running backup

Thanks for your reply but I don't see how you consider the SSH session to expire when VM_SERVER_LIST contains multiple servers and it errors on each of the servers after a while but continues with the next server when encountered an error, DC continues to log output, it does not stop.

Consider this command, it's just a standard remote SSH command execution and nothing complex, this is how we execute XSIBackup Pro for the last >year also:

$ssh root@$ESX_SERVER '/scratch/xsi/xsibackup --backup "VMs(Oracle-1)" /vmfs/volumes/XSI_BACKUP/POLLUX-xsitools/2019_DC/ --verbosity=5 --use-smtp=1 --subject="Pollux Backup"'

XSIBackup DC generates this as output:

[90m|---------------------------------------------------------------------------------|[0m
[90m||-------------------------------------------------------------------------------||[0m
[90m|||[0m[1m   (c)XSIBackup-Datacenter 1.0.0.1: Backup & Replication Software           [0m [90m|||[0m
[90m|||[0m[1m   (c)33HOPS, Sistemas de Informacion y Redes, S.L. | All Rights Reserved[0m    [90m|||[0m
[90m||-------------------------------------------------------------------------------||[0m
[90m|---------------------------------------------------------------------------------|[0m
                   (c)Daniel J. Garcia Fidalgo | info@33hops.com
[90m|---------------------------------------------------------------------------------|[0m
System Information: ESXi, Kernel 6 Major 0 Minor 0 Patch 0
[90m-----------------------------------------------------------------------------------------------------------------[0m
License: [1m000306F200000000000000002047479466ac[0m
[90m-----------------------------------------------------------------------------------------------------------------[0m
PID: 245181660, Running job as: [1mroot[0m
[90m-----------------------------------------------------------------------------------------------------------------[0m
(c)XSIBackup-Datacenter setting repository at /vmfs/volumes/XSI_BACKUP/POLLUX-xsitools/2019_DC/
[90m-----------------------------------------------------------------------------------------------------------------[0m
Virtual Machine Name: Oracle-1
[90m-----------------------------------------------------------------------------------------------------------------[0m
Creating snapshot VM : Oracle-1 (powered on)
[90m-----------------------------------------------------------------------------------------------------------------[0m
*** Snapshot was successfully created ***
[90m-----------------------------------------------------------------------------------------------------------------[0m
New Backup: [1mOracle-1[0m
[90m-----------------------------------------------------------------------------------------------------------------[0m
Backup start date: 2019-10-18 10:17:18
[90m-----------------------------------------------------------------------------------------------------------------[0m
2019-10-18 10:17:18 | Backing up 41 files, total size is 224624829954
[90m-----------------------------------------------------------------------------------------------------------------[0m
    NUMBER                                                       FILE            SIZE       PROGRESS
[90m-----------------------------------------------------------------------------------------------------------------[0m

     1/41                                        Oracle-1-3a76f09d.hlog        42.00 B  [90m|[0m Done   0.00%
[90m-----------------------------------------------------------------------------------------------------------------[0m

<******* LOGLINES ARE REMOVED HERE FOR LENGTH *******>

     2/41                                            Oracle-1-flat.vmdk        16.00 GB [90m|[0m Done   0.00%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Disk metadata map available, zeros will be skipped
[90m-----------------------------------------------------------------------------------------------------------------[0m

[90m::: detail :::[0m   0.06% done [90m|[0m block       1 out of    1639                              [90m|[0m Done   0.00%

[90m::: detail :::[0m 100.00% done [90m|[0m block    1024 out of    1024                              [90m|[0m Done  12.43%
[90m-----------------------------------------------------------------------------------------------------------------[0m

     9/41                                                 Oracle-1.vmxf       150.00 B  [90m|[0m Done  12.43%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    10/41                                                  Oracle-1.vmx         3.64 KB [90m|[0m Done  12.43%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    11/41                                                 Oracle-1.vmx~         3.61 KB [90m|[0m Done  12.43%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    12/41                                                Oracle-1.nvram         8.48 KB [90m|[0m Done  12.43%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    13/41                                          Oracle-1_6-flat.vmdk       110.00 GB [90m|[0m Done  12.43%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Disk metadata map available, zeros will be skipped
[90m-----------------------------------------------------------------------------------------------------------------[0m

[90m::: detail :::[0m   0.01% done [90m|[0m block       1 out of   11264                              [90m|[0m Done  12.43%
[90m::: detail :::[0m   0.02% done [90m|[0m block       2 out of   11264                              [90m|[0m Done  12.44%

<******* LOGLINES ARE REMOVED HERE FOR LENGTH *******>

[90m::: detail :::[0m 100.00% done [90m|[0m block   11264 out of   11264                              [90m|[0m Done  65.01%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    14/41                                       Oracle-1_1.vmdk.extents        37.05 KB [90m|[0m Done  65.01%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Could not retrieve disk metadata all blocks will be scanned
[90m-----------------------------------------------------------------------------------------------------------------[0m

    15/41                                                 vmware-61.log         3.15 MB [90m|[0m Done  65.01%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    16/41                                          Oracle-1_5-flat.vmdk        20.00 GB [90m|[0m Done  65.01%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Disk metadata map available, zeros will be skipped
[90m-----------------------------------------------------------------------------------------------------------------[0m

[90m::: detail :::[0m   0.05% done [90m|[0m block       1 out of    2048                              [90m|[0m Done  65.02%
[90m::: detail :::[0m   0.10% done [90m|[0m block       2 out of    2048                              [90m|[0m Done  65.02%

<******* LOGLINES ARE REMOVED HERE FOR LENGTH *******>

[90m::: detail :::[0m 100.00% done [90m|[0m block    2048 out of    2048                              [90m|[0m Done  74.57%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    17/41                                               Oracle-1_5.vmdk       498.00 B  [90m|[0m Done  74.57%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Could not retrieve disk metadata all blocks will be scanned
[90m-----------------------------------------------------------------------------------------------------------------[0m

    18/41                                                 vmware-58.log         1.14 MB [90m|[0m Done  74.57%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    19/41                                                 vmware-56.log       204.36 KB [90m|[0m Done  74.57%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    20/41                                               Oracle-1_6.vmdk       500.00 B  [90m|[0m Done  74.57%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Could not retrieve disk metadata all blocks will be scanned
[90m-----------------------------------------------------------------------------------------------------------------[0m

    21/41                                          Oracle-1_2-flat.vmdk        40.00 GB [90m|[0m Done  74.57%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Disk metadata map available, zeros will be skipped
[90m-----------------------------------------------------------------------------------------------------------------[0m

[90m::: detail :::[0m   0.02% done [90m|[0m block       1 out of    4096                              [90m|[0m Done  74.58%
[90m::: detail :::[0m   0.05% done [90m|[0m block       2 out of    4096                              [90m|[0m Done  74.58%

<******* LOGLINES ARE REMOVED HERE FOR LENGTH *******>

[90m::: detail :::[0m  15.89% done [90m|[0m block     651 out of    4096                              [90m|[0m Done  77.61%
[90m::: detail :::[0m  15.92% done [90m|[0m block     652 out of    4096                              [90m|[0m Done  77.62%2019-10-18T11:45:18 | Error code 434 at file dedup-in.c, line 434
Error description: can't rename temp block: /vmfs/volumes/XSI_BACKUP/POLLUX-xsitools/2019_DC//data/f/a/6/f/6/fa6f6e33afff2395af66b83ada7674a785d4f717
[90m-----------------------------------------------------------------------------------------------------------------[0m

[90m-----------------------------------------------------------------------------------------------------------------[0m

    22/41                                               Oracle-1_2.vmdk       498.00 B  [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Could not retrieve disk metadata all blocks will be scanned
[90m-----------------------------------------------------------------------------------------------------------------[0m

    23/41                                       Oracle-1_5.vmdk.extents        36.77 KB [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Could not retrieve disk metadata all blocks will be scanned
[90m-----------------------------------------------------------------------------------------------------------------[0m

    24/41                                       Oracle-1_2.vmdk.extents       139.43 KB [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Could not retrieve disk metadata all blocks will be scanned
[90m-----------------------------------------------------------------------------------------------------------------[0m

    25/41                                                 vmware-59.log       795.52 KB [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    26/41                                         Oracle-1.vmdk.extents       935.00 B  [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Could not retrieve disk metadata all blocks will be scanned
[90m-----------------------------------------------------------------------------------------------------------------[0m

    27/41                                       Oracle-1_6.vmdk.extents       201.27 KB [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m
Could not retrieve disk metadata all blocks will be scanned
[90m-----------------------------------------------------------------------------------------------------------------[0m

    28/41                          vmx-Oracle-1-2449876116-1.vswp    [open excluded] [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    29/41                                        Oracle-1.vmx.lck    [open excluded] [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    30/41                                                    vmware.log         8.57 MB [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    31/41                                  Oracle-1-92062494.vswp    [open excluded] [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    32/41                                              Oracle-1.vmx.tmp         3.61 KB [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    33/41                                     Oracle-1-Snapshot640.vmsn        28.42 KB [90m|[0m Done  77.62%
[90m-----------------------------------------------------------------------------------------------------------------[0m

    34/41                                 Oracle-1-000001-delta.vmdk    [skipped excluded]
[90m-----------------------------------------------------------------------------------------------------------------[0m

    35/41                                       Oracle-1-000001.vmdk    [skipped excluded]
[90m-----------------------------------------------------------------------------------------------------------------[0m

    36/41                               Oracle-1_6-000001-delta.vmdk    [skipped excluded]
[90m-----------------------------------------------------------------------------------------------------------------[0m

    37/41                                     Oracle-1_6-000001.vmdk    [skipped excluded]
[90m-----------------------------------------------------------------------------------------------------------------[0m

    38/41                               Oracle-1_2-000001-delta.vmdk    [skipped excluded]
[90m-----------------------------------------------------------------------------------------------------------------[0m

    39/41                                     Oracle-1_2-000001.vmdk    [skipped excluded]
[90m-----------------------------------------------------------------------------------------------------------------[0m

    40/41                               Oracle-1_5-000001-delta.vmdk    [skipped excluded]
[90m-----------------------------------------------------------------------------------------------------------------[0m

    41/41                                     Oracle-1_5-000001.vmdk    [skipped excluded]
[90m-----------------------------------------------------------------------------------------------------------------[0m
Total size:            200719 MB                                                        [90m|[0m Done 100.00%
[90m-----------------------------------------------------------------------------------------------------------------[0m
*** Snapshot was removed ***
[90m-----------------------------------------------------------------------------------------------------------------[0m
Backup end date: 2019-10-18 11:45:19
[90m-----------------------------------------------------------------------------------------------------------------[0m
Time taken:     5281 sec.
[90m-----------------------------------------------------------------------------------------------------------------[0m
Total time:     5281 sec.
[90m-----------------------------------------------------------------------------------------------------------------[0m
Full file speed:                                                                        38.01 mb/s
[90m-----------------------------------------------------------------------------------------------------------------[0m
Real data speed:                                                                        56.55 mb/s
[90m-----------------------------------------------------------------------------------------------------------------[0m
Differential blocks were added to the .blocklog database
[90m-----------------------------------------------------------------------------------------------------------------[0m
Data processing completed successfully
[90m-----------------------------------------------------------------------------------------------------------------[0m
Sending e-mail to email@email.com...
[90m-----------------------------------------------------------------------------------------------------------------[0m
E-mail was sent          [32mOK[0m
[90m-----------------------------------------------------------------------------------------------------------------[0m
Removed /tmp/xsi dir     [32mOK[0m
[90m-----------------------------------------------------------------------------------------------------------------[0m
Unlocked backup          [32mOK[0m
[90m-----------------------------------------------------------------------------------------------------------------[0m
Removed PID              [32mOK[0m
[90m-----------------------------------------------------------------------------------------------------------------[0m

If the SSH session would expire or time out at the moment the error is thrown everything would pretty much stop and the connection would be closed with an error state, it would not continue. But the SSH session does not stop or expires as you can see because DC continues.

XSIBackup Pro runs every weekend long-running jobs on a big list of huge VM's and runs for >12h without interruption and executed in the exact same remote SSH executed way, never had an issue but I cannot say the same for DC, I cannot execute it remotely.

Offline

#7 2019-10-18 13:54:31

admin
Administrator
Registered: 2017-04-21
Posts: 1,386

Re: Error when running backup

We can offer yo hints, but we don't have all the details, setting permissions is your "homework".
We might have not grasped your particular scenario in full detail. If the xsibackup binary continues to run, that means that you have permissions to run it in the security context of the ESXi server, but that does not mean that you have permissions under the security context of the underlying FS in your NFS NAS. What's puzzling is that you seem to have those permissions for some time in each backup cycle.

The only insight we can offer you is:

The line that raises the error is a plain call to the C rename() function run under the user you used to invoke the xsibackup binary. So the issue is as simple as:

1 - You don't have permissions to rename the block.
2 - Your target file system is broken.

Don't look any further, whatever your problem is, it is related to your ESXi user not being able to access the block to rename.

It could also be an NFS issue. Do not use NFS 4.1 to connect your backup volume. There are plenty of ESXi issues related to that, use NFS 3 instead

You may alternatively use your (c)Synology device as an IP target, that way you would bypass any NFS related issue and use SSH to authenticate on the underlying FS.

https://33hops.com/xsibackup-datacenter … y-nas.html

If you backup over IP use a block size of 1M, add --block-size=1M, it will be faster.

Also, stick to well known FSs, don't be creative with backup infrastructure. BTRFS is quite stable now, but EXT4 or XFS are much more mature and faster to host deduplicated chunks of data.

And last, but not least important:

Does it make sense to create a separate user for backups instead of using the root user in your case?
Whatever alternative user you use will neccessarily need to have full rights on the VM folder, thus, it will have full access to your guest's data. You may employ different users to isolate different VMs security contexts, but in case you are backing up VMs for a single owner:

What's the point in using a user different than root?,
that you will isolate that user from admin rights in the ESXi server itself?,
what's the point in doing that when they already have your data?

The queen to protect in a simple single user scenario is the VM, if you loose the queen, who cares about the ESXi server, it's just a container.

Offline

#8 2019-10-28 09:59:55

admin
Administrator
Registered: 2017-04-21
Posts: 1,386

Re: Error when running backup

We have assumed it was a permission issue. Version 1.0.0.6 will add some extended debug information around this issue. It could very well turn out to be an I/O problem.

Let's coalesce this issue in this new thread, which is more descriptive:
http://33hops.com/forum/viewtopic.php?id=593

Offline

Board footer