#1 2020-07-21 13:07:34

sphen
Member
Registered: 2020-06-13
Posts: 45

prune stuck

I am running prune on my remote system and have run into an issue where it gets stuck.

it has been running for ~ 8 hours or so. when checking processes i can see sort is still running:

22037 admin         4 S   ./xsibackup --prune 20200721000005
23778 admin      2256 S   sh -c sort -T'/share/CACHEDEV1_DATA/tmp/xsi/tmp' '/share/CACHEDEV1_DATA/tmp/xsi/tmp/0U.xsi' | uniq > '/share/CACHEDEV1_DATA/tmp/xsi/tmp/0U.xsi.tmp' && mv '/share/CACHEDEV1_DATA/tmp/xsi/tmp/0U.xsi.tmp' '/share/CACHEDEV1_DATA/tmp/xsi/tmp/0U.xsi' && rm -rf '/share/CACHEDEV1_DATA/tmp/xsi/tmp/0U.xsi.tmp' && echo 1 || echo 0
23779 admin    346336 D   sort -T/share/CACHEDEV1_DATA/tmp/xsi/tmp /share/CACHEDEV1_DATA/tmp/xsi/tmp/0U.xsi

output of prune is as follows:

[/share/CACHEDEV1_DATA/BKUP] # ./xsibackup --prune 20200721000005
|---------------------------------------------------------------------------------|
||-------------------------------------------------------------------------------||
|||   (c)XSIBackup-DC 1.3.0.1: Backup & Replication Software                    |||
|||   (c)33HOPS, Sistemas de Informacion y Redes, S.L. | All Rights Reserved    |||
||-------------------------------------------------------------------------------||
|---------------------------------------------------------------------------------|
                   (c)Daniel J. Garcia Fidalgo | info@33hops.com
|---------------------------------------------------------------------------------|
System Information: Linux, Kernel 4 Major 14 Minor 24 Patch 0
-----------------------------------------------------------------------------------------------------------
License: unlicensed trial version
-----------------------------------------------------------------------------------------------------------
PID: 22037, Running job as: root
-----------------------------------------------------------------------------------------------------------
Finding blocks to prune, please wait...
-----------------------------------------------------------------------------------------------------------
Getting map files from repo...
-----------------------------------------------------------------------------------------------------------
(!) Not enough space to prune in the tmp folder: /tmp. required: 857074968, available: 36831232
-----------------------------------------------------------------------------------------------------------
Looking for alternative locations...
-----------------------------------------------------------------------------------------------------------
The TMP dir was moved to </share/CACHEDEV1_DATA/tmp/xsi/tmp> (515.03 GB free)
-----------------------------------------------------------------------------------------------------------
Retrieving general block data 100.00%
-----------------------------------------------------------------------------------------------------------
Please wait while we order the data at: /share/CACHEDEV1_DATA/tmp/xsi/tmp/0U.xsi
-----------------------------------------------------------------------------------------------------------

top shows this for the sort process:

Mem: 840032K used, 54988K free, 24620K shrd, 39684K buff, 58112K cached
CPU:  3.5% usr  9.4% sys  0.0% nic  7.8% idle 78.3% io  0.0% irq  0.8% sirq
Load average: 5.94 6.04 6.35 2/729 4277
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
23779 23778 admin    D     808m 92.2   0  0.2 sort -T/share/CACHEDEV1_DATA/tmp/xsi/tmp /share/CACHEDEV1_DATA/tmp/xsi/tmp/0U.xsi

maybe i should try pruning from a different machine via NFS connection? pruning via ssh/remote does not work from my understanding. issue is that i do not believe i have a server other than this one at the remote site.

any thoughts? thanks.

Offline

#2 2020-07-24 11:31:29

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: prune stuck

If you try to prune over NFS you will add network latency to the issue, it will just worsen the problem.
The thing is:

1 - How big is your repo?
2 - What resources (CPU, Memory, disk) you can count on.

Pruning is a resource intensive operation, it is the most resource intensive operation by far when compared to --backup or --replica. If your NAS device is not powerful enough, it might be wiser to take a different approach.

Offline

#3 2020-07-24 13:49:18

sphen
Member
Registered: 2020-06-13
Posts: 45

Re: prune stuck

it is honestly rather underpowered. 1 GB ram and 2 cpu. plenty of disk. the prune over NFS/remote is working however.

the repo is in the neighborhood of 12 TB with 10 MB block sizes. nightly backups seems to have quite a large delta even though not much data is being changed. not sure what the best alternative would be. i know we could de separate repos and rotate those but i would need much more space that i have right now.

Offline

#4 2020-07-25 11:42:52

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: prune stuck

You are trying to push a camel through the eye of a needle. You are running out of RAM and CPU time and the pruning process is just sitting there iddle waiting for resources on a clogged server.

This post explains in detail how pruning works and how to calculate resources:
https://33hops.com/xsibackup-datacenter … ckups.html

You clearly need a more powerful server to be able to prune big repos, especially in terms of RAM. Or just do as you suggest and keep two repos that you rotate, that will allow you to use that backup server.

Make the camel smaller or the needle eye bigger, that's the only way to go.

Offline

#5 2020-08-12 11:33:57

admin
Administrator
Registered: 2017-04-21
Posts: 1,370

Re: prune stuck

In your case you should update to 1.4.0.0 ASAP. There's a bug in block sizes over 1M. Your version 1.3.0.1 is not affected, it is not compatible with ESXi 7 though.
Contact support to get your 1.4.0.0 copy.

Offline

Board footer