You are not logged in.
If you only have room for one backup, then what you need is more room.
You can try to overcome it in any possible way, but facing the real issue is something that you can't escape.
We will study this as a feature in next main branch.
Offline
I made a couple of changes to the script above.
The main changes are:
- Use -name '*-flat.vmdk' when looking for vmdk files containing hashes.
On the Xsibackup website, it is stated that only these vmdk files are deduplicated.
- Use sort -u, instead of sort + awk. I don't think awk is needed because we sort the input anyway.
- Use: find with -regex looking for blocks, instead of ls+grep,
and be stricter on the filename (don't use word boundaries).
- use `basename $0` instead of $0
Note: make sure you create the directory "var/logs/$name" or adapt it to your taste...
#!/bin/sh
# Check for inconsistencies in xsitools-repository
# Find and delete unused files
# Prune old backups
name=`basename $0`
{
echo "Begin: `date`"
# usage
if [ -e $1/.xsitools ]
then
echo "$1 seems to be an xsitools-Repository, using it."
else
echo "$1 doesn't seem to be an xsitools-Repository."
echo "Use \"$name [xsitools-repo-directory]\""
exit 1
fi
if [ "$2" != "--delete" ]
then
echo "Use \"$name [xsitools-repo-directory] [--delete]\" to remove unused files (be careful)."
else
echo "\"--delete\" is set, will remove unused files."
fi
if echo $3 | egrep -q '^[0-9]+$';
then
echo "Searching for backup-folders older than $3 days."
bkpfolders=`find $1 -type d -maxdepth 1 -regex ".*/[0-9\-]\{14\}" -mtime +$3`
if [ ! -z "$bkpfolders" ]
then
echo "$bkpfolders found, deleting"
rm -rf $bkpfolders
else
echo "No backup-folders found."
fi
else
echo "3rd option can be a number: Delete backup-folders older than ... days."
echo "You can use this to prune older backups (be careful)."
fi
# Temporary files and variables
temp_dir=`mktemp -d -t`
hashes="$temp_dir/hashes"
hashes_sorted="$temp_dir/hashes_sorted"
files="$temp_dir/files"
files_sorted="$temp_dir/files_sorted"
delete_candidates="$temp_dir/delete_candidates"
missing_files="$temp_dir/missing_files"
diff_output="$temp_dir/diff_output"
hashes_count=0
files_count=0
echo "Collecting hashes of all .vmdk files."
# my old version to exclude delta files:
# find $1/ -path data -prune -o -name *.vmdk -maxdepth 3 | grep -v '\delta.vmdk$' | grep -v '\sesparse.vmdk$' | while read line; do cat "$line" ; done | grep -o '\b[0-9a-f]\{40\}\+\b' > $hashes
# wile-loop inserted for handling filenames with spaces, exclude delta files (snapshots), faster search (thanks to wowbagger)
# find $1/ -path $1/data -prune -o -name *.vmdk | grep -v '\delta.vmdk$' | grep -v '\sesparse.vmdk$' | grep -v $1/data | while read LINE; do cat "$LINE" ; done | grep -o '^\b[0-9a-f]\{40\}\+\b' > $hashes
find $1/ -path $1/data -prune -o -name '*-flat.vmdk' -exec cat {} \; > $hashes
echo "Sorting hashes and removing duplicates."
sort -u $hashes > $hashes_sorted
hashes_count=`cat $hashes_sorted | wc -l`
echo "Hashes in vmdks: $hashes_count"
echo "Generating list of files in ./data."
# find $1/data -type f -exec basename {} \; > $files
# ls -1R $1/data | grep -o '\b[0-9a-f]\{40\}\+\b' > $files
find $1/data -type f -regex '.*/[0-9a-f]\{40\}\+$' -exec basename {} \; > $files
echo "Sorting list of files."
sort $files > $files_sorted
files_count=`cat $files_sorted | wc -l`
echo "Files: $files_count"
# some checks if everything is valid
echo "Using diff for comparing .vmdk-hashes with filenames in ./data."
diff $hashes_sorted $files_sorted -U 0 > $diff_output
if [ $? -eq 0 ];
then
echo "No unused files found. Every hash in the .vmdk files"
echo "has a proper file in data-directory. Good."
echo "Removing temporary files."
rm -rf "$temp_dir"
echo "End: `date`"
exit 0
else
echo "Checking if hashes in .vmdk files have a file in the data-directory."
grep "^-[a-f0-9]" $diff_output | sed 's/^.//' > $missing_files
if [ `cat $missing_files | wc -l` -eq 0 ];
then
echo "Every hash contained in the .vmdk files has a proper file in data-directory. Good."
grep "^+[a-f0-9]" $diff_output | sed 's/^.//' > $delete_candidates
unused_count=`cat $delete_candidates | wc -l`
echo "There are $unused_count unused files in ./data:"
if [ "$2" != "--delete" ];
then
cat $delete_candidates
fi
else
echo "The following `cat $missing_files | wc -l` data files are missing:"
cat $missing_files
echo "Repository is damaged. Leaving everything untouched. Exiting."
echo "Removing temporary files."
rm -rf "$temp_dir"
echo "End: `date`"
exit 1
fi
fi
if [ "$2" == "--delete" ]
then
echo "Counting space used of $1/data."
echo "Repo-size before pruning: `du $1/data/ -h -s | awk '{print $1;}'`"
cat $delete_candidates | while read file
do
rmpath="$1/data/`echo $file | cut -c1-1`/`echo $file | cut -c2-1`/`echo $file |cut -c3-1`/$file"
echo "Deleting $rmpath"
rm -rf $rmpath
done;
echo "Counting space used of $1/data."
echo "Repo-size after pruning: `du $1/data/ -h -s | awk '{print $1;}'`"
echo "Removing empty directories."
# Busybox find doesnt know -empty.
find $1/data -type d -depth -exec rmdir -p --ignore-fail-on-non-empty {} \;
echo "Counting files in data-directory again."
# no sort needed here
# find $1/data -type f -exec basename {} \; > $files
# ls -1R $1/data | grep -o '\b[0-9a-f]\{40\}\+\b' > $files
find $1/data -type f -regex '.*/[0-9a-f]\{40\}\+$' -exec basename {} \; > $files
files_count=`cat $files | wc -l`
if [ $files_count == $hashes_count ]
then
echo "Number of files and hashes ($files_count) are same, everything went right."
else
echo "Number of files ($files_count) and hashes ($hashes_count) are different."
echo "Perhaps not every file could be deleted. Check it using the logfile."
echo "End: `date`"
echo "Removing temporary files."
rm -rf "$temp_dir"
exit 1
fi
echo "Updating Bcnt in .xsitools-file:"
bcnt=`grep Bcnt $1/.xsitools | awk -F ': ' '{print $2}'`
echo "Old value of Bcnt: $bcnt."
echo "Setting actual number of files ($files_count) as new value of Bcnt."
sed -i -e "s/Bcnt: $bcnt/Bcnt: $files_count/" $1/.xsitools
fi
echo "Removing temporary files."
rm -rf "$temp_dir"
echo "End: `date`"
} 2>&1 | tee -a var/logs/$name/$name-`date +"%d"`.log
exit 0
Offline
[quote=admin]If you only have room for one backup, then what you need is more room.
You can try to overcome it in any possible way, but facing the real issue is something that you can't escape.
[/quote]
Do you mean that there is a high enough risk that this backup isn't sane?
Does this have to do with possible hash collisions?
Offline
No, that means that you need a bigger storage device.
To translate from probabilities to a "real life" joke, the possibility that you hit a hash collision is about the same than that of a meteor ridden by a chubby Santa landing in your toilet in the next 30 minutes, it's not zero, but pretty close to.
Offline
We will be sending [b]XSIBACKUP-PRO 11.2.0[/b] to registered users from [b]Dec 27th[/b], which includes [b](c)XSITools[/b] repository pruning, both on demand via the [b]--prune-xsitoolsrepo[/b] and automatic by means of the [b]--backup-room[/b] argument, which will limit the size to which an [b](c)XSITools[/b] repository can grow.
Offline
Next [b]XSIBACKUP-PRO[/b] version will incorporate a pruning mechanism by taking into account the [b][https://33hops.com/xsibackup-help-man-page.html#backuproom]--backup-room[/url][/b] argument, thus you will be able to rotate backups by sitting inside the boundaries of the amount passed along with this argument.
It is worth to note that the pruning is performed after the current VM being backed up has been performed, thus you need some maneuver margin which is at least the size of the current VM being backed up.
Future versions will include an [b][https://33hops.com/xsitools-vmfs-deduplication.html](c)XSITools[/url][/b] backup rotation mechanism by using the [b][https://33hops.com/xsibackup-help-man-page.html#deldirs]--del-dirs[/url][/b] argument.
Due to SSDs nature, it is mandatory to keep at least 10% of the disk free at all time, otherwise they will reach their worn out limit much sooner. This is due to the physical limits of an SSD cell, which can only be overwritten a limited number of times.
[url]https://blog.westerndigital.com/ssd-endurance-speeds-feeds-needs/[/url]
[url]https://www.cnet.com/how-to/how-ssds-solid-state-drives-work-increase-lifespan/[/url]
Offline
Could you explain if [b]--backup-room[/b] will be able to make room within one repository, so the repository doesn't grow bigger than that. If I understood right, [b]--backup-room[/b] used to delete older repositories when you had more than one xsitools repository. In other words, do I still need more than one repository, in order to make use of this feature?
Offline
You are too limited.
Download latest version 11.2.2, which allows to prune (c)XSITools repositories by using the --backup-room argument. It allows to parse up to 2048 gb as the XSITools repo limit, next version clears that restriction.
Offline
Just to be sure I understand how the new version works...
So if I run this xsitools job repeatedly:
"/vmfs/volumes/datastore1/xsi-dir/xsibackup" --backup-prog=xsitools:z --backup-point=/vmfs/volumes/Backup/xsit-repo --backup-type=Custom --backup-vms="..." --backup-room=2000 --mail-to=... --use-smtp=1 --backup-how=Hot --backup-id=01 --description="..." --exec=yes >> "/vmfs/volumes/datastore1/xsi-dir/var/logs/xsibackup.log"
it will make room as needed within the --backup-point folder, by deleting the eldest folders (named like 20190102002335, as determined by mask), and by pruning the repository afterwards.
Do I understand this right?
Offline
It will prune the [b][https://33hops.com/xsitools-vmfs-deduplication.html](c)XSITools[/url][/b] repository to make sure it stays within the 2Tb boundary. You don't have to take this literally to the Mb unit, as there are maneuver margins and the repo is pruned only when the new backup set has been fit into the repo, but you just have to tweak the [b][https://33hops.com/xsibackup-help-man-page.html#backuproom]--backup-room[/url][/b] a bit below your desired size to fit your available space.
Please, be aware that as [b][https://33hops.com/xsibackup-help-man-page.html#backuproom]--backup-room[/url][/b] not only affects the size of the (c)XSITools repo, but also the accumulated size of the folders with an XSITools mask, i.e.: 20190107102345, it may also prune other types of backups if the accumulated size of those thusly masked folders reach the configured value, so it's better to keep your XSITools repositories in a different root folder.
You may as well run out of space in your disk while the size of the repo is still below the limit, in that case you would get out of space errors from the file system. It's easy to prevent all of this situations by keeping backup contents well organized.
Offline