Backing up the snapshots

wimg · 2021-10-03 16:02:22

We're looking at different ways of making off-site backups of the snapshots that XSIBackup-DC produces.
In the past we tried using rdiff-backup to make sure we didn't have to copy all the data every time, but this started becoming too slow.
Recently we switched to Bacula with daily incremental backups. Sadly it seems the large number of files that XSIBackup-DC produces is what causes the most I/O wait, meaning the backup speed (from a Synology with a 10Gbps network adapter on a 10Gbps network with MTU set to 9000) is about 85Mbit/sec.
Does anyone have suggestions on the best way to make off-site backups of the snapshots ?

wimg · 2021-10-03 21:25:37

To clarify a bit more : the initial full backup is 85Mbit/sec, the incremental runs at 10-20Mbit/sec. As far as I can tell most of the time is spent scanning the millions of files on the file system :-/

admin · 2021-10-04 18:57:32

Why don't you use Rsync with a size + timestamp check, instead of a full checksum comparison?.

(*) To other readers: please note that when [b]wimg[/b] says snapshot, he is really referring to the repositories created by (c)XSIBackup. We thought it was worth clarifying.

wimg · 2021-10-11 21:25:30

It seems even doing that won't cut it. The brand new Synology just isn't fast enough to scan through all the directories that result from the XSIBackup-DC run.
Is there any way to reduce the number of files, even if that increases the amount of data per incremental backup ? Since most of our environments have little daily change, we'd be ok with that.

admin · 2021-10-12 10:18:45

You could use a bigger block size (10, 20 or 50MB). Still, CBT is not compatible with other block sizes than 1MB and incrementing the block size to try to overcome some kind of limitation from part of the backup device is not the best solution.

The most adequate cheap backup device for (c)XSIBackup is some old PC with a big HD and an NVMe/ SSD as write cache with some CentOS 7 distro installed or any other Linux distro with FUSE3 and a fast file system: ext4/XFS

File Systems also play a crucial role. VMFS6 should never be used to store deduplicated backups, your best bet is XFS or ext4. BTRFS is not the best match either. On the other side using some user space FUSE derived FS, like: ZFS, s3fs, etc... to store deduplicated backups is just not feasible.

In any case, we use Synology and works well for us. If you just add data to a repository without ever rotating it you are just going to hit the limits of it's CPU and RAM. Another option is that you detect what is the amount of data that your Synology device can handle and create a new repository when you reach that limit.

wimg · 2021-10-12 15:10:50

We have a DS1621+ with 6 Western Digital WD Red Plus NAS disks in RAID 10 with an ext4 volume. Our impression is that it has trouble searching through the directory structure extremely fast, taking 80 minutes for a "find" in a single XSIBackup-DC VM directory (we have 1 per VM).

admin · 2021-10-13 15:05:59

You need to put things in context.
t's not the same a repository hosting 1TB than one hosting 50TB.
It's not the same looking for a file in a FS tree hosting 50 million files than splitting the workload.

We still don't know the size of your repository.

When (c)XSIBackup backs data up or restores it, it already knows where each block in the tree goes, thus seek times are negligible. When it looks for it it uses a very fast indexing system, thus the seek time is still very low. Nonetheless, if you generate very big repos and you then want to sync them anywhere else at the file system level, using Rsync per instance. Rsync is going to traverse the whole FS to account for the files to transfer.

Even if you just compare file size + timestamp, you are going to use a big amount of RAM and CPU time just to hold that data, let's say 50M files.

[url]https://serverfault.com/questions/365103/how-to-speed-up-rsync-for-small-files/365124[/url]
[url]https://serverfault.com/questions/914214/rsync-huge-dataset-of-small-files-5tb-m-small-files[/url]

You have different ways to work that around:

- Split the load into smaller sets. (c)XSIBackup generates a tree by the first 5 hexadecimal characters of a block which gives more than a million possible subdirs, i.e. /data/0/d/b/a/f/0dbaf3eecbe81de93ddd55bc46f8263bde615613.
You can easily script something that makes Rsync iterate the first 16 subdirs or the first 16x16, taking in consideration the second level. That way you will minimize the need of RAM and CPU to synchronize the whole set in multiple passes.

- Sync the data via a block device, that way it will be agnostic of the number of files being sync'ed. To do that you would need to sync the volume where you store your data. You have different methods to achieve that: some distributed FS such as Ceph or Gluster FS, or maybe even some much simpler approach, such as LVM RAID or mdadm.

You can't use these methods to create a distributed cluster, as nothing is managing concurrency and thus you would end up with a corrupt FS, still you should be able to use this approach if your remote end mirrored RAID1 half is mounted read only. You would also need to find a way to make RAID1 syncs be async to avoid lags if using it over a network.

[url]https://blog.programster.org/create-raid-with-lvm[/url]
[url]https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-16-04[/url]

Native file systems such as ext4 and XFS are among the fastest database alike systems that you can find. You issue is not with the FS speed but with the need to store its info in RAM and operate with the file tree. Any of the above solutions will pay off with a minimum investment in terms of technology and time.

wimg · 2021-10-16 13:57:33

Each repository is for a single VM. The largest VM is currently 150GByte, most are around 60GByte.

We're not rsyncing at this point, we're using Bacula. When taking the backup, the IO wait on the Synology goes to 20-30%.

Probably I just don't understand how file systems work, but I was wondering if there's a straightforward way to store all file attributes in the directory cache permanently, so that it doesn't have to retrieve that information from disk. The machine we're using has 64GByte of memory of which over 40GByte is available.

admin · 2021-10-17 14:37:25

There isn't any magical solution, or is there one?...

If you use Rsync (or any other software that works at FS level), it's going to take some time to perform the first sync and also a lot of time to perform the subsequent resyncs.

Using Rsync over SSH adds overhead, thus using the plain Rsync protocol along with the [b]--size-only[/b] argument would be the fastest option at a file system level.

Still, as already commented, this would be the wrong approach if you want to minimize sync time. Just place the repository in a volume and then sync the volume at block level, that's the simplest and fastest way to do it.

Of course you can apply the concept using any available technology that manages volumes at block level, still (c)XSIBackup will come in handy here, as a -flat.vmdk file is a volume that you can indeed sync at block level, thus, the solution to your problem is very straight forward:

Store your (c)XSIBackup repositories in a -flat.vmdk file and then use --replica to replicate that disk anywhere you want. If you use CBT you will achieve in seconds the task that would otherwise take days.

wimg · 2021-10-17 15:30:49

Thanks for your help. Could you explain that last part ? How do you store XSIBackup respositories in a -flat.vmdk file ?

admin · 2021-10-17 16:39:20

1/ Create a Linux VM with a size big enough to fit your repo data.
2/ Backup to that VM over IP (the simplest and fastest method) or mount it via NFS3
3/ Use (c)XSIBackup to backup your data to a repo in the previous VM.
4/ Use (c)XSIBackup to replicate the VM containing the repos. Optionally use CBT to minimize backup times.

VMDK virtual disks are very convenient containers, they can be moved around, replicated with (c)XSIBackup and you can [b][url=https://33hops.com/mounting-vmdk-disks-in-linux-to-access-individual-files.html#mountvmdks]mount vmdk files in a any Linux distro[/url][/b].

admin · 2021-10-19 12:21:52

Although when you have big repositories that you want to replicate you should use the method explained above to avoid the intrinsec File System overhead, we have detected some things that don't match in your case, which may in turn point at some resource in your system not being well configured or optimized.

We have carried on some tests in our lab with extremely outdated hardware. This sync was done from a Synology NAS box (DS712+) which is around 10 years old seagate SATA disks to XSIBackup-NAS appliance with a single core on an i5-4460 CPU @ 3.20GHz test server using Rsync over SSH (huge SSH overhead when compared to raw Rsync).

We like to use old hardware, not only because in a lab with some tenths of servers we can reduce the costs, but also because that forces us to optimize things to the extreme.

time rsync -rlpDv --size-only --progress --partial --whole-file --rsh="ssh" /volume2/backup4/DATACENTER-BACKUP-1M root@a.b.c.d:/home/backup/volume1/repo01

...
...

      812375 100%    1.76MB/s    0:00:00 (total: 1%) (xfer#9697, to-check=194/1079665)
DATACENTER-BACKUP-1M/data/f/f/f/4/3/
DATACENTER-BACKUP-1M/data/f/f/f/4/3/fff435aa59b1e8eb04ecc093b0b0ebec64e11f8a
      718443 100%    1.40MB/s    0:00:00 (total: 1%) (xfer#9698, to-check=179/1079665)
DATACENTER-BACKUP-1M/data/f/f/f/6/9/
DATACENTER-BACKUP-1M/data/f/f/f/6/9/fff69d4bcf6e4684e3b5e31baeaf194df19972c4
      863527 100%    1.47MB/s    0:00:00 (total: 1%) (xfer#9699, to-check=144/1079665)
DATACENTER-BACKUP-1M/data/f/f/f/a/7/
DATACENTER-BACKUP-1M/data/f/f/f/a/7/fffa78ab8160524c411da2464d5ac8feb5438ddf
      416077 100%  696.96kB/s    0:00:00 (total: 1%) (xfer#9700, to-check=90/1079665)
DATACENTER-BACKUP-1M/data/f/f/f/c/e/
DATACENTER-BACKUP-1M/data/f/f/f/c/e/fffce5f0c1297487771c129d4847626d7c02cafc
      803256 100%    1.23MB/s    0:00:00 (total: 1%) (xfer#9701, to-check=51/1079665)

sent 6432649065 bytes  received 712064 bytes  149132.70 bytes/sec
total size is 384855008254  speedup is 59.82

real    718m59.062s
user    4m57.396s
sys     6m45.952s

The total size of the replicated repository is 367.00GB. The first sync took around 36 hours, which is quite a bit for the size of the repo. As each block has to be sent over SSH, there is some overhead due to the SSH encapsulation.

As you can see, the second sync using the --size-only argument, took just 718 mins., In our case we generated a diff data volume of aroud 6.00GB

This are figures that make it worth to turn your head to the alternative solution. Still when using newer hardware with more resources and the raw Rsync protocol, this figures should be drastically reduced.

We will repeat the test with the same hardware, the raw Rsync protocol and publish the results.

wimg · 2021-10-19 12:41:04

Thanks for your very thorough responses. I believe we've found a workable solution based on your previous suggestion :
- We do a replication using --replicate
- We use rdiff-backup to make a rsync-like copy of this replicated VM (rdiff-backup uses rdiff, just like rsync does) over SSH. The advantage is that we can then restore previous versions of the replicated VM
- This seems to work efficiently, with speeds hitting up to 250Mbit/sec over a Wireguard VPN connection from Belgium to Germany.

Does the above make sense to you as a backup strategy ?

Last edited by wimg (2021-10-19 12:41:32)

admin · 2021-10-19 14:52:21

Sure, still we believe (c)XSIBackup is a better option for the following reasons:

1/ It should be faster than rdiff-backup, as it can make use of CBT which is an instant differential feature, the changed blocks are known on advance.

2/ It handles the VM management layer, with rdiff-backup you need to make sure the VM is off while (c)XSIBackup will handle that for you.

wimg · 2021-10-19 16:20:02

When you do --replicate of the VM, do you need to turn it off first ? I thought it took a snapshot, then made a backup of that snapshot ? Since we're doing an rdiff-backup of the replicated VM, would that not be safe ?

admin · 2021-10-19 17:54:33

Of course you don't need to turn it off first. What would be the point of (c)XSIBackup then?

We though that you were using (c)XSIBackup to backup your VMs. We are a bit puzzled by your question. (c)XSIBackup does not make a backup of the snapshot, it does a backup of the whole VM disks, snapshots are not backups.

Any of the two main actions (c)XSIBackup performs (--replica and --backup) do generate a full copy of the VM unless otherwise especified by excluding some disk. Any algorithm aimed at minimizing the transfer of data does just that: minimize the transferred data or the time it takes to do so. All restore points in a repository contain all the data in your disks at any given point in time.

Thus, using (c)XSIBackup to replicate some -flat.vmdk disk that in turn contains some repository will replicate the disk as it is, fully, bit by bit.

If you still have some doubt on how things work, please do not hesitate to ask, still don't set a procedure up asuming something works in any given way without being 100% sure.

wimg · 2021-10-19 18:02:49

Sorry I'm mixing a few things. I'll try to be more clear :
- If we use --replica to make a complete replicate of an active VM to location A and then make a copy of that replica to location B, we can later on restore it by placing it back on a VMware host and registering the vmx file, correct ?
- If we then take a new --replica to location A and do a differential copy of that replica to location B, we have both the original version and the second one available for restoring.

At least that's what we're trying to accomplish...

admin · 2021-10-20 07:52:47

Sure. Any software that can do the replication is OK. The only thing we pointed out is that (c)XSIBackup is more convenient than rdiff-backup for the already exposed reasons.

Forum ©XSIBackup: ©VMWare ©ESXi Backup Software

#1 2021-10-03 16:02:22

Backing up the snapshots

#2 2021-10-03 21:25:37

Re: Backing up the snapshots

#3 2021-10-04 18:57:32

Re: Backing up the snapshots

#4 2021-10-11 21:25:30

Re: Backing up the snapshots

#5 2021-10-12 10:18:45

Re: Backing up the snapshots

#6 2021-10-12 15:10:50

Re: Backing up the snapshots

#7 2021-10-13 15:05:59

Re: Backing up the snapshots

#8 2021-10-16 13:57:33

Re: Backing up the snapshots

#9 2021-10-17 14:37:25

Re: Backing up the snapshots

#10 2021-10-17 15:30:49

Re: Backing up the snapshots

#11 2021-10-17 16:39:20

Re: Backing up the snapshots

#12 2021-10-19 12:21:52

Re: Backing up the snapshots

#13 2021-10-19 12:41:04

Re: Backing up the snapshots

#14 2021-10-19 14:52:21

Re: Backing up the snapshots

#15 2021-10-19 16:20:02

Re: Backing up the snapshots

#16 2021-10-19 17:54:33

Re: Backing up the snapshots

#17 2021-10-19 18:02:49

Re: Backing up the snapshots

#18 2021-10-20 07:52:47

Re: Backing up the snapshots

Board footer