Registered users
Linkedin Twitter Facebook Google+
  This website uses its own cookies or from third parties Close
33HOPS, IT Consultants
33HOPS ::: Proveedores de Soluciones Informáticas :: Madrid :+34 91 663 6085Avda. Castilla la Mancha, 95 - local posterior - 28700 S.S. de los Reyes - MADRID33HOPS, Sistemas de Informacion y Redes, S.L.Info

© XSIBackup-Pro: two step deduplication

How to deploy a solution that will host (c)XSITools repositories per month and archive older data to a Borg server over IP

For our case study, we are going to use CentOS 7.5, which is an Open Source version of the well known RHEL (Red Hat Enterprise Linux), so RHEL can be used instead of CentOS too. You can of course deploy this solution in virtualized computers or directly over hardware. This last approach, installing on hardware, will offer you the best results, as all resources will be dedicated to one single purpose.

CentOS 7.5 installationThis OS can be downloaded here for free. We recommend that you download and use the minimal install ISO file. We will use just NFS, you can optionally install Samba to access your files from Windows computers, for our case, we will use IPv4 and deactivate IPv6, which is generally useless inside a LAN. Needless to say we won't be using any GUI, so you need some Linux command line basic skills before you do this on your own.

We will need two of these CentOS 7.5 servers: one of them will act as the NFS server connected to our ESXi boxes as a datastore. This datastore will host our (c)XSITools repositories grouped into monthly repository folders. This server will also have Borg installed, as it will act as a Borg client, sending data to the Borg server, which will be our second box. You just need to install the OS once, then you can save it as an .ova template, copy it to other ESXi host using XSIBackup, duplicate the VM folder via the NFS server command line, or whatever method you like the best.

We won't need NFS on the Borg server, but in sake of simplicity, we'll just clone the primary server and eventually deactivate NFS on the second box.

Installing CentOS 7 is fairly easy, just follow the on screen instructions given by the installer program. The trickiest part is probably that having to do with network setup. Just remember to activate the NIC (circle #1 in figure 1) and to configure (circle #2) static IPs. Pay attention to the General tab inside the configuration section, as it contains some options about NIC activation too.

Two step deduplication Once you have completed the installation, you can connect via SSH to the configured IP address, in our example, the first CentOS 7.5 server has IP 10.0.0.20.

Before doing anything else, we are going to disable the firewall, this will prevent issues regarding services being blocked. Firewall management is a topic on its own, you can re-enable and configure the firewall once you have completed the setup, or just leave it disabled. If you are in a LAN subnet with perimetral firewalls and NATD, you might not need it, but that will depend on many variables, so I'll leave that decission up to you.

To disable the firewall just issue these two commands, we have prepared them in a row for your convenience.



Now you will have to install the NFS service, you can just issue the following yum command, which will search and install the NFS server from the CentOS default repositories.



Now enable the recently installed NFS server



We have a running NFS server, but we have not yet made up our minds about which folders to share through it. Linux servers generally leave the /home directory mounted in the biggest partition, so we will use /home/nfs-share. You can also create a user for the NFS service and mount under that user's home, like /home/nfs-user/nfs-share, it's up to you.

Now we have to configure the NFS server by editing the /etc/exports file. Use vi, nano or whatever text editor you like the best and add this to the exports file. We will use the command line text redirection operators.



The above string contains the shared resource path, the IP or range of IPs to which we grant access and between parentheses, the NFS options: rw (read write), sync (synchronous data transfers), no_root_squash (allow root access). Running exportfs -a after including the configuration string in the NFS configuration file /etc/exports, ensures that the resource is made visible to NFS clients like ESXi.

Now we just need to connect our ESXi server/s to the just created NFS server. This is very simple, see Figure 3, just go to "Add storage" in your ESXi server GUI and choose Network File System.

Mounting an NFS share Should you encounter some problem setting up your NFS server, start by reading this post regarding this topic: Setting up an NFS Server on CentOS 7

Provided that you use the same CentOS version, there should not be any problem. Don't try to do everything at once, or you'll get stuck for sure. Leave firewall configuration, creation of specific users and permission assignments for when you have everything working.

We already have our NFS server running and it is connected to our ESXi server as a datastore named NFS-Borg-Client, which should be accessible through the ESXi command line as /vmfs/volumes/NFS-Borg-Client. Just issue this command to verify that the newly created datastore is up and running.

Run: df -h



As you can see the new datastore is there waiting for us to write some data on it.

If you remember our previous posts about (c)XSITools, you probably remind that we encourage users to not just create an (c)XSITools repository and start adding data without any limit. VMWare Virtual Machines can become big, and just one of our jobs may copy many hundreds of gigabytes to the repository, if not terabytes. As days pass, we could be adding many terabytes of data to our (c)XSITools repository, there's nothing wrong with it, but you should think about having some sort of redundancy, like combining with other backup types, and also you should consider not letting the number of blocks get so high that searching for them starts to slow down the backup process.

We recomend renovating the repository location once a month for an average SME: 5 to 10 VMs summing up to 1 tb. of data. Of course the latter is just some vague reference for you to make your own figures.

A tipical XSIBackup job to backup all your running VMs to your just created NFS share would be something like this:



Please, note how the backup point (--backup-point=/vmfs/volumes/NFS-Borg-Client/$(date +%Y%m'00000000')) consists in a string containing a bash expression: $(date +%Y%m'00000000') that will generate a new path each month, i.e. 20180700000000. Note also how it contains a full string completed with zeros to match the date dir mask configured in the conf/xsiopts file. This would ensure that XSITools repositories are considered for automatic deletion when the backup media fills up.

Thus, the above command will add the running VM backups to a new repository each month: 20180100000000 (Jan), 20180200000000 (Feb), 20180300000000 (Mar), etc...

Now we need Borg installed into our NFS-Borg-Client server to be able to use Borg to backup to an archive host. This second host, as stated, will just contain a Borg installation. We will install Borg to our first server and just clone it, we will of course disable NFS and change the server's IP and DNS name.

To install Borg just issue the command below, that's all.

We grabbed the download URL from the official Borg Backup pre-compiled binaries URL at GitHub: https://github.com/borgbackup/borg/releases. Let us know if this URL is not working for you so that we update the links in case something is moved around.



So far so good, we already have a working NFS + Borg server, we just need to clone it and change IP and DNS names to have a Borg backend where to do our final data archival with a small block size.

Let's recapitulate what we have at this point of the case study: we have an NFS server attached to our ESXi box where we are already storing our monthly (c)XSITools backup repositories, which by virtue of our dynamic folder naming, using the above date command expression changes with the inner clock and generates a new (c)XSITools repository each month.

Now we need to create the part of the system that archives past months to the Borg server to maximize space utilization. We need to create a script that is executed regularly copying data to the Borg archive and deleting past months repos. To avoid this script interfering with an ongoing backup job or to prevent entering into a situation in which we have no backups ready to restore in our NFS server, we could execute this task every 15th day of the month, per instance, or you could keep two months of backups and delete the month previous to last month.

Or, even better, synchronize your NFS Share everyday and just leave the deletion of past months' repositories to the automatic space provision features of XSIBackup.

We are giving you many options on purpose, you should be able to take all of them into consideration and choose the one that best fits your needs, or even better, come up with your own method. If you feel puzzled by the many options, relax and make some tests, it's the best way to learn.

Whatever you decide will depend on your circumstances, but if you got to this point you got it rolling so far.

Now we need to generate a key pair with OpenSSL at the NFS server and Borg client. By copying the public key to the authorized_keys file, we will allow passwordless communication with the Borg backend.

To automate all that functionality, we have created a script that wraps a Borg backup and the key pair generation and exchange.

Of course you can adapt it to your very particular needs or requirements. Assuming that you paste the below code in a file, name it backup_to_borg and assign execute permissions to it.

USAGE:
./backup_to_borg --to-backup="/home/nfs-share" --auth-keys=/home/myuser/.ssh/authorized_keys --borg-repo=ssh://root@192.168.3.232:22/xsibak/b7
WHERE:
--to-backup: is the path relative to the NAS OS where the NFS share is mounted.

--auth-keys: optional remote path of the authorized_keys file, default is (/root/.ssh/authorized_keys).

--borg-repo: remote repository using the Borg Backup syntax.



The above script takes care of generating a key pair and exchange it with the remote server, as well as setting the Borg environment variables. Of course you can program your own script too.

Once you have setup this system, you will have a two step deduplicated system where: backups will be performed very fast with a minimum footprint on your production ESXi host and stored with a very nice data density ratio as an (c)XSITools repository, which can be restored directly, plus an optional second step deduplicated archival repository, where data density will be taken to the extreme without affecting performance on your production hosts and allowing you to store months or years of backups of all your VMs in cheap commodity HDs.

Below yo have an example showing the Borg Backup statistics on the archive repository corresponding to our development tests. The real density ratio you can reach is far beyond what's shown, as we have performed a very limited set of backups in comparison to a production system running for months or years.

You should also take into account, that the (c)XSITools repository, is already deduplicated, so in our short example you are multiplying space utilization on an already deduplicated set, which yields compression ratios that will only grow as you add data, or in other words, you can store hundreds of backups in little more than the space one backup set would use. That will greatly depend on the amount of data you generate everyday, the type of data, the uniformity of your OSs, etc...



Daniel J. García Fidalgo



Website Map
Consultancy
IT Manager
In Site
Resources & help
Index of Docs
33HOPS Forum

Fill in to download
The download link will be sent to your e-mail.
Name
Lastname
E-mail


            Read our Privacy Policy