Last updated on Monday 28th of February 2022 08:52:48 PM

Deploying a basic backup job with ©XSIBackup Classic

 Please note that this post is relative to old deprecated software ©XSIBackup-Classic. Some facts herein contained may still be applicable to more recent versions though.

For new instalations please use new ©XSIBackup which is far more advanced than ©XSIBackup-Classic.

We should start this series of case studies by analyzing the most frequent type of VM backup. We'll call this a "Basic Backup Job". The case consists on a number of VMs, which I will reduce to three, that will need to be backed up to a local datastore in order to be able to recover them in case of a disaster recovery situation. This is something basic, we would not be covered in case of a fire or a theft, but you could combine it a with a basic procedure by which you use two different portable disks and always keep one out of the office, per instance.

We will be backing up this three VMs (Linux1, Windows1, Unix1) every night. Our office is a regular SME with 33 employees or a bigger enterprise regional branch office, that uses this three virtualized servers in a double CPU Xeon server equipped with three 1 TB SATA drives plus a 500 mb SSD that will be used as a cache disk. On top of that we have a Synology or QNap NAS server with two 4TB disks in a mirror RAID. This is a very common hardware configuration in any SME or regional branch around the globe.

Let's make things clear

I'm not the type of guy that writes convenience papers, I don't think you have seen any banner in this site, and I'm not willing to act in any firm's behalf. In fact what I am about to say does not harm anybody. Most of the hardware manufacturers out there create usable and useful pieces of hardware, just as long as you use them for the purpose they were meant for.

I could concrete and say: "don't expect to reach high transfer rates if you use home meant network hardware, it's not ©XSIBackup's fault if you are backing up at 10 mb/s over your gigabit NICs". You can certainly run ESXi servers on commodity hardware and still reach good average backup speeds, but you should not save in network equipment. That does not mean you have to buy multi thousand network switches or spend hundreds of dollars in your NICs, it just means that the switch and the server's NICs are crucial, and that you need to choose them well.

I highly recommend Intel NICs, almost any server NIC chipset is well made and will be recognized by ESXi (check the ESXi hardware compatibility list before buying). In regards to the switches you need to design a good topology that allows to avoid bottlenecks, a common topology is to use a powerful switch at a first level, to allow the servers to communicate between them and one or more switches to manage the network client's traffic. You can also install one additional double port NIC in the server and connect it directly to the backup NAS device, this last option is damn simple and equally effective.

In regards to NIC teaming, don't get too excited because you have four NICs in your server, teaming might not work the way you expected. It's true that building a NIC team is quite straight most of the times, but there are a number of ways to set up a NIC team, and the bandwidth does not sum up as is. First of all, if you want NIC teaming, your switch needs to support it, so you'll have to read the papers and choose your hardware accordingly to what you pretend to achieve. As an excerpt, NIC bonding will offer you various benefits like failover and load balancing, but that does not mean that when you transfer a huge .vmdk file it will be being transmitted through all your NIC ports at the same time and thus you'll be multiplying your speed by the number of bonded ports.

In respect to the NAS server, don't complicate your life, Synology and QNap devices are well made and optimized for data transfer, it's not worth to build your own NAS on top of a server unless you already have a spare one or have the need of a very powerful NAS in terms of CPU and/or memory, as would be the case if you wanted to run some FS like ZFS.

The backup design

Once we have gone through the most important facts when dealing with the hardware, we can focus in the backup logic. In first place, we should realize that we are destining valuable resources to backup our data. Part of our data will never change, per instance, a bunch of hi resolution photos from our annual meeting in 2007 is probably something we don't want to backup every day. Thus, all obsolete data has to be extracted and archived before even posing any backup strategy.

Secondly, a backup is not a proper backup if it does not allow you to go back in time, at least a number of days, you should also keep at least some weekly backups and also some monthly ones. Apart from that you can keep other backups from previous years offsite just for archival purposes. Copying your server to the same place every night is not an option, as you would replicate any problem to your copy.

What are the most dangerous threats in today's world?:

Obviously classic threats are still there: fire, theft, broken water pipes and so on..., and we also have ever changing threats, like: malware, trojans and specially ransom ware, I know a couple of clients that have been affected by ransom ware incidents in the last year, and the only thing that saved them was having a good set of backups.

- So, the first basic characteristic a backup design must have is: multiple backups available, the more the better.

Apart from that thought, there are other important facts to take on account. Will we need to perform hot backups? (while the VMs are running), or, can we switch the VMs off because there's nobody here during the night?. If the office is not busy during the night making backups will be a kiddies game, so we will turn our case study company into a 33 seat 24x7 Call Center, to make the case study a little bit more interesting. So, yes, we will be needing to perform hot backups, as stated above we have three servers:

- A Windows server: it's a PDC that also has the CEO, CIO and two more executives home folders, so it's acting both as a PDC and file server. It is installed in a single 300 gb .vmdk file and has 200 gb. of data plus 100 gb. of empty space in a 1TB HD (/vmfs/volumes/datastore1).

- A Linux server: it holds the rest of the staff data, mainly stored in a CRM database of about 90 gb., but there are also file folders belonging to 30 Call Center operators, summing up 400 gb, a HHRR database of about 10 gb. and some DDBB logs and technical data. This server is spread in two 1TB HDs (/vmfs/volumes/datastore2 & /vmfs/volumes/datastore3) and holds 500 GB of data.

- A Unix server: this is the PBX that gives service to the Call Center, it is installed in a FreeBSD OS and holds the PBX plus 100 gb. of recordings from the daily conversations, summing up 110 gb of data plus 290 gb of empty space into a 400 gb. vmdk HD hosted in (/vmfs/volumes/datastore2).

O.K., so we have 1.2 tb. of data to backup every day, and we must do it in such way that nobody notices it. We are ready to design our backup strategy with ©XSIBackup Classic.


Well, first of all do not panic, we used the word programming more in the sense of scheduling, you don't need to be a programmer to use ©XSIBackup Classic.

Let's start by the Windows server. We know it is the PDC, so we cannot switch it off. Apart from that we know that the executives that use it as a file server are not at the office from 17:00 to 07:30 h.. Copying those 200 GB. while the rest of the stuff is using the Linux Server and the FreeBSD PBX might cause the PBX to drop the calls and the file server to work sluggish.

If we use ©XSIBackup Classic, we will be working at the ESXi shell level, we will be only using one of the available cores in the server, thus the impact in terms of CPU will be very low. The memory management has been optimized to only use very limited amounts of RAM, so the only thing that we should be afraid of is clogging the RAID controller or the network cards with the stream of data, which is not despicable (1200 GB).

Do you remember some paragraphs above when I proposed to use a separate NIC to hold the backup traffic?, that is a cheap way to detour backup data traffic from the production NICs, the ones that will be used to keep the office working. If you use that technique, you will be causing a low impact in the CPU and memory and also a low impact in the NIC cards. But wait, what about the HDs attached to the server?, we could be saturating the disk controller channel with our backup traffic. Also the switch could be congested, thus we could use a separate one for this network segment. The obvious solution to this is to isolate, as much as possible, the backup implied elements in different NICs, switches and HDs attached to different disk controller channels.

In our case, the correct connection of the disks should be: one disk holding the PDC and executive folders server .vmdk disk attached to the disk controller's channel number 1. The Linux server should be spread in, at least, two .vmdk disks, one of them holding the database files connected to the disk controller's channel number 2 and the other holding the user files connected to the disk controllers channel number 3. The PBX's .vmdk disks should be connected to the last disk controller's channel, number 3. Assuming we have 4 channels in our SATA controller, we can use the fourth to connect the 500 gb. SSD cache disk, that will help the system run smoother.

This disposition of things will use our hardware resources rationally, and will allow to perform a backup without disrupting the other departments' productivity. There are other ways to backup our servers that will be much less exigent in terms of data transfer, but we will leave that for next posts, by now we will copy all data every time, that will force us to squeeze our brains to optimize the system's layers.

O.K., now that you have reconnected everything to take advantage of your hardware to the maximum extent, let's see how your ESXi server looks from inside. You should have a number of data stores (4), three corresponding to the SATA disks and one corresponding to your NAS device. I recommend that you mount it through NFS, it's simpler and more efficient than iSCSI, at least with the various different NAS devices that we have tested. Make sure that you have access via SSH to your ESXi's server's shell. Login to your server through SSH and run: df -h, you should see something like the following:

Filesystem Size Used Available Use% Mounted on
VMFS-5 1.0T 300.0G 700.0G 33% /vmfs/volumes/datastore1
VMFS-5 1.0T 90.0G 910.0G 96% /vmfs/volumes/datastore2
VMFS-5 1.0T 800.0G 200.0G 80% /vmfs/volumes/datastore3
NFS 3.8T 1.2T 2.6T 69% /vmfs/volumes/NAS-backup
VMFS-5 500.0G 400.0M 100.0G 1% /vmfs/volumes/cache
vfat 285.8M 205.8M 80.0M 72% /vmfs/volumes/58df34w2-e17c0367-85c9-90e2byadf7c4
vfat 249.7M 143.6M 106.1M 58% /vmfs/volumes/84e2267f-adgfdddc-40c8-fd5026c06f1e
vfat 4.0G 111.9M 3.9G 3% /vmfs/volumes/58ab0wd9-b91f87e0-7fbe-90e2ba2df7c4
vfat 249.7M 143.6M 106.1M 58% /vmfs/volumes/c6dfsg70-ecafghk7-a872-252fghdfgc67

Below our ©XSIBackup Classic job file. ©XSIBackup-DC Classic job files are about the same, although the arguments change a bit. Nonetheless conceptually the backup job file is the very same thing, namely: a script that wraps a call to the xsibackup binary along with the required arguments and options.

"/vmfs/volumes/datastore1/xsi-dir/xsibackup" \
--backup-prog=Vmkfstools \
--backup-point="/vmfs/volumes/backupds" \
--backup-type=Custom \
--backup-vms="WINPDC" \
--backup-how=Hot \
--date-dir=yes \
--use-smtp=1 \ \
--backup-id=001 \
--certify-backup=yes \
--description="Basic backup job 001" \
--on-success="backupId->002" \
--on-error="backupId->002" \
--snapshot=doquiesce \
--smart-info=yes \
--backup-Id=001 \
--exec=yes >> "/vmfs/volumes/datastore1/xsi-dir/var/logs/xsibackup.log"

"/vmfs/volumes/datastore1/xsi-dir/xsibackup" \
--backup-prog=Vmkfstools \
--backup-point="/vmfs/volumes/backupds" \
--backup-type=Custom \
--backup-vms="LIN01,BSDPBX" \
--backup-how=Hot \
--date-dir=yes \
--use-smtp=1 \ \
--backup-id=001 \
--certify-backup=yes \
--description="Basic backup job 002" \
--snapshot=doquiesce \
--smart-info=yes \
--backup-Id=002 \
--exec=yes >> "/vmfs/volumes/datastore1/xsi-dir/var/logs/xsibackup.log"

Backup job explained:

• As the executives leave the office at 17:30 and don't come back until 07:30 am, we have a plenty wide open backup window here, we will start our backup job at 19:00 h. just to make sure that the chances there's somebody still working are smaller (see cron schedule below).

• As the executives don't normally work during the weekends, we won't be backing up this server on Saturdays and Sundays, as data won't normally grow those days (see cron schedule below).

• We don't need to parse the --backup-how=hot option, as it is the default behaviour, in any case it's there for you to read.

• I will be explicitly using the --backup-vms option to set the VMs to backup.

• We will be sending a backup report to by means of the SMTP server #1 (--use-smtp=1) set in the conf/smtpsrvs file.

• The --date-dir=yes argument/option has been set, thus backups will be stored in timestamped folders under the --backup-point directory.

• And finally, we are setting the --on-success and --on-error arguments in backup job 001, so that when the backup job ends mon-fri, a second backup job (backupid->002) is started.

Now we need to set up the second backup job (backupid=002), that will be fired on sat and sun. Taking on account the way work is organized in our Call Center, and that the first backup (200 gb), will last about half an hour to 45 min., we should probably be thinking about launching the Linux server backup, first and leave the PBX backup to be made after it, as the volume of incoming calls in the Call Center normally decreases as the night comes.

# This is the crontab for the user specified in the first part of the file name
# You can add your cron schedules here, do not forget to enable them in the
# ESXi crontab by running the ./xsibackup --update-cron command in your host
# Example:
# * 12 * * * "/vmfs/volumes/datastore1/xsi-dir/etc/jobs/001" > /dev/null 2>&1
#min hour day mon dow command
0 19 * * 1,2,3,4,5 /vmfs/volumes/datastore1/xsi-dir/etc/jobs/001 >> /vmfs/volumes/datastore1/xsi-dir/xsibackup.log 2>&1
0 19 * * 6,7 /vmfs/volumes/datastore1/xsi-dir/etc/jobs/002 >> /vmfs/volumes/datastore1/xsi-dir/xsibackup.log 2>&1

The second backup job (backupid=002) will last about 2-3 h., it doubles the size of the first backup volume and will be made through a pretty busy controller channel, thus we should expect it to be a bit slower. You should also take on account that the stream of backup data will slow down the server response, still it should work fast enough to keep production going.

Daniel J. García Fidalgo