Deploying ©XSIBackup Classic in an enterprise network
Please note that this post is relative to old deprecated software ©XSIBackup-Classic. Some facts herein contained may still be applicable to more recent versions though.
For new instalations please use new ©XSIBackup which is far more advanced than ©XSIBackup-Classic.
For new instalations please use new ©XSIBackup which is far more advanced than ©XSIBackup-Classic.
We are initiating this series of articles to illustrate different backup projects that cover the most frequent topologies and can be adapted to your particular needs easily. In the last years, and particularly in the last months, we have added a lot of features to ©XSIBackup-Pro. It has become a sort of Swiss army knife, that makes ©XSIBackup-Pro incredibly flexible at the time of designing your backup projects, but some clients have let us know they find it a bit overwhelming.
©XSIBackup Classic is a command line tool. It is user friendly and the learning curve is very fast, in fact you can put your first backup to work in just a couple of minutes and, from then on, learn about the different options that it offers you very rapidly. In any case, if this is your first time to a *NIX alike command line, maybe backing up a mission critical server is not the best thing to play around with. If this is your case, irony apart, take your time to download some Linux distro and learn the basics, then you can come back and start you first backups and even learn as you go.
The following posts will introduce you to some of the most demanded backup topologies. We'll start with some of the most common ones and will add more as new features are introduced. You will most likely find something similar to what you need, if not the exact case.
But before starting let's revise some of the main ©XSIBackup-Pro's features, just to make sure that we have all the tools we need on top of the table, and we know what each one of them will be useful for.
• Syncronous/asynschronous: this pair of terms or opposite concepts are widely used in engineering and telecommunications and can have subtle different connotations depending on what we are taking about. In our particular universe (copying huge files), this term pair will signify whether we are communicated if a particular chunk of data was correctly written to the other end before sending the next, or we if we just send the chunks in a row and trust that they are written successfully, alternatively using a delayed mechanism (generally out of the transmission protocol) to know about the integrity of the copied data... Now that I read it it looks very condensed, but it seems pretty accurate, I'm open to corrections anyway. As a rule of thumb: synchronicity is good, but its slow. When transferring huge files, specially if we will be doing it recursively, darefulness can be an option.BACKUP PROGRAMS (--backup-prog):
Comprehending the meaning of the upper condensed conceptual pairing will be one of the keys to mastering backups. It's not only something about ©XSIBackup Classic, it's a general principle that you'll read about all over when copying, synchronizing, streaming..., its also interleaved in all the multiple layers, any protocol you use to achieve such tasks, has.
• Linking servers (--link-srv): this is not a very technical term. We use the term "link" just to make it more friendly, in fact all about this is very friendly from a user's perspective, you can link a server by just issuing the --link-srv equals an IP belonging to another server. It will ask you for the password a number of times and when done ©XSIBackup Classic will be able to copy data to and to trigger backup commands on the other host. That's all from the user's point of view. What's happening behind the scenes is that ©XSIBackup Classic is generating an RSA key and sending the public part of it to the authorized_hosts file of the other hosts' SSH service, the result: the first server can "talk" to the other without the need to enter a password every time.
• Snapshots: this is other of the key concepts to understand associated with virtualization. Similar constructs are also used in other areas of the computer sciences like volume managers and filesystems. We will focus in what ESXi snapshots are, how they work and what they are useful for.
I have spoken with many people that believed a snapshot was some sort of backup, and it is not. A snapshot is a file that mimics a hard drive (like any virtual disk) where the VMs stores data temporarily. They can be considered in parallel and also constituting chains of them. The main reason why a snapshot is useful for a backup tool, is because it frees the data from being accessed by the VM process running in the hypervisor, thus allowing a backup tool to access the base virtual disks to back them up. Another way they can be used is to retain previous states of the VM that we can revert to if needed. When using snapshots this way, we are considering them as a chain of data holders, in which the eldest link, the base .vmdk disk, holds the data at the time the first snapshot was taken. The rest of the links in the chain contain the intermediate data between the time the previous snapshot was taken and the time the following was.
One good metaphor is to imagine a clerk piling up papers in a side of this desk. He knows he will be asked by his supervisor to report the work done during the week, so he just inserts a posit note before leaving the office each day and continues to pile up papers when he arrives the following day. If at any time, he is asked to report how much work he had finished by, let's say, Wednesday, he just has to remove the papers on top of the Wednesday posit and the state of the pile will be returned to exactly the time when he left the office on Wednesday evening. Once the supervising labor has ended, he can remove the posit separators and there will be no way to know the state of the pile at a given day, but all information will be there anyway.
The information contained between the posit notes are the snapshots, we can remove posit notes and consolidate the papers with the data underneath, or remove the topmost block of papers and revert the state of things to how they were the day before.
Backup programs are the binaries or modules that allow you to copy the VM constituent files from one place to another. The "other" place is the backup location or --backup-point in ©XSIBackup's terminology. Backup locations can basically be of two kinds: local or remote, but in this case the term is not literal. I still remember when I watched Sesame Street and Grover explained the difference, things have changed a lot since then, and have become somewhat more complicated. The term "local" refers to something that is logically attached to, and thus is visible to, our ESXi server. It does not matter where it is physically, it could in fact be in the other end of the world. The term "remote" refers to another ESXi server that is visible through an IP network, it does not matter if the other "remote" server or NAS is connected through a 12 inch ethernet hose. If you still have doubts about what this means, then you should take some time to familiarize yourself with networking and storage protocols.THE BACKUP POINT (--backup-point):
• Vmkfstools: this is nothing else but the command line utility offered by VMWare and present in every ESXi system. It's fast and efficient, but it's cut down in functionality, thus it will only be useful to perform local copies, datastore to datastore. We will invoke this program only when backing up locally to a datastore visible in our ESXi server. ESXi uses NFS and iSCSI to mount datastores, as well as local HDs connected to the hardware storage controller. Any HD or volume that is accessible via any of this methods through a datastore visible to our ESXi server will be considered to be a local backup media. Excerpt: Vmkfstools is the fastest tool copying data locally, but it can't copy data over a network.
• Rsync: what can I say about Rsync that has not already been said. You should know this tool before starting to use it. If not, take some time to read about it and use it in a Linux command line environment to transfer files to other servers and take the chance to know it. Rsync can copy files both locally or over an IP network, and can do it differentially. This means that it's able to detect which exact bytes have changed and send those bytes only over the wire to minimize data transfer. But don't smile too much, before it knows which bytes have to be sent it needs to find out where those bytes are in the file, and that can take a loooooong time for our huge .vmdk files. Please, read this article to have a more thorough understanding of what I'm posing here. Excerpt: Rsync is good at transferring data through a network, it's much slower than Vmkfstools when copying files locally.
• Borg Backup: is a fork of the elder "Attic Backup" Python deduplication engine software, which development is active now, in contrast to Attic, which hasn't published any update for years. Borg is used as a mere backup backend from ©XSIBackup Classic. Although ESXi has a built in Python interpreter, Borg Backup cannot nowadays be run in ESXi, thus ©XSIBackup Classic send all bytes to Borg Backup, losing any advantage in reducing or limiting the transmitted data. Still Borg is very efficient in saving space, as it deduplicates and compresses data at a block level. By using Borg Backup as your backup backend, you'll multiply your storage capacity and will be able to store a historic set of backups to move back to any desired point in time. Excerpt: Borg Backup is efficient storing data and faster than Rsync as a network copy XSIBackup-Pro program.
• XSITools: this is the newest backup program offered by XSIBackup-Pro. XSITools is our propietary block de-duplication mechanism, and it can be used in locally mounted HDs or any datastore equiped with a compatible decent filesystem. By decent I mean something that is reliable and can host some hundreds of thousands of files. It has many advantages over the other backup programs: it is the fastest in copying files locally, with the only exception of Vmkfstools, which wins by a little margin, but can deduplicate data, which is a big advantage, and it will soon offer compression and over the network data replication, which will make it the most powerful backup program available in XSIBackup-Pro. Even though it is a propietary copy mechanism, implementing its own copy algorithm, data is stored in a transparent way and you can easily rebuild your data from the chunks created by XSITools if you needed to.
This is a quite simple concept; it's the place where you'll be copying/backing up your VMs to. Nothing obscure right?. ©XSIBackup Classic manages two types of backup points:THE "BACKUP HOW" (--backup-how):
• Local: defined as a local absolute path, i.e. /vmfs/volumes/backup
• Remote: defined as a remote absolute path. This type is slightly more tricky, as it also includes the remote server's IP or FQDN and the SSH port, separated by two colons (server:port:path), i.e.: 192.168.33.100:22:/vmfs/volumes/backup
This argument refers to the method that will be used during the backup process. The default --backup-how method is hot, this means that the VM will be backed up even if it's turned on. There are three backup methods:
• Hot: VMs will be backed up without turning them off. This is the default and preferred method for obvious reasons.
• Cold: by using this method, ©XSIBackup Classic will ensure that the VM is off during the whole backup process. It can be useful for office servers that don't need to remain switch on during the night hours. This way you prolong their life, save electricity and at the same time ensure 100% that the backup process is consistent, even with the most bizarre OS that can run in ESXi.
• Warm: this method offers something in between a hot backup and a cold backup. Turns the VM off, takes a snapshot, switches it on and then makes the backup. This ensures that the process of taking the snapshot is consistent with some OSs for which VMWare Tools is not available.
Daniel J. García Fidalgo