Last updated on Thursday 14th of July 2022 08:38:17 AM

©VMWare ©ESXi memory constraints and workarounds

How to deal with the ©VMWare Hypervisor when managing very big repos and sets of data

(*) Branch 1.5.2 of ©XSIBackup includes automatic management of memory for edge cases (VMs over 6-8 TB). This post is an overview of memory as a resource when using ©XSIBackup in such edge cases and can also serve as a document on how to manage memory pools manually with older versions of ©XSIBackup.

Let's start by saying that you can't asume ©ESXi will behave like any other Linux OS just because it resembles one. There are many things that are quite different. One of them is how memory is managed.

ACME DDR5 memory module

©ESXi organizes memory in isolated pools, our guess is that this is a good way to increase the stability of the hypervisor, as some process out of control will less likely affect others, or at least it won't eat all the available memory up. The default configured memory size for binaries running in the shell is 800MB, which is a rather generous amount of memory.

When you use a tool such as ©XSIBackup (a command line utility) with very big VMs, you need to also consider the whole scenario in which you are working, let's call it to have a bigger picture than just the arguments you are using.

That includes many aspects related to what you are doing: considering async vs sync NFS, choosing an adequate FS (look no further, XFS is your best option for backups), taking on account the network latency and how it will affect throughput and memory usage, etc...

If you have some relatively small set of VMs (<2TB) and you plan to keep some replicas or backups with some tenths of restore points, you won't even need to read this post or delve into how you are using your resources. Still, if your data set is big (>4TB) and you are planning to keep many restore points and prune your repos, then you should keep on reading.

One of the key subjects among all other variables in ©ESXi is memory usage. The default configured value for shell binaries is 800MB, which will suffice for up to multi-terabyte virtual machines. If your VMs are just some hundred gigabytes in size (as is most of the times in SMEs), the default set of resources will be enough, except if you are going to grow your repositories very much (some hundreds of restore points). In that case you will just need to worry about memory and how to fulfil the task at the moment to prune or repair your repo.

Even in that case you would not need to worry if you prune from the backup server, as Linux will let the prune process use all available memory and the prune process itself will work much faster than from the ©ESXi host when the data is hosted in an NFS share.

Backups & Replicas

At the time to backup or replicate a huge VM (we will set the limit to the subjective "huge" adjective at 4TB although it could probably be stretched more), you will need to start to worry about memory, as the default 800MB limit will probably start to fall short.

When you run out of available memory for the xsibackup process, or any other one, you will receive a SEGFAULT and the program execution will halt.

Many of you will probably be wondering why doesn't ©ESXi just swap memory instead of returning a nasty SEGFAULT. ©ESXi design may look rather primitive when compared with a modern Linux OS, the thing is that it is not a regular OS but a hypervisor, it hasn't been designed to interact with humans. On addition, swapping memory for shell binaries doesn't seem a very recommendable thing to do in a virtualization host, as it could easily clog the disk controllers and leave the VMs literally hanging on a string.

If you are planning to backup/replicate a VM which is let's say 20TB in size, the default configured values in ©ESXi will not be enough to complete the task. Is this VM size out of the reach of ©XSIBackup?, the answer is that ©XSIBackup can handle it, as it has been designed using 64 bit long integers, it can virtually handle any VM size.

Nonetheless, for the above case you will be short of RAM for the default configured limits. You will need to allocate more RAM from the physically available.

(*) Needless to say that if you are plainly short of RAM, per instance, you have some ©ESXi host with 4GB of RAM and you plan to backup some 20TB VM hosted on it, you are going to simply bang your head against a wall. The solution is simple: add more RAM.

There is a nice mostly undocumented binary called vsish in ©ESXi hypervisor which allows to control many aspects of the functioning of it. Among other things it allows to query for available memory and to tweak the amount of it assigned to different pools.

Checking the amount of available RAM

vsish -e get /sched/groups/4/stats/capacity | awk -F '[ :]' '$4 ~ /mem-unreserved/ {print $5/1024" Megabytes"}'

The above will allow you to know the amount of RAM that you can rely on at some given moment. You should try to always leave some good margin when setting RAM reservation manually.

Monitoring ©XSIBackup's RAM usage

memstats -r group-stats -s gid:name:name:parGid:nChild:min:max:minLimit:conResv:availResv:memSize -u mb | grep "xsibackup\|gid"

The above command will return something like the output below.

The output over this lines is showing some real time figures on RAM usage. You can run the command as many times as you want to get the amount of RAM being used by ©XSIBackup.

The most significant numbers above are:

- GID: this is the process Id assigned to xsibackup.
- PARGID: the parent's process Id.
- MIN: minimum assigned memory, it's usually 0 or -1 (no limit).
- MAX: maximum assigned memory, it's usually -1 (no limit).
- CONRESV: consumed from reserved.
- AVAILRESV: available from reserved.

As some heavy duty job starts to malloc memory you will see the CONRESV growing up and the AVAILRESV value going down. When the column AVAILRESV reaches 0 you will get a SEGFAULT.

As said this will only happen when you run some job that is very intensive in memory use, namely: backup or replica of a very big VM well above 4TB in size, or: when you prune or repair some repository that contains many restore points from the very ©ESXi OS, which you should not do, still we haven't explicitly removed the possibility to do so.

Thus, if you insist in pruning or repairing some repository in an NFS datastore from the very ©ESXi box and the repo is very big, you will likely need to use the --memory-size argument to extend the amount of memory that ©XSIBackup can use.

Set the amount of RAM available to ©XSIBackup

To work that problem around you can tweak the amount of memory asigned to the parent process, so that ©XSIBackup binary can just malloc more memory as it needs it. That can only be done while ©XSIBackup job is running, as we can only find the PID of the parent process once the job has been launched.

The command below will get the line for some xsibackup process, get the parent process' Id from that line and set the max amount of memory assigned to the parent process to 2 gigabytes (2048MB).

gid=$(memstats -r group-stats -g0 -s gid:name:parGid | grep xsibackup | awk '{print $3}') && vsish -e set /sched/groups/$gid/memAllocationInMB max=2048;

Pruning

--prune is by far the most memory intensive argument that you can run in ©XSIBackup. This is due to the fact that to prune some backup repository, you need to account every block in it.

Pruning consist in traversing all the blocks in some repository, eliminate dupes to then find out which blocks are exclusive to the restore point being pruned. This requires approximately 50 bytes of data per each block, including duplicates

Let's say that you have a 1TB VM set and that you have accumulated 70 restore points so far. The figures associated to your scenario yield around 70M blocks. If we consider that we need 50 bytes per block we have that we need 3.3 GB to host that info in a buffer, plus a twin buffer to perform dupe removal and other operations, we have that you need around 6.7 GB of RAM to be able to prune your repository.

That is just some peak memory that will be reached during some seconds, still, you need that memory available

That is why you need to make your own basic calculations to reserve the resources that you need to operate. Setting some backup job, letting the repository grow to 300 restore points just to find out that you can't prune it with the hardware you have is not somebody else's fault nor a bug in the software.


Do not prune/repair your repositories from within ©ESXi

There is no justification to prune your big repositories from within the very same ©ESXi shell. ©XSIBackup will allow you to do so and also offers the --rotate argument which will make use of the prune feature internally. Although that is very convenient for the vast majority of users that have manageable VM sets and keep some 20-30 restore points, if you are backing up very big VMs you don't want to do that.

The reasons not to do so should be obvious: multiplied I/O times in NFS shares when compared to performing the same operation locally. An NFS volume will take hundreds of times more to perform some I/O system call, like removing some file than what it takes to do so from the server hosting the disks through the very SCSI bus. When you are working with millions of blocks the drawbacks become very apparent.

Thus, when you are planning to backup a vast amount of data, you need a proper backup server, a Linux Server (we recommend Rocky Linux as of 2022). You need a server in which you can run your --prune commands with virtually no memory limits, as ©ESXi imposes, Linux will give you that.

NAS appliances such as Synology or QNap are not a good choice in case of big loads of data. Why?: because they run of top of custom cut down Linux environments which will limit what you can do and they have very limited CPUs and RAM when compared to a regular computer.


Next steps

There are some optimizations and there will be more as the software evolves, which will reduce the amount of RAM required for pruning, but those are the basic figures. Nobody said that managing 70TB of data was something simple.

Branch 1.5.2 adds automatic management for extreme cases in such way that it will increase reserved memory as needed. Nonetheless we will probably limit pruning directly from ©ESXi in NFS mounted volumes.