Last updated on Monday 28th of February 2022 08:52:48 PM



©ESXi Disk Layout Related Issues.

 Please note that this post is relative to old deprecated software ©XSIBackup-Classic. Some facts herein contained may still be applicable to more recent versions though.

For new instalations please use new ©XSIBackup which is far more advanced than ©XSIBackup-Classic.

Let's start by saying that an ESXi VM is conceptually the same as a regular PC. The only difference is that the parts that constitute it are virtual (software), thay reside in your hard disk as files: binary .vmdk files for storage or some kind of executable file for the drivers that emulate the real hardware. Thus, no matter what happens, whether using ESXi or any other kind of virtualization system out there, just as long as you have your disks (.vmdk files in case of ESXi) and they are consistent, your data and your whole VM is safe.

But sometimes VMs don't boot, even when the data in the VMDK files is consistent. As in any regular PC if cables are missconnected, faulty or you just left a connector unplugged, the machine may not boot. Let's start by analyzing the main constitutive parts of a VMWare ESXi Virtual Machine:

As we stated before, the main components are the .vmdk disks themselves, just as long as they are O.K., your VM can be easily recovered. Apart from the .vmdk files, there are two main configuration files: .vmx and .vmsd.

The .VMX File:

It is unique (one per VM) and is normally stored in the main VM folder. When you do a listing of the VMs in an ESXi system, the .vmx file will normaly appear associated to the VM. This is normal, as it is the main file describing the VM itself. The .vmx file is a text file, it can be edited to manually add, remove or change any of its contents.

Getting list of all VMs...
24 XSINAS [datastore1] XSINAS/XSINAS.vmx centos64Guest vmx-08
26 WXPMK [datastore1] WXPMK/WXPMK.vmx winXPProGuest vmx-08


You can start by editing a .vmx file, you can easily do so by using the vi editor under the ESXi SSH shell; change dir to your main VM folder and issue.

/vmfs/volumes/.../xsi-dir #vi yourvm.vmx


You will see a lot of text describing the hardware associated with the VM: virtual hardware version, memory, CD-ROM, diskettes, disk controllers, disks themselves, etc... Everytime you edit the VM configuration by adding a disk or removing any other device, the vSphere client (Windows or web) will edit this file accordingly.

.encoding = "UTF-8"
config.version = "8"
virtualHW.version = "8"
pciBridge0.present = "TRUE"
pciBridge4.present = "TRUE"
pciBridge4.virtualDev = "pcieRootPort"
pciBridge4.functions = "8"
pciBridge5.present = "TRUE"
pciBridge5.virtualDev = "pcieRootPort"
pciBridge5.functions = "8"
pciBridge6.present = "TRUE"
pciBridge6.virtualDev = "pcieRootPort"
pciBridge6.functions = "8"
pciBridge7.present = "TRUE"
pciBridge7.virtualDev = "pcieRootPort"
pciBridge7.functions = "8"
vmci0.present = "TRUE"
hpet0.present = "TRUE"
nvram = "WXPMK.nvram"
virtualHW.productCompatibility = "hosted"
powerType.powerOff = "soft"
powerType.powerOn = "hard"
powerType.suspend = "hard"
powerType.reset = "soft"
displayName = "WXPMK"
extendedConfigFile = "WXPMK.vmxf"
numvcpus = "2"
cpuid.coresPerSocket = "2"
memsize = "256"
ide0:0.present = "TRUE"
ide0:0.fileName = "WXPMK.vmdk"
ethernet0.present = "TRUE" ethernet0.networkName = "VM Network"
ethernet0.addressType = "generated"
guestOS = "winxppro"
uuid.location = "56 4d 59 1e 62 f8 fd e3-fd 97 b7 c8 6b 43 c1 ab"
uuid.bios = "56 4d 07 ef 3c 2b 98 eb-a8 b6 9a a0 8f e5 1d e7"
vc.uuid = "52 52 a6 bb 09 0c 86 22-21 1b 7f 70 62 1d d5 e4"
ethernet0.generatedAddress = "00:0c:29:e5:1d:e7"
ethernet0.pciSlotNumber = "32"
vmci0.id = "-1880810009"
vmci0.pciSlotNumber = "33"
tools.syncTime = "TRUE"
cleanShutdown = "FALSE"
...
...


One of the commands we use widely accross many posts is

/vmfs/volumes/datastore1/WXPMK #cat WXPMK.vmx | grep .vmdk
ide0:0.fileName = "WXPMK.vmdk"
scsi0:0.fileName = "WXPMK_1.vmdk"


This command grabs all ouput of the .vmx file and then filters it by lines containing the string "vmdk", which inturn outputs the lines describing the .vmdk type, paths and hardware channel. In this example we have used an old XP machine that we use for marketing, thus the name WXPMK. This is an old Frankenstein machine that has an IDE disk attached to the first channel in the virtual controller (ide0:0) plus an additional SCSI disk attached to the first available channel as well (scsi0:0).

One of our duties as the ESXi OS admins is making sure the contents of the .vmx files describe the hardware attached to the VM accurately. You do not need to know what each line means, with the years you might end up knowing it, but that's not important. One of the most important tasks in regards to the .vmx file maintenance is making sure it is pointing to the right paths to find the .vmdk files. There are only two kind of possible values for the paths in this file: 1) relative, the disk path is just the file name cause it is stored in the same directory as the .vmx file, and 2) absolute, the path is pointing at some absolute path in any of the available datastores, i.e.: /vmfs/volumes/datastore2/WXPMK/WXPMK_2.vmdk

The .VMX file is used by the VM process and its contents vary when we switch on the VM depending on what operations we are performing on the VM. The most important thing to comprehend is that the .vmx file always holds the .vmdk files the VM is running on top. If we have a consolidated VM without any snapshot, like the one in our example, the VM will be running on top of the base .vmdk disks shown above. But if we take a snapshot and run the same command again, we'll see the output has changed.

/vmfs/volumes/datastore1/WXPMK #cat WXPMK.vmx | grep .vmdk
ide0:0.fileName = "WXPMK-000001.vmdk"
scsi0:0.fileName = "WXPMK_1-000001.vmdk"


Now the VM is running on top of disks that have the string "-000001" appended to their names. These are the snapshot .vmdk files, where every I/O change is temporarily stored (while the snapshot exists). If we now delete the snapshot and run the same command again, we'll see everyting returned back to normal, as in the output previous to taking the snapshot. What has occured is that by deleting the snapshot we have merged the data temporarily stored there with the base disks and returned the VM to its original state.

This is the very basic functioning of the .vmx file in regards to the .vmdk files management. If you understand this, you can rebuild any VM from its base .vmdk disks, but wait, there is another important file that we mentioned before...

The .VMSD File:

This file holds the information related to the snapshots present in the VM. When you take a snapshot the file is modified, the newly created snapshot files are referenced here, and if you take more than a snapshot, the logic in the chain of snapshot files is held here too in part. We won't tell much about this file by now, except that it is not needed by the VM unless it has active snapshots. If you delete all snapshots, only a seed file with some redundant information is there. In fact if you delete it after deleting all snapshots the VM will work anyway and the .vmsd file will be regenerated again when the VM needs it.

The .VMDK Files:

As we stated before, the .vmdk files are those that store the blocks of data that constitute our VM hard disks, but there are two kinds of .vmdk files. If you list a VM dir you will see there are two .vmdk files per hard disk: 1) the descriptor file, which is usually only a few bytes big and holds only a few lines of text, in fact it is a text file, 2) and the -flat.vmdk file, that holds the real data blocks. Even in the case that you lost the descriptor file, just as long as you keep the -flat.vmdk file, you could recover your data by regenerating a descriptor file for your -flat.vmdk file.

TROUBLESHOOTING:

Now that we have this basic hints on how the .vmdk files relate themselves to the VM layout, let's see how we can take advantage of this to fix some common problems with VMWare ESXi VMs.

As the VM process does in fact modify the contents of the .vmx and .vmsd files while the VM is running, there are times when the info that this files contain becomes obsolete, or contains redundant information, or simply points to the wrong path. There is a vast number of situations in which this might occur: power outages, disk space constraints, moving disks, running out of memory, etc... We won't delve into de the details much now, in fact we don't care why ESXi fails to do some things sometimes. In fact it is a very reliable hypervisor and fails, most of the times, only when we push it to its limits.

So whenever XSIBackup is not able to backup a VM, first thing we should do is make a manual backup by copying the VM folder somewhere else, then remove all snapshots and consolidate, then try again. If the problem persists, then it is most probably due to some incoherent information being held in the config files. Turn the VM off and run the command above to find out which disks appear in the .vmx file and where the VM thinks they are located. If the paths are O.K. and are pointing to the base disks, you can safely delete every file except the .vmdk and .vmx files. This will leave the VM folder with the minimum set of files needed to run the VM. Run the VM, check that all disks are present by entering the VM settings view and that they are coherent with those observed when listing the .vmdk files in the .vmx file from the SSH console. Now run XSIBackup again and everyting should work as expected.

Before doing this you must evaluate the situation and find out if you can discard the info in the snapshot files, this is a general guide to solving VM configuration problems by using a simple aproach.

Depending on the size of the disks, the presence of snapshots, the data stored in the snapshots, the existence of previous backups, etc..., you might not be able to discard files freely. If this is your case, you will need to take a closer look at the matter and delve into specific questions like: is the .vmsd file info accurate?, is base hardware healthy?, is the chain of snapshots consistent?. Answering this questions and fixing the problem might require somebody able to fully understand the contents and significance of every piece of information in these files.

XSIBackup only copies data from its original placement to a new path in a datastore or remotely over IP, but does not, in any way modify the contents of the files being copied, this includes any file in the VM folders.

Daniel J. García Fidalgo
33HOPS