Skip to main content

RAM Disk Full Due To Inodes In ESXi Host




Last week I got into issue where one of my ESXi host was prompting error while creating virtual machine.

I verified the task and events of ESXi host and found below error generating while creating
Virtual machine "A general system error occured: Failed to open "/var/log/vmware/journel/ for write: There is no space". While further digging, I identified the ESXi host from where its generating this error from Task and Event section in vCenter.

Tried to take SSH of the question ESXi host but its was inaccessible via putty. However, ESXi host was up and running fine. Last option left to access the ESXi host from its management console which is ILO as its ESXi installed on HP server.

Also you can try accessing the SSH via other ESXi host or Linux machine using below command ssh -T servername. However, this will not give your prompt but you can type the command to get the output, but that was not giving any luck at that time so we use ILO for further troubleshooting.

Took the Shell access of ESXi host from ILO by pressing ALT+F1 to move into command prompt.
As initially it’s showing space issue, verified the disk space by using df command.

========================================================================
root# df /var/log/vmware/journal
Filesystem           1K-blocks   Used Available Use% Mounted on
/dev/mapper/db_vg-db  25663804 471036  23882460   2% /storage/db
========================================================================

From the above command it’s seems there is sufficient space on the disk partition where journal is writing the transactions. Then where the issue is...

I decided to verify the logs as its the gold mine to fine the issue. While going through vmkwarning.log I found multiple error like below
"Cannot create file /var/run/vmware/tickets... ...for process hostd-worker because the inode table of its ramdisk (root) is full"

So, from the logs it’s clear that the issue is not persisting from disk space issue, rather its issue with inode.

Now what is inode?

An inode is an entry in inode table, containing information ( the metadata ) about a regular file and directory. An inode is a data structure on a traditional Unix-style file system such as ext3 or ext4.

Node number also called index number, it consists following attributes.
File types ( executable, block special etc )
Permissions ( read, write etc )
UID ( Owner )
GID ( Group )
FileSize
Time stamps including last access, last modification and last inode number change.
File deletion time
Number of links ( soft/hard )
Location of ile on harddisk.
Some other metadata about file.

Now, we need to identify how many inodes are used by the file system and how many are available at present.

To identify this we can run the command    stat -f /

Also, you can run the command localcli system visorfs ramdisk list to identify the available Inode into the root partition.

outcome of the command

ID: 5        Namelen: 973     Type: visorfs
Block size: 4096
Blocks: Total: 449852     Free: 445356     Available: 445356
Inodes: Total: 8900       Free: 0

After running the command I found out of 8900 Inodes all 8900 Inodes are in use and 0 available. 

From this its crystal clear that its the issue with exhausting of Inodes.

Now, it’s time to identify which file is using high Inodes and how to clear them to make Inodes available.

Starting from /var| wc -l found its using around more than half of the Inodes. Now to dig deeper need to find which file under /var its using high Inodes. Under the /var/run i identify that the sfcb "/var/run/sfcb" is the file which is using high nodes.

sfcb is basically the hardware monitoring services which seems not to be critical but important to know the hardware status.

To make free the inodes from sfcb file I use VMware KB

Run this command to stop the sfcbd service:
/etc/init.d/sfcbd-watchdog stop

Manually delete the files in the var/run/sfcb directory to free inodes.

Run these commands to remove the files:

cd /var/run/sfcb
rm [0-2]*
rm [3-6]*
rm [7-9]*
rm [a-c]*
rm [d-f]*

Run this command to restart the sfcbd service:
/etc/init.d/sfcbd-watchdog start

Now restart the ESXi host management agents.

After deleting the sub files from sfcb folder I again verify "/var/run/sfcb | wc -last" and now its have only bunch of inodes connected to it.

Now, we retry the VM creation and it’s creating without any error.

Thanks for reading the post .. Happy sharing... :)

Comments

Post a Comment

Popular posts from this blog

VM Creation Date & Time from Powercli

Most of the times we have several requirement when we talk about IT environment like designing , deployment , compliance check or for Security auditing the environment.
Somewhere during security auditing we require to provide several information to security team to get successful audit.
One of them is the compliance of Virtual machine auditing of creation date and time.
Here into this post we will explore how to get the creation date and time of virtual machine hosted into the vCenter or ESXi.
To get the details we will use VMware Powercli to extract the details.
By default there is no function added into Powercli to get such details, so here we will add a function of vm creation date.
Below is the function which needed to be copy and paste into the Powercli.
=======================================================================
function Get-VMCreationTime { $vms = get-vm $vmevts = @() $vmevt = new-object PSObject foreach ($vm in $vms) { #Progress bar: $foundString = "       Found: "+$v…

Changing the FQDN of the vCenter appliance (VCSA)

This article states how to change the system name or the FQDN of the vCenter appliance 6.x
You may not find any way to change the FQDN from the vCenter GUI either from VAMI page of from webclient as the option to change the hostname always be greyed out.
Now the option left is from the command line of VCSA appliance.
Below steps will make it possible to change the FQDN of the VCSA from the command line.
Access the VCSA from console or from Putty session.Login with root permissionUse above command in the command prompt of VCSA : /opt/vmware/share/vami/vami_config_netOpt for option 3 (Hostname)Change the hostname to new nameReboot the VCSA appliance.After reboot you will be successfully manage to change the FQDN of the VCSA .

Note: Above step is unsupported by VMware and may impact your SSL certificate and face problem while logging to vSphere Web Client.

If you are using self-signed certificate, you can regenerate the certificate with the help of below KB 2112283 article.



Happy Sharin…

Could not connect to one or more vCenter Server systems: https://FQDN:443/sdk

Recently I got a case where vCenter 6.0 where the webclient was not showing inventory while loading. Issue occur when the customer was performing migration activity of virtual machine.
We verified that the vpxd services of vCenter, which is VCSA (Appliance), went into stopped stated just after starting means its crashing.
On VCSA Shell: service-control --status vmware-vpxd shows "stopped" service-control --start vmware-vpxd starts the service starts for a couple of seconds and stops again
VCSA 6.0 is linked with extrnal PSC 6.0. Verified the services of PSC and found all looks into good state.
Tried to power off both the VCSA and PSC and Power on in sequence where we started first PSC and later VCSA. After restarting the VCSA, status of the VPXD services was same as it was getting stopped after couple of seconds.
Checked the VPXD logs and found that the heartbeat between ESXi and VCSA was getting timed out for more than 1032 ms or more.
VCSA has generated the core dump at /var/core. …