Last week I got into issue where
one of my ESXi host was prompting error while creating virtual machine.
I
verified the task and events of ESXi host and found below error generating
while creating
Virtual machine "A general
system error occured: Failed to open "/var/log/vmware/journel/
for write: There is no space". While further digging, I identified the
ESXi host from where its generating this error from Task and Event section in
vCenter.
Tried to take SSH of the question
ESXi host but its was inaccessible via putty. However, ESXi host was up and
running fine. Last option left to access the ESXi host from its management
console which is ILO as its ESXi installed on HP server.
Also you can try accessing the SSH via other ESXi host or Linux machine using below command ssh -T servername. However, this will not give your prompt but you can type the command to get the output, but that was not giving any luck at that time so we use ILO for further troubleshooting.
Also you can try accessing the SSH via other ESXi host or Linux machine using below command ssh -T servername. However, this will not give your prompt but you can type the command to get the output, but that was not giving any luck at that time so we use ILO for further troubleshooting.
Took the Shell access of ESXi
host from ILO by pressing ALT+F1 to move into command prompt.
As initially it’s showing space
issue, verified the disk space by using df command.
========================================================================
root# df /var/log/vmware/journal
Filesystem
1K-blocks Used Available Use%
Mounted on
/dev/mapper/db_vg-db
25663804 471036 23882460 2% /storage/db
========================================================================
From the above command it’s seems
there is sufficient space on the disk partition where journal is writing the
transactions. Then where the issue is...
I decided to verify the logs as
its the gold mine to fine the issue. While going through vmkwarning.log I found
multiple error like below
"Cannot create file /var/run/vmware/tickets... ...for
process hostd-worker because the inode table of its ramdisk (root) is
full"
So, from the logs it’s clear that the issue is not
persisting from disk space issue, rather its issue with inode.
Now what is inode?
An inode is an entry
in inode table, containing information ( the metadata ) about a regular file
and directory. An inode is a data structure on a traditional Unix-style file
system such as ext3 or ext4.
Node number also
called index number, it consists following attributes.
File types (
executable, block special etc )
Permissions ( read,
write etc )
UID ( Owner )
GID ( Group )
FileSize
Time stamps including
last access, last modification and last inode number change.
File deletion time
Number of links (
soft/hard )
Location of ile on
harddisk.
Some other metadata
about file.
Now, we need to identify how many inodes are used by the
file system and how many are available at present.
To identify this we can run the command stat
-f /
Also, you can run the command localcli system visorfs ramdisk list to identify the available Inode into the root partition.
outcome of the command
ID: 5 Namelen: 973 Type: visorfs
Block size: 4096
Blocks: Total: 449852 Free: 445356 Available: 445356
Inodes: Total: 8900 Free: 0
After running the command I found out of 8900 Inodes all
8900 Inodes are in use and 0 available.
From this its crystal clear that its
the issue with exhausting of Inodes.
Now, it’s time to identify which file is using high Inodes
and how to clear them to make Inodes available.
Starting from /var| wc -l found its using around more than
half of the Inodes. Now to dig deeper need to find which file under /var its
using high Inodes. Under the /var/run i identify that the sfcb
"/var/run/sfcb" is the file which is using high nodes.
sfcb is basically the
hardware monitoring services which seems not to be critical but important to
know the hardware status.
To make free the inodes from sfcb file I use VMware KB
Run this command to stop the sfcbd service:
/etc/init.d/sfcbd-watchdog stop
Manually delete the files in the var/run/sfcb directory to
free inodes.
Run these commands to remove the files:
cd /var/run/sfcb
rm [0-2]*
rm [3-6]*
rm [7-9]*
rm [a-c]*
rm [d-f]*
Run this command to restart the sfcbd service:
/etc/init.d/sfcbd-watchdog start
Now restart the ESXi host management agents.
After deleting the sub files from sfcb folder I again verify
"/var/run/sfcb | wc -last" and now its have only bunch of inodes
connected to it.
Now, we retry the VM creation and it’s creating without any
error.
Thanks for reading the post .. Happy sharing... :)
Comments
Post a Comment