Sometime we used to get issue where were are not able to perform vMotion or logging was unable to write under /var/log.. While trying to do some normal things – like vMotion. I noticed an error which states just “A general system error occurred.” On further investigation, I found that the underlying message was an out of disk space message while trying to proceed with a Storage vMotion.
Observations during issue
While vMotion – “A general system error occurred:”
While performing Storage vMotion – “/var/log/vmware/journal/xxxx error writing file. There is no space left on the device.”
Steps during troubleshooting
- Go to Configuration tab on host in vCenter client, go to Security Profile, click Properties link on the Services section.
- Scroll down to SSH and highlight – click options – click start to start SSH service.
- Use putty or reflections to ssh to the host.
- If you get a connection rejected – root filesystem ramdisk is probably full.
- Go to console (either through KVM or OA for blades)
- F2 to login, login, arrow down to Troubleshooting Options, select Enable ESXi Shell.
- Press ALT-F1 to change to management shell and login (same root credentials).
- Run ‘vdf -h’ and look for root filesystem – should look like:
Ramdisk Size Used Available Use% Mounted on root 32M 3M 28M 10% --
- If it is 0M available and 100% used, that’s the problem. Try to clear up space:
cd /var/log/ ls -la
- Check size of the hpHelper.log file – likely pretty large. Reset the file, if large.
> hpHelper.log
After clearing the hpHelper.log, were were able to bring the host into normal state and all task were observed as normal. Identified issue detected with the HP agents inside the custom image of ESXi provided by HP It seems in some circumstances that the hpHelper.log file can become very large, filling the RAMdisk and causing the issues. Its a first for me and I have not observed the issues on any of my other ESXi hosts running on Proliant rack-mount or blade servers.
Happy Sharing... :)
Comments
Post a Comment