Skip to main content

Managing a VMware vCenter Server running on VM






Just wanted to share a couple of pointers which came up during a vSphere design review process for a customer.


During my discussions there were arguments around tracking of the vCenter Virtual machine in a big environment and getting on to it for troubleshooting in case the vCenter Server service or the VM is down can be a little time consuming.

Therefore, some of the organizations prefer a physical vCenter to have more control and a single point to look at and troubleshoot in case of issues. I would say this has more to do with comfort and mindset of the admin, that the application managing the virtual environment itself is not virtual and isolated from the virtual infrastructure.

I would not say that these points are not valid, since no one would like to search for there vCenter VM in case of vCenter downtimes. If you have not planned the initial placement of the vCenter VM, then you might end up logging on to each ESXi server directly via vSphere Client and search for your vCenter VM. This can be a cumbersome and time consuming process. This might actually affect services such as VMware view or vCloud Director for a longer duration in case of vCenter Downtimes, given that you do not use vCenter Heartbeat in your infrastructure.

There are a couple of things which every vSphere Design with a Virtual vCenter should consider:-

a) Separate Management Cluster - In slightly bigger setups where you might end up having multiple clusters of ESXi servers and multiple different management virtual machines, such as storage management appliances, vCloud director or SRM machines, you should have a separate management cluster of 2 to 3 ESXi servers (size them as per your requirement). Here is where you place for vCenter Server as well. Isolated from your production environment and also easy to track and troubleshoot in case of vCenter server   fails due to any issues.

b) DRS rules for vCenter VM - You may or may not have the liberty of creating a separate management cluster. However, it is absolutely recommended to use DRS rules to control the placement of your vCenter Virtual Machine.

You should use the DRS rule of "Virtual Machine to Hosts" in order to place the vCenter VM on the FIRST host of the FIRST Cluster in your vCenter. This is possible with the DRS rules and this will ensure that your vCenter server is always running on the same ESXi server and only in case of that ESXi server failing, the VM powers onto the next host in the cluster using vSphere HA. This method will ensure that you have only one ESXi server to look at in case your vCenter Server is acting up and you can trace the VM easily.

This is how you can achieve this:-

1- Right click on the first cluster of your vCenter Server (Assuming vCenter VM is a part of this cluster).
2- Click on Edit Settings.
3- Under DRS > Click Rules > Add.
4- Click the DRS Groups Manager tab.
5- Click Add under Host DRS Groups to create a new Host DRS Group containing the first host of the cluster.
6- Click Add under Virtual Machine DRS Groups to create a Virtual Machine DRS Group for the vCenter VM
7- Click the Rule tab, from the Type drop-down menu, click Virtual Machines to Hosts.
9- Select the Virtual Machine DRS Group which you created in the previous steps and the Host Groups which you created and you are done.

After saving this setting the vCenrer VM will automatically migrate to the host which you selected using vMotion and would stay there, making it easy and simple for you to locate in case of vCenter downtime.

Just for a recap here are the settings:-

DRS Groups Manager
Specification
Virtual Machine Group Name
<vCenter VM Name>
Virtual Machine Group Member
<vCenter VM>
Host DRS Group Name
<First ESXi Hostname in Cluster>
Host DRS Group Member
<First ESXi Host in Cluster>
Rules
Specification
Name
<vCenter VM Name> on <ESXi Hostname>
Type
Virtual Machines to Hosts
Cluster VM Group
<Virtual Machine Group Name>
Rule
Should run on hosts in group
Cluster Host Group
<Host DRS Group Name>



Last but not the least you need to ensure that you keep the Virtual Machine Restart Policy for vCenter Server in case of an HA event, as the highest priority so that the vCenter VM is up as soon as possible.

Duncan and Frank in there book mentioned a valid point:-
   "Although HA is configured by vCenter and exchanges virtual machine state information with HA, vCenter is not involved when HA responds to failure. It is comforting to know that in case of a host failure containing the virtualized vCenter Server, HA takes care of the failure and restarts the vCenter Server on another host, including all other configured virtual machines from that failed host.
There is a corner case scenario with regards to vCenter failure: if the ESXi hosts are so called “stateless hosts” and Distributed vSwitches are used for the management network, virtual machine restarts will not be attempted until vCenter is restarted. For stateless environments, vCenter and Auto Deploy availability is key as the ESXi hosts literally depend on them."

Hence, it is important you ensure that vCenter comes back up on high priority in case of an HA event. This will get the management network going in case of a Distributed vSwitch and Auto Deploy to work... However with vSphere 5.1, you do have an option to boot the ESXi server with a backup copy of ESXi which you can save on the local drive if available on the server.

Comments

Popular posts from this blog

Changing the FQDN of the vCenter appliance (VCSA)

This article states how to change the system name or the FQDN of the vCenter appliance 6.x You may not find any way to change the FQDN from the vCenter GUI either from VAMI page of from webclient as the option to change the hostname always be greyed out. Now the option left is from the command line of VCSA appliance. Below steps will make it possible to change the FQDN of the VCSA from the command line. Access the VCSA from console or from Putty session. Login with root permission Use above command in the command prompt of VCSA : /opt/vmware/share/vami/vami_config_net Opt for option 3 (Hostname) Change the hostname to new name Reboot the VCSA appliance.   After reboot you will be successfully manage to change the FQDN of the VCSA . Note: Above step is unsupported by VMware and may impact your SSL certificate and face problem while logging to vSphere Web Client. If you are using self-signed certificate, you can regenerate the certificate with the

Issue : Configure Management Network option is Grayed out into ESXi

Last week I got into an issue of one of my client into Vsphere environment where one of its ESXi went done out of the network. Issue was IP address was showing 0.0.0.0 on main Esxi screen and when I tried to change the network configuration, its " Configure Management network option was greyed out.  I tried to gid into it and try to analyis its vmKernal and vmwarning logs. What I found is its VMkernal switch got removed due to unexpected reason. So to resolve the issue I tried to reconfigure its vswitch0 (vmk0) by going into Tech Mode of that Exi. Below are the steps which I followed to resolve the issue. 1) Login to ESXi 2) Press F2, Check if you " Configure Management network " is greyed out or not" if yes,    follow below 3) Press ALT+F1 to move the ESXi screen to tech mode   ( This is command line like dos) 4) login with root account 5) Run the following command into it esxcli network ip interface add --interface-name= vmk0

Collecting Logs from NSX-T Edge nodes using CLI

  This article explains how to extract the logs from NSX-T Edge nodes from CLI. Let's view the steps involved: 1) Login to NSX-T  Edge node using CLI from admin credentials. 2) Use of  " get support-bundle " for Log extraction. get support-bundle command will extract the complete logs from NSX-T manager/Edge nodes. nsx-manager-1> get support-bundle file support-bundle.tgz 3) Last step is to us e of " copy file support-bundle.tgz url " command. copy file will forward your collected logs from the NSX-T manager to the destination(URL) host from where you can download the logs. copy file support.bundle.tgz url scp://root@192.168.11.15/tmp Here, the URL specified is the ESXi host ( 192.168.11.15) under /tmp partition where logs will be copied and from there one can extract it for further log review. Happy Learning.  :)