Posts Tagged ‘VMware’

Unable to Remove Host From vCenter

Posted on the April 3rd, 2010 under VMware by

I opened a VMware Service Request yesterday because I ran into a problem that I couldn’t figure out.  In one of our environments, we have a 15 host DRS/HA cluster we are migrating from ESX 4 to ESXi 4.  The process involves removing the existing ESX host from vCenter before rebuilding the host with ESXi.  On two of the 15 hosts, the “Remove” option was greyed out.  I first suspected a permissions issue was preventing the hosts from being removed.  Though, it didn’t make sense that other hosts in the same cluster were able to be removed and the permissions where identical on the two hosts in question.

I tried disconnecting the hosts from vCenter and then remove–but no success there.  I tried re-connecting the hosts to vCenter–still no success.  I tried moving the hosts out of the cluster and placing it under the root vCenter object (i.e. the vCenter name)–again no success.  I didn’t want to edit the backend database directly before I contacted VMware, so I opened a Service Request.  After two days, VMware dug up an existing case that had recently been opened by a customer experiencing the same issue.  The only way to cleanly remove the hosts is to create a new temporary cluster, place the hosts in question in this new cluster, and then delete the cluster.  This process not only deletes the cluster but it also removes all the hosts underneath the cluster object.

VMware does not yet know what is causing this to occur or how widespread this issue is.  They suspect it has something to do with Update Manager and some recent updates.  If you run into this condition, it is rare, hopefully this article will help you resolve the problem.

ESX / ESXi Slow Boot – UWConflictRetries

Posted on the April 2nd, 2010 under VMware by

Are some of your ESX / ESXi hosts taking a long time to boot?  On some of our ESXi 4.0 Update 1 hosts we are seeing boot times over 10 minutes.  Watching the boot process we see the issue is related to storage; the ESXi DCUI appears to hang at “Loading module multiextent”.

Here is why:

During the boot process an ESX(i) host rescans its accessible LUNs.  If you are using MSCS clusters with Raw Device Mappings (RDMs) you will likely experience a lengthy delay during the scanning process at boot time.  The issue is caused by a timeout condition during the rescan operation on RDM LUNs.  In my experience you will see the slow boot issue on all hosts that are zoned to see MSCS RDM LUN(s).  For example, if you have a DRS/HA cluster with 15 hosts and are using MSCS with RDMs; within that cluster you will see slow boot times on all the cluster hosts. This occurs because all hosts in a DRS/HA cluster should have access to all common datastores.

You can mitigate this issue (not resolve it) by implementing a parameter change recommended in VMware internal KB1016106.  The Scsi.UWConflictRetries parameter for ESX(i) 4 Update 1 hosts has a default value of 1000.  This increases the time spent enumerating LUN and VMFS volumes.

Follow the steps below:

The Scsi.UWConflictRetries parameter for ESX and ESXi 4 Update 1 hosts have a default value of 1000.

To resolve this issue and speed up the boot process, modify this value to 80.

Click on Host -> Configuration -> Advanced Settings

In the Advanced Settings  -> Select SCSI

Now, Change the Scsi.UWConflictRetries  value to 80(Default is 1000).

As an alternative, you might consider creating a DRS/HA cluster dedicated to MSCS Virtual Machines and mask the RDM LUNs from all your other ESX(i) hosts not participating in the dedicated MSCS DRS/HA cluster.

ESX vs. ESXi Debate

Posted on the March 31st, 2010 under VMware by

It’s been a longstanding debate…which version of VMware’s baremetal hypervisor should I choose–ESX or ESXi?  Which one is better?  Why would VMware offer two seemingly identical versions of their core hypervisor product?  There are answers to all these questions and I will address them in this article.  By the time you are finished with this article you should have a good understanding of each product and will have the necessary knowledge to select the best option for your particular requirements.

Lets begin by understanding VMware’s reasons for offering both ESX and ESXi.  VMware ESX and ESXi are very similar in many ways but one.  ESX contains a Linux operating system VMware calls the “Service Console” that is used as a management application to interact directly with the hypervisor.  ESXi does not contain a Service Console; therefore, management tasks are performed through other means–mainly using remote tools via an exposed ESXi API.  According to VMware, the ESX version is legacy and will be phased out in the future.  ESXi is the so-called “next generation” hypervisor version.  VMware is keeping ESX around because many companies have created third-party applications/agents that must be run from inside the Service Console and also because customers are not totally sold on running enterprise workloads without access to the Service Console in time of an emergency.  These two issues alone have contributed to preventing VMware from decommissioning the ESX version.

So what’s the difference between ESX and ESXi?  As previously mentioned, the main difference between ESX and ESXi is the lack of the Service Console in the ESXi version.  Because ESXi natively lacks many of the features provided by the Service Console, many see ESXi as a lesser version of ESX used in smaller environments.  However, in reality, ESX and ESXi are identical from a core functionality standpoint.  Below are some pros and cons of the ESXi version:

Pros

  • Smaller installation footprint with reduced installation complexity.  The manual installation of the “installable” version of ESXi takes about five minutes from baremetal to hypervisor.  The “embedded” version of ESXi is ready out of the box with no installation media required.  ESX on the other hand takes considerable time creating the Service Console partitions during installation.
  • ESXi is more secure than ESX.  Why?  By eliminating the Service Console, ESXi requires fewer patches and also reduces the number of open ports as compared to ESX.
  • The embedded version of ESXi requires less energy because the hypervisor runs from an internal USB memory module which eliminates the need to have expensive energy consuming physical hard disks.  It’s also worth mentioning in larger environments a substantial cost-savings is obtained with ESXi embedded as expensive hard disks and RAID controllers can be eliminated from server purchases.
  • Some have mentioned boot-time is reduced when using the ESXi version because the services that are started with the Service Console are not present in ESXi.  This is hit-or-miss in my experience.  In some environments where MSCS clusters are being used with RDMs the boot time is substantial regardless of VMware hypervisor version used.
  • Many hardware vendors have created VIB files that contain OEM CIM providers that allow the ability to interact with vendor specific hardware sensors remotely.  For example, one of the major concerns for Dell customers using ESXi used to be that the Dell OpenManage Server Administrator tool could not be installed inside ESXi (note: Dell did release a Dell specific version of ESXi 3.5 with the Server Administrator embedded).  With ESXi 4 Dell has released a vSphere VIB file that allows administrators to remotely monitor Dell hardware instrumentation using the familiar Dell Server Administrator web interface.  The benefit of this model is that the Dell OpenManage Tomcat web server is no longer required to be running on the ESXi server which further reduces the attack surface.  Since the OEM CIM providers are packaged as a VIB file future updates should be as easy as installing a typical ESXi security patch.

Cons

  • Administrators don’t want to give up the Service Console.  The Service Console is like a giant safety net an administrator can always count on.  It’s there when something goes wrong with the hypervisor unexpectedly.  In a worse case scenario and administrator can logon to the Service Console at the server console and interact with the hypervisor and troubleshoot and correct most issues.  Conversely, ESXi exposes an API so that remote tools such as the vMA, vSphere PowerCLI, and the vCLIcan perform many of the management functions that are provided by the ESX Service Console.  Unfortunately, if the API services are unavailable due to an issue with the hypervisor how do you interact with ESXi?  That’s were the ESXi Direct Console User Interface (DCUI) is necessary.  By logging onto the DCUI in “unsupported” mode, administrators have access to some of the core esxcfg-* troubleshooting commands.
  • Because of the lack of a Service Console troubleshooting ESXi is sometimes more difficult.  For example, the vmkfstools command is not natively available in ESXi–it must be run from a remote console.  Further, ESXi has limited native logging capabilities as compared to ESX.  It is necessary to setup a syslog server or use the vilogger daemon in the vMA when using ESXi for long-term log archiving.
  • Some consider the embedded version of ESXi to contain a single point of failure (SPOF).  Because ESXi embedded is installed on some form of memory such as USB–it stands to reason that if the USB memory were to fail the ESXi hypervisor instance would also fail.  At this time there is no ESXi embedded redundancy that might normally be provided hardware RAID.  The embedded version can however be configured to boot from SAN which mitigates the SPOF argument.  In the event of an unlikely USB failure VMware HA would potentially be able to start the failed Virtual Machines on alternate hardware (assuming you are licensed and using an HA cluster).
  • Some third-party software cannot be used with ESXi because of dependencies on the Service Console.  Most examples would be related to backup software where a single backup agent is installed inside the Service Console and is responsible for coordinating the backups of Virtual Machines residing on the host.  Another example is where storage administrators require the use of agents on hosts connected to enterprise storage (i.e. Navisphere Naviagent, etc.).

Based upon the information in this article how do I choose between ESX and ESXi?  If you do not have any requirements to install third-party software that uses the ESX Service Console (i.e. backup agents, storage agents, etc.) I would strongly recommend you install the ESXi version.  The reduced attack surface, reduced security patch requirements, and reduced complexity of ESXi far outweigh issues related to the inability to directly administer an ESXi hypervisor.  In the unlikely event an ESXi server experiences an issue that cannot be resolved using the vSphere Client (the GUI management tool) there are many tools that can be used that provide Service Console-like features (i.e. vMA, PowerCLI, vCLI, DCUI).  Further, remember that VMware will be eliminating the Service Console in the future and it would make good business sense to start building processes and procedures to support ESXi now before you are forced too.