Esxi Root password lock out/ Determining source of last failed ssh login on Esxi

Generally.  Should the root account be locked out, SSH and UI/client access to the host fails. In order to work this around

  • Bring up a Console session to the Host and enable Esxi Shell (under troubleshooting options)
  • on the console session, press ALT+F1,
  • log in as  root and password:
  • In order to unlock the root account and determine the last log on failure, type the below:
    • /sbin/pam_tally2 -r -u root

The root account should now be unlocked. Review the IP listed there to prevent logon(scripted or 3’rd party monitoring)

migrate option grayed out for VM’s on the vCenter view

the migrate option is normal grayed out when there is an ongoing task (clone, backup, snapshot take/consolidate/reconfigure etc) running against the VM)

In certain rare cases, an orphaned DB record could also cause this. From the vCenter server database, Look at the table VPX_DISABLED_METHODS

Select * from VPX_DISABLED_METHODS;

Result:
Select * from VPX_DISABLED_METHODS;
entity_mo_id_val | method_name | source_id_val | reason_id_val
------------------+-------------+---------------+---------------
(0 rows)

IF there are no such task’s and should you find this to be an orphaned entery, the contents of the table may be cleared

Delete from VPX_DISABLED_METHODS where entity_mo_id_val =x;

Replacing vmdir certificates on vCenter 6.0

vmdir is a vCenter component that Listens on port 389 and 636(LDAPs/LDAP)

We will start creating a new configuration file called vmdir.cfg with the below content: (replace the contents under v3_req with the fields appropriate to your environment)

	[ req ]
	distinguished_name = req_distinguished_name
	encrypt_key = no
	prompt = no
	string_mask = nombstr
	req_extensions = v3_req
	[ v3_req ]
	basicConstraints = CA:false
	keyUsage = nonRepudiation, digitalSignature, keyEncipherment
	subjectAltName = DNS:psc1.domain.com, DNS:psc1, IP: x.x.x.x
	[ req_distinguished_name ]
	countryName = US
	stateOrProvinceName = State
	localityName = City
	0.organizationName = Company
	organizationalUnitName = Department
	commonName = psc1.domain.com

using openssl, create a new CSR file with the above configuration:

"%VMWARE_OPENSSL_BIN%" req -new -out c:\cert\vmdir.csr -newkey rsa:2048 -keyout c:\cert\vmdir.key -config c:\cert\vmdir.cfg

If the solution user certificates are signed with a CA cert, sign the CSR with the same issuing CA
else, Sign them using VMCA using the instructions below.

Signing the CSR with the VMCA certificate.

  • Copy root.cer and privatekey.pem from C:\ProgramData\VMware\vCenterServer\data\vmca
    (appliance: /var/lib/vmware/vmca/) to c:\cert\

Run the brow command to sign the certificate:

"%VMWARE_OPENSSL_BIN%" x509 -req -days 3650 -in c:\cert\vmdir.csr -out c:\cert\vmdir_signed.crt -CA c:\cert\root.cer -CAkey c:\cert\privatekey.pem -extensions v3_req -CAcreateserial -extfile c:\cert\vmdir.cfg

Now we have a certificate that can be used to replace the existing vmdir certificates. To proceed with the certificate replacement, Stop all vCenter services

service-control --stop --all

Note: For windows, you must be on path: “C:\Program Files\VMware\vCenter Server\bin”

  • Go into path: C:\ProgramData\VMware\vCenterServer\cfg\vmdird (appliance: ‘/usr/lib/vmware-vmdir/share/config/’)
  • (backup original certificates) vmdircert.pem and vmdirkey.pem to a temp directory
  • rename vmdir_signed.crt to vmdircert.pem  and  vmdir.key to vmdirkey.pem on the above directory

Start all services

service-control --start--all

Note: If the services fail to start (most likely inventory) then you it means that the wrong root cert was used when sigining the certificate. Replace the original files on the directory and restart the service to roll back to previous configuration.

web client service crashes java.lang.OutOfMemoryError: PermGen space and java.lang.OutOfMemoryError

vSphere web client refused to start with memory errors
log location:

Windows: c:\programdata\VMware\vCenter\Logs\vsphere-client
Appliance: /var/log/vmware/vsphere-client
wrapper.log
	INFO | jvm 1 | 2018/04/03 15:34:25 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "org.springframework.scheduling.timer.TimerFactoryBean#0"
	INFO | jvm 1 | 2018/04/03 15:34:33 |
	INFO | jvm 1 | 2018/04/03 15:34:33 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "http-bio-9443-exec-10"
	INFO | jvm 1 | 2018/04/03 15:35:12 |
	INFO | jvm 1 | 2018/04/03 15:35:12 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "http-bio-9443-exec-6"
	vsphere_client_virgo.log
	[2018-04-03T15:28:45.855-04:00] [ERROR] http-bio-9090-exec-2 com.vmware.vise.util.concurrent.WorkerThread http-bio-9090-exec-2 terminated with exception: java.lang.OutOfMemoryError: PermGen space
	[2018-04-03T15:28:46.773-04:00] [ERROR] http-bio-9090-exec-5 com.vmware.vise.util.concurrent.WorkerThread http-bio-9090-exec-5 terminated with exception: java.lang.OutOfMemoryError: PermGen space

Cause: insufficient  Web-client heap size/Insufficient PermGen space?configuration change

Scenario 1:  Heap-size

Resolution:

  • Ensure there is sufficient free memory on the vCenter
free -m
  • Review and Increase (double) the heap size of the web client

Appliance:

cloudvm-ram-size -l vsphere-client

windows:

 C:\Program Files\VMware\vCenter Server\visl-integration\usr\sbin\cloudvm-ram-size.bat -l
To increase, use the below
cloudvm-ram-size.bat -C XXX 


  • start vsphere-client and observe if this still crashes.

Scenario 2: PermGen

  • Take a copy of the file service-layout.mfx as service-layout.mfx.bak
    Appliance path: /etc/vmware/
    Windows Path: C:\ProgramData\VMware\vCenterServer\cfg\
  • Edit service-layout.mfx with a text editor
  •  Change MaxPermMB size from 256 to 512 for the row vspherewebclientsvc. (increase accordingly depending on the number of plugins configured with vCenter_webclient)
  • start vsphere-client service

Scenario 3: Problem persists even after increasing/maxing out scenario 1 and scenario 2.

  • backup configuration file before you proceed
  cp      /usr/lib/vmware-vsphere-client/server/wrapper/bin/vsphere-client /usr/lib/vmware-vsphere-client/server/wrapper/bin/vsphereclient.bak
  • Edit the file using a text editor
vi /usr/lib/vmware-vsphere-client/server/wrapper/bin/vsphere-client
  • Look for the line  “RUN_AS_USER=vsphere-client” and hash this

Start vsphere-client service.

vCenter Pre-upgrade fails

Error: Internal error occurred during execution of upgrade process.

Resolution: Send upgrade log files to VMware technical support team for further Assistance.

Upgrade logs say:

	less /var/log/vmware/upgrade/bootstrap.log
	2018-03-23T20:14:34.11Z ERROR transport.guestops Invalid command: "/bin/bash" --login -c '/opt/vmware/share/vami/vami_get_network eth0 1>/tmp/vmware-root/exec-vmware47-
	stdout 2>/tmp/vmware-root/exec-vmware235-stderr'
	None
	2018-03-23T20:14:34.12Z ERROR upgrade_commands Unable to execute pre-upgrade checks on host 10.1.0.209
	Traceback (most recent call last):
	File "/usr/lib/vmware/cis_upgrade_runner/bootstrap_scripts/upgrade_commands.py", line 2199, in execute
	preupgradeResult = self._executePreupgradeChecks()
	File "/usr/lib/vmware/cis_upgrade_runner/bootstrap_scripts/upgrade_commands.py", line 2655, in _executePreupgradeChecks
	srcIpv4Address, srcIpv4SubnetMask, srcIpv6Address, srcIpv6Prefix = retrieveNetworkingConfiguration(self.opsManager)
	File "/usr/lib/vmware/cis_upgrade_runner/bootstrap_scripts/transfer_network.py", line 1309, in retrieveNetworkingConfiguration
	interface)
	File "/usr/lib/vmware/cis_upgrade_runner/bootstrap_scripts/apply_networking.py", line 188, in _retrieveNetworkIdentity
	networkConfig = vamiGetNetwork(processManager, interface)
	File "/usr/lib/vmware/cis_upgrade_runner/bootstrap_scripts/apply_networking.py", line 144, in vamiGetNetwork
	output = _execNetworkConfigCommand(processManager, [VAMI_GET_NETWORK_CMD, interface])
	File "/usr/lib/vmware/cis_upgrade_runner/bootstrap_scripts/apply_networking.py", line 66, in _execNetworkConfigCommand
	cr = transport.executeCommand(processManager, cmd)
	File "/usr/lib/vmware/cis_upgrade_runner/libs/sdk/transport/__init__.py", line 122, in executeCommand
	return processManager.pollProcess(processUid, True)
	File "/usr/lib/vmware/cis_upgrade_runner/libs/sdk/proxy.py", line 81, in __call__
	ret = self.func(*args, **kwargs)
	File "/usr/lib/vmware/cis_upgrade_runner/libs/sdk/transport/guestops.py", line 1184, in pollProcess
	self._checkInvalidCommandError(processInfo, stderr)
	File "/usr/lib/vmware/cis_upgrade_runner/libs/sdk/transport/guestops.py", line 1123, in _checkInvalidCommandError
	raise ExecutionException(error, ErrorCode.INVALID_REQUEST)
	ExecutionException: ('Invalid command: "/bin/bash" --login -c \'/opt/vmware/share/vami/vami_get_network eth0 1>/tmp/vmware-root/exec-vmware47-stdout 2>/tmp/vmware-root/
	exec-vmware235-stderr\'', 1)
	2018-03-23T20:14:39.442Z ERROR __main__ ERROR: Fatal error during upgrade REQUIREMENTS. For more details take a look at: /var/log/vmware/upgrade/requirements-upgrade-runner.log
	 

Now look at the source appliance.

	VMware VirtualCenter 6.0.0 build-3339084
	vCenter:~ # ifconfig
	eth0 Link encap:Ethernet HWaddr 00:50:56:AC:53:FD
	inet addr:x.x.x.x Bcast:x.x.x.x Mask:255.255.252.0
	UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
	RX packets:45028984 errors:0 dropped:28266 overruns:0 frame:0
	TX packets:16476384 errors:0 dropped:0 overruns:0 carrier:0
	collisions:0 txqueuelen:1000
	RX bytes:74680502042 (71220.8 Mb) TX bytes:7187692049 (6854.7 Mb)
	lo Link encap:Local Loopback
	inet addr:127.0.0.1 Mask:255.0.0.0
	inet6 addr: ::1/128 Scope:Host
	UP LOOPBACK RUNNING MTU:16436 Metric:1
	RX packets:147809637 errors:0 dropped:0 overruns:0 frame:0
	TX packets:147809637 errors:0 dropped:0 overruns:0 carrier:0
	collisions:0 txqueuelen:0
	RX bytes:93984509789 (89630.6 Mb) TX bytes:93984509789 (89630.6 Mb)

Run /opt/vmware/share/vami/vami_get_network  less returns an dependency error:

vCenter:~ # /opt/vmware/share/vami/vami_get_network eth0 1 | less
	/opt/vmware/share/vami/vami_get_network: error while loading shared libraries: libvami-common.so: cannot open shared object file: No such file or directory

To resolve this, re-create the link to dependency by running the below commands.

echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}/opt/vmware/lib/vami/" >> /etc/profile
echo 'export LD_LIBRARY_PATH' >> /etc/profile

Re-run the command to confirm if it is returning the IP details

/opt/vmware/share/vami/vami_get_network
	vCenter55:~ # /opt/vmware/share/vami/vami_get_network
	interface: eth0
	config_present: true
	config_flags: STATICV4
	config_ipv4addr: 10.1.0.209
	config_netmask: 255.255.252.0
	config_broadcast: 10.1.3.255
	config_gatewayv4:
	config_ipv6addr:
	config_prefix:
	config_gatewayv6: 10
	autoipv6:
	active_ipv4addr: 10.1.0.209
	active_netmask: 255.255.252.0
	active_broadcast: 10.1.3.255
	active_ipv6addr:
	active_prefix:
	active_gatewayv4: 10.1.0.61
	active_gatewayv6:
	hasdhcpv6: 1
	Traceback (most recent call last):
	File "/opt/vmware/share/vami/vami_ovf_process", line 25, in <module>
	import libxml2
	File "/usr/lib64/python2.6/site-packages/libxml2.py", line 1, in <module>
	ImportError: No module named libxml2mod
	managed:

vami_ovf_process and libxml2.py can be ignored
Re-run the upgrade/migration.

Deploy A VM using OVFtool

Syntax:

ovftool -ds="datastore" -n="VM_Name" -–net:"VM _old_Network"="VM Network"  c:\path_to_ovf\file.ovf vi://[email protected]:password@vCenter_server_name?ip=Esxi_Host_IP

Example:

ovftool -ds="datastore" -n="VMName" -–net:"VM Network"="VM Network"  C:\Users\ntitta\Downloads\myth.ovf vi://[email protected]:P@[email protected]/?ip=10.109.10.120

Note:

VM Network = the vmnetwork of the ovf at the time of export

If you are not too sure of the original network, Run the below to query:

C:\Program Files\VMware\VMware OVF Tool>ovftool C:\Users\ntitta\Downloads\myth.ovf
Output:
OVF version:   1.0
VirtualApp:    false
Name:          myth

Download Size:  Unknown

Deployment Sizes:
Flat disks:   16.00 GB
Sparse disks: Unknown

Networks:
Name:        VM Network
Description: The VM Network network

Virtual Machines:
Name:               myth
Operating System:   ubuntu64guest
Virtual Hardware:
Families:         vmx-13
Number of CPUs:   2
Cores per socket: 1
Memory:           1024.00 MB

Disks:
Index:          0
Instance ID:    5
Capacity:       16.00 GB
Disk Types:     SCSI-lsilogic

NICs:
Adapter Type:   VmxNet3
Connection:     VM Network

Link to download OVFtool  https://my.vmware.com/web/vmware/details?productId=614&downloadGroup=OVFTOOL420

Registering replication appliance to vCenter fails with “no element found: line 1, column 0”

Registering replication appliance to vCenter fails with “no element found: line 1, column 0”

Cause: Corrupt/missing ovfenv.xml

Log on to the vR appliance console session and run the below command:

ls -lth /opt/vmware/etc/vami/

f the ovfenv.xml is 0, power down the replication appliance and power (do not perform a guest restart) this back up using the web client (must be powered up on the web client)

If the file still does not regenerate, you will re-deploy the replication appliance.

VMware Pubs::

https://docs.vmware.com/en/vSphere-Replication/6.1/com.vmware.vsphere.replication-admin.doc/GUID-0D980B0A-44B4-4644-BB26-4E100727D6BD.html

  1. If powering the vSphere Replication appliance does not resolve the issue, most certainly the appliance has been temporarily removed and re-added in the vCenter Server. There is no solution for restoring the OVF environment in that case. You must re-deploy the vSphere Replication appliance by using an empty database, and configure all replications from scratch.

SRM service crash during upgrade

Log location: c:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs\vmware-dr.log
 
crash backtrace:
2018-01-08T23:02:56.539-07:00 [01952 verbose 'Recovery' ctxID=5c5b5384 opID=76706664] No state info for vm 'protected-vm-96860'
2018-01-08T23:02:56.539-07:00 [01952 verbose 'Recovery' ctxID=5c5b5384 opID=76706664] No state info for vm 'protected-vm-96942'
2018-01-08T23:02:56.539-07:00 [01952 verbose 'Recovery' ctxID=5c5b5384 opID=76706664] No state info for vm 'protected-vm-96962'
2018-01-08T23:02:56.539-07:00 [01952 verbose 'Recovery' ctxID=5c5b5384 opID=76706664] No state info for vm 'protected-vm-96963'
2018-01-08T23:02:56.539-07:00 [01952 verbose 'Recovery' ctxID=5c5b5384 opID=76706664] PVM State Tracker: Initializing.
2018-01-08T23:02:56.539-07:00 [01952 verbose 'Recovery' ctxID=5c5b5384 opID=76706664] PVM State Tracker: Will NOT initialize VM Info.
2018-01-08T23:02:58.770-07:00 [03132 verbose 'DatastoreGroupManager' opID=640e1e2d] Found 9 devices on host 'host-757'
2018-01-08T23:02:58.770-07:00 [03132 verbose 'DatastoreGroupManager' opID=640e1e2d] Found 9 VMFS volumes on host 'host-757'
2018-01-08T23:02:58.770-07:00 [03132 verbose 'DatastoreGroupManager' opID=640e1e2d] Dr::Storage::Match::`anonymous-namespace'::DeviceFetcher::FetchDevices: Fetched devices for host host-757
2018-01-08T23:02:58.786-07:00 [01952 panic 'Default' ctxID=5c5b5384 opID=76706664]
-->
--> Panic: FAILURE: "Deserialize failed for data item (persistence id: ##global##_pvmi.protected-vm-106049): std::exception 'class boost::archive::archive_exception' "input stream error"" @ d:/build/ob/bora-4535903/srm/src/jobs/jobs.cpp:304
--> Backtrace:
-->
--> [backtrace begin] product: VMware vCenter Site Recovery Manager, version: 6.1.1, build: build-4535903, tag: -
--> backtrace[00] vmacore.dll[0x001BF51A]
--> backtrace[01] vmacore.dll[0x0005D88F]
--> backtrace[02] vmacore.dll[0x0005E9DE]
--> backtrace[03] vmacore.dll[0x001D7A55]
--> backtrace[04] vmacore.dll[0x001D7B4D]
--> backtrace[05] vmacore.dll[0x0004DEFC]
--> backtrace[06] dr-jobs.dll[0x00035DB7]
--> backtrace[07] MSVCR90.dll[0x00074830]
--> backtrace[08] MSVCR90.dll[0x00043B3C]
--> backtrace[09] ntdll.dll[0x0004B681]
--> backtrace[10] dr-jobs.dll[0x0000390C]
--> backtrace[11] dr-jobs.dll[0x00005408]
--> backtrace[12] dr-recovery.dll[0x0016F153]
--> backtrace[13] dr-recovery.dll[0x0016AC81]
--> backtrace[14] dr-recovery.dll[0x002B0488]
--> backtrace[15] dr-recovery.dll[0x002AEF33]
--> backtrace[16] dr-recovery.dll[0x002B4011]
--> backtrace[17] dr-recovery.dll[0x00031A19]
--> backtrace[18] functional.dll[0x00028089]
--> backtrace[19] vmacore.dll[0x00153B8E]
--> backtrace[20] vmacore.dll[0x0015749F]
--> backtrace[21] vmacore.dll[0x001589F1]
--> backtrace[22] vmacore.dll[0x0015A725]
--> backtrace[23] vmacore.dll[0x00066C4B]
--> backtrace[24] vmacore.dll[0x00155DC0]
--> backtrace[25] vmacore.dll[0x001D302B]
--> backtrace[26] MSVCR90.dll[0x00002FDF]
--> backtrace[27] MSVCR90.dll[0x00003080]
--> backtrace[28] kernel32.dll[0x000159CD]
--> backtrace[29] ntdll.dll[0x0002A561]
--> [backtrace end]
-->

Resolution:  Take a backup of the SRM DB and delete the contents of the table: pdj_dataitem and resume the installer.

Delete from pdj_dataitem;

Setting up replication using existing seeds/reverse replication

Before proceeding, Ensure there are no stale replications. The VM name must not be listed in the vCenter server>Monitor>Vsphere replication>(on the source) outgoing,  incoming tab (for the destination)

Configuring replication using existing seeds are rather simple.  Configure replication like you normally would except, you would select the datastore>Folder where the source VM’s virtual disk’s are located.

Configure replication:

Select target site:

Select VR server:

Click on Edit for the VM

Select Datastore>VM folder where the disks are located

for the vR to detect the disk as seed, the Disk name must be identical to that of what is on the source side. you will see the below message when it picks up the seed.  Click on “use existing” to use the seeds

Click on use all seeds

Proceed with the wizard

The status of replication is expected to be “initial full sync” however if you click on the i icon as highlighted below, you will see it doing a checksum.

The replications should resume once the checksum has been compared

One the checksum is completed, the state should go back to ok

Recovering a VM using vSphere replication

Log into the recovery site vCenter webclient>vCenter server>Monitor>vSphere replication>incoming replication>

Click on the VM that you wish to recover and click on the recover button: 

Should you be performing a planned migration of the VM (with the most recent change, select Synchronise recently change (you will need to authenticate with the source side vCenter) ,. Else, use the latest available data or point in time to recover to a point in time snapshot

in my  scenario, the source site was down and  the VM needs to be recovered on the DR,

Now the VM has been recovered and is running at the DR site