vcsa update “stage only” and “stage and install grayed out”

screenshot:

Solution: remove/rename the software update status config file

mv /etc/applmgmt/appliance/software_update_state.conf /etc/applmgmt/appliance/software_update_state.conf.bak

On a side note: I ran into the issue after attempting to “stage and install” my homelabs vcsa,
logs reveal:

/var/log/vmware/applmgmt/applmgmt.log
2021-05-30T15:20:26.108 [3758]DEBUG:vmware.appliance.update.update_functions:Removing the mount point /mnt/iso-contents
2021-05-30T15:20:26.109 [3758]INFO:vmware.appliance.update.update_functions:ISO unmounted successfully
2021-05-30T15:20:26.109 [3758]DEBUG:vmware.appliance.update.update_b2b:discoverLocalUpdate failed.
Traceback (most recent call last):
  File "/usr/lib/applmgmt/update/py/vmware/appliance/update/update_b2b.py", line 1434, in _discoverUpdateAt
    tempFolder)
  File "/usr/lib/python3.7/shutil.py", line 248, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/lib/python3.7/shutil.py", line 120, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/iso-contents/manifest-latest.xml'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/applmgmt/update/py/vmware/appliance/update/update_b2b.py", line 1640, in discoverLocalUpdatesNoException
    _discoverLocalUpdates()
  File "/usr/lib/applmgmt/update/py/vmware/appliance/update/update_b2b.py", line 1631, in _discoverLocalUpdates
    _discoverUpdateAtIso()
  File "/usr/lib/applmgmt/update/py/vmware/appliance/update/update_b2b.py", line 1607, in _discoverUpdateAtIso
    raise e
  File "/usr/lib/applmgmt/update/py/vmware/appliance/update/update_b2b.py", line 1602, in _discoverUpdateAtIso
    _discoverUpdateAt(manifestDir, packagesDir, copyFileFunc, 'iso')
  File "/usr/lib/applmgmt/update/py/vmware/appliance/update/update_b2b.py", line 1446, in _discoverUpdateAt
    raise RpmManifestNotFoundException
vmware.appliance.update.update_b2b.RpmManifestNotFoundException

so, I did the above, staged the upgrade repository this time:

Then ran the install:


Progress:

vrbc 7.6 security patch (7.6.0.46000) installation fails

installing vRBC security patch fails:

logs:

/opt/vmware/var/log/vami/vami.log
/opt/vmware/var/log/vami/updatecli.log

pg_dump: dumping contents of table "itfm_cloud_admin.vra_vm_details"
pg_dump: dumping contents of table "itfm_cloud_admin.vra_vm_details_tags"
pg_dump: saving large objects
Dump successful
Stopping VMware vPostgres: ok
MongoDB instance is already upgraded. Aborting upgrade
error: package libyui-ncurses-pkg6-2.46.1-3.4.x86_64 is not installed
error: package perl-Bootloader-YAML is not installed
30/05/2021 7:30:50 [INFO] Update status: Done pre-install scripts
30/05/2021 7:30:50 [INFO] Update status: Running installation tests
30/05/2021 7:30:50 [INFO] Running /opt/vmware/var/lib/vami/update/data/job/15/test_command 
Preparing packages...
	installing package kernel-default-4.12.14-122.26.1.x86_64 needs 4MB on the /boot filesystem
30/05/2021 7:30:51 [ERROR] Failed with exit code 65024
30/05/2021 7:30:51 [INFO] Update status: Running post-install scripts

Cause: the /boot partition ran out of space

Resolution:

Take a snapshot of the vrb appliance and perform the below:

  • SSH to vRB VA.
  • Create temp directory and move old kernel files.
mkdir /tmp/boot
cd /boot/
mv vmlinu* initrd* /tmp/boot
  • re-run the upgrade to the security build

Usage meter 4.3/4.4 vCenter server Partial collection failure: Events

Usage meter reports partial collection failure: events

Logs files: 	vccol_main.log | vccol_error.log
[2021-05-18 09:02:25]  | ERROR | ter collector thread |    com.vmware.um.vccollector.VCCollector | vCenter collector90 | Events stage raised exception javax.xml.ws.WebServiceException: java.net.SocketTimeoutException: Read timed out java.net.SocketTimeoutException: Read timed out=>Read timed out
[2021-05-18 09:02:26]  | ERROR | ter collector thread | com.vmware.um.collector.CollectionHelper | vCenter collector98 | Status (COLLECT_API_ERR) for vCenter server 7: Partial collection failure: Events
[2021-05-18 10:02:18]  | ERROR | ter collector thread |    com.vmware.um.vccollector.VCCollector | vCenter collector179 | Events stage raised exception javax.xml.ws.WebServiceException: java.net.SocketTimeoutException: Read timed out java.net.SocketTimeoutException: Read timed out=>Read timed out

Cause: Connection was closed before the data could be retrieved successfully. Usage meter requests vCenter for events, this api generally takes some time to respond either due to the huge number of events or slowly due to heavy processing on the vCenter.

Resolution: Increase timeouts
* take a snapshot of the um appliance.
* ssh into the appliance with the user usagemeter
* take a backup copy of common_utils.sh

cp /opt/vmware/cloudusagemetering/scripts/common_utils.sh /opt/vmware/cloudusagemetering/scripts/common_utils.sh.bak

* edit the config file using vi

 vi /opt/vmware/cloudusagemetering/scripts/common_utils.sh

* replace the values of the below field

=> CONNECT_TIMEOUT_MS="300000"
=> READ_TIMEOUT_MS="600000"

* save the file and restart the appliance


Note: On the vCenter side, if there are bursts of events, then this is also a likely scenario. KB https://kb.vmware.com/s/article/74607 is one among several where the burst is documented (event bursts need to be triaged from vCenter prospective)