Usage meter reports partial collection failure: events

Logs files: 	vccol_main.log | vccol_error.log
[2021-05-18 09:02:25]  | ERROR | ter collector thread |    com.vmware.um.vccollector.VCCollector | vCenter collector90 | Events stage raised exception javax.xml.ws.WebServiceException: java.net.SocketTimeoutException: Read timed out java.net.SocketTimeoutException: Read timed out=>Read timed out
[2021-05-18 09:02:26]  | ERROR | ter collector thread | com.vmware.um.collector.CollectionHelper | vCenter collector98 | Status (COLLECT_API_ERR) for vCenter server 7: Partial collection failure: Events
[2021-05-18 10:02:18]  | ERROR | ter collector thread |    com.vmware.um.vccollector.VCCollector | vCenter collector179 | Events stage raised exception javax.xml.ws.WebServiceException: java.net.SocketTimeoutException: Read timed out java.net.SocketTimeoutException: Read timed out=>Read timed out

Cause: Connection was closed before the data could be retrieved successfully. Usage meter requests vCenter for events, this api generally takes some time to respond either due to the huge number of events or slowly due to heavy processing on the vCenter.

Resolution: Increase timeouts
* take a snapshot of the um appliance.
* ssh into the appliance with the user usagemeter
* take a backup copy of common_utils.sh

cp /opt/vmware/cloudusagemetering/scripts/common_utils.sh /opt/vmware/cloudusagemetering/scripts/common_utils.sh.bak

* edit the config file using vi

 vi /opt/vmware/cloudusagemetering/scripts/common_utils.sh

* replace the values of the below field

=> CONNECT_TIMEOUT_MS="300000"
=> READ_TIMEOUT_MS="600000"

* save the file and restart the appliance


Note: On the vCenter side, if there are bursts of events, then this is also a likely scenario. KB https://kb.vmware.com/s/article/74607 is one among several where the burst is documented (event bursts need to be triaged from vCenter prospective)

Leave a Reply

Your email address will not be published. Required fields are marked *