Database (PostgreSQL)
Slow to respond
The service retries transactions three times before failing. Synchronous calls may fail and not trigger asynchronous calls to the account-processor. Timeouts may appear as gateway timeout server errors at the Core API level.
Unavailable
All Account Service endpoints make at least one call to the database. It is critical for a functioning Account Service.
Kafka
Account Service relies on Kafka for vault-accountand vault-account-processor to communicate with each other. Monitor vault.core.account_update.events to check its health. Note that account migrations will create account update batches with the batch size limited by ACCOUNT_UPDATES_NEXT_BATCH_SIZE in the config maps. ACCOUNT_UPDATES_NEXT_BATCH_SIZE is set to 5 by default.
Unavailable
vault-account remains operational, being able to handle requests and produce a response. Communication with the account processor ceases, however, thus account updates will not be processed. An account’s status may therefore transition from PENDING to OPEN to PENDING_CLOSURE, but none of the corresponding update hooks will be processed.
Monitor vault.core.account_update.events to see if they are being published and check the statuses of account updates to determine whether things have been processed.
Once Kafka is back online, the account updates will be processed eventually. The processor pods can be scaled horizontally to catch up with the demand once the service regains Kafka availability.