-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
5
-
False
-
-
True
-
BIZ-679 - Ansible on AWS, SaaS
-
-
Exceptions getting logged when the value of instanceKey label from prometheus results is longer than 60 characters. This error is causing event ingestion to fail just for these particular events. It is not causing batches of events to fail, missing any important usage, etc.
It looks like this is limited to metrics with billing_marketplace gcp with mock data that's presumably set up by RHEL team for GCP support.
Caused by: org.springframework.kafka.listener.BatchListenerFailedException: could not execute batch [Batch entry 0 insert into events (data,event_source,event_type,instance_id,metering_batch_id,org_id,record_date,timestamp,event_id) values (('{"event_source":"rhelemeter","event_type":"snapshot_rhel-for-x86-els-payg_vcpus","org_id":"16787820","instance_id":"projects/mock-project-id/zones/mock-zone/instances/mock-instance-name","metering_batch_id":"5bfadfdb-f5f3-4a9f-a893-af9de82c11af","event_id":"d679c66e-fed6-42d6-be34-53c43ada3742","service_type":"RHEL System","timestamp":"2024-06-20T11:00:00Z","record_date":"2024-06-20T12:30:03.841872666Z","expiration":"2024-06-20T12:00:00Z","display_name":"gcp1.host-metering.test","measurements":[{"value":1.0,"metric_id":"vCPUs"}],"product_ids":["204","69"],"sla":"Premium","usage":"Production","billing_provider":"gcp","billing_account_id":"gcp16787820","product_tag":["rhel-for-x86-els-payg"],"conversion":true}'),('rhelemeter'),('snapshot_rhel-for-x86-els-payg_vcpus'),('projects/mock-project-id/zones/mock-zone/instances/mock-instance-name'),('5bfadfdb-f5f3-4a9f-a893-af9de82c11af'::uuid),('16787820'),('2024-06-20 12:30:03.841873+00'),('2024-06-20 11:00:00+00'),('d679c66e-fed6-42d6-be34-53c43ada3742'::uuid)) was aborted: ERROR: value too long for type character varying(60) Call getNextException to see other errors in the batch.] [insert into events (data,event_source,event_type,instance_id,metering_batch_id,org_id,record_date,timestamp,event_id) values (?,?,?,?,?,?,?,?,?)]; SQL [insert into events (data,event_source,event_type,instance_id,metering_batch_id,org_id,record_date,timestamp,event_id) values (?,?,?,?,?,?,?,?,?)] @-0 at org.candlepin.subscriptions.event.EventController.lambda$persistServiceInstances$4(EventController.java:201) at java.base/java.util.HashMap.forEach(HashMap.java:1421)
https://redhat-internal.slack.com/archives/C01F7QFNATC/p1718891232122339
Done:
- No stacktraces spamming when this happens
- try to catch this exception and log at WARN with a new swatch error code
- liquibase script to move to an uinlimited length varchar on the instance_id columns in swatch database tables