Currently, a failure to register or unregister a JMX metric is logged but doesn't prevent the connector from running. While metrics seem to be a secondary thing compared to the actual data capturing logic, I'd argue that they are as important:
- Metrics are used to synchronize the integration tests with the connector life cycle. If the connector fails to register its metrics, the integration tests will fail. Troubleshooting those failures is non-trivial.
- Metrics are used for monitoring the connector in a production environment. As an operator of a connector that fails to register its metrics, I'd rather have the connector fail than let it run unobserved.
The current implementation suppresses both the runtime and logical errors. For instance, during the work on
DBZ-4459, I had multiple metrics attempting to register themselves using the same name. It was a logical error that should have been reported as an exception but that exception was only logged and was hard to spot.
- By default, if a metric fails to register itself, it should cause a connector failure.
At this point, the register() and unregister() methods of the Metrics class will no longer need to accept a logger.The logger is still needed for DBZ-2089(no JMX in GraalMV).
- If there are valid scenarios when a connector is expected not to be able to register its metrics (e.g. due to the runtime configuration), metrics should be disabled explicitly via configuration.