Uploaded image for project: 'OpenShift API Server'
  1. OpenShift API Server
  2. API-1322

Create alert for API Server audit log errors #1166

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Obsolete
    • Icon: Undefined Undefined
    • openshift-4.11
    • None
    • None
    • None
    • False
    • False
    • undefined

      The purpose of the alarm is to inform administrators that there are
      errors writing audit logs which might be an issue for incident response
      teams as the audit logs become unreliable at this point. The alarm has a
      query that provides appropriate labels to identify the API server
      type and instance that's failing.

      The alert is left vague enough that this will catch any API Server
      causing these types of issues; but is still usable via the
      aforementioned labels.

      The alert will trigger if there are any errors detected (threshold above 0).
      However, this happening is not a common occurrence, and would most
      likely trigger for the following reasons:

      • Node's disk space is full
      • Someone actually got into the node as is tampering with the audit logs
        to the point that they're un-usable (changing modes or attributes).

      PR: https://github.com/openshift/cluster-kube-apiserver-operator/pull/1166

              josorior@redhat.com Juan Antonio Osorio (Inactive)
              anachand Anandnatraj Chandramohan (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: