Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-56594

Kube API Server Crashes on Audit Log Event

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • 4.21.0
    • 4.17.z, 4.16.z, 4.18.z, 4.19.z, 4.20.z
    • kube-apiserver
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • In Progress
    • Bug Fix
    • Fixed a Kubernetes API server crash regarding concurrent map iteration (audit context handling).
    • None
    • None
    • None
    • None

      Description of problem:
      kube-apiserver is crashing regularly on audit log events with:

      fatal error: concurrent map iteration and map write
      
      goroutine 258531290 [running]:
      reflect.mapiterinit(0x456109?, 0x180?, 0x4d9cf80?)
      	runtime/map.go:1392 +0x13
      reflect.(*MapIter).Next(0x49fbe80?)
      	reflect/value.go:1935 +0x57
      encoding/json.mapEncoder.encode({0xc2674f8ad0?}, 0xc3f9138f40, {0x49fbe80?, 0xc4509c1a30?, 0xc4509c1970?}, {0xe?, 0x0?})
      	encoding/json/encode.go:744 +0x33e
      encoding/json.structEncoder.encode({{{0xc09e531900, 0x12, 0x23}, 0xc0329222a0, 0xc0329222d0}}, 0xc3f9138f40, {0x53dc700?, 0xc4509c1900?, 0x2?}, {0x0, ...})
      	encoding/json/encode.go:706 +0x21e
      encoding/json.ptrEncoder.encode({0x1?}, 0xc3f9138f40, {0x5385800?, 0xc4509c1900?, 0x5385800?}, {0x0?, 0x3e?})
      	encoding/json/encode.go:878 +0x20f
      encoding/json.(*encodeState).reflectValue(0xc2674f8d90?, {0x5385800?, 0xc4509c1900?, 0x4092a6?}, {0x60?, 0x46?})
      	encoding/json/encode.go:323 +0x73
      encoding/json.(*encodeState).marshal(0x408e5d?, {0x5385800?, 0xc4509c1900?}, {0xc?, 0x0?})
      	encoding/json/encode.go:295 +0xb9
      encoding/json.(*Encoder).Encode(0xc2674f8ea8, {0x5385800, 0xc4509c1900})
      	encoding/json/stream.go:209 +0xdf
      k8s.io/apimachinery/pkg/runtime/serializer/json.(*Serializer).doEncode(0x2?, {0x5e4e2a8?, 0xc4509c1900?}, {0x5e320e0, 0xc3ffc5e000})
      	k8s.io/apimachinery@v0.29.0/pkg/runtime/serializer/json/json.go:246 +0x175
      k8s.io/apimachinery/pkg/runtime/serializer/json.(*Serializer).Encode(0xc000815bd0, {0x5e4e2a8, 0xc4509c1900}, {0x5e320e0, 0xc3ffc5e000})
      	k8s.io/apimachinery@v0.29.0/pkg/runtime/serializer/json/json.go:220 +0xf2
      k8s.io/apimachinery/pkg/runtime/serializer/versioning.(*codec).doEncode(0xc000521720, {0x5e46bb0, 0xc3ae9b3530}, {0x5e320e0, 0xc3ffc5e000}, {0x0?, 0x0?})
      	k8s.io/apimachinery@v0.29.0/pkg/runtime/serializer/versioning/versioning.go:284 +0x9ea
      k8s.io/apimachinery/pkg/runtime/serializer/versioning.(*codec).encode(0xc000521720, {0x5e46bb0, 0xc3ae9b3530}, {0x5e320e0, 0xc3ffc5e000}, {0x0?, 0x0?})
      	k8s.io/apimachinery@v0.29.0/pkg/runtime/serializer/versioning/versioning.go:214 +0x15c
      k8s.io/apimachinery/pkg/runtime/serializer/versioning.(*codec).Encode(0x17d0c97?, {0x5e46bb0?, 0xc3ae9b3530?}, {0x5e320e0?, 0xc3ffc5e000?})
      	k8s.io/apimachinery@v0.29.0/pkg/runtime/serializer/versioning/versioning.go:207 +0x2d
      k8s.io/apimachinery/pkg/runtime.Encode({0x7ff02416ffb0, 0xc000521720}, {0x5e46bb0, 0xc3ae9b3530})
      	k8s.io/apimachinery@v0.29.0/pkg/runtime/codec.go:49 +0x5e
      k8s.io/apiserver/plugin/pkg/audit/log.(*backend).logEvent(0xc00150ce10, 0xc3ae9b3530)
      	k8s.io/apiserver@v0.29.0/plugin/pkg/audit/log/backend.go:76 +0x65
      k8s.io/apiserver/plugin/pkg/audit/log.(*backend).ProcessEvents(0x4130e5?, {0xc439f37820, 0x1, 0xc3e79b8201?})
      	k8s.io/apiserver@v0.29.0/plugin/pkg/audit/log/backend.go:65 +0x3c
      k8s.io/apiserver/pkg/server/options.(*ignoreErrorsBackend).ProcessEvents(0x5e71e58?, {0xc439f37820?, 0x4130e5?, 0x98?})
      	k8s.io/apiserver@v0.29.0/pkg/server/options/audit.go:385 +0x22
      k8s.io/apiserver/pkg/endpoints/filters.processAuditEvent({0x5e71e58, 0xc194e1a1b0}, {0x7ff0342a8310, 0xc00148ee60}, 0xc3ae9b3530, {0xc00148ee00?, 0x1, 0x19b0f70?})
      	k8s.io/apiserver@v0.29.0/pkg/endpoints/filters/audit.go:198 +0x230
      k8s.io/apiserver/pkg/server.DefaultBuildHandlerChain.WithAudit.func7.1()
      	k8s.io/apiserver@v0.29.0/pkg/endpoints/filters/audit.go:113 +0x3bb
      k8s.io/apiserver/pkg/server.DefaultBuildHandlerChain.WithAudit.func7({0x5e5dde0, 0xc092b5dec0}, 0xb7e75d?)
      	k8s.io/apiserver@v0.29.0/pkg/endpoints/filters/audit.go:116 +0x3a8
      net/http.HandlerFunc.ServeHTTP(0x0?, {0x5e5dde0?, 0xc092b5dec0?}, 0x0?)
      	net/http/server.go:2141 +0x29
      k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1({0x5e5dde0, 0xc092b5dec0}, 0xc2674ba100)
      	k8s.io/apiserver@v0.29.0/pkg/endpoints/filterlatency/filterlatency.go:84 +0x192
      net/http.HandlerFunc.ServeHTTP(0x10?, {0x5e5dde0?, 0xc092b5dec0?}, 0xc48f64f960?)
      	net/http/server.go:2141 +0x29
      k8s.io/apiserver/pkg/server.DefaultBuildHandlerChain.WithStartupEarlyAnnotation.withStartupEarlyAnnotation.func30({0x5e5dde0, 0xc092b5dec0}, 0xc2674ba100)
      	k8s.io/apiserver@v0.29.0/pkg/server/filters/with_early_late_annotations.go:127 +0x2a7
      net/http.HandlerFunc.ServeHTTP(0x5138300?, {0x5e5dde0?, 0xc092b5dec0?}, 0x5e20e88?)
      	net/http/server.go:2141 +0x29
      k8s.io/apiserver/pkg/server.DefaultBuildHandlerChain.TrackCompleted.trackCompleted.func32({0x5e5dde0?, 0xc092b5dec0}, 0xc2674ba100)
      	k8s.io/apiserver@v0.29.0/pkg/endpoints/filterlatency/filterlatency.go:110 +0x177
      net/http.HandlerFunc.ServeHTTP(0x5e71e58?, {0x5e5dde0?, 0xc092b5dec0?}, 0x5e17fa0?)
      	net/http/server.go:2141 +0x29
      k8s.io/apiserver/pkg/endpoints/filters.withAuthentication.func1({0x5e5dde0, 0xc092b5dec0}, 0xc2674ba100)
      	k8s.io/apiserver@v0.29.0/pkg/endpoints/filters/authentication.go:120 +0x7e5
      net/http.HandlerFunc.ServeHTTP(0x5e71e58?, {0x5e5dde0?, 0xc092b5dec0?}, 0x5e20e88?)
      	net/http/server.go:2141 +0x29
      k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1({0x5e5dde0, 0xc092b5dec0}, 0xc266edfe00)
      	k8s.io/apiserver@v0.29.0/pkg/endpoints/filterlatency/filterlatency.go:94 +0x37a
      net/http.HandlerFunc.ServeHTTP(0xc092b5def0?, {0x5e5dde0?, 0xc092b5dec0?}, 0xc339871ee8?)
      	net/http/server.go:2141 +0x29
      k8s.io/apiserver/pkg/server/filters.WithCORS.func1({0x5e5dde0, 0xc092b5dec0}, 0xc266edfe00)
      	k8s.io/apiserver@v0.29.0/pkg/server/filters/cors.go:61 +0x630
      net/http.HandlerFunc.ServeHTTP(0xc339871fd0?, {0x5e5dde0?, 0xc092b5dec0?}, 0x5e47930?)
      	net/http/server.go:2141 +0x29
      k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP.func1()
      	k8s.io/apiserver@v0.29.0/pkg/server/filters/timeout.go:115 +0x62
      created by k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP in goroutine 250263206
      	k8s.io/apiserver@v0.29.0/pkg/server/filters/timeout.go:101 +0x1b2
      

      In the largest production cluster, this is causing API disruptions and further failues due to LIST/WATCH storms when the API abruptly fails.

      Version-Release number of selected component (if applicable):

      4.16.34
      

      How reproducible:

      Random, but happens every days days.
      

      Steps to Reproduce:

          1.
          2.
          3.
      

      Actual results:

      API server crashes causing the other 2 to be stormed with kubelet LIST/WATCH requests which sometimes degrades the API server or causes them crash temporarily before recovering. 
      

      Expected results:

      No panics
      

      Additional info:
      This is related to: https://github.com/kubernetes/kubernetes/issues/120507
      And should be fixed by: https://github.com/kubernetes/kubernetes/pull/129472

              rmarasch@redhat.com Ricardo Maraschini
              rhn-support-mrobson Matt Robson
              None
              None
              Ke Wang Ke Wang
              None
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: