-
Story
-
Resolution: Unresolved
-
Minor
-
None
-
None
Currently we're obfuscating line-by-line which works great on log files. Another big set of files we have is schema-based on k8s resources where the data can be multi-line like in configMaps or secrets:
kind: ConfigMap metadata: creationTimestamp: "2021-08-03T09:26:51Z" labels: app: oauth-openshift name: v4-0-config-system-metadata namespace: openshift-authentication resourceVersion: "19520" uid: bb2b8990-d9ab-4146-ae8f-0a523921ef91 - apiVersion: v1 data: service-ca.crt: | -----BEGIN CERTIFICATE----- MIIDUTCCAjmgAwIBAgIITiuDGTuteWgwDQYJKoZIhvcNAQELBQAwNjE0MDIGA1UE Awwrb3BlbnNoaWZ0LXNlcnZpY2Utc2VydmluZy1zaWduZXJAMTYyNzk4MjMzOTAe Fw0yMTA4MDMwOTE4NTlaFw0yMzEwMDIwOTE5MDBaMDYxNDAyBgNVBAMMK29wZW5z aGlmdC1zZXJ2aWNlLXNlcnZpbmctc2lnbmVyQDE2Mjc5ODIzMzkwggEiMA0GCSqG SIb3DQEBAQUAA4IBDwAwggEKAoIBAQCgTMe63NxzFwMGd5Mv+eiD81R288pd9vOu V8hMSvfLaGOQgg74rFdjlxniSD3cYCnYvfD4ZUz8PL+q1tuHTgD9Mvx0I0p7AZDf v1E3Rds7yuK7t82mmUAISSKSoajm5ZrL0fOK9HQNK8/aoeG5M1h9kDKiNQJtybHQ V7aZN5OSvrchfxKqRUVKPqXyf8AA7t4fl3SF52PpC5VxxCr1P4gl+wucmtTp0FRv jP9TJmJQ3ZdRgeT7fw5OAaBjqgu1ErX70aMePV7KfgBvW8Bim7XzW7Uuyh0PzEtq Yi6cKC5h+MvUC8X9Zg75CM8SQg/2eFb6rsd6wl7YUXOAb//GsFtPAgMBAAGjYzBh MA4GA1UdDwEB/wQEAwICpDAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSwrqU6 hd6RLQ647r3Q8ZMIfQboFDAfBgNVHSMEGDAWgBSwrqU6hd6RLQ647r3Q8ZMIfQbo FDANBgkqhkiG9w0BAQsFAAOCAQEABL02z6ftvCpACNgTa+jfs/WPO9b0gd7XSyF+ pn2j1AbOBGMex272aLjs3t+fqOm+Y8nfNpX5KSdPRiQiCZFykZslXbUBy4vL/BcI F+s1OxZJdcPO9vxD0dpXYr0Hi8HBClNTRs+UlYAXy94Shyyv7qnDG2gvSyyxWGPd VLdGJvWpdi7O0e15XvqzOGB3jKElY1mXVBlqQZngOYVzkDI+L8L5ThPAxqbOd+fs glwZIjWNLkDkPu7UxUcnia7dDZXfSRSbknbM9BNUrSuLc1QcsBpIBxI8iR/msYYi R2KGO0hYX0GOgCN3R6hhaN4BIhKOJ1Rwx1O8UOgNIJQwKRiKMw== -----END CERTIFICATE-----
Additionally, we can leverage semantic information like in the route object, where we know that there are domain and host names:
--- apiVersion: route.openshift.io/v1 items: - apiVersion: route.openshift.io/v1 kind: Route metadata: creationTimestamp: "2021-08-03T09:25:53Z" labels: app: oauth-openshift name: oauth-openshift namespace: openshift-authentication spec: host: oauth-openshift.apps.ci-ln-5ylibmb-d5d6b.origin-ci-int-aws.dev.rhcloud.com port: targetPort: 6443 tls: insecureEdgeTerminationPolicy: Redirect termination: passthrough to: kind: Service name: oauth-openshift weight: 100 wildcardPolicy: None status: ingress: - conditions: - lastTransitionTime: "2021-08-03T09:26:51Z" status: "True" type: Admitted host: oauth-openshift.apps.ci-ln-5ylibmb-d5d6b.origin-ci-int-aws.dev.rhcloud.com routerCanonicalHostname: router-default.apps.ci-ln-5ylibmb-d5d6b.origin-ci-int-aws.dev.rhcloud.com routerName: default wildcardPolicy: None kind: RouteList metadata: resourceVersion: "41846"
There is no need for us to obfuscate the "kind" or the "resourceVersion" (eg yaml object names), because the data that would be obfuscated most likely is in the values. (Labels + Annotations might be an exception however).
—
there's a great unmanaged package out there that is being used by insights-operator: https://github.com/openshift/insights-operator/blob/22ea9c972cda5e39219db339677b4fb9dde0ddff/pkg/gatherers/clusterconfig/machine_configs.go#L37-L68
that allows us to recursively go through a schema without actually knowing about it.
AC:
- tooling should support reading and obfuscating schema values instead of a line based approach
- unit testing