Uploaded image for project: 'OCP Technical Release Team'
  1. OCP Technical Release Team
  2. TRT-2179

FallbackRelease caching is not working due to cache key issues

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • Quality / Stability / Reliability
    • None
    • None
    • None
    • None
    • None
    • None

      We've discovered that the fallback release persistent cache in biquery is not working as intended. It's supposed to cache basis results for a longer period of time, days, but the cache keys are missing because they contain changing values and thus we're updating the cache every 4h costing a lot of money.

      Sample key from this query:

      SELECT key,modified_time FROM `openshift-gce-devel.ci_analysis_us.cached_data` WHERE TIMESTAMP_TRUNC(modified_time, HOUR) >= TIMESTAMP("2025-06-20T10:00:00") order by modified_time desc
      
      cc:FallbackBaseTestStatus~{\"BaseRelease\":\"4.17\",\"BaseStart\":\"2024-09-01T00:00:00Z\",\"BaseEnd\":\"2024-10-01T00:00:00Z\",\"ReqOptions\":{\"BaseRelease\":{\"release\":\"4.18\",\"start\":\"2025-01-26T00:00:00Z\",\"end\":\"2025-02-25T23:59:59Z\"},\"SampleRelease\":{\"release\":\"4.19\",\"start\":\"2025-06-20T00:00:00Z\",\"end\":\"2025-06-27T12:00:00Z\"},\"VariantOption\":{\"column_group_by\":{\"Architecture\":{},\"Network\":{},\"Platform\":{},\"Topology\":{}},\"db_group_by\":{\"Architecture\":{},\"FeatureSet\":{},\"Installer\":{},\"Network\":{},\"Platform\":{},\"Suite\":{},\"Topology\":{},\"Upgrade\":{}},\"include_variants\":{\"Architecture\":[\"amd64\"],\"CGroupMode\":[\"v2\"],\"ContainerRuntime\":[\"crun\",\"runc\"],\"FeatureSet\":[\"default\",\"techpreview\"],\"Installer\":[\"ipi\",\"upi\"],\"JobTier\":[\"blocking\",\"informing\",\"standard\"],\"LayeredProduct\":[\"none\"],\"Network\":[\"ovn\"],\"Owner\":[\"eng\",\"service-delivery\"],\"Platform\":[\"aws\",\"azure\",\"gcp\",\"metal\",\"rosa\",\"vsphere\"],\"Topology\":[\"ha\",\"microshift\"]}},\"AdvancedOption\":{\"minimum_failure\":3,\"confidence\":95,\"pity_factor\":5,\"pass_rate_required_new_tests\":95,\"pass_rate_required_all_tests\":0,\"ignore_missing\":false,\"ignore_disruption\":true,\"flake_as_failure\":false,\"include_multi_release_analysis\":true},\"CacheOption\":{\"ForceRefresh\":true,\"CRTimeRoundingFactor\":14400000000000,\"SkipCacheWrites\":false},\"TestIDOptions\":[{}]}}
      

      And looking through the timestamps it's clear we're updating every 4h.

      The inclusion of sampleRelease will cause this immediately. There also appears to be duplicated base release info.

      This code path needs a more explicit key generation so we can control the data going in, using objects and public/private fields has gotten us into trouble many times.

      It would be nice if the keys would match for all releases using a fallback release, i.e. both 4.19 and 4.20 use 4.17, but includeVariants and other options tend to change. Not all options actually affect the query, so this would need to be handled very carefully. IncludeVariants probably is in the query. This is probably a dead end for that reason.

              sgoeddel@redhat.com Stephen Goeddel
              rhn-engineering-dgoodwin Devan Goodwin
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: