Uploaded image for project: 'Cost Management'
  1. Cost Management
  2. COST-4296

Trino's map_concat overwrites values of the same key

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Not a Bug
    • Icon: Major Major
    • None
    • None
    • None
    • False
    • None
    • False

      This affects both Pod and Volume labels!

       

      Things to note when fixing this if we need to reconcile labels of the same key.
      If the object has a label, it should respect it (i.e. the pod is labelled env = demo, the cluster label env = demo_cluster does not have any effect).

      Override stuff from the lowest level
       

      During testing, rh-ee-plopezpe discovered that the persistent volume labels were not showing up in the tag options while reviewing the all_labels work and asked me to investigate.

      After, we download the csv out of trino we do see:

      persistentvolume_labels persistentvolumeclaim_labels
      label_volume:bravo label_volume:stor_bravo

      Pv labels are added in nise as:

       

      volumes:
      
      volume:
      volume_name: pvc-volume_1
      storage_class: gp2
      volume_request_gig: 2
      labels: label_volume:bravo
      volume_claims: volume_claim:
      volume_claim_name: data_bravo
      pod_name: bravo
      labels: label_volume:stor_bravo
      capacity_gig: 2
      volume_claim_usage_gig:
      full_period: 2 

      However, if you go to:

      You won't see the them in the API:

       

      { "key": "volume", "values": [ "stor_alpha", "stor_bravo", "stor_charlie", "stor_nod" ], "enabled": true }

       

      We attempted to modify the yaml to give it a different value:

       

      volumes:
      
      volume:
      volume_name: pvc-volume_1
      storage_class: gp2
      volume_request_gig: 8
      labels: label_notvol:notavclabel
      volume_claims: volume_claim:
      volume_claim_name: data_nod
      pod_name: nod
      labels: label_volume:stor_nod|label_test:nice|label_git:commit|label_stack:overflow|label_preference:tag_tests|label_storageclass:dua
      capacity_gig: 10
      volume_claim_usage_gig:
      full_period: 5 
      •  

      Which (after enabling the key ) resulted in the values from that column showing up in the return:

      postgres=# select distinct(volume_labels) from reporting_ocpusagelineitem_daily_summary where persistentvolumeclaim_capacity_gigabyte_months is not null;
                                                              volume_labels
      ------------------------------------------------------------------------------------------------------------------------------
       {"volume": "charlie"}
       {"git": "commit", "test": "nice", "stack": "overflow", "notvol": "notavclabel", "volume": "stor_nod", "storageclass": "dua"}
       {"volume": "bravo"}
       {"volume": "alpha"}
      

      This made me take a closer look at how Trino's map_config works and after reading their docs, I discovered the the map_config, returns the union of all the given maps. If a key is found in multiple given maps, that key’s value in the resulting map comes from the last one of those maps. That's when I discover we are overwriting the same key with the last one in the list.

      How to reproduce
      1. Run the smoke test with a breakpoint 
      2. Use the make command to populate the jinja vars for the trino sql file, and  then run it in trino.
      3. Now in Postgres run this query and look at the results:

      select distinct(volume_labels) from reporting_ocpusagelineitem_daily_summary where persistentvolumeclaim_capacity_gigabyte_months is not null;
      

      Results:

      ------------------------------------------------------------------------------------------------------------------------------
       {"volume": "stor_bravo"}
       {"git": "commit", "test": "nice", "stack": "overflow", "notvol": "notavclabel", "volume": "stor_nod", "storageclass": "dua"}
       {"volume": "stor_charlie"}
       {"volume": "stor_alpha"}
      

      4. Now update the trino sql code to be:

      Unable to find source-code formatter for language: sql. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yamlmap_concat(
                  cast(json_parse(coalesce(nli.node_labels, '{}')) as map(varchar, varchar)),
                  cast(json_parse(coalesce(nsli.namespace_labels, '{}')) as map(varchar, varchar)),
                  cast(json_parse(sli.persistentvolumeclaim_labels) as map(varchar, varchar)),
                  cast(json_parse(sli.persistentvolume_labels) as map(varchar, varchar))
              ) as volume_labels,
      

      5. And then rerun the test and run the same postgres query to see that the "volume" key's values are all from the sli.persistentvolume_labels column.

            myersco Cody Myers
            myersco Cody Myers
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: