Uploaded image for project: 'Clair'
  1. Clair
  2. CLAIRDEV-80

indexer: "browse" API

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • None
    • indexer, indexer-api
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Capturing a conversation Crozzy and I had about a possible API addition to Clair(core): the ability to query the contents of the indexer database and what components were used on given manifests.

      Chat log

      hank:
      Should we consider adding API for this?
      well, not "scan status", but "manifest status"

      Crozzy:
      We we'd be pretty binary in the results right? If we're not persisting anything on failure

      hank:
      yeah, I guess we'd need the indexer events log to make it worthwhile
      or some integration with a log backend

      Crozzy:
      Maybe it's[sic] be useful to be able to extract all indexed manifests?
      But I think we'd need the scanner version information in there or it's just like "yeah we saw it, might have been 3 years ago"
      I do feel like the metaview type API discussion does keep coming up though

      hank:
      "metaview" ?

      Crozzy:
      yeah, like when you come to for information Clair you need some context, basically the manifest hash. There is no way to say; "what do you have" and then extract specific results

      hank:
      yeah, there are no exploratory APIs

      Crozzy:
      that's the word 😄
      IDK, the "list of manifests thing" might be useful in some specific situations but the really powerful exploratory APIs probably answer questions like "show me all the manifests with this package" (maybe affected manifests does this but isn't really implemented by anyone externally)

      hank:
      yeah, and the database isn't structured to answer those queries efficiently

      Crozzy:
      right

      hank:
      we could prototype a little REST API for it, perhaps with a minimal UI as user 0
      to also finally kill those "I opened it in the browser and it did nothing" issues that pop up occasionally

      Crozzy:
      a prototype for the manifest list (for want of a better phrase)? It would be nice to have a simple UI for that
      I'm straight away seeing a need for a search 😄

      hank:
      It would really just need "lookup" and "list", right?

      Crozzy:
      yeah I guess, probably overkill to allow search based on partial digests
      Do you think a response like this makes sense:

      /manifests
      
      [
          {
              "id": 123,
              "hash": "sha...",
              "scanners": [
                  {
                      "id": 123,
                      "name": "dpkg",
                      "version": "6",
                      "kind": "package"
                  },
                  {
                      "id": 234,
                      "name": "debian",
                      "version": "3",
                      "kind": "distribution"
                  },
                  {
                      "id": 345,
                      "name": "whiteout",
                      "version": "1",
                      "kind": "file"
                  },
              ]
          },
          {...}
      ]
      

      Or just a list of digests and a separate endpoint to see a manifest's scanners?

      hank:
      1. no database IDs
      ideally the responses would be fully normalized, not nested
      although that makes pagination (somewhat) harder
      only a little, though. It'd be interesting to work out how to thread the state to make everything work correctly

      GET /manifests
      
      Link: </manifests?next=...>; rel="next"
      Content-Type: application/vnd.clair.manifestbrowse+json
      
      [
        [
          {
            "digest": "sha256:...",
            "detectors": [
              0,
              2
            ]
          }
        ],
        [
          {
            "name": "a",
            "version": "1",
            "kind": "package"
          },
          {
            "name": "a",
            "version": "1",
            "kind": "distribution"
          },
          {
            "name": "a",
            "version": "1",
            "kind": "file"
          }
        ]
      ]
      

      a single result is the same, but has a single object in the first array
      I guess we could make the outermost item an object

      Crozzy:
      So in your example the detectors are referring to the index of the scanner in the scanners array?

      hank:
      yes
      (IMO our next HTTP+JSON API should work this way)
      alternatively, json-ld might be good enough?
      eh, I'm remembering how much of a pain in the ass json-ld is
      (also why SPDX3 takes some time to figure out what's going on)
      I'm basically saying we should represent things like the "flattened" json-ld encoding

              Unassigned Unassigned
              hdonnay Henry Donnay
              Joseph Crosland
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: