-
Feature
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
False
-
-
False
Capturing a conversation Crozzy and I had about a possible API addition to Clair(core): the ability to query the contents of the indexer database and what components were used on given manifests.
hank:
Should we consider adding API for this?
well, not "scan status", but "manifest status"
Crozzy:
We we'd be pretty binary in the results right? If we're not persisting anything on failure
hank:
yeah, I guess we'd need the indexer events log to make it worthwhile
or some integration with a log backend
Crozzy:
Maybe it's[sic] be useful to be able to extract all indexed manifests?
But I think we'd need the scanner version information in there or it's just like "yeah we saw it, might have been 3 years ago"
I do feel like the metaview type API discussion does keep coming up though
hank:
"metaview" ?
Crozzy:
yeah, like when you come to for information Clair you need some context, basically the manifest hash. There is no way to say; "what do you have" and then extract specific results
hank:
yeah, there are no exploratory APIs
Crozzy:
that's the word 😄
IDK, the "list of manifests thing" might be useful in some specific situations but the really powerful exploratory APIs probably answer questions like "show me all the manifests with this package" (maybe affected manifests does this but isn't really implemented by anyone externally)
hank:
yeah, and the database isn't structured to answer those queries efficiently
Crozzy:
right
hank:
we could prototype a little REST API for it, perhaps with a minimal UI as user 0
to also finally kill those "I opened it in the browser and it did nothing" issues that pop up occasionally
Crozzy:
a prototype for the manifest list (for want of a better phrase)? It would be nice to have a simple UI for that
I'm straight away seeing a need for a search 😄
hank:
It would really just need "lookup" and "list", right?
Crozzy:
yeah I guess, probably overkill to allow search based on partial digests
Do you think a response like this makes sense:
/manifests [ { "id": 123, "hash": "sha...", "scanners": [ { "id": 123, "name": "dpkg", "version": "6", "kind": "package" }, { "id": 234, "name": "debian", "version": "3", "kind": "distribution" }, { "id": 345, "name": "whiteout", "version": "1", "kind": "file" }, ] }, {...} ]
Or just a list of digests and a separate endpoint to see a manifest's scanners?
hank:
1. no database IDs
ideally the responses would be fully normalized, not nested
although that makes pagination (somewhat) harder
only a little, though. It'd be interesting to work out how to thread the state to make everything work correctly
GET /manifests Link: </manifests?next=...>; rel="next" Content-Type: application/vnd.clair.manifestbrowse+json [ [ { "digest": "sha256:...", "detectors": [ 0, 2 ] } ], [ { "name": "a", "version": "1", "kind": "package" }, { "name": "a", "version": "1", "kind": "distribution" }, { "name": "a", "version": "1", "kind": "file" } ] ]
a single result is the same, but has a single object in the first array
I guess we could make the outermost item an object
Crozzy:
So in your example the detectors are referring to the index of the scanner in the scanners array?
hank:
yes
(IMO our next HTTP+JSON API should work this way)
alternatively, json-ld might be good enough?
eh, I'm remembering how much of a pain in the ass json-ld is
(also why SPDX3 takes some time to figure out what's going on)
I'm basically saying we should represent things like the "flattened" json-ld encoding