Loading...

XML

Word

Printable

Details

Type: Epic
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: indexer
Labels:
None

Epic Name:
Precise Build Metadata
Blocked:
True
Blocked Reason:

Hide

Waiting on build system changes so that the required information is present.

Show
Waiting on build system changes so that the required information is present.
Ready:
False
Epic Status:
To Do
Hierarchy Progress:
50
Hierarchy Progress Bar:

50% 50%

SFDC Cases Links:
SFDC Cases Counter:

Description

TL;DR

Red Hat's vulnerability data is organized according to Common Platform Enumeration (CPE) names. With the CPE and the Name-Epoch-Version-Release-Architecture (NEVRA) string for a given rpm package, one can evaluate evaluate whether any given advisory in the vulnerability data applies to it.

Claircore's model of software has a "Package" type which contains the name, version, etc of a discovered package for any given package manager under consideration. The Package can be associated with "Distribution" and "Repository" types that are discovered in parallel processes*.

Claircore's handling of Red Hat produced layers (a.k.a. rhel) overloads the Repository type to pass through the CPEs. Roughly, the rhel Repository Indexer† looks for content_manifest files, takes the discovered content_sets values (called "repositories"), looks them up in a mapping provided by PST, and constructs Repository objects for them. The "Coalescing" step (when all the per-layer domain types are brought together to produce a coherent view of the container image) then takes the Repository and Package objects and creates the set of Package ⨯ Repository (read: CPE). That is to say, given 3 Package and 3 Repository, the result is 9 "logical" packages. These are all present in the system for the matching step‡.

The obvious problem with this is that it will generate both "phantom packages" and apparent duplicates. For example: given a container with two CPEs associated with it – baseos and appstream – and a package – zlib-1.2.12-5 – two logical packages will be created: zlib-1.2.12-5/baseos and zlib-1.2.12-5/appstream. The package manager only got the package from one place, so one of these packages is a phantom. When compared against an advisory for zlib that contains both baseos and appstream, each will match one of the packages. This is an appears as a duplicate at a glance but is not because of either a lack specificity in the indexing step (producing the phantom package) or excessive generality in the CPE list (including both baseos and appstream, when the package is only present in one).

The solution to this problem would seem to be:

Examine the package manager's metadata and determine the actual repository that the package came from.
Record the repositories the package manager knows about in addition to the content_sets and delay the CPE resolution.

The reason this wasn't done prior is that rpm does not record repository information. This is recorded by dnf (or yum), but it was never prioritized for implementation.

PROJQUAY-5182 was reported and seemed to be stemming from something like the issue above – the Package/Repository association heuristic being too broad. Through the investigation into ~~PROJQUAY-5182~~ (and subsequently PROJQUAY-5185), we determined that there's insufficient information in a layer produced by OSBS to be able to determine which CPE(s) a Package should be associated with. Moreover, a Package may not be associated with any content_set (and correspondingly, CPE) that the content_manifest contains.

This is an overarching issue while we design and implement a solution with the relevant teams.

*: That is, every domain type (Package, Distribution, Repository) is discovered by an independent piece of code that is unconcerned with any other type.

†: An "Indexer" is Claircore's terminology for the code that examines a layer for a domain type. It's (in theory) a pure function of type LayerContent -> DomainType. Modeling them like that means it can be memoized as (IndexerImplementation, LayerContent) -> DomainType and is the key to reducing work when presented with multiple copies of the same layer.

‡: "Matching" is the Claircore term for matching an index of container contents with a set of vulnerabilities. This is sometimes what people mean when they say "scan."

Attachments

Issue Links

is related to

PROJQUAY-5543 Build rhcc survey tool

Closed

relates to

PROJQUAY-5182 OSP 17.1 containers are being graded against 9.1 instead of 9.0 EUS

Closed

CLAIRDEV-41 Examine dnf database in addition to rpm database

To Do

Activity

People

Assignee:: Henry Donnay

Reporter:: Henry Donnay

Contributors:: Joseph Crosland

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 2023/03/17 1:40 AM

Updated:: 2024/04/11 6:44 PM

OSBS layers contain insufficient information to make CPE assertions