Provenance and Standards

This document explains how attestable builds produce verifiable claims about software and how those claims fit into the broader supply chain security ecosystem. After reading it, you'll understand what SLSA and in-toto are, why standards matter for interoperability, what artifacts Kettle produces, and how to interpret provenance documents.

We assume familiarity with the attestable builds concept from What Are Attestable Builds? and the build flow from How It Works. No prior knowledge of supply chain security standards is required.

Why Standards Matter

Attestable builds produce claims about how software was built. Those claims are only useful if others can understand and verify them. A proprietary format creates a silo: your tools can read it, but nothing else in the ecosystem can. Standards create ecosystems where provenance is usable by other tools, verifiable by third parties without special knowledge, and auditable against known criteria.

The supply chain security space has converged on two complementary standards: SLSA (Supply-chain Levels for Software Artifacts) for defining security requirements and levels, and in-toto for the attestation format itself. Using these standards means that security scanners, package registries, deployment systems, and audit tools can all consume attestable build output without custom integration work. It also means that when a customer asks "do you meet SLSA Build L3?", you can point to concrete evidence rather than explaining a proprietary system.

SLSA: Supply-chain Levels for Software Artifacts

SLSA (pronounced "salsa") is a set of incrementally adoptable guidelines for supply chain security, established by industry consensus. It provides two things: a common vocabulary for talking about supply chain security, and a framework for evaluating the trustworthiness of software artifacts.

The framework emerged from Google's internal Binary Authorization for Borg system, which they've used for years to ensure that production binaries are built from reviewed source through authorized build systems. SLSA generalizes these practices into levels that any organization can adopt incrementally.

The Build Track

SLSA organizes requirements into "tracks" that focus on different aspects of the supply chain. The Build track, which is most relevant to attestable builds, focuses on build integrity: ensuring that packages are built from the correct, unmodified sources and dependencies according to the build recipe defined by the software producer.

The Build track defines three levels:

Level	Summary	Key Requirements
L1	Provenance exists	Package has provenance showing how it was built. Can be unsigned. Prevents mistakes but trivial to forge.
L2	Hosted build platform	Provenance is signed by a hosted build platform. Forging requires an explicit attack, not just a mistake.
L3	Hardened builds	Build platform has strong tamper protection. Builds are isolated from each other. Signing keys are inaccessible to user-defined build steps.

Each level provides stronger guarantees but requires more investment. L1 catches mistakes and aids debugging. L2 deters unsophisticated adversaries and those who face legal or financial risk from evading controls. L3 prevents tampering during the build itself, even from insider threats or compromised credentials.

The progression is designed to be incremental. An organization can start at L1 (just generate provenance), move to L2 (switch to a hosted build platform that signs provenance), and eventually reach L3 (harden that platform with isolation and key protection). Each step provides value on its own.

Attestable Builds Achieve L3

Attestable builds using TEEs achieve SLSA Build L3 through hardware enforcement:

L3 Requirement	How Attestable Builds Meet It
Provenance generated by build platform's trusted control plane	TEE generates provenance in its trusted control plane, not in user-defined steps
Provenance signed by build platform	Hardware-rooted key signs provenance. Key is derived from TEE attestation.
Builds isolated from one another	TEE hardware isolation prevents cross-build interference
Signing keys inaccessible to user-defined build steps	Keys derived from TEE hardware, never exposed to build scripts

The TEE provides stronger guarantees than typical L3 implementations because the isolation and key protection are hardware-enforced, not software-enforced. A compromised build script cannot escape the TEE's memory encryption. A malicious insider cannot extract signing keys because they're derived from hardware that only the CPU can access.

The in-toto Attestation Framework

While SLSA defines what security properties you need, in-toto defines how to express claims about software. It's an attestation framework that provides a standard structure for making signed statements about artifacts.

The core concept is simple: an attestation is a signed statement that says something about one or more software artifacts. The statement has a subject (what artifacts it's about), a predicate type (what kind of claim this is), and a predicate (the actual claim data).

{
  "_type": "https://in-toto.io/Statement/v1",
  "subject": [{ "name": "my-app", "digest": { "sha256": "abc123..." } }],
  "predicateType": "https://slsa.dev/provenance/v1",
  "predicate": {
    // claim-specific data
  }
}

The subject identifies artifacts by cryptographic digest, not by name alone. This means the statement is bound to specific bytes, not to a filename that could be reused for different content.

The predicate type is a URI that identifies the schema for interpreting the predicate. Different predicate types exist for different purposes: SLSA Provenance for build information, SCAI for security attributes, test results, code review attestations, and SBOMs. The framework is extensible. New predicate types can be defined without changing the core format.

This modularity matters for real-world use. A complete picture of an artifact's security posture might include multiple attestations: provenance showing how it was built, a code review attestation showing the source was reviewed, test results showing it passed CI, and a vulnerability scan showing no known CVEs. All of these use the same outer structure, can be bundled together, and can be verified with the same tooling.

SLSA Provenance as a Predicate Type

SLSA Provenance is one predicate type within the in-toto framework. It's specifically designed to record build information: what inputs went into a build, what process was used, and what outputs were produced.

The relationship between SLSA and in-toto is complementary. SLSA defines the requirements ("provenance must be signed by the build platform"). in-toto defines the format ("here's how to structure a signed provenance statement"). You can think of in-toto as the unopinionated layer for expressing supply chain information, and SLSA as the opinionated layer specifying exactly what information must be captured to achieve specific security guarantees.

What Kettle Produces

A complete Kettle build generates three files. Each serves a different purpose in the verification chain.

manifest.json

The manifest is a human-readable summary of the build. It contains:

{
  "git_commit": "a1b2c3d4e5f67890abcdef1234567890abcdef12",
  "git_tree": "7890abcdef1234567890abcdef1234567890abcde",
  "lockfile_hash": "23b2e23aa04c93c350cac09ac73636e4ecedf564...",
  "input_merkle_root": "72a97c73d0c59905c89dc7da145a5ecc3d809be5...",
  "toolchain": {
    "rustc_hash": "e6abf55ab1859e7c990be77fd593f5166...",
    "cargo_hash": "51de284e8bb0d03dcee595a0fb1cb3a952..."
  },
  "artifacts": [
    { "name": "my-app", "hash": "1d1ea25c371d4f6de8d6e3c26fdad2238..." }
  ]
}

This is for humans to inspect and debug. When something goes wrong, you look here to understand what inputs were used and what outputs were produced. The git commit and tree hash let you identify the exact source. The lockfile hash lets you verify dependency pinning. The input Merkle root is the cryptographic commitment to all inputs combined.

provenance.json

The provenance is a SLSA v1.2 statement in in-toto format. This is the machine-readable, interoperable record. It follows the SLSA specification exactly, which means other tools in the ecosystem can consume it without custom parsing.

The structure has two main sections:

buildDefinition describes the inputs to the build:

{
  "buildDefinition": {
    "buildType": "https://example.com/attestable-build/v1",
    "externalParameters": {
      "repository": "https://github.com/org/repo",
      "ref": "refs/heads/main"
    },
    "internalParameters": {},
    "resolvedDependencies": [
      {
        "uri": "git+https://github.com/org/repo@refs/heads/main",
        "digest": { "gitCommit": "a1b2c3d4..." }
      },
      {
        "uri": "pkg:cargo/serde@1.0.228",
        "digest": { "sha256": "9a8e94ea..." }
      }
    ]
  }
}

The buildType identifies how to interpret the parameters. It's a URI that should resolve to documentation explaining the build process.

The externalParameters are the top-level inputs controlled by the user: which repository to build, which ref to check out, which entry point to use. These are untrusted from SLSA's perspective and must be verified downstream.

The internalParameters are set by the build platform itself. In attestable builds, this might include TEE configuration or platform version information.

The resolvedDependencies capture what was actually fetched during the build. Notice the distinction: externalParameters might say "build from refs/heads/main", while resolvedDependencies records that this resolved to commit a1b2c3d4.... The dependencies use Package URLs (PURLs) for standardized identification. A Cargo dependency looks like pkg:cargo/serde@1.0.228?checksum=sha256:9a8e94ea....

runDetails describes the build execution:

{
  "runDetails": {
    "builder": {
      "id": "https://example.com/tee-builder/v1"
    },
    "metadata": {
      "invocationId": "build-12345",
      "startedOn": "2024-01-15T10:30:00Z",
      "finishedOn": "2024-01-15T10:35:00Z"
    }
  }
}

The builder.id is the critical field. It identifies the build platform and represents the transitive closure of everything you're trusting to faithfully run the build and record provenance. For attestable builds, this ID represents the TEE-based build system with its specific security properties.

The metadata provides operational information: when the build ran, how long it took, and an identifier for this specific invocation.

Evidence

The evidence file contains the TEE attestation report, base64-encoded. This is what roots everything in hardware.

The attestation report is signed by the CPU using keys that chain back to the hardware vendor's root of trust. A verifier can check this signature against the vendor's certificate chain to confirm the report came from genuine hardware.

Critically, the first 32 bytes of the report's custom data field contain the SHA256 hash of the provenance document. This cryptographically binds the attestation to the provenance. You can't take an attestation from one build and attach it to provenance from a different build. The hash must match.

The verification chain works like this:

Verify attestation signature against hardware vendor's certificate chain
Extract provenance hash from attestation report
Verify hash matches actual provenance document
Verify provenance contents (inputs, outputs, builder ID)
Verify artifact hashes match provenance subjects

If any step fails, the verification fails. This chain ensures that the provenance is exactly what was generated inside the attested TEE, not something fabricated afterward.

The Provenance Structure in Detail

Understanding the provenance structure helps when debugging builds or writing verification policies. Let's look at each field more carefully.

buildType

The build type is a URI that identifies how to interpret the build definition. It encapsulates the build process independent of what platform ran it.

"buildType": "https://kettle.dev/cargo-build/v1"

The URI should resolve to documentation explaining: what the build process does, what externalParameters and internalParameters mean for this build type, and how to initiate a build given this definition. Different build types exist for different toolchains (Cargo vs Nix) or different build configurations.

externalParameters vs internalParameters

The distinction matters for verification. External parameters are untrusted. They come from outside the build platform: a user requesting a build, a CI trigger, a webhook. Verifiers must check these against expectations.

Internal parameters are set by the platform itself. They're trusted because the platform is trusted. A verifier doesn't need to check them individually, though they might be useful for debugging or reproducibility.

In practice, external parameters should be minimal. The more you put in external parameters, the more a verifier needs to check. Good build type design pushes configuration into the source repository (where it's covered by the source commit hash) rather than into external parameters.

resolvedDependencies

This field captures what was actually used during the build, not just what was requested. The distinction matters because many inputs resolve dynamically:

{
  "resolvedDependencies": [
    {
      "uri": "git+https://github.com/org/repo@refs/heads/main",
      "digest": { "gitCommit": "a1b2c3d4e5f67890abcdef1234567890abcdef12" }
    },
    {
      "uri": "pkg:cargo/serde@1.0.228",
      "digest": { "sha256": "9a8e94ea..." },
      "name": "serde"
    },
    {
      "uri": "pkg:cargo/tokio@1.35.0",
      "digest": { "sha256": "7b4c89..." },
      "name": "tokio"
    }
  ]
}

A request to build "refs/heads/main" resolves to a specific commit. A dependency on "serde ^1.0" resolves to a specific version. The resolved dependencies record these resolutions.

Each dependency is a ResourceDescriptor with several optional fields: uri identifies the dependency, digest provides cryptographic verification, name is human-readable, downloadLocation says where it was fetched from if different from the URI.

Completeness is best-effort at L3. Ideally, every artifact fetched during the build would be recorded. In practice, some builds fetch things that are hard to track. The goal is to capture enough that a security team can investigate if something goes wrong.

builder.id

The builder ID is the trust anchor. It identifies the transitive closure of everything trusted to faithfully run the build. For attestable builds:

{
  "builder": {
    "id": "https://kettle.dev/tee-builder/v1"
  }
}

This ID should resolve to documentation explaining: the scope of what the ID represents, the claimed SLSA Build level, the accuracy and completeness guarantees of the provenance fields, and any fields that are generated by tenant-controlled processes rather than the trusted control plane.

A consumer's verification policy specifies which builder IDs they trust. The policy might say "I trust builds from https://kettle.dev/tee-builder/v1 at SLSA L3" while rejecting builds from other platforms.

How Provenance Relates to SBOMs

Software Bill of Materials (SBOMs) and SLSA Provenance are related but distinct.

An SBOM describes what components are present in software. It's focused on understanding software for vulnerability assessment and license compliance. SBOMs need fine-grained, timely data: exact versions, licenses, known vulnerabilities.

SLSA Provenance describes how software was built. It's focused on trustworthiness of the build process. Provenance is generated in the build platform's trusted control plane, which in practice makes it coarser-grained than an SBOM.

The two are complementary. An SBOM tells you what's in the artifact. Provenance tells you that the SBOM (and the artifact) came from a verified build process. You might use provenance to trust the SBOM's accuracy: "I believe this SBOM is correct because it was generated by a build system I trust."

Kettle's provenance includes dependency information in resolvedDependencies, but this isn't a full SBOM. It captures build-time dependencies at a coarse level (what was fetched during the build), not the fine-grained component analysis that SBOM tools provide.

Summary

Attestable builds produce SLSA Build L3 provenance in standard in-toto format. The three output files (manifest.json, provenance.json, Evidence) together provide a complete, verifiable record of the build.

The provenance structure follows SLSA v1.2: buildDefinition captures inputs, runDetails captures execution context, and the builder.id identifies the trust anchor. The TEE attestation binds the provenance to hardware, preventing forgery.

Using industry standards means the output integrates with existing supply chain security tools. Verifiers don't need custom code to consume attestable build provenance. They check the attestation against the hardware vendor's certificates, verify the provenance hash binding, and evaluate the provenance contents against their policy.

The next section, Threat Model, covers what exactly attestable builds protect against, what they don't, and how to reason about the security boundaries.