/ Docs / Benchmark

Benchmark Methodology

Benchmarks are intended for first-pass design-review triage, not replacement of final engineering validation.

Page summary

How to interpret the representative 1,492-part automotive body assembly benchmark.

Benchmark purpose

The benchmark demonstrates the workflow shape: encode one representative automotive body assembly, reuse the encoded assembly across multiple first-pass agents, and return geometry-linked findings for review.

Assembly size and cache

Assembly size

Representative automotive body assembly with 1,492 parts.

Cache size

Public benchmark summary reports a query cache under 5 MB for the representative assembly.

Compression explanation

Compression compares the source CAD artifact size to the generated query cache used by agents. It is not a claim that the cache is a full-fidelity CAD replacement.

GPU and CPU comparison

GPU/CPU comparisons isolate equivalent encoding and feature-extraction workloads for the benchmark assembly. They do not measure final simulation validation, CAD authoring, PLM workflows, or compliance testing.

First-pass triage measurement

Triage timing compares the agent review workflow against specialist-tool handoff patterns for the same review category. It should be read as a workflow-screening indicator, not a replacement for specialist review.

Tolerance measurement scope

Dimensional tolerance measurement is reported for deterministic measurements on encoded assemblies within the public benchmark scope. Exact measurement procedure, fixtures, and validation details are shared with qualified teams under NDA.

Scope and limitations

Representative benchmark results are not universal performance guarantees.
The benchmark does not certify engineering quality or regulatory compliance.
The encoded cache is a review substrate, not a replacement for source CAD.
Full methodology, datasets, and architecture-review materials can be shared under NDA where appropriate.