Verified Performance Benchmarks
Performance & Accuracy
We believe in transparency about our capabilities. These benchmarks are derived from systematic testing across thousands of research tasks, verified against ground-truth datasets, and updated quarterly.
How We Measure Performance
Our benchmarking process is designed to be rigorous, reproducible, and transparent.
Ground-Truth Dataset
We maintain a curated dataset of entities with known risk profiles, verified by human experts. This serves as the baseline for measuring accuracy.
Blind Testing
Grep research is run against the ground-truth dataset without any prior knowledge of expected results. Outputs are compared against known findings.
Multi-Dimensional Scoring
We measure accuracy across multiple dimensions: entity identification, risk factor detection, source citation accuracy, and completeness of coverage.
Quarterly Review
Benchmarks are re-run quarterly with updated datasets and methodology. Results are published transparently, including any declines.
Continuous Improvement
Benchmark results directly inform engineering priorities. When we identify gaps, we address them in subsequent releases.
Data Sources & Coverage
Explore the comprehensive data sources that power Grep's research capabilities.
Frequently Asked Questions
Verify Our Claims Yourself
Don't take our benchmarks at face value. Run a research report on an entity you already know well and compare Grep's findings against your existing intelligence.