Verified Performance Benchmarks

Performance & Accuracy

We believe in transparency about our capabilities. These benchmarks are derived from systematic testing across thousands of research tasks, verified against ground-truth datasets, and updated quarterly.

94%

Research Accuracy

Verified against ground-truth datasets

15 min

Average Completion

Deep research report turnaround

50+

Sources per Report

Average databases checked per research

10,000+

Reports Generated

Across all expert modes

How We Measure Performance

Our benchmarking process is designed to be rigorous, reproducible, and transparent.

Ground-Truth Dataset

We maintain a curated dataset of entities with known risk profiles, verified by human experts. This serves as the baseline for measuring accuracy.

Blind Testing

Grep research is run against the ground-truth dataset without any prior knowledge of expected results. Outputs are compared against known findings.

Multi-Dimensional Scoring

We measure accuracy across multiple dimensions: entity identification, risk factor detection, source citation accuracy, and completeness of coverage.

Quarterly Review

Benchmarks are re-run quarterly with updated datasets and methodology. Results are published transparently, including any declines.

Continuous Improvement

Benchmark results directly inform engineering priorities. When we identify gaps, we address them in subsequent releases.

Data Sources & Coverage

Explore the comprehensive data sources that power Grep's research capabilities.

Frequently Asked Questions

Verify Our Claims Yourself

Don't take our benchmarks at face value. Run a research report on an entity you already know well and compare Grep's findings against your existing intelligence.

Try Grep Free View Benchmark