Public benchmark card

krishnaadavi/a2zai

A2ZAI Builder Radar Guard

Overall score improved from 78 to 88. Biggest movement came from cost.

Before

78

After

88

Delta

+10

Run status

completed

Why this artifact is shareable

Best improvement

cost

+11

Dimensions improved

4

out of 4 measured dimensions

Main risk

latency

8

Public URL ready to share1200 x 630 social card export readyGitHub-native eval artifactHistorical comparison includedNo regressed dimensions

Suggested launch post

Copy this when sharing the benchmark on X, GitHub, launch posts, or team chats.

DriftCheck: krishnaadavi/a2zai
A2ZAI Builder Radar Guard finished at 88 (+10 vs baseline).
Best gain: cost +11.
No failing examples were detected in this run.
https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-a2zai-builder-radar-guard-2

Benchmark URL: https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-a2zai-builder-radar-guard-2

Social card: https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-a2zai-builder-radar-guard-2/opengraph-image

Add to README

Link to this benchmark from your repo README so visitors see your eval results.

Badge (markdown)

[![DriftCheck](https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-a2zai-builder-radar-guard-2/opengraph-image)](https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-a2zai-builder-radar-guard-2)

Link (markdown)

[Benchmark: A2ZAI Builder Radar Guard](https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-a2zai-builder-radar-guard-2)

Compare with run from Mar 12, 2026

Current run vs previous `A2ZAI Builder Radar Guard` result.

After score vs previous

88 -> 88

Change +0

Run delta vs previous

+10 -> +10

Change +0

quality

After score 88 -> 88

+0

safety

After score 93 -> 93

+0

latency

After score 82 -> 82

+0

cost

After score 79 -> 79

+0

New failing cases

No new failing cases.

Resolved failing cases

No resolved failing cases.

Persistent failing cases

No persistent failing cases.

Dimension scorecard

quality

78 -> 88

+10

safety

84 -> 93

+9

latency

74 -> 82

+8

cost

68 -> 79

+11

PR scorecard output

## A2ZAI Checks Scorecard

Repo: `krishnaadavi/a2zai` • PR #4
Pack: `A2ZAI Builder Radar Guard`

Overall: **78 -> 88** (+10)

### Dimension deltas
- quality: 78 -> 88 (+10)
- safety: 84 -> 93 (+9)
- latency: 74 -> 82 (+8)
- cost: 68 -> 79 (+11)

Public benchmark card: https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-a2zai-builder-radar-guard-2

Run context

Repo: krishnaadavi/a2zai

Branch: main -> checks-writeback-test-3

PR: #4

Created: 3/12/2026, 6:57:03 PM

GitHub commit status: success on 12391db

Run history

Other runs for this repo and pack. Compare this run with any of them.

Cases to review

No failing examples were detected in this run.

krishnaadavi/a2zai | A2ZAI Builder Radar Guard | DriftCheck