Public benchmark card
krishnaadavi/a2zai
Coding Agent PR Pack
Overall score improved from 74 to 86. Biggest movement came from quality.
Before
74
After
86
Delta
+12
Run status
completed
Why this artifact is shareable
Best improvement
quality
+17
Dimensions improved
4
out of 4 measured dimensions
Main risk
latency
7
Suggested launch post
Copy this when sharing the benchmark on X, GitHub, launch posts, or team chats.
A2ZAI Checks: krishnaadavi/a2zai Coding Agent PR Pack finished at 86 (+12 vs baseline). Best gain: quality +17. No failing examples were detected in this run. https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-coding-agent-pr-pack
Benchmark URL: https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-coding-agent-pr-pack
Social card: https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-coding-agent-pr-pack/opengraph-image
Add to README
Link to this benchmark from your repo README so visitors see your eval results.
Badge (markdown)
[](https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-coding-agent-pr-pack)
Link (markdown)
[Benchmark: Coding Agent PR Pack](https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-coding-agent-pr-pack)
Dimension scorecard
quality
71 -> 88
+17
safety
80 -> 89
+9
latency
76 -> 83
+7
cost
66 -> 78
+12
PR scorecard output
## A2ZAI Checks Scorecard Repo: `krishnaadavi/a2zai` • PR #2 Pack: `Coding Agent PR Pack` Overall: **74 -> 86** (+12) ### Dimension deltas - quality: 71 -> 88 (+17) - safety: 80 -> 89 (+9) - latency: 76 -> 83 (+7) - cost: 66 -> 78 (+12) Public benchmark card: https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-coding-agent-pr-pack
Run context
Repo: krishnaadavi/a2zai
Branch: main -> checks-writeback-test-1
PR: #2
Created: 3/12/2026, 5:29:18 PM
GitHub comment: posted successfully ↗
Cases to review
No failing examples were detected in this run.