Catalog
affaan-m/benchmark

affaan-m

benchmark

Use this skill to measure performance baselines, detect regressions before/after PRs, and compare stack alternatives.

global
0installs0uses~567
v1.1Saved Apr 20, 2026

Benchmark — Performance Baseline & Regression Detection

When to Use

  • Before and after a PR to measure performance impact
  • Setting up performance baselines for a project
  • When users report "it feels slow"
  • Before a launch — ensure you meet performance targets
  • Comparing your stack against alternatives

How It Works

Mode 1: Page Performance

Measures real browser metrics via browser MCP:

1. Navigate to each target URL
2. Measure Core Web Vitals:
   - LCP (Largest Contentful Paint) — target < 2.5s
   - CLS (Cumulative Layout Shift) — target < 0.1
   - INP (Interaction to Next Paint) — target < 200ms
   - FCP (First Contentful Paint) — target < 1.8s
   - TTFB (Time to First Byte) — target < 800ms
3. Measure resource sizes:
   - Total page weight (target < 1MB)
   - JS bundle size (target < 200KB gzipped)
   - CSS size
   - Image weight
   - Third-party script weight
4. Count network requests
5. Check for render-blocking resources

Mode 2: API Performance

Benchmarks API endpoints:

1. Hit each endpoint 100 times
2. Measure: p50, p95, p99 latency
3. Track: response size, status codes
4. Test under load: 10 concurrent requests
5. Compare against SLA targets

Mode 3: Build Performance

Measures development feedback loop:

1. Cold build time
2. Hot reload time (HMR)
3. Test suite duration
4. TypeScript check time
5. Lint time
6. Docker build time

Mode 4: Before/After Comparison

Run before and after a change to measure impact:

/benchmark baseline    # saves current metrics
# ... make changes ...
/benchmark compare     # compares against baseline

Output:

| Metric | Before | After | Delta | Verdict |
|--------|--------|-------|-------|---------|
| LCP | 1.2s | 1.4s | +200ms | WARNING: WARN |
| Bundle | 180KB | 175KB | -5KB | ✓ BETTER |
| Build | 12s | 14s | +2s | WARNING: WARN |

Output

Stores baselines in .ecc/benchmarks/ as JSON. Git-tracked so the team shares baselines.

Integration

  • CI: run /benchmark compare on every PR
  • Pair with /canary-watch for post-deploy monitoring
  • Pair with /browser-qa for full pre-ship checklist
Files1
1 files · 1.0 KB

Select a file to preview

Overall Score

76/100

Grade

B

Good

Safety

88

Quality

72

Clarity

82

Completeness

65

Summary

A performance benchmarking skill that measures page load metrics (Core Web Vitals), API latency, and build times. It supports baseline creation and before/after comparison to detect regressions, with output stored in `.ecc/benchmarks/` as JSON.

Detected Capabilities

Browser-based Core Web Vitals measurement (LCP, CLS, INP, FCP, TTFB)Resource size analysis (JS bundles, CSS, images, third-party scripts)API endpoint load testing and latency percentile calculationBuild performance profiling (cold build, HMR, test suite, TypeScript, lint, Docker)Baseline creation and comparison with before/after delta reportingJSON-based metrics storage in project directory

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

performance regression detectionmeasure page load metricsapi latency benchmarkingbuild time profilingcore web vitals checkbefore/after performance comparison

Risk Signals

INFO

No shell execution, privilege escalation, or credential access detected

general
INFO

File writes to `.ecc/benchmarks/` directory — project-scoped, no system-wide access

Output section
INFO

Network requests to target URLs via browser MCP — expected for performance measurement

Mode 1 and Mode 2 sections

Use Cases

  • Measure performance impact of a PR before and after merge
  • Establish performance baselines for a project
  • Detect performance regressions in CI/CD pipelines
  • Compare API endpoint latency under load
  • Track build time and development feedback loop metrics

Quality Notes

  • Clear mode-based structure (4 modes) with specific, measurable metrics for each
  • Well-documented targets for each metric (e.g., LCP < 2.5s, bundle < 200KB gzipped)
  • Good example output showing delta calculation and verdict determination
  • Integration guidance provided (CI, pairing with other skills) — helps users understand ecosystem placement
  • No error handling or edge case coverage documented (e.g., network timeouts, unreachable endpoints, empty baselines)
  • Limitations not explicitly stated — unclear if skill handles dynamic content, authentication-required endpoints, or multi-region testing
  • Browser MCP dependency implied but not formally documented; unclear what data sources/APIs the agent accesses
  • Git-tracking mentioned but no guidance on baseline initialization or conflict resolution strategies
Model: claude-haiku-4-5-20251001Analyzed: Apr 20, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Version History

v1.1

Content updated

2026-04-20

Latest
v1.0

No changelog

2026-04-12

Add affaan-m/benchmark to your library

Command Palette

Search for a command to run...