Benchmark — Performance Baseline & Regression Detection

When to Use

Before and after a PR to measure performance impact
Setting up performance baselines for a project
When users report "it feels slow"
Before a launch — ensure you meet performance targets
Comparing your stack against alternatives

How It Works

Mode 1: Page Performance

Measures real browser metrics via browser MCP:

1. Navigate to each target URL
2. Measure Core Web Vitals:
   - LCP (Largest Contentful Paint) — target < 2.5s
   - CLS (Cumulative Layout Shift) — target < 0.1
   - INP (Interaction to Next Paint) — target < 200ms
   - FCP (First Contentful Paint) — target < 1.8s
   - TTFB (Time to First Byte) — target < 800ms
3. Measure resource sizes:
   - Total page weight (target < 1MB)
   - JS bundle size (target < 200KB gzipped)
   - CSS size
   - Image weight
   - Third-party script weight
4. Count network requests
5. Check for render-blocking resources

Mode 2: API Performance

Benchmarks API endpoints:

1. Hit each endpoint 100 times
2. Measure: p50, p95, p99 latency
3. Track: response size, status codes
4. Test under load: 10 concurrent requests
5. Compare against SLA targets

Mode 3: Build Performance

Measures development feedback loop:

1. Cold build time
2. Hot reload time (HMR)
3. Test suite duration
4. TypeScript check time
5. Lint time
6. Docker build time

Mode 4: Before/After Comparison

Run before and after a change to measure impact:

/benchmark baseline    # saves current metrics
# ... make changes ...
/benchmark compare     # compares against baseline

Output:

| Metric | Before | After | Delta | Verdict |
|--------|--------|-------|-------|---------|
| LCP | 1.2s | 1.4s | +200ms | WARNING: WARN |
| Bundle | 180KB | 175KB | -5KB | ✓ BETTER |
| Build | 12s | 14s | +2s | WARNING: WARN |

Output

Stores baselines in .ecc/benchmarks/ as JSON. Git-tracked so the team shares baselines.

Integration

CI: run /benchmark compare on every PR
Pair with /canary-watch for post-deploy monitoring
Pair with /browser-qa for full pre-ship checklist

Files1

1 files · 1.0 KB

Select a file to preview

Overall Score

76/100

Grade

B

Good

Safety

88

Quality

72

Clarity

82

Completeness

65

Summary

A performance benchmarking skill that measures page load metrics (Core Web Vitals), API latency, and build times. It supports baseline creation and before/after comparison to detect regressions, with output stored in `.ecc/benchmarks/` as JSON.

Detected Capabilities

Browser-based Core Web Vitals measurement (LCP, CLS, INP, FCP, TTFB)Resource size analysis (JS bundles, CSS, images, third-party scripts)API endpoint load testing and latency percentile calculationBuild performance profiling (cold build, HMR, test suite, TypeScript, lint, Docker)Baseline creation and comparison with before/after delta reportingJSON-based metrics storage in project directory

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

performance regression detectionmeasure page load metricsapi latency benchmarkingbuild time profilingcore web vitals checkbefore/after performance comparison

Risk Signals

INFO

No shell execution, privilege escalation, or credential access detected

general

INFO

File writes to `.ecc/benchmarks/` directory — project-scoped, no system-wide access

Output section

INFO

Network requests to target URLs via browser MCP — expected for performance measurement

Mode 1 and Mode 2 sections

Use Cases

Measure performance impact of a PR before and after merge
Establish performance baselines for a project
Detect performance regressions in CI/CD pipelines
Compare API endpoint latency under load
Track build time and development feedback loop metrics

Quality Notes

Clear mode-based structure (4 modes) with specific, measurable metrics for each
Well-documented targets for each metric (e.g., LCP < 2.5s, bundle < 200KB gzipped)
Good example output showing delta calculation and verdict determination
Integration guidance provided (CI, pairing with other skills) — helps users understand ecosystem placement
No error handling or edge case coverage documented (e.g., network timeouts, unreachable endpoints, empty baselines)
Limitations not explicitly stated — unclear if skill handles dynamic content, authentication-required endpoints, or multi-region testing
Browser MCP dependency implied but not formally documented; unclear what data sources/APIs the agent accesses
Git-tracking mentioned but no guidance on baseline initialization or conflict resolution strategies

Model: claude-haiku-4-5-20251001Analyzed: Apr 20, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Version History

v1.1

Content updated

2026-04-20

Latest

v1.0

No changelog

2026-04-12

benchmark

Benchmark — Performance Baseline & Regression Detection

When to Use

How It Works

Mode 1: Page Performance

Mode 2: API Performance

Mode 3: Build Performance

Mode 4: Before/After Comparison

Output

Integration

Summary

Detected Capabilities

Trigger Keywords

Risk Signals

Use Cases

Quality Notes

Reviews

Version History

Command Palette