Agentic Engineering

Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls.

Operating Principles

Define completion criteria before execution.
Decompose work into agent-sized units.
Route model tiers by task complexity.
Measure with evals and regression checks.

Eval-First Loop

Define capability eval and regression eval.
Run baseline and capture failure signatures.
Execute implementation.
Re-run evals and compare deltas.

Task Decomposition

Apply the 15-minute unit rule:

each unit should be independently verifiable
each unit should have a single dominant risk
each unit should expose a clear done condition

Model Routing

Haiku: classification, boilerplate transforms, narrow edits
Sonnet: implementation and refactors
Opus: architecture, root-cause analysis, multi-file invariants

Session Strategy

Continue session for closely-coupled units.
Start fresh session after major phase transitions.
Compact after milestone completion, not during active debugging.

Review Focus for AI-Generated Code

Prioritize:

invariants and edge cases
error boundaries
security and auth assumptions
hidden coupling and rollout risk

Do not waste review cycles on style-only disagreements when automated format/lint already enforce style.

Cost Discipline

Track per task:

model
token estimate
retries
wall-clock time
success/failure

Escalate model tier only when lower tier fails with a clear reasoning gap.

Files1

1 files · 1.0 KB

Select a file to preview

Overall Score

62/100

Grade

C

Adequate

Safety

95

Quality

58

Clarity

68

Completeness

45

Summary

This skill provides methodology guidance for operating AI agents in engineering workflows using eval-first execution, task decomposition, and cost-aware model routing. It instructs humans and AI systems on how to structure agentic work, define success criteria, decompose tasks, and manage model costs across different tiers (Haiku, Sonnet, Opus).

Detected Capabilities

Eval framework design and baseline captureTask decomposition and scope definitionModel tier selection and routing logicCost tracking and optimizationCode review strategy for AI-generated codeSession management and state compaction

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

agentic engineeringeval-first executiontask decompositionmodel routingai code reviewcost optimization

Use Cases

Structuring large engineering projects for AI agent implementation
Defining and running evaluation baselines before agent execution
Decomposing complex tasks into independently verifiable units
Managing cost and quality trade-offs across Claude model tiers
Conducting effective code reviews of AI-generated implementations

Quality Notes

Skill provides clear operating principles and methodology but lacks concrete examples or templates
The '15-minute unit rule' is mentioned but not explained with sufficient detail for consistent application
Model routing guidance is brief — no guidance on failure modes, retry budgets, or when to escalate tiers
No concrete eval example: what does a 'baseline' look like? How are 'failure signatures' captured?
Session strategy section uses vague terms ('closely-coupled units', 'major phase transitions') without defining thresholds
Cost tracking guidance lists metrics to track but provides no format, reporting mechanism, or threshold definitions
Review focus section is useful but would benefit from concrete examples of 'hidden coupling' or 'rollout risk' patterns
Skill is instructional rather than executable — it does not provide templates, checklist formats, or structured outputs that an agent could directly use
No guidance on error recovery: what happens when a regression eval fails post-implementation?

Model: claude-haiku-4-5-20251001Analyzed: Apr 20, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Version History

v1.1

Content updated

2026-04-20

Latest

v1.0

Seeded from github.com/affaan-m/everything-claude-code

2026-03-16

agentic-engineering