Catalog
affaan-m/agentic-engineering

affaan-m

agentic-engineering

Operate as an agentic engineer using eval-first execution, decomposition, and cost-aware model routing.

global
0installs0uses~418
v1.1Saved Apr 20, 2026

Agentic Engineering

Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls.

Operating Principles

  1. Define completion criteria before execution.
  2. Decompose work into agent-sized units.
  3. Route model tiers by task complexity.
  4. Measure with evals and regression checks.

Eval-First Loop

  1. Define capability eval and regression eval.
  2. Run baseline and capture failure signatures.
  3. Execute implementation.
  4. Re-run evals and compare deltas.

Task Decomposition

Apply the 15-minute unit rule:

  • each unit should be independently verifiable
  • each unit should have a single dominant risk
  • each unit should expose a clear done condition

Model Routing

  • Haiku: classification, boilerplate transforms, narrow edits
  • Sonnet: implementation and refactors
  • Opus: architecture, root-cause analysis, multi-file invariants

Session Strategy

  • Continue session for closely-coupled units.
  • Start fresh session after major phase transitions.
  • Compact after milestone completion, not during active debugging.

Review Focus for AI-Generated Code

Prioritize:

  • invariants and edge cases
  • error boundaries
  • security and auth assumptions
  • hidden coupling and rollout risk

Do not waste review cycles on style-only disagreements when automated format/lint already enforce style.

Cost Discipline

Track per task:

  • model
  • token estimate
  • retries
  • wall-clock time
  • success/failure

Escalate model tier only when lower tier fails with a clear reasoning gap.

Files1
1 files · 1.0 KB

Select a file to preview

Overall Score

62/100

Grade

C

Adequate

Safety

95

Quality

58

Clarity

68

Completeness

45

Summary

This skill provides methodology guidance for operating AI agents in engineering workflows using eval-first execution, task decomposition, and cost-aware model routing. It instructs humans and AI systems on how to structure agentic work, define success criteria, decompose tasks, and manage model costs across different tiers (Haiku, Sonnet, Opus).

Detected Capabilities

Eval framework design and baseline captureTask decomposition and scope definitionModel tier selection and routing logicCost tracking and optimizationCode review strategy for AI-generated codeSession management and state compaction

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

agentic engineeringeval-first executiontask decompositionmodel routingai code reviewcost optimization

Use Cases

  • Structuring large engineering projects for AI agent implementation
  • Defining and running evaluation baselines before agent execution
  • Decomposing complex tasks into independently verifiable units
  • Managing cost and quality trade-offs across Claude model tiers
  • Conducting effective code reviews of AI-generated implementations

Quality Notes

  • Skill provides clear operating principles and methodology but lacks concrete examples or templates
  • The '15-minute unit rule' is mentioned but not explained with sufficient detail for consistent application
  • Model routing guidance is brief — no guidance on failure modes, retry budgets, or when to escalate tiers
  • No concrete eval example: what does a 'baseline' look like? How are 'failure signatures' captured?
  • Session strategy section uses vague terms ('closely-coupled units', 'major phase transitions') without defining thresholds
  • Cost tracking guidance lists metrics to track but provides no format, reporting mechanism, or threshold definitions
  • Review focus section is useful but would benefit from concrete examples of 'hidden coupling' or 'rollout risk' patterns
  • Skill is instructional rather than executable — it does not provide templates, checklist formats, or structured outputs that an agent could directly use
  • No guidance on error recovery: what happens when a regression eval fails post-implementation?
Model: claude-haiku-4-5-20251001Analyzed: Apr 20, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Version History

v1.1

Content updated

2026-04-20

Latest
v1.0

Seeded from github.com/affaan-m/everything-claude-code

2026-03-16

Add affaan-m/agentic-engineering to your library

Command Palette

Search for a command to run...