Catalog
Yeachan-Heo/autoresearch

Yeachan-Heo

autoresearch

Stateful single-mission improvement loop with strict evaluator contract, markdown decision logs, and max-runtime stop behavior

global
0installs0uses~856
v1.0Saved Apr 20, 2026

<Use_When>

  • You already have a mission and evaluator from /deep-interview --autoresearch
  • You want persistent single-mission improvement with strict evaluation
  • You need durable experiment logs under .omc/autoresearch/
  • You want a supported path for periodic reruns via Claude Code native cron </Use_When>

<Do_Not_Use_When>

  • You need evaluator generation at runtime — use /deep-interview --autoresearch first
  • You need multiple missions orchestrated together — v1 forbids that
  • You want the deprecated omc autoresearch CLI flow — it is no longer authoritative </Do_Not_Use_When>

<Required_Artifacts> Canonical persistent storage lives under .omc/autoresearch/<mission-slug>/ and/or .omc/logs/autoresearch/<run-id>/.

Minimum required artifacts:

  • mission spec
  • evaluator script or command reference
  • per-iteration evaluation JSON
  • markdown decision logs

Recommended canonical shape:

.omc/autoresearch/<mission-slug>/
  mission.md
  evaluator.json
  runs/<run-id>/
    evaluations/
      iteration-0001.json
      iteration-0002.json
    decision-log.md

Reuse existing runtime artifacts when available rather than duplicating them unnecessarily. </Required_Artifacts>

<Cron_Integration> Claude Code native cron is a supported integration point for periodic mission enhancement. In v1, prefer documenting/configuring cron inputs over building a large scheduler UI.

If cron is used:

  • keep one mission per scheduled job
  • preserve the same mission/evaluator contract
  • append new run artifacts rather than overwriting prior experiments </Cron_Integration>

<Execution_Policy>

  • Do not hand execution back to omc autoresearch
  • Do not create multi-mission orchestration
  • Prefer reusing src/autoresearch/* runtime/schema helpers where they already match the stricter contract
  • Keep logs useful to humans, not only machines </Execution_Policy>
Files1
1 files · 1.0 KB

Select a file to preview

Overall Score

72/100

Grade

B

Good

Safety

78

Quality

68

Clarity

75

Completeness

62

Summary

Autoresearch is a stateful skill that orchestrates bounded, single-mission iterative improvement loops with strict evaluator contracts. It maintains durable experiment logs in markdown and JSON formats, enforces max-runtime stop conditions, and supports periodic reruns via native cron scheduling.

Detected Capabilities

State management for iterative missions (mission slug, evaluator reference, iteration tracking)File I/O for mission specs, evaluator definitions, and evaluation resultsMarkdown and JSON artifact generation (decision logs, evaluation records)Runtime lifecycle management (start, resume, stop conditions)Timestamp tracking and run-id bookkeepingCron integration for periodic scheduling

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

iterative improvement loopsingle-mission experimentevaluator-driven optimizationbounded experiment runresume experiment checkpoint

Risk Signals

WARNING

Skill references external evaluator scripts/commands without specifying how they are sourced or validated

Contract section: 'evaluator script or command reference'
INFO

File write operations to `.omc/autoresearch/` and `.omc/logs/` directories without explicit permission guards documented

Required_Artifacts and Workflow sections
INFO

Max-runtime parameter described as 'primary strict stop hook' but no specification of duration format or validation logic provided

Argument-hint and Execution_Policy sections
WARNING

Resume capability via `--resume <run-id>` referenced but no guardrails documented against resuming stale or compromised run states

Argument-hint and Workflow sections

Use Cases

  • Run iterative improvement experiments on a single well-defined mission with structured evaluation
  • Maintain durable experiment logs and decision records for reproducibility and audit trails
  • Integrate periodic mission enhancement with native cron scheduling
  • Resume interrupted improvement runs from checkpoint state

Quality Notes

  • Positive: Clear contract boundaries — skill explicitly forbids multi-mission orchestration and multi-evaluator scenarios, reducing complexity.
  • Positive: Well-structured artifact model with canonical directory layout and versioned run artifacts supports reproducibility and audit trails.
  • Positive: Markdown decision logs paired with JSON evaluation records balance human readability and machine parsability.
  • Positive: Explicit `Do_Not_Use_When` section sets clear expectations about dependencies (requires prior `deep-interview --autoresearch` setup).
  • Negative: No specification of evaluator JSON schema validation — the skill requires 'boolean pass and optional numeric score' but does not document error handling for malformed evaluator output.
  • Negative: Max-runtime duration format is not specified (e.g., ISO 8601, seconds, human-readable like '2h30m'). The skill should document expected input format and validation.
  • Negative: No documented rollback or conflict resolution strategy when resuming runs — what happens if the mission spec or evaluator has changed since the last run?
  • Negative: Cron integration section is aspirational ('prefer documenting') rather than prescriptive — unclear how cron inputs map to CLI arguments or how scheduling state is persisted.
  • Negative: No guidance on handling evaluator timeouts, crashes, or partial failures — the skill says 'continue even when evaluation does not pass' but does not address when evaluation itself fails.
  • Negative: Missing examples of decision-log markdown format and iteration JSON schema — agents would benefit from concrete templates.
Model: claude-haiku-4-5-20251001Analyzed: Apr 20, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Add Yeachan-Heo/autoresearch to your library

Command Palette

Search for a command to run...