Copilot PR Autopilot

Drive any GitHub pull request through repeated rounds of Copilot code review until the agent has done its job — every Copilot finding has a reply from the agent (fix-acknowledgement, decline-with-rationale, or explicit escalate-to-user hand-off). Remaining open threads, if any, are deliberate hand-offs to the human merge owner — they're not loop failures. Repository-agnostic — works on any repo that has Copilot Code Review enabled, run from a machine with gh CLI installed and authenticated (see Prerequisites).

When to Use This Skill

The user asks to "request Copilot review" or "run a Copilot review loop" on a PR.
A PR is functionally complete and the user wants a final correctness pass via repeated automated review rounds.
A previous Copilot review on the PR has left open threads that need triage, fixing, replying, and resolving.

When NOT to Use This Skill

The PR is still under active design — wait until the structure is stable; otherwise findings churn round-over-round.
The user wants human reviewer feedback, not Copilot's.

Prerequisites

gh CLI installed and authenticated against the target repository.
PowerShell on PATH — Windows PowerShell 5.1+ (powershell.exe) or PowerShell 7+ (pwsh). Both are tested.
Copilot Code Review is the primary use case (01-request-review.ps1 uses GraphQL requestReviewsByLogin to trigger Copilot). It is NOT a hard requirement — if 01-request-review.ps1 fails because Copilot isn't enabled on the repo / account, the agent can still drive existing review threads (human, advanced-security, etc.) to completion by running steps 3–8 once as a single iteration; just skip the trigger + wait. There is no auto-detect for "Copilot unavailable" — the agent makes that decision after the trigger fails (the script can't reliably tell "Copilot disabled" from "Copilot enabled but not yet triggered" from API state alone).

Permissions: who can run the full loop

The full multi-round autopilot (steps 1 → 9 → 1) needs Triage or Write permission on the target repo, because GitHub's only public API for adding the Copilot bot as a reviewer (requestReviewsByLogin) is gated on that permission. Verified against the public REST + GraphQL surface in this PR's commit history — there is no public-API path for bot reviewers without write permission.

You are…	What works
Repo collaborator with Triage / Write	Full loop: `01` triggers Copilot, `02` waits, `04`–`08` triage / fix / reply, loop back to `01`. Hands-off.
External PR author (no write permission)	`01` will throw a clear actionable error. Use `-SingleIteration` mode: address all current findings in one pass, then either click the UI 🔄 next to Copilot, or push a substantive commit (the `synchronize` event auto-triggers Copilot on most repos). Then re-run `02` to verify.

In single-iteration mode the loop's convergence boolean is Converged: true iff OpenThreadsAwaitingReply == 0 (the agent's side is done). The maintainer-side re-trigger then drives any additional rounds.

Every script dot-sources scripts/_lib.ps1 which runs Assert-GhReady on load: if gh is missing OR gh auth status fails, the script halts before any work with a single actionable error message naming the install command and gh auth login. The agent should surface that message to the user verbatim and stop the loop — do not retry or work around it.

Step-by-Step Workflow

The loop: steps 1 → 2 → 3 → 4 → 5 → 6 → 7 → 8 → 9, then back to step 1 if Converged: false. Repeat the 1→9 round until step 9 returns Converged: true; only then run step 10 once and call task_complete. At every 10th round, the parent runs the round-cap recap gate before looping back — recap all prior rounds and stop if the loop has drifted out of the PR's original scope.

Each round runs steps 1–9; step 10 is a one-time cleanup after convergence. The parent agent coordinates; every sub-agent step runs in a fresh context with a bounded budget. Cross-cutting protocol (time-boxing, extension, single-iteration fallback): orchestration.md.

Request review (parent) — see 01-request-review.md
Wait for review (sub-agent, 20-min cap) — see 02-wait.md
List + categorize open threads (sub-agent, 5 min) — see 03-list-threads.md
Triage (sub-agent, 5 min per ≤5 threads) — see 04-triage.md
Fix (sub-agents, parallel max 5, 5 min each) — see 05-fix.md
Build + test per repo conventions (sub-agent, 10 min) — see 06-build-test.md
Commit + push (parent) — see 07-commit-push.md
Reply (always) + resolve (conditional) (sub-agent drafts, parent posts) — see 08-reply-resolve.md
Convergence verify (sub-agent, 3 min) — see 09-convergence.md
- Converged: false → loop back to step 1 for another round (re-trigger, wait, list, triage, fix, push, reply, re-check). Each round addresses Copilot's findings on the previous round's HEAD; the loop terminates as soon as Copilot has nothing new to say AND every open thread has a reply from the agent.
- Converged: true → exit the loop, run step 10 once, call task_complete with the proof.
- Every 10th round (10, 20, 30…) → run the round-cap recap gate before looping back. Recap ALL prior rounds against the PR's original scope and pick a verdict: CONTINUE, REVERT-AND-SHIP (drop drifted commits, ship the in-scope ones), or HAND-OFF (escalate to the user). This is the circuit breaker that stops a runaway bot-review loop.
Cleanup outdated (parent, post-convergence, once) — see 10-cleanup.md

Convergence is computed by scripts/02-check-review-status.ps1 as a single Converged: true boolean. Do not call task_complete until it returns true; print the proof (HeadOid, LatestCopilotReview.commitOid, submittedAt) in the completion message.

Gotchas

The bundled scripts enforce the hard correctness invariants (trigger landing via copilot_work_started event id, Converged requiring HEAD-match + zero-awaiting + at-HEAD review, single-iteration fallback semantics, PR-state guard). Trust them — don't re-derive. The notes below cover decisions the scripts can't make for you:

Reply to every open thread; resolve only when the loop owns the disposition. For fix and decline threads, reply + resolve. For escalate-to-user threads, reply with the analysis but leave the thread OPEN (08-reply-and-resolve.ps1 -NoResolve) so the human merge owner can act on it. See 08-reply-resolve.md.
Copilot threads are loop-owned; human / advanced-security / other-bot threads default to escalate-to-user. Auto-resolving a human review thread can hide unaddressed concerns. See 04-triage.md for the rubric.
One focused commit per round, not one per PR. Bundling rounds destroys the audit trail of which finding drove which change and breaks git bisect. See 07-commit-push.md.
Build/test/lint with the repo's own commands (per its CONTRIBUTING / AGENTS / README / package.json / Makefile) before pushing a fix. Discovery procedure: 06-build-test.md.
Push back with written rationale when a Copilot finding would over-engineer the design for a hypothetical edge case. Auto-accepting every suggestion erodes the design — see the decline path in 04-triage.md.
Scripting traps (gh api graphql -F type-coercion, git stash push -m positional parsing, the three GraphQL traps for the reviewer mutation) are documented in references/api-quirks.md. Read before modifying any script.

Troubleshooting

Issue	Solution
Script throws `prerequisite missing — gh CLI is not on PATH`	Install `gh` (`winget install GitHub.cli` on Windows; `brew install gh` on macOS; package manager on Linux; or download from https://cli.github.com). Then `gh auth login`. Surface the message to the user and STOP the loop — do not retry.
Script throws `prerequisite missing — gh CLI is not authenticated`	Run `gh auth login`. STOP the loop until the user completes auth.
Trigger fails or no `copilot_work_started` event lands	Push a substantive (non-whitespace) commit — auto-assign on `synchronize` is the most reliable trigger. Persistent failure indicates Copilot Code Review may not be enabled on the repo / account (check repo Settings → Code & automation → Copilot, or account-level Copilot Pro/Pro+).
No new review after waiting ~10 min	Quiet-period after recent dismissal or trivial-diff suppression. Push a substantive commit and retry. Do not blindly re-run `01-request-review.ps1` — it reports `InFlight` while Copilot is still a requested reviewer.
Outdated-but-unresolved threads in the open list	Expected: unresolved state is the source of truth. Reply + resolve them like any other open thread. `10-cleanup-outdated.ps1` is only a final safety net.
Unsure whether to fix or decline a finding	See references/04-triage.md.
Need a reply phrasing for "fixed", "declined", or "drift"	See the templates under templates/ — reply-fix.md, reply-decline.md, reply-drift.md, reply-partial.md.

References

references/orchestration.md — cross-cutting loop control: time-boxing & extension protocol, sub-agent delegation map, single-iteration fallback, and loop-wide notes.
Per-step contracts (one NN-*.md per step): references/01-request-review.md (parent), references/02-wait.md, references/03-list-threads.md, references/04-triage.md (includes the fix-vs-decline rubric), references/05-fix.md, references/06-build-test.md, references/07-commit-push.md (parent), references/08-reply-resolve.md, references/09-convergence.md (includes the round-cap recap gate), references/10-cleanup.md (parent).
references/api-quirks.md — verified GitHub API behavior, dead-ends, and the GraphQL traps for the reviewer mutation.
Templates (one per reply type): templates/reply-fix.md — accepted-fix pattern; templates/reply-decline.md — declined-with-rationale pattern; templates/reply-drift.md — PR-description / comment / test-plan drift acknowledgement; templates/reply-partial.md — partial fix with deferred follow-up. Cross-cutting reply guidance and anti-patterns live in references/08-reply-resolve.md.
scripts/_lib.ps1 — shared helpers (Invoke-Gh, Invoke-GhGraphQL, Resolve-RepoCoords); dot-sourced by every script.
scripts/01-request-review.ps1 — trigger Copilot review and verify pickup via the copilot_work_started event.
scripts/02-check-review-status.ps1 — single-shot snapshot of the PR's Copilot review state; emits Converged: true only when all three conditions hold.
scripts/03-list-open-threads.ps1 — every unresolved PR review thread from all reviewers (Copilot, humans, github-advanced-security, etc.).
scripts/08-reply-and-resolve.ps1 — post a reply and resolve in one call.
scripts/10-cleanup-outdated.ps1 — safety net for outdated Copilot threads.

Files24

24 files · 127.4 KB

Select a file to preview

Overall Score

88/100

Grade

A

Excellent

Safety

86

Quality

92

Clarity

84

Completeness

88

Summary

Copilot PR Autopilot is an advanced orchestration skill that automates multi-round GitHub PR reviews using Copilot Code Review and GitHub's GraphQL API. It drives a complete loop: triggering Copilot, waiting for review, triaging findings across all reviewers (Copilot, humans, security scanners), dispatching parallel fix sub-agents, validating builds/tests, committing per round, replying with SHA citations, and re-triggering until convergence. The skill includes explicit safeguards: single-iteration fallback for external contributors (no write permission), a recap gate at every 10th round to prevent drift, and escalation rules that keep human threads open as explicit hand-offs rather than auto-resolving them.

Detected Capabilities

GitHub GraphQL API calls (requestReviewsByLogin, reviews, pull requests, comments)PowerShell script execution (Windows 5.1+, PowerShell 7+)gh CLI invocation (authenticated API gateway)Git operations (commit, push, stash)File system read (repo discovery: CONTRIBUTING.md, Makefile, package.json, CI configs)File system write (apply fixes, commit changes)Build/test/lint execution per repo conventionsParallel sub-agent coordination (max 5 concurrent fix agents)Loop orchestration with budgeting and extension protocol

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

address copilot commentscopilot review loopfix pr findingsiterate on copilot feedbackauto-resolve review threadscopilot multi-roundtriage mixed reviewersprevent bot drift

Risk Signals

INFO

GraphQL mutation to add bot reviewers requires Triage/Write permission on target repo

SKILL.md, Prerequisites section

INFO

PowerShell script execution with native `gh` CLI integration; exit codes must be checked explicitly (gh is native, not a cmdlet)

references/api-quirks.md

INFO

File writes to PR (commits, replies) — scoped to the PR branch only; no arbitrary filesystem mutation

Step 7 (commit-push), Step 8 (reply-resolve)

WARNING

Runs build/test/lint commands discovered from repo config — potential for arbitrary command execution if repo config is untrusted

references/06-build-test.md

WARNING

GraphQL variables passed via gh CLI with -f/-F flags; improper flag choice can cause type coercion or command injection on Windows PowerShell 5.1

references/api-quirks.md, 'gh api graphql -F coerces strings' section

INFO

Dot-sources shared _lib.ps1 helper in every script; Assert-GhReady halts on missing gh or auth failure before any work

SKILL.md, 'Every script dot-sources' paragraph

Referenced Domains

External domains referenced in skill content, detected by static analysis.

cli.github.com

Use Cases

Auto-iterate on Copilot code review comments until all findings have agent replies (fix-acknowledgement, decline-with-rationale, or explicit escalation)
Triage mixed review threads from Copilot, humans, and security scanners, applying different policies to each source
Parallelize fix implementation across multiple Copilot findings while respecting repo build/test/lint conventions
Prevent bot-review runaway by recapping every 10 rounds against the PR's original scope and deciding whether to continue, revert-and-ship, or hand off to a human
Provide single-iteration mode for external PR authors without write permission (run once, let human re-trigger via UI or commit)
Document the decision chain across rounds (why each finding was fixed or declined, citing commit SHAs) for audit and future reviewer context

Quality Notes

✅ Extremely comprehensive: 10 steps with detailed contracts, 12 reference docs, 4 reply templates, and a 15KB shared PowerShell library of verified helpers. Each step has inputs, return contract, procedure, and gotchas explicitly documented.
✅ Clear architectural boundaries: parent-owned steps (1, 7, 10, convergence decisions), sub-agent waves with budgets (3, 4, 5, 6, 8, 9), and a parallel execution cap (max 5 concurrent fix agents). Loop control is explicit and scripted (RecapInterval, RecapDue, Converged flag).
✅ Safety & consistency guardrails: single-iteration fallback for auth/permission failures, round-cap recap gate (every 10 rounds) to stop bot-drift, explicit escalation rules for human threads (leave open, don't auto-resolve), oscillation detection (hard stop if reverting prior round's edit).
✅ API-specific expertise: detailed documented GraphQL traps (requestReviewsByLogin vs requestReviews, botLogins vs userLogins, latestReviews cache staleness, exit-code handling), PowerShell 5.1 native-arg quoting workarounds, git stash argument order.
✅ Excellent reply templates: concrete examples (lock promotion, UUID removal) showing how to cite rationale, file paths, and SHAs — no generic 'Thanks for the feedback' replies.
✅ Repository-agnostic discovery: mandatory convention lookup (CONTRIBUTING.md, Makefile, CI configs, recent commits) before applying fixes; no invented build commands.
⚠️ Complexity: 10-step orchestration with parallel waves, sub-agent budgeting, extension protocol, and a recap gate. This is intentional for the domain (runaway bot-review prevention) but requires careful parent-agent implementation. The instructions assume the parent can reason about drift, call recap verdicts, and manage loop state across rounds.
⚠️ Convergence definition is nuanced: three modes (normal Copilot-driven, single-iteration, no-review-ever-observed), and `Converged: true` with open threads IS valid (escalated hand-offs stay open). The script enforces this correctly, but a misimplementing parent could falsely declare convergence.
⚠️ Gotchas are critical path items (reply-to-every-thread, resolve-only-if-owned, one-commit-per-round, no-invent-build-commands) — violations silently break the audit trail or skip review threads. All are documented, but their centrality suggests this skill rewards careful reading.
⚠️ Permission model is asymmetric: full loop (write permission), single-iteration (no write), with no auto-detect for Copilot availability. External contributors get a clear error at step 1, but error handling and fallback responsibility sits with the parent.

Model: claude-haiku-4-5-20251001Analyzed: Jul 1, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

copilot-pr-autopilot

Copilot PR Autopilot

When to Use This Skill

When NOT to Use This Skill

Prerequisites

Permissions: who can run the full loop

Step-by-Step Workflow

Gotchas

Troubleshooting

References

Summary

Detected Capabilities

Trigger Keywords

Risk Signals

Referenced Domains

Use Cases

Quality Notes

Reviews

Command Palette