affaan-m/enterprise-agent-opsv1.2

enterprise-agent-ops

Operate long-lived agent workloads with observability, security boundaries, and lifecycle management.

global

origin:ECC

New~286

v1.2Saved Jul 14, 2026

Enterprise Agent Ops

Use this skill for cloud-hosted or continuously running agent systems that need operational controls beyond single CLI sessions.

Operational Domains

runtime lifecycle (start, pause, stop, restart)
observability (logs, metrics, traces)
safety controls (scopes, permissions, kill switches)
change management (rollout, rollback, audit)

Baseline Controls

immutable deployment artifacts
least-privilege credentials
environment-level secret injection
hard timeout and retry budgets
audit log for high-risk actions

Metrics to Track

success rate
mean retries per task
time to recovery
cost per successful task
failure class distribution

Incident Pattern

When failure spikes:

freeze new rollout
capture representative traces
isolate failing route
patch with smallest safe change
run regression + security checks
resume gradually

Deployment Integrations

This skill pairs with:

PM2 workflows
systemd services
container orchestrators
CI/CD gates

Files1

1 files · 1.0 KB

Select a file to preview

Overall Score

48/100

Grade

Adequate

Safety

Quality

Clarity

Completeness

Summary

This skill provides operational guidance for managing long-lived agent workloads in enterprise environments, covering runtime lifecycle (start/pause/stop), observability (logs/metrics/traces), safety controls (scopes/permissions/kill switches), and change management (rollout/rollback/audit). It documents baseline controls like immutable artifacts, least-privilege credentials, and audit logging, along with incident response patterns and integration points with deployment tools.

Detected Capabilities

observability and monitoringlifecycle managementaudit loggingchange managementincident responseintegration with deployment tools

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

manage agent workloadsenterprise agent operationslong-lived agent systemsagent lifecycle managementproduction agent monitoringagent rollout proceduresincident response automation

Use Cases

Monitor and manage continuously running agent systems in production
Implement safety controls and audit trails for multi-agent deployments
Execute controlled rollout and rollback procedures for agent updates
Respond to agent failures with structured incident patterns and isolation procedures
Integrate agent operations with PM2, systemd, container orchestrators, and CI/CD pipelines

Quality Notes

Skill lacks concrete implementation examples or code patterns — describes concepts but provides no actionable instructions for an agent
No specific metrics definitions, thresholds, or monitoring setup guidance provided
Incident response pattern is high-level and generic; lacks specific commands or procedures an agent could follow
No details on credential management beyond 'least-privilege' and 'environment-level injection'
Missing guardrails or scope boundaries around which systems/agents can be operated on
No error handling guidance for failed rollouts, lifecycle operations, or recovery
Deployment integration section lists tools but provides no instructions for using them
No edge cases addressed (e.g., what happens if an agent is already stopped before pause, how to handle orphaned processes)
Very limited scope — skill is mostly conceptual framework rather than actionable operational guidance

Model: claude-haiku-4-5-20251001Analyzed: Jul 14, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Version History

v1.2

Content updated

2026-07-14

Latest

v1.1

Content updated

2026-04-20

v1.0

Seeded from github.com/affaan-m/everything-claude-code

2026-03-16

Use affaan-m/enterprise-agent-ops in your dev environment — a Developer account adds skills to your library and syncs them via the SkillRepo CLI.

Start a Developer trial