Catalog
softaworks/web-to-markdown

softaworks

web-to-markdown

Use ONLY when the user explicitly says: 'use the skill web-to-markdown ...' (or 'use a skill web-to-markdown ...'). Converts webpage URLs to clean Markdown by calling the local web2md CLI (Puppeteer + Readability), suitable for JS-rendered pages.

global
version:0.1.0
0installs0uses~859
v1.1Saved Apr 20, 2026

web-to-markdown

Convert web pages to clean Markdown by driving a locally installed browser (via web2md).

Hard trigger gate (must enforce)

This skill MUST NOT be used unless the user explicitly wrote exactly a phrase like:

  • use the skill web-to-markdown ...
  • use a skill web-to-markdown ...

If the user did not explicitly request this skill by name, stop and ask them to re-issue the request including: use the skill web-to-markdown.

What this skill does

  • Handles JS-rendered pages (Puppeteer → user Chrome).
  • Works best with Chromium-family browsers (Chrome/Chromium/Brave/Edge) via puppeteer-core.
  • Extracts main content (Readability).
  • Converts to Markdown (Turndown) with cleaned links and optional YAML frontmatter.

Non-goals

  • Do not use Playwright or other browser automation stacks; the mechanism is web2md.

Inputs you should collect (ask only if missing)

  • url (or a list of URLs)
  • Output preference:
    • Print to stdout (--print), OR
    • Save to a file (--out ./file.md), OR
    • Save to a directory (--out ./some-dir/ to auto-name by page title)
  • Optional rendering controls for tricky pages:
    • --chrome-path <path> (if Chrome auto-detection fails)
    • --interactive (show Chrome and pause so the user can complete human checks/login, then press Enter)
    • --wait-until load|domcontentloaded|networkidle0|networkidle2
    • --wait-for '<css selector>'
    • --wait-ms <milliseconds>
    • --headful (debug)
    • --no-sandbox (sometimes required in containers/CI)
    • --user-data-dir <dir> (login/session; use a dedicated profile directory)

Workflow

  1. Confirm the user explicitly invoked the skill (use the skill web-to-markdown).
  2. Validate URL(s) start with http:// or https://.
  3. Ensure web2md is installed:
    • Run: command -v web2md
    • If missing, instruct the user to install it (assume the project exists at ~/workspace/softaworks/projects/web2md):
      • cd ~/workspace/softaworks/projects/web2md && npm install && npm run build && npm link
      • Or: cd ~/workspace/softaworks/projects/web2md && npm install && npm run build && npm install -g .
  4. Convert:
    • Single URL → file:
      • web2md '<url>' --out ./page.md
    • Single URL → auto-named file in directory:
      • mkdir -p ./out && web2md '<url>' --out ./out/
    • Human verification / login walls (interactive):
      • mkdir -p ./out && web2md '<url>' --interactive --user-data-dir ./tmp/web2md-profile --out ./out/
      • Then: complete the check in the browser window and press Enter in the terminal to continue.
    • Print to stdout:
      • web2md '<url>' --print
    • Multiple URLs (batch):
      • Create output dir (e.g. ./out/) then run one web2md command per URL using --out ./out/
  5. Validate output:
    • If writing files, verify they exist and are non-empty (e.g. ls -la <path> and wc -c <path>).
  6. Return:
    • The saved file path(s), or the Markdown (stdout mode).
  • For most pages: --wait-until networkidle2
  • For heavy apps: start with --wait-until domcontentloaded --wait-ms 2000, then add --wait-for 'main' (or another stable selector) if needed.
Files2
2 files · 7.6 KB

Select a file to preview

Overall Score

84/100

Grade

B

Good

Safety

87

Quality

87

Clarity

85

Completeness

76

Summary

This skill converts web pages to clean Markdown by invoking a local `web2md` CLI tool that uses Puppeteer for browser automation and Readability for content extraction. It handles JavaScript-rendered pages, supports interactive mode for login walls, and offers flexible output options (stdout, single file, or batch directory).

Detected Capabilities

URL validation and web content fetchingBrowser automation via Puppeteer (local execution)JavaScript rendering and DOM content extractionMarkdown file generation and writingDirectory creation and file managementCommand-line tool invocationInteractive user mode supportBatch processing of multiple URLsSession/profile persistence for authentication

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

convert webpage to markdownextract web contentweb scraping markdownhandle javascript-rendered pagesbatch convert urlsarchive web articles

Risk Signals

INFO

Invokes external CLI tool (web2md) that launches real browser processes

SKILL.md: Workflow section, step 4
INFO

Creates temporary directories for browser profiles (--user-data-dir ./tmp/web2md-profile)

SKILL.md: Advanced usage examples, Interactive Mode section
INFO

Writes Markdown files to project directories without explicit scope restriction

SKILL.md: Workflow section, step 5; README examples
INFO

Hard trigger gate enforcement required: user must explicitly invoke with 'use the skill web-to-markdown'

SKILL.md: Hard trigger gate section and README prerequisites
INFO

Assumes web2md installation at ~/workspace/softaworks/projects/web2md with fallback to global install

SKILL.md: Workflow step 3
INFO

Referenced domains (app.example.com, example.com) are placeholder examples, not actual external services

Static analysis domain extraction

Referenced Domains

External domains referenced in skill content, detected by static analysis.

app.example.comexample.com

Use Cases

  • Extract article content from news sites and blogs
  • Convert JavaScript-heavy pages to portable Markdown
  • Archive web documentation in readable format
  • Handle pages with login or verification requirements
  • Batch convert multiple URLs to Markdown files
  • Process dynamic single-page applications

Quality Notes

  • Excellent documentation with clear use cases, prerequisites, and troubleshooting section
  • Strong scope definition via hard trigger gate mechanism to prevent accidental invocation
  • Comprehensive workflow with numbered steps and validation checkpoints
  • Well-structured examples covering basic, advanced, batch, and interactive scenarios
  • README provides context on when to use vs. simpler alternatives (WebFetch)
  • Advanced options documented with clear descriptions of wait strategies and browser control
  • Good error handling guidance for common failure modes (Chrome not found, incomplete content, login walls)
  • Output validation step included (file existence and non-empty checks)
  • Minor: README references 'Claude Code' which may not be accurate terminology in all contexts
  • Installation instructions are specific and include multiple approaches (npm link vs. global install)
Model: claude-haiku-4-5-20251001Analyzed: Apr 20, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

Version History

v1.1

Content updated

2026-04-20

Latest
v1.0

No changelog

2026-04-12

Add softaworks/web-to-markdown to your library

Command Palette

Search for a command to run...