Source Code
HackerNews Extract
Extract a HackerNews post (article + comments) into single clean Markdown for quick reading or LLM input.
see Examples
What it does
- Accepts an HackerNews id or url
- Download the linked article HTML, cleans and formats it.
- Fetches the Hacknews post metadata and comments.
- Outputs a readable combined markdown file with original article, threaded comments, and key metadata.
Requirements
uvinstalled and in PATH.
Install
No install beyond having uv.
Dependencies will be installed automatically by uv into to a dedicated venv when run this script.
Usage Workflow (Mandatory for Agents)
When an agent is asked to extract a HackerNews post:
- Run the script with an output path:
uv run --script ${baseDir}/hn-extract.py <input> -o /tmp/hn-<id>.md. - Send ONE combined message: Upload the file and ask the question in the same tool call. Use the
messagetool (action=send,filePath="/tmp/hn-<id>.md",message="Extraction complete. Do you want me to summarize it?"). - Do not output the full text or a summary directly in the chat unless specifically requested.
Usage
# run as uv script
uv run --script ${baseDir}/hn-extract.py <hn-id|hn-url|path/to/item.json> [-o path/to/output.md]
# Examples
uv run --script ${baseDir}/hn-extract.py 46861313 -o /tmp/output.md
uv run --script ${baseDir}/hn-extract.py "https://news.ycombinator.com/item?id=46861313"
- Omit
-oto print to stdout. - Directories for
-oare created automatically.
Notes
- Retries are enabled for HTTP fetches.
- Comments are indented by thread depth.
- Sites requires authentication or blocks scraping may still fail.