mediagrab

Generic web media downloader — find video/audio on any standards-compliant page, save as open-source container

v1.0.0
Linux

Quick Start

Install via jcli (recommended)

jcli install mediagrab

Or via the installer

curl -fsSL https://cli.johlem.net/tools/mediagrab/install.sh | bash

One-shot smoke test

mediagrab probe https://archive.org/download/BigBuckBunny_124/Content/big_buck_bunny_720p_surround.mp4
# → detects direct .mp4, shows file size, no download

mediagrab video -o bbb.mkv https://archive.org/.../big_buck_bunny_720p_surround.mp4
# → downloads + transcodes mp4 → mkv via ffmpeg + progress bar on stderr

mediagrab audio -f mp3 -b 192k -o bbb.mp3 https://archive.org/.../big_buck_bunny_720p_surround.mp4
# → downloads + extracts audio + transcodes to MP3 at 192kbps

What it does

mediagrab detects video and audio media on any standards-compliant web page or major video host, picks the highest-quality stream by default, and writes the result to an open-source container of your choice. Single Rust binary. No async runtime. No Python. No yt-dlp dependency.

The architecture is an extractor trait with a registry of site-specific and generic-fallback implementations. Adding a new site is one new file; the dispatch loop walks the registry, first match wins.

Coverage

SourceHow it's detectedNotes
Direct file URLs (.mp4, .webm, .mkv, .m3u8, .mpd, .mp3, .opus, …) URL extension + HEAD Content-Type gate Pages that look like media URLs but serve HTML are correctly rejected and fall through to the next extractor.
YouTube (youtube.com, youtu.be, /shorts/, /embed/) ytInitialPlayerResponse.streamingData.formats + adaptiveFormats Streams that require signatureCipher / nsig decoding are skipped with a clear error. Many older or shorts videos work; some newer ones don't.
Vimeo player.vimeo.com/video/<id>/config JSON Progressive + HLS variants.
Any standards-compliant page <video src>, <source src>, og:video, twitter:player:stream, <link rel="alternate" type="application/x-mpegURL|dash+xml">, JSON-LD VideoObject.contentUrl, inline .m3u8/.mpd URLs The long tail. Wikipedia, news sites, blogs, CMS embeds all generally work.

Out of scope: DRM (HLS AES-128, Widevine), YouTube signatureCipher streams. These produce a clear error message — no silent failure.

Subcommands

CommandWhat it does
mediagrab probe <URL>Detect media, print a table of candidates ranked best-first, exit. No download.
mediagrab video <URL> [-f mkv|mp4|webm|ogg] [-q best|720|1080|...]Download highest-quality video by default. Container default: mkv.
mediagrab audio <URL> [-f mp3|opus|ogg|flac|wav|m4a] [-b 192k]Extract audio. Container default: mp3 at 192kbps.

Output templates

-o takes a literal path or a template. Tokens:

TokenExpands to
{title}Slug of the page or video title
{id}Source ID (video id, hash)
{site}Host (e.g. youtube.com)
{ext}Chosen container extension

Default template: {title}.{ext}.

Progress bars

Every download writes a colored indicatif progress bar to stderr so stdout stays clean for the [OK] line and JSON-piping. Three bar styles:

The bar is hidden automatically when stderr isn't a TTY (CI logs, pipes).

Touches / Produces / Gates

Exit codes

CodeMeaning
0Ok
1Runtime failure — network, parse, no candidates, ffmpeg failure, DRM rejection
2Usage error (clap)

Why not yt-dlp?

yt-dlp is excellent and covers far more sites — 1700+ dedicated extractors. mediagrab targets a different point in the trade-off space: a single ~4 MB Rust binary that does ~80% of what you actually need from a generic media downloader, without a Python runtime or the ongoing extractor-rotation maintenance treadmill. When you hit a site mediagrab can't handle, fall through to yt-dlp — they coexist fine.

Build from source

cd tools/mediagrab/rust
cargo build --release
cargo test                                  # 62 unit tests

Toolchain pin: Rust 1.85.0. Dependencies are all in the suite-permitted allowlist (clap, reqwest [blocking + rustls-tls], serde, serde_json, anyhow, regex, url, indicatif).

Runtime prereq for transcoding and HLS muxing: ffmpeg on PATH. Without ffmpeg you can still download direct files in their source container.