- Auto-download dumps by language code (350+ languages) - Extract specific articles by title without downloading the full dump - Extract articles from a Wikipedia category with subcategory recursion - Extract specific sections by name with alias matching (e.g., "Plot" also matches "Synopsis") - Template expansion (dates, coordinates, unit conversions → readable text) - Content type markers ([MATH], [TABLE], etc.) instead of silent removal - Category metadata preserved in output - JSON/JSONL output - Parallel processing (English Wikipedia 24 GB dump: ~2 hours on Apple M4) - Written in Ruby.