frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I indexed 8,643 BSides talks across 227 chapters and 6 continents

https://allbsides.com/
2•Parkado•2h ago
Hi HN,

I'm Roland, and for the past few weeks, I've been building AllBSides — a directory of every BSides conference talk uploaded to YouTube. As of today, 8,643 talks from 5,927 speakers across 227 chapters in 68 countries. Combined runtime is 280 days. The transcripts come to about 60 million words.

The archive came together in stages:

1. Manually map every BSides chapter's YouTube channel 2. Pull every video and transcript from Supabase 3. Run each transcript through Haiku for tag extraction (tools, topics, difficulty, team, talk style, research method, and much more) 4. Run results through Sonnet for categorization and dedup 5. Final pass goes through Opus for verification 6. Do a manual verification - at one time, the pipeline showed over 16k AI suggestions for manual verification. Today, most are resolved.

Total LLM cost so far: about €200. The whole pipeline is rebuildable from scratch.

Each talk gets its own page with embedded video, full transcript, speakers, tags, and "related talks." Each tool/framework/protocol/standard mentioned across the corpus gets its own page (3,968 distinct technologies tracked).

Some interesting facts I gathered while building it:

-(A) The site is currently 94% bot traffic. Of that, about 80,000 hits/month are AI training crawlers (ClaudeBot, GPTBot, meta-externalagent). Within 7 days of the talks archive going live, all major AI labs had ingested the entire corpus. The discovery cascade was startling to watch in real time.

-(B) The taxonomy work was the hardest part. Distinguishing "tools" from "frameworks" from "protocols" from "concepts" sounds easy until you have 5,000 ambiguous extracted entities. The 3-tier LLM pipeline helped a lot — Haiku alone was too noisy, Opus alone was too expensive.

-(C) Top tools mentioned: Wireshark (343), PowerShell (342), Metasploit (332), Burp Suite (322), GitHub (296), VirusTotal (273), Docker (253), Splunk (251), Nmap (247), MITRE ATT&CK (237). The list reflects what BSides talks actually discuss, not what vendors curate.

-(D) May is the peak BSides month — 29 events, 17% of all events with dates.

-(E) The top 1% of talks (86 videos by view count) account for 51% of all viewership. The other 99% are deeply niche, often the only video record of a specific technique.

The stack is intentionally lean: Go, SQLite, vanilla JavaScript, BunnyCDN. Static rendering at build time. No frameworks, no client-side state. The site costs about €50/month to run.

The data behind this post and much more can be found in the site footer, under the link "stats".

Happy to answer questions about the data pipeline, the taxonomy decisions, or what the AI crawler patterns looked like as the archive went live. Feedback on what to build next is genuinely welcome — I'm a solo dev figuring this out as I go.

— Roland (parkado)

Programmers Sell Ox, Not UX

https://www.makonea.com/en-US/casual/programmers-sell-ox-not-ux
1•jdw64•29s ago•0 comments

Probability Distributions: An Intuitive Guide

https://tawsifk.substack.com/p/probability-distributions-an-intuitive
1•t35khan•3m ago•0 comments

Stripped an AI agent down to a bash loop – No Framework

https://github.com/seedpi867-cmd/seed
1•seed867•4m ago•0 comments

Marc Andreessen, A16Z and Netscape

https://www.davidsenra.com/episode/marc-andreessen
1•gnabgib•5m ago•0 comments

The Dragon Won Because Nobody Fought It (2014) [video]

https://www.youtube.com/watch?v=cZYNADOHhVY
1•warbaker•11m ago•1 comments

Why AI rarely says "I don't know"

https://medium.com/@blueshirts23/i-got-chatgpt-to-confess-its-own-design-logic-heres-exactly-what...
1•BoundaryTester•13m ago•0 comments

8Veda – AI-powered news intelligence. Bias-indexed. Neutral

https://8veda.com/
1•anthonymooz•14m ago•0 comments

Peter Thiel backs $1B ocean data centre startup powered by waves

https://www.ft.com/content/711ce313-16fb-4a12-b6be-fbed547c8a39
1•voxadam•15m ago•1 comments

Y Combinator's Stake in OpenAI (0.6%)

https://daringfireball.net/2026/05/y_combinators_stake_in_openai
3•gyomu•15m ago•0 comments

Show HN: I built a native macOS audio player and it changed my life

https://github.com/chrisallick/light-crime-audio-player
1•chrisallick•18m ago•1 comments

Ribbon – A Linkding Client

https://www.coryd.dev/posts/2026/ribbon-a-linkding-client
1•cdrnsf•19m ago•0 comments

Show HN: Agent Historic Philosophical Persona Routing and Prompts

https://github.com/barretts/AgentHistoric
2•sosuke•22m ago•1 comments

I Bought a TV with No 'Smart' Features [video]

https://www.youtube.com/watch?v=LJh72_O4pXE
1•absqueued•24m ago•0 comments

Using agroforestry to buffer noise [pdf]

https://www.fs.usda.gov/nac/assets/documents/agroforestrynotes/an42w05.pdf
1•koolba•24m ago•0 comments

An Introduction to LangChain's Deep Agents

https://medium.com/@ngpeijiun/an-introduction-to-langchains-deep-agents-ad14b511f3dc
2•eugenis•26m ago•0 comments

Kredd – open-source SaaS application for ranking cold inbound emails

https://github.com/DomHudson/kredd
1•domhudson•29m ago•0 comments

Open-source diagnostic for Al misalignment. Model agnostic, industry agnostic

https://github.com/ifixai-ai/diagnostic
1•dimneo24•30m ago•1 comments

Highlander returns to theaters in glorious 4K, for 40th anniversary.

https://www.polygon.com/highlander-returns-to-theaters-in-glorious-4k/
2•nephihaha•30m ago•2 comments

The actual strategy plan Walt Disney gave investors

https://hbr.org/resources/images/article_assets/2013/05/disney-2.jpeg
1•megamike•35m ago•0 comments

Austria expels three Russian embassy staff after 'forest of antennae' discovered

https://www.theguardian.com/world/2026/may/04/austria-expels-three-russian-embassy-staff-vienna-s...
3•CqtGLRGcukpy•35m ago•0 comments

Show HN: Yames – A distraction-free desktop metronome built with Rust and Tauri

https://turutupa.github.io/yames/
2•turutupa•35m ago•0 comments

May the 4th be with the ballpark: Inside MLB's Star Wars obsession

https://www.espn.com/mlb/story/_/id/48652519/mlb-star-wars-promotions-traditions-4th
2•1659447091•35m ago•0 comments

Running a Company with Agents

https://cofounder.co
1•yuedongze•35m ago•0 comments

AOL killed the early internet on a single day in September 1993

https://twitter.com/GeniusGTX/status/2051316737749217627
3•bilsbie•36m ago•1 comments

Suspected YouTube bug spikes RAM over 7gbs users report lag and frozen tabs

https://www.tomshardware.com/software/a-suspected-youtube-interface-bug-spikes-ram-usage-above-7-...
6•Zeidd•37m ago•0 comments

I left academia to sell Elephant Garlic online

https://demeterfamilyfarm.com/
1•WWIII_Historian•40m ago•1 comments

2026 Cocodona Livestream Day 1 [video]

https://www.youtube.com/watch?v=dWhF6tTn8zI
1•BiraIgnacio•44m ago•0 comments

VSCode Dark Islands – Safe Version

https://github.com/raaid3/vscode-dark-islands
1•raaid3•46m ago•1 comments

Metalenz Has Figured Out a Way to Make Face ID Invisible

https://www.wired.com/story/metalenz-has-figured-out-a-way-to-make-face-id-invisible/
1•0in•48m ago•0 comments

An unbiased benchmark for how well agents can read your docs

https://docsalot.dev/benchmarks/docs
2•fazkan•50m ago•1 comments