Sessions are typically 3–7 hours long, mixing English and Swahili. This tool transcribes, chunks, and summarizes them to make political content more accessible and searchable for the public.
Sessions are typically 3–7 hours long, mixing English and Swahili. This tool transcribes, chunks, and summarizes them to make political content more accessible and searchable for the public.
I've been working on something in the same space for the Belgian federal parliament. The Belgian parliament livestreams sessions and publishes a single (long, bloated, dual-language) PDF report[0] for each session and that's it.
This means no search across sessions, no details of which parties voted how, no API etc. The only view you get is from the perspective of a single session which is not very useful when you're trying to figure out who to vote for.
I made 'zij werken voor u' (TheyWorkForYou[1] in Dutch) by scraping the PDFs file and parsing it with a Rust script automatically.
The scraped data (votes, questions, topics, dossiers) get put into .parquet files. I also compute some additional things like voting patterns, attendance and which topics interest specific PMs the most.
These parquet files are then fed into a static site generator and a search index is built. I also sprinkle in some summarization using Mistral[2]
The result is https://zijwerkenvooru.be/nl/votes/ (in Dutch) which allows you to look at the data from multiple viewpoints such as
- what questions did member X ask?
- how did party Y vote?
- what is happening around topic Z?
I also post new votes/questions on Bluesky[3]. The whole process (downloading, scraping, publishing, posting) is automated to run through GitHub Actions. I literally have to do nothing now.
I'm hoping the Belgian government will step up and improve their archaic and almost unusable site[4].
Thanks for sharing this project, I'm already getting inspired by it to improve zijwerkenvooru.be!
Edit: I’m thinking it might be good to have an overview of initiatives like these somewhere? Public initiatives to help with political transparency for each country?
[0]: https://www.dekamer.be/doc/PCRI/html/56/ip052x.html
[1]: https://www.theyworkforyou.com/
[2]: https://mistral.ai/
On my end, it’s a bit frustrating that our Parliament still only shares pdf reports weeks after sessions happen, likely compiled manually. No API, no transcript archive, and no structured metadata around bills, speakers, or topics.
That’s partly why I started building Bunge Bits: to sidestep the bottlenecks and make the information usable.
Appreciate you sharing zijwerkenvooru.be, bookmarking it for inspiration as I figure out what’s next.
arecsu•3h ago