frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

How OpenElections uses LLMs

https://thescoop.org/archives/2025/06/09/how-openelections-uses-llms/index.html
73•m-hodges•6h ago

Comments

simonw•4h ago
This is such an excellent example of a responsible and thorough application of vision LLMs to a gnarly data entry problem.
polskibus•3h ago
It’s also an excellent example on how lack of forced machine-readable format for gov publishing is a PITA.
sitkack•2h ago
json to qr code would be a good start. PRIOR ART inb4 a troll.
Mtinie•1h ago
If I was in power and wanted to continue said rule, I’d definitely discourage the adoption of any standardized formatting for election results.

Not, you know, for any nefarious purpose…but because what we’ve used forever was good enough for grandpappy, so it’s obviously good enough for us.

/cough

nxrabl•4h ago
Very interesting! Is this the state of the art for accurate OCR of tabular PDFs, or is there other work in the space to compare against?
SnooSux•3h ago
There's lots of posts on HN for developments and companies doing OCR and Document Extraction. It's a classic CV problem but still has come a long way in the past couple years
dwillis•2h ago
Yeah, this is a very well-traveled road, but LLMs have made some big improvements. If you asked me (the guy who wrote the original piece linked above) what I'd use if accuracy alone was the goal, probably would be AWS Textract. But accuracy and structure? Gemini.
benob•2h ago
I wonder how difficult it would be to bias a model so that it subtly corrupts election results when performing OCR.
croemer•2h ago
Surely not hard but why?
bilbo0s•2h ago
Easier to steal elections?

Don't have to bother with gerrymandering, or slick legal ways to arrest people for voting with the wrong documents. Or just good old fashioned intimidation, like making the polling place the police station or the ICE detention facility.

It's just a lot smoother process when you can simply write some software to manipulate the count.

Who's gonna check?

(No, seriously, Who's gonna check? Because you also need to layoff everyone in that department once you're in power.)

simonw•2h ago
Corrupted OCR won't help you steal elections. The result counting is a different process, with well designed checks and safeguards.

The problem is that once the counts are done and have been reported a lot of places then print those results out on paper and then scan those papers into a PDF for anyone who asks for a copy!

dwillis•1h ago
Many jurisdictions do risk-limiting audits using the original ballots, so futzing with the results wouldn't necessarily make that easier. Also, cast vote records are public in many states - those are records of each ballot cast. So people can check.
philips•1h ago
I think you mean risk limiting, right?
bilbo0s•50m ago
Freudian Slip?
dwillis•7m ago
Yes, thanks! Fixed.
philips•1h ago
You may consider reading about risk limiting audits. https://www.voting.works/audits
GardenLetter27•2h ago
Why is the original source data not available anywhere digitally?

Since it's printed it is clearly already in a database somewhere. Why can't that just be made public too.

Seems bizarre to OCR printed documents (although I am aware of many companies doing this to parse invoices, etc.)

simonw•2h ago
Welcome to government data.

One key problem is that the US has tens of thousands of local governments, and each of them get to solve problems in their own way.

Digital literacy of the kind that understands why releasing a CSV file is more valuable than a PDF is rare enough that most of them won't have someone with that level of thinking in a decision making role.

fasthands9•1h ago
In college (about 15 years ago) I worked for a professor who was compiling precint level results for old elections. My job was just to request the info and then do manual data entry. It was abysmally slow.

This application seems very good - but still a bit amazing that lawmakers haven't just required that all data be uploaded via csv! Even if every csv was slightly different format, it would be way easier for everyone (LLM or not).

xp84•32m ago
I could be wildly off-base, but I wonder if some of these systems are airgapped, and the only way the data comes off of the closed system is via printing, to avoid someone inserting a flash drive full of malware in the guise of "copying the CSV file." Obviously there are or should be technical ways to safely extract data in a digital format, but I can see a little value in the provable safety that airgapping gives you.
dwillis•8m ago
In some cases that's true, but for many jurisdictions the results systems are third-party vendor platforms, too.

Compiling LLMs into a MegaKernel: A path to low-latency inference

https://zhihaojia.medium.com/compiling-llms-into-a-megakernel-a-path-to-low-latency-inference-cf7840913c17
79•matt_d•2h ago•21 comments

Curved-Crease Sculpture

https://erikdemaine.org/curved/
139•wonger_•8h ago•19 comments

Homegrown Closures for Uxn

https://krzysckh.org/b/Homegrown-closures-for-uxn.html
46•todsacerdoti•4h ago•2 comments

How OpenElections uses LLMs

https://thescoop.org/archives/2025/06/09/how-openelections-uses-llms/index.html
73•m-hodges•6h ago•21 comments

Andrej Karpathy: Software in the era of AI [video]

https://www.youtube.com/watch?v=LCEmiRjPEtQ
1041•sandslash•21h ago•576 comments

Show HN: EnrichMCP – A Python ORM for Agents

https://github.com/featureform/enrichmcp
62•bloppe•4h ago•16 comments

Show HN: A DOS-like hobby OS written in Rust and x86 assembly

https://github.com/krustowski/rou2exOS
124•krustowski•8h ago•24 comments

Estrogen: A Trip Report

https://smoothbrains.net/posts/2025-06-15-estrogen.html
91•sebg•2h ago•28 comments

Extracting memorized pieces of books from open-weight language models

https://arxiv.org/abs/2505.12546
33•fzliu•3d ago•7 comments

Star Quakes and Monster Shock Waves

https://www.caltech.edu/about/news/star-quakes-and-monster-shock-waves
26•gmays•2d ago•4 comments

Testing a Robust Netcode with Godot

https://studios.ptilouk.net/little-brats/blog/2024-10-23_netcode.html
21•smig0•2d ago•5 comments

Guess I'm a Rationalist Now

https://scottaaronson.blog/?p=8908
189•nsoonhui•11h ago•544 comments

Show HN: RM2000 Tape Recorder, an audio sampler for macOS

https://rm2000.app
6•marcelox86•2d ago•1 comments

Show HN: Claude Code Usage Monitor – real-time tracker to dodge usage cut-offs

https://github.com/Maciek-roboblog/Claude-Code-Usage-Monitor
176•Maciej-roboblog•12h ago•99 comments

What would a Kubernetes 2.0 look like

https://matduggan.com/what-would-a-kubernetes-2-0-look-like/
112•Bogdanp•10h ago•179 comments

Flowspace (YC S17) Is Hiring Software Engineers

https://flowspace.applytojob.com/apply/6oDtY2q6E9/Software-Engineer-II
1•mrjasonh•5h ago

Show HN: Unregistry – “docker push” directly to servers without a registry

https://github.com/psviderski/unregistry
604•psviderski•23h ago•134 comments

Posit floating point numbers: thin triangles and other tricks (2019)

http://marc-b-reynolds.github.io/math/2019/02/06/Posit1.html
42•fanf2•7h ago•27 comments

DNA floating in the air tracks wildlife, viruses, even drugs

https://www.sciencedaily.com/releases/2025/06/250603114822.htm
52•karlperera•3d ago•49 comments

Juneteenth in Photos

https://texashighways.com/travel-news/the-history-of-juneteenth-in-photos/
164•ohjeez•4h ago•95 comments

Why do we need DNSSEC?

https://howdnssec.works/why-do-we-need-dnssec/
64•gpi•5h ago•100 comments

From LLM to AI Agent: What's the Real Journey Behind AI System Development?

https://www.codelink.io/blog/post/ai-system-development-llm-rag-ai-workflow-agent
110•codelink•12h ago•34 comments

Visual History of the Latin Alphabet

https://uclab.fh-potsdam.de/arete/en
91•speckx•2d ago•62 comments

Munich from a Hamburger's perspective

https://mertbulan.com/2025/06/14/munich-from-a-hamburgers-perspective/
95•toomuchtodo•4d ago•79 comments

Geochronology supports LGM age for human tracks at White Sands, New Mexico

https://www.science.org/doi/10.1126/sciadv.adv4951
33•gametorch•6h ago•12 comments

Getting Started Strudel

https://strudel.cc/workshop/getting-started/
122•rcarmo•3d ago•48 comments

Elliptic Curves as Art

https://elliptic-curves.art/
191•nill0•18h ago•24 comments

My iPhone 8 Refuses to Die: Now It's a Solar-Powered Vision OCR Server

https://terminalbytes.com/iphone-8-solar-powered-vision-ocr-server/
420•hemant6488•1d ago•177 comments

June 2025 C2PA News

https://www.tbray.org/ongoing/When/202x/2025/06/17/More-C2PA
14•timbray•5h ago•0 comments

In praise of “normal” engineers

https://charity.wtf/2025/06/19/in-praise-of-normal-engineers/
117•zdw•4h ago•85 comments