frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

HathiTrust Digital Library

https://www.hathitrust.org/
67•djoldman•6mo ago

Comments

pyuser583•6mo ago
This is an excellent resource! It should be more popular!
JdeBP•6mo ago
It is. It's used on a fairly regular basis nowadays in Wikipedia, for example. A decade ago one would have seen just the Internet Archive or the dreaded Google Books hyperlinks.
dilawar•6mo ago
Haathi means elephant in Hindi. I first thought it is to be an Indian site but it is based in the US.

Curious about the connection.

pyuser583•6mo ago
There's an English saying, "an elephant never forgets." I'm guessing its about that.
shervinafshar•6mo ago
Tangential:

- https://en.wikipedia.org/wiki/Elephant_Memory_Systems

- https://i.imgur.com/vNQURE3.jpeg

JdeBP•6mo ago
You can still find the original answer, from 2008, at https://old.www.hathitrust.org/help_general.html .
apaprocki•6mo ago
I would use this site all the time for genealogy purposes. It’s hard to unravel how the datasets are shared, because many things here are from Google’s scanning, but IMO there are lots of things that do not appear anywhere else.
robin_reala•6mo ago
We use Hathi a lot at Standard Ebooks as a source of scans to proof productions against. Archive.org has a somewhat better interface, but Hathi has a wider selection.
cxr•6mo ago
Try John Mark Ockerbloom's Online Books Page:

<https://onlinebooks.library.upenn.edu/>

For the books that have been manually curated, multiple collections are indexed, including HathiTrust and the Internet Archive. Search will also fall back to showing hits from the "extended shelves" if a title is not in the catalog.

shervinafshar•6mo ago
Thanks for your volunteer work for Standard Ebooks!
leetrout•6mo ago
My family is from Eastern KY and I had access to the HTDL and NYPL through my stint working for a public university a few years ago. It's fascinating what you can find in there! When I had looked a couple years ago it seemed like there wasn't as much publicly available as what I am seeing now.
philipkglass•6mo ago
HathiTrust is much better than Google Books about allowing access to works that are no longer under copyright in the United States. Under US law, everything published 1929 and before is currently in the public domain. But there are a lot of special cases where 20th century works published after 1929 are also in the public domain:

https://guides.library.cornell.edu/copyright/publicdomain

Google Books appears to follow the blanket 1929 rule, or did the last time I looked. HathiTrust has cleared the copyright status for many additional works following the more complex rules, e.g.

"Drawing Birds" by Joy Postle, 1953:

https://babel.hathitrust.org/cgi/pt?id=nyp.33433115876140&se...

Unfortunately, the Google-originated scans that HathiTrust has come with special restrictions. Google itself required that only people associated with the academic libraries could download whole books as a unit, even for works that are in the public domain:

https://hathitrust.atlassian.net/servicedesk/customer/portal...

Fortunately, members of the public can download individual page scans without any special affiliation. People have naturally written tools to automate this process so that full books can be reassembled and then uploaded to the Internet Archive or other book sites.

Google Books has a much faster and sometimes better search interface, so a common flow I use is to search Google Books for terms and then go to HathiTrust to read inside books that Google Books surfaced but won't show.

EDIT: corrected 1926 to 1929 per cxr's comment below.

billbrown•6mo ago
This is very helpful context. I have disparaged HathiTrust in my mind for several of these public domain problems and it makes sense that it's actually a Google Books problem.
acidburnNSA•6mo ago
As a nuclear power historian, this resource is unbelievably valuable. I've been using it for years and it constantly delivers the goods. It contains incredible multitudes.
roadside_picnic•6mo ago
Somewhat tangential, but HathiTrust was born from what I would consider the "golden age" of technical work coming out of libraries (2002-2010). One of the unintended consequences of the dotcom crash was that compensation falling meant that there were a lot of talented software people working on what interested them rather than what simply paid the most (since the gap was much smaller).

As a result research libraries were well staffed with very technical people all genuinely interested in making software that made the world a better place. MIT's DSpace, LibraryThing, Open ILSs like Evergreen/Koha, and a huge range of quirky/innovative smaller projects that no longer exist all came out of this period.

It ended around 2010 since the GFC fallout started to hit library budgets while tech suddenly started getting really hot. Even if you loved libraries, most library devs where facing pay cuts to stay in libraries versus massive raises and other quality of life improvements for going into tech. Plus startups and tech companies in general at the time felt more inspired.

geephroh•6mo ago
And now that government funding sources like IMLS, CLIR, NEH, NARA and LoC have been nuked and/or crippled, things are unlikely to get better any time soon, especially for collaborative research projects that have no immediate commercial benefit.
sadcodemonkey•6mo ago
I worked at a university library for a few short years in the 2010s. Reading your comment helped me make sense of some of the experiences I had there. I still try to keep on top of some of the trends, with the vague hope of working in that field again one day.

I'm curious what some of the "quirky/innovative smaller projects that no longer exist" are, if you're inclined to go into some details. Or if you could point to a good resource on this somewhere. A lot of technology projects in the library space seem to reinvent the wheel over and over, so I think such a list is very valuable.

TZubiri•6mo ago
One day I needed some legal info, I call the library of congress, they send me a link to hathitrust with a hearing from 1980. Sent to my email, boom I take that link add it to wikipedia.

All free (tax dollars ok) and swift, felt surreal.

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
260•theblazehen•2d ago•86 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
27•AlexeyBrin•1h ago•3 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
707•klaussilveira•15h ago•206 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
969•xnx•21h ago•558 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
70•jesperordrup•6h ago•32 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
8•onurkanbkrc•50m ago•0 comments

Making geo joins faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
135•matheusalmeida•2d ago•35 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
46•speckx•4d ago•36 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
68•videotopia•4d ago•7 comments

Welcome to the Room – A lesson in leadership by Satya Nadella

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
39•kaonwarb•3d ago•30 comments

Ga68, a GNU Algol 68 Compiler

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
13•matt_d•3d ago•2 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
45•helloplanets•4d ago•46 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
240•isitcontent•16h ago•26 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
238•dmpetrov•16h ago•127 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
340•vecti•18h ago•150 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
506•todsacerdoti•23h ago•248 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
390•ostacke•22h ago•99 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
306•eljojo•18h ago•189 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
361•aktau•22h ago•186 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
429•lstoll•22h ago•284 comments

Cross-Region MSK Replication: K2K vs. MirrorMaker2

https://medium.com/lensesio/cross-region-msk-replication-a-comprehensive-performance-comparison-o...
3•andmarios•4d ago•1 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
25•bikenaga•3d ago•11 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
71•kmm•5d ago•10 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
96•quibono•4d ago•22 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
26•1vuio0pswjnm7•2h ago•16 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
271•i5heu•18h ago•219 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
34•romes•4d ago•3 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1079•cdrnsf•1d ago•462 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
64•gfortaine•13h ago•30 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
306•surprisetalk•3d ago•45 comments