HathiTrust Digital Library

https://www.hathitrust.org/

67•djoldman•6mo ago

Comments

pyuser583•6mo ago

This is an excellent resource! It should be more popular!

JdeBP•6mo ago

It is. It's used on a fairly regular basis nowadays in Wikipedia, for example. A decade ago one would have seen just the Internet Archive or the dreaded Google Books hyperlinks.

dilawar•6mo ago

Haathi means elephant in Hindi. I first thought it is to be an Indian site but it is based in the US.

Curious about the connection.

pyuser583•6mo ago

There's an English saying, "an elephant never forgets." I'm guessing its about that.

shervinafshar•6mo ago

Tangential:

- https://en.wikipedia.org/wiki/Elephant_Memory_Systems

- https://i.imgur.com/vNQURE3.jpeg

JdeBP•6mo ago

You can still find the original answer, from 2008, at https://old.www.hathitrust.org/help_general.html .

apaprocki•6mo ago

I would use this site all the time for genealogy purposes. It’s hard to unravel how the datasets are shared, because many things here are from Google’s scanning, but IMO there are lots of things that do not appear anywhere else.

robin_reala•6mo ago

We use Hathi a lot at Standard Ebooks as a source of scans to proof productions against. Archive.org has a somewhat better interface, but Hathi has a wider selection.

cxr•6mo ago

Try John Mark Ockerbloom's Online Books Page:

<https://onlinebooks.library.upenn.edu/>

For the books that have been manually curated, multiple collections are indexed, including HathiTrust and the Internet Archive. Search will also fall back to showing hits from the "extended shelves" if a title is not in the catalog.

shervinafshar•6mo ago

Thanks for your volunteer work for Standard Ebooks!

leetrout•6mo ago

My family is from Eastern KY and I had access to the HTDL and NYPL through my stint working for a public university a few years ago. It's fascinating what you can find in there! When I had looked a couple years ago it seemed like there wasn't as much publicly available as what I am seeing now.

philipkglass•6mo ago

HathiTrust is much better than Google Books about allowing access to works that are no longer under copyright in the United States. Under US law, everything published 1929 and before is currently in the public domain. But there are a lot of special cases where 20th century works published after 1929 are also in the public domain:

https://guides.library.cornell.edu/copyright/publicdomain

Google Books appears to follow the blanket 1929 rule, or did the last time I looked. HathiTrust has cleared the copyright status for many additional works following the more complex rules, e.g.

"Drawing Birds" by Joy Postle, 1953:

https://babel.hathitrust.org/cgi/pt?id=nyp.33433115876140&se...

Unfortunately, the Google-originated scans that HathiTrust has come with special restrictions. Google itself required that only people associated with the academic libraries could download whole books as a unit, even for works that are in the public domain:

https://hathitrust.atlassian.net/servicedesk/customer/portal...

Fortunately, members of the public can download individual page scans without any special affiliation. People have naturally written tools to automate this process so that full books can be reassembled and then uploaded to the Internet Archive or other book sites.

Google Books has a much faster and sometimes better search interface, so a common flow I use is to search Google Books for terms and then go to HathiTrust to read inside books that Google Books surfaced but won't show.

EDIT: corrected 1926 to 1929 per cxr's comment below.

billbrown•6mo ago

This is very helpful context. I have disparaged HathiTrust in my mind for several of these public domain problems and it makes sense that it's actually a Google Books problem.

acidburnNSA•6mo ago

As a nuclear power historian, this resource is unbelievably valuable. I've been using it for years and it constantly delivers the goods. It contains incredible multitudes.

roadside_picnic•6mo ago

Somewhat tangential, but HathiTrust was born from what I would consider the "golden age" of technical work coming out of libraries (2002-2010). One of the unintended consequences of the dotcom crash was that compensation falling meant that there were a lot of talented software people working on what interested them rather than what simply paid the most (since the gap was much smaller).

As a result research libraries were well staffed with very technical people all genuinely interested in making software that made the world a better place. MIT's DSpace, LibraryThing, Open ILSs like Evergreen/Koha, and a huge range of quirky/innovative smaller projects that no longer exist all came out of this period.

It ended around 2010 since the GFC fallout started to hit library budgets while tech suddenly started getting really hot. Even if you loved libraries, most library devs where facing pay cuts to stay in libraries versus massive raises and other quality of life improvements for going into tech. Plus startups and tech companies in general at the time felt more inspired.

geephroh•6mo ago

And now that government funding sources like IMLS, CLIR, NEH, NARA and LoC have been nuked and/or crippled, things are unlikely to get better any time soon, especially for collaborative research projects that have no immediate commercial benefit.

sadcodemonkey•6mo ago

I worked at a university library for a few short years in the 2010s. Reading your comment helped me make sense of some of the experiences I had there. I still try to keep on top of some of the trends, with the vague hope of working in that field again one day.

I'm curious what some of the "quirky/innovative smaller projects that no longer exist" are, if you're inclined to go into some details. Or if you could point to a good resource on this somewhere. A lot of technology projects in the library space seem to reinvent the wheel over and over, so I think such a list is very valuable.

TZubiri•6mo ago

One day I needed some legal info, I call the library of congress, they send me a link to hathitrust with a hearing from 1980. Sent to my email, boom I take that link add it to wikipedia.

All free (tax dollars ok) and swift, felt surreal.

Start all of your commands with a comma (2009)

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The Waymo World Model

Vocal Guide – belt sing without killing yourself

Reinforcement Learning from Human Feedback

Making geo joins faster with H3 indexes

Where did all the starships go?

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Welcome to the Room – A lesson in leadership by Satya Nadella

Ga68, a GNU Algol 68 Compiler

What Is Ruliology?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: I spent 4 years building a UI design tool with only the features I use

Hackers (1995) Animated Experience

Sheldon Brown's Bicycle Technical Info

Show HN: If you lose your memory, how to regain access to your computer?

Microsoft open-sources LiteBox, a security-focused library OS

An Update on Heroku

Cross-Region MSK Replication: K2K vs. MirrorMaker2

Was Benoit Mandelbrot a hedgehog or a fox?

PC Floppy Copy Protection: Vault Prolok

Dark Alley Mathematics

The AI boom is causing shortages everywhere else

How to effectively write quality code with AI

Delimited Continuations vs. Lwt for Threads

I now assume that all ads on Apple news are scams

Introducing the Developer Knowledge API and MCP Server

Understanding Neural Network, Visually