All Text in NYC - https://news.ycombinator.com/item?id=42367029 - Dec 2024 (4 comments)
All text in Brooklyn - https://news.ycombinator.com/item?id=41344245 - Aug 2024 (50 comments)
(The commenters below are right. It is the Maps API, not compute, that I should worry about. Using the free tier, it would have taken the author years to download all tiles. I wish I had their budget!)
It's the Google Maps API costs that will sink your project if you can't get them waived as art:
https://mapsplatform.google.com/pricing/
Not sure how many panoramas there are in New York or your metro, but if it's over the free tier you're talking thousands of dollars.
I'm wondering about more the data - did they use Google's API or work with Google to use the data?
OCR I'd expect to be comparatively cheap, if you weren't in a hurry - a consumer GPU running PaddlePaddle server can do about 4 MP per second. If you spent a few grand on hardware that might work out to 3-6 months of processing, depending on the resolution per pano and size of your model.
Again, a complex problem and I love it...
A game: find an English word with the fewest hits. (It must have at least one hit that is not an OCR error, but such errors do still count towards your score. Only spend a couple of minutes.) My best is "scintillating" : 3.
WorldPeas•2h ago
JackFr•2h ago
Instead shows me thousands of “Rev“
adrianparsons•1h ago