Viral ChatGPT trend is doing 'reverse location search' from photos

https://techcrunch.com/2025/04/17/the-latest-viral-chatgpt-trend-is-doing-reverse-location-search-from-photos/

108•jnord•9mo ago

Comments

simianwords•9mo ago

Working backwards -- it seems like a good idea to use geoguessr in the training set for SFT or sorts. I would imagine it would generalise well to other aspects.

anshumankmr•9mo ago

Its pretty good I had used 4o many months back with a picture of me deccades back in nalanda, and mind you not the iconic huge wall that most people associate with it, but another corner of the place and it knew where I was back then.

defrost•9mo ago

Earlier on HN:

ChatGPT now performs well at GeoGuesser (flausch.social)

131 points | 8 hours ago | 113 comments https://news.ycombinator.com/item?id=43723408

paulgb•9mo ago

I’ve found it surprisingly good, but has anyone verified that it’s not just using EXIF geolocation data embedded in the photo? I haven’t bothered to strip it.

Edit: just saw defrost’s link to the earlier threads, and one comment did just that https://news.ycombinator.com/item?id=43724063

manmal•9mo ago

Just create a screenshot of the pic and upload that.

jillesvangurp•9mo ago

I've given it a screenshot and it got close enough. So, it's not looking at exif data.

I've been playing a bit with just giving it random things from Google's streetview.

It figured out a landmark in Dresden pretty easily. But then I gave it a generic view of some street in Nuremberg and it took a long time. It extracted lots of information from the screenshot. Including the logo of a travel agency and some very pixelated house numbers. Amusingly I actually triggered some AB testing thing where it was using two models in parallel side by side.

I just let it do its thing and it looked like it got stuck in some loop. Eventually both models produced an answer. They both correctly identified the city but then both models came up with two wrong streets. The travel agency has offices across Germany and probably multiple in Nuremberg as well. There was no street name readable and no house numbers. It based this off just the name of the travel agency and the fact it was next to some hairdresser.

It seems to be using python scripts to zoom in on different bits of the image and search engines to query clues it finds. It also looks at architecture and other aspects to deduce things about country, regions, etc. Overall, I'm pretty impressed with how far it got.

tomjen3•9mo ago

I used it to place some old vacation photos.

These were taken without EXIF for GPS (because the cameras did not have GPS), and it was still very good. With anything turisty it got them immediately, and with my photo of a house on the Muese river it got it nearly perfect. When I added a few more photos I was able to nail it down to a couple hundred meters.

I did it all with 4o (the old model), the only issue is that if it starts to search the internet it will then lie to you that its not able to read pictures.

TrackerFF•9mo ago

Use google maps, and take a screenshot.

curiousObject•9mo ago

Edit the location metadata to be obviously false (Alaska coordinates for a tropical beach photo) then see what it says?

lucraft•9mo ago

As always when there's a new trend it refuses me.

I showed it a picture of a street in Rome from our last holiday and the thinking traces show it was bang on but halfway through the output it just deletes it all and says it's against policy.

Imustaskforhelp•9mo ago

Well, this is why I personally like open source since if something cool is found, it can't be taken away.

I think the openai team is putting some restrictions on the model because such reverse location could put a bad paint on their reputation. But if Openai was actually open source, they could've done nothing about it. But that's not the world we live in.

Telemakhos•9mo ago

If a trendy thing to do with AI goes viral, people feel peer pressure to do the trendy thing. If you block the trendy thing from your free version, but keep it available in your paid version, maybe you can covert some free users to paid.

AI models cost money to develop and run: that much electricity and that many processors do not come cheaply. Open Source has yet to find ways to pay for that.

squigz•9mo ago

> Open Source has yet to find ways to pay for that.

I mean, self-hosted models would call that an electricity bill.

croes•9mo ago

The same is true if something dangerous is found.

lcnPylGDnU4H9OF•9mo ago

> it's against policy

Presumably, they don’t want people using the tool to dox others. It’s a bit moot since it can still be done by humans but that requires time, effort, skill, etc., which is otherwise outsourced to the AI.

quitit•9mo ago

I'm pretty sure this extends beyond ChatGPT.

The other day I meme-ified a photo with ChatGPT. Pleased with the style I fed that into Midjourney's "Describe" feature which aims to write an image generation prompt based on the image supplied. Midjourney did include a location as part of its output description and this was indeed accurate to the original photographic source material - this is all in spite of the image fed into the system being a ChatGPT-generated caricature, with what I thought was a generic looking background.

The lesson here is that these are still algorithmically generated images - and although it may not be obvious to us, even heavily stylised images may still give away a location through the inclusion of unremarkable landmarks. In my case it appears that the particular arrangement of mountains in the background was specific to a single geographic region.

KeplerBoy•9mo ago

While I think your story is entirely plausible, I wonder if there could be something else going on. Maybe ChatGPT puts the prompt (or an assumed location) in the image's metadata?

quitit•9mo ago

Not ruling it out, but this would mean both ChatGPT to put the metadata in the file, and then Midjourney read that metadata and put it into the img2txt output. (Midjourney produces 4 sets of text outputs from the single input image, two contained location information, naming the specific mountain chains it "saw" in the caricature image.)

Assuming it's not the metadata, it's a powerful use of AI, but also not one that I would not be too surprised about. It can be a useful investigative tool, or simply a fun way to hide clues for a puzzle.

numpad0•9mo ago

Generative AIs just patch together memorized data. So parts of the original data can sometimes get back out like victim's hairs out of a monster's mouth.

quitit•9mo ago

>Generative AIs just patch together memorized data.

Gen AI isn't collage. But overtraining can produce results which look like that which has lead to the confusion.

An image generation model doesn't even contain enough data to do that, it would be a remarkable form of image compression if it could.

Instead gen ai is far more flexible, and this is where its usefulness comes from - the way it can quickly combine concepts, even if the training data didn't have those particular combinations.

Oarch•9mo ago

I tried just now. It got one image exactly and proposed reasonably good but wrong guesses for the other two.

Makes me appreciate the insane level of skill that humans on GeoGuessr style subreddits have.

Gigachad•9mo ago

There are geoguesser specific AIs which are able to outperform the top players consistently. It’s just surprising a generalised AI can do it.

aprilthird2021•9mo ago

Not to rain on anyone's parade. This does seem fun, but I have been doing this with Google Lens or Gemini on my phone for a while now and it's usually pretty good already?

I mean a while like Google Lens has been able to do this for a long time...

Rastonbury•9mo ago

Yeah I wanted to post this, reverse image search can already do this easily. Sometimes when one of my friends posts a pic in a foreign land, I try to geoguess, it once told me the name of the hotel the view was taken from

imurray•9mo ago

A photo taken on my street (no exif) "only" gives the correct town in chatgpt and gemini, and then incorrectly guesses the precise neighbourhood/street when pushed. Gemini claimed to have done a reverse image search, but I'm not convinced it did. An actual Google reverse image search found similar photos, taken a bit further along the same street or in a different direction, labelled with the correct street (no LLM required).

xeyownt•9mo ago

"New privacy risk" what the hell.

The whole internet is a privacy risk from the start. Don't want any risk? Don't publish anything. Go live on an island. Be a random.

I'm fond of boosting privacy issue awareness, but jumping directly to "booh new privacy risk" every time is insane.

junon•9mo ago

15 years ago I'd have agreed with you. However we've managed to make almost everything about life, happen on the Internet. Not just the "fun" things, but identification (and verification thereof), communication, payment, bureaucracy, hell even medicine in many ways.

We cannot lull ourselves into this idea of "technopia" where everyone - 100% of everyone - is acutely aware of privacy risks and proper security and safety posture on the internet. That will never happen, and only puts disadvantaged individuals at risk - especially since we've effectively forced them to use a system they're not capable of operating safely.

croes•9mo ago

> Don't want any risk? Don't publish anything.

I don‘t, and still my data was shared by other people because they posted something or gave by private number to FB and WhatsApp.

It’s a new privacy risk and it’s legit to name it every time it is one.

Do you say the same every time they found a new cause for cancer?

Cheer2171•9mo ago

> I'm fond of boosting privacy issue awareness

No, you very clearly are not if you think it is "insane" to even just talk about this as a privacy issue. The location of an EXIF stripped image can easily be inferred, automatically, at scale, with high accuracy. If that isn't a privacy issue, I don't know what is. You may not feel it is a big concern, or you may have given up on privacy, but don't gaslight me into thinking that this is not a privacy concern. That would be literally insane. Why are you so upset with this being talked about as a privacy issue?

And I don't know why you feel compelled to give this disclaimer. Sounds like the bad faith "as a diehard conservative/liberal, this conservative/liberal policy goes too far" when in fact they are not from that side at all. "Privacy is harder now on the internet, so it is impossible, just give up" is what everyone who profits from personal data collection wants you to think.

Sadly that is part of the internet commenter's rhetorical toolbox now. Every argument sounds a little stronger if you say that you are usually part of one team on this side of the argument, but you think your team has gone too far on this one. Classic enlightened centrism. What an independent thinker! If only it were true.

labrador•9mo ago

Never in my life have a shared a picture and thought "I hope nobody knows where this was taken." Nor have I ever thought about people sharing pictures of me "I hope they keep the location private." It's on me to wear a mask if I don't want to be identified. My assumption is that any time I'm in public my life is public. To assume otherwise is folly.

LightBug1•9mo ago

Naive. This is risk at a whole new level - and should be raised in an article such as this, as it was.

notsylver•9mo ago

I've been digitising family photos using this. I scanned the photo itself and the text on it, then passed that to an LLM for OCR and used tools to get the caption verbatim, the location mentioned and the date in a standard format. That was going to be the end of it, but the OpenAI docs https://platform.openai.com/docs/guides/function-calling?lan... suggest letting the model guess coordinates instead of just grabbing names, so I did both and it was impressive. My favourite was taking a picture looking out to sea from a pier and pinpointing the exact pier.

imposterr•9mo ago

Hmm, not sure I understand how you made use of OpenAI to guess the location oh a photo. Could you expand on that a bit? Thanks!

notsylver•9mo ago

I showed the model a picture and any text written on that picture and asked it to guess a latitude/longitude using the tool use API for structured outputs. That was in addition to having it transcribe the hand written text and extracting location names, which was my original goal until I saw how good it was at guessing exact coordinates. It would guess within ~200km on average, even on pictures with no information written on them.

piinbinary•9mo ago

With a sample size of 1, Gemini 2.5 Pro (Experimental) did a great job of this (and was considerably faster than O3)

cormorant•9mo ago

The example includes the following "reasoning":

"Left-hand-drive cars, but traffic keeps to the left" -- yet the picture doesn't hint at which side traffic drives on.

"Language on the shop fascia looks like a Latin alphabet business name rather than Spanish or Portuguese" -- I'm sorry, what alphabet are Spanish and Portuguese written in?

zamalek•9mo ago

LLMs lie about their reasoning: https://www.anthropic.com/research/tracing-thoughts-language...

Lerc•9mo ago

It's worth mentioning that this is a different scenario to the reasoning models though. Reasoning models use the generated text to arrive at an answer, in a sense, it cannot lie until it gives the answer. That answer may express a reasoning that was not the reasoning used. That bit is the lie.

You can actually take this further when you consider deepseek style reinforcement. While the reasoning text may appear to show the thought process used in readable language, the model is trained to say whatever it needs to generate the right answer, that may or may not be what that text means to an outside observer. In theory it could encode extra information in word lengths or even evolve it's own Turing complete gobbledegook. There are many degrees of likelihood in the options available. Perhaps one more likely is some rarely used word has some poorly trained side-effect that gives the context a kick in the right direction right before it was going to take a fork going the wrong way. Kind of a SolidGoldMagikarp spanking.

unoti•9mo ago

> LLMs lie about their reasoning

People do this all the time too! Cat scans show that people make up their minds quickly, showing activations in one part of the brain that makes snap judgements, and then a fraction of a second later the part that shows rational reasoning begins to activate. People in sales have long known this, wanting to give people emotional reasons to make the right decision, while also giving them the rational data needed to support it. [1]

I remember seeing this illustrated ourselves when our team of 8 or so people was making a big ERP purchasing decision between Oracle ERP and Peoplesoft long ago. We had divided what our application needed to do into over 400 feature areas, and in each feature area had developed a very structured set of evaluation criteria for each area. Then we put weights on each of those to express how important it was to us. We had a big spreadsheet to rank the things.

But along the way of the 9 month sales process, we really enjoyed working with the Oracle sales team a lot better. We felt like we'd be able to work with them better. In the end, we ran all the numbers, and Peoplesoft came out on top. And we sat there and soberly looked each other in the eyes, and said "We're going with Oracle." (Actually I remember one lady on the team when asked for her vote said, "It's gotta be the big O.")

Salespeople know that ultimately it's a gut decision, even if the people buying things don't realize that themselves.

[1] https://pmc.ncbi.nlm.nih.gov/articles/PMC6310859/

bluefirebrand•9mo ago

> People do this all the time too

I wish people would stop comparing AI to Humans, honestly

I know humans are flawed. We all know

The appeal of computer systems is that they are consistent. The ideal software is bug free, zero flaws

Creating human-like computer systems is so worthless. Why would we want to make them less predictable and less consistent

reneretord•9mo ago

I actually prefer a system that's correct half of the time at thousands of times the cost & speed.

superluserdo•9mo ago

The real answer is it's completely domain-specific. If you're trying to search for something that you'll instantly know when you see it, then something that can instantly give you 5 wrong answers and 1 right answer is a godsend and barely worse than something that is right 100% of the time. If the task is to be an authoritative designer of a new aeroplane, it's a different story.

tediousgraffit1•9mo ago

Because we can still do things computers can't and that's interesting

seunosewa•9mo ago

Language models happen to share human flaws, but like humans they can amplify their abilities and reliability by building and using reliable tools.

graypegg•9mo ago

Hey, maybe they're in a really old part of town! ...like really REALLY old. https://en.wikipedia.org/wiki/Iberian_scripts

gknapp•9mo ago

I just played a full round of Geoguessr world with Gemini 2.5 and got a score of 22k / 25k (so a silver medal). This puts in the realm of a "pretty good" player.

It was shockingly accurate with its guesses of Essen, Germany and Sheffield, UK, but faltered a bit in Italy (it thought Genoa was Siena) and Russia (it guessed Samara but it was actually a small town about 400 miles to the west). It also guessed Orlando when it was Tampa.

Still this was only giving it a single image to work off of, where any player would be able to move around for a few minutes.

TrackerFF•9mo ago

Worked so-so for me. Took a picture from my street, and cropped it a bit to leave out some significant landmark in the distance. It missed by around 500 km, but deduced a lot of things correctly.

Then I used the uncropped picture, and it spent 3 minutes trying to look at the features of said landmark. It get hung up on some similar (and much more famous) island which is even further away from here.

Lastly I used a google image photo of said landmark (which is an island with a lighthouse) - which was quite clear. But it insisted on being the same island as the previous try.

jncfhnb•9mo ago

Hmmm this could be really problematic tbh.

The version of using reasoning to do geoguesser to find approximate locations is fine. But we should fully expect this tech to reasonably soon be able to rapidly vector search satellite imagery or even non satellite imagery to pinpoint locations based on landmarks that should feel unusable to us humans.

We’re going to create a fuzzy visual index for every location in the world.

morkalork•9mo ago

Google probably has this from their street view repository.

huydotnet•9mo ago

I gave it this picture https://i.imgur.com/HyfVxiD.jpeg

At first, it's unsure, but also mention that there are a lot of riverside cafes in Southeast Asia that have this view. Then I said it was in Vietnam, and it was immediately concluded that this was taken at the Han River in Da Nang city, which was correct.

I can see that there is some actual analysis skill here. I'm not 100% convinced, but I'm still impressed.

sumedh•9mo ago

I took a screenshot of your image and this prompt "play the game geoguesser and guess where this image was taken"

Putting those pieces together, the most likely spot is one of the cafés on the east bank just north of Dragon Bridge. A popular candidate with a very similar railing/table setup is Bridgespan Café (also called Bridge Cafe) at ≈ 16.0645 N, 108.2292 E.

Location guess: A second‑floor riverside café on Trần Hưng Đạo street, east bank of the Hàn River, Đà Nẵng, Vietnam (looking southwest toward Dragon Bridge).

Approx. coordinates: 16.064 °N, 108.229 °E

Confidence level: 70 %

The bridge‑light pattern and cruise‑boat LEDs strongly suggest Đà Nẵng, but several cafés share almost identical views, so the exact establishment is harder to pin down.

krunck•9mo ago

People taking pictures with you in them without your permission, whether intentional or not, are invading your privacy.

knowitnone•9mo ago

you're saying you've NEVER taken a picture of anyone in the background without their permission?

throwaway85747•9mo ago

It's not inconceivable, given that smartphones became widespread about a decade ago. Somehow, we endured for hundreds of thousand of years without recording every single moment of our lives to feed into the social media data mining machine.

retrochameleon•9mo ago

I took a crapshot at asking chatgpt how I'd set the clock on my car radio by giving it a picture. Not only did it tell me the correct method, but it identified my radio as a "typical factory radio installed in early 2000 insert make here vehicles."

SectorC: A C Compiler in 512 bytes

The F Word

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Speed up responses with fast mode

Software factories and the agentic moment

Stories from 25 Years of Software Development

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

I write games in C (yes, C)

First Proof

Show HN: A luma dependent chroma compression algorithm (image compression)

The Waymo World Model

Al Lowe on model trains, funny deaths and working with Disney

Vocal Guide – belt sing without killing yourself

Start all of your commands with a comma (2009)

Reinforcement Learning from Human Feedback

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Selection Rather Than Prediction

Coding agents have replaced every framework I used

The AI boom is causing shortages everywhere else

A Fresh Look at IBM 3270 Information Display System

France's homegrown open source online office suite

72M Points of Interest

We mourn our craft

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Learning from context is harder than we thought

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

History and Timeline of the Proco Rat Pedal (2021)

SectorC: A C Compiler in 512 bytes

The F Word

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Speed up responses with fast mode

Software factories and the agentic moment

Stories from 25 Years of Software Development

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

I write games in C (yes, C)

First Proof

Show HN: A luma dependent chroma compression algorithm (image compression)

The Waymo World Model

Al Lowe on model trains, funny deaths and working with Disney

Vocal Guide – belt sing without killing yourself

Start all of your commands with a comma (2009)

Reinforcement Learning from Human Feedback

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Selection Rather Than Prediction

Coding agents have replaced every framework I used

The AI boom is causing shortages everywhere else

A Fresh Look at IBM 3270 Information Display System

France's homegrown open source online office suite

72M Points of Interest

We mourn our craft

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Learning from context is harder than we thought

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

History and Timeline of the Proco Rat Pedal (2021)

Viral ChatGPT trend is doing 'reverse location search' from photos

Comments