ChatGPT now performs well at GeoGuesser (flausch.social)
131 points | 8 hours ago | 113 comments https://news.ycombinator.com/item?id=43723408
Edit: just saw defrost’s link to the earlier threads, and one comment did just that https://news.ycombinator.com/item?id=43724063
I've been playing a bit with just giving it random things from Google's streetview.
It figured out a landmark in Dresden pretty easily. But then I gave it a generic view of some street in Nuremberg and it took a long time. It extracted lots of information from the screenshot. Including the logo of a travel agency and some very pixelated house numbers. Amusingly I actually triggered some AB testing thing where it was using two models in parallel side by side.
I just let it do its thing and it looked like it got stuck in some loop. Eventually both models produced an answer. They both correctly identified the city but then both models came up with two wrong streets. The travel agency has offices across Germany and probably multiple in Nuremberg as well. There was no street name readable and no house numbers. It based this off just the name of the travel agency and the fact it was next to some hairdresser.
It seems to be using python scripts to zoom in on different bits of the image and search engines to query clues it finds. It also looks at architecture and other aspects to deduce things about country, regions, etc. Overall, I'm pretty impressed with how far it got.
These were taken without EXIF for GPS (because the cameras did not have GPS), and it was still very good. With anything turisty it got them immediately, and with my photo of a house on the Muese river it got it nearly perfect. When I added a few more photos I was able to nail it down to a couple hundred meters.
I did it all with 4o (the old model), the only issue is that if it starts to search the internet it will then lie to you that its not able to read pictures.
I showed it a picture of a street in Rome from our last holiday and the thinking traces show it was bang on but halfway through the output it just deletes it all and says it's against policy.
I think the openai team is putting some restrictions on the model because such reverse location could put a bad paint on their reputation. But if Openai was actually open source, they could've done nothing about it. But that's not the world we live in.
AI models cost money to develop and run: that much electricity and that many processors do not come cheaply. Open Source has yet to find ways to pay for that.
I mean, self-hosted models would call that an electricity bill.
Presumably, they don’t want people using the tool to dox others. It’s a bit moot since it can still be done by humans but that requires time, effort, skill, etc., which is otherwise outsourced to the AI.
The other day I meme-ified a photo with ChatGPT. Pleased with the style I fed that into Midjourney's "Describe" feature which aims to write an image generation prompt based on the image supplied. Midjourney did include a location as part of its output description and this was indeed accurate to the original photographic source material - this is all in spite of the image fed into the system being a ChatGPT-generated caricature, with what I thought was a generic looking background.
The lesson here is that these are still algorithmically generated images - and although it may not be obvious to us, even heavily stylised images may still give away a location through the inclusion of unremarkable landmarks. In my case it appears that the particular arrangement of mountains in the background was specific to a single geographic region.
Makes me appreciate the insane level of skill that humans on GeoGuessr style subreddits have.
I mean a while like Google Lens has been able to do this for a long time...
The whole internet is a privacy risk from the start. Don't want any risk? Don't publish anything. Go live on an island. Be a random.
I'm fond of boosting privacy issue awareness, but jumping directly to "booh new privacy risk" every time is insane.
We cannot lull ourselves into this idea of "technopia" where everyone - 100% of everyone - is acutely aware of privacy risks and proper security and safety posture on the internet. That will never happen, and only puts disadvantaged individuals at risk - especially since we've effectively forced them to use a system they're not capable of operating safely.
I don‘t, and still my data was shared by other people because they posted something or gave by private number to FB and WhatsApp.
It’s a new privacy risk and it’s legit to name it every time it is one.
Do you say the same every time they found a new cause for cancer?
No, you very clearly are not if you think it is "insane" to even just talk about this as a privacy issue. The location of an EXIF stripped image can easily be inferred, automatically, at scale, with high accuracy. If that isn't a privacy issue, I don't know what is. You may not feel it is a big concern, or you may have given up on privacy, but don't gaslight me into thinking that this is not a privacy concern. That would be literally insane. Why are you so upset with this being talked about as a privacy issue?
And I don't know why you feel compelled to give this disclaimer. Sounds like the bad faith "as a diehard conservative/liberal, this conservative/liberal policy goes too far" when in fact they are not from that side at all. "Privacy is harder now on the internet, so it is impossible, just give up" is what everyone who profits from personal data collection wants you to think.
Sadly that is part of the internet commenter's rhetorical toolbox now. Every argument sounds a little stronger if you say that you are usually part of one team on this side of the argument, but you think your team has gone too far on this one. Classic enlightened centrism. What an independent thinker! If only it were true.
"Left-hand-drive cars, but traffic keeps to the left" -- yet the picture doesn't hint at which side traffic drives on.
"Language on the shop fascia looks like a Latin alphabet business name rather than Spanish or Portuguese" -- I'm sorry, what alphabet are Spanish and Portuguese written in?
You can actually take this further when you consider deepseek style reinforcement. While the reasoning text may appear to show the thought process used in readable language, the model is trained to say whatever it needs to generate the right answer, that may or may not be what that text means to an outside observer. In theory it could encode extra information in word lengths or even evolve it's own Turing complete gobbledegook. There are many degrees of likelihood in the options available. Perhaps one more likely is some rarely used word has some poorly trained side-effect that gives the context a kick in the right direction right before it was going to take a fork going the wrong way. Kind of a SolidGoldMagikarp spanking.
People do this all the time too! Cat scans show that people make up their minds quickly, showing activations in one part of the brain that makes snap judgements, and then a fraction of a second later the part that shows rational reasoning begins to activate. People in sales have long known this, wanting to give people emotional reasons to make the right decision, while also giving them the rational data needed to support it. [1]
I remember seeing this illustrated ourselves when our team of 8 or so people was making a big ERP purchasing decision between Oracle ERP and Peoplesoft long ago. We had divided what our application needed to do into over 400 feature areas, and in each feature area had developed a very structured set of evaluation criteria for each area. Then we put weights on each of those to express how important it was to us. We had a big spreadsheet to rank the things.
But along the way of the 9 month sales process, we really enjoyed working with the Oracle sales team a lot better. We felt like we'd be able to work with them better. In the end, we ran all the numbers, and Peoplesoft came out on top. And we sat there and soberly looked each other in the eyes, and said "We're going with Oracle." (Actually I remember one lady on the team when asked for her vote said, "It's gotta be the big O.")
Salespeople know that ultimately it's a gut decision, even if the people buying things don't realize that themselves.
I wish people would stop comparing AI to Humans, honestly
I know humans are flawed. We all know
The appeal of computer systems is that they are consistent. The ideal software is bug free, zero flaws
Creating human-like computer systems is so worthless. Why would we want to make them less predictable and less consistent
It was shockingly accurate with its guesses of Essen, Germany and Sheffield, UK, but faltered a bit in Italy (it thought Genoa was Siena) and Russia (it guessed Samara but it was actually a small town about 400 miles to the west). It also guessed Orlando when it was Tampa.
Still this was only giving it a single image to work off of, where any player would be able to move around for a few minutes.
Then I used the uncropped picture, and it spent 3 minutes trying to look at the features of said landmark. It get hung up on some similar (and much more famous) island which is even further away from here.
Lastly I used a google image photo of said landmark (which is an island with a lighthouse) - which was quite clear. But it insisted on being the same island as the previous try.
The version of using reasoning to do geoguesser to find approximate locations is fine. But we should fully expect this tech to reasonably soon be able to rapidly vector search satellite imagery or even non satellite imagery to pinpoint locations based on landmarks that should feel unusable to us humans.
We’re going to create a fuzzy visual index for every location in the world.
At first, it's unsure, but also mention that there are a lot of riverside cafes in Southeast Asia that have this view. Then I said it was in Vietnam, and it was immediately concluded that this was taken at the Han River in Da Nang city, which was correct.
I can see that there is some actual analysis skill here. I'm not 100% convinced, but I'm still impressed.
Putting those pieces together, the most likely spot is one of the cafés on the east bank just north of Dragon Bridge. A popular candidate with a very similar railing/table setup is Bridgespan Café (also called Bridge Cafe) at ≈ 16.0645 N, 108.2292 E.
Location guess: A second‑floor riverside café on Trần Hưng Đạo street, east bank of the Hàn River, Đà Nẵng, Vietnam (looking southwest toward Dragon Bridge).
Approx. coordinates: 16.064 °N, 108.229 °E
Confidence level: 70 %
The bridge‑light pattern and cruise‑boat LEDs strongly suggest Đà Nẵng, but several cafés share almost identical views, so the exact establishment is harder to pin down.
simianwords•1d ago