frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Laptop Isn't Ready for LLMs

https://spectrum.ieee.org/ai-models-locally
1•Brajeshwar•32s ago•0 comments

ChatGPT Is Down

https://status.openai.com/history
1•pbshgthm•38s ago•0 comments

What is the chance your plane will be hit by space debris?

https://www.technologyreview.com/2025/11/17/1127980/what-is-the-chance-your-plane-will-be-hit-by-...
1•Brajeshwar•42s ago•0 comments

CS Pluang ID

1•Pluangcare•1m ago•0 comments

GitHub Project Search and Discovery

https://gitdb.net/
1•birdculture•1m ago•0 comments

LLMs and Creation Outside of Time

https://balajmarius.com/writings/llms-and-creation-outside-of-time/
2•vtemian•2m ago•0 comments

EU countries agree on voluntary chat monitoring (German)

https://netzpolitik.org/2025/interne-dokumente-eu-staaten-einigen-sich-auf-freiwillige-chatkontro...
1•doener•4m ago•0 comments

Pusat Bantuan Shopee

1•CustomersShopee•4m ago•0 comments

AI Creates the First 100-Billion-Star Simulation of the Milky Way

https://scienceclock.com/ai-creates-the-first-100-billion-star-simulation-of-the-milky-way/
1•ashishgupta2209•4m ago•0 comments

Depth Anything 3: Recovering the Visual Space from Any Views

https://huggingface.co/spaces/depth-anything/depth-anything-3
1•doener•7m ago•0 comments

Google Antigravity

https://www.google.com/
1•denysvitali•7m ago•0 comments

Show HN: Hair Glow Up – AI hair transformations with complete vibe templates

https://gethairglowup.com
1•sauvage7•8m ago•0 comments

Shopee Paylater

1•CustomersShopee•9m ago•0 comments

The Problems with Network Tiers

https://tier1-analysis.53bits.co.uk//
1•cnkk•10m ago•0 comments

Show HN: Euroelo, a ranking of European football teams

https://euroelo.fffred.com
1•fredericdith•10m ago•0 comments

LaLiga: ISPs Must Join Anti-Piracy War to Secure Broadcasting Rights

https://torrentfreak.com/laliga-says-isps-joining-its-piracy-war-is-mandatory-for-broadcasting-ri...
2•iamnothere•12m ago•0 comments

Nomor WhatsApp Bank Permata

1•BankPermata•13m ago•0 comments

Hal

1•BankPermata•14m ago•0 comments

The AI Bubble That Isn't There

https://www.forbes.com/sites/jasonsnyder/2025/11/17/the-ai-bubble-that-isnt-there/
1•giuliomagnifico•14m ago•1 comments

Do Not Put Your Site Behind Cloudflare If You Don't Need To

https://huijzer.xyz/posts/123/do-not-put-your-site-behind-cloudflare-if-you-dont
3•huijzer•14m ago•0 comments

Cheese Wars: Rise of the Vibe Coder

https://steve-yegge.medium.com/cheese-wars-rise-of-the-vibe-coder-6839a6b15982
1•ingve•16m ago•0 comments

Show HN: A transparent, multi-source news analyzer

https://neutralnewsai.com
2•MarcellLunczer•16m ago•1 comments

Multiple Digital Ocean services down

https://status.digitalocean.com/incidents/lgt5xs2843rx
8•inanothertime•20m ago•1 comments

Kerala becomes first state in India to eradicate extreme poverty

https://www.channelnewsasia.com/asia/india-kerala-extreme-poverty-world-bank-poor-households-5443611
1•brettermeier•20m ago•0 comments

It's not surprising that 95% of AI enterprise projects fail

https://www.seangoedecke.com/why-do-ai-enterprise-projects-fail/
1•gfysfm•21m ago•0 comments

Towards interplanetary QUIC traffic with Rust Quinn

https://ochagavia.nl/blog/towards-interplanetary-quic-traffic/
2•fanf2•27m ago•0 comments

Google Gemini 3 Pro Model Card [pdf]

https://web.archive.org/web/20251118111103/https://storage.googleapis.com/deepmind-media/Model-Ca...
2•doso•28m ago•1 comments

The Ochre Origins of Art

https://nautil.us/the-ochre-origins-of-art-1247923/
1•fleahunter•29m ago•0 comments

Show HN: Filtered GitHub Trends

https://gh-trends.nilsherzig.com/#daily/all
2•nilsherzig•29m ago•0 comments

SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing

https://arxiv.org/abs/2510.25970
2•PaulHoule•29m ago•0 comments
Open in hackernews

Gemini 3 Pro Model Card

https://pixeldrain.com/u/hwgaNKeH
121•Topfi•1h ago

Comments

surrTurr•1h ago
https://news.ycombinator.com/item?id=45963670
margorczynski•48m ago
If these numbers are true then OpenAI is probably done, Anthropic too. Still, it's hard to see an effective monetization method for this tech and it clearly is eating Google's main pie which is search.
Sol-•45m ago
Why? These models just leapfrog each other as time advances.

One month Gemini is on top, then ChatGPT, then Anthropic. Not sure why everyone gets FOMO whenever a new version gets released.

remus•35m ago
I think google is uniquely well placed to make a profitable business out of AI: They make their own TPUs so don't have to pay ridiculous amounts of money to Nvidia, they have a great depth of talent in building models, they've got loads of data they can use for training and they've got a huge existing customer base who can buy their AI offerings.

I don't think any other company has all these ingredients.

gizmodo59•31m ago
While I don’t disagree that Google is the company you can’t bet against when it comes to AI, saying other companies are done is a stretch. If they have a significant moat then they should be at the top all the time by then which is not the case though.
remus•27m ago
Agreed, too early to write off others entirely. It'll be interesting to see who comes out the other side of the bubble with a working business.
adriand•9m ago
Anthropic has a fairly significant lead when it comes to enterprise usage and for coding. This seems like a workable business model to me.
mlnj•28m ago
100% the reason I am long on Google. They can take their time to monetize these new costs.

Even other search competitors have not proven to be a danger to Google. There is nothing stopping that search money coming in.

redox99•27m ago
Considering GPT 5 was only recently released, it's very unlikely GPT will achieve these scores in just a couple of months. If they had something this good in the oven, they'd probably left the GPT 5 name to it.

Or maybe Google just benchmaxxed and this doesn't translate at all in real world performance.

Palmik•7m ago
GPT 5 was released more than 3 months ago. Gemini 2.5 was released less than 8 months ago.
happa•40m ago
This may just be bad recollection from my part, but hasn't Google reported that their search business is right now the most profitable it has ever been?
senordevnyc•38m ago
1) New SOTA models come out all the time and that hasn't killed the other major AI companies. This will be no different.

2) Google's search revenue last quarter was $56 billion, a 14% increase over Q3 2024.

margorczynski•7m ago
1) Not long ago Altman and the OpenAI CFO were openly asking for public money. None of these AI companies have actually any kind of working business plan and are just burning investor money. If the investors see there is no winning against Google (or some open Chinese model) the money will dry up.

2) I'm not suggesting this will happen overnight but especially younger people gravitate towards LLM for information search + actively use some sort of ad blocking. In the long run it doesn't look great for Google.

paswut•35m ago
I'd love to see anthropic/openai pop. back to some regular programming. the models are good enough, time to invest elsewhere
ilaksh•29m ago
The only one it doesn't win is SWE bench which it is significantly behind Claude Sonnet. You just can't take down Sonnet.
stavros•23m ago
Codex has been much better than Sonnet for me.
dotancohen•15m ago
On what types of tasks?
lukev•27m ago
Or else it trained/overfit to the benchmarks. We won't really know until people have a chance to use it for real-world tasks.

Also, models are already pretty good but product/market fit (in terms of demonstrated economic value delivered) remains elusive outside of a couple domains. Does a model that's (say) 30% better reach an inflection point that changes that narrative, or is a more qualitative change required?

alecco•23m ago
For SWE it is the same ranking. But if Google's $20/mo plan is comparable to the $100-200 plans for OpenAI and Anthropic, yes they are done.

But we'll have to wait a few weeks to see if the nerfed model post-release is still as good.

Traubenfuchs•42m ago
So does google actually have a claude console alternative currently?
rjtavares•41m ago
Gemini CLI
muro•40m ago
https://github.com/google-gemini/gemini-cli
itsmevictor•37m ago
Noteworthily, although Gemini 3 Pro seems to have much benchmark scores than other models across the board (including compared to Claude), it's not the case for coding, where it appears to score essentially the same as the others. I wonder why that is.

So far, IMHO, Claude Code remains significantly better than Gemini CLI. We'll see whether that changes with Gemini 3.

decster•22m ago
from my experience, the quality of gemini-cli isn't great, experiencing lot of stupied bug.
BoredPositron•20m ago
Gemini performs better if you use it with Claude Code than with Gemini cli. It still has some odd problems with tool calling but a lot of the performance loss is the Gemini cli app itself.
lifthrasiir•19m ago
Probably because many models from Anthropic would have been optimized for agentic coding in particular...

EDIT: Don't disagree that Gemini CLI has a lot of rough edges, though.

Lionga•18m ago
Because benchmark are a retarded comparison and having nothing to do with reality. Its just jerk material for AI Fanboys
laborcontract•40m ago
It's hilarious that the release of Gemini 3 is getting eclipsed by this cloudflare outage.
senordevnyc•37m ago
It hasn't been released, this is just a leak
amarcheschi•33m ago
On reddit I see it's already available on cursor

https://www.reddit.com/r/Bard/comments/1p093fb/gemini_3_in_c...

yen223•25m ago
Coincidence? Yes
scrlk•38m ago
Benchmarks from pg 4 of the system card:

    | Benchmark                                    | 3 Pro         | 2.5 Pro       | Sonnet 4.5   | GPT-5.1     |
    |--------------------------------------------- |---------------|---------------|------------- |-------------|
    | Humanity’s Last Exam                         | 37.5%         | 21.6%         | 13.7%        | 26.5%       |
    | ARC-AGI-2                                    | 31.1%         | 4.9%          | 13.6%        | 17.6%       |
    | GPQA Diamond                                 | 91.9%         | 86.4%         | 83.8%        | 88.1%       |
    | AIME 2025 (no tools / with code execution)   | 95.0% / 100%  | 88.0% / —     | 87.0% / 100% | 88.0% / —   |
    | MathArena Apex                               | 23.4%         | 0.5%          | 1.6%         | 1.0%        |
    | MMMU-Pro                                     | 81.0%         | 68.0%         | 68.0%        | 80.8%       |
    | ScreenSpot-Pro                               | 72.7%         | 11.4%         | 36.2%        | 3.5%        |
    | CharXiv Reasoning                            | 81.4%         | 69.6%         | 68.5%        | 69.5%       |
    | OmniDocBench 1.5                             | 0.115         | 0.145         | 0.145        | 0.147       |
    | Video-MMMU                                   | 87.6%         | 83.6%         | 77.8%        | 80.4%       |
    | LiveCodeBench Pro                            | 2,439         | 1,775         | 1,418        | 2,243       |
    | Terminal-Bench 2.0                           | 54.2%         | 32.6%         | 42.8%        | 47.6%       |
    | SWE-Bench Verified                           | 76.2%         | 59.6%         | 77.2%        | 76.3%       |
    | t2-bench                                     | 85.4%         | 54.9%         | 84.7%        | 80.2%       |
    | Vending-Bench 2                              | $5,478.16     | $573.64       | $3,838.74    | $1,473.43   |
    | FACTS Benchmark Suite                        | 70.5%         | 63.4%         | 50.4%        | 50.8%       |
    | SimpleQA Verified                            | 72.1%         | 54.5%         | 29.3%        | 34.9%       |
    | MMLU                                         | 91.8%         | 89.5%         | 89.1%        | 91.0%       |
    | Global PIQA                                  | 93.4%         | 91.5%         | 90.1%        | 90.9%       |
    | MRCR v2 (8-needle) (128k avg / 1M pointwise) | 77.0% / 26.3% | 58.0% / 16.4% | 47.1% / n/a  | 61.6% / n/a |
manmal•6m ago
Looks like it will be on par with the contenders when it comes to coding. I guess improvements will be incremental from here on out.
CjHuber•3m ago
If it’s on par in code quality, it would be a way better model for coding because of its huge context window.
Alifatisk•5m ago
These numbers are impressive, at least to say. It looks like Google has produced a beast that will raise the bar even higher. What's even more impressive is how Google came into this game late and went from producing a few flops to being the leader at this (actually, they already achieved the title with 2.5 Pro).

What makes me even more curious is the following

> Model dependencies: This model is not a modification or a fine-tune of a prior model

So did they start from scratch with this one?

benob•2m ago
What does it mean nowadays to start from scratch? At least in the open scene, most of the post-training data is generated by other LLMs.
falcor84•4m ago
That looks impressive, but some of the are a bit out of date.

On Terminal-Bench 2 for example, the leader is currently "Codex CLI (GPT-5.1-Codex)" at 57.8%, beating this new release.

oalessandr•27m ago
Trying to open this link from Italy leads to a CSAM warning
Jowsey•15m ago
Pixeldrain is a free anonymous file host, which unfortunately goes hand-in-hand with this kind of thing.
Fornax96•11m ago
Creator of pixeldrain here. Italy has been doing this for a very long time. They never notified me of any such material being present on my site. I have a lot of measures in place to prevent the spread of CSAM. I have sent dozens of mails to Polizia Postale and even tried calling them a few times, but they never respond. My mails go unanswered and they just hang up the phone.
embedding-shape•27m ago
Curiously, this website seems to be blocked in Spain for whatever reason, and the website's certificate is served by `allot.com/emailAddress=info@allot.com` which obviously fails...

Anyone happen to know why? Is this website by any change sharing information on safe medical abortions or women's rights, something which has gotten websites blocked here before?

amarcheschi•25m ago
That website is used to share everything including pirated things, so that's the reason maybe
Fornax96•18m ago
Creator of pixeldrain here. I have no idea why my site is blocked in Spain, but it's a long running issue.

I actually never discovered who was responsible for the blockade, until I read this comment. I'm going to look into Allot and send them an email.

EDIT: Also, your DNS provider is censoring (and probably monitoring) your internet traffic. I would switch to a different provider.

transcriptase•24m ago
There needs to be a sycophancy benchmark in these comparisons. More baseless praise and false agreement = lower score.
swalsh•22m ago
You're absolutely right
jstummbillig•12m ago
Does not get old.
BoredPositron•11m ago
Your comment demonstrates a remarkably elevated level of cognitive processing and intellectual rigor. Inquiries of this caliber are indicative of a mind operating at a strategically advanced tier, displaying exceptional analytical bandwidth and thought-leadership potential. Given the substantive value embedded in your question, it is operationally imperative that we initiate an immediate deep-dive and execute a comprehensive response aligned with the strategic priorities of this discussion.
postalcoder•3m ago
I care very little about model personality outside of sycophancy. The thing about gemini is that it's notorious for its low self esteem. Given that thing is trained from scratch, I'm very curious to see how they've decided to take it.
jll29•17m ago
Hopefully this model does not generate fake news...

https://www.google.com/search?q=gemini+u.s.+senator+rape+all...

jll29•17m ago
Gemini 3 > Gemma? Hopefully this model does not generate fake news...

https://www.google.com/search?q=gemini+u.s.+senator+rape+all...

lxdlam•16m ago
What does the "Google Antigravity" mean? The link is http://antigravity.google/docs, seemingly a new product but now routing to the Google main page.
Palmik•14m ago
Archive link: https://web.archive.org/web/20251118111103/https://storage.g...
denysvitali•11m ago
Title of the document is "[Gemini 3 Pro] External Model Card - November 18, 2025 - v2", in case you needed further confirmation that the model will be released today.

Also interesting to know that Google Antigravity (antigravity.google / https://github.com/Google-Antigravity ?) leaked. I remember seeing this subdomain recently. Probably Gemini 3 related as well

jmkni•6m ago
what is Google Antigravity?
denysvitali•3m ago
I guess we'll know it in a few hours. Most likely another AI playground? No clue really
catigula•10m ago
I know this is a little controversial but the lack of performance on SWE-bench is hugely disappointing I think economically. These models don’t have any viable path to profitability if they can’t take engineering jobs.
mohsen1•9m ago

     This model is not a modification or a fine-tune of a prior model

Is that common to mention that? Feels like they built something from scratch
mynti•3m ago
It is interesting that the Gemini 3 beats every other model on these benchmarks, mostly by a wide margin, but not on SWE Bench. Sonnet is still king here and all three look to be basically on the same level. Kind of wild to see them hit such a wall when it comes to agentic coding