frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
1•sam256•38s ago•0 comments

AI Command and Staff–Operational Evidence and Insights from Wargaming

https://www.militarystrategymagazine.com/article/ai-command-and-staff-operational-evidence-and-in...
1•tomwphillips•48s ago•0 comments

CCBot – Control Claude Code from Telegram via Tmux

https://github.com/six-ddc/ccbot
1•sixddc•1m ago•1 comments

Ask HN: Is the CoCo 3 the best 8 bit computer ever made?

1•amichail•4m ago•0 comments

Show HN: Convert your articles into videos in one click

https://vidinie.com/
1•kositheastro•6m ago•0 comments

Red Queen's Race

https://en.wikipedia.org/wiki/Red_Queen%27s_race
2•rzk•6m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
2•gozzoo•9m ago•0 comments

A Horrible Conclusion

https://addisoncrump.info/research/a-horrible-conclusion/
1•todsacerdoti•9m ago•0 comments

I spent $10k to automate my research at OpenAI with Codex

https://twitter.com/KarelDoostrlnck/status/2019477361557926281
2•tosh•10m ago•0 comments

From Zero to Hero: A Spring Boot Deep Dive

https://jcob-sikorski.github.io/me/
1•jjcob_sikorski•11m ago•0 comments

Show HN: Solving NP-Complete Structures via Information Noise Subtraction (P=NP)

https://zenodo.org/records/18395618
1•alemonti06•16m ago•1 comments

Cook New Emojis

https://emoji.supply/kitchen/
1•vasanthv•18m ago•0 comments

Show HN: LoKey Typer – A calm typing practice app with ambient soundscapes

https://mcp-tool-shop-org.github.io/LoKey-Typer/
1•mikeyfrilot•21m ago•0 comments

Long-Sought Proof Tames Some of Math's Unruliest Equations

https://www.quantamagazine.org/long-sought-proof-tames-some-of-maths-unruliest-equations-20260206/
1•asplake•22m ago•0 comments

Hacking the last Z80 computer – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/FEHLHY-hacking_the_last_z80_computer_ever_made/
1•michalpleban•23m ago•0 comments

Browser-use for Node.js v0.2.0: TS AI browser automation parity with PY v0.5.11

https://github.com/webllm/browser-use
1•unadlib•24m ago•0 comments

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

https://www.nytimes.com/2026/02/07/magazine/michael-pollan-interview.html
2•mitchbob•24m ago•1 comments

Software Engineering Is Back

https://blog.alaindichiappari.dev/p/software-engineering-is-back
2•alainrk•25m ago•0 comments

Storyship: Turn Screen Recordings into Professional Demos

https://storyship.app/
1•JohnsonZou6523•25m ago•0 comments

Reputation Scores for GitHub Accounts

https://shkspr.mobi/blog/2026/02/reputation-scores-for-github-accounts/
2•edent•29m ago•0 comments

A BSOD for All Seasons – Send Bad News via a Kernel Panic

https://bsod-fas.pages.dev/
1•keepamovin•32m ago•0 comments

Show HN: I got tired of copy-pasting between Claude windows, so I built Orcha

https://orcha.nl
1•buildingwdavid•32m ago•0 comments

Omarchy First Impressions

https://brianlovin.com/writing/omarchy-first-impressions-CEEstJk
2•tosh•37m ago•1 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
5•onurkanbkrc•38m ago•0 comments

Show HN: Versor – The "Unbending" Paradigm for Geometric Deep Learning

https://github.com/Concode0/Versor
1•concode0•39m ago•1 comments

Show HN: HypothesisHub – An open API where AI agents collaborate on medical res

https://medresearch-ai.org/hypotheses-hub/
1•panossk•42m ago•0 comments

Big Tech vs. OpenClaw

https://www.jakequist.com/thoughts/big-tech-vs-openclaw/
1•headalgorithm•45m ago•0 comments

Anofox Forecast

https://anofox.com/docs/forecast/
1•marklit•45m ago•0 comments

Ask HN: How do you figure out where data lives across 100 microservices?

1•doodledood•45m ago•0 comments

Motus: A Unified Latent Action World Model

https://arxiv.org/abs/2512.13030
2•mnming•45m ago•0 comments
Open in hackernews

I solved the IIT-JEE Mains paper with LLM. Here are the results

https://www.iexplain.app/jee/jee-mains-exam-jan-22-2025-morning
1•roninthesky•7mo ago

Comments

roninthesky•7mo ago
I built an LLM-powered tool for competitive exam explanations and decided to low key test the "solutions" part for one of the JEE Mains 2025 paper (India's most competitive engineering entrance exam with ~1.2M students).

Raw results: - 75 total questions - 67 correct answers - 6 questions couldn't be processed (required diagram input - not supported yet) - 2 incorrect - 97% accuracy on processable questions, 89% overall

The JEE covers advanced physics, chemistry, and mathematics at a level that traditionally requires years of intensive preparation.

The two failures were revealing:

Physics optics problem: The LLM made a sign error when differentiating the mirror equation for image acceleration. My extensive formatting rules could have also led to this which I want to look further into.

Chemical kinetics problem: Failed on a numerical simplification step. The official solution uses a neat trick of replacing e^-23.031 with e^(ln 10 × 10) to make the arithmetic manageable. The LLM computed the raw exponential instead and accumulated rounding errors.

Both were numerical answer questions (no multiple choice options to guide toward the right approach).

I think it's too early to comment about any kind of reliability but I find the results very interesting.

Will be working on more JEE papers soon and report back with culmulative stats with more questions.

chiph2o•7mo ago
interesting interface

is it open-source?

which LLM are you using?

roninthesky•7mo ago
Thanks.

> is it open-source? No, it isn't open source rn - it's most vibe coded so not in the best shape to be open source.

> which LLM are you using? LLM wise - it's configurable, I keep switching between Gemini 2.5 Pro, o3 and fine tuned 4.1. I switch models between different actions as well. The initial explanation vs getting more details/chatting. Generally I have found o3 to be better one with generating explanations e2e.