I solved the IIT-JEE Mains paper with LLM. Here are the results

https://www.iexplain.app/jee/jee-mains-exam-jan-22-2025-morning

1•roninthesky•7mo ago

Comments

roninthesky•7mo ago

I built an LLM-powered tool for competitive exam explanations and decided to low key test the "solutions" part for one of the JEE Mains 2025 paper (India's most competitive engineering entrance exam with ~1.2M students).

Raw results: - 75 total questions - 67 correct answers - 6 questions couldn't be processed (required diagram input - not supported yet) - 2 incorrect - 97% accuracy on processable questions, 89% overall

The JEE covers advanced physics, chemistry, and mathematics at a level that traditionally requires years of intensive preparation.

The two failures were revealing:

Physics optics problem: The LLM made a sign error when differentiating the mirror equation for image acceleration. My extensive formatting rules could have also led to this which I want to look further into.

Chemical kinetics problem: Failed on a numerical simplification step. The official solution uses a neat trick of replacing e^-23.031 with e^(ln 10 × 10) to make the arithmetic manageable. The LLM computed the raw exponential instead and accumulated rounding errors.

Both were numerical answer questions (no multiple choice options to guide toward the right approach).

I think it's too early to comment about any kind of reliability but I find the results very interesting.

Will be working on more JEE papers soon and report back with culmulative stats with more questions.

chiph2o•7mo ago

interesting interface

is it open-source?

which LLM are you using?

roninthesky•7mo ago

Thanks.

> is it open-source? No, it isn't open source rn - it's most vibe coded so not in the best shape to be open source.

> which LLM are you using? LLM wise - it's configurable, I keep switching between Gemini 2.5 Pro, o3 and fine tuned 4.1. I switch models between different actions as well. The initial explanation vs getting more details/chatting. Generally I have found o3 to be better one with generating explanations e2e.

A Bid-Based NFT Advertising Grid

AI readability score for your documentation

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

I inhaled traffic fumes to find out where air pollution goes in my body

X said it would give $1M to a user who had previously shared racist posts

155M US land parcel boundaries

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

Lobsters Vibecoding Challenge

E-Commerce vs. Social Commerce

Avoiding Modern C++ – Anton Mikhailov [video]

Show HN: AegisMind–AI system with 12 brain regions modeled on human neuroscience

Zig – Package Management Workflow Enhancements

AI-powered text correction for macOS

AppSecMaster – Learn Application Security with hands on challenges

Fibonacci Number Certificates

AI Overviews are killing the web search, and there's nothing we can do about it

City skylines need an upgrade in the face of climate stress

1979: The Model World of Robert Symes [video]

Satellites Have a Lot of Room

1980s Farm Crisis

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

Show HN: Holy Grail: Open-Source Autonomous Development Agent

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?