There are about 80k active contracts across both platforms right now. Real-time probability estimates for everything from tonight's NBA games to whether Bitcoin will be up in 5 minutes. But the data is fragmented with no unified way to search it.
We trade prediction markets everyday and needed a smarter way to search across all of them. We started by ingesting everything into Postgres and writing SQL. Then we built a Claude wrapper that queries the DB in plain English. Type a question, get structured results. That actually worked and felt like what search should be. But the answers were only as good as the data, and the data was a mess.
The core problem: Both platforms structure their data completely differently. One NBA game on Kalshi is dozens of separate contracts, each with its own ticker. Polymarket has the same game as a handful of contracts with different naming. One says "Cleveland," the other says "Cavaliers." You can't search across any of it without cleaning it up first.
So we built a pipeline to clean the data and classify it properly. Every market goes through it. Structured parsing for the predictable stuff, LLM for free-form titles/description the rules can't handle. New markets get picked up and classified within minutes. Not glamorous work but it's what makes the search return the right stuff instead of garbage.
Some things you can try:
- "NBA tonight" - games from both platforms resolving today - "Zelensky markets on Polymarket" - filtered to one platform - "Weather in Chicago today" - Kalshi has an entire weather derivatives market - "Kalshi trending" - sorted by volume - "pokemon" - you'd be surprised what people bet on
Groupings aren't perfect, some labels are messy. But we use it every day for our own trading and fix things as we run into them. Would love your feedback.