I built KaggleIngest to solve a problem I kept hitting: using AI coding assistants effectively during Kaggle competitions.
The problem: You want Claude/Copilot to help you iterate on a Kaggle competition, but feeding it useful context is painful. There are hundreds of notebooks, limited context windows, and valuable insights are buried in noise.
The solution: KaggleIngest takes any Kaggle competition or dataset URL and outputs a token-optimized file containing:
Top-ranked notebooks (scored by upvotes × recency) Key code patterns (imports and visualizations stripped) Dataset schemas parsed from CSVs Competition metadata Demo: http://kaggleingest.com/
GitHub: https://github.com/Anand-0037/KaggleIngest
Stack: FastAPI, React 19, Redis, Python 3.13
The output uses TOON (Token-Optimized Object Notation) which reduces token usage by ~40% compared to standard JSON.
I'd love feedback on the approach or feature requests. Thanks for looking!