I built the first version of a project I personally needed — and I’m testing if it could be useful to others. Repo is public + I added a simple waitlist if you’d like to follow along.
Repo: http://github.com/Ga0512/video-analysis
Waitlist: https://iaap4qo6zs2.typeform.com/to/J43jclr2
What it does now:
- Process a video (file or URL)
- Split it into blocks for analysis
- Transcribe audio + caption frames
- Generate multimodal summaries (text + context)
Flexible setup:
- Run locally with open models (privacy, no API costs) Or connect your own API key (faster / larger models)
- Fully customizable: language, summary size (short/medium/long), persona, extra prompts
Ideas for future:
- Chat-with-video → ask questions directly about a video (using both frames + transcription)
- Export for AI parsing → structured export so you can feed the content into other AI workflows or databases
Possible pricing ideas:
- Pay-as-you-go credits for hosted usage
- Or a fixed subscription (X$/month) where you bring your own API key and just use the UI/UX layer
Why I’m here: Before polishing it into a MVP, I’d love some honest feedback:
Would you actually use a tool like this?
What do you value more: local mode (privacy, no cost) or API mode (speed, larger models)?
Does the chat-with-video/export direction make sense?
How would you prefer pricing?
If there’s enough interest, I’ll start building this in public (X) and share progress Thanks in advance
Ga_0512•1h ago
I built the first version of a project I personally needed — and I’m testing if it could be useful to others. Repo is public + I added a simple waitlist if you’d like to follow along.
Repo: http://github.com/Ga0512/video-analysis
Waitlist: https://iaap4qo6zs2.typeform.com/to/J43jclr2
What it does now:
- Process a video (file or URL)
- Split it into blocks for analysis
- Transcribe audio + caption frames
- Generate multimodal summaries (text + context)
Flexible setup:
- Run locally with open models (privacy, no API costs) Or connect your own API key (faster / larger models)
- Fully customizable: language, summary size (short/medium/long), persona, extra prompts
Ideas for future:
- Chat-with-video → ask questions directly about a video (using both frames + transcription)
- Export for AI parsing → structured export so you can feed the content into other AI workflows or databases
Possible pricing ideas:
- Pay-as-you-go credits for hosted usage
- Or a fixed subscription (X$/month) where you bring your own API key and just use the UI/UX layer
Why I’m here: Before polishing it into a MVP, I’d love some honest feedback:
Would you actually use a tool like this?
What do you value more: local mode (privacy, no cost) or API mode (speed, larger models)?
Does the chat-with-video/export direction make sense?
How would you prefer pricing?
If there’s enough interest, I’ll start building this in public (X) and share progress Thanks in advance