Pipeline: - Bilibili auth + favorites sync - Audio extraction + ASR fallback (handles inaccessible audio URLs) - Chunking + embeddings + ChromaDB - RAG chat UI with source links
Stack: FastAPI, LangChain, ChromaDB, Next.js, SQLite.
I’d love feedback on: 1) retrieval quality tradeoffs 2) better indexing strategy for long videos 3) cost control for ASR + embeddings