Developed a 7-pass scripting enrichment system (beat analysis, adaptation filtering, character deep dives) before generating any images.
Dual backend: Google Gemini for scripting (2M context window) and either Gemini or OpenAI for image generation with 3-tier model fallback (comparing the performance of both).
It's not great. Would love feedback on the pipeline.