Key features: - Command-line interface for generating audio - Support for multi-speaker dialogue with [S1]/[S2] tags - Non-verbal sounds like (laughs), (coughs), etc. - Plans for voice cloning capability
Current status: Functional but with memory optimization challenges. The PyTorch version can generate minutes of audio in <10GB VRAM, while this port currently has higher memory usage. Contributions from JAX optimization experts welcome!
GitHub: [https://github.com/jaco-bro/diajax]