I’ve been experimenting with MLflow’s Prompt Engineering UI, which lets you do no-code prompt tuning across multiple LLMs. While it officially supports models like OpenAI out of the box, I wanted to try it with Japanese open-source models from the LLM-jp project.
This repo shows how to serve these models locally using MLflow’s pyfunc model interface, expose them via the MLflow AI Gateway, and compare prompt performance through the UI.
It includes a working setup with:
- Hugging Face LLM-jp models (e.g. llm-jp-3-3.7b-instruct)
- MLflow Model Serving
- MLflow Gateway
- Prompt Engineering UI
- Streamlit UI for experiment tracking
ss-13•1h ago
This repo shows how to serve these models locally using MLflow’s pyfunc model interface, expose them via the MLflow AI Gateway, and compare prompt performance through the UI.
It includes a working setup with: - Hugging Face LLM-jp models (e.g. llm-jp-3-3.7b-instruct) - MLflow Model Serving - MLflow Gateway - Prompt Engineering UI - Streamlit UI for experiment tracking
GitHub: https://github.com/suzuki-2001/mlflow-llm-jp-integration Japanese article explaining the project: https://zenn.dev/shosuke_13/articles/21d304b5f80e00