I was creating an AI chat companion for one of my products and this is the compilation of my decisions and reflections. Plenty of technical parts that you'd like to look into.
*Things I think worth highlighting*
1. Cloudflare Workers
2. Custom static site for interface
3. Full system prompt at the beginning: 17000 tokens -> Ultimately 2500 tokens
4. Tried two LLMs: one as a router LLM to inject selective context + another as the main model
5. Router was a bad idea. Main issue: 6000 ms latency
6. Product matched Embedding approach perfectly (I didn't know at the time, learnt it)
7. Context distillation was a huge learning: remove all semantic words from the prompt
8. Used Promptfoo for benchmarking
9. Cosine similarity score has to be understood for your own data, don't take any random score from the internet (internet and AI suggested 0.7 similarity score, mine turned out around 0.2-0.25)
10. Found OpenAI-4o-mini the best conversation model for my case
---
Questions? Any fellow travellers having gone through the same pain?
bhagyeshsp•1h ago
I was creating an AI chat companion for one of my products and this is the compilation of my decisions and reflections. Plenty of technical parts that you'd like to look into.
*Things I think worth highlighting*
1. Cloudflare Workers
2. Custom static site for interface
3. Full system prompt at the beginning: 17000 tokens -> Ultimately 2500 tokens
4. Tried two LLMs: one as a router LLM to inject selective context + another as the main model
5. Router was a bad idea. Main issue: 6000 ms latency
6. Product matched Embedding approach perfectly (I didn't know at the time, learnt it)
7. Context distillation was a huge learning: remove all semantic words from the prompt
8. Used Promptfoo for benchmarking
9. Cosine similarity score has to be understood for your own data, don't take any random score from the internet (internet and AI suggested 0.7 similarity score, mine turned out around 0.2-0.25)
10. Found OpenAI-4o-mini the best conversation model for my case
---
Questions? Any fellow travellers having gone through the same pain?