For a chatbot like claude code, since the instruction prompt part (including descriptions of tools) is relatively constant for multiple users and over a long time, a lot of optimizations can be made. Even basic prompt caching gives a lot of speed and cost reduction.
graphitout•1h ago