My current checklist for "production-ready":
1. Persistent memory across sessions (not just in-context window stuffing) 2. Real tool use with error recovery (file I/O, shell, browser, APIs) 3. Multi-model support (swap between Claude, GPT, local models without rewriting) 4. Extensibility via a skill/plugin system rather than hardcoded chains 5. Runs as a daemon/service, not just a CLI you invoke manually 6. Security boundaries — sandboxing, permission models, audit logs
What I've noticed is most frameworks nail 1-2 of these but fall apart on the rest. The ones built for demos tend to have flashy UIs but break when you try to run them unattended for a week.
What's your checklist? What patterns have you seen that separate real agent infrastructure from weekend projects?