(The above is my human sarcastic attempt at hitting a sycophantic tone common to chatbots today)
Thanks for the demo. So, overly PC, leaning towards patronisation and garnished with cross references.
Think of it like the text version of jpeg artifacts. Or, to make a comparison to image models, it's like "ai hands" (but note that recent image models are much better at drawing hands)
There's research to stop this syncophantic behavior https://openai.com/index/sycophancy-in-gpt-4o/ so it's likely that in the future, systems won't have this specific flaw (or at least not as glaring). However they may have their own artifacts
Can anyone write a good prompt that will do this?
> Your English is fine as it is.
You do not know this. This level of technical explanation is a lot harder than a few simple sentences.
Structured State Space Models and Mamba. Models like Mamba [Gu and Dao, 2023] can be in- terpreted within GWO as employing a sophisticated Path, Shape, and Weight. The Path is defined by a structured state-space recurrence, enabling it to model long-range dependencies efficiently. The Shape is causal (1D), processing information sequentially. Critically, the Weight function is highly dynamic and input- dependent, realized through selective state parameters that allow the model to focus on or forget information based on the context, creating an effective content-aware bottleneck for sequences.
1. Context-dependent convolution
2. Global & Local branches
3. Replace large-filter Conv with matrix multiplication
4. Information bottleneck -> Information loss
I also want to share that Mamba is based on the concept of Hyena. And the simplicity is the best (HyperZZW), and Hyena is a failure.
umjunsik132•4mo ago
rf15•4mo ago
CuriouslyC•4mo ago
If it's useful to you, I'm happy to be a sounding board/vibes partner for your research. My contact info is in my profile.