Your concern about catastrophic forgetting is mostly unfounded in the regime of fine-tuning large diffusion models. The weights in this case will maybe suffer from some damage to accuracy on some downstream tasks. In general though, it is not “catastrophic”. I believe this is due to the attention mechanism but I’m happy to be corrected.
frotaur•33m ago
I see, it was probably my high learning rate that caused problems. To be honest, I got a bit lazy to retry full finetuning since LoRA worked so well, but maybe I'll revisit this in the future, maybe with Qwen Image.
adzm•39m ago
Minor observation: the formula text appears to go above the sticky header in the website.
frotaur•37m ago
True, I hadn't noticed, thanks! I'll try to fix that in the near future.
throwaway314155•52m ago
frotaur•33m ago