I spent weeks testing whether continual learning requires explicit task context. Tried 23+ standard methods (EWC, k-WTA, gradient surgery) - all failed on conflicting tasks.
Found transformation learning scales: 100% XOR/XNOR → 98.3% MNIST (5 tasks). Key insight: transform features (128D), not logits (5D). Feature-level gets +16% accuracy.
Documented everything - successes and failures. All experiments verified and reproducible.
Curious about feedback, especially on:
1. Scaling to CIFAR-100 or beyond MNIST
2. More sophisticated routing without task labels
3. Theoretical connections to meta-learning
VoidTactician•2h ago
Found transformation learning scales: 100% XOR/XNOR → 98.3% MNIST (5 tasks). Key insight: transform features (128D), not logits (5D). Feature-level gets +16% accuracy.
Documented everything - successes and failures. All experiments verified and reproducible.
Curious about feedback, especially on: 1. Scaling to CIFAR-100 or beyond MNIST 2. More sophisticated routing without task labels 3. Theoretical connections to meta-learning
Happy to answer questions!