But I'm curious about the actual mechanics: How exactly does this feedback loop work? When I accept, reject, or modify the code that these models spit out, is that signal fed directly back into training?
Not necessarily against this, just genuinely curious about how the sausage is made.
incomingpain•3h ago
My understanding is that essentially nobody uses our inputs as training anymore. It makes the models worse when they were doing this.
Garbage data in, garbage out.
They want to control and use only quality data as their training data.