Great explanation, but the last question is quite simple. You determine the weights via brute force. Simply running a large amount of data where you have the input as well as the correct output (handwriting to text in this case).
ggambetta•37m ago
"Brute force" would be trying random weights and keeping the best performing model. Backpropagation is compute-intensive but I wouldn't call it "brute force".
Ygg2•14m ago
"Brute force" here is about the amount of data you're ingesting. It's no Alpha Zero, that will learn from scratch.
brudgers•1d ago