x 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ... 254 255
f(x) 0 0 1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12 ... 190 191
So something like sta table,y
iny
sta table,y
adc $0
iny
sta table,y
adc $0
iny
sta table,y
adc $0
iny
used as the loop body that should be repeated 64 times, should work. Will it take less than 6000 cycles total?
rbanffy•2h ago
They could go one step further and calculate the table as needed and use it as a cache.
For an single image scaling it might get a little bit better.
spc476•21m ago
anyfoo•6m ago
Making it an on-demand cache instead is a neat next step. Whether it helps or hurts depends on the actual input: If the input image uses every pixel value anyway, the additional overhead of checking whether the table entry is computed is just unnecessary extra with no value.
But if a typical image only uses a few pixel values, then the amortized cost of just calculating the few needed table entries may very well be significantly below the cost of the current algorithm.
If images are somewhere in between, or their characteristics not well known, then simply trying out both approaches with typical input data is a good approach!
Unless you’re perfectly happy with 0.2 seconds, for example because the runtime of some other parts take so long that dwarfs those 0.2s, then why bother.