The only reasonable way to compare rationals is the decimal expansion of the string.
https://en.wikipedia.org/wiki/Floating-point_arithmetic#Accu...
Careful, someone is liable to throw this in an LLM prompt and get back code expanding the ASCII characters for string values like "1/346".
Why decimal? I don’t see why any other integer base wouldn’t work, and, on about any system, doing 2^n for any n > 0* will be both easier to implement and faster to run.
And that, more or less, is what the suggested solution does. It first compares the first 53 bits and, if that’s not conclusive, it compares 64 bits.
Also, of course, if your number has more than n bits, you’d only generate digits until you know the answer.
This was one of the bigger hidden performance issues when I was working on Hive - the default coercion goes to Double, which has a bad hash code implementation [1] & causes joins to cluster & chain, which caused every miss on the hashtable to probe that many away from the original index.
The hashCode itself was smeared to make values near Machine epsilon to hash to the same hash bucket so that .equals could do its join, but all of this really messed up the folks who needed 22 digit numeric keys (eventually Decimal implementation handled it by adding a big fixed integer).
Databases and Double join keys was one of the red-flags in a SQL query, mostly if you see it someone messed up something.
pestatije•2mo ago
stronglikedan•2mo ago
thaumasiotes•2mo ago
I had the impression that the usual way to compare floats is to define a precision and check for -p < (a - b) < p. In this case 0.99997 - 1.0002 = -0.00023, which correctly tells us that the two numbers are equal at 0.001 precision and unequal at 0.0001.
wiml•2mo ago
You can do it if you produce two hash values for each key (and clean up your duplicates later), but not if you produce only one.
Of course most of the time if you are doing equality comparisons on floats you have a fundamental conceptual problem with your code.