https://github.com/vaganov/y-fast
Y-fast trie operations take amortized O(log log M) time (note: M denotes the maximum capacity, not the size). Although constant factor in O() is rather large, it only takes reasonably large size to outperform standard binary search trees (performance tests with plots attached). Certain conditions apply, like CPU architecture and index bit representation efficiency.
Feedback, issues and any collaboration are welcome.