This idea could fly if downstream readers will be able to read it. Json is great because anything can read it, process, transform and serialize without having to know the intrisics of the protocol.
Whats the point of using binary, columnar format for data in transit?
You don't do high performance without knowing the data schema.
Performance optimization and being able to "plug in" to the data ecosystem that Apache Arrow exists in.
OpenTelemetry is pretty great for a lot of uses, but the protocol over the wire is too chunky for some applications where. From last year's post on the topic[0]:
> In a side-by-side comparison between OpenTelemetry Protocol (“OTLP”) and OpenTelemetry Protocol with Apache Arrow for similarly configured traces pipelines, we observe 30% improvement in compression. Although this study specifically focused on traces data, we have observed results for logs and metrics signals in production settings too, where OTel-Arrow users can expect 50% to 70% improvement relative to OTLP for similar pipeline configurations.
For your average set of apps and services running in a k8s cluster somewhere in the cloud, this is just a nice-to-have, but size on wire is a problem for a lot of systems out there today, and they are precluded from adopting OpenTelemetry until that's solved.
[0]: https://opentelemetry.io/blog/2024/otel-arrow-production/
https://opentelemetry.io/blog/2023/otel-arrow/row-vs-columna...
I'm curious about the thread-per-core runtimes, are there even any mature thread-per-core runtimes in Rust around?
ByteDance also has their very fast monio. https://github.com/bytedance/monoio
Both integrate io-uring support for very fast io.
Adopting OTLP without third-party support is pretty time consuming, especially is your tech stack is large and/or varied.
Re runtimes: curious about this too. Feels like the right direction if you’re optimizing a telemetry pipeline.
Kind of a bummer - one thing i was hoping to come out of this was better Arrow ecosystem support for golang.
We’ve been thinking along similar lines with the use of Rust, particularly for OpenTelemetry collection in environments where high performance and low resource overhead are critical, such as edge and serverless. With that in mind, we’ve open-sourced a lightweight OpenTelemetry collector written in Rust to address these use cases. We’ve also developed a native Lambda extension around it, and have seen encouraging interest from folks aiming to improve cold start times.
The project is still fairly early, but we’re optimistic that Rust can open up new opportunities for efficient observability pipelines. Vendors like Datadog are also moving in this direction with their Lambda extension and appear to be adopting Rust more broadly for data-plane components.
If this resonates, feel free to take a look here: https://github.com/streamfold/rotel. We’d love to hear your thoughts on how this could be useful.
andygrove•1mo ago