Memory copy (memcpy) is a pervasive but major performance bottleneck across all modern operating systems and apps, eating up to 66% of cycles in some benchmarks. We introduce Copier, a novel first-class OS service that fundamentally re-architects how memory is copied.
Instead of traditional synchronous (blocking) copy, Copier offers coordinated asynchronous copy, enabling applications to overlap computation with data movement using novel programming primitives, hiding copy latency.
As an OS service, Copier can also:
- Fully utilize hardware (like AVX2 and DMA) via a specialized dispatcher, delivering higher throughput than existing kernel or library functions.
- Absorb redundant copies across privilege levels (user-space to kernel and back) by maintaining a global view, drastically accelerating I/O-intensive workloads.
In practice, Copier achieves up to 1.8x latency speedup for Redis.
This shows memory copy doesn't have to be a performance tax—it can be a managed, efficient service.
jkhe•1h ago
Instead of traditional synchronous (blocking) copy, Copier offers coordinated asynchronous copy, enabling applications to overlap computation with data movement using novel programming primitives, hiding copy latency.
As an OS service, Copier can also: - Fully utilize hardware (like AVX2 and DMA) via a specialized dispatcher, delivering higher throughput than existing kernel or library functions. - Absorb redundant copies across privilege levels (user-space to kernel and back) by maintaining a global view, drastically accelerating I/O-intensive workloads.
In practice, Copier achieves up to 1.8x latency speedup for Redis.
This shows memory copy doesn't have to be a performance tax—it can be a managed, efficient service.