frontpage.

We just completed benchmarking version 10 of Crystal, our log archival tool. The goal is to replace standard .log.zst or .log.gz cold storage pipelines with something that is natively searchable without re-indexing or full inflation.

The results on large datasets were pretty startling.

The tl;dr from the benchmarks:

The Search Gap: On a 65GB MongoDB JSON log dataset, searching for a rare string (like a specific "ERROR" that occurs 0 times) took standard zstdcat | grep over 8 minutes. Crystal finished the same query in 0.8 seconds.

How? It uses internal Bloom filters to instantly skip huge sections of compressed data, and only decompresses necessary blocks when hits occur. Even on queries with millions of matches, it was still ~4x faster than raw grep.

Compression Performance: At Level 3, it clocked between 800 MB/s and 1300 MB/s on several datasets. At Level 19, it matches ZSTD-19 compression ratios but compresses roughly 10x faster.

We want to know how this fits your infrastructure.

Every logging pipeline is different. We are currently prioritizing packaging for various environments (CLI, K8s sidecar, Docker, etc.).

If you are interested in testing Crystal against your own log deluge, please let us know your preferred integration method in this 3-question form: https://docs.google.com/forms/d/e/1FAIpQLSehstef-rbLfM72scgx...

It helps us prioritize what to build next for real-world deployments.

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

Kessler Syndrome Has Started [video]

Complex Heterodynes Explained

EVs Are a Failed Experiment

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

CCC (Claude's C Compiler) on Compiler Explorer

Homeland Security Spying on Reddit Users

Actors with Tokio (2021)

Can graph neural networks for biology realistically run on edge devices?

Deeper into the shareing of one air conditioner for 2 rooms

Weatherman introduces fruit-based authentication system to combat deep fakes

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

A Curated List of ML System Design Case Studies

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

Open Problems in Mechanistic Interpretability

Bye Bye Humanity: The Potential AMOC Collapse

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

Digital Iris [video]

Essential CDN: The CDN that lets you do more than JavaScript

They Hijacked Our Tech [video]

Vouch

HRL Labs in Malibu laying off 1/3 of their workforce

Show HN: High-performance bidirectional list for React, React Native, and Vue

Show HN: I built a Mac screen recorder Recap.Studio

Ask HN: Codex 5.3 broke toolcalls? Opus 4.6 ignores instructions?

Vectors and HNSW for Dummies

Sanskrit AI beats CleanRL SOTA by 125%