frontpage.

tl;dr

* If you need an LLM to parse OR emit a timestamp, use:

  RFC 3339 ( e.g. 2024-03-26 10:30:00-05:00 )

* python date format also works well

* Do NOT use unix epoch or javascript date formats.

* Smaller models and non-reasoning models still make a LOT of mistakes in time parsing / formatting.

---

There are lots of temporal reasoning benchmarks (like TimeBench, TRAM, etc.) but they test whether models understand time concepts. Nothing on which datetime output format models get right most often. So we just built the benchmark ourselves.

We tested 22 models across Google, Anthropic, OpenAI, Qwen, and GLM on 235 scenarios and 7 different formats.

The two that surprised us the most were JavaScript Date and unix epoch. JavaScript Date is probably the most commonly used format and it's wrong ~1 in 4 times on parsing. Unix epoch drops to 40% on arithmetic tasks. If you need epoch, just have the model output a string and convert it yourself in code.

The Hard Problem of Consciousness: Triveritas and the Third Impossibility

Apple Discontinues Mac Pro

Order Granting Preliminary Injunction – Anthropic vs. U.S. Department of War [pdf]

Drift – a terminal screensaver that activates when you're idle

A 100% serverless RAG that extracts complex tables better than NotebookLM

Why a company is investigating rapes at an ICE detention center, not the sheriff

Islamic Astronomy and Copernicus [pdf]

Uber and Lyft users overpay when they don’t price check: study

RunKoda – Real-time collaborative IDE where AI agents don't conflict

New York's Cannabis Business

What is economics these days?

Simulated microgravity alters fertilization and embryo development in mammals

Fedora Moving from Pagure to Forgejo

Trump Administration Plans to Require Higher Wages for H-1B Visa Holders

2023

Arctic Winter Sea Ice Ties Record Low, NASA, NSIDC Scientists Find

Show HN: 96.2% on LongMemEval – world record, built solo in 16 days for $1k

Husband "cheating" on wife with AI chatbot

Uptime of GitHub Pages Alternatives

The Apple Charging Situation

Upgrading K8s to 1.35? cgroup v1 is now rejected by default

Anthropic discourages Claud demand during peak productivity hours

Cryptography Migration Timeline

Ditching GitHub

Final training runs account for a minority of R&D compute spending

Should AI Be Listed as a Co-Author in Your Git Commits?

Stripe: Provision a production-ready dev stack from your terminal

Show HN: Interactive streamer map with purchasable cells

Show HN: I put an AI agent on a $7/month VPS with IRC as its transport layer

Traffic Violation! License Plate Reader Mission Creep Is Already Here

Show HN: Datetime-bench: which datetime formats LLMs get right (and wrong)

Comments