frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Datetime-bench: which datetime formats LLMs get right (and wrong)

https://github.com/MemoryStore/datetime-bench/tree/main
2•diwank•2h ago
tl;dr

* If you need an LLM to parse OR emit a timestamp, use:

  RFC 3339 ( e.g. 2024-03-26 10:30:00-05:00 )
* python date format also works well

* Do NOT use unix epoch or javascript date formats.

* Smaller models and non-reasoning models still make a LOT of mistakes in time parsing / formatting.

---

There are lots of temporal reasoning benchmarks (like TimeBench, TRAM, etc.) but they test whether models understand time concepts. Nothing on which datetime output format models get right most often. So we just built the benchmark ourselves.

We tested 22 models across Google, Anthropic, OpenAI, Qwen, and GLM on 235 scenarios and 7 different formats.

The two that surprised us the most were JavaScript Date and unix epoch. JavaScript Date is probably the most commonly used format and it's wrong ~1 in 4 times on parsing. Unix epoch drops to 40% on arithmetic tasks. If you need epoch, just have the model output a string and convert it yourself in code.

Comments

ishita159•43m ago
surprised to see gemini > sonnet 4.6 > opus 4.6

why do you think sonnet is better than opus on this?

The Hard Problem of Consciousness: Triveritas and the Third Impossibility

https://zenodo.org/records/18930279
1•amelius•1m ago•0 comments

Apple Discontinues Mac Pro

1•alifeinbinary•1m ago•0 comments

Order Granting Preliminary Injunction – Anthropic vs. U.S. Department of War [pdf]

https://storage.courtlistener.com/recap/gov.uscourts.cand.465515/gov.uscourts.cand.465515.134.0.pdf
1•theindieman•1m ago•0 comments

Drift – a terminal screensaver that activates when you're idle

https://github.com/phlx0/drift
1•phlx0•4m ago•0 comments

A 100% serverless RAG that extracts complex tables better than NotebookLM

1•saurav-dev•5m ago•0 comments

Why a company is investigating rapes at an ICE detention center, not the sheriff

https://apnews.com/article/otay-mesa-immigration-center-rape-investigations-f14e4687075f84ddb52d4...
1•petethomas•6m ago•0 comments

Islamic Astronomy and Copernicus [pdf]

https://www.tuba.gov.tr/files/yayinlar/bilim-ve-dusun/TUBA-978-625-8352-02-3.pdf
1•teleforce•7m ago•0 comments

Uber and Lyft users overpay when they don’t price check: study

https://hub.jhu.edu/2026/01/02/uber-lyft-study-carey-business-school/
1•hhs•8m ago•0 comments

RunKoda – Real-time collaborative IDE where AI agents don't conflict

https://runkoda.com
1•SNAFI•8m ago•0 comments

New York's Cannabis Business

https://www.bbc.com/worklife/article/20260325-is-new-yorks-weed-business-really-flying-high
1•1659447091•10m ago•0 comments

What is economics these days?

https://marginalrevolution.com/marginalrevolution/2026/03/what-is-economics-these-days.html
1•hhs•12m ago•0 comments

Simulated microgravity alters fertilization and embryo development in mammals

https://www.nature.com/articles/s42003-026-09734-4
1•geox•12m ago•0 comments

Fedora Moving from Pagure to Forgejo

https://communityblog.fedoraproject.org/the-forge-is-our-new-home/
1•birdculture•12m ago•0 comments

Trump Administration Plans to Require Higher Wages for H-1B Visa Holders

https://www.wsj.com/politics/policy/trump-administration-plans-to-require-higher-wages-for-h-1b-v...
1•petethomas•14m ago•0 comments

2023

https://www.hyperdimensional.co/p/2023
1•jger15•15m ago•0 comments

Arctic Winter Sea Ice Ties Record Low, NASA, NSIDC Scientists Find

https://science.nasa.gov/earth/arctic-winter-sea-ice-2026/
1•martinpw•17m ago•0 comments

Show HN: 96.2% on LongMemEval – world record, built solo in 16 days for $1k

https://github.com/JordanMcCann/agentmemory
1•JordanMcCann•20m ago•0 comments

Husband "cheating" on wife with AI chatbot

https://old.reddit.com/r/BestofRedditorUpdates/comments/1s16oqw/cheating_with_ai/
2•cercatrova•21m ago•0 comments

Uptime of GitHub Pages Alternatives

https://alexsci.com/blog/static-hosting-uptime/
1•QuadmasterXLII•21m ago•0 comments

The Apple Charging Situation

https://randsinrepose.com/guides/apple-charging-guide.html
2•colinprince•24m ago•0 comments

Upgrading K8s to 1.35? cgroup v1 is now rejected by default

https://randomwrites.com/operations/23-Cluster-Upgrade-1-34-to-1-35
1•mutahirs•25m ago•0 comments

Anthropic discourages Claud demand during peak productivity hours

https://www.theregister.com/2026/03/26/anthropic_tweaks_usage_limits/
3•dude250711•26m ago•0 comments

Cryptography Migration Timeline

https://blog.google/innovation-and-ai/technology/safety-security/cryptography-migration-timeline/
1•sans_souse•26m ago•0 comments

Ditching GitHub

https://lonami.dev/blog/ditching-github/
2•stek29•27m ago•0 comments

Final training runs account for a minority of R&D compute spending

https://epochai.substack.com/p/final-training-runs-account-for-a
1•gmays•30m ago•0 comments

Should AI Be Listed as a Co-Author in Your Git Commits?

https://www.dariuszparys.com/should-ai-be-listed-as-a-co-author-in-your-git-commits/
2•thcipriani•30m ago•0 comments

Stripe: Provision a production-ready dev stack from your terminal

https://stripe.dev/blog/production-ready-dev-stack-from-terminal
1•nadis•31m ago•0 comments

Show HN: Interactive streamer map with purchasable cells

https://streamergrid.net
2•D-Nis•31m ago•0 comments

Show HN: I put an AI agent on a $7/month VPS with IRC as its transport layer

https://georgelarson.me/writing/2026-03-23-nullclaw-doorman/
8•j0rg3•32m ago•2 comments

Traffic Violation! License Plate Reader Mission Creep Is Already Here

https://www.eff.org/deeplinks/2026/03/traffic-violation-license-plate-reader-mission-creep-alread...
3•hn_acker•33m ago•1 comments