frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Creating and Hosting a Static Website on Cloudflare for Free

https://benjaminsmallwood.com/blog/creating-and-hosting-a-static-website-on-cloudflare-for-free/
1•bensmallwood•2m ago•1 comments

"The Stanford scam proves America is becoming a nation of grifters"

https://www.thetimes.com/us/news-today/article/students-stanford-grifters-ivy-league-w2g5z768z
1•cwwc•7m ago•0 comments

Elon Musk on Space GPUs, AI, Optimus, and His Manufacturing Method

https://cheekypint.substack.com/p/elon-musk-on-space-gpus-ai-optimus
2•simonebrunozzi•15m ago•0 comments

X (Twitter) is back with a new X API Pay-Per-Use model

https://developer.x.com/
2•eeko_systems•22m ago•0 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
1•neogoose•25m ago•1 comments

Show HN: Deterministic signal triangulation using a fixed .72% variance constant

https://github.com/mabrucker85-prog/Project_Lance_Core
1•mav5431•26m ago•1 comments

Scientists Discover Levitating Time Crystals You Can Hold, Defy Newton’s 3rd Law

https://phys.org/news/2026-02-scientists-levitating-crystals.html
2•sizzle•26m ago•0 comments

When Michelangelo Met Titian

https://www.wsj.com/arts-culture/books/michelangelo-titian-review-the-renaissances-odd-couple-e34...
1•keiferski•27m ago•0 comments

Solving NYT Pips with DLX

https://github.com/DonoG/NYTPips4Processing
1•impossiblecode•28m ago•1 comments

Baldur's Gate to be turned into TV series – without the game's developers

https://www.bbc.com/news/articles/c24g457y534o
2•vunderba•28m ago•0 comments

Interview with 'Just use a VPS' bro (OpenClaw version) [video]

https://www.youtube.com/watch?v=40SnEd1RWUU
1•dangtony98•33m ago•0 comments

EchoJEPA: Latent Predictive Foundation Model for Echocardiography

https://github.com/bowang-lab/EchoJEPA
1•euvin•41m ago•0 comments

Disablling Go Telemetry

https://go.dev/doc/telemetry
1•1vuio0pswjnm7•43m ago•0 comments

Effective Nihilism

https://www.effectivenihilism.org/
1•abetusk•46m ago•1 comments

The UK government didn't want you to see this report on ecosystem collapse

https://www.theguardian.com/commentisfree/2026/jan/27/uk-government-report-ecosystem-collapse-foi...
3•pabs3•48m ago•0 comments

No 10 blocks report on impact of rainforest collapse on food prices

https://www.thetimes.com/uk/environment/article/no-10-blocks-report-on-impact-of-rainforest-colla...
2•pabs3•49m ago•0 comments

Seedance 2.0 Is Coming

https://seedance-2.app/
1•Jenny249•50m ago•0 comments

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

https://apps.apple.com/us/app/fitspire-5-minute-workout/id6758784938
1•devavinoth12•50m ago•0 comments

Dexterous robotic hands: 2009 – 2014 – 2025

https://old.reddit.com/r/robotics/comments/1qp7z15/dexterous_robotic_hands_2009_2014_2025/
1•gmays•55m ago•0 comments

Interop 2025: A Year of Convergence

https://webkit.org/blog/17808/interop-2025-review/
1•ksec•1h ago•1 comments

JobArena – Human Intuition vs. Artificial Intelligence

https://www.jobarena.ai/
1•84634E1A607A•1h ago•0 comments

Concept Artists Say Generative AI References Only Make Their Jobs Harder

https://thisweekinvideogames.com/feature/concept-artists-in-games-say-generative-ai-references-on...
1•KittenInABox•1h ago•0 comments

Show HN: PaySentry – Open-source control plane for AI agent payments

https://github.com/mkmkkkkk/paysentry
2•mkyang•1h ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•1h ago•1 comments

The Crumbling Workflow Moat: Aggregation Theory's Final Chapter

https://twitter.com/nicbstme/status/2019149771706102022
1•SubiculumCode•1h ago•0 comments

Pax Historia – User and AI powered gaming platform

https://www.ycombinator.com/launches/PMu-pax-historia-user-ai-powered-gaming-platform
2•Osiris30•1h ago•0 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
3•ambitious_potat•1h ago•4 comments

Scams, Fraud, and Fake Apps: How to Protect Your Money in a Mobile-First Economy

https://blog.afrowallet.co/en_GB/tiers-app/scams-fraud-and-fake-apps-in-africa
1•jonatask•1h ago•0 comments

Porting Doom to My WebAssembly VM

https://irreducible.io/blog/porting-doom-to-wasm/
2•irreducible•1h ago•0 comments

Cognitive Style and Visual Attention in Multimodal Museum Exhibitions

https://www.mdpi.com/2075-5309/15/16/2968
1•rbanffy•1h ago•0 comments
Open in hackernews

Processing Strings 109x Faster Than Nvidia on H100

https://ashvardanian.com/posts/stringwars-on-gpus/
216•ashvardanian•4mo ago

Comments

ashvardanian•4mo ago
After publishing this a few days ago, 2 things have happened.

First, it tuned out that StringZilla scales further to over 900 GigaCUPS around 1000-byte long inputs on Nvidia H100. Moreover, the same performance is obviously accessible on lower-end hardware as the algorithm is not memory bound, no HBM is needed.

Second, I’ve finally transitioned to Xeon 6 Granite Rapids nodes with 192 physical cores and 384 threads. On those, the Ice Lake+ kernels currently yield over 3 TeraCUPS, 3x the current Hopper kernels.

The most recent numbers are already in the repo: https://github.com/ashvardanian/StringWa.rs

giancarlostoro•4mo ago
I am curious if RipGrep is faster or would be even faster if using StringZilla. RipGrep is insanely fast as it is.
ashvardanian•4mo ago
I’m not an active RipGrep user, so can’t speak for all usage patterns. My guess: for plain substring searches, you probably won’t see much difference. Where StringZilla may potentially help is in character-set searches.
burntsushi•4mo ago
Nope. ripgrep uses the `memchr` crate for substring search, and in my benchmarks it's generally faster than stringzilla:

    $ rebar cmp results.csv --intersection -f huge
    benchmark                                        rust/memchr/memmem/prebuilt  stringzilla/memmem/oneshot
    ---------                                        ---------------------------  --------------------------
    memmem/pathological/md5-huge-no-hash             47.4 GB/s (1.00x)            38.1 GB/s (1.25x)
    memmem/pathological/md5-huge-last-hash           40.3 GB/s (1.00x)            23.4 GB/s (1.72x)
    memmem/pathological/rare-repeated-huge-tricky    40.4 GB/s (1.04x)            42.0 GB/s (1.00x)
    memmem/pathological/rare-repeated-huge-match     1977.7 MB/s (1.00x)          563.3 MB/s (3.51x)
    memmem/subtitles/common/huge-en-that             35.9 GB/s (1.00x)            25.3 GB/s (1.42x)
    memmem/subtitles/common/huge-en-you              15.9 GB/s (1.00x)            9.5 GB/s (1.67x)
    memmem/subtitles/common/huge-en-one-space        1376.4 MB/s (1.00x)          1364.0 MB/s (1.01x)
    memmem/subtitles/common/huge-ru-that             29.0 GB/s (1.00x)            15.5 GB/s (1.87x)
    memmem/subtitles/common/huge-ru-not              16.0 GB/s (1.00x)            3.5 GB/s (4.53x)
    memmem/subtitles/common/huge-ru-one-space        2.6 GB/s (1.00x)             2.4 GB/s (1.08x)
    memmem/subtitles/common/huge-zh-that             31.2 GB/s (1.00x)            23.8 GB/s (1.31x)
    memmem/subtitles/common/huge-zh-do-not           19.4 GB/s (1.00x)            12.1 GB/s (1.59x)
    memmem/subtitles/common/huge-zh-one-space        5.3 GB/s (1.05x)             5.6 GB/s (1.00x)
    memmem/subtitles/never/huge-en-john-watson       41.2 GB/s (1.00x)            31.2 GB/s (1.32x)
    memmem/subtitles/never/huge-en-all-common-bytes  47.9 GB/s (1.00x)            37.5 GB/s (1.28x)
    memmem/subtitles/never/huge-en-some-rare-bytes   43.4 GB/s (1.00x)            42.7 GB/s (1.02x)
    memmem/subtitles/never/huge-en-two-space         42.2 GB/s (1.00x)            30.7 GB/s (1.37x)
    memmem/subtitles/never/huge-ru-john-watson       42.2 GB/s (1.00x)            42.1 GB/s (1.00x)
    memmem/subtitles/never/huge-zh-john-watson       47.6 GB/s (1.00x)            34.0 GB/s (1.40x)
    memmem/subtitles/rare/huge-en-sherlock-holmes    40.8 GB/s (1.05x)            42.9 GB/s (1.00x)
    memmem/subtitles/rare/huge-en-sherlock           36.7 GB/s (1.16x)            42.5 GB/s (1.00x)
    memmem/subtitles/rare/huge-en-medium-needle      47.7 GB/s (1.00x)            31.3 GB/s (1.52x)
    memmem/subtitles/rare/huge-en-long-needle        44.5 GB/s (1.00x)            32.0 GB/s (1.39x)
    memmem/subtitles/rare/huge-en-huge-needle        45.7 GB/s (1.00x)            33.4 GB/s (1.37x)
    memmem/subtitles/rare/huge-ru-sherlock-holmes    42.1 GB/s (1.00x)            42.2 GB/s (1.00x)
    memmem/subtitles/rare/huge-ru-sherlock           42.3 GB/s (1.01x)            42.9 GB/s (1.00x)
    memmem/subtitles/rare/huge-zh-sherlock-holmes    46.7 GB/s (1.00x)            33.1 GB/s (1.41x)
    memmem/subtitles/rare/huge-zh-sherlock           47.4 GB/s (1.00x)            42.8 GB/s (1.11x)
But I would say they are overall pretty competitive.

If you want to run the benchmarks yourself, you can. First, get rebar[1]. Then, from the root of the `memchr` repository[2]:

    $ rebar build -e 'rust/memchr/memmem/prebuilt' -e 'stringzilla/memmem/oneshot'
    stringzilla/memmem/oneshot: running: cd "benchmarks/./engines/stringzilla" && "cargo" "build" "--release"
    stringzilla/memmem/oneshot: build complete for version 3.12.3
    rust/memchr/memmem/prebuilt: running: cd "benchmarks/./engines/rust-memchr" && "cargo" "build" "--release"
    rust/memchr/memmem/prebuilt: build complete for version 2.7.4
    $ rebar measure -e 'rust/memchr/memmem/prebuilt' -e 'stringzilla/memmem/oneshot' | tee results.csv
    $ rebar rank results.csv
    Engine                       Version  Geometric mean of speed ratios  Benchmark count
    ------                       -------  ------------------------------  ---------------
    rust/memchr/memmem/prebuilt  2.7.4    1.14                            57
    stringzilla/memmem/oneshot   3.12.3   1.43                            54
    $ rebar cmp results.csv --intersection -f never/huge
    benchmark                                        rust/memchr/memmem/prebuilt  stringzilla/memmem/oneshot
    ---------                                        ---------------------------  --------------------------
    memmem/subtitles/never/huge-en-john-watson       41.2 GB/s (1.00x)            31.2 GB/s (1.32x)
    memmem/subtitles/never/huge-en-all-common-bytes  47.9 GB/s (1.00x)            37.5 GB/s (1.28x)
    memmem/subtitles/never/huge-en-some-rare-bytes   43.4 GB/s (1.00x)            42.7 GB/s (1.02x)
    memmem/subtitles/never/huge-en-two-space         42.2 GB/s (1.00x)            30.7 GB/s (1.37x)
    memmem/subtitles/never/huge-ru-john-watson       42.2 GB/s (1.00x)            42.1 GB/s (1.00x)
    memmem/subtitles/never/huge-zh-john-watson       47.6 GB/s (1.00x)            34.0 GB/s (1.40x)
See also: https://github.com/BurntSushi/memchr/discussions/159

[1]: https://github.com/BurntSushi/rebar

[2]: https://github.com/BurntSushi/memchr

clausecker•4mo ago
When I implemented SIMD-accelerated string functions for FreeBSD's libc, I briefly looked at Stringzilla, but the code didn't look particularly interesting or fast. So no surprise here.
ashvardanian•4mo ago
It’s a very nice and detailed benchmark suite! Great effort! Can you please share the CPU model you are running on? I suspect it’s an x86 CPU without AVX-512 support.
burntsushi•4mo ago
i9-12900K, x86-64.

There is definitely no AVX-512 support on my CPU. Which is also true for most of my users. I don't bother with AVX-512 for that reason.

Another substantial population of my users are on aarch64, which memchr has optimizations for. I don't think StringZilla does.

ashvardanian•4mo ago
Makes sense! I mostly focus on newer AVX-512 variants as opposed to older AVX2-only CPUs. As for aarch64, it is supported with both NEON, SVE, and SVE2 kernels for some tasks. The last two are rarely useful, unless you run on AWS Graviton 3 (previous gen) or some of the supercomputers with custom chips like Fujitsu A64FX.
burntsushi•4mo ago
> newer AVX-512 variants as opposed to older AVX2-only CPUs

This is exactly my issue with targeting AVX-512. It isn't just absent on "older AVX2-only CPUs." It's also absent on many "newer AVX2-only CPUs." For example, the i9-14900K. I don't think any of the other newer Intel CPUs have AVX-512 either. And historically, whether an x86-64 CPU supported AVX-512 at all was hit or miss.

AVX-512 has been around for a very long time now, and it has just never been consistently available.

vlovich123•4mo ago
It’s mainly available in data centers, but yes missing in consumer parts. And for a while even in data centers you wanted to be careful about using it due to Intel’s issues with clock downscaling but that hasn’t been true for a few years.
ashvardanian•4mo ago
The consumer situation is changing. A few years ago, when I was working with a team on some closed source HPC stuff, we’ve got everyone Tiger Lake-based laptops to simplify AVX-512 R&D. Now, Zen4-based desktop CPUs also support it.

But its fair to say that I’m mostly focusing on the datacenter/supercomputing hardware, both on the x86 and Arm side.

vlovich123•4mo ago
If you’re targeting AVX-512 Intel consumer it’s pointless. But yes, AMD does continue to ship AVX-512 chips so completely ignoring 512 on consumer isn’t ideal.
William_BB•4mo ago
Could you elaborate on SVE and SVE2? Is that because it's only 128 bits? I think my Macbook (Apple silicon) is one of the two
ashvardanian•4mo ago
Yes, at the scale of 128-bit registers NEON is mostly enough, except for a few categories of instructions missing in that ISA subset, like scatter/gather ops, that can yield 30% boost over serial memory accesses: https://github.com/ashvardanian/less_slow.cpp/releases/tag/v...
giancarlostoro•4mo ago
Thank you! I love RipGrep, its the one thing I install, use it for everything, even non-dev stuff.
jasonjmcghee•4mo ago
Thank you for memchr- really!
llm_nerd•4mo ago
This is neat, and I click the little upvote because hyper-optimizations are a delight.

But realistically, is there any real-world situation where one would use this? What niche or industry or need would benefit from this, where the dependency + setup costs are worth it. Strings just seem to be a long-solved non-issue.

ashvardanian•4mo ago
This last wave of work was actually triggered by the industry over the last 2 years, as the volume of biological sequence data is growing rapidly and more BioTech and Pharma companies are rushing to scale computational pipelines.

Namely, if you look at DeepMind’s AlphaFold 1 and 2, bulk volume of compute time is spent outside of PyTorch - running sequence alignment. Historically, with BLAST. More recently, in other labs, with some of my code :)

ozgrakkurt•4mo ago
Really dig these optimization blogs. Educational and well written
abdellah123•4mo ago
super nice, is there already an extension to use this in Postgres?
ashvardanian•4mo ago
Not that I’m aware of. Some commercial DBMS vendors are experimenting with integrations, but I haven’t really seen much in the Postgres ecosystem.

What excites me in this release is the quality of the new hash functions. I’ve built many over the years but never felt they were worth sharing until now. Having two included here was a personal milestone for me, since I’ve always admired how good xxHash and aHash are and wanted to build something of similar caliber.

The new hashes should be directly useful in databases, for example improving JOIN performance. And the fingerprinting interfaces based on 52-bit modulo math with double-precision FMA units open up another path. They aren’t easy to use and won’t apply everywhere, but on petabyte-scale retrieval tasks they can make a real impact.

ComputerGuru•4mo ago
Great work and nice write up, Ash!

A suggestion: in the comparison table under the “AES and Port-Parallelism Recipe” it would be great to include “streaming support” and “stable output” (across os/arch) as a column.

Also something to beware of, some hash libraries claim to support streaming via the Hasher interface but actually return different results in streaming and one-shot mode (and have different performance profiles). I’m on mobile so I can’t check atm but I’m about 80% sure gxhash has at least one of these problems that prevented me from using it before.

ashvardanian•4mo ago
Thanks! You are likely right! It took a lot of time to make sure that all 6 of ISA-specific versions of StringZilla (https://github.com/ashvardanian/StringZilla/blob/main/includ...) return the same output for both one-shot and incremental construction, and I’m not sure if it was a priority for other projects :)
unwind•4mo ago
I'm not (at the moment) a potential user of this, but I just wanted to say that it was a fantastic page with a really good presentation of the project and its capabilities.

One micro-question on the editing: why are numbers written with an apostrophe (') as the thousands-separator [1]? I know that is used for this purpose in Switzerland and that many programming languages support it. It just seemed very strange for English text, where typically comma (,) would be used, of course.

[1]: https://en.wikipedia.org/wiki/Decimal_separator#Digit_groupi...

[2]: https://en.wikipedia.org/wiki/Apostrophe#Miscellaneous_uses_...

adrian_b•4mo ago
This is the stupid choice made by the C++ (2014) standard.

A digit separator for increased readability of long numbers has been first introduced by Ada (1979-06), which has used the underscore. This usage matched the original reason for the introduction of the underscore in the character set, which had been done by PL/I (1964-12), for increasing the readability of long identifiers, while avoiding the ambiguity caused by using hyphen for that purpose, as previously in COBOL (many LISPs have retained the COBOL usage of the hyphen, because they, like COBOL, do not normally write arithmetic expressions with operators).

Most programming languages that have added a digit separator have followed Ada, by using the underscore.

35 years later, C++ should have done the same and I hate whoever thought otherwise within the people who have updated the standard, causing thus completely unnecessary compatibility problems, e.g. when copying a big initialized array between program text sources written in different languages.

There was some flawed argument against the underscore that it could have caused some parsing problems in some weird legacy programs, but they were not more difficult to solve than avoiding parsing errors caused by the legacy use of the apostrophe in character constants (i.e. forbidding the digit separator as the first character in a number is enough to ensure a non-ambiguous parsing) .

ashvardanian•4mo ago
Thanks for the kind words! In this case, it isn’t tied to any programming language or locale-specific formatting. I just find commas less readable in long numbers, especially in running text across Western languages. Apostrophes feel clearer to me, so I usually stick with them.