frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Solving NP-Complete Structures via Information Noise Subtraction (P=NP)

https://zenodo.org/records/18395618
1•alemonti06•4m ago•1 comments

Cook New Emojis

https://emoji.supply/kitchen/
1•vasanthv•7m ago•0 comments

Show HN: LoKey Typer – A calm typing practice app with ambient soundscapes

https://mcp-tool-shop-org.github.io/LoKey-Typer/
1•mikeyfrilot•10m ago•0 comments

Long-Sought Proof Tames Some of Math's Unruliest Equations

https://www.quantamagazine.org/long-sought-proof-tames-some-of-maths-unruliest-equations-20260206/
1•asplake•11m ago•0 comments

Hacking the last Z80 computer – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/FEHLHY-hacking_the_last_z80_computer_ever_made/
1•michalpleban•11m ago•0 comments

Browser-use for Node.js v0.2.0: TS AI browser automation parity with PY v0.5.11

https://github.com/webllm/browser-use
1•unadlib•12m ago•0 comments

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

https://www.nytimes.com/2026/02/07/magazine/michael-pollan-interview.html
1•mitchbob•12m ago•1 comments

Software Engineering Is Back

https://blog.alaindichiappari.dev/p/software-engineering-is-back
1•alainrk•13m ago•0 comments

Storyship: Turn Screen Recordings into Professional Demos

https://storyship.app/
1•JohnsonZou6523•14m ago•0 comments

Reputation Scores for GitHub Accounts

https://shkspr.mobi/blog/2026/02/reputation-scores-for-github-accounts/
1•edent•17m ago•0 comments

A BSOD for All Seasons – Send Bad News via a Kernel Panic

https://bsod-fas.pages.dev/
1•keepamovin•20m ago•0 comments

Show HN: I got tired of copy-pasting between Claude windows, so I built Orcha

https://orcha.nl
1•buildingwdavid•20m ago•0 comments

Omarchy First Impressions

https://brianlovin.com/writing/omarchy-first-impressions-CEEstJk
2•tosh•26m ago•1 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
2•onurkanbkrc•27m ago•0 comments

Show HN: Versor – The "Unbending" Paradigm for Geometric Deep Learning

https://github.com/Concode0/Versor
1•concode0•27m ago•1 comments

Show HN: HypothesisHub – An open API where AI agents collaborate on medical res

https://medresearch-ai.org/hypotheses-hub/
1•panossk•30m ago•0 comments

Big Tech vs. OpenClaw

https://www.jakequist.com/thoughts/big-tech-vs-openclaw/
1•headalgorithm•33m ago•0 comments

Anofox Forecast

https://anofox.com/docs/forecast/
1•marklit•33m ago•0 comments

Ask HN: How do you figure out where data lives across 100 microservices?

1•doodledood•33m ago•0 comments

Motus: A Unified Latent Action World Model

https://arxiv.org/abs/2512.13030
1•mnming•33m ago•0 comments

Rotten Tomatoes Desperately Claims 'Impossible' Rating for 'Melania' Is Real

https://www.thedailybeast.com/obsessed/rotten-tomatoes-desperately-claims-impossible-rating-for-m...
3•juujian•35m ago•2 comments

The protein denitrosylase SCoR2 regulates lipogenesis and fat storage [pdf]

https://www.science.org/doi/10.1126/scisignal.adv0660
1•thunderbong•37m ago•0 comments

Los Alamos Primer

https://blog.szczepan.org/blog/los-alamos-primer/
1•alkyon•39m ago•0 comments

NewASM Virtual Machine

https://github.com/bracesoftware/newasm
2•DEntisT_•42m ago•0 comments

Terminal-Bench 2.0 Leaderboard

https://www.tbench.ai/leaderboard/terminal-bench/2.0
2•tosh•42m ago•0 comments

I vibe coded a BBS bank with a real working ledger

https://mini-ledger.exe.xyz/
1•simonvc•42m ago•1 comments

The Path to Mojo 1.0

https://www.modular.com/blog/the-path-to-mojo-1-0
1•tosh•45m ago•0 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md
5•sakanakana00•48m ago•1 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/
3•pieterdy•51m ago•0 comments

Hot Reloading in Rust? Subsecond and Dioxus to the Rescue

https://codethoughts.io/posts/2026-02-07-rust-hot-reloading/
4•Tehnix•51m ago•1 comments
Open in hackernews

I want a good parallel language [video]

https://www.youtube.com/watch?v=0-eViUyPwso
117•raphlinus•3mo ago

Comments

ChadNauseam•3mo ago
Raph and I also talked about this subject here: https://www.popovit.ch/interviews/raph-levien-simd The discussion covers things at a relatively basic level as we wanted it to be accessible to a wide audience. So we explain SIMD vs SIMT, predication, multiversioning, and some more.

Raph is a super nice guy and a pleasure to talk to. I'm glad we have people like him around!

SamInTheShell•3mo ago
Went in thinking "Have you heard of Go?"... but this turned out to be about GPU computing.
nasretdinov•3mo ago
Well, they said "good" :). Go we already have, that is correct.

P.S. I'm joking, I do love Go, even though it's by no means a perfect language to write parallel applications with

SamInTheShell•3mo ago
Haha, yeah. Idk any other language where I practically get a free parallelization+concurrency sandwich. It's kept me coming back to Go for a decade now, despite them using a signal that prevents using it for system level libraries. They literally broke my libnss-go package years ago when they selected the signal to use to control the concurrency portion of the runtime.
pdimitar•3mo ago
Elixir.

Parallelism is trivial and front-and-center.

And no it's not a niche language. Don't listen to the army of Python technicians.

SamInTheShell•2mo ago
There seems to be about the same level of effort with Elixir as there is for a language like Kotlin.

The free sandwich I'm referring to with Go is the ability to just do `go funcnamehere()` and that's running concurrently and in parallel. If I need coordination of those goroutines, I can still do that with any number of locking patterns. It's extremely convenient, making the trade off of having a runtime baked in worth it imo.

pdimitar•2mo ago
Well, almost exactly the same goes for Elixir. You have to declare a supervisor first but from then on it's trivial, almost on the level of `go doThisThing()`.

That's why I made Elixir my main language. A lot of the tech sphere stubbornly pretends we don't live in a world with multicore CPUs, even to this day.

fifilura•3mo ago
SQL.

It is a joke, but an SQL engine can be massively parallel. You just don't know it, it just gives you what you want. And in many ways the operations resembles what you do for example in CUDA.

CUDA backend for DuckDB or Trino would be one of my go-to projects if i was laid off.

drivebyhooting•3mo ago
My issue with SQL is lack of composability and difficulty of debugging intermediate results.
asadm•3mo ago
is it a language problem though? it's just lack of tooling.
theLiminator•3mo ago
The dataframe paradigm (a good example being polars) is another good alternative that's more composable (imo).
fifilura•3mo ago
It is true. I still hate it. I think because it always offers 10 different ways to do the same thing. So it is just too much to remember.
mamcx•3mo ago
Yes, SQL is poor.

What could be good is relational + array model. I have some ideas on https://tablam.org, and building not just the language but the optimizer in tandem I think will be very nice.

oembar4•3mo ago
The programming style reminds me of the old days of clipper and xbase family, even ABAP. I like the syntax.
kragen•3mo ago
The standard things that someone should always say when someone brings up this problem is:

• Datalog is much, much better on these axes.

• Tutorial D is also better than SQL.

zozbot234•3mo ago
You can use SQL CTE's and/or VIEW's as a composable abstraction over queries and inspect intermediate results. The language features are there.
Too•3mo ago
Check out https://prql-lang.org/

It solves all the warts of sql while still being true to its declarative execution. Trailing commas, from statement first and reads as a a composable pipeline, temporary variables for expressions, intuitive grouping.

dvrp•3mo ago
If you want to work in data engineering for massive datasets (many petabytes) pls hit me up!
fifilura•3mo ago
Sorry, wrong continent :)
taeric•3mo ago
More generally, the key here is that the more magic you want in the execution of your code, the more declarative you want the code to be. And SQL is pretty much the poster child declarative language out there.

Term rewriting languages probably work better at this than I would expect? It is kind of sad how little experience with that sort of thing that I have built up. And I think I'm above a large percentage of developers out there.

gnulinux•3mo ago
Even in this thread people underestimate how good e.g. DuckDB can be if you swallow its quirks. Yeah SQL has many problems, but with a slightly extended language with QoL features and seamless parallelism DuckDB is extremely productive if you want to crunch bunch of numbers in the order of minutes, hours etc (not real time).

Sometimes I have a problem, I just generate bunch of "possible solutions" with a constraint solver (e.g. Minizinc) which generates GBs of CSVs describing bunch of solutions, then let DuckDB analyze which ones are suitable, DuckDB is amazing.

v9v•3mo ago
There were a few languages designed specifically for parallel computing spurred by DARPA's High Productivity Computing Systems project. While Fortress is dead, Chapel is still being developed.
zokier•3mo ago
iirc those were oriented more towards large HPC clusters rather than computation on single node?
Jtsummers•3mo ago
Chapel, at least, aims for both. You can write loops that it will try to compile to use SIMD instructions, or even for the GPU: https://chapel-lang.org/docs/technotes/gpu.html
convolvatron•3mo ago
the distinction matters less and less. Inside the GPU there is already plenty of locality to exploit (catches, schedulers, warps). nvlink is a switch memory access network, so that already gets you some fairly large machines with multiple kinds of locality.

throwing infiniband or IP on top is really structurally more of the same.

Chapel definitely can target a single GPU.

jandrewrogers•3mo ago
Those languages were not effective in practice. The kind of loop parallelism that most people focus on is the least interesting and effective kind outside of niche domains. The value was low.

Hardware architectures like Tera MTA were much more capable but almost no one could write effective code for them even though the language was vanilla C++ with a couple extra features. Then we learned how to write similar software architecture on standard CPUs. The same problem of people being bad at it remained.

The common thread in all of this is people. Humans as a group are terrible at reasoning about non-trivial parallelism. The tools almost don't matter. Reasoning effectively about parallelism involves manipulating a space that is quite evidently beyond most human cognitive abilities to reason about.

Parallelism was never about the language. Most people can't build the necessary mental model in any language.

kragen•3mo ago
Maybe we can find better abstractions. Software transactional memory seems like a promising candidate, for example. Sawzall/Dremel and SQL seem to also be capable of expressing some interesting things. And, as RoboToaster mentions, in VHDL and Verilog, people have successfully described parallel computations containing billions of concurrent processes, and even gotten them to work properly.
jimbokun•3mo ago
That's why programming languages are important for solving this problem.

The syntax and semantics should constrain the kinds of programs that are easy to write in the language to ones that the compiler can figure out how to run in parallel correctly and efficiently.

That's how you end up with something like Erlang or Elixir.

kevindamm•3mo ago
This was, I think, the greatest strength of MapReduce. If you could write a basic program you could understand the map, combine, shuffle and reduce operations. MR and Hadoop etc. would take care of recovering from operational failures like disk or network outages by idempotencies in the workings behind the scenes, and programmers could focus on how data was being transformed, joined, serialized, etc.

To your point, we also didn't need a new language to adopt this paradigm. A library and a running system were enough (though, semantically, it did offer unique language-like characteristics).

Sure, it's a bit antiquated now that we have more sophisticated iterations for the subdomains it was most commonly used for, but it hit a kind of sweet spot between parallelism utility and complexity of knowledge or reasoning required of its users.

cubefox•3mo ago
Unfortunately his microphone did not cooperate.
abejfehr•3mo ago
Bend comes to mind as an attempt at this: https://github.com/HigherOrderCO/Bend

Disclaimer: I did not watch the video yet

dandanua•3mo ago
I think a good parallel language will be the one that takes your code written with tasks and channels, understands its logic, rewrites and compiles it in the most efficient way. I don't feel that I have to write something harder than that as a pity human.
convolvatron•3mo ago
mapping from channels to SIMD seems kind of intractable, its a kind of lifting that involves looking across the producers and the consumers.

going the other direction, making channel runtimes run SIMD, is trivial

Munksgaard•3mo ago
Interesting talk. He mentions Futhark a few times, but fails to point out that his ideal way of programming is almost 1:1 how it would be done in Futhark.

His example is:

  sequence
    .map(|x: T0| ...: T1)
    .scan(|a: T1, b: T1| ...: T1)
    .filter(|x: T1| ...: bool)
    .flat_map(|x: T1| ...: sequence<T2>)
    .collect()
It would be written in Futhark something like this:

  sequence
    |> map (\x -> ...)
    |> scan (\x y -> ...)
    |> filter (\x -> ...)
    |> map (\x -> ...) 
    |> flatten
Munksgaard•3mo ago
Also, while not exactly the algorithm Raph is looking for, here is a bracket matching function (from Pareas, which he also mentions in the talk) in Futhark: https://github.com/Snektron/pareas/blob/master/src/compiler/...

I haven't studied it in depth, but it's pretty readable.

pythomancer•3mo ago
(author here) check_brackets_bt is actually exactly the algorithm that Raph mentions
Munksgaard•3mo ago
Thanks for clarifying! It would indeed be interesting to see a comparison between similar implementations in other languages, both in terms of readability and performance. I feel like the readability can hardly get much better than what you wrote, but I don't know!
raphlinus•3mo ago
Right. This is the binary tree version of the algorithm, and is nice and concise, very readable. What would take it to the next level for me is the version in the stack monoid paper, which chunks things up into workgroups. I haven't done benchmarks against the Pareas version (unfortunately it's not that easy), but I would expect the workgroup optimized version to be quite a bit faster.
convolvatron•3mo ago
I've been playing with one using scans. too bad that's not really on the map for architectural reasons, it opens up a lot of uses.
kragen•3mo ago
Yeah, monoid prefix sum is a surprisingly powerful tool for parallel and incremental algorithm design!
Munksgaard•3mo ago
To be clear, you can express workgroup parallelism in Futhark, or rather, if the compiler sees that you've programmed your problem in such a way that it can take advantage of workgroup parallelism, it will.

But you're right, it would be interesting to see how the different approaches stack up to each other. The Pareas project linked above also includes an implementation using radix sort.

snthpy•3mo ago
Very cool. I have seen Futhark mentioned before but I don't know anything about it.

The example you showed is very much how I think about PRQL pipelines. Syntax is slightly different but semantics are very similar.

At first I thought that PRQL doesn't have scan but actually loop fulfills the same function. I'm going to look more into comparing those.

MangoToupe•3mo ago
prolog?
jolt42•3mo ago
Now you are backtracking (pun intended).
yuretz•3mo ago
yes.
icandoit•3mo ago
There is the logical system that turns prolog into SQL. Something like this might be useful for nuero symbolic computing. Graph languages like cypher seem a lot more limited.
swatson741•3mo ago
So he wants a good parallel language? What's the issue? I haven't had problems with concurrency, multiplexing, and promises. They've solved all the parallelism tasks I've needed to do.
pbronez•3mo ago
The audio is weirdly messed up
raphlinus•3mo ago
Yes, sorry about that. We had tech issues, and did the best we could with the audio that was captured.
awaymazdacx5•3mo ago
Lower-level programming language, which is either object-oriented like python or after compilation a real-time system transposition would assemble the microarchitecture to an x86 chip.
pancakeguy•3mo ago
What about burla.dev ?

Or basically a generic nestable `remote_parallel_map` for python functions over lists of objects.

I haven't had a chance to fully watch the video yet / I understand it focuses on lower levels of abstraction / GPU programming. But I'd love to know how this fit's into what the speaker is looking for / what it's missing (other than obviously it not being a way to program GPU's) (also full disclosure I am a co-founder).

RobotToaster•3mo ago
VHDL?
raphlinus•3mo ago
I almost mentioned it in the talk, as an example of a language that's deployed very successfully and expresses parallelism at scale. Ultimately I didn't, as the core of what I'm talking about is control over dynamic allocation and scheduling, and that's not the strength of VHDL.
TJSomething•3mo ago
It seems like there are two sides to this problem, both of which are hard and go hand in hand. There is the HCI problem of having abstractions are rich enough to handle problems like parsing and scheduling on the GPU. Then you need a sufficiently smart compiler problem of lowering these problems to the GPU. But of course, there's a limit to how smart a compiler can be, which loops back to your abstraction design.

Overall, it seems to be a really interesting problem!

AllegedAlec•3mo ago
ctrl-f Erlang

Nothing yet? Damn...

rramadass•3mo ago
Yeah, i too was looking for Erlang.

The thing i would really like to see is some research on how to run the Erlang concurrency model on a GPU.

jerf•3mo ago
There's no need for research. The answer is simple: You can't run Erlang concurrency on a GPU. GPUs fundamentally get their advantage by running the same operations on a huge set of cores across different data. They aren't just Platonically faster than CPUs, they're faster than CPUs on very, very specific tasks. Out of the context of those tasks, they are in fact massively, massively slower.

Some of the operations Erlang does, GPUs don't even want to do at all, including basic things like pattern matching. GPUs do not want that sort of code at all.

"Erlang" is being over specific here. No conventional CPU language makes sense on a GPU at all.

rramadass•3mo ago
Not what i meant (these are superficialities).

Erlang is a concurrency-oriented language though its concurrency architecture (multicore/node/cluster/etc.) is different from that modeled by GPUs (Vectorized/SIMD/SIMT/etc.) Since share-nothing Processes (so-called Actor model) are at the heart of the Erlang Run Time System(ERTS)/BEAM it is easy to imagine a "group of Erlang processes" being mapped directly to a "group of threads in a warp on a GPU". Of course the Erlang scheduler being different (it is reduction based and not time sliced) one would need to rethink some fundamental design decisions but that should not be too out-of-the-way since the system as a whole is built for concurrency support. The other problem would be memory transfers between CPU and GPU (while still preserving immutability) but this is a more general one.

You can call out to CUDA/OpenCL/etc. from Erlang through its C interface (Kevin Smith did a presentation years ago) but i have seen no new research since then. However, there has been some new things in Elixir land notably "Nx" (Numerical Elixir) and "GPotion" (a DSL for GPU programming in Elixir).

But note that none of the above is aimed at modifying the Erlang language/runtime concurrency model itself to map to GPU models which is what i would very much like to see.

seanmcdirmid•3mo ago
The biggest issue I think is utilizing massive GPU memory bandwidth, you really need SIMD or your GPU is just going to generate a lot of heat to do only a bit of work.
rramadass•3mo ago
But SIMD has got nothing to do with language per se. In Erlang everything is in a module and hence i can imagine annotating a module with SIMD/SIMT attribute which would then be the hint for the ERTS to map all the processes in that module to a warp on a GPU using SIMD vectorization as needed. Of course my Erlang processes must be written to take advantage of the above and thus cannot be a general-purpose (i.e. MIMD) process.
icandoit•3mo ago
Something like vine lang or something built on interaction nets might be close to what you are looking for. It can run on GPU.
rramadass•3mo ago
Right; but since Erlang is already concurrency-oriented i was thinking it might be easier to add the GPU model to its runtime. See my other comment here - https://news.ycombinator.com/edit?id=45843523
jerf•3mo ago
Erlang operates on a higher level, with much, much larger chunks. Any sensible modern system will, for instance, have some sort of execution context (thread/green thread/continuation/task/whatever) associated with a single incoming HTTP request. That's very nice, and not going anywhere.

However Erlang has very little to say about parallelization of loops, or in the levels between a single loop and a HTTP request.

Nor would it be a good base for such things; if you're worried about getting maximum parallel performance out of your CPUs you pretty much by necessity need to start from a base where single-threaded performance is already roughly optimal, such as with C, C++, or Rust. Go at the very outside, and that's already a bit of a stretch in my opinion. BEAM does not have that level of single-threaded performance. There's no point in making what BEAM does fully utilize 8 CPUs in this sort of parallel performance when all that does is get you back to where a single thread of Rust can run.

(I think this is an underappreciated aspect of trying to speed things up with multiple CPUs. There's no point straining to get 8 CPUs running in some sort of complicated perfect synchronization in your slow-ish language when you could just write the same thing in a compiled language and get it on one CPU. I particularly look at the people who think that GIL removal in Python is a big deal for performance and wonder what they're thinking... a 32-core machine parallelizing Python code perfectly, with no overhead, might still be outperformed by a single-core Go process and would almost certainly be beated by a single-core Rust process. And perfect parallelization across 32 cores is a pipe dream. Unless you've already maxed out single-core performance, you don't need complicated parallelization, you need to write in a faster language to start with.)

nailer•3mo ago
Was trying to remember where I recognised this name, Raph Levien is the Ghostscript and Advogato creator and helped legalize crypto https://en.wikipedia.org/wiki/Raph_Levien
zhethoughts•3mo ago
https://docs.ray.io/en/latest/
coffeeaddict1•3mo ago
I was expecting the author to at least mention Halide https://halide-lang.org/.