frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Google staff call for firm to cut ties with ICE

https://www.bbc.com/news/articles/cvgjg98vmzjo
1•tartoran•48s ago•0 comments

Dependency Resolution Methods

https://nesbitt.io/2026/02/06/dependency-resolution-methods.html
1•zdw•1m ago•0 comments

Crypto firm apologises for sending Bitcoin users $40B by mistake

https://www.msn.com/en-ie/money/other/crypto-firm-apologises-for-sending-bitcoin-users-40-billion...
1•Someone•1m ago•0 comments

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

https://www.iplotcsv.com/demo
1•maxmoq•2m ago•0 comments

There's no such thing as "tech" (Ten years later)

https://www.anildash.com/2026/02/06/no-such-thing-as-tech/
1•headalgorithm•2m ago•0 comments

List of unproven and disproven cancer treatments

https://en.wikipedia.org/wiki/List_of_unproven_and_disproven_cancer_treatments
1•brightbeige•3m ago•0 comments

Me/CFS: The blind spot in proactive medicine (Open Letter)

https://github.com/debugmeplease/debug-ME
1•debugmeplease•3m ago•1 comments

Ask HN: What are the word games do you play everyday?

1•gogo61•6m ago•1 comments

Show HN: Paper Arena – A social trading feed where only AI agents can post

https://paperinvest.io/arena
1•andrenorman•8m ago•0 comments

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•12m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
2•elashri•12m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•12m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•12m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•14m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•14m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•14m ago•1 comments

The Neuroscience Behind Nutrition for Developers and Founders

https://comuniq.xyz/post?t=797
1•01-_-•14m ago•0 comments

Bang bang he murdered math {the musical } (2024)

https://taylor.town/bang-bang
1•surprisetalk•14m ago•0 comments

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

https://konfuzio.com/en/a-night-without-the-nerds-claude-opus-4-6-in-the-field-test/
1•konfuzio•17m ago•0 comments

Could ionospheric disturbances influence earthquakes?

https://www.kyoto-u.ac.jp/en/research-news/2026-02-06-0
2•geox•19m ago•1 comments

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

https://www.space.com/space-exploration/launches-spacecraft/spacexs-next-astronaut-launch-for-nas...
1•bookmtn•20m ago•0 comments

Show HN: One-click AI employee with its own cloud desktop

https://cloudbot-ai.com
2•fainir•22m ago•0 comments

Show HN: Poddley – Search podcasts by who's speaking

https://poddley.com
1•onesandofgrain•23m ago•0 comments

Same Surface, Different Weight

https://www.robpanico.com/articles/display/?entry_short=same-surface-different-weight
1•retrocog•25m ago•0 comments

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
2•Brajeshwar•29m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
3•Brajeshwar•30m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
2•Brajeshwar•30m ago•0 comments

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•33m ago•1 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
2•righthand•36m ago•1 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•37m ago•0 comments
Open in hackernews

JIT: So you want to be faster than an interpreter on modern CPUs

https://www.pinaraf.info/2025/10/jit-so-you-want-to-be-faster-than-an-interpreter-on-modern-cpus/
170•pinaraf•3mo ago

Comments

gr4vityWall•3mo ago
That was a pretty interesting read.

My take is that you can get pretty far these days with a simple bytecode interpreter. Food for thought if your side project could benefit from a DSL!

stmw•3mo ago
As I noted in another comment, I would like to persuade you that this is not the right take-away..
stmw•3mo ago
Good read. But a word of caution - the "JIT vs interpreter" comparisons often favor the interpreter when the JIT is inplemented as more-or-less simple inlining of the interpreter code. (Here called "copy-and-patch" but a decades-only approach). I've had fairly senior engineers try to convince me that this is true even for Java VMs. It's not in general, at least not with the right kind of JIT compiler design.
hoten•3mo ago
I just recently upgraded[1] a JIT that essentially compiled each bytecode separately to one that shares registers within the same basic block. Easy 40 percent improvement to runtime, as expected.

But something I hadn't expected was it also improved compilation time by 40 percent too (fewer virtual registers made for much faster register allocation).

[1] https://github.com/ZQuestClassic/ZQuestClassic/commit/68087d...

chromatic•3mo ago
This is an embarrassing context to admit, but here goes.

Back when Parrot was a thing and the Perl 6 people were targeting it, I profiled the prelude of Perl 6 to optimize startup time and discovered two things:

- the first basic block of the prelude was thousands of instructions long (not surprising) - the compiler had to allocate thousands of registers because the prelude instructions used virtual registers

The prelude emitted two instructions, one right after another: load a named symbol from a library, then make it available. I forget all of the details, but each of those instructions either one string register and one PMC register. Because register allocation used the dominance frontier method, the size of the basic block and total number of all symbolic registers dominated the algorithm.

I suggested a change to the prelude emitter to reuse actual registers and avoid virtual registers and compilation sped up quite a bit.

_cogg•3mo ago
Yeah, I expect the real advantage of a JIT is that you can perform proper register allocation and avoid a lot of stack and/or virtual register manipulation.

I wrote a toy copy-patch JIT before and I don't remember being impressed with the performance, even compared to a naive dispatch loop, even on my ~11 year old processor.

stmw•3mo ago
Exactly, and it's not just register allocatio: but for many languages also addign proper typing, some basic data flow optimization, some constant folding, and a few other things that can be done fairly quickly, without the full set of trees and progressive lowering of the operators down to instruactions.

What's odd about the "JIT vs interpreter" debate is that it keeps coming up, given that it is fairly easy to see even in toy examples.

ack_complete•3mo ago
The difference between interpreters and simple JITs has narrowed partly due to two factors: better indirect branch predictors with global history, and wider execution bandwidth to absorb the additional dispatch instructions. Intel CPUs starting with Haswell, for instance, show less branch misprediction impact due to better ability to predict jump path patterns through the interpreter. A basic jump table no longer suffers as much compared to tail-calling/dispatch or a simple splicing JIT.
stmw•3mo ago
Turns out one of the classic papers on this is available for those interested in this discussion - https://news.ycombinator.com/item?id=45582127
klipklop•3mo ago
A shame operating systems like iOS/iPadOS do not allow JIT. iPad Pro's have such fast CPU's that you cant even use fully because of decisions like this.
Pulcinella•3mo ago
Those operating systems allow it, but Apple does not. Agree that it is a total waste.
duped•3mo ago
What advantage does JIT compilation have over Swift or Obj-C?
saagarjha•3mo ago
It speeds up interpreted languages.
Pulcinella•3mo ago
And emulation.
saagarjha•3mo ago
What is an architecture but a scripting language to interpret? ;)
duped•3mo ago
I get that, but what interpreted language do you want to write iOS apps in when there's Swift and Obj-C right there, with bespoke support and tooling from Apple?

And if you care about performance, why aren't you writing that code in native to begin with?

saagarjha•3mo ago
Why would you care about faster cars when planes exist?
ranger_danger•3mo ago
https://0x0.st/XJZT.jpg
ranger_danger•3mo ago
...Javascript.
bencyoung•3mo ago
JIT compilation can be faster for compiled languages too, as it allows data driven inlining and devirtualization, as well as "effective constant" propogation and runtime architecture feature detection
varjag•3mo ago
It can be but it never is.
ignoramous•3mo ago
To re-optimize compiled code blocks isn't without effort. Google has publicly spoken about AutoFDO and Propeller [0], after Meta had open sourced BOLT [1] in 2021.

AutoFDO has since been ported to Android and adopted by Yandex [3].

[0] https://lwn.net/Articles/995397/

[1] https://news.ycombinator.com/item?id=40868224

[2] https://news.ycombinator.com/item?id=42896716

__s•3mo ago
Especially after PGO (profiling guided optimization) gets most of the way there
Rohansi•3mo ago
But is that because of JIT compilation or other decisions for how the language should work (dynamic typing, GC, etc.)?
ranger_danger•3mo ago
Hard disagree. Many newer game system emulators (32-bit and up) rely on JIT or "dynarecs" to get playable speeds, and they pretty much all use high performance compiled languages already. They often double the performance over their interpreter or more.
duped•3mo ago
Is there a production JIT for a compiled language that is actually faster? I understand the theory, I don't think the practice backs it up.
fragmede•3mo ago
Depends, what do you consider Java?
varjag•3mo ago
Java is certainly not the fastest language out there.
pcwalton•3mo ago
Sure, but the relevant comparison isn't between languages: it's between a state-of-the-art JIT implementation of one language and a likewise-state-of-the-art AOT implementation of the same language. Unfortunately there aren't many examples of this; most languages have a preferred implementation strategy that receives much more effort than the other one.
pcwalton•3mo ago
I believe HotSpot is usually faster than GCJ.
almostgotcaught•3mo ago
> that you cant even use fully because of decisions like this.

Have no clue what this means - you can pre-compile for target platforms and therefore "fully" use whichever Apple device CPU.

pjc50•3mo ago
You can't precompile your web page's Javascript for iOS, even if you're willing to have it signed and notarized and submitted for policy review.
tgv•3mo ago
Apart from the fact that its JS engine is really fast, Safari accepts WebAssembly. What else would you precompile it to?
true_religion•3mo ago
Originally, iOS Safari handled WASM but only with JIT disabled.

However the EU decreed that it must allow for fair competition, leading to it claiming that it will enable JIT for authorized developers: https://developer.apple.com/support/alternative-browser-engi...

But I'm not sure that they have done so...

Mozilla: https://github.com/mozilla/platform-tilt/issues/3

Chrome: https://issues.chromium.org/issues/42203058

pjscott•3mo ago
They do, technically, allow JIT. You need a very hard-to-obtain entitlement that lets you turn writable pages into executable read-only pages, and good luck getting that entitlement if (for some reason) your name isn’t “mobilesafari”, but the capability exists.
Wowfunhappy•3mo ago
When you say it's "hard" to obtain--is it possible to obtain if you aren't Apple? Does Apple ever provide it to third party developers, or is there even a path to requesting it?
j16sdiz•3mo ago
Yes.
cmeacham98•3mo ago
Source? Is there any non-Apple app that has this entitlement?
Rohansi•3mo ago
If your app happens to be a browser that's only usable in the EU then:

https://developer.apple.com/documentation/browserenginekit/p...

gf000•3mo ago
I believe the Delta emulator has JIT support, but possibly only when installed as a developer.
Wowfunhappy•3mo ago
As far as I can tell, you need to connect your phone to a PC running software which enables JIT by exploiting a feature intended for remote debugging. https://faq.altstore.io/altstore-classic/enabling-jit
ivankra•3mo ago
They allow, but Apple's policy is to lock down that ability pretty much just to Safari/WKWebView. If you could transpile/compile your program to JS or WASM and run it through one of these blessed options, it should get JIT'ted.
xxs•3mo ago
Of course they do - this is how javascript in any site works
imtringued•3mo ago
I'm not really interested in building an interpreter, but the part about scalar out of order execution got me thinking. The opcode sequencing logic of an interpreter is inherently serial and an obvious bottleneck (step++; goto step->label; requires an add, then a fetch and then a jump, pretty ugly).

Why not do the same thing the CPU does and fetch N jump addresses at once?

Now the overhead is gone and you just need to figure out how to let the CPU fetch the chain of instructions that implement the opcodes.

You simply copy the interpreter N times, store N opcode jump addresses in N registers and each interpreter copy is hardcoded to access its own register during the computed goto.

saagarjha•3mo ago
You run into the same problem a CPU does: if you have dependencies between the instructions, you can't execute ahead of time. Your processor has a bunch of hardware to efficiently resolve conflicts but your interpreter does not.
titzer•3mo ago
Depending on the bytecode, instructions might be variable-length, which means that you need to execute a nontrivial amount of logic to fetch more than just the next bytecode or handler. That said, I tinkered with adding a prefetch to Wizard's interpreter which basically moves the load of the next handler from the dispatch at the end to the first thing in the handler, and saw something like a 5% improvement.
stmw•3mo ago
The thing you're suggesting makes sense, but it's far more efficient to do in hardware. You might say that you could do it on one of the many cores available on your modern processor, but it turns out that synchronizing them to your main thread is really inefficient -- and anyway, they're busy running your HN browser threads and your YouTube music video.
gary_0•3mo ago
> This is called branch prediction, it has been the source of many fun security issues...

No, that's speculative execution you just described. Branch prediction was implemented long before out-of-order CPUs were a thing, as you need branch prediction to make the most of pipelining (eg. fetching and decoding a new instruction while you're still executing the previous one--if you predict branches, you're more likely to keep the pipeline full).

Arnavion•3mo ago
Speculative execution does not require out-of-order execution. When you predict a branch, you're speculatively executing the predicted branch. Whether you're doing it in the same order as instruction order or out of order is independent of that.
Wowfunhappy•3mo ago
If you're executing instructions in order, wouldn't you already know the result of the branch by the time you reach its code?
Sesse__•3mo ago
You're starting them in order and you're ending (retiring) them in order, but you're not necessarily ending one instruction before you're starting the next one. For instance, in a very simple pipeline, you can start decoding the next instruction before you've completed the previous one, so you can do some work in parallel.
Arnavion•3mo ago
The fetch stage of the pipeline will have needed to predict the branch N cycles before the execute stage of the pipeline actually gets around to evaluating it, in order to continue fetching the post-branch instructions. Without branch prediction the fetch stage would need to stall until that happens, which decreases throughput. The point of branch prediction and the subsequent speculative execution is to optimistically avoid that stall.
gary_0•3mo ago
A good explanation of branch prediction: https://danluu.com/branch-prediction/
gary_0•3mo ago
The article is talking about OoO which is why I mentioned it. My point is that branch prediction and speculative execution are different things. You can do speculative execution without a branch predictor (run both branches and throw out the one that's wrong).
Arnavion•3mo ago
You're right, I missed the article specifically mentions Meltdown in that sentence, not Spectre.
monocasa•3mo ago
Essentially all microarchtectural state is fodder for side channel exploits.

Static branch prediction like "predict taken if negative branch offset" doesn't leak anything, but just about any dynamically updated tables will (almost tautologically) contain statistical information about what was executed recently.

neerajsi•3mo ago
From the previous article in the series, it looks like the biggest impediment to just using full llvm to compile the query is that they didn't find a good way to cache the results across invocations.

Sql server hekaton punted this problem in a seemingly effective way by requiring the client to use stored procedures to get full native compilation. Not sure though if they recompile if the table statistics indicate a different query plan is needed.

scrash•3mo ago
The issues with branch prediction aren't really as much of a thing in modern interpreters, I can really recommend reading https://inria.hal.science/hal-01100647/document
titzer•3mo ago
The paper is 10 years old. While the gap between a threaded an interpreter (a dispatch at the end of every handler) versus non-threaded (loop over switch) isn't as big as it used to be, it's still 15-30% on modern very fast interpreters. For example, I measured between 14 and 29% performance improvement for threading Wizard's interpreter[1].

[1] https://dl.acm.org/doi/10.1145/3563311

scrash•3mo ago
Interesting paper :) I've kept choosing threaded myself, but would have put the gap in a 5-10% range. I guess the branch predictor hasn't kept up. (Also trying to resist getting nerdsniped into measuring it myself 0_0)
titzer•3mo ago
Testing this in Wizard is fairly easy.

Compare the running speed of the two binaries built with different options:

    % V3C_OPTS=-redef-field=FastIntTuning.threadedDispatch=true ./build.sh wizeng x86-64-linux

    % bin/wizeng.x86-64-linux --mode=int test/microbench/100ms/fib.wasm


    % V3C_OPTS=-redef-field=FastIntTuning.threadedDispatch=false ./build.sh wizeng x86-64-linux

    % bin/wizeng.x86-64-linux --mode=int test/microbench/100ms/fib.wasm
adzm•3mo ago
I was surprised to learn that postgresql does not have a compiled plan / query cache.

Also I recommend reading the previous blog post first, then this one, for additional context: https://www.pinaraf.info/2024/03/look-ma-i-wrote-a-new-jit-c...

trashface•3mo ago
Doesn't work for every case, but I think for a lot of cases nowadays, if you are using an interpreter and its slow, you should just generate web assembly. Libraries like walrus for rust make this pretty easy to do, and wasmtime provides a serviceable standalone runtime. For my little language, recursive fib(40) executes in firefox with wasm in about 600ms. My interpreter basically can't finish it.