frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Sebastian Galiani on the Marginal Revolution

https://marginalrevolution.com/marginalrevolution/2026/02/sebastian-galiani-on-the-marginal-revol...
1•paulpauper•1m ago•0 comments

Ask HN: Are we at the point where software can improve itself?

1•ManuelKiessling•1m ago•0 comments

Binance Gives Trump Family's Crypto Firm a Leg Up

https://www.nytimes.com/2026/02/07/business/binance-trump-crypto.html
1•paulpauper•1m ago•0 comments

Reverse engineering Chinese 'shit-program' for absolute glory: R/ClaudeCode

https://old.reddit.com/r/ClaudeCode/comments/1qy5l0n/reverse_engineering_chinese_shitprogram_for/
1•edward•2m ago•0 comments

Indian Culture

https://indianculture.gov.in/
1•saikatsg•4m ago•0 comments

Show HN: Maravel-Framework 10.61 prevents circular dependency

https://marius-ciclistu.medium.com/maravel-framework-10-61-0-prevents-circular-dependency-cdb5d25...
1•marius-ciclistu•5m ago•0 comments

The age of a treacherous, falling dollar

https://www.economist.com/leaders/2026/02/05/the-age-of-a-treacherous-falling-dollar
2•stopbulying•5m ago•0 comments

Ask HN: AI Generated Diagrams

1•voidhorse•7m ago•0 comments

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

https://www.windowscentral.com/microsoft/windows-11/windows-locked-me-out-of-notepad-is-the-thin-...
2•josephcsible•8m ago•0 comments

Show HN: A delightful Mac app to vibe code beautiful iOS apps

https://milq.ai/hacker-news
2•jdjuwadi•11m ago•1 comments

Show HN: Gemini Station – A local Chrome extension to organize AI chats

https://github.com/rajeshkumarblr/gemini_station
1•rajeshkumar_dev•11m ago•0 comments

Welfare states build financial markets through social policy design

https://theloop.ecpr.eu/its-not-finance-its-your-pensions/
2•kome•15m ago•0 comments

Market orientation and national homicide rates

https://onlinelibrary.wiley.com/doi/10.1111/1745-9125.70023
3•PaulHoule•15m ago•0 comments

California urges people avoid wild mushrooms after 4 deaths, 3 liver transplants

https://www.cbsnews.com/news/california-death-cap-mushrooms-poisonings-liver-transplants/
1•rolph•15m ago•0 comments

Matthew Shulman, co-creator of Intellisense, died 2019 March 22

https://www.capenews.net/falmouth/obituaries/matthew-a-shulman/article_33af6330-4f52-5f69-a9ff-58...
3•canucker2016•17m ago•1 comments

Show HN: SuperLocalMemory – AI memory that stays on your machine, forever free

https://github.com/varun369/SuperLocalMemoryV2
1•varunpratap369•18m ago•0 comments

Show HN: Pyrig – One command to set up a production-ready Python project

https://github.com/Winipedia/pyrig
1•Winipedia•20m ago•0 comments

Fast Response or Silence: Conversation Persistence in an AI-Agent Social Network [pdf]

https://github.com/AysajanE/moltbook-persistence/blob/main/paper/main.pdf
1•EagleEdge•20m ago•0 comments

C and C++ dependencies: don't dream it, be it

https://nibblestew.blogspot.com/2026/02/c-and-c-dependencies-dont-dream-it-be-it.html
1•ingve•20m ago•0 comments

Show HN: Vbuckets – Infinite virtual S3 buckets

https://github.com/danthegoodman1/vbuckets
1•dangoodmanUT•20m ago•0 comments

Open Molten Claw: Post-Eval as a Service

https://idiallo.com/blog/open-molten-claw
1•watchful_moose•21m ago•0 comments

New York Budget Bill Mandates File Scans for 3D Printers

https://reclaimthenet.org/new-york-3d-printer-law-mandates-firearm-file-blocking
2•bilsbie•22m ago•1 comments

The End of Software as a Business?

https://www.thatwastheweek.com/p/ai-is-growing-up-its-ceos-arent
1•kteare•23m ago•0 comments

Exploring 1,400 reusable skills for AI coding tools

https://ai-devkit.com/skills/
1•hoangnnguyen•24m ago•0 comments

Show HN: A unique twist on Tetris and block puzzle

https://playdropstack.com/
1•lastodyssey•27m ago•1 comments

The logs I never read

https://pydantic.dev/articles/the-logs-i-never-read
1•nojito•28m ago•0 comments

How to use AI with expressive writing without generating AI slop

https://idratherbewriting.com/blog/bakhtin-collapse-ai-expressive-writing
1•cnunciato•29m ago•0 comments

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

https://github.com/choihimchan/linkscope-bpu-uart-analyzer
1•octablock•30m ago•0 comments

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

https://github.com/user19870/cppsp
1•user19870•31m ago•1 comments

The next frontier in weight-loss drugs: one-time gene therapy

https://www.washingtonpost.com/health/2026/01/24/fractyl-glp1-gene-therapy/
2•bookofjoe•34m ago•1 comments
Open in hackernews

Understand CPU Branch Instructions Better

https://chrisfeilbach.com/2025/07/05/understand-cpu-branch-instructions-better/
77•mfiguiere•7mo ago

Comments

noone_youknow•7mo ago
Nice article! Always good to see easy-to-follow explainers on these kinds of concepts!

One minor nit, for the “odd corner case that likely never exists in real code” of taken branches to the next instruction, I can think of at least one example where this is often used: far jumps to the next instruction with a different segment on x86[_64] that are used to reload CS (e.g. on a mode switch).

Aware that’s a very specific case, but it’s one that very much does exist in real code.

chrisfeilbach•7mo ago
Author here. I'll work this in. Thank you.
a_void_sky•7mo ago
its such a fascinating thing that most people just ignore i too wrote (using AI) an article on Branch Prediction after i found out that most of my team members only read this in college but never understood
djmips•7mo ago
Weird cookie policy on that blog?
chrisfeilbach•7mo ago
What's weird about it? It's the standard Wordpress cookie policy.
djmips•7mo ago
I couldn't choose like most sites
chrisfeilbach•7mo ago
Weird, I'll investigate tomorrow, thank you.
msk-lywenn•7mo ago
I clicked "learn more" and then I got a "disagree" button. Not really the most intuitive flow but it's there...
vincent-manis•7mo ago
This is why the old-fashioned university course on assembly language is still useful. Writing assembly language (preferably for a less-complex architecture, so the student doesn't get bogged down on minutiae) gives one a gut feeling for how machines work. Running the program on a simulator that optionally pays attention to pipeline and cache misses can help a person understand these issues.

It doesn't matter what architecture one studies, or even a hypothetical one. The last significant application I wrote in assembler was for System/370, some 40 years ago. Yet CPU ISAs of today are not really that different, conceptually.

saagarjha•7mo ago
ISAs have not changed, sure. Microarchitectures are completely different and basically no school is going to teach you anything useful for that.
dragontamer•7mo ago
> Yet CPU ISAs of today are not really that different, conceptually.

CPU true.

GPU no. It's not even the instructions that are different, but I would suggest studying up on GPU loads/stores.

GPUs have fundamentally altered how loads/stores have worked. Yes it's a SIMD load (aka gather operation) which has been around since the 80s. But the routing of that data includes highly optimized broadcast patterns and or butterfly routing or crossbars (which allows for an arbitrary shuffle within log2(n)).

Load(same memory location) across GPU Threads (or SimD lanes) compiles as a single broadcast.

Load(consecutive memory location) across consecutive SIMD lanes is also efficient.

Load(arbitrary) is doable but slower. The crossbar will be taxed.

PerryStyle•7mo ago
Do you have any good resources that go into detail on GPU ISAs or GPU architecture? There's certainly a lot available for CPUs, but the resources I’ve found for GPUs mostly focus on how they differ from CPUs and how their ISAs are tailored to the GPU's specific goals.
grg0•7mo ago
Unfortunately this is a topic that isn't open enough, and architectures change rather quickly so you're always chasing the rabbit. That being said:

RDNA architecture (a few gens old) slides has some breadcrumbs: https://gpuopen.com/download/RDNA_Architecture_public.pdf

AMD also publishes its ISAs, but I don't think you'll be able to extract much from a reference-style document: https://gpuopen.com/amd-gpu-architecture-programming-documen...

Books on CUDA/HIP also go into some detail of the underlying architecture. Some slides from NV:

https://gfxcourses.stanford.edu/cs149/fall21content/media/gp...

Edit: I should say that Apple also publishes decent stuff. See the link here and the stuff linked at the bottom of the page. But note that now you're in UMA/TBDR territory; discrete GPUs work considerably differently: https://developer.apple.com/videos/play/wwdc2020/10602/

If anyone has more suggestions, please share.

xelxebar•7mo ago
Branch Education apparently decapped and scanned a GA102 (Nvidia 30 series) for the following video: https://www.youtube.com/watch?v=h9Z4oGN89MU. The beginning is very basic, but the content ramps up quickly.
dragontamer•7mo ago
I assume most people learn microarchitecture for performance reasons.

At which point, the question you are really asking is what aspects of assembly are important for performance.

Answer: there are multiple GPU Matrix Multiplication examples covering channels (especially channel conflicts), load/store alignment, memory movement and more. That should cover the issue I talked about earlier.

Optimization guides help. I know it's 10+ years old, but I think AMDs OpenCL optimization guides was easy to read and follow, and still modern enough to cover most of today's architectures.

Beyond that, you'll have to see conferences about DirectX12 new instructions (wave instructions, ballot/voting, etc. etc) and their performance implications.

It's a mixed bag, everyone knows one or two ways of optimization but learning all of them requires lots of study.

vincent-manis•6mo ago
One of the Hennessy/Patterson books has coverage of GPUs. But the definitive description of GPUs at a conceptual undergrad level has yet to be written, I think.

As for microarchitectures, for the most part those have evolved over the last 50 years. If you are looking to extract the last bit of speed out of a CPU, the only complete resource is the documentation. But pipelines and caches have heen around for a long time, look at the System/360 Model 85 and 91. A C programmer can use a fairly primitive conceptual model of a microarchitecture to get a pretty good first-order approximation of performance issues.

I am not saying that we should teach undergrads with 1980s materials. I am saying that a good understanding of pretty much any assembly language can get a programmer pretty far.

hinkley•7mo ago
Intro CompE class does a good bit for mechanical sympathy as well.
IshKebab•7mo ago
I don't think we had out of order designs with speculative execution 40 years ago? That seems like a pretty huge change.
flohofwoe•7mo ago
These are mostly internal implementation details, instructions still appear to resolve in order from the outside (with some subtle exceptions for memory read/writes depending on the CPU architecture). It may become important to know such details for performance profiling though.

What has drastically changed is that you cannot do trivial 'cycle counting' anymore.

tucnak•7mo ago
Not to step on your toes, but it shall be said that instructions in a CPU "retire" in order.
peterfirefly•7mo ago
They don't even always do that anymore.
IshKebab•7mo ago
Depends what you mean by "retire" but by the normal definition they always retire in order, even in OoO CPUs. You might be thinking of writeback.
chrisfeilbach•6mo ago
Hey, author here. Tomasulo's algorithm is a means of out or order execution, and that was invented and implemented by IBM in the 1960s. It was designed for their floating point operations, specifically. I don't remember if they also implemented speculative execution.
jdougan•7mo ago
Did you teach the UBC CS systems programming course in 1985?
vincent-manis•6mo ago
Yes i did.
jdougan•6mo ago
I was likely one of your students. I took the course around then, but don't have my records accessible. It was definitely worthwhile going to the low-level.
vincent-manis•6mo ago
You undoubtedly were in my class if you did it in 1985, and I am glad you got value from it.
o11c•7mo ago
Decent intro, though nothing new.

A couple useful points it lacks:

* `switch` statements can be lowered in two different ways: using a jump table (an indirect branch, only possible when values are adjacent; requires a highly-predictable branch to check the range first), or using a binary search (multiple direct branches). Compilers have heuristics to determine which should be used, but I haven't played with them.

* You may be able to turn an indirect branch into a direct branch using code like the following:

  if (function_pointer == expected_function)
    expected_function();
  else
    (*function_pointer)();
* It's generally easy to turn tail recursion into a loop, but it takes effort to design your code to make that possible in the first place. The usual Fibonacci example is a good basic intro; tree-walking is a good piece of homework.

* `cmov` can be harmful (since it has to compute both sides) if branch is even moderately predictable and/or if the less-likely side has too many instructions. That said, from my tests, compilers are still too hesitant to use `cmov` even for cases where yes I really know dammit. OoO CPUs are weird to reason about but I've found that due to dependencies between other instructions, there's often some execution ports to spare for the other side of the branch.

chrisfeilbach•7mo ago
Author here. You can only write so much before you start to lose the audience -- do you believe that anything you mentioned in your list is inherently lacking from my post?

Cool trick with the function pointer comparison!

xelxebar•7mo ago
Good material, targeted at undergraduate or advanced high school level.

I've been slowly reading Agner Fog's resources. The microarchitecture manual is incredible, and pertinently, the section on branch prediction algorithms I find fascinating:

https://web.archive.org/web/20250611003116/https://www.agner...

HelloNurse•7mo ago
> A function always has a single entry point in a program (at least, I don’t know of any exceptions to this rule)

We can consider distinct entry points as distinct functions, but it doesn't mean that different functions cannot overlap, sharing code in general and return statements. Feasibility depends on calling conventions, which are outside the topic of the article.

chrisfeilbach•7mo ago
I'll add the word 'common' before exceptions. Thanks for feedback.