frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

1•PhantomKey•1m ago•0 comments

Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

https://github.com/qrafty-ai/teleop_xr
1•playercc7•4m ago•1 comments

The Highest Exam: How the Gaokao Shapes China

https://www.lrb.co.uk/the-paper/v48/n02/iza-ding/studying-is-harmful
1•mitchbob•8m ago•1 comments

Open-source framework for tracking prediction accuracy

https://github.com/Creneinc/signal-tracker
1•creneinc•10m ago•0 comments

India's Sarvan AI LLM launches Indic-language focused models

https://x.com/SarvamAI
1•Osiris30•11m ago•0 comments

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

https://github.com/TermiX-official/cryptoclaw
1•cryptoclaw•14m ago•0 comments

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

https://twitter.com/sathish316/status/2020116849065971815
1•sathish316•16m ago•1 comments

CReact Version 0.3.0 Released

https://github.com/creact-labs/creact
1•_dcoutinho96•18m ago•0 comments

Show HN: CReact – AI Powered AWS Website Generator

https://github.com/creact-labs/ai-powered-aws-website-generator
1•_dcoutinho96•19m ago•0 comments

The rocky 1960s origins of online dating (2025)

https://www.bbc.com/culture/article/20250206-the-rocky-1960s-origins-of-online-dating
1•1659447091•24m ago•0 comments

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

https://github.com/Parassharmaa/agent-fetch
1•paraaz•25m ago•0 comments

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
5•witnessme•29m ago•1 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
2•aloukissas•33m ago•1 comments

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

https://seedance.ai/
2•bigbromaker•36m ago•0 comments

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•42m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
6•alephnerd•44m ago•2 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•45m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
2•pbradv•47m ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
4•hasheddan•48m ago•0 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•1h ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•1h ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
28•duxup•1h ago•6 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•1h ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•1h ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•1h ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•1h ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•1h ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•1h ago•0 comments

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

https://ponyalpha.pro
1•qzcanoe•1h ago•1 comments

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

https://github.com/Goofygiraffe06/tunbot
2•g1raffe•1h ago•0 comments
Open in hackernews

GPU-Driven Clustered Forward Renderer

https://logdahl.net/p/gpu-driven
116•logdahl•8mo ago

Comments

zeristor•8mo ago
Apostrophe as a number separator?

Where’s that from?

dahart•8mo ago
Switzerland and Italy for two. https://en.wikipedia.org/wiki/Decimal_separator#

Also note C++14 introduced the apostrophe in numeric literals! https://en.cppreference.com/w/cpp/language/integer_literal

lacoolj•8mo ago
Learn somethin new every day.

And I would never have known this existed without hackernews

logdahl•8mo ago
Interesting that Sweden explicitly do NOT use it... Not sure where i picked it up! :-)
qingcharles•8mo ago
I've started using the underscore in my code since that is becoming the (non-localized) standard and trendy:

https://en.wikipedia.org/wiki/Integer_literal#Digit_separato...

m-schuetz•8mo ago
Apostrophe are nice because they are not ambiguous. Started using them myself after getting used to them from C++ and learning that they are used in switzerland.
unclad5968•8mo ago
This is awesome! At the end you mention the 27k dragons and 10k lights just barely fits in 16ms. Do you see any paths to improve performance? I've seen some demos on with tens/hundreds of thousands of moving lights, but hard to tell if they're legit or highly constrained. I'm not a graphics programmer by trade.

I need a renderer for a personal project and after some research decided I'll implement a forward clustered renderer as well.

logdahl•8mo ago
Well, the core issue is still drawing. I took another look at some profiles again and seems like its not the renderer limiting this to 27k! I still had some stupid scene-graph traversal... But clustering and culling is 53us and 33us respectively, but the draw is 7ms. So a frame (on the GPU-side) is like 7ms, and some 100-200 us on the CPU side.

Should really dive deeper and update the measurements for final results...

godelski•8mo ago
I haven't look at the post in the detail it deserves, but given your graphs the workload looks pretty bursty. I'd suspect there are some good I/O optimizations or some predication. Definitely that last void main block looks ripe for that. But I'd listen to Knuth, premature optimization and all, so grab a profiler. I wouldn't be surprised if you're nearing peak performance. Also NVIDIA GPUs have a lot of special tricks that can be exploited but are buried in documentation... if you haven't already seen it (I suspect you have), you'd be interested in "GPU Gems". Gems 2 has some good stuff on predication.

But also, really good work! You should be proud of this! Squeezing that much out of that hardware is no easy feat.

gmueckl•8mo ago
This seems fairly well optimized. There's probably room to squeeze out some more perf, but not dramatic improvements. Maybe preventing overdraw of shaded pixels by doing a depth prepass would help.

Without digging into the detailed breakdown, I would assume that the sheer amount of teeny tiny triangles is the main bottleneck in this benchmark scene. When triangles become smaller than about 4x4 pixels, GPU utilization for raterization starts to diminish. And with the scaled down dragons, there's a lot of then in the frame.

spookie•8mo ago
This is by far the biggest culprit OP, look into this.

You can try to come up with imposters representing these far away dragons, or simple LoD levels. Some games do use particles to represent far away and repeated "meshes" (Ghost of Tsushima does these for soldiers far away).

Lot's of techniques in this area ranging from simple to bananas. LoD levels alone can get you pretty far! Of course, this is at the cost of having more different draw calls, so it is a balancing game.

Think about the topology too, hope these old gems helps getting a grasp on the cost of this:

https://www.humus.name/index.php?page=Comments&ID=228

https://www.g-truc.net/post-0662.html

logdahl•8mo ago
Yeah, I use LODs already but as you say, even my lowest lod far away is too many vertices. Imposter rendering seems very interesting but also completely bonkers (viewing angle, lighting)!
corysama•8mo ago
I've not sat down and watched this yet, but you might appreciate it. https://www.youtube.com/watch?v=DZfhbMc9w0Q Apparently Doom: The Dark Ages switched to Visibility Buffer rendering. Likely because it reduces issues with quad utilization. http://filmicworlds.com/blog/visibility-buffer-rendering-wit...
undefuser•8mo ago
have you considered using meshlets technique like Unreal Engine Nanite or Assassin's Creed? It could potentially open the door for better culling and more effective depth prepass.
logdahl•8mo ago
Absolutely! I think this would likely be the next step.
zokier•8mo ago
Worth noting that the GTX 1070 is nearly 10 year old "mainstream" GPU. I'd imagine a 5090 or something could push the numbers fair bit more higher.
cullingculling•8mo ago
(GPU-driven) occlusion culling with meshlet rendering would help a lot while being relatively straightforward to implement if you already have a GPU-driven engine like OP does. Occlusion culling techniques cull objects that are completely hidden behind other objects. Meshlets break up objects (at asset build time) into tiny meshlets of around 64 to 128 triangles, such that these meshlets can be individually occlusion culled. This would help a lot by allowing the renderer to skip not just individual parts of the dragons that are hidden behind other dragons, but even parts of each dragon that are occluded by the rest of the dragon itself! There's a talk on YouTube about the Alan Wake 2 team implementing these techniques and being able to cull complex outdoor scenes of (iirc) hundreds of millions of triangles down to around 10-20 million.

The basic idea is to first render as normal some meshes that you either know are visible, or are likely to occlude objects in the scene (say the N closest objects, or some large terrain feature in a real game). Then you can take the resulting depth buffer and downsample it into something resembling a mipmap chain, but with each level holding the max depth of the contributing pixels, rather than the average. This is called a hierarchical Z (depth) buffer, or HZB for short. This can be used to very quickly, with just a few samples of the HZB, test if an object's bounding box is behind the all the pixels in a given area and thus definitely not visible. The hierarchical nature of the HZB allows both small and large meshes to be tested at the same performance cost.

Typically, a game would track which meshlets were known to be visible last frame, and start by rendering all of those (with updated positions and camera orientation, of course). This will make up most of what is drawn to the scene, because typically objects and the camera change very little from frame to frame. Then all the meshlets that weren't known to be visible get tested against the HZB, and just the few that were revealed by changes in the scene will need to be rendered. Lastly, at some point the known visible meshlet set should be re-tested, so that it does not grow indefinitely with meshlets that are no longer visible.

The result is that the first frame rendered after a major camera change (like the player respawning) will be slow, as all the meshlets in the frustum need to be rendered. But after that, the scene can be narrowed down to just the meshlets that actually contributed to the frame, and performance improves significantly. I think this would be more than enough for a demo, but for a real game you would probably want to explore methods to speed up that first frame's rendering, like sorting objects and picking the N closest/largest ones so you can at least get some occlusion culling working.

fabiensanglard•8mo ago
This website has a beautiful layout ;) !
logdahl•8mo ago
Fun to see you ;) Love your site!
rezmason•8mo ago
Ten thousand lights! Your utility bill must be enormous
Flex247A•8mo ago
Lights in games use real electricity :)
amelius•8mo ago
Even the stars use real electricity.
cluckindan•8mo ago
Not really, nuclear fusion doesn’t run on electrons.
DiabloD3•8mo ago
So where does the magnetic field come from? ;) ;) ;)
cluckindan•8mo ago
Nuclear fusion produces a million times more energy from proton and neutron collisions than is produced by electron shells during the same event.
amelius•8mo ago
The energy leaves the star in the form of EM energy. This is also the energy that is responsible for electricity.
monster_truck•8mo ago
Am I missing a link somewhere or is there no way to build/run this myself? Interested to see what a modern flagship gpu is good for
wizzwizz4•8mo ago
> As some other renderers do, we share a single GPU buffer for all vertex data. Instead, we use a simple allocator which manages this contigous buffer automatically.

I'm not sure what this part is supposed to say, but it doesn't look right. "Instead" usually follows differences, not similarities.