frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Project Patchouli: Open-source electromagnetic drawing tablet hardware

https://patchouli.readthedocs.io/en/latest/
251•ffin•6h ago•24 comments

A closer look at a BGP anomaly in Venezuela

https://blog.cloudflare.com/bgp-route-leak-venezuela/
163•ChrisArchitect•5h ago•61 comments

The Napoleon Technique: Postponing things to increase productivity

https://effectiviology.com/napoleon/
106•Khaine•3d ago•47 comments

Kernel bugs hide for 2 years on average. Some hide for 20

https://pebblebed.com/blog/kernel-bugs
188•kmavm•9h ago•73 comments

Open Infrastructure Map

https://openinframap.org
201•efskap•8h ago•45 comments

Mothers (YC X26) Is Hiring

https://jobs.ashbyhq.com/9-mothers
1•ukd1•7m ago

Anyone have experiences with Audio Induction Loops?

https://en.wikipedia.org/wiki/Audio_induction_loop
26•evolve2k•3d ago•6 comments

Eat Real Food

https://realfood.gov
880•atestu•18h ago•1193 comments

Lessons from Hash Table Merging

https://gist.github.com/attractivechaos/d2efc77cc1db56bbd5fc597987e73338
34•attractivechaos•5d ago•7 comments

Shipmap.org

https://www.shipmap.org/
640•surprisetalk•21h ago•102 comments

Go.sum is not a lockfile

https://words.filippo.io/gosum/
91•pabs3•7h ago•33 comments

Tailscale state file encryption no longer enabled by default

https://tailscale.com/changelog
300•traceroute66•15h ago•117 comments

ChatGPT Health

https://openai.com/index/introducing-chatgpt-health/
303•saikatsg•16h ago•370 comments

The Q, K, V Matrices

https://arpitbhayani.me/blogs/qkv-matrices/
135•yashsngh•1d ago•56 comments

The virtual AmigaOS runtime (a.k.a. Wine for Amiga:)

https://github.com/cnvogelg/amitools/blob/main/docs/vamos.md
81•doener•11h ago•19 comments

LaTeX Coffee Stains (2021) [pdf]

https://ctan.math.illinois.edu/graphics/pgf/contrib/coffeestains/coffeestains-en.pdf
344•zahrevsky•21h ago•82 comments

GLSL Web CRT Shader

https://blog.gingerbeardman.com/2026/01/04/glsl-web-crt-shader/
68•msephton•3d ago•19 comments

Play Aardwolf MUD

https://www.aardwolf.com/
136•caminanteblanco•12h ago•68 comments

NPM to implement staged publishing after turbulent shift off classic tokens

https://socket.dev/blog/npm-to-implement-staged-publishing
179•feross•17h ago•60 comments

AI misses nearly one-third of breast cancers, study finds

https://www.emjreviews.com/radiology/news/ai-misses-nearly-one-third-of-breast-cancers-study-finds/
114•Liquidity•5h ago•58 comments

How Google got its groove back and edged ahead of OpenAI

https://www.wsj.com/tech/ai/google-ai-openai-gemini-chatgpt-b766e160
146•jbredeche•19h ago•164 comments

Musashi: Motorola 680x0 emulator written in C

https://github.com/kstenerud/Musashi
79•doener•11h ago•7 comments

US will ban Wall Street investors from buying single-family homes

https://www.reuters.com/world/us/us-will-ban-large-institutional-investors-buying-single-family-h...
886•kpw94•16h ago•898 comments

Reading Without Limits or Expectations

https://www.carolinecrampton.com/reading-without-limits-or-expectations/
44•herbertl•2d ago•12 comments

Notion AI: Unpatched data exfiltration

https://www.promptarmor.com/resources/notion-ai-unpatched-data-exfiltration
176•takira•16h ago•27 comments

Claude Code CLI was broken

https://github.com/anthropics/claude-code/issues/16673
135•sneilan1•15h ago•128 comments

Health care data breach affects over 600k patients, Illinois agency says

https://www.nprillinois.org/illinois/2026-01-06/health-care-data-breach-affects-600-000-patients-...
193•toomuchtodo•19h ago•67 comments

Creators of Tailwind laid off 75% of their engineering team

https://github.com/tailwindlabs/tailwindcss.com/pull/2388
1274•kevlened•20h ago•723 comments

“Stop Designing Languages. Write Libraries Instead” (2016)

https://lbstanza.org/purpose_of_programming_languages.html
251•teleforce•23h ago•251 comments

A4 Paper Stories

https://susam.net/a4-paper-stories.html
358•blenderob•23h ago•168 comments
Open in hackernews

Vector graphics on GPU

https://gasiulis.name/vector-graphics-on-gpu/
162•gsf_emergency_6•5d ago

Comments

larodi•1d ago
Really, inst there anything which comes Slug-level of capabilities and is not super expensive?
coffeeaddict1•23h ago
Vello [0] might suit you although it's not production grade yet.

[0] https://github.com/linebender/vello

miguel_martin•17h ago
Just use blend2d - it is CPU only but it is plenty fast enough. Cache the rasterization to a texture if needed. Alternatively, see blaze by the same author as this article: https://gasiulis.name/parallel-rasterization-on-cpu/
reallynattu•9h ago
ThorVG might be worth a look - open source (MIT), ~150KB core, GPU backends (WebGPU, OpenGL).

We are using it as official dotLottie runtimes, now a Linux Foundation project. Handles SVG, Lottie, fonts, effects.

https://github.com/thorvg/thorvg/

badlibrarian•1d ago
Author uses a lot of odd, confusing terminology and brings CPU baggage to the GPU creating the worst of both worlds. Shader hacks and CPU-bound partitioning and choosing the Greek letter alpha to be your accumulator in a graphics article? Oh my.

NV_path_rendering solved this in 2011. https://developer.nvidia.com/nv-path-rendering

It never became a standard but was a compile-time option in Skia for a long time. Skia of course solved this the right way.

https://skia.org/

bsder•1d ago
While the author doesn't seem to be aware of state of the art in the field, vector rendering is absolute NOT a solved problem whether on CPU or GPU.

Vello by Raph Levien seems to be a nice combination of what is required to pull this off on GPUs. https://www.youtube.com/watch?v=_sv8K190Zps

lukan•1d ago
Yeah, I have high hopes for Vello to take off. I could throw away lots of hacks and caching and whatnot if I could do fast vector rendering reliable on the GPU.

I think Rive also does vector rendering on the GPU

https://rive.app/renderer

But it is not really meant (yet?) as a general graphics libary, but just a renderer for the rive design tools.

pier25•19h ago
AFAIK you can use the Rive renderer in your C++ app.

http://github.com/rive-app/rive-runtime

bean469•16h ago
> While the author doesn't seem to be aware of state of the art in the field

The blog post is from 2022, though

sirwhinesalot•1d ago
So what is the right way that Skia uses? Why is there still discussion on how to do vector graphics on the GPU right if Skia's approach is good enough?

Not being sarcastic, genuinely curious.

cyberax•19h ago
The major unsolved problem is real-time high-quality text rendering on GPU. Skia just renders fonts on the CPU with all kinds of hacks ( https://skia.org/docs/dev/design/raster_tragedy/ ). It then renders them as textures.

Ideally, we want to have as much stuff rendered on the GPU as possible. Ideally with support for glyph layout. This is not at all trivial, especially for complex languages like Devanagari.

In the perfect world, we want to be able to create a 3D cube and just have the renderer put the text on one of its facets. And have it rendered perfectly as you rotate the cube.

exDM69•23h ago
> NV_path_rendering solved this in 2011.

By no means is this a solved problem.

NV_path_rendering is an implementation of "stencil then cover" method with a lot of CPU preprocessing.

It's also only available on OpenGL, not on any other graphics API.

The STC method scales very badly with increasing resolutions as it is using a lot of fill rate and memory bandwidth.

It's mostly using GPU fixed function units (rasterizer and stencil test), leaving the "shader cores" practically idle.

There's a lot of room for improvement to get more performance and better GPU utilization.

Asm2D•21h ago
You know nothing.

Skia is definitely not a good example at all. Skia started as a CPU renderer, and added GPU rendering later, which heavily relies on caching. Vello, for example, takes a completely different approach compared to Skia.

NV path rendering is a joke. nVidia though that ALL graphics would be rendered on GPU within 2 years after making the presentation, and it took 2 decades and 2D CPU renderers still shine.

nicoburns•21h ago
I believe Skia's new Graphite architecture is much more similar to Vello
badlibrarian•20h ago
Right. The question is does Skia grows its broad and useful toolkit with an eye toward further GPU optimization? Or does Vello (broadened and perhaps burdened by Rust and the shader-obsessive crowd) grow a broad and useful API?

There's also the issue of just how many billions of line segments you really need to draw every 1/120th of a second at 8K resolution, but I'll leave those discussions to dark-gray Discord forums rendered by Skia in a browser.

coffeeaddict1•19h ago
> There's also the issue of just how many billions of line segments you really need to draw every 1/120th of a second at 8K resolution

IMO, one of biggest benefit of a high performance renderer would be power savings (very important for laptops and phones). If I can run the same work but use half the power, then by all means I'd be happy to deal with the complications that the GPU brings. AFAIK though, no one really cares about that and even efforts like Vello are just targeting fps gains, which do correlate with reduced power consumption but only indirectly.

badlibrarian•18h ago
It's an argument you can make in any performance effort. But I think the "let's save power using GPUs" ship sailed even before Microsoft started buying nuclear reactors to power them.
Asm2D•14h ago
Adding a power draw into the mix is pretty interesting. Just because a GPU can render something 2x faster in a particular test doesn't mean you have consumed 50% less power, especially when we talk about dedicated GPUs that can have power draw in hundreds of watts.

Historically 2D rendering on CPU was pretty much single-threaded. Skia is single-threaded, Cairo too, Qt mostly (they offload gradient rendering to threads, but it's painfully slow for small gradients, worse than single-threaded), AGG is single-threaded, etc...

In the end only Blend2D, Blaze, and now Vello can use multiple threads on CPU, so finally CPU vs GPU comparisons can be made more fairy - and power draw is definitely a nice property of a benchmark. BTW Blend2D was probably the first library to offer multi-threaded rendering on CPU (just an option to pass to the rendering context, same API).

As far as I know - nobody did a good benchmarking between CPU and GPU 2D renderers - it's very hard to do completely unbiased comparison, and you would be surprised how good the CPU is in this mix. Modern CPU cores consume maybe few watts and you can render to a 4K framebuffer with that single CPU core. Put rendering text to the mix and the numbers would start to be very interesting. Also GPU memory allocation should be included, because rendering fonts on GPU means to pre-process them as well, etc...

2D is just very hard, on both CPU and GPU you would be solving a little bit different problems, but doing it right is insane amount of work, research, and experimentation.

nicoburns•13h ago
It's not a formal benchmark, but my Browser Engine / Webview (https://github.com/DioxusLabs/blitz/) has pluggable rendering backends (via https://github.com/DioxusLabs/anyrender) with Vello (GPU), Vello CPU, Skia (various backends incl. Vulkan, Metal, OpenGL, and CPU) currently implemented

On my Apple M1 Pro, the Vello CPU renderer is competitive with the GPU renderers on simple scenes, but falls behind on more complex ones. And especially seems to struggle with large raster images. This is also without a glyph cache (so re-rasterizing every glyph every time, although there is a hinting cache) which isn't implemented yet. This is dependent on multi-threading being enabled and can consume largish portions of all-core CPU while it runs. Skia raster (CPU) gets similarish numbers, which is quite impressive if that is single-threaded.

Asm2D•3h ago
I think Vello CPU would always struggle with raster images, because it does a bounds check for every pixel fetched from a source image. They have at least described this behavior somewhere in Vello PRs.

The obsession for memory safety just doesn't pay off in some cases - if you can batch 64 pixels at once with SIMD it just cannot be compared to a per-pixel processor that has a branch in a path.

virtualritz•1d ago
Unless I miss something I think that this describes box filtering.

It should probably mention that that this is only sufficient for some use cases but not for high quality ones.

E.g. if you were to use this e.g. for rendering font glyphs into something like a static image (or a slow rolling title/credits) you probably want a higher quality filter.

jstimpfle•1d ago
What type of filter do you mean? Unless I'm misunderstanding/missing something, the approach described doesn't go into the details of how coverage is computed. If the input image is only simple lines whose coverage can be correctly computed (don't know how to do this for curves?) then what's missing?

I'd be interested how feasible complete 2D UIs using dynamically GPU rendered vector graphics are. I've played with vector rendering in the past, using a pixel shader that more or less implemented the method described in the OP. Could render the ghost script tiger at good speeds (like 1-digit milliseconds at 4K IIRC), but there is always an overhead to generating vector paths, sampling them into line segments, dispatching them etc... Building a 2D UI based on optimized primitives instead, like axis-aligned rects and rounded rects, mostly will always be faster, obviously.

Text rendering typically adds pixel snapping, possibly using byte code interpreter, and often adds sub-pixel rendering.

jlokier•21h ago
> If the input image is only simple lines whose coverage can be correctly computed (don't know how to do this for curves?) then what's missing?

Computing pixel coverage accurately isn't enough for the best results. Using it as the alpha channel for blending forground over background colour is the same thing as sampling a box filter applied to the underlying continuous vector image.

But often a box filter isn't ideal.

Pixels on the physical screen have a shape and non-uniform intensity across their surface.

RGB sub-pixels (or other colour basis) are often at different positions, and the perceptual luminance differs between sub-pixels in addition to the non-uniform intensity.

If you don't want to tune rendering for a particular display, there are sometimes still improvements from using a non-box filter

An alternative is to compute the 2D integral of a filter kernel over the coverage area for each pixel. If the kernel has separate R, G, B components, to account for sub-pixel geometry, then you may require another function to optimise perceptual luminance while minimising colour fringing on detailed geometries.

Gamma correction helps, and fortunately that's easily combined with coverage. For example, slow rolling tile/credits will shimmer less at the edges if gamme is applied correctly.

However, these days with Retina/HiDPI-style displays, these issues are reduced.

For example, MacOS removed sub-pixel anti-aliasing from text rendering in recent years, because they expect you to use a Retina display, and they've decided regular whole-pixel coverage anti-aliasing is good enough on those.

dahart•19h ago
> What type of filter do you mean? […] the approach described doesn’t go into the details of how coverage is computed

This article does clip against a square pixel’s edges, and sums the area of what’s inside without weighting, which is equivalent to a box filter. (A box filter is also what you get if you super-sample the pixel with an infinite number of samples and then use the average value of all the samples.) The problem is that there are cases where this approach can result in visible aliasing, even though it’s an analytic method.

When you want high quality anti-aliasing, you need to model pixels as soft leaky overlapping blobs, not little squares. Instead of clipping at the pixel edges, you need to clip further away, and weight the middle of the region more than the outer edges. There’s no analytic method and no perfect filter, there are just tradeoffs that you have to balance. Often people use filters like Triangle, Lanczos, Mitchell, Gaussian, etc.. These all provide better anti-aliasing properties than clipping against a square.

masswerk•1d ago
May require "(2022)" in the title.
xattt•23h ago
Tangential, but was this not the goal of Quartz 2D? The idea of everyday things running on the GPU seemed very attractive.

There is some context in this 13-year-old discussion: https://news.ycombinator.com/item?id=5345905#5346541

I am curious if the equation of CPU-determined graphics being faster than being done on the GPU has changed in the last decade.

Did Quartz 2D ever become enabled on macOS?

kllrnohj•22h ago
When things like this (or Vello or piet-gpu or etc...) talk about "vector graphics on GPU" they are near exclusively talking only about essentially a full solve solution. A generic solution that handles fonts and svgs and arbitrarily complex paths with strokes and fills and the whole shebang.

These are great goals, but also largely inconsequential with nearly all UI designs. The majority of systems today (like skia) are hybrids. Things like simple shapes (eg, round rects) have analytical shaders on the GPU and complex paths (like fronts) are just done on the CPU once and cached on the GPU in a texture. It's a very robust, fast approach to the wholistic problem, at the cost of not being as "clean" of a solution like a pure GPU renderer would be.

jacobp100•22h ago
> I am curious if the equation of CPU-determined graphics being faster than being done on the GPU has changed in the last decade

If you look at Blend2D (a CPU rasterizer), they seem to outperform every other rasterizer including GPU-based ones - according to their own benchmarks at least

Asm2D•21h ago
Blend2D doesn't benchmark against GPU renderers - the benchmarking page compares CPU renderers. I have seen comparisons in the past, but it's pretty difficult to do a good CPU vs GPU benchmarking.
miguel_martin•17h ago
Blaze outperforms Blend2D - by the same author as the article: https://gasiulis.name/parallel-rasterization-on-cpu/ - but to be fair, Blend2D is really fast.
Asm2D•14h ago
You need to rerun the benchmarks if you want fresh numbers. The post was written when Blend2D didn't have JIT for AArch64, which penalized it a bit. Also on X86_64 the numbers are really good for Blend2D, which beats Blaze in some tests. So it's not black&white.

And please keep in mind that Blend2D is not really in development anymore - it has no funding so the project is basically done.

coffeeaddict1•13h ago
> And please keep in mind that Blend2D is not really in development anymore - it has no funding so the project is basically done.

That's such a shame. Thanks a lot for Blend2D! I wish companies were less greedy and would fund amazing projects like yours. Unfortunately, I do think that everyone is a bit obsessed with GPUs nowadays. For 2D rendering the CPU is great, especially if you want predictable results and avoid having to deal with the countless driver bugs that plague every GPU vendor.

samiv•21h ago
The issue is not performance the issue is that pixel precise operations are difficult on the GPU using graphics features such as shaders.

You don't normally work with pixels but you work with polygonal geometry (triangles) and the GPU does the pixel (fragment) rasterization.

pjmlp•20h ago
Not sure what you mean, it can make use of accelerated graphics,

https://developer.apple.com/library/archive/documentation/Gr...

xattt•1h ago
I’ve explored it for a few years, but all I could tell that it was never actually fully enabled. You can enable it through debugging tools, but it was never on by default for all software.
willtemperley•20h ago
Quartz 2D is now CoreGraphics. It's hard to find information about the backend, presumably for commercial reasons. I do know it uses the GPU for some operations like magnifyEffect.

Today I was smoothly panning and zooming 30K vertex polygons with SwiftUI Canvas and it was barely touching the CPU so I suspect it uses the GPU heavily. Either way it's getting very good. There's barely any need to use render caches.

nubskr•21h ago
Turns out the best GPU optimization is just being too scared of graphics drivers to do the fancy stuff, 10-15x faster and you can actually debug it.
jayd16•19h ago
So without blowing up the traditional shader pipeline, why is it not trivial to add a path stage as an alternative to the vertex stage? It seems like GPUs and shader language could implement a standard way to turn vector paths into fragments and keep the rest of the pipeline.

In fact, you could likely use the geometry stage to create arbitrarily dense vertices based on path data passed to the shader without needing any new GPU features.

Why is this not done? Is the CPU render still faster than these options?

exDM69•18h ago
> why is it not trivial to add a path stage as an alternative to the vertex stage?

Because paths, unlike triangles are not fixed size or have screen space locality. Paths consist of multiple contours of segments, typically cubic bezier curves and a winding rule.

You can't draw one segment out of a contour on the screen and continue to the next one, let alone do them in parallel. A vertical line segment on the left hand side going bottom to top of your screen will make every pixel to the right of it "inside" the path, but if there's another line segment going top to bottom somewhere the pixel and it's outside again.

You need to evaluate the winding rule for every curve segment on every pixel and sum it up.

By contrast, all the pixels inside the triangle are also inside the bounding box of the triangle and the inside/outside test for a pixel is trivially simple.

There are at least four popular approaches to GPU vector graphics:

1) Loop-Blinn: Use CPU to tessellate the path to triangles on the inside and on the edges of the paths. Use a special shader with some tricks to evaluate a bezier curve for the triangles on the edges.

2) Stencil then cover: For each line segment in a tessellated curve, draw a rectangle that extends to the left edge of the contour and use two sided stencil function to add +1 or -1 to the stencil buffer. Draw another rectangle on top of the whole path and set the stencil test to draw only where the stencil buffer is non-zero (or even/odd) according to the winding rule.

3) Draw a rectangle with a special shader that evaluates all the curves in a path, and use a spatial data structure to skip some. Useful for fonts and quadratic bezier curves, not full vector graphics. Much faster than the other methods for simple and small (pixel size) filled paths. Example: Lengyel's method / Slug library.

4) Compute based methods such as the one in this article or Raph Levien's work: use a grid based system with tessellated line segments to limit the number of curves that have to be evaluated per pixel.

Now this is only filling paths, which is the easy part. Stroking paths is much more difficult. Full SVG support has both and much more.

> In fact, you could likely use the geometry stage to create arbitrarily dense vertices based on path data passed to the shader without needing any new GPU features.

Geometry shaders are commonly used with stencil-then-cover to avoid a CPU preprocessing step.

But none of the GPU geometry stages (geometry, tessellation or mesh shaders) are powerful enough to deal with all the corner cases of tessellating vector graphics paths, self intersections, cusps, holes, degenerate curves etc. It's not a very parallel friendly problem.

> Why is this not done?

As I've described here: all of these ideas have been done with varying degrees of success.

> Is the CPU render still faster than these options?

No, the fastest methods are a combination of CPU preprocessing for the difficult geometry problems and GPU for blasting out the pixels.

Dwedit•17h ago
According to the page here: https://www.humus.name/index.php?page=News&ID=228

The best way to draw a circle on a GPU is to start with a large triangle, and keep adding additional triangles on the edges until you've reached the point where you do not need to add any more triangles (smaller than a pixel)

jesse__•15h ago
I'd put money on that the best way is actually to draw a quad, or single triangle, and draw the circle as a SDF in the fragment shader
Lichtso•17h ago
> but [analytic anti-aliasing (aaa)] also has much better quality than what can be practically achieved with supersampling

What this statement is missing is that aaa coverage is immediately resolved, while msaa coverage is resolved later in a separate step with extra data being buffered in between. This is important because msaa is unbiased while aaa is biased towards too much coverage once two paths partially cover the same pixel. In other words aaa becomes incorrect once you draw overlapping or self-intersecting paths.

Think about drawing the same path over and over at the same place: aaa will become darker with every iteration, msaa is idempotent and will not change further after the first iteration.

Unfortunately, this is a little known fact even in the exquisite circles of 2D vector graphics people, often presenting aaa as the silver bullet, which it is not.

jesse__•15h ago
Interestingly they do not cite calculating a signed distance to the surface of the shape as an approach to doing AA, as described in the Valve paper [1]. I suppose this is more targeted at offline baking, but given they're suggesting iterating every curve at every pixel, I'm not sure why you wouldn't.

[1] https://steamcdn-a.akamaihd.net/apps/valve/2007/SIGGRAPH2007...

reallynattu•9h ago
For anyone looking at this space: ThorVG is worth checking out.

Open-source vector engine with GPU backends (WebGPU, OpenGL), runs on microcontrollers to browsers. Now a Linux Foundation project.

https://github.com/thorvg/thorvg

(Disclosure: CTO at LottieFiles, we build and maintain ThorVG in-house, with community contributions from individuals and companies like Canva)