Instead Raph has spent the past 9 years I believe, trying to create a sound foundation on the problem of performant UI rendering.
I don't know how it will go, and he's going to end up shipping his grand vision at all eventually, but I really appreciate the effort of “doing something well” in a world that pretty much only rewards “doing something quickly”.
“The first volume of Knuth's series (dedicated to the IBM 650 computer, "in remembrance of many pleasant evenings") was printed in the late 1960s using old-fashioned but beautiful hot-type printing technology, complete with Linotype machines and the sharp smell of molten lead. Volume 2, which appeared a few years later, used photo-offset printing to save money for the publisher (the publisher of this book, in fact). Knuth didn't like the change from hot type to cold, from Lino to photo, and so he took a few months off from his other work, rolled up his sleeves, and set to work computerizing the business of setting type and designing type fonts. Nine years later, he was done.”
What projects like Slug and Vello rather show is that GPU coding remains so obtuse that you cannot tackle an isolated subproblem like 2D vector rendering, and instead have to make apple pie from scratch by first creating the universe. And then the resulting solution is itself a whole beast that cannot just be hooked up to other API(s) and languages than it was created for, unless that is specifically something you also architect for. As the first slide shows, v1 required modern GPUs, and the CPU side uses hand-optimized SIMD routines.
2D vector graphics is also just an awkward niche to optimize for today. GPUs are optimized for 3D, where z-buffers are used to draw things in an order-independent way. 2D graphics instead must be layered and clipped in the right order, which is much more difficult to 'embarrassingly' parallelize. Formats like SVG can have an endless number of points per path, e.g. a detailed polygon of the United States has to be processed as one shape, you can't blindly subdivide it. You also can't rely on vanilla anti-aliasing because complementary edges wouldn't be fully opaque.
Even if you do go all the way, you'll still have just a 2D rasterizer. Perhaps it can work under projective transform, that's usually pretty easy, but will it be significantly more powerful or extensible than something like Cairo is today? Or will it just do that exact same feature set in a technologically sexier way? e.g. Can it be adapted to rendering of 3D globes and maps, or would that break everything? And note that rasterizing fonts as just unhinted glyphs (i.e. paths) is rarely what people what.
I disagree, you either have the old object-oriented toolkits, which are fast enough but very unpleasant to, or the new reactive frameworks, that offers much better developer ergonomics (and that's why pretty much everybody uses them right now) but have pathological performance characteristics and requires lots of additional work to be fast enough when the number of items on screen is high enough.
By the way you are missing the forest (Xilem) for the tree (Vello) here: the foundational work Raph has been doing isn't just a 2D renderer (Vello), this is just a small piece in a bigger UI toolkit (Xilem) that is aimed at addressing the problem I mention above.
Xilem originally started using the native libraries for 2D rendering (through a piet wrapper he discusses quickly in the video) but ended up being disappointed and switched to making his own, but that's just one piece of the puzzle. The end goal is a fact reactive UI framework.
1. Do you have a favorite source for GPU terminology like draw calls? I optimized for them on an unreal engine project but never "grokked" what all the various GPU constructs are and how to understand their purpose, behavior and constraints. (For this reason I was behind the curve for most of your talk :D) Maybe this is just my lack of understanding of what a common / modern pipeline consists of?
2. I replayed the video segment twice but it is still lost on me how you know which side of the path in a tile is the filled side. Is that easy to understand from the code if I go spelunking for it? I am particularly interested in the details on how that is known and how the merge itself is performed.
I expect the line segments are represented by their two end points.
This makes it easy to encode which side is fill vs. alpha, by ordering the two points. So that as you move from the first point to the second, fill is always on the right. (Or vice versa.)
Another benefit of ordering at both point and segment levels, is from one segment to the next, a turn to the fill side vs. the alpha side, can be used to inform clipping, either convex or concave, reflecting both segments.
No idea if any of this what is actually happening here, but this is one way to do it. The animation of segmentation did reflect ordered segments, and clockwise for the outside border, and counterclockwise for the cavity in R. Fill to the right.
2. I skipped over this in the interest of time. `Nevermark has the central insight, but the full story is more interesting. For each tile, detect whether the line segment crosses the top edge of the tile, and if so, the direction. This gives you a delta of -1, 0, or +1. Then do a prefix sum of these deltas on the sorted tiles. That gives you the winding number at the top left corner of each tile, which in turn lets you compute the sparse fills and also which side to fill within the tile.
[1]: https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-...
There are different applications for 2D rendering.
In our case we need support for the rendering to take place with f32/float precision, i.e. RGBA colors need to be 96 bit values.
We also do not care if the renderer is realtime. The application we have is vector rendering for movie production.
That's where the multiple backend approach of vello and especially the vello-cpu crate become really interesting. We will either add the f32 support ourselves or hope it will become part of the vello roadmap at some stage.
Also, Blend2D is C++ (as is Skia, the best alternative, IMHO). Adding a C++ toolchain requirement to any Rust project is always a potential PITA.
For example, on the (Rust) software we work on, C++ toolchain breakage around a C++ image processing lib that we Rust-wrapped cost us two man weeks over the last 11 months. That's a lot for a startup where two devs work on the resp. affected part.
Suffice to say, there was zero Rust toolchain-related work done or breakage happening in the same timeframe.
There is a different problem though. While many people working on Vello are paid full time, Blend2D lacks funding and what you see today was developed independently. So, the development is super slow and that's the reason that Blend2D will most likely never have the features other libraries have.
What's the internal color space, I assume it is linear sRGB? It looks like you are going straight to RGBA FP32 which is good. Think how you will deal with denormals as the CPU will deal with those differently compared to the GPU. Rendering artifacts galore once you do real world testing.
And of course IsInf and NaN need to be handled everywhere. Just checking for F::ZERO is not enough in many cases, you will need epsilon values. In C++ doing if(value==0.0f){} or if (value==1.0f){} is considered a code smell.
Just browsing the source I see Porter Duff blend modes. Really, in 2025? Have fun dealing with alpha compositing issues on this one. Also most of the 'regular' blend modes are not alpha compositing safe, you need special handling of alpha values in many cases if you do not want to get artifacts. The W3C spec is completely underspecified in this regard. I spent many months dealing with this myself.
If I were to redo a rasterizer from scratch I would push boundaries a little more. For instance I would target full FP32 dynamic range support and a better internal color space, maybe something like OKLab to improve color blending and compositing quality. And coming up with innovative ways to use this gained dynamic range.
You are correct that conflation artifacts are a problem and that doing antialiasing in the right color space can improve quality. Long story short, that's future research. There are tradeoffs, one of which is that use of the system compositor is curtailed. Another is that font rendering tends to be weak and spindly compared with doing compositing in a device space.
I made myself a CPU SDF library last weekend, primarily for fast shadow textures. It was fun, and I was surprised how well most basic SDFs run with SIMD. Except yeah Beziers didn't fair well. Fonts seem much harder.
SIMD was easy, just asked Claude to convert my scalar Nim code to Neon SIMD version and then to an sse2 version. Most SDFs and gaussian shadowing got 4x speedup on my macbook m3. It's a bit surprising the author has so much trouble in Rust. Perhaps fp16 issues?
What should we be using in 2025? I thought pre-multiplied alpha is essentially what you go for if you want a chance of alpha compositing ending up correct, but my knowledge is probably outdated.
Is the problem here that computing the vector texture in real-time is too expensive and perhaps that font contours are too much of a special case of a general purpose vector rasterizer to be useful? The SLUG algorithm also implements 'banding' which seems similar to the tiling described in the presentation.
The Ministry of Education was using MS Gothic for printed student transcripts. To help students send transcripts directly to post-secondary schools, the Ministry wanted to shift from paper to digital copies. This meant producing a PDF file that had like-for-like characteristics with the printed copy.
Legally, Microsoft requires licensing MS Gothic if the font is used in server-side generated documents. I raised this issue with the Ministry as part of my work in recreating the transcripts. MS Gothic proved to be cost-prohibitive and I suggested they used Raph Levien's unencumbered Inconsolata Zero instead, which is a near-perfect drop-in replacement for MS Gothic and drew inspiration from Letter Gothic.
Now, the stakeholders for the Ministry of Education are extremely protective of the transcript format and there was a subtle, but important difference: The Ministry wanted a non-decorated zero whereas Inconsolata Zero's is slashed. That would not fly with the Ministry.
I, a complete stranger, emailed Raph. The next day, he asked Alexei Vanyashin to set up a custom version of Inconsolata Zero. Alexei went above and beyond to fix all the issues I encountered and about eight days later we had a free replacement for Inconsolata Zero without a dotted zero that passed Ministry scrutiny.
Hard to believe that that was nine years ago.
Aside, my coworkers got a kick of watching me walk down the hall from the printer back to my desk holding two overlapping pieces of paper up to the lights: an official student transcript and my version. This was the technique I used to make sure that the PDF file produced a pixel-perfect replica on paper.
Their vector rendering implementation is much faster than Skia[2].
How does Vellum compares to Rive in terms of performance?
Fraterkes•1d ago
Oh and a question for Raph, did the new spline you invented end up being integrated in any vector/font-creation tools? I remember being really impressed when I first tried your demo
raphlinus•1d ago
The newest spline work (hyperbezier) is still on the back burner, as I'm refining it. This turns out to be quite difficult, but I'm hopeful it will turn out better than the previous prototype you saw.