But the world has increasingly moved to Retina-type displays, and there's very little reason for subpixel rendering there.
Plus it just has so many headaches, like screenshots get tied to one subpixel layout, you can't scale bitmaps, etc.
It was a temporary innovation for the LCD era between CRT and Retina, but at this point it's backwards-looking. There's a good reason Apple removed it from macOS years ago.
macOS looks *awful* on anything that isn't precisely 218ppi. Other than Apple's overpriced profit-machine displays, there are two displays that reach this: LG's Ultrafine 5K, and Dell's 6K (with its ugly, extraneous webcam attached to the top). Other 6K monitors were shown at CES this year but so far, I haven't actually found any for sale. EDIT: Correction, LG apparently doesn't sell the 5K Ultrafine anymore, at least on their website.
That means, the odds are incredibly high that unless you buy the LG, or drop a wad on an overpriced Studio Display or the even worse valued Pro Display, your experience with macOS on an external monitor will be awful.
That's even before we get into the terrible control we have in the OS over connection settings. I shouldn't have to buy BetterDisplay to pick a refresh rate I know my display is capable of on the port it's plugged into.
Computers, on the other hand, have stuck with 1080p, unless you're spending a fortune.
I can only attribute it to penny pinching by the large computer manufacturers, because with the high-res tablets coming to market for Chromebook prices, I doubt they're unable to put a similarly high-res display in a similarly sized laptop without bumping the price up by 500 euros like I've seen them do.
Have you tried adjusting your display gamma for each RGB subchannel? Subpixel antialiasing relies on accurate color space information, even more than other types of anti-aliased rendering.
Not my world. Even the display hooked up to the crispy work MacBook is still 1080p (which looks really funky on macOS for some reason).
Even in tech circles, almost everyone I know still has a 1080p laptop. Maybe some funky 1200p resolution to make the screen a bit bigger, but the world is not as retina as you may think it is.
For some reason, there's actually quite a price jump from 1080p to 4k unless you're buying a television. I know the panels are more expensive, but I doubt the manufacturer is indeed paying twice the price for them.
Still bemoaning the loss of the basically impossible (50"? I can't remember precisely) 4k TV we bought that same year for $800usd when every other 4k model that existed at the time was $3.3k and up.
It's black point was "when rendering a black frame the set 100% appears to be unpowered" and the whitepoint was "congratulations, this is what it looks like to stare into baseball stadium floodlights". We kept it at 10% brightness as a matter of course and still playing arbitrary content obviated the need for any other form of lighting in our living room and dining room combined at night.
It was too pure for this world and got destroyed by one of the kids throwing something about in the living room. :(
It’s an utterly glorious display for programming. I can have 3 full width columns of code side by side. Or 2 columns and a terminal window.
But the pixels are still the “normal” size. Text looks noticeably sharper with sub-pixel rendering. I get that subpixel rendering is complex and difficult to implement correctly, but it’s good tech. It’s still much cheaper to have a low resolution display with subpixel font rendering than render 4x as many pixels. To get the same clean text rendering at this size, I’d need an 8k display. Not only would that cost way more money, but rendering an 8k image would bring just about any computer to its knees.
It’s too early to kill sub pixel font rendering. It’s good. We still need it.
we could do with a better image format for screenshots - something that preserves vectors and text instead of rasterizing. HDR screenshots on Windows are busted for similar reasons.
Unfortunately, EDID isn't always reliable, either: you need to know the screen's orientation as well or rotated screens are going to look awful. You're probably going to need administrator access on computers to even access the hardware to get the necessary data, which can also be a problem for security and ease-of-use reasons.
Plus, some vendors just seem to lie in the EDID. Like with other information tables (ACPI comes to mind), it looks almost like they just copy the config from another product and adjust whatever metadata they remember to update before shipping.
I also created glyphon (https://github.com/grovesNL/glyphon) which renders 2D text using wgpu and cosmic-text. It uses a dynamic glyph texture atlas, which works fine in practice for most 2D use cases (I use it in production).
I suppose vello is heading there but whenever I tried it the examples always broke in some way.
Also it's not just about speed, but power consumption. Once you are fast enough to hit the monitor frame rate then further performance improvements won't improve responsiveness, but you may notice your battery lasting longer. So there's no such thing as "fast enough" when it comes to rendering. You can always benefit from going faster.
This is not true, if your rendering is faster then you can delay the start of rendering and processing of input to be closer to the frame display time thus reducing input latency.
Of course for e.g. games that breaks if the font size changes, letters rotate and/or become skewed etc.
Most GPUs dispatch pixel shaders in groups of 4. If all your triangles are only 1 pixel big then you end up with 3 of those shader threads not contributing to the output visually. It's called 'quad overdraw'. You also spend a lot of time processing vertices for no real reason too.
https://sluglibrary.com/ implements Dynamic GPU Font Rendering and Advanced Text Layout
On the other hand, if you are rendering to an atlas anyway then you don't really need to bother with a GPU implementation for that an can just use an existing software font rasterizer like FreeType to generate that atlas for you.
From what I understood, it's even worse. Not just non standard, but multiple incompatible subpixel layouts that OLEDs have. That's the reason freetype didn't implement subpixel rendering for OLEDs and it's a reason to avoid OLEDs when you need to work with text. But it's also not limited to freetype, a lot of things like GUI toolkits (Qt, GTK. etc.) need to play along too.
Not really sure if there is any progress on solving this.
> I really wish that having access to arbitrary subpixel structures of monitors was possible, perhaps given via the common display protocols.
Yeah, this is a good point. May be this should be communicated in EDIDs.
I think the weird layouts are mostly due to needing different sizes for the different colors in HDR displays in order to not burn out one color (blue) too fast.
* https://bugs.kde.org/show_bug.cgi?id=472340
* https://gitlab.freedesktop.org/freetype/freetype/-/issues/11...
In 2012, Behdad Esfahbod wrote Glyphy, an implementation of SDF that runs on the GPU using OpenGL ES. It has been widely admired for its performance and enabling new capabilities like rapidly transforming text. However it has not been widely used.
Modern operating systems and web browsers do not use either of these techniques, preferring to rely on 1990s-style Truetype rasterization. This is a lightweight and effective approach but it lacks many capabilities. It can't do subpixel alignment or arbitrary subpixel layout, as demonstrated in the article. Zooming carries a heavy performance penalty and more complex transforms like skew, rotation, or 3d transforms can't be done in the text rendering engine. If you must have rotated or transformed text you are stuck resampling bitmaps, which looks terrible as it destroys all the small features that make text legible.
Why the lack of advancement? Maybe it's just too much work and too much risk for too little gain. Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering? It would be a daunting task. Rendering glyphs is one thing but how about handling line breaking? Seems like it would require a lot of communication between CPU and GPU, which is slow, and deep integration between the software and the GPU, which is difficult.
SDF works by encoding a localized _D_istance from a given pixel to the edge of character as a _F_ield, i.e. a 2d array of data, using a _S_ign bit to indicate whether that distance is inside or outside of the character. Each character has its own little map of data that gets packed together into an image file of some GPU-friendly type (generically called a "map" when it does not represent an image meant for human consumption), along with a descriptor file of where to find the sub-image of each character in that image, to work with the SDF rendering shader.
This definition of a character turns out to be very robust against linear interpolation between field values, enabling near-perfect zoom capability for relatively low resolution maps. And GPUs are pretty good at interpolating pixel values in a map.
But most significantly, those maps have to be pre-processed during development from existing font systems for every character you care to render. Every. Character. Your. Font. Supports. It's significantly less data than rendering every character at high resolution to a bitmap font. But, it's also significantly more data than the font contour definition itself.
Anything that wants to support all the potential text of the world--like an OS or a browser--cannot use SDF as the text rendering system because it would require the SDF maps for the entire Unicode character set. That would be far too large for consumption. It really only works for games because games can (generally) get away with not being localized very well, not displaying completely arbitrary text, etc.
The original SDF also cannot support Emoji, because it only encodes distance to the edges of a glyph and not anything about color inside the glyph. Though there are enhancements to the algorithm to support multiple colors (Multichannel SDF), the total number of colors is limited.
Indeed, if you look closely at games that A) utilize SDF for in-game text and B) have chat systems in which global communities interact, you'll very likely see differences in the text rendering for the in-game text and the chat system.
Would caching (domain restricted ofc) not trivially fix that? I don't expect a given website to use very many fonts or that they would change frequently.
Good. My text document viewer only needs to render text in straight lines left to right. I assume right to left is almost as easy. Do the Chinese still want top to bottom?
Yes, inconceivable that somebody might ever want to render text in anything but a "text document viewer"!
I’m not sure why you’re saying this: text shaping and layout (including line breaking) are almost completely unrelated to rendering.
https://github.com/servo/pathfinder uses GPU compute shaders to do this, which has way better performance than trying to fit this task into the hardware 3D rendering pipeline (the SDF approach).
It is tricky, but I thought they already (partly) do that. https://keithclark.co.uk/articles/gpu-text-rendering-in-webk... (2014):
“If an element is promoted to the GPU in current versions of Chrome, Safari or Opera then you lose subpixel antialiasing and text is rendered using the greyscale method”
So, what’s missing? Given that comment, at least part of the step from UTF-8 string to bitmap can be done on the GPU, can’t it?
The idea that the state of the art or what's being shipped to customers haven't advanced is false.
Can you tell me more about it? I love making tutorials about GPU stuff and I would love to structure them like yours.
Is it an existing template? Is it part of some sort of course?
The point about intersections (or hard corners in general) is the issue with distance fields though. You can counteract it a bit by having multiple distance fields and rendering the intersection of them. See e.g. https://github.com/Chlumsky/msdfgen
What happened to the dot of the italic "j" in the first video?
As a side note, from the first "menu UI" until the end, I had the Persona music in my head ^^ (It was a surprise reading the final words)
But subpixel AA is futile in my opinion. It was a nice hack in the aughts when we had 72dpi monitors, but on modern "retina" screens it's imperceptible. And for a teeny tiny improvement, you get many drawbacks:
- it only works over opaque backgrounds
- can't apply any effect on the rasterized results (e.g. resizing, mirroring, blurring, etc.)
- screenshots look bad when viewed on a different display
- A protocol to ask the hardware
- A database of quirks about hardware that is known to provide wrong information
- A user override for when neither of the previous options do the job
On my rarely used Windows partition, I have used ClearType Tuner (name?) to set up ClearType to my preferences. The results are still somewhat grainy and thin, but that's a general property of Windows font rendering.
This isn't just legacy hardware; 96dpi monitors and notebooks are still being produced today.
I also got nerd sniped by Sebastian Lague's recent video on text rendering [0] (also linked to in the article) and started writing my own GPU glyph rasterizer.
In the video, Lague makes a key observation: most curves in fonts (at least for Latin alphabet) are monotonic. Monotonic Bezier curves are contained within the bounding box of its end points (applies to any monotonic curve, not just Bezier). The curves that are not monotonic are very easy to split by solving the zeros of the derivative (linear equation) and then split the curve at that point. This is also where Lague went astray and attempted a complex procedure using geometric invariants, when it's trivially easy to split Beziers using de Casteljau's algorithm as described in [1]. It made for entertaining video content but I was yelling at the screen for him to open Pomax's Bezier curve primer [1] and just get on with it.
For monotonic curves, it is computationally easy to solve the winding number for any pixel outside the bounding box of the curve. It's +1 if the pixel is to the right or below the bounding box, -1 if left or above and 0 if outside of the "plus sign" shaped region off to the diagonals.
Further more, this can be expanded to solving the winding number for an entire axis aligned box. This can be done for an entire GPU warp (32 to 64 threads): each thread in a warp looks at one curve and checks if the winding number is the same for the whole warp and accumulate, if not, set a a bit that this curve needs to be evaluated per thread.
In this way, very few pixels actually need to solve the quadratic equation for a curve in the contour.
There's still one optimization I haven't done: solving the quadratic equation in for 2x2 pixel quads. I solve both vertical and horizontal winding number for good robustness of horizontal and vertical lines. But the solution for the horizontal quadratic for a pixel and the pixel below it is the same +/- 1, and ditto for vertical. So you can solve the quadratic for two curves (a square root and a division, expensive arithmetic ops) for the price of one if you do it for 2x2 quads and use warp level swap to exchange the results and add or subtract 1. This can only be done in orthographic projection without rotation, but the rest of the method also works in with perspective, rotation and skew.
For a bit of added robustness, Jim Blinn's "How to solve a quadratic equation?" [2] can be used to get rid of some pesky numerical instability.
I'm not quite done yet, and I've only got a rasterizer, not the other parts you need for a text rendering system (font file i/o, text shaping etc).
But the results are promising: I started at 250 ms per frame at a 4k rendering of a '@' character with 80 quadratic Bezier curves, evaluating each curve at each pixel, but I got down to 15 ms per frame by applying the warp vs. monotonic bounding box optimizations.
These numbers are not very impressive because they are measured on a 10 year old integrated laptop GPU. It's so much faster on a discrete gaming GPU that I could stop optimizing here if it was my target HW. But it's already fast enough for real time in practical use on the laptop because I was drawing an entire screen sized glyphs for the benchmark.
[0] https://www.youtube.com/watch?v=SO83KQuuZvg [1] https://pomax.github.io/bezierinfo/#splitting [2] https://ieeexplore.ieee.org/document/1528437
What's a "monotonic Bezier curve"?
Btw, every Bezier curve is contained within its control points' convex hull. It follows from the fact that all points on a Bezier curve are some convex combination of the control points. In other words, the Bezier basis functions sum to 1, and are nonnegative everywhere.
Good question!
It's a Bezier curve that has a non-zero derivative for t=0..1 (exclusive).
Just your high school calculus definition of monotonic.
To get from a general quadratic Bezier to monotonic sections, you solve the derivative for zeros in x and y direction (a linear equation). If the zeros are between 0 and 1 (exclusive), split the Bezier curve using de Casteljau's at t=t_0x and t=t_0y. For each quadratic Bezier you get one to three monotonic sections.
> every Bezier curve is contained within its control points' convex hull.
This is true, but only monotonic Bezier curves are contained between the AABB formed by the two end points (so control points in the middle don't need to be considered outside the AABB).
For a quadratic Bezier this means that it is monotonic iff the middle control point is inside the AABB of the two end points.
The monotonicity is a requirement for all the GPU warp level AABB magic to happen (which is a nice 10x to 20x perf increase in my benchmarks). At worst you'd have to deal with 3x the number of curves after splitting (still a win), but because most curves in fonts are monotonic the splitting doesn't increase the the number of curves a lot in practice.
Monotonicity also implies that the quadratic equations have only one unique solution for any horizontal or vertical line. No need to classify the roots as in Lengyel's method.
To actually render text I would still need to add text shaping to get from strings of text to pixels on the screen.
But a bigger problem is how to integrate it into someone else's software project. I have the rasterizer GPU shader and some CPU preprocessing code and the required 3d API code (I'm using Vulkan). I'm not sure how people usually integrate this kind of components to their software, do they just want the shader and preprocessor or do they expect the 3d API code too. Packaging it into a library that has the Vulkan code too has its own problems with interoperability.
You need to hope that it rains during my summer vacation, maybe this project will see some progress :)
The Unity solution is quite limiting, someone about 10 years ago made a great asset and was actively developing it, but then they bought it and integrated it into their subsystem, after that all development basically stopped. Now in modern times a lot of games are using Slug, and it renders some really beautiful text, but unfortunately they are doing large scale licensing to big companies only.
Some thoughts I've come up with: 1. Creating text is basically a case of texture synthesis. We used to just throw it into textures and let the GPU handle it, but obviously its a general purpose device and not meant to know what information (such as edges) are more important than others. Even a 2k texture of a letter doesn't look 100% when you view it at full screen size.
2. Edge Lines: I have a little business card here from my Doctor's office. At e.g. 20-30cm normal human reading distance the text on it looks great, but zooming in close you can see a lot of differences. The card material isn't anywhere close to flat and the edge lines are really all over the place, in all kinds of interesting ways.
3. Filling: The same happens for the center filled part of a letter or vector, when you zoom in you can see a lot of flaws creep in, e.g. 5% of the visible layer has some or the other color flaw. There are black flecks on the white card, white/black flecks in a little apple logo they have etc.
4. So basically I want to add a distance parameter as well as using size. Both these cases for just rendering normal 2d text is relatively irrelevant, but in 3d people will often go stand right up against something, so the extra detailing will add a lot. For the synthesis part, there's no reason that any of the lines should be a solid fill instead of for example some artistic brush stroke, or using some derivative noise curves to create extra/stable information/detailing.
5. Another thing to look at might be work subdivision. Instead of rendering a whole set of letters in N time, if the camera is relatively stable we can refine those over successive frames to improve detailing, for example go from 2 to M subdivisions per curve.
6. There are numerous available examples such as the following concurrent B-Tree: https://www.youtube.com/watch?v=-Ce-XC4MtWk In their demo they fly into e.g. a planetary body and keep subdividing the visible terrain on-screen to some minimum size; then they can synthesize extra coherent detailing that matches that zoom level from some map, in their case e.g. noise for the moon, or code for water etc.
I find that a lot of the people working on text are sort of doing their own thing data structure wise, instead of looking at these already implemented and proven concurrent solutions. Here is another interesting paper, DLHT: A Non-blocking Resizable Hashtable with Fast Deletes and Memory-awareness / https://arxiv.org/abs/2406.09986
Not to say that those are the way to parallelize the work, or even if that is necessary, but it might be an interesting area where one can increase detailing.
No. I don't. This is a horrifying concept. It implies that the same character may look different every time is printed! This is extremely noticeable and really awful. For example when you align equal signs on consecutive lines of code, you notice straight away whether the characters are different.
Nowadays pixels are so small that I don't understand why don't we all just use good quality bitmap fonts. I do, and couldn't be happier with them. They are crisp to a fault, and their correct rendering does not depend on the gamma of the display (which is a serious problem that TFA does not even get into).
base_slot_coordinates := decode_morton2_16(xx index);
which in Jai means “autocast”.
dustbunny•23h ago