Lots of past discussions:
https://news.ycombinator.com/item?id=35076487 74 points, 2 years ago, 69 comments
https://news.ycombinator.com/item?id=26950455 81 points, 4 years ago,70 comments
https://news.ycombinator.com/item?id=20535984 143 points, 6 years ago, 79 comments
https://news.ycombinator.com/item?id=8614159 118 points, 10 years ago, 64 comments
https://news.ycombinator.com/item?id=1472175 46 points, 15 years ago, 20 comments
Audio samples are point samples (usually). This is nice, because there's a whole theory on how to upsample point samples without loss of information. But more importantly, this theory works because it matches how your playback hardware functions (for both analog and digital reasons that I won't go into).
Pixels, however, are actually displayed by the hardware as little physical rectangles. Take a magnifying glass and check. Treating them as points is a bad approximation that can only result in unnecessarily blurry images.
I have no idea why this article is quoted so often. Maybe "everybody is doing it wrong" is just a popular article genre. Maybe not everyone is familiar enough with sampling theory to know exactly why it works in audio (to see why those reasons don't apply to graphics).
There is also the complication of composite video signals, where you can't treat pixels as linearly independent components.
This signal processing applies to images as well. Resampling is used very often for upscaling, for example. Here's an example: https://en.wikipedia.org/wiki/Lanczos_resampling
> It was already wrong in 1995 when monitors where CRTs, and it's way wrong in 2025 in the LCD/OLED era where pixels are truly discrete.
I don't think it has anything to do with display technologies though. Imagine this: there is a computer that is dedicated to image processing. It has no display, no CRT, no LCD, nothing. The computer is running a service that is resizing images from 100x100 pixels to 200x200 pixels. Would the programmer of this server be better off thinking in terms of samples or rectangular subdivisions of a display?
Alvy Ray Smith, the author of this paper, was coming from the background of developing Renderman for Pixar. In that case, there were render farms doing all sorts of graphics processing before the final image was displayed anywhere.
How about a counter example: As part of a vectorization engine you need to trace the outline of all pixels of the same color in a bitmap. What other choice to you have than to think of pixels as squares with four sides?
I think that’s a bad example. For vector tracing, you want the ability to trace using lines and curves at any angle, not alongside the pixel boundaries, so you want to see the image as a function from ℝ² to RGB space for which you have samples at grid positions. Target then is to find a shape that covers the part of ℝ² that satisfies a discriminator function (e.g. “red component at least 0.8, green and blue components at most 0.1) decently well, balancing the simplicity of the shape (in terms of number of control points or something like that) with the quality of the cover.
I think your two examples nicely illustrate that it's all about the display technology.
> The computer is running a service that is resizing images from 100x100 pixels to 200x200 pixels. Would the programmer of this server be better off thinking in terms of samples or rectangular subdivisions of a display?
That entirely depends on how the resizing is done. Usually people choose nearest neighbor in scenarios like that to be faithful to the original 100x100 display, and to keep the images sharp. This treats the pixels as squares, which means the programmer should do so as well.
> Alvy Ray Smith, the author of this paper, was coming from the background of developing Renderman for Pixar.
That's meaningful context. I'm sure that in 1995, Pixar movies were exposed onto analog film before being shown in theatres. I'm almost certain this process didn't preserve sharp pixels, so "pixels aren't squares" was perhaps literally true for this technology.
Perhaps I should have chosen a higher resolution. AIUI, in many modern systems, such as your OS, it’s usually bilinear or Lanczos resampling.
You say that the resize should be faithful to the “100x100 display”, but we don’t know whether it was used from such a display, or coming from a camera, or generated by software.
> I'm almost certain this process didn't preserve sharp pixels
Sure, but modern image processing pipelines work the same way. They are working to capture the original signal, with a hopeful representation of the continuous signal, not just a grid of squares.
I suppose this is different for a “pixel art” situation, where resampling has to be explicitly set to nearest neighbor. Even so, images like that have problems in modern video codecs, which model samples of a continuous signal.
And yes, I am aware that the “pixel” in “pixel art” means a little square :). The terminology being overloaded is what makes these discussions so confusing.
> a bad approximation that can only result in unnecessarily blurry images
If you light up pixels in a row, you get a line - a long thin rectangle - and not a chain of blobs. If you light them up diagnoally, you get a jagged line. For me that is proof that they squares - at least close enough to squares. Heck even on old displays that don't have a square pixel ratio they are squished squares ;-). And you have to treat them like little squares if you want to understand antialiasing, or why you sometimes have to add (0.5, 0.5) to get sharp lines.
(And a counterpoint: The signal-theoretical view that they are point samples is useful if you want to understand the role of gamma in anti-aliasing, or if you want to do things like superresolution with RGB-sub-pixels.)
See also https://www.reddit.com/r/apple/comments/9fp1ty/did_you_ever_....
But these samples are usually called fragments, not pixels. They turn into little square pixels later in the pipeline, so yeah, I guess that pixels really are little squares, or maybe little rectangles.
I think it actually depends what you define as "pixel". Sure, the pixel on your screen emits light on a tiny square into space. And sure, a sensor pixel measures the intensity on a tiny square.
But let's say I calculate something like:
# samples from 0, 0.1, ..., 1
x = range(0, 1, 11)
# evaluate the sin function at each point
y = sin.(x)
Then each pixel (or entry in the array) is not a tiny square. It represents the value of sin at this specific location. A real pixelated detector would have integrated sin from `y[u] = int_{u}^{u + 0.1} sin(x) dx` which is entirely different from the point wise evaluation before.So for me that's the main difference to understand.
Screen pixels are (nowadays) usually three vertical rectangles that occupy a square spot on the grid that forms the screen. This is sometimes exploited for sub-pixel font smoothing purposes.
Digital photography pixels are reconstructed from sensors that perceive cone of incoming light of certain frequency band, arranged in a Bayer grid.
Rendered 3D scene pixels are point samples unless they approximate cones via sampling neighborhood of the pixel center.
In any case, Nyquist will tear your head off and spit into your neck hole as soon as you come close to any kind of pixel. Square or point.
Consider the Direct3D rasterization rules[1], which offset each sample point by 0.5 on each axis to sample "at the pixel center". Why are the "pixel centers" even at half-integer coordinates in the first place? Because if thinking of pixels as little squares, it's tempting to align the "corners" with integer coordinates like graph paper. If instead the specifiers had thought of pixels as lattice of sample points, it would have been natural to align the sample points with integer coordinates. "Little square" pixels resulted in an unneeded complication to sampling, an extra translation by a fractional distance, so now every use of the API for pixel perfect rendering must apply the inverse transform.
[1]: https://learn.microsoft.com/en-us/windows/win32/direct3d11/d...
turtleyacht•4h ago