frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Me/CFS: The blind spot in proactive medicine (Open Letter)

https://github.com/debugmeplease/debug-ME
1•debugmeplease•29s ago•1 comments

Ask HN: What are the word games do you play everyday?

1•gogo61•3m ago•0 comments

Show HN: Paper Arena – A social trading feed where only AI agents can post

https://paperinvest.io/arena
1•andrenorman•4m ago•0 comments

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•8m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
2•elashri•9m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•9m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•9m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•10m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•11m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•11m ago•1 comments

The Neuroscience Behind Nutrition for Developers and Founders

https://comuniq.xyz/post?t=797
1•01-_-•11m ago•0 comments

Bang bang he murdered math {the musical } (2024)

https://taylor.town/bang-bang
1•surprisetalk•11m ago•0 comments

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

https://konfuzio.com/en/a-night-without-the-nerds-claude-opus-4-6-in-the-field-test/
1•konfuzio•14m ago•0 comments

Could ionospheric disturbances influence earthquakes?

https://www.kyoto-u.ac.jp/en/research-news/2026-02-06-0
2•geox•15m ago•1 comments

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

https://www.space.com/space-exploration/launches-spacecraft/spacexs-next-astronaut-launch-for-nas...
1•bookmtn•17m ago•0 comments

Show HN: One-click AI employee with its own cloud desktop

https://cloudbot-ai.com
2•fainir•19m ago•0 comments

Show HN: Poddley – Search podcasts by who's speaking

https://poddley.com
1•onesandofgrain•20m ago•0 comments

Same Surface, Different Weight

https://www.robpanico.com/articles/display/?entry_short=same-surface-different-weight
1•retrocog•22m ago•0 comments

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
2•Brajeshwar•26m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
3•Brajeshwar•26m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
2•Brajeshwar•26m ago•0 comments

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•30m ago•1 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
2•righthand•33m ago•1 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•34m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•34m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
3•vinhnx•35m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
3•tosh•39m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•44m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•48m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•50m ago•0 comments
Open in hackernews

Convolutions, Polynomials and Flipped Kernels

https://eli.thegreenplace.net/2025/convolutions-polynomials-and-flipped-kernels/
106•mfrw•8mo ago

Comments

Sourabhsss1•8mo ago
The visualizations make the concept easy to grasp.
bjt12345•8mo ago
This and complex analysis are fascinating topics in Undergraduate studies.
esafak•8mo ago
Contour integrals still feel cool.
incognito124•8mo ago
My favourite use case for this: By the same derivation as this blog, one can prove that, if you have any two probability distributions X and Y (they can be different), the probability distribution of X+Y is a convolution of the PMFs/PDFs of X and Y.
srean•8mo ago
On similar lines the MAX operator on the random variables become PRODUCT operator on its distribution.

It's fun to play with the (Max, +)algebra of random variables and infer it's distribution.

This turns out to be quite useful in estimating completion time of dependant parallel jobs.

Spawning multiple parallel jobs becomes a Max operation and chaining sequential jobs becomes a '+' operation on the completion times. This expression tree of Max'es and Plus'es can be algebraically processed to obtain bounds on the completion time distribution.

One example is the straggler problem in mapreduce/Hadoop. In the naive case, the completion time is the max of each parallel subtask. (*)

If the tasks have a heavy tail, which sometimes they do, the straggler's completion time can be really bad. This can be mitigated by k-out-n set up, where you encode the problem in such a way that only k out of n jobs need to finish to obtain the final result. One can play with this trade-off between potentially wasted computation and expected completion time.

For heavy tailed distributions another simplification is possible. The tails of Max and + start becoming of the same order, so one can switch between convolutions and products.

(*) This shows up in microservices architectures also. The owner/maintainer of a microservice end point might be very happy with its tail latency. However an end user who consumes and composed the results of the endpoints can experience really bad tail latencies for the final output.

incognito124•8mo ago
Thanks for sharing the name of that problem! I've encountered it before while optimizing batched LLM inference. The whole batch would last until all queries in a batch were done, and by changing the batch size, you'd trade off per-query-speed (better in a larger batch) with overall performance (worse with a larger batch).

Nowadays I think this is solved in an entirely different way, though.

srean•8mo ago
It gets more entertaining.

It's common to wrap API calls with

retry on failure, or

spawn an identical request if taking longer than x,or

recursively spawn an identical request if taking longer than x,or

retry on failure but no more than k times.

All of these and similar patterns/decorators can be analysed using the same idea.

incognito124•8mo ago
Oh wow, pretty cool stuff! If you have more to share, you can always dump it in my mail inbox
silloncito•8mo ago
You should be careful with your estimation. The events should be independent to apply those properties but it is very common that one cause can influence many factors, so they are not independent and all the beauty math does not work as with independence. In the worst day, all fail, because one resource can block others and the system get strangled. There is the black swan book when one rare event make the financial market realize what is risk.
srean•8mo ago
A load-balancer transparently sitting in front of the api end-point (not an uncommon scenario) usually decouples things well enough to be practically independent.

That said, silloncito's warning does need to be paid heed.

While independence is essential for the proof to go through, the relationships need not break catastrophically with break of independence, usually it is graceful degradation with degree of independence. There are however specific, often degenerate, theoretical edge cases where the degradation is rapid.

ttoinou•8mo ago
What ? I've never heard of this math. Do you mean literally those are the resulting operations in the general case or are those approximate explanation and we need to find more specific cases to make this true ?
srean•8mo ago
The math is general and exact.

Max and Plus at the random variables space becomes product and convolution in their distribution function space.

    Distr(X+Y) =  DistrX ° DistrY

    Distr (X ^ Y) = DistrX * DistrY. 

    Where '^' denotes Max and '°' denotes convolution.
Note *, +, ° and ^ being commutative and associative they can be chained. One can also use their distributive properties. This really the math of groups and rings.

However, one can and one does resort to approximations to compute the desired end results.

More specifically, people are often interested not in the distribution, but some statistics. For example, mean, standard deviation, some tail percentile etc. To compute those stats from the exact distributions, approximations can be employed.

gjm11•8mo ago
Surely this isn't quite right.

Max of variables = product of cumulative distribution functions.

Sum of variables = convolution of probability density functions.

So both of the equations you write down are correct, but only if you interpret "Distr" as meaning different things in the two cases.

[EDITED to add:] Provided the random variables in question are independent, as mentioned elsewhere in the discussion; if they aren't then none of this works.

srean•8mo ago
The original post, to which I replied, is about the correspondence between summation of random variables and convolution of their distribution. Independence is sufficient for that.

I just carried through that assumption of independence in my own comment, thinking it was obvious to do that (carry over the assumptions).

ttoinou•8mo ago
But is he right about the different meanings of Distr in your equations ?
srean•8mo ago
No he is not.

Both the ideas work for the cumulative distribution function that is called just distribution function in math. I think he got confused by the fact that convolution relation works also with densities (so he might have assumed that it works with densities only and not distributions)

gjm11•8mo ago
I'm sorry, but I think you are just wrong about convolutions and cumulative distribution functions.

Let's take the simplest possible case: a "random" variable that's always equal to 0. Its cdf is a step function: 0 for negative values, 1 for positive values. (Use whatever convention you prefer for the value at 0.)

The sum of two such random variables is another with the same distribution, of course. So, what's the convolution of the cdfs?

Answer: it's not even well defined.

The convolution of functions f and g is a function such that h(x) = integral over t of f(t) g(x-t) dt. The integral is over the whole of (in this case) the real numbers.

In this case f and g are both step functions as described above, so (using the convenient Iverson bracket notation for indicator functions) this is the integral over t of [t>0] [x-t>0] dt, i.e., of [0<t<x] dt, whose value is 0 for negative x and x for positive x.

This is not the cdf of any probability distribution since it doesn't -> 1 as x -> oo. In particular, it isn't the cdf of the right probability distribution which, as mentioned above, would be the same step function as f and g.

If X,Y are independent with pdfs f,g and cdfs F,G then the cdf of X+Y is (not F conv G but) f conv G = F conv g.

gjm11•8mo ago
Oops, one thing in the above is completely wrong (I wrote it before thinking things through carefully, and then forgot to delete it).

It is not at all true that "it's not even well defined", and indeed the following couple of paragraphs determine exactly what the thing in question is. It's not an actual cdf, but the problem isn't that it's ill-defined but that it's well-defined but has the wrong shape to be a cdf.

silloncito•8mo ago
$Let M=Max(X,Y)$. If $X$ and $Y$ are independent then: $F_M(k) = P(M \leq K) = P((X \leq K) and (Y \leq K))$, so that

  $P(X \leq K) x P(Y \leq K) = F_X(K) x F_Y(K)$.

 So $F_M = F_X \times F_Y$
srean•8mo ago
Thanks for making the assumption of independence explicit and welcome to HN.
silloncito•8mo ago
Thank you for your welcome, I must have been lurking here for around 30 years or more (always changing accounts). Anyway in this specific case, since M = Max(X,X) = X you can't have F(M) = F(X)*F(X) = F(X) except when F(X) in {0,1}, so the independence property is essential. Welcome fellow Lisper (for the txr and related submission) and math inspired (this one and another related to statistical estimation) with OS related interest (your HN account), OS are not my cup of tea but awk is not bad).

  In another post there are some comments between topology and deep learning. I wonder if there is a definition similar to dimension in topology which would allow you to estimate the minimal size (number of parameters) in a neural network so that is able to achieve a certain state (for example obtaining the capacity to one shot learning with high probability).
srean•8mo ago
Yes independence is absolutely an assumption that I (implicitly) made. It's essential for the convolution identity to hold as well, I just carried through that assumption.

We share interest in AWK (*) then :) I don't know OS at all. Did you imply I know lisp ? I enjoy scheme, but used it in anger never. Big fan of the little schemer series of books.

(*) Have to find that Weinberger face Google-NY t-shirt. Little treasures.

Regarding your dimensions comment, this is well understood for a single layer, that is, for logistic regression. Lehmann's book will have the necessary material. With multiple layers it gets complicated real fast.

The best performance estimates, as in, within realms of being practically useful, largely come from two approaches, one from PAC-Bayesian bounds, the other from Statistical Physics (but these bounds are data distribution dependent). The intrinsic dimension of the data plays a fundamental role there.

The recommended place to dig around is JMLR (journal of machine learning research).

silloncito•8mo ago
Perhaps your txr submission suggests a lisp flavor. The intrinsic dimension concept looks interesting, also the V.C. dimension, but both concepts are very general. Perhaps Lehmann's book is: Elements of large sample theory.
srean•8mo ago
Txr is super interesting.

I meant Lehmann's Theory of Point Estimation, but large sample theory is a good book too. The newer editions of TPE are a tad hefty in number of pages. The earlier versions would serve you fine.

The generic idea is that smaller these dimensions, easier the prediction problem. Intrinsic dimension is one that comes closest to topology. VC is very combinatorial and gives the worst of worst case bounds. For a typical sized dataset one ends up with an error probability estimate of less than 420. With PAC-Bayes the bounds are atleast less than 1.0.

pizza•8mo ago
Check out tropical algebras
ttoinou•8mo ago
Fascinating thank you for reminding me about that
whatshisface•8mo ago
That doesn't sound right. If P(X) is the vector {0.5,0,0.5} and P(Y) is {0.5,0.5,0}, P(X)P(Y) is {0.25,0,0} and that's both not normalized and clearly not the distribution for max(X,Y). Did you get that from an LLM?
srean•8mo ago
You are using PMFs. I meant and wrote distribution function aka cumulative distribution function. They are closed under products.

> Did you get it from LLM

LOL. There must be a fun and guilty story lurking inside the accusation.

On a more serious note, I would love it if LLMs could do such simplifications and estimations on their own.

whatshisface•8mo ago
Distributions can be either PDFs or CDFs. To be honest I'd never heard of assuming that a distribution was a CDF unless otherwise specified.
srean•8mo ago
May I raise you a

https://en.m.wikipedia.org/wiki/Distribution_function

It's right in the title.

In probability theory, integration theory, as well as electrical engineering, "distribution function", unless further clarified, means that cumulative thing.

In math, nomenclature overloading can be a problem. So context matters. In the context of dirac delta, distribution means something else entirely -- generalized functions.

Oh! So sad you deleted your point about densities. One can only laugh at and enjoy these idiosyncrasies of nomenclature.

In Electrical Engineering one uses j for imaginary numbers because i is taken (by current).

math_dandy•8mo ago
This is a natural point of confusion. The true (IMO) primitive concept here is the probability measure. Probability measures on the real line are in canonical bijection with CDFs, the latter being axiomatizable as càdlàg functions (see https://en.wikipedia.org/wiki/Càdlàg) asymptotic to 0 (resp. 1) at minus infinity (resp. infinity). On the other hand, not every probability measure has a density function. (If you want the formalism of densities to capture all probability measures, you need to admit more exotic generalized functions à la Dirac.)
deepsun•8mo ago
Almost every probability theorem starts with "let's take independent random variables". But in reality almost nothing is independent. Superdeterminism even claims that exactly nothing is independent.
srean•8mo ago
You are right.

The degree of dependence matters though. Mutual information is one way to measure that.

Thankfully, some theorems remain valid even when independence is violated. The next stop after independence is martingale criteria. Martingale difference sequences can be quite strongly dependent yet allow some of the usual theorems to go through but with worse convergence rates.

mananaysiempre•8mo ago
Almost every elementary probability theorem. There are plenty of theorems conditional on bounds on how weak the dependency should be (or how fast correlations decrease with order, etc.), including CLT-like theorems, it’s just that they are difficult to state, very difficult to prove, and almost impossible to verify the applicability of. In practice you are anyway going to instead use the simpler version and check if the results make sense after the fact.
jerf•8mo ago
3blue1brown has a walkthrough of this: https://www.youtube.com/watch?v=IaSGqQa5O-M
stared•8mo ago
Beware - one step more and you get into the region of generating functions. I recommend a book Herbert Wilf with a wonderful name of Generatingfunctionology (https://www2.math.upenn.edu/~wilf/gfology2.pdf).
eliben•8mo ago
Indeed, generating functions are mentioned in a footnote :) Very interesting topic
stared•8mo ago
Saw that!

Sometimes it makes things simpler (quite a a lot of things in combinatorics), other times it is a tools for nice tricks (I have no idea how I would solved these equations if it were not for generating functions, see the appendix from a Mafia game paper, https://arxiv.org/abs/1009.1031).

srean•8mo ago
Ooh! Lovely. Thank you.

Generating functions, Z-transforms are indispensable in probability theory, Physics, signal processing, and now it seems for a good round of Mafia while camping with friends.

esafak•8mo ago
I tip my hat to the person who invented that.
nayuki•8mo ago
You can also multiply polynomials by way of analogy with integer multiplication:

         3  1  2  1
       ×    2  0  6
       ------------
        18  6 12  6
      0  0  0  0
   6  2  4  2
  -----------------
   6  2 22  8 12  6
= 6x^5 + 2x^4 + 22x^3 + 8x^2 + 12x^1 + 6x^0.
kazinator•8mo ago
Not to mention divide:

  2x^2 + 5x - 3
  -------------
     x + 2

                  2x + 1
          ______________
   x + 2 | 2x^2 + 5x - 3
           2x^2 + 4x
           -------------
                   x - 3
                   x + 2
                   -----
                     - 5
The remainder is -5, which gives us a -5/(x + 2) term: Thus

                    5
   =    2x + 1  - -----
                  x + 2
How about something we know divides:

                     x + 1
             _______________
      x + 1 | x^2 + 2x + 1
              x^2 + x
              --------
                     x + 1
                     x + 1
                     -----
                         0