frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Discuss – Do AI agents deserve all the hype they are getting?

4•MicroWagie•3h ago•1 comments

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

48•UmYeahNo•1d ago•30 comments

LLMs are powerful, but enterprises are deterministic by nature

3•prateekdalal•7h ago•5 comments

Ask HN: Non AI-obsessed tech forums

28•nanocat•18h ago•25 comments

Ask HN: Ideas for small ways to make the world a better place

18•jlmcgraw•21h ago•21 comments

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

44•Invictus0•1d ago•11 comments

Ask HN: Who wants to be hired? (February 2026)

139•whoishiring•5d ago•520 comments

Ask HN: Who is hiring? (February 2026)

313•whoishiring•5d ago•514 comments

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

2•netfortius•16h ago•1 comments

AI Regex Scientist: A self-improving regex solver

7•PranoyP•22h ago•1 comments

Tell HN: Another round of Zendesk email spam

104•Philpax•2d ago•54 comments

Ask HN: Is Connecting via SSH Risky?

19•atrevbot•2d ago•37 comments

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

18•jchung•2d ago•13 comments

Ask HN: Why LLM providers sell access instead of consulting services?

5•pera•1d ago•13 comments

Ask HN: How does ChatGPT decide which websites to recommend?

5•nworley•1d ago•11 comments

Ask HN: What is the most complicated Algorithm you came up with yourself?

3•meffmadd•1d ago•7 comments

Ask HN: Is it just me or are most businesses insane?

8•justenough•1d ago•7 comments

Ask HN: Mem0 stores memories, but doesn't learn user patterns

9•fliellerjulian•2d ago•6 comments

Ask HN: Is there anyone here who still uses slide rules?

123•blenderob•4d ago•122 comments

Kernighan on Programming

170•chrisjj•5d ago•61 comments

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

2•guhsnamih•1d ago•4 comments

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

5•wewewedxfgdf•1d ago•3 comments

Ask HN: Any International Job Boards for International Workers?

2•15charslong•18h ago•2 comments

We built a serverless GPU inference platform with predictable latency

5•QubridAI•2d ago•1 comments

Ask HN: Does a good "read it later" app exist?

8•buchanae•3d ago•18 comments

Ask HN: Have you been fired because of AI?

17•s-stude•4d ago•15 comments

Ask HN: Anyone have a "sovereign" solution for phone calls?

12•kldg•4d ago•1 comments

Ask HN: Cheap laptop for Linux without GUI (for writing)

15•locusofself•3d ago•16 comments

Ask HN: How Did You Validate?

4•haute_cuisine•2d ago•6 comments

Ask HN: OpenClaw users, what is your token spend?

14•8cvor6j844qw_d6•4d ago•6 comments
Open in hackernews

Ask HN: How is it possible to get -0.0 in a sum?

12•gus_massa•6mo ago
I'm looking for corner cases where he result is -0.0. As far as I know, the only way to get -0.0 in a sum is

  (-0.0) + (-0.0)
Does someone know any other case in IEEE 754?

Bonus question: What happens in subtractions? I only know

  (-0.0) - (+0.0)
Is there any other case?

Comments

sparkie•6mo ago
It depends on the FP rounding mode. If rounding mode is FE_TOWARDZERO/FE_UPWARD/FE_TONEAREST then the case you gave is the only one I'm aware of. If rounding mode is FE_DOWNWARD (towards negative infinity) then other calculations that result in a zero will give a -0.0.

Here's an example of -1.0f + 1.0f resulting in -0.0: https://godbolt.org/z/5qvqsdh9P

gus_massa•6mo ago
Thanks! [Sorry for the delay.]

---

FYI: For more context, I'm trying to send a PR to Chez Scheme (and indirectly to Racket) https://github.com/cisco/ChezScheme/pull/959 to reduce expressions like

  (+ 1.0 (length L))  ;  ==>  (+ 1.0 (fixnum->flonum (length L)))
where the "fixnums" are small integers and "flonums" are double.

It's fine, unless you have the case

  (+ -0.0 (length L))  ;  =wrong=>  (+ -0.0 (fixnum->flonum (length L)))
because if the length is 0, it get's transformed into 0.0 instead of -0.0

There are a few corner cases, in particular because it's possible to have

   (+ 1.0 x (length L))
and I really want to avoid the runtime check of (length L) == 0 if possible.

So I took a look, asked there, and now your opinion confirms what I got so far. My C is not very good, so it's nice to have a example of how the rounding directions are used. Luckily Chez Scheme only uses the default rounding and it's probably correct to cut a few corners. I'll take a looks for a few days in case there is some surprise.

sparkie•6mo ago
I'm not sure you can avoid the check, but you can avoid a branch.

An AVX-512 extension has a `vfixupimm` instruction[1] which can adjust special floating point values. You could use this to adjust all zeroes to -0 but leave any non-zeroes untouched. It isn't very obvious how to use though.

    vfixupimmsd dst, src, fixup, flag

 * The `flag` is for error reporting - we can set it to zero to ignore errors.

 * `dst` and `src` are a floating point value - they can be the same register.

 * The instruction first checks `src` and turns any denormals into zero if the MXCSR.DAZ flag is set.

 * It then categorizes `src` as one of {QNAN, SNAN, ZERO, ONE, NEG_INF, POS_ING, NEG_VALUE, POS_VALUE}

 * `fixup` is an array of 8 nybbles (a 32-bit int) and is looked up based on the categorization of `src` {QNAN = 0 ... POS_VALUE = 7}

 * The values of each nybble denote which value to place into `dst`:

    0x0 : dst (unchanged)
    0x1 : src (with denormals as zero if MXCSR.DAZ is set)
    0x2 : QNaN(src)
    0x3 : QNAN_Indefinite
    0x4 : -INF
    0x5 : +INF
    0x6 : src < 0 ? -INF : +INF
    0x7 : -0
    0x8 : +0
    0x9 : -1
    0xA : +1
    0xB : 1/2
    0xC : 90.0
    0xD : PI/2
    0xE : MAX_FLOAT
    0xF : -MAX_FLOAT
You want to set the nybble for categorization ZERO (bits 11..8) to 0x7 (-0) in `fixup`. This would mean you want `fixup` to be equal to `0x00000700`. So usage would be:

    static __m128i zerofixup = { 0x700 };

    double fixnum_to_flonum(int64_t fixnum) {
        __m128d flonum = { (double)fixnum };
        return _mm_cvtsd_f64(_mm_fixupimm_sd(flonum, flonum, zerofixup, 0));
    }
Which compiles to just 4 instructions, with no branches:

    .FIXUP:
        .long   1792                            # 0x700
        .long   0                               # 0x0
        .long   0                               # 0x0
        .long   0                               # 0x0
    fixnum_to_flonum:
        vcvtsi2sd       xmm0, xmm0, rdi
        vmovq           xmm0, xmm0
        vfixupimmsd     xmm0, xmm0, qword ptr [rip + .FIXUP], 0
        ret
It can be extended to operate on 8 int64->double at a time (__m512d) with little extra cost.

You could maybe use this optimization where the instruction is available and just stick with a branch version otherwise, or figure out some other way to make it branchless - though I can't think of any other way which would be any faster than a branch.

[1]:https://www.intel.com/content/www/us/en/docs/intrinsics-guid...

gus_massa•6mo ago
So, if I use 0x00450000 I can swap -inf.0 and +inf.0 without modifying any other value? (I don't expect this swap operation to be useful, but I'm trying to understand the details.)

---

Thanks again, it's very interesting. I used assembler a long time ago, for the Z80 and 80?86, when the coprosesor was like 2 inches away :) . The problem is that Chez Scheme emits it's own assembler, and support many platforms. So after going into the rabbit hole, you get to asm-fpt https://github.com/search?q=repo%3Acisco%2FChezScheme+asm-fp... (expand and look for "define asm-fpt" near line 1300-2000)

This is like 2 or 3 layers below the level I usually modify, so I'm not sure about the details and quirks in that layer. I'll link to this discussion in github in case some of the maintainers wants to add something like this. My particular case is a very small corner cases and I'm not sure they'd like to add more complexity, but it's nice to have this info in case there are similar cases because once you notice them, they star to appear everywhere.

You can tag yourself in case someone wants to ask more questions or just get updates, but I expect that I'll go in the oposite direction.

sparkie•6mo ago
> So, if I use 0x00450000 I can swap -inf.0 and +inf.0 without modifying any other value?

Yeah, you got it. See test: https://ce.billsun.dev/#g:!((g:!((g:!((h:codeEditor,i:(filen...

If you are going to implement something like this you basically need a fallback for where it is not supported. In C you write:

    #ifdef __AVX5125F__
        // optimized code
    #else
        // fallback code
    #endif
The optimized code will only be emitted if -mavx512f is passed to the compiler. This flag is implied if `-march=native` and the host compiling the code supports it, or if `-march=specificarch` and specificarch supports it. Otherwise the fallback code will be used.

If using the custom assembler you would need to test whether AVX512F is available by using the CPUID instruction.

kazinator•6mo ago
What happens if we take the smallest (as in closest to zero) negative subnormal and add it to itself?
gus_massa•6mo ago
Copying the example by sparkie, something like this? https://godbolt.org/z/xhdnb9ax3 I get +0.0 if I comment the round to negative option.
gethly•6mo ago
i would guess that because of how *** * floats are in binary computers, you have something like -0.0000000000000000000000000000000000001 and when you round it you end up with -0.0. Same goes for positive value, you're just not used to write the + sign before every number, so seeing the minus feels strange.
dcminter•6mo ago
You're answering a question that OP did not ask.
perilunar•6mo ago
I get -0.0 all the time in LibreOffice spreadsheets. Assume it's just due to rounding. Annoying, but not enough to investigate further.