frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

List of unproven and disproven cancer treatments

https://en.wikipedia.org/wiki/List_of_unproven_and_disproven_cancer_treatments
1•brightbeige•28s ago•0 comments

Me/CFS: The blind spot in proactive medicine (Open Letter)

https://github.com/debugmeplease/debug-ME
1•debugmeplease•51s ago•1 comments

Ask HN: What are the word games do you play everyday?

1•gogo61•3m ago•0 comments

Show HN: Paper Arena – A social trading feed where only AI agents can post

https://paperinvest.io/arena
1•andrenorman•5m ago•0 comments

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•9m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
2•elashri•9m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•9m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•9m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•11m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•11m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•11m ago•1 comments

The Neuroscience Behind Nutrition for Developers and Founders

https://comuniq.xyz/post?t=797
1•01-_-•12m ago•0 comments

Bang bang he murdered math {the musical } (2024)

https://taylor.town/bang-bang
1•surprisetalk•12m ago•0 comments

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

https://konfuzio.com/en/a-night-without-the-nerds-claude-opus-4-6-in-the-field-test/
1•konfuzio•14m ago•0 comments

Could ionospheric disturbances influence earthquakes?

https://www.kyoto-u.ac.jp/en/research-news/2026-02-06-0
2•geox•16m ago•1 comments

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

https://www.space.com/space-exploration/launches-spacecraft/spacexs-next-astronaut-launch-for-nas...
1•bookmtn•17m ago•0 comments

Show HN: One-click AI employee with its own cloud desktop

https://cloudbot-ai.com
2•fainir•19m ago•0 comments

Show HN: Poddley – Search podcasts by who's speaking

https://poddley.com
1•onesandofgrain•20m ago•0 comments

Same Surface, Different Weight

https://www.robpanico.com/articles/display/?entry_short=same-surface-different-weight
1•retrocog•22m ago•0 comments

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
2•Brajeshwar•27m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
3•Brajeshwar•27m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
2•Brajeshwar•27m ago•0 comments

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•30m ago•1 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
2•righthand•33m ago•1 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•34m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•34m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
3•vinhnx•35m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
3•tosh•40m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•44m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•49m ago•1 comments
Open in hackernews

Parsing Advances

https://matklad.github.io/2025/12/28/parsing-advances.html
108•birdculture•1mo ago

Comments

kccqzy•1mo ago
How about another way, which is memoization: at each position in the source code we never attempt to parse the same production more than once. This solves infinite looping as discussed by the author because the “loop” will be downgraded by the memoization to execute once. Of course I wouldn't literally use a while loop in code to represent the production. I would use a higher-level abstraction to indicate one-or-more or zero-or-more in the production; indeed I would represent productions as data not code.

This also has another benefit of work sharing. A production like `A B | C B` will ensure that in case parsing A or C consumes the same number of characters, the work to parse B will be shared, despite not literally factoring the production into `(A | C) B`.

smj-edison•1mo ago
That's a slick way, would you essentially have a second counter that you'd set to the current cursor whenever you use `.currentToken()` or something like that?
luizfelberti•1mo ago
I also find this to be an elegant way of doing this, and it is also how the Thompson VM style of regex engines work [0]

It's a bit harder to adapt the technique to parsers because the Thompson NFA always increments the sequence pointer by the same amount, while a parser's production usually has a variable size, making it harder to run several parsing heads in lockstep.

[0] https://swtch.com/~rsc/regexp/regexp2.html

Porygon•1mo ago
Memoization to limit left-recursive recursion is nicely described in Guido van Rossums' article here: https://medium.com/@gvanrossum_83706/left-recursive-peg-gram...

I recently tried that approach while simultaneously building an abstract syntax tree, but I dropped it in favor of a right-recursive grammar for now, since restoring the AST when backtracking got a bit complex.

kccqzy•1mo ago
You can look at the Earley parser. It handles left recursion well well using a method that’s basically memoization.
smj-edison•1mo ago
Huh, that's a really interesting approach. I just wrote my first Pratt parser a month ago, and one of the most annoying things was debugging infinite loops in various places (I had both tokenizer bugs where no characters were consumed and parser bugs where a token was emitted but not advanced). It's doubly annoying in Zig, because the default test runner won't print out stdout at all, and won't print stderr unless the program terminates by itself (Ctrl + C doesn't print). I resorted to building the test and running it manually, or jumping into a debugger to figure out recursion issues. It's working now, but if (really when) I run into issues in the future I'll definitely add some helper functions to check emitting invariants.
someone_jain_•1mo ago
its also very annoying that one can't have two test names where one is substring of other
eru•1mo ago
Writing parsers by hand this way can be fun (and might be required for the highest performance ones, maybe?), but for robustness and ease of development you are generally better off using a parser combinator library.
tubs•1mo ago
Are you?

The majority of production compilers use hand rolled parsers, ostensibly for better error reporting and panic synch.

cipherself•1mo ago
One anecdote in the same vein, a couple of months ago, I wanted to parse systemd-networkd INI files in Python and the python built-in ConfigParser [0] and pytest's iniconfig parser [1] couldn't handle multiple sections with the same name so I ended up writing 2 parsers, one using a ParserCombinator library and one by hand and ended up using the latter given it was much simpler to understand and I didn't have to introduce an extra dependency.

Admittedly, INI is quite a simple format, hence I mention this as an anecdote.

[0] https://docs.python.org/3/library/configparser.html

[1] https://github.com/pytest-dev/iniconfig

thechao•1mo ago
As a project gets larger the cost of owning a dependency directly begins to outweigh the impedance mismatch between 3rd party software & software customized to your project.

I've got 10 full time senior engineers on a project heading in to its 15th year. We rewrite even extremely low level code like std::vector or malloc to make sure it matches our requirements.

UNIX was written by a couple of dudes.

kccqzy•1mo ago
That’s because Python is a bad language for writing parser combinators and parsers based on them. Try Haskell.
cipherself•1mo ago
I have written parsers using parser combinators in Haskell and Clojure. I find that ML-like (Haskell, OCaml, StandardML) languages generally are great at writing parsers, even hand-written ones in it is a superior experience.

In this case, this was a project at $EMPLOYER in an existing codebase with colleagues who have never seen Haskell code, using Haskell would've been a major error in judgement.

eru•1mo ago
I agree!

Haskell is a great language. It can even be a great language for beginners, especially if there's some senior help on hand.

But it's a terrible language to foist upon an unsuspecting and even unwilling victim.

tgv•1mo ago
So ... someone calls their parsing strategy "resilient LL parsing" without actually implementing LL parsing, a technique known since the 1970s, and then has an infinite recursion bug? Probably skipped Parsing 101.
sureglymop•1mo ago
In rust I really like the grmtools set of tools: https://github.com/softdevteam/grmtools.

It is lexx/yacc style lexer and parser generation and generates an LR1 parser but using the CPCT+ algorithm for error recovery. Iirc the way it works is that when an error occurs, the nearest likely valid token is inserted, the error is recorded and parsing continues.

I would use this for anything that is simple enough and recursive descent for anything more complicated and where even more context is needed for errors.

ratmice•1mo ago
I always feel that when saying lex/yacc style tools, it comes with a lot of preconceived notions that using the tools involves a slow development cycle with code gen + compilation steps.

What drew me to the grmtools (eventually contributing to it) was that you can evaluate grammars basically like an interpreter without going through that compilation process. Leading to a fairly quick turnaround times during language development process.

I hope this year I can work on porting my grmtools based LSP to browser/wasm.

sureglymop•1mo ago
I've seen your commits, thank you sincerely for your work!
dcrazy•1mo ago
I’m curious why the author chose to model this as an assertion stack. The developer must still remember to consume the assertion within the loop. Could the original example not be rewritten more simply as:

    const result: ast.Expression[] = [];
    p.expect("(");
    while (!p.eof() && !p.at(")")) {
     subexpr = expression(p);
     assert(p !== undefined); // << here
     result.push(subexpr);
     if (!p.at(")")) p.expect(",");
    }
    p.expect(")");
    return result;
matklad•1mo ago
I assume you ment to write `assert(subexpression != undefined)`?

This is resilient parsing --- we are parsing source code with syntax errors, but still want to produce a best-effort syntax tree. Although expression is required by the grammar, the `expression` function might still return nothing if the user typed some garbage there instead of a valid expression.

However, even if we return nothing due to garbage, there are two possible behaviors:

* We can consume no tokens, making a guess that what looks like "garbage" from the perspective of expression parser is actually a start of next larger syntax construct:

``` function f() { let x = foo(1, let not_garbage = 92; } ```

In this example, it would be smart to _not_ consume `let` when parsing `foo(`'s arglist.

* Alternatively, we can consume some tokens, guessing that the user _meant_ to write an expression there

``` function f() { let x = foo(1, /); } ```

In the above example, it would be smart to skip over `/`.