frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
594•klaussilveira•11h ago•176 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
901•xnx•17h ago•545 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
22•helloplanets•4d ago•17 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
95•matheusalmeida•1d ago•22 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
28•videotopia•4d ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
203•isitcontent•11h ago•24 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
199•dmpetrov•12h ago•91 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
313•vecti•13h ago•137 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
353•aktau•18h ago•176 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
355•ostacke•17h ago•92 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
459•todsacerdoti•19h ago•231 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
24•romes•4d ago•3 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
259•eljojo•14h ago•155 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
80•quibono•4d ago•19 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
392•lstoll•18h ago•266 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
7•bikenaga•3d ago•1 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
53•kmm•4d ago•3 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
3•jesperordrup•1h ago•0 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
235•i5heu•14h ago•178 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
46•gfortaine•9h ago•13 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
122•SerCe•7h ago•103 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
136•vmatsiiako•16h ago•60 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
68•phreda4•11h ago•12 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
271•surprisetalk•3d ago•37 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
25•gmays•6h ago•7 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1044•cdrnsf•21h ago•431 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
13•neogoose•4h ago•9 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
171•limoce•3d ago•92 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
60•rescrv•19h ago•22 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
89•antves•1d ago•66 comments
Open in hackernews

Three Algorithms for YSH Syntax Highlighting

https://github.com/oils-for-unix/oils.vim/blob/main/doc/algorithms.md
49•todsacerdoti•7mo ago

Comments

chubot•7mo ago
(author here) I just noticed this link doesn’t work on my iPad because of the captcha – this is the same content:

https://github.com/oils-for-unix/oils.vim/blob/main/doc/algo...

tomhow•7mo ago
Great, thanks, we re-pointed it from https://codeberg.org/oils/oils.vim/src/branch/main/doc/algor...
chrismorgan•7mo ago
Coarse parsing is really good for the basics in almost all programming languages. But it’s not good at semantic detail, even though editors like Vim try to put some in there. One of the most notable ones is splitting Identifier up by adding Function. These have routinely then been misused and inconsistently applied, with the result that historically a language like JavaScript would look completely different from C; I think there was some tidying up of things a few years ago, but can’t remember—I wrote a deliberately simple colorscheme that discards most of those differences anyway. Sometimes you’ll find Function being used for a function’s name at definition time; sometimes at call time too/instead; sometimes a `function` keyword instead.

In many languages, it’s simply not possible to match function names in definitions or calls using coarse parsing. C is definitely such a language. A large part of the problem is when you don’t have explicit delimiting syntax. That’s what you need. Oils, by contrast, looks to say `proc demo { … }`, so you can look for the `proc` keyword.

Vim’s syntax highlighting is unfortunately rather limited, and if you try to stretch what it’s capable of, it can get arbitrarily slow. It’s my own fault, but the Rust syntax files try to be too clever, and on certain patterns of curly braces after a few hundred lines, any editing can have multiple seconds of lag. I wish there were better tools for identifying what’s making it slow. I tried to figure it out once, but gave up.

I’ve declared coarse parsing rerally good for the basics in almost all programming languages, and that explicit delimiting syntax is necessary. This leads to probably my least favourite limitation in Vim syntax highlighting: you can’t model indent-based mode switching. In Markdown, for example (keep the leading two spaces, they’re fine):

   Text

         Code

  1.  Text

         Vim says code, actually text

                 Code
reStructuredText highlighting suffers greatly too, though it honestly can’t be highlighted correctly without a full parser (the appropriate mode inside the indented block can’t be known statically).

This is a real problem for my own lightweight markup language too, which uses meaningful indentation.

chubot•7mo ago
Oh cool, I'd be interested to read about the issues you had expressing Rust syntax in Vim!

And yes, there are a whole bunch of limitations:

- C function definitions - harder than JavaScript because there's no "function". It's still syntactic, but probably requires parsing, not just lexing.

- C variable definitions - I'd call this "coarse semantic analysis", not coarse parsing! Because of the "lexer hack"

- Indentation as you mention - there was a thread where someone was complaining that treesitter had 2 composed parsers for Markdown -- block and inline -- although I'm not sure if this causes a problem in practice? (feedback appreciated)

---

But I did intend for YSH to be easier to parse than shell, and that actually worked, because it fits quite well in Vim!

I noted here that OSH/bash has 10 lexer modes -- I just looked and it's up to ~16 now

https://www.oilshell.org/blog/2019/02/07.html#2019-updates

Whereas YSH has 3 mutually recursive modes, and maybe 6 modes total.

---

On the "coarse semantic analysis", another motivation is that I found Github's semantic source browser a bit underwhelming. Not sure if others had that same experience. I think it can't really be accurate because it doesn't have a lot of build time info. So I think they could have embraced "coarseness" more, to make it faster

Although maybe I am confusing the UI speed with the analysis speed. (One reason that this was originally a Codeberg link is that Codeberg/Forejo's UI is faster, without all the nav stuff)

There are some related links here, like How To Build Static Analyzers in Orders of Magnitude Less Code:

https://github.com/oils-for-unix/oils/wiki/Polyglot-Language...

taeric•7mo ago
I question this? Sure, it is difficult, if not possible, to match function names/calls using a naive single pass. But, I don't see any reason you couldn't do a full parse and work from there?

This is really no different than how we process language, though? Even using proper names everywhere, turns out proper names get reused. A lot. Such that you pretty much have to have an active simulation of what you are reading in order for most things to attach to identities. No?

chrismorgan•7mo ago
I’m explicitly (and very clearly!) talking about coarse parsing.
taeric•7mo ago
Right, I meant my lead to be an agreement on the the coarsest possible parsing having trouble doing these things. With the understanding that it isn't just coarse straight to non-coarse. You will build up more capabilities as you progress.

And, directly to my second sentence, we have the resources that you don't need to cater to the coarsest possible capabilities. You can augment the things you are looking for just fine. You can see in one pass that a new function named "foo" was added, then in another pass, you can start highlighting the function foo easily enough. Staying in a coarse world, you could probably even add a regex that roughly checks the correct number of parameters. Yes, it takes more passes, but isn't impossible?

Further still, you could basically scan all identifiers and any that are a short hamming distance from each other can be queried to see if they are mistakes. Any that match after case folding? Still coarsely found, but still very helpful.

My point on how we treat language is that we do that more than not. Is my biggest gripe with the jump from a coarse parse straight to a context free grammar approach. We work in contexts. I don't know why we go through so much effort to make the context unnecessary.

As a fun example in natural language where I've seen this. We were going on a trip to Savannah with our kids when they were younger. The oldest was so excited that she was going to get to see cheetahs. An easy mistake to spot when being asked to look for it.

b0a04gl•7mo ago
vim’s syntax engine doesn’t track context. it matches tokens, not structure. in langs like ysh where command and expression modes mix mid-line, this breaks. no memory of nesting, no awareness of why you’re in a mode. one bad match and sync collapses. it’s not about regex power or file size. the engine just isn’t built to follow structure. stop layering hacks. generate semantic tokens outside, let vim just render them.
chubot•7mo ago
no memory of nesting

It absolutely nests! Vim's model has recursion, and it works perfectly. Some details here:

https://github.com/oils-for-unix/oils.vim/blob/main/doc/stag...

This highlighter is extremely accurate, and I would call it correct. I list about 3 known issues here, and they are all fixable/expressible in Vim's model:

https://github.com/oils-for-unix/oils.vim/blob/main/doc/algo...

Please install it, and file bugs with any inaccuracies. If YSH code is valid, it should not be mis-highlighted. There is test data in false-postive.ysh and false-negative.ysh.

Try to break it!

---

There are lots of Vim/Textmate plugins that are buggy, but it doesn't mean that all such plugins are.

generate semantic tokens outside

I'd also say that this doesn't really help, since I believe Tree-sitter is the most common way of doing that. I show at the top of the doc that Tree-sitter has issues in practice expressing shell (although admittedly it's not a fair comparison to YSH in Vim. Shell in Vim will have more problems, although in practice I find it pretty good)

chubot•7mo ago
Nested is also demonstrated by stage 2 fixing the "nested double quotes bug". Screenshots:

https://github.com/oils-for-unix/oils.vim/blob/main/doc/stag...

Stage 1 is non-recursive, but stage 2 is recursive.

b0a04gl•7mo ago
fair, recursive groups exist, and yeah stage 2’s structure is solid. but the point was less about recursion as a feature and more about context awareness. vim’s engine lets you nest, sure, but it doesnt preserve intent across transitions. you can recurse into quoted strings, command subs, etc, but you can’t reflect on why you entered a state. there's no semantic trace. take ysh: command vs expression isn’t just syntactic, it shifts meaning of the same tokens. `[` in one context is an index, in another it’s test. vim can match both, but it can’t decide which meaning is active unless the outer mode is remembered. and that’s the gap

tbh the plugin is impressive, no question. but that memoryless model will always need compromises, rule layering, and finetuned sync tricks. treesitter has its issues too, agreed. but having typed nodes and scope trees gives a baseline advantage when meaning depends on ancestry.

chubot•7mo ago
> `[` in one context is an index, in another it’s test. vim can match both, but it can’t decide which meaning is active unless the outer mode is remembered. and that’s the gap

ysh.vim solves exactly that problem.

The [ in command mode (test) is not highlighted.

In contrast, the [ within expressions like a[i] is highlighted (currently Normal, but you can make it any color - https://github.com/oils-for-unix/oils.vim/blob/main/syntax/l... )

Again I recommend trying it. Verify what you think the bugs are. I think you have some preconceptions based on using other Vim plugins.

Vim's model is powerful enough to write good plugins, or bad plugins. That is one of the main points of this article.

frou_dh•7mo ago
It's so nice when an editor can do completely accurate syntax-highlighting for a language. I think there is a subconscious disturbing effect when being presented with false-positive and false-negative colouring here and there, as traditional "good-enough" hacky syntax highlighting tends to result in.
taeric•7mo ago
Its a big shame, as my preference for how to see code would 100% fall in a "literate style" if I could get it. I'd love an even more dynamic view than that style, if I could. But, I'm fairly sure that 100% correct syntax highlighting would not be possible in that world? Especially in some of the more complicated syntax options out there.

I'm also curious on how many times you have used something that didn't get syntax highlighting correct? Even using some of the more advanced cweb features of org-mode, it typically gets things more correct than not. And I don't think it is using anything more than regexps? (I have not checked to see how the tree sitter stuff interacts with cweb in many blocks. Will try and look into that.)

frou_dh•7mo ago
Something that seems quite common on the false-negative side is type names not being highlighted at all when they are the names of user-defined types, even though they're being used in type positions in the code. Dumb highlighting will just have a fixed list of type names it knows about, because it is not as aware of the positional aspect of usages.
taeric•7mo ago
And this is an example where I feel this is strikingly like proper name usage in language. Everyone has a different set of proper names that they have ingrained in their mind for so long that, hearing them, they will jump out differently than other proper names. We literally ingrain a fixed list of names in our brains starting at a very young age.
chubot•7mo ago
Yeah after writing this highlighter, I started noticing what I consider bugs in other highlighters

e.g. although Vim's syntax highlighting helped me learn shell, it highlights numbers like 'echo 42' in a special way, which is misleading, because shell doesn't have numbers. (On the other hand, YSH does, but not in 'echo 42' either!)

On the other hand, there are also language design issues. Shell also allows MULTIPLE here docs, and I claim that ZERO syntax highlighters handle it correctly - https://github.com/oils-for-unix/oils.vim/blob/main/demo/bad...

(YSH removes here docs in favor of Python-like multi-line strings)

---

But the "surprise" in this article is that Vim is powerful, and you can write a good syntax highlighter or a bad one. There are many possible "programs" to write in this paradigm

I'd also say "completely accurate" highlighting doesn't really exist in practice, and is even problematic in theory.

Tree-sitter grammars are not completely faithful to the original language, because the metalanguage is limited. And highlighters have to deal with incomplete code, so it's not clear what "two parsers being the same" means.

kazinator•7mo ago
Vim has the best syntax highlighting engine out there.

The approach of regions and match items which can contain each other in a hierarchy can handle anything.

By the way, I use Vim for web requests to highlight code served by CGIT.

norir•7mo ago
I personally find syntax highlighting an annoying distraction, but I know this is a minority (and unpopular) viewpoint. For me, it actually has negative value, especially if I find myself spending time troubleshooting it (which I have extensively over the years) rather than actually working on the true problem at hand. I can't think of a single case where automatic syntax highlighting helped me solve a hard problem, but I have certainly wasted a lot of time futzing around with it.

The vast majority of code you are reading is almost definitionally syntactically correct, unless you are in the process of editing it. In that case, syntax highlighting can provide a lightweight proxy for correctness, which I suspect is where much of the enthusiasm comes from. What I personally want is immediate feedback on the actual correctness of the code, and syntax is just a subset.

That is not to say that highlighting is never useful. I just want it to be manual, like when you search for something with / in vim. Then the highlighted items actually pop and my eyes can go directly to the area I want to focus on. I immediately clear the highlighting as soon as I'm done because otherwise it creates a visual distraction.

In my estimation, what we actually need more of are smarter, faster compilers that can immediately respond to edit changes and highlight only the problem areas in the code. Typically, this should be exactly where my cursor is. I should ideally be programming in a state where everything above the cursor can be assumed correct, but there might be a problem with the current word, which is helpfully reported to me exactly where my focus already lies.