I use tree-sitter for developing a custom programming language, you still need an extra step to get from CST to AST, but the overall DevEx is much quicker that hand-rolling the parser.
Could you elaborate on what this involves? I'm also looking at using tree-sitter as a parser for a new language, possibly to support multiple syntaxes. I'm thinking of converting its parse trees to a common schema, that's the target language.
I guess I don't quite get the difference between a concrete and abstract syntax tree. Is it just that the former includes information that's irrelevant to the semantics of the language, like whitespace?
The strong reason is latency and layout stability. Tree-sitter parses on the main thread (or a close worker) typically in sub-ms timeframes, ensuring that syntax coloring is synchronous with keystrokes. LSP semantic tokens are asynchronous by design. If you rely solely on LSP for highlighting, you introduce a flash of unstyled content or color-shifting artifacts every time you type, because the round-trip to the server (even a local one) and the subsequent re-tokenization takes longer than the frame budget (16ms).
The ideal hygiene could be something like -> tree-sitter provides the high-speed lexical coloring (keywords, punctuation, basic structure) instantly and LSP paints the semantic modifiers (interfaces vs classes, mutable vs const) asynchronously like 200ms later. Relying on LSP for the base layer makes the editor feel sluggish.
tetris11•1h ago
taeric•1h ago
codethief•1h ago
woodruffw•1h ago
[1]: https://github.com/tree-sitter-grammars/tree-sitter-yaml
_ache_•47m ago
Since it comes from `tree-sitter-grammars/tree-sitter-yaml`, it may be quick to integrate the official repo.
matthew-craig•1h ago
awk bash bibtex blueprint c c-sharp clojure cmake commonlisp cpp css dart dockerfile elixir glsl gleam go gomod heex html janet java javascript json julia kotlin latex lua magik make markdown nix nu org perl proto python r ruby rust scala sql surface toml tsx typescript typst verilog vhdl vue wast wat wgsl yaml
johanvts•59m ago
jasonjmcghee•47m ago
For others, this is a sub optimal answer, but I’ve played with generating grammars with latest llms and they are surprisingly good at doing this (in a few shots).
That being said, if you’re doing something more serious than syntax highlighting or shipping it in a product, you’ll want to spend more time on it.
zokier•43m ago
https://github.com/tree-sitter/tree-sitter/wiki/List-of-pars...