frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Parsing JSON in Forty Lines of Awk

https://akr.am/blog/posts/parsing-json-in-forty-lines-of-awk
52•thefilmore•4h ago

Comments

teddyh•4h ago
“except Unicode escape sequences”
chaps•4h ago
Awk is great and this is a great post. But dang, awk really shoots itself so much with its lack of features that it so desperately needs!

Like: printing all but one column somewhere in the middle. It turns into long, long commands that really pull away from the spirit of fast fabrication unix experimentation.

jq and sql both have the same problem :)

thrwwy9234•3h ago

  $ echo "one two three four five" | awk '{$3="";print}'
  one two  four five
mauvehaus•2h ago
...And once you get away from the most basic, standard set of features, the several awks in existence have diverging sets of additional features.
chaps•1h ago
Things are already like that, friend! We have mawk, gawk and nawk. But it's fun to think about how we could improve our ideal tooling if we had a time machine.
SoftTalker•1h ago
> awk really shoots itself so much with its lack of features that it so desperately needs

Whence perl.

jcynix•1h ago
>awk really shoots itself so much with its lack of features that it so desperately needs!

That's why I use Perl instead (besides some short one liners in awk, which in some cases are even shorter than the Perl version) and do my JSON parsing in Perl.

This

diff -rs a/ b/ | ask '/identical/ {print $4}' | xargs rm

is one of my often used awk one liners. Unless some filenames contain e.g. whitespace, then it's Perl again

wutwutwat•2h ago
Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should.
chubot•1h ago
JSON is not a friendly format to the Unix shell — it’s hierarchical, and cannot be reasonably split on any character

Yes, shell is definitely too weak to parse JSON!

(One reason I started https://oils.pub is because I saw that bash completion scripts try to parse bash in bash, which is an even worse idea than trying to parse JSON in bash)

I'd argue that Awk is ALSO too weak to parse JSON

The following code assumes that it will be fed valid JSON. It has some basic validation as a function of the parsing and will most likely throw an error if it encounters something strange, but there are no guarantees beyond that.

Yeah I don't like that! If you don't reject invalid input, you're not really parsing

---

OSH and YSH both have JSON built-in, and they have the hierarchical/recursive data structures you need for the common Python/JS-like API:

    osh-0.33$ var d = { date: $(date --iso-8601) }

    osh-0.33$ json write (d) | tee tmp.txt
    {
      "date": "2025-06-28"
    }
Parse, then pretty print the data structure you got:

    $ cat tmp.txt | json read (&x)

    osh-0.33$ = x
    (Dict)  {date: '2025-06-28'}
Create a JSON syntax error on purpose:

    osh-0.33$ sed 's/"/bad/"' tmp.txt | json read (&x)
    sed: -e expression #1, char 9: unknown option to `s'
      sed 's/"/bad/"' tmp.txt | json read (&x)
                                     ^~~~
    [ interactive ]:20: json read: Unexpected EOF while parsing JSON (line 1, offset 0-0: '')
(now I see the error message could be better)

Another example from wezm yesterday: https://mastodon.decentralised.social/@wezm/1147586026608361...

YSH has JSON natively, but for anyone interested, it would be fun to test out the language by writing a JSON parser in YSH

It's fundamentally more powerful than shell and awk because it has garbage-collected data structures - https://www.oilshell.org/blog/2024/09/gc.html

Also, OSH is now FASTER than bash, in both computation and I/O. This is despite garbage collection, and despite being written in typed Python! I hope to publish a post about these recent improvements

packetlost•26m ago
I don't really buy that shell / awk is "too weak" to deal with JSON, the ecosystem of tools is just fairly immature as most of the shells common tools predate JSON by at least a decade. `jq` being a pretty reasonable addition to the standard set of tools included in environments by default.

IMO the real problem is that JSON doesn't work very well at as a because it's core abstraction is objects. It's a pain to deal with in pretty much every statically typed non-object oriented language unless you parse it into native, predefined data structures (think annotated Go structs, Rust, etc.).

alganet•20m ago
> Yes, shell is definitely too weak to parse JSON!

Parsing is a trivial, rejecting invalid input is trivial, the problem is representing the parsed content in a meaningful way.

> bash completion scripts try to parse bash in bash

You're talking about ble.sh, right? I investigated it as well.

I think they made some choices that eventually led to the parser being too complex, largely due to the problem of representing what was parsed.

> Also, OSH is now FASTER than bash, in both computation and I/O.

According to my tests, this is true. Congratulations!

JavaScript Trademark Update

https://deno.com/blog/deno-v-oracle4
146•thebeardisred•1h ago•19 comments

MCP: An (Accidentally) Universal Plugin System

https://worksonmymachine.substack.com/p/mcp-an-accidentally-universal-plugin
342•Stwerner•5h ago•151 comments

Addictions Are Being Engineered

https://masonyarbrough.substack.com/p/engineered-addictions
193•echollama•5h ago•103 comments

BusyBeaver(6) Is Quite Large

https://scottaaronson.blog/?p=8972
124•bdr•3h ago•90 comments

2025 ARRL Field Day

https://www.arrl.org/field-day
23•rookderby•1h ago•1 comments

AI fakes duel over impeachment of Vice-President in Phillipines

https://factcheck.afp.com/doc.afp.com.63ZF9CP
20•anigbrowl•57m ago•3 comments

Use Plain Text Email

https://useplaintext.email/
46•cyrc•2h ago•31 comments

Life of an inference request (vLLM V1): How LLMs are served efficiently at scale

https://www.ubicloud.com/blog/life-of-an-inference-request-vllm-v1
12•samaysharma•1h ago•0 comments

We ran a Unix-like OS Xv6 on our home-built CPU with a home-built C compiler

https://fuel.edby.coffee/posts/how-we-ported-xv6-os-to-a-home-built-cpu-with-a-home-built-c-compiler/
176•AlexeyBrin•7h ago•14 comments

Is being bilingual good for your brain?

https://www.economist.com/science-and-technology/2025/06/27/is-being-bilingual-good-for-your-brain
46•Anon84•3h ago•39 comments

Meta Spends $14B to Hire a Single Guy

https://theahura.substack.com/p/tech-things-meta-spends-14b-to-hire
16•theahura•1h ago•18 comments

Unheard works by Erik Satie to premiere 100 years after his death

https://www.theguardian.com/music/2025/jun/26/unheard-works-by-erik-satie-to-premiere-100-years-after-his-death
154•gripewater•9h ago•33 comments

Show HN: A Go service that exposes a FIFO message queue in RAM

https://github.com/raiyanyahya/zapq
15•RaiyanYahya•3d ago•8 comments

Sirius: A GPU-native SQL engine

https://github.com/sirius-db/sirius
45•qianli_cs•5h ago•4 comments

NovaCustom – Framework Laptop alternative focusing on privacy

https://novacustom.com/
8•CHEF-KOCH•2h ago•3 comments

Show HN: I'm an airline pilot – I built interactive graphs/globes of my flights

https://jameshard.ing/pilot
1386•jamesharding•1d ago•189 comments

Parsing JSON in Forty Lines of Awk

https://akr.am/blog/posts/parsing-json-in-forty-lines-of-awk
52•thefilmore•4h ago•11 comments

Engineer creates ad block for the real world with augmented reality glasses

https://www.tomshardware.com/maker-stem/engineer-creates-ad-block-for-the-real-world-with-augmented-reality-glasses-no-more-products-or-branding-in-your-everyday-life
188•LorenDB•6d ago•117 comments

Lago (Open-Source Usage Based Billing) is hiring for ten roles

https://www.ycombinator.com/companies/lago/jobs
1•AnhTho_FR•8h ago

ZeQLplus: Terminal SQLite Database Browser

https://github.com/ZetloStudio/ZeQLplus
35•amadeuspagel•7h ago•7 comments

Verifiably Correct Lifting of Position-Independent x86-64 Binaries (2024)

https://dl.acm.org/doi/10.1145/3658644.3690244
16•etiams•3d ago•2 comments

No One Is in Charge at the US Copyright Office

https://www.wired.com/story/us-copyright-office-chaos-doge/
68•rntn•2h ago•21 comments

LLMs Bring New Nature of Abstraction

https://martinfowler.com/articles/2025-nature-abstraction.html
32•hasheddan•3d ago•34 comments

History of Cycling Maps

https://cyclemaps.blogspot.com/
69•altilunium•10h ago•8 comments

Lossless LLM 3x Throughput Increase by LMCache

https://github.com/LMCache/LMCache
118•lihanc111•4d ago•33 comments

US Justice Department settles antitrust case for HPE's $14B takeover of Juniper

https://www.reuters.com/business/us-doj-settles-antitrust-case-hpes-14-billion-takeover-juniper-2025-06-28/
22•awat•2h ago•2 comments

JWST reveals its first direct image discovery of an exoplanet

https://www.smithsonianmag.com/smart-news/james-webb-space-telescope-reveals-its-first-direct-image-discovery-of-an-exoplanet-180986886/
312•divbzero•1d ago•136 comments

Sinaloa cartel used phone data and surveillance cameras to find FBI informants

https://www.reuters.com/world/americas/sinaloa-cartel-hacked-phones-surveillance-cameras-find-fbi-informants-doj-says-2025-06-27/
12•_tk_•1h ago•2 comments

Boeing uses potatoes to test wi-fi (2012)

https://www.bbc.com/news/technology-20813441
19•m-hodges•2h ago•6 comments

Why the moon shimmers with shiny glass beads

https://phys.org/news/2025-06-moon-shimmers-shiny-glass-beads.html
7•PaulHoule•3d ago•2 comments