frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Parsing JSON in Forty Lines of Awk

https://akr.am/blog/posts/parsing-json-in-forty-lines-of-awk
52•thefilmore•4h ago

Comments

teddyh•4h ago
“except Unicode escape sequences”
chaps•4h ago
Awk is great and this is a great post. But dang, awk really shoots itself so much with its lack of features that it so desperately needs!

Like: printing all but one column somewhere in the middle. It turns into long, long commands that really pull away from the spirit of fast fabrication unix experimentation.

jq and sql both have the same problem :)

thrwwy9234•3h ago

  $ echo "one two three four five" | awk '{$3="";print}'
  one two  four five
chaps•10m ago
Oh dang, that's good.
mauvehaus•3h ago
...And once you get away from the most basic, standard set of features, the several awks in existence have diverging sets of additional features.
chaps•1h ago
Things are already like that, friend! We have mawk, gawk and nawk. But it's fun to think about how we could improve our ideal tooling if we had a time machine.
SoftTalker•2h ago
> awk really shoots itself so much with its lack of features that it so desperately needs

Whence perl.

jcynix•1h ago
>awk really shoots itself so much with its lack of features that it so desperately needs!

That's why I use Perl instead (besides some short one liners in awk, which in some cases are even shorter than the Perl version) and do my JSON parsing in Perl.

This

diff -rs a/ b/ | ask '/identical/ {print $4}' | xargs rm

is one of my often used awk one liners. Unless some filenames contain e.g. whitespace, then it's Perl again

wutwutwat•2h ago
Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should.
chubot•2h ago
JSON is not a friendly format to the Unix shell — it’s hierarchical, and cannot be reasonably split on any character

Yes, shell is definitely too weak to parse JSON!

(One reason I started https://oils.pub is because I saw that bash completion scripts try to parse bash in bash, which is an even worse idea than trying to parse JSON in bash)

I'd argue that Awk is ALSO too weak to parse JSON

The following code assumes that it will be fed valid JSON. It has some basic validation as a function of the parsing and will most likely throw an error if it encounters something strange, but there are no guarantees beyond that.

Yeah I don't like that! If you don't reject invalid input, you're not really parsing

---

OSH and YSH both have JSON built-in, and they have the hierarchical/recursive data structures you need for the common Python/JS-like API:

    osh-0.33$ var d = { date: $(date --iso-8601) }

    osh-0.33$ json write (d) | tee tmp.txt
    {
      "date": "2025-06-28"
    }
Parse, then pretty print the data structure you got:

    $ cat tmp.txt | json read (&x)

    osh-0.33$ = x
    (Dict)  {date: '2025-06-28'}
Create a JSON syntax error on purpose:

    osh-0.33$ sed 's/"/bad/"' tmp.txt | json read (&x)
    sed: -e expression #1, char 9: unknown option to `s'
      sed 's/"/bad/"' tmp.txt | json read (&x)
                                     ^~~~
    [ interactive ]:20: json read: Unexpected EOF while parsing JSON (line 1, offset 0-0: '')
(now I see the error message could be better)

Another example from wezm yesterday: https://mastodon.decentralised.social/@wezm/1147586026608361...

YSH has JSON natively, but for anyone interested, it would be fun to test out the language by writing a JSON parser in YSH

It's fundamentally more powerful than shell and awk because it has garbage-collected data structures - https://www.oilshell.org/blog/2024/09/gc.html

Also, OSH is now FASTER than bash, in both computation and I/O. This is despite garbage collection, and despite being written in typed Python! I hope to publish a post about these recent improvements

packetlost•37m ago
I don't really buy that shell / awk is "too weak" to deal with JSON, the ecosystem of tools is just fairly immature as most of the shells common tools predate JSON by at least a decade. `jq` being a pretty reasonable addition to the standard set of tools included in environments by default.

IMO the real problem is that JSON doesn't work very well at as a because it's core abstraction is objects. It's a pain to deal with in pretty much every statically typed non-object oriented language unless you parse it into native, predefined data structures (think annotated Go structs, Rust, etc.).

alganet•31m ago
> Yes, shell is definitely too weak to parse JSON!

Parsing is a trivial, rejecting invalid input is trivial, the problem is representing the parsed content in a meaningful way.

> bash completion scripts try to parse bash in bash

You're talking about ble.sh, right? I investigated it as well.

I think they made some choices that eventually led to the parser being too complex, largely due to the problem of representing what was parsed.

> Also, OSH is now FASTER than bash, in both computation and I/O.

According to my tests, this is true. Congratulations!

Show HN: Leveraging Google ADK for Cyber Threat Intelligence

https://manta.black/leveraging-google-adk-for-cyber-intelligence.html
1•blackmanta•3m ago•0 comments

Show HN: The Commonbase Data Structure

https://github.com/your-commonbase/architecture
1•_bramses•4m ago•0 comments

Zo.computer: A personal computer in the cloud operated (mostly) conversationally

https://robc.substack.com/p/zo-computer
2•kousun12•5m ago•1 comments

The vengeful elephant and journalism's clickshare problem

https://savingjournalism.substack.com/p/the-vengeful-elephant-and-journalisms
1•thunderbong•6m ago•0 comments

Congress might block state AI laws for a decade. Here's what it means

https://techcrunch.com/2025/06/27/congress-might-block-state-ai-laws-for-a-decade-heres-what-it-means/
1•rntn•8m ago•0 comments

Alexander the Great poisoned? Science sheds new light on an age-old question

https://www.nationalgeographic.com/history/article/how-did-alexander-the-great-die-river-styx
3•Bluestein•8m ago•0 comments

What is the Value of Data?

https://admjs.substack.com/p/what-is-the-value-of-data
1•admjs•9m ago•0 comments

Show HN: Send Invisible Text

https://invisibletext.app/
1•artiomyak•9m ago•0 comments

Command line management for Google Workspace

https://github.com/GAM-team/GAM
1•mooreds•9m ago•0 comments

Your Infra Isn't Special: Why Open Source Infrastructure as Code (IaC) Wins

https://masterpoint.io/blog/why-open-source-iac-wins/
1•mooreds•10m ago•0 comments

What Is a Number?

https://www.idrisschebak.com/blog/what-is-a-number
2•idrisschebak•10m ago•1 comments

Werner's Nomenclature of Colours

https://www.c82.net/werner/
1•Tomte•11m ago•0 comments

Ask HN: Better-auth or Nextauth or something else

2•dasubhajit•15m ago•0 comments

The Unreasonable Effectiveness of Mathematical Experiments

https://arxiv.org/abs/2506.19787
2•belter•16m ago•0 comments

Sketchy Boats

https://sketchy.boats
2•iBotPeaches•19m ago•1 comments

People Keep Inventing Prolly Trees

https://www.dolthub.com/blog/2025-06-03-people-keep-inventing-prolly-trees/
2•lifty•21m ago•0 comments

'Living in Doodle Land' The million-dollar artist who drew himself crazy

https://www.theguardian.com/artanddesign/2025/jun/28/mr-doodle-sam-cox-psychosis-mental-health-interview
2•Hoasi•21m ago•0 comments

Ask HN: What are your favorite funny things from the old internet?

2•firefax•22m ago•1 comments

Notes on Software Engineering Beyond the Code

https://sevazhidkov.com/notes-on-software-engineering-beyond-the-code
1•sevazhidkov•25m ago•0 comments

Show HN: Vet – A tool for safely running remote shell scripts

https://getvet.sh
3•a10r•26m ago•1 comments

The Great Illusion: When We Believed BeOS Would Save the World

https://www.desktoponfire.com/haiku_inc/782/the-great-illusion-when-we-believed-beos-would-save-the-world-and-maybe-it-was-right/
2•naves•29m ago•3 comments

Bulletproof, Fire-Resistant and Stronger Than Steel: Superwood

https://www.wsj.com/tech/inventwood-superwood-material-engineered-wood-f7f558e9
2•bookofjoe•33m ago•1 comments

Tesla Robotaxi misses left-turn and drives into oncoming traffic lane casually

https://www.twitch.tv/themayor_mccheese/clip/TolerantTiredHippoUncleNox-5Mq4ALJVZ1t3x5-z
2•haunter•35m ago•0 comments

Show HN: Anti-Cluely – Detect virtual devices and cheating tools on exam systems

https://www.anti-cluely.com/
2•kkxingh•41m ago•0 comments

Decoding Tesla's New "Fully Autonomous" Car Video–and What It Isn't Telling You

https://gizmodo.com/decoding-teslas-new-fully-autonomous-car-video-and-what-it-isnt-telling-you-2000621768
2•rntn•51m ago•0 comments

Is Anything Random?

https://www.bbc.com/audio/play/w3ct5rjg
5•Bluestein•55m ago•3 comments

How do you handle production webhook delivery reliability in your apps?

2•Tanjim•57m ago•2 comments

Why does software have bugs?

https://www.youtube.com/watch?v=eZ1mmqlq-mY
2•georgewsinger•58m ago•1 comments

Why sharks 'freeze' when turned upside down

https://bgr.com/science/scientists-still-dont-know-why-sharks-freeze-when-turned-upside-down/
4•Bluestein•1h ago•0 comments

Ask HN: Would this idea help address declining populations in many countries?

1•amichail•1h ago•5 comments