I've been working on a Go port of jsmn, the minimal C JSON tokenizer. The goal was to create a version that leverages goroutines to parse large JSON files in parallel. It's part of a larger project I'm calling SafeHeaders-Go, where I'm attempting to create safe, concurrent Go ports of popular single-file C header libraries.
Currently, parallel parsing is performed by naively splitting the JSON input into chunks and processing them concurrently. It's showing a decent performance improvement (around 2x on larger files in my benchmarks), but I'm sure the chunking logic could be much smarter.
I have two main questions for the community:
1. How would you approach the parallel chunking more robustly? I'm concerned about correctly handling tokens that get split across chunk boundaries.
2. Are there other popular C header libraries you'd find helpful to have a safe, concurrent Go port of? I've been considering something like stb_image.
I'm open to any and all feedback, and pull requests are very welcome.
alikatyc•3h ago
I've been working on a Go port of jsmn, the minimal C JSON tokenizer. The goal was to create a version that leverages goroutines to parse large JSON files in parallel. It's part of a larger project I'm calling SafeHeaders-Go, where I'm attempting to create safe, concurrent Go ports of popular single-file C header libraries.
You can check out the jsmn-go implementation here: https://github.com/alikatgh/safeheaders-go/tree/main/jsmn-go
Currently, parallel parsing is performed by naively splitting the JSON input into chunks and processing them concurrently. It's showing a decent performance improvement (around 2x on larger files in my benchmarks), but I'm sure the chunking logic could be much smarter.
I have two main questions for the community:
1. How would you approach the parallel chunking more robustly? I'm concerned about correctly handling tokens that get split across chunk boundaries.
2. Are there other popular C header libraries you'd find helpful to have a safe, concurrent Go port of? I've been considering something like stb_image.
I'm open to any and all feedback, and pull requests are very welcome.