frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
228•isitcontent•14h ago•25 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
329•vecti•16h ago•143 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
287•eljojo•16h ago•168 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
72•phreda4•13h ago•14 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
90•antves•1d ago•66 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•1h ago•1 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
17•denuoweb•1d ago•2 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
47•nwparker•1d ago•11 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
10•michaelchicory•3h ago•1 comments

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

https://github.com/artifact-keeper
150•bsgeraci•1d ago•63 comments

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode
17•NathanFlurry•22h ago•7 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
11•keepamovin•4h ago•2 comments

Show HN: Horizons – OSS agent execution engine

https://github.com/synth-laboratories/Horizons
23•JoshPurtell•1d ago•5 comments

Show HN: Daily-updated database of malicious browser extensions

https://github.com/toborrm9/malicious_extension_sentry
14•toborrm9•19h ago•7 comments

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

https://apps.apple.com/us/app/fitspire-5-minute-workout/id6758784938
2•devavinoth12•7h ago•0 comments

Show HN: Micropolis/SimCity Clone in Emacs Lisp

https://github.com/vkazanov/elcity
172•vkazanov•2d ago•49 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
4•ambitious_potat•7h ago•4 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
2•rs545837•8h ago•1 comments

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

https://www.biotradingarena.com/hn
25•dchu17•18h ago•12 comments

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

https://rahuljaguste.github.io/Nethack_Falcons_Eye/
4•rahuljaguste•13h ago•1 comments

Show HN: Local task classifier and dispatcher on RTX 3080

https://github.com/resilientworkflowsentinel/resilient-workflow-sentinel
25•Shubham_Amb•1d ago•2 comments

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

https://github.com/AGDNoob/FastLog
5•AGDNoob•10h ago•1 comments

Show HN: Gohpts tproxy with arp spoofing and sniffing got a new update

https://github.com/shadowy-pycoder/go-http-proxy-to-socks
2•shadowy-pycoder•10h ago•0 comments

Show HN: A password system with no database, no sync, and nothing to breach

https://bastion-enclave.vercel.app
11•KevinChasse•19h ago•16 comments

Show HN: I built a directory of $1M+ in free credits for startups

https://startupperks.directory
4•osmansiddique•11h ago•0 comments

Show HN: GitClaw – An AI assistant that runs in GitHub Actions

https://github.com/SawyerHood/gitclaw
9•sawyerjhood•19h ago•0 comments

Show HN: A Kubernetes Operator to Validate Jupyter Notebooks in MLOps

https://github.com/tosin2013/jupyter-notebook-validator-operator
2•takinosh•11h ago•0 comments

Show HN: 33rpm – A vinyl screensaver for macOS that syncs to your music

https://33rpm.noonpacific.com/
3•kaniksu•12h ago•0 comments

Show HN: Chiptune Tracker

https://chiptunes.netlify.app
3•iamdan•13h ago•1 comments

Show HN: Craftplan – I built my wife a production management tool for her bakery

https://github.com/puemos/craftplan
568•deofoo•5d ago•166 comments
Open in hackernews

Show HN: Streaming gigabyte medical images from S3 without downloading them

https://github.com/PABannier/WSIStreamer
161•el_pa_b•3w ago

Comments

matthberg•3w ago
Seems very similar to how maps work on the web these days, in particular protomap files [0]. I wonder if you could view the medical images in leaflet or another frontend map library with the addition of a shim layer? Cool work!

0: https://protomaps.com/

el_pa_b•3w ago
Thanks! Indeed, digital pathology, satellite imaging and geospatial data share a lot of computational problems: efficient storage, fast spatial retrieval/indexing. I think this could be doable.

As for digital pathology, the field is very much tied to scanner-vendor proprietary formats (SVS, NDPI, MRXS, etc).

tonyhart7•3w ago
hey, I need this
lametti•3w ago
Interesting - I'm not so familiar with S3 but I wonder if this would work for WSI stored on-premises. Imposing lower network requirememts and a lightweight web viewer is very advantageous in this use case. I'll have to try it out!
el_pa_b•3w ago
When WSI are stored on-premise, they are typically stored on hard drives with a filesystem. If you have a filesystem, you can use OpenSlide, and use a viewer like OpenSeaDragon to visualize the slide.

WSIStreamer is relevant for storage systems without a filesystem. In this case, OpenSlide cannot work (it needs to seek and open the file).

rurban•2w ago
Then mount the s3 filesystem. It's slow though. But good if have tools to filter them properly.
tokyovigilante•3w ago
This is really a job for JPEG-XL, which supports decode of portions of larger images and has recently been added to the DICOM standard.
dmd•2w ago
Or IIIF.
iberator•2w ago
No. Jpg conpression sucks. Medical data should not be compressed loosely. PNG and TIFF for the win
vrighter•2w ago
unlike jpeg, jpeg-xl supports lossless compression too.
nszceta•2w ago
The original JPEG supports a lossless mode.

JPEG-LL refers to the lossless mode of the original JPEG standard (ISO/IEC 10918-1 or ITU-T T.81), also known as JPEG Lossless, and not to be confused with JPEG-LS (ISO/IEC 14495-1, Transfer Syntax 1.2.840.10008.1.2.4.80), which offers better ratios and speed via LOCO-I algorithm. JPEG-LL is older and less efficient yet more widely implemented in legacy systems.

The lossless mode in JPEG-XL is superior to all of those.

tokyovigilante•2w ago
Yes i am referring to lossless compression, and JPEG-XL also supports progressive lossless decoding. It also supports the 12 and 16-bit colour depths required for CT and DR.
Nora23•3w ago
How does this handle images with different compression formats?
el_pa_b•2w ago
Currently we only support TIFF and SVS with JPEG and JPEG2000 compression formats. I plan on supporting more file extensions (e.g. NDPI, MRXS) in the future, each with their own compression formats.
rwmj•3w ago
https://dicom.nema.org/dicom/dicomwsi/

Interesting guide to the Whole Slide Images (WSI) format. The surprising thing for me is that compression is used, and they note does not affect use in diagnostics.

Back in the day we used TIFF for a similar application (X-ray detector images).

yread•2w ago
Digital pathology are just a lot bigger than radiology, we regularly see slides 500k x 500k pixels.
el_pa_b•2w ago
Yes, they can be huge, and for modalities like multiplex immunofluorescence with up to 20 channels, you're often dealing with very faint proteomic signals. Preserving that signal is critical, and compression can destroy it quickly.
yread•2w ago
CODEX can do up to 120 channels I think. They are also 16/32bit. They are usually just deflated
invaderJ1m•2w ago
How does this compare to things like COGs (Cloud Optimised GeoTIFFs) or other binary blob + index raster pyramid formats?

Was there a requirement to work with these formats directly without converting?

el_pa_b•2w ago
Yes there is a requirement to work with the vendor format. For instance, TCGA (The Cancer Genome Atlas - a large dataset of 12k+ human tumor cases) has mostly .svs files (scanned with an Aperio scanner). We tend to work with these formats as they contain all the metadata we need.

Sometimes, it happens that we re-write the image in a pyramidal TIFF format (happened to me a few times, where NDPI images had only the highest resolution level, no pyramid), in which case COGs could work.

andrewstuart•2w ago
Please don’t use AWS S3 there’s vast numbers of much cheaper compatible choices.
thenaturalist•2w ago
Pretty bold half claim while not backing it up with a single data point. :D
PunchyHamster•2w ago
It's trivial to find and there are many alternatives.

Main problem is most support subset of the more advanced S3 features and often not all that big one. But if you just want to dump some backups in the cloud backblaze and other alternatives is cheaper

imhoguy•2w ago
Especially when you have to account HIPAA/GDPR/legalese, and some serious SecOps behind that.
el_pa_b•2w ago
As data scientists, we usually don't get to choose. It's usually up to the hospital or digital lab's CISO to decide where the digitized slides are stored, and S3 is a fairly common option.

That being said, I plan to support more cloud platforms in the future, starting with GCP.

lijok•2w ago
I guess by "compatible" you mean the data plane.

There are choices that speak the S3 data plane API (GetObject, ListBucket, etc).

There are no alternatives that support most of the AWS S3 functionality such as replication, event notifications.

leptons•2w ago
None? I've seen a few projects that purport to be a drop-in replacement for S3.
kube-system•2w ago
“Cheap” is not always the #1 requirement for a project.
yread•2w ago
You could probably do it completely clientside. I have a parser for 12 scanner formats in js. It doesnt read the pixels, just parses metadata but jpeg is easy and most common anyway
Sleaker•2w ago
Maybe a bit pedantic, but if you're streaming it, then you're still downloading portions of it, yah? Just not persisting the whole thing locally before viewing it.

Edit: Looks like this is a slight discrepancy between the HN title and the GitHub description.

el_pa_b•2w ago
Yes, I agree. I'm not persisting the WSI locally, which creates a smoother user experience. But I do need to transfer tiles from server to client. They are stored in an LRU cache and evicted if not used.
tomnicholas1•2w ago
The generalized form of this range-request-based streaming approach looks something like my project VirtualiZarr [0].

Many of these scientific file formats (HDF5, netCDF, TIFF/COG, FITS, GRIB, JPEG and more) are essentially just contiguous multidimensional array(/"tensor") chunks embedded alongside metadata about what's in the chunks. Efficiently fetching these from object storage is just about efficiently fetching the metadata up front so you know where the chunks you want are [1].

The data model of Zarr [2] generalizes this pattern pretty well, so that when backed by Icechunk [3], you can store a "datacube" of "virtual chunk references" that point at chunks anywhere inside the original files on S3.

This allows you to stream data out as fast as the S3 network connection allows [4], and then you're free to pull that directly, or build tile servers on top of it [5].

In the Pangeo project and at Earthmover we do all this for Weather and Climate science data. But the underlying OSS stack is domain-agnostic, so works for all sorts of multidimensional array data, and VirtualiZarr has a plugin system for parsing different scientific file formats.

I would love to see if someone could create a virtual Zarr store pointing at this WSI data!

[0]: https://virtualizarr.readthedocs.io/en/stable/

[1]: https://earthmover.io/blog/fundamentals-what-is-cloud-optimi...

[2]: https://earthmover.io/blog/what-is-zarr

[3]: https://earthmover.io/blog/icechunk-1-0-production-grade-clo...

[4]: https://earthmover.io/blog/i-o-maxing-tensors-in-the-cloud

[5]: https://earthmover.io/blog/announcing-flux

el_pa_b•2w ago
Thanks for sharing! I agree that newer scientific formats will need to deeply think about how they are deciphered directly from cloud storage.
tomnicholas1•2w ago
IMO Zarr is that newer format. It abstracts over the features of all these other formats so neatly that it can literally subsume them.

I feel that we no longer really need TIFF etc. - for scientific use cases in the cloud Zarr is all that's needed going forwards. The other file formats become just archival blobs that either are converted to Zarr or pointed at by virtual Zarr stores.

bwfan123•2w ago
thanks for sharing !
derefr•2w ago
Sounds like an approach that would also work for ML model weights files — just another kind of multidimensional array with metadata.

I wonder what exactly the big multi-model AI companies are doing to optimize model cold-start latency, and how much it just looks like Zarr on top of on-prem object storage.

tomnicholas1•2w ago
People have literally used Zarr for this - at one point Gemini used Zarr for checkpointing model weights. Not sure what the current fashion in that space is though.

It's definitely one of many fields that see convergent evolution towards something that just looks like Zarr. In fact you can use VirtualiZarr to parse HuggingFace's "SafeTensors" format [0].

[0]: https://github.com/zarr-developers/VirtualiZarr/pull/555

adolph•2w ago
> Many of these scientific file formats (HDF5, netCDF, TIFF/COG, FITS, GRIB, JPEG and more) are essentially just contiguous multidimensional array(/"tensor") chunks

Yeah, a recurring thought is that these should condense into Apache Arrow queried by DuckDB but there must be some reason for this not to have already happened.

isuckatcoding•2w ago
Is there a visual demo of this?
mlhpdx•2w ago
A while back I worked on a project where s3 held giant zip files containing zip files (turtles all the way down) and also made good use of range requests. I came up with seekable-s3-stream[1] to generalize working with them via an idiomatic C# stream.

[1] https://github.com/mlhpdx/seekable-s3-stream

el_pa_b•2w ago
Nice!
tonymet•2w ago
If only we had NFS to begin with
carderne•2w ago
I did something similar once for a mining technique called “core logging”. It’s a single photo about 1000 pixels wide and several million “deep”: what the earth looks like for a few km down.

Existing solutions are all complicated and clunky, I put something together with S3 and bastardised CoGeoTIFF, instant view of any part of the image.

Wish I knew how to commercialise it…

el_pa_b•2w ago
I'm curious about the "core logging" photo. Where can I find one? Do you have an implementation of your solution? I would be curious to have a look at it.
czbond•2w ago
@carderne I think el_pa_b has an idea on how to commercialize it.

In all seriousness, how is it not useful for gold mining or phracking?

carderne•2w ago
Might not be possible to find any, they’re expensive and niche. If you reach out (email in profile) I can show/share how it works (nothing currently public).
carderne•2w ago
I wasn't able to find any imagery online, and I don't have anything I can share publicly.

These are some of the existing commercial solutions (just found these on Google, can't remember which I was comparing my own work against):

- https://koregeosystems.com/digital-core-logging/

- https://mountsopris.com/wellcad/core-logging-software/

- https://www.geologicai.com/logging/

I don't know enough about the science side to take it any further on my own.

The "tech" part of what I started building is really quite simple: convert the images to Cloud-optimised GeoTIFF, then do range requests to S3 from the browser.

kirubakaran•2w ago
Of course you could commercialize it!

You've already done the "building v1" part, and have started to do the "talking about it" part.

Next step is to write up how one could use it, how it is better than the alternatives, and put it up on a website.

I'm happy to chat about it if you like. My email is in my profile.

Once you have real users, they will pull the v2 out of you, and that will be what you'll sell.

What I've written above sounds like a business proposition, but I want to clarify that I'm just offering to share what I know for free :-)