news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

How we index images for RAG

https://www.kapa.ai/blog/how-we-index-images-for-rag

29•mooreds•4h ago

Comments

hparadiz•42m ago

That cookie popup just makes me wanna leave and never come back

dang•26m ago

I think they've fixed it now.

emil_sorensen•15m ago

Thanks! Yep fixed

bad_username•37m ago

> we don't send images to the model at query time. We describe each image once, at indexing time, with a cheap vision model, store the descriptions as text, and retrieve them alongside ordinary text chunks

This is what I've been doing in my Obsidian infodump for a while. If I know that an image is important, I generate a text description (Mermaid if possible, English if not) and paste it after the image in a block. This lets agents see the image if they don't really see it. Though my process is manual, the improvements in outcomes for agents that rely on text search/retrieval is very real and is worth it.

hparadiz•7m ago

With media ingestion this is called "eager" processing. Historically for things like pulling thumbnails for images / video and pre-generating common sizes for things. This follows the same pattern and makes all the sense in the world. My only concern is that due to the non deterministic nature of LLMs new models will reveal new information about your data.

For example you might identify a car in an image but the context is the car running a red light. A new model might pick that up while an old one doesn't. These context adjustments might sometimes require you to rerun your LLM processing or potentially have a one to many relationship for multiple runs so you can take the best off or combine results.

Actual usage will also reveal most commonly used assets and you can target the ones that are most trafficked and save a ton on processing that way.

MAI-Code-1-Flash

https://microsoft.ai/news/introducingmai-code-1-flash/

214•EvanZhouDev•2h ago•99 comments

Gmail thinks I'm stupid, so I left

https://moddedbear.com/gmail-thinks-im-stupid-so-i-left

251•speckx•1h ago•126 comments

MAI-Thinking-1

https://microsoft.ai/news/introducing-mai-thinking-1/

81•LER0ever•2h ago•25 comments

Open Repair Data Standard – Open Repair Alliance

https://openrepair.org/open-data/open-standard/

30•cassepipe•1h ago•1 comments

A walking tour of surveillance infrastructure in Seattle (2020)

https://coveillance.org/a-walking-tour-of-surveillance-infrastructure-in-seattle/

334•eustoria•7h ago•190 comments

GitHub Copilot App

https://github.com/features/preview/github-app

68•theanonymousone•2h ago•41 comments

Launch HN: Rudus (YC P26) – AI for concrete contractors

26•rishipankhaniya•2h ago•1 comments

The advertising cartel coming to your web browser

https://blog.zgp.org/the-advertising-cartel-coming-to-your-web-browser/

50•speckx•1h ago•11 comments

Trump signs downsized AI order after weeks of reversals

https://www.politico.com/news/2026/06/02/trump-signs-downsized-ai-order-00946389

110•_alternator_•4h ago•78 comments

HP re-releases classic computer science calculator: The HP-16C

https://hpcalcs.com/product/hp-16c-collectors-edition/

36•dm319•1h ago•15 comments

Adafruit receives demand letter from Fenwick legal counsel on behalf of Flux.ai

https://blog.adafruit.com/

539•semanser•10h ago•226 comments

Uber caps employee AI spending after blowing through budget in four months

https://techcrunch.com/2026/06/02/uber-caps-employee-ai-spending-after-blowing-through-budget-in-...

19•notfried•42m ago•2 comments

QBE – Compiler Backend – 1.3

https://c9x.me/compile/release/qbe-1.3.html

48•birdculture•3h ago•7 comments

Bringing Up DeepSeek-V4-Flash on AMD MI300X

https://fergusfinn.com/blog/deepseek-v4-flash-mi300x/

49•kkm•2h ago•4 comments

How we index images for RAG

https://www.kapa.ai/blog/how-we-index-images-for-rag

30•mooreds•4h ago•5 comments

CT scans of BYD car parts

https://www.lumafield.com/scan-of-the-month/byd

8•viasfo•21m ago•1 comments

Why Janet? (2023)

https://ianthehenry.com/posts/why-janet/

402•yacin•11h ago•212 comments

Expanding Project Glasswing

https://www.anthropic.com/news/expanding-project-glasswing

134•surprisetalk•7h ago•178 comments

Fidonet: Technology, Use, Tools, and History (1993)

https://www.fidonet.org/inet92_Randy_Bush.txt

126•BruceEel•6h ago•43 comments

Preparing for KDE Plasma's Last X11-Supported Release

https://blog.davidedmundson.co.uk/blog/596/

107•jandeboevrie•6h ago•129 comments

BQN: What Is a Primitive?

https://mlochbaum.github.io/BQN/commentary/primitive.html

21•tosh•3d ago•1 comments

Made a Tool to Streams Changes from Microsoft SQL Server to Apache Kafka

https://github.com/Niyko/Athena

6•hyvr_official•2d ago•1 comments

Love systemd timers

https://blog.tjll.net/you-dont-love-systemd-timers-enough/

295•yacin•11h ago•189 comments

Great Question (YC W21) Is Hiring Applied AI Interns

https://www.ycombinator.com/companies/great-question/jobs/J5TNvQH-ai-engineer-intern

1•nedwin•8h ago

Microsoft announces Scout, an autonomous AI agent built on OpenClaw

https://www.computerworld.com/article/4180103/microsoft-unveils-scout-an-autonomous-ai-agent-buil...

45•EvanZhouDev•2h ago•40 comments

Rethinking search as code generation

https://research.perplexity.ai/articles/rethinking-search-as-code-generation

56•1zael•4h ago•16 comments

Three Ways to Get Paid (2018)

https://jasonzweig.com/three-ways-to-get-paid/

182•nate•3h ago•113 comments

Age verification for social media, the beginning of the end for a free internet?

https://mullvad.net/en/blog/age-verification-for-social-media-the-beginning-of-the-end-for-a-free...

376•StrLght•21h ago•271 comments

Multicore suppport for DOS is real – partly

https://www.vogons.org/viewtopic.php?t=111336

15•beebix•2d ago•3 comments

Show HN: RePlaya – self-hosted browser session replay with live tailing

https://github.com/s2-streamstore/replaya

22•shikhar•3h ago•3 comments