frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Self-hosting your own media considered harmful according to YouTube

https://www.jeffgeerling.com/blog/2025/self-hosting-your-own-media-considered-harmful
686•DavideNL•5h ago•279 comments

The impossible predicament of the death newts

https://crookedtimber.org/2025/06/05/occasional-paper-the-impossible-predicament-of-the-death-newts/
474•bdr•20h ago•164 comments

Tokasaurus: An LLM inference engine for high-throughput workloads

https://scalingintelligence.stanford.edu/blogs/tokasaurus/
168•rsehrlich•12h ago•22 comments

Test Postgres in Python Like SQLite

https://github.com/wey-gu/py-pglite
98•wey-gu•9h ago•27 comments

How we’re responding to The NYT’s data demands in order to protect user privacy

https://openai.com/index/response-to-nyt-data-demands/
182•BUFU•9h ago•163 comments

Show HN: Claude Composer

https://github.com/possibilities/claude-composer
116•mikebannister•11h ago•54 comments

What a developer needs to know about SCIM

https://tesseral.com/blog/what-a-developer-needs-to-know-about-scim
107•noleary•11h ago•20 comments

Show HN: Air Lab – A portable and open air quality measuring device

https://networkedartifacts.com/airlab/simulator
399•256dpi•1d ago•168 comments

X changes its terms to bar training of AI models using its content

https://techcrunch.com/2025/06/05/x-changes-its-terms-to-bar-training-of-ai-models-using-its-content/
130•bundie•17h ago•124 comments

APL Interpreter – An implementation of APL, written in Haskell (2024)

https://scharenbroch.dev/projects/apl-interpreter/
101•ofalkaed•13h ago•42 comments

Defending adverbs exuberantly if conditionally

https://countercraft.substack.com/p/defending-adverbs-exuberantly-if
36•benbreen•14h ago•13 comments

Seven Days at the Bin Store

https://defector.com/seven-days-at-the-bin-store
185•zdw•18h ago•89 comments

Show HN: Ask-human-mcp – zero-config human-in-loop hatch to stop hallucinations

https://masonyarbrough.com/blog/ask-human
79•echollama•11h ago•38 comments

Open Source Distilling

https://opensourcedistilling.com/
43•nativeit•8h ago•18 comments

SkyRoof: New Ham Satellite Tracking and SDR Receiver Software

https://www.rtl-sdr.com/skyroof-new-ham-satellite-tracking-and-sdr-receiver-software/
88•rmason•15h ago•8 comments

I made a search engine worse than Elasticsearch (2024)

https://softwaredoug.com/blog/2024/08/06/i-made-search-worse-elasticsearch
74•softwaredoug•15h ago•8 comments

Digital Minister wants open standards and open source as guiding principle

https://www.heise.de/en/news/Digital-Minister-wants-open-standards-and-open-source-as-guiding-principle-10414632.html
30•donutloop•4h ago•19 comments

Show HN: Lambduck, a Functional Programming Brainfuck

https://imjakingit.github.io/lambduck/
39•jorkingit•10h ago•16 comments

I do not remember my life and it's fine

https://aethermug.com/posts/i-do-not-remember-my-life-and-it-s-fine
211•mrcgnc•10h ago•142 comments

The Universal Tech Tree

https://asteriskmag.com/issues/10/the-universal-tech-tree
101•mitchbob•3d ago•46 comments

Programming language Dino and its implementation

https://github.com/dino-lang/dino
50•90s_dev•16h ago•16 comments

Converge (YC S23) Well-capitalized New York startup seeks product developers

https://www.runconverge.com/careers
1•thomashlvt•13h ago

Show HN: iOS Screen Time from a REST API

https://www.thescreentimenetwork.com/api/
92•anteloper•16h ago•46 comments

Autonomous drone defeats human champions in racing first

https://www.tudelft.nl/en/2025/lr/autonomous-drone-from-tu-delft-defeats-human-champions-in-historic-racing-first
334•picture•1d ago•276 comments

Eleven v3

https://elevenlabs.io/v3
242•robertvc•15h ago•127 comments

How Common Is Multiple Invention?

https://www.construction-physics.com/p/how-often-do-inventions-have-multiple
45•rbanffy•13h ago•35 comments

parrot.live

https://github.com/hugomd/parrot.live
227•jasonthorsness•1d ago•52 comments

LLMs and Elixir: Windfall or Deathblow?

https://www.zachdaniel.dev/p/llms-and-elixir-windfall-or-deathblow
236•uxcolumbo•1d ago•122 comments

Show HN: ClickStack – Open-source Datadog alternative by ClickHouse and HyperDX

https://github.com/hyperdxio/hyperdx
213•mikeshi42•16h ago•58 comments

Apple Notes Will Gain Markdown Export at WWDC, and, I Have Thoughts

https://daringfireball.net/linked/2025/06/04/apple-notes-markdown
305•robenkleene•20h ago•175 comments
Open in hackernews

VectorSmuggle: Covertly Exfiltrate Data in Embeddings

https://github.com/jaschadub/VectorSmuggle
34•smugglereal•1d ago

Comments

smugglereal•1d ago
A comprehensive proof-of-concept demonstrating sophisticated vector-based data exfiltration techniques in AI/ML environments. This educational security research project illustrates potential risks in RAG systems and provides tools for defensive analysis.
acmiyaguchi•1d ago
The idea of using stenographic techniques to exfiltrate data is interesting, but I don't quite follow the general method outlined in the repository -- either through the generated documentation or code. The threat model and case studies seem contrived. I find it hard to believe that folks would expose data via RAG that they wouldn't want users of the underlying system to be privy to.

There's too much fluff here to be useful. I imagine having something that is concise and concrete would make it more appealing to others. But as-is, it's missing a good technical summary and demonstration.

smugglereal•1d ago
Thanks for the feedback!

It's less about the RAG exposing new data to a regular user, and more about using the vector pipeline as a covert channel. The idea is to sneak out data the attacker already can access, but in a way that might bypass traditional DLP looking at emails, USBs, etc.

The "fluff" is largely educational material, as the project is for research and learning. For a concrete technical demonstration, the scripts/embed.py and scripts/query.py scripts are the core, and the docs/guides/quick_start.md tries to offer a direct path to seeing it in action.

Hope that helps! Will add a video demo soon.

anonymousiam•1d ago
Well over a decade ago, I recall learning about a covert data exfiltration method that could bypass firewalls by using DNS lookups. The payload would be a base64 hostname prefix attached to an evil domain. Adding a time stamp to the prefix data would guarantee uniqueness, and get around local caching DNS servers.
DrScientist•1d ago
Yep - bottom line you just use a protocol you know the firewall won't/can't block.

In theory you don't even need anything in the payload - you could put information in the timing of the DNS requests a la morse code....

HTTP is the obvious other one - with much more options for somebody to exfiltrate data - you can think of ways where you don't even need an evil domain.

For example - you could exfilrate data via hackernews comments!

As far as I can see, the only thing you can do in the end is to make it harder to do easily, and then monitor unusual activity - and hope that is enough to stop large scale exfiltration, as small scale is impossible to stop.

stephantul•1d ago
Literal attack vectors