frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: STDM – Make Your Documents and Data Think by Embedding LLM Instructions

https://github.com/csiro/stdm
1•benl_c•6mo ago
Hi HN, I’m Ben from CSIRO, Australia’s national science agency. We’ve been exploring how to make data and documents "think" when you use them with LLMs. We call it Self-Thinking Data Manifests (STDM). The idea is to embed plain-text instructions directly within files that tell an LLM how it should think about that data and interact with the user. We demonstrate it with PDF and HTML documents but in the future hope it might be possible for lots of formats.

Why Thinking Data?

* *Enhance PDF drag-and-drop* People already drag scientific papers and reports into LLMs to chat with them, but the interaction is often generic. STDM gives authors more control and customisation in these scenarios. It inverts custom chat-to-pdf systems: instead of building custom RAG interfaces on top of documents, we’re programming the LLM from within the document itself.

* *Author-directed interpretation* STDM helps ensure LLMs approach content with the author’s intended context and purpose, especially for complex scientific or technical data.

* *Smarter documents* Files with embedded STDM carry their own interactive logic, analysis routines, or guided explorations, making them more like mini-applications.

* *Towards in-document LLM programming* We see STDM as a step toward a future where data and instructions combine to form a kind of memory and quasi-procedural instruction set for LLMs; perhaps entire programs could live inside agentic LLM contexts using this approach.

To build an STDM you define a GOAL for the LLM, set CONSTRAINTS for interpretation, suggest REQUESTED_TOOLS (such as code_interpreter for analysis or web_retrieval for context), and optionally sketch out a CUSTOM_UI_DEFINITION (e.g a text-based UI, UX, or specific output format). When a user loads an STDM-enabled file into a capable LLM and explicitly tells the LLM to follow these instructions, the LLM uses the embedded manifest to guide its behaviour.

A mandatory Safety Preamble within the STDM instructs the LLM to await explicit user command and consent before executing any significant actions (especially tool use), ensuring the user is in control.

STDM is designed to be model-agnostic, STDM has been tested with GPT, Claude, and Gemini, if an LLM can read text and follow structured instructions, it should work with STDM. See it in action (save the file, upload/paste it into your LLM, then tell the LLM: Follow the STDM instructions in this document):

* Interactive Floodplain Study (HTML) This one can think about fetching live news if you allow it: https://csiro.github.io/stdm/examples/floodplain.html

* Same study (PDF) See how it thinks to answer questions based on its embedded guide: https://csiro.github.io/stdm/examples/floodplain.pdf

* The Brain (GitHub Spec v0.1, more examples, 2-min explainer video in README): https://github.com/csiro/stdm

This is an early-stage v0.1 specification and very much an experiment. We’re excited by the potential of data that can explain itself or guide its own analysis via an LLM, data that can think! We’d love to hear your thoughts. Is this a useful direction for programming LLMs or creating more dynamic documents? What are the pitfalls (we’ve focused on explicit invocation and consent as key safeguards)? How might you use data that thinks or programs its own interaction?

Docker model runner integrates vllm

https://www.docker.com/blog/docker-model-runner-integrates-vllm/
1•robot-wrangler•2m ago•0 comments

Rust 2027 considering replacing poisoned locks

https://github.com/rust-lang/rust/issues/149359
1•vsgherzi•5m ago•0 comments

FileZilla Pro "Perpetual License" – A Warning to All Users

https://github.com/x011/FileZilla-Pro-Download
1•lobito25•6m ago•1 comments

Major fire rages at Hong Kong housing estate [video]

https://youtu.be/1WD0j0mW5qo
1•busymom0•6m ago•0 comments

You might be carrying an invisible gun

https://www.modernleader.is/p/invisible-gun
1•sebg•7m ago•0 comments

The Impossible Prompt

https://teodordyakov.github.io/the-impossible-promt/
1•emn13•7m ago•1 comments

An Empirical Study on Why LLMs Struggle with Password Cracking

https://arxiv.org/abs/2510.17884
1•gnabgib•10m ago•0 comments

Neural Architecture Design as a Compositional Language

https://lambpetros.substack.com/p/neural-architecture-design-as-a-compositional-32e
1•speiroxaiti•19m ago•0 comments

Refrag: Rethinking RAG Based Decoding

https://arxiv.org/abs/2509.01092
1•redbell•19m ago•1 comments

OpenAI Hacked, a Lot Leaked

https://peq42.com/blog/openai-hacked-a-lot-leaked/
3•peq42•19m ago•2 comments

The first $1B company run by one person is coming

2•AkshatRaj00•20m ago•0 comments

Setting Secrets in Env Vars

https://hugovk.dev/blog/2025/secrets-in-env-vars/
1•todsacerdoti•20m ago•0 comments

Lazy Linearity for a Core Functional Language (POPL 2026)

https://alt-romes.github.io/posts/2025-11-26-lazy-linearity-popl26.html
1•romes•21m ago•0 comments

Effective harnesses for long-running agents

https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents
1•handfuloflight•21m ago•0 comments

UFO Flap

https://en.wikipedia.org/wiki/UFO_flap
1•handfuloflight•23m ago•0 comments

Show HN: One click LinkedIn posts library

https://chromewebstore.google.com/detail/popup-linkedin-knowledge/pmejgpmingcbhpifjefenjkjaamlomha
1•rakeshkakati_47•23m ago•0 comments

Show HN: It's a Feature, Not a Bug, a gamified bug dismissal logger for QA

https://no-bug.app/
1•sebi-secasiu•24m ago•0 comments

What College Doesn't Teach You – But You Must Master to Survive in Tech

1•AkshatRaj00•26m ago•0 comments

The Deep Ocean Is a Global Public Good

https://nautil.us/the-deep-ocean-is-a-global-public-good-1238459/
2•dnetesn•27m ago•0 comments

Consumers don't care if AI made the ad. That's fine

https://thesocialjuice.substack.com/p/consumers-dont-really-care-if-ai
2•lazymentors•29m ago•1 comments

Rust in Production – Jon Seager, VP Engineering for Ubuntu

https://corrode.dev/podcast/s05e05-canonical/
1•mustache_kimono•31m ago•1 comments

Show HN: Ever Played Connections Game in NYTimes? Checkout thebrightmindgames

https://www.thebrightmindgames.com/connections/
2•subhash_k•31m ago•0 comments

Defense Startup Anduril Hits Setbacks with Weapons Tech

https://www.wsj.com/politics/national-security/anduril-industries-defense-tech-problems-52b90cae
3•bookofjoe•40m ago•1 comments

Underrated reasons to be thankful V

https://dynomight.net/thanks-5/
8•numeri•41m ago•0 comments

The engineer–manager pendulum is breaking

https://www.modernleader.is/p/pendulum-revisited
1•gpi•46m ago•0 comments

Bending Emacs – Episode 6: Overlays

https://xenodium.com/bending-emacs-episode-6-overlays
1•todsacerdoti•47m ago•0 comments

LinkedIn is loud, and corporate is hell

https://ramones.dev/posts/linkedin-is-loud/
5•austinallegro•49m ago•2 comments

VybeCam – find the perfect song that matches your vibe – like Shazam in reverse

https://apps.apple.com/app/apple-store/id6749338267?pt=118080429&ct=HackerNews&mt=8
1•donemanuel•52m ago•0 comments

Hybrid dark-pool DEX architecture on Solana

1•DarkVeil•53m ago•0 comments

Amazon: Who pays the price? – DW Documentary [video]

https://www.youtube.com/watch?v=6r3x2t872Pc
1•eternalreturn•53m ago•0 comments