frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: STDM – Make Your Documents and Data Think by Embedding LLM Instructions

https://github.com/csiro/stdm
1•benl_c•5mo ago
Hi HN, I’m Ben from CSIRO, Australia’s national science agency. We’ve been exploring how to make data and documents "think" when you use them with LLMs. We call it Self-Thinking Data Manifests (STDM). The idea is to embed plain-text instructions directly within files that tell an LLM how it should think about that data and interact with the user. We demonstrate it with PDF and HTML documents but in the future hope it might be possible for lots of formats.

Why Thinking Data?

* *Enhance PDF drag-and-drop* People already drag scientific papers and reports into LLMs to chat with them, but the interaction is often generic. STDM gives authors more control and customisation in these scenarios. It inverts custom chat-to-pdf systems: instead of building custom RAG interfaces on top of documents, we’re programming the LLM from within the document itself.

* *Author-directed interpretation* STDM helps ensure LLMs approach content with the author’s intended context and purpose, especially for complex scientific or technical data.

* *Smarter documents* Files with embedded STDM carry their own interactive logic, analysis routines, or guided explorations, making them more like mini-applications.

* *Towards in-document LLM programming* We see STDM as a step toward a future where data and instructions combine to form a kind of memory and quasi-procedural instruction set for LLMs; perhaps entire programs could live inside agentic LLM contexts using this approach.

To build an STDM you define a GOAL for the LLM, set CONSTRAINTS for interpretation, suggest REQUESTED_TOOLS (such as code_interpreter for analysis or web_retrieval for context), and optionally sketch out a CUSTOM_UI_DEFINITION (e.g a text-based UI, UX, or specific output format). When a user loads an STDM-enabled file into a capable LLM and explicitly tells the LLM to follow these instructions, the LLM uses the embedded manifest to guide its behaviour.

A mandatory Safety Preamble within the STDM instructs the LLM to await explicit user command and consent before executing any significant actions (especially tool use), ensuring the user is in control.

STDM is designed to be model-agnostic, STDM has been tested with GPT, Claude, and Gemini, if an LLM can read text and follow structured instructions, it should work with STDM. See it in action (save the file, upload/paste it into your LLM, then tell the LLM: Follow the STDM instructions in this document):

* Interactive Floodplain Study (HTML) This one can think about fetching live news if you allow it: https://csiro.github.io/stdm/examples/floodplain.html

* Same study (PDF) See how it thinks to answer questions based on its embedded guide: https://csiro.github.io/stdm/examples/floodplain.pdf

* The Brain (GitHub Spec v0.1, more examples, 2-min explainer video in README): https://github.com/csiro/stdm

This is an early-stage v0.1 specification and very much an experiment. We’re excited by the potential of data that can explain itself or guide its own analysis via an LLM, data that can think! We’d love to hear your thoughts. Is this a useful direction for programming LLMs or creating more dynamic documents? What are the pitfalls (we’ve focused on explicit invocation and consent as key safeguards)? How might you use data that thinks or programs its own interaction?

Advent of Compiler Optimisations 2025

https://xania.org/202511/advent-of-compiler-optimisation
2•hasheddan•3m ago•0 comments

Disrupting the first reported AI-orchestrated cyber espionage campaign [pdf]

https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestr...
1•piotrgrudzien•7m ago•0 comments

Nintendo's Messy Palworld Fight Takes an Unexpected Turn

https://kotaku.com/pokemon-patent-summoning-nintendo-palworld-monsters-2000641133
1•PaulHoule•7m ago•1 comments

Show HN: Claritate – Insurance, Made Clear

https://claritate.ch
1•Y_ssine•9m ago•0 comments

Time-domain theory of transient heat conduction in the local limit

https://journals.aps.org/prb/abstract/10.1103/p8wg-p1j3
1•westurner•10m ago•1 comments

Happy holidays: AI-enabled toys teach kids how to play with fire, sharp objects

https://www.theregister.com/2025/11/13/ai_toys_fmatches_knives_kink/
2•Bender•12m ago•0 comments

Ask HN: LLM apps for doctors/lawyers in Germany given section 203

1•fersarr•12m ago•0 comments

Intel Now Confirms Nova Lake Will Support AVX10.2 and APX Extensions

https://www.phoronix.com/news/Nova-Lake-Does-AVX10.2-APX
1•Bender•13m ago•0 comments

Linux Has Another Maintainer Now for Its Dec Alpha Port

https://www.phoronix.com/news/Linux-DEC-Alpha-2025-Maintainer
1•Bender•13m ago•0 comments

Cursor's Composer Model Surprised Me

https://www.sawyerhood.com/blog/composer-1-surprise
2•sawyerjhood•13m ago•0 comments

Seattle to Portland on a Unicycle (2005)

https://randomascii.wordpress.com/2016/12/12/seattle-to-portland-on-a-unicycle/
1•wonger_•14m ago•0 comments

Ubuntu's Rust Transition Hits Another Bump as Sudo-Rs Security Vulnerabilities

https://itsfoss.com/news/sudo-rs-issue-ubuntu/
2•LopRabbit•17m ago•0 comments

New Graphene Tech Powers Supercapacitors to Rival Traditional Batteries

https://scitechdaily.com/new-graphene-tech-powers-supercapacitors-to-rival-traditional-batteries/
2•westurner•18m ago•1 comments

strange game – daily ai art battle: prompt –> vote –> win?

https://apps.apple.com/us/app/strange-game-ai-with-friends/id6749246509
1•gglang•21m ago•0 comments

Andrej Karpathy: I am unreasonably excited about self-driving

https://twitter.com/karpathy/status/1989078861800411219
1•bilsbie•22m ago•0 comments

Linux: AI Guidelines for Kernel Developers Under Discussion

https://www.heise.de/en/news/Linux-AI-Guidelines-for-Kernel-Developers-Under-Discussion-11074917....
1•konmok•22m ago•0 comments

Amiga Bill's Opening Speech at Amiga 40 Germany [video]

https://www.youtube.com/watch?v=gNA9SpJADyA
2•sgt•23m ago•0 comments

Show HN: Interview and Exam-Level Data Structures Problems (With Full Solutions)

https://leanpub.com/masteringdatastructures
2•shhabmlkawy•23m ago•1 comments

Garry Tan apologizes for admitting company behind Chad IDE into YC F25

https://twitter.com/garrytan/status/1988732645317177746
3•stillatit•25m ago•3 comments

Palestinian [Amiga] artist Samia Halaby – "I see beauty" [video]

https://www.youtube.com/watch?v=7OeX_uPBcts
2•sgt•27m ago•0 comments

Blue Origin Completes 36th New Shepard Flight to Space

https://www.blueorigin.com/news/new-shepard-ns-36-mission
2•JumpCrisscross•29m ago•0 comments

Stocks notch worst day in over a month as tech sell-off intensifies

https://www.cnbc.com/2025/11/12/stock-market-today-live-updates.html
3•MilnerRoute•30m ago•1 comments

The Quake III Arena Bot

https://www.researchgate.net/publication/240430519_The_Quake_III_Arena_Bot
2•Vexowsky•30m ago•0 comments

State of Native AOT in .NET 10

https://code.soundaranbu.com/state-of-nativeaot-net10
1•vyrotek•31m ago•0 comments

Anti-Consumer Ad Block and Online Ordering

https://www.dominos.com/menu
2•Fairburn•32m ago•3 comments

Show HN: Agentic Adversarial testing for your voice AI

https://audn.ai
1•ozgurozkan•32m ago•0 comments

Air gapped / Data Diode backing up arrangement

https://bnikolic.co.uk/blog/linux/2025/11/13/backupdiode.html
1•zzbn00•32m ago•0 comments

Ask HN: Is it racist to say 'Chinese models'

1•nothrowaways•33m ago•3 comments

First Death Linked to 'Meat Allergy' Spread by Ticks

https://newsroom.uvahealth.com/2025/11/13/1st-death-linked-to-meat-allergy-spread-by-ticks/
6•geox•34m ago•0 comments

Guests ejected mid-stay from bankrupt hotel chain Sonder

https://www.bbc.com/news/articles/c364yg7g351o
25•onemoresoop•36m ago•11 comments