frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI

https://simonwillison.net/2025/Dec/12/openai-skills/
77•simonw•1h ago

Comments

simonw•1h ago
I had a bunch of fun writing about this one, mainly because it was a great excuse to highlight the excellent news about Kākāpō breeding season this year.

(I'm not just about pelicans.)

KK7NIL•1h ago
TIL about a large moss green flightless parrot :)
jb_rad•1h ago
Will Kākāpō be riding bicycles soon?
OrsonSmelles•35m ago
They already ride British nature photographers—what do they need bikes for?
throwup238•26m ago
https://youtube.com/watch?v=Jlk9u8MIv7o

The foreplay starts around the 1 minute mark.

bilekas•15m ago
> Skills are a keeper #

Good thinking, I agree actually, however..

> Skills are based on a very light specification, if you could even call it that, but I still think it would be good for these to be formally documented somewhere.

Like a lot of posts around AI, and I hope OP can speak to it, surely you can agree that while when used for a good cool idea, it can also be used for the inverse and probably to more detrimental reason. Why would they document an unmanageable feature that may be consumed.

Shareholder value might not go up if they learnt that the major product is learning bad things.

Have you or would you try this on a local LLM instead ?

simonw•9m ago
These work well with local LLMs that are powerful enough to run a coding agent environment with a decent amount of context over longer loops.

The OpenAI GPT OSS models can drive Codex CLI, so they should be able to do this.

I have high hopes for Mistral's Devstral 2 but I've not run that locally yet.

koakuma-chan•1h ago
Does Cursor support skills?
smcleod•47m ago
No I don't believe so. Cursor is usually pretty behind other agentic coding tools in my experience.
hurturue•1h ago
Github Copilot too
simonw•59m ago
VS Code Copilot just announced experimental skill support in their November release: https://code.visualstudio.com/updates/v1_107#_reuse-your-cla...
jumploops•1h ago
I think the future is likely one that mixes the kitchen-sink style MCP resources with custom skills.

Services can provide an MCP-like layer that provides semantic definitions of everything you can do with said service (API + docs).

Skills can then be built that combine some subset of the 3rd party interfaces, some bespoke code, etc. and then surface these more context-focused skills to the LLM/agent.

Couldn’t we just use APIs?

Yes, but not every API is documented in the same way. An “MCP-like” registry might be the right abstraction for 3rd parties to expose their services in a semantic-first way.

dkdcio•1h ago
CLIs are really good when you can use them. self-documenting, agents already have shell tools, they tend to solve fine-grained auth, etc.

feels like the right layer of abstraction for remote APIs

esafak•26m ago
If only there was a way to progressively disclose the API in MCP instead of presenting the full laundry list up front.
simonw•25m ago
That is effectively what this proposal is about: https://www.anthropic.com/engineering/code-execution-with-mc...
bzmrgonz•1h ago
It is interesting that they are relying on visual reading for document ingestion instead of OCT. Recently I read an article which says Handwriting recognition has matured, and I'm beginning to think this is the approach they are takingwirh HAndwiting recognition.
esperent•54m ago
I've been testing out Gemini Enterprise for use by staff in various positions at my business.

It's got the best implementation of a "skills-like" agent tool I've seen. Basically a visual tree builder, currently only one level deep. So I've set up the "<my company name> agent" and then it has subagents/skills for thing like marketing/supply chain research/sysadmin/translation etc., each with a separate description, prompt, and knowledge base.

Unfortunately, everything else about Gemini Enterprise screams "early alpha, why the hell are you selling this as an actual finished product?".

For example, after I put half a day into setting up an agent and subagents, then went to share this with the other people helping me to test it, I found that... I can't. Literally no way to share agents in a tool that is supposedly for teams to use. I found one of the devs saying that sharing agents would be released in "about two weeks". That was two months ago.

Mini rant over... But my point is that skills are just "agents + auto-selecting sub-agents via a short description" and we'll see this pattern everywhere soon. Claude Skills have some additional sandboxing but that's mostly only interesting for coders.

ohghiZai•10m ago
Looking for a way to do this with ADK as well, looks like skills can be a sweet spot between giant instruction and sprawling tools/subagents.
mbesto•53m ago
From a purely technical view, skills are just an automated way to introduce user and system prompt stuffing into the context right? Not to belittle this, but rather that seems like a way of reducing the need for AI wrapper apps since most AI wrappers just do systematic user and system prompt stuffing + potentially RAG + potentially MCP.
simonw•38m ago
Yeah, there are a whole lot of AI wrapper applications that could be a folder with a markdown file in at this point!
petetnt•53m ago
It’s impressive how every iteration tries to get further from pretending actual AGI would be anywhere close when we are basically writing library functions with the worst DSL known to man, markdown-with-english.
cyanydeez•25m ago
Yes. Prompt engineering is like a shittier verson of writing a VBA app inside Excel or Access.

Bloat has a new name and its AI integration. You thought Chrome using GB per tab was bad, wait until you need a whole datacenter to use your coding environment.

Alex3917•10m ago
> Prompt engineering is like a shittier verson of writing a VBA app inside Excel or Access.

Sure, if you could use VBA to read a patient's current complaint, vitals, and medical history, look up all the relevant research on Google Scholar, and then output a recommended course of treatment.

simonw•8m ago
The difference between prompting a coding agent and VBA is that with VBA you have to write and test and iterate on the code yourself.
ogogmad•24m ago
Gemini seems to be firmly in the lead now. OpenAI doesn't seem to have the SoTA. This should have bearing on whether or not LLMs have peaked yet.
skybrian•8m ago
This might be actually be better in a certain way: if you change a real customer-facing API then customers will complain when you break their code. An LLM will likely adapt. So the interface is more flexible.

But perhaps an LLM could write an adapter that gets cached until something changes?

kenjackson•5m ago
I think really more than anything it’s become clear that AGI is an illusion. There’s nothing there. It’s the mirage in the desert, you keep waking towards it but it’s always out of reach and unclear if it even exists.

So companies are really trying to deliver value. This is the right pivot. If you gave me an AGI with a 100 IQ, that seems pretty much worthless in today’s world. But domain expertise - that I’ll take.

johnfn•3m ago
Literally yesterday we had a post about GPT-5.2, which jumped 30% on ARC-AGI 2, 100% on AIME without tools, and a bunch of other impressive stats. A layman's (mine) reading of those numbers feels like the models continue to improve as fast as they always have. Then today we have people saying every iteration is further from AGI. It really perplexes me is how split-brain HN is on this topic.
8cvor6j844qw_d6•52m ago
Does this mean I can point to a code snippet and a link to the related documentation and the coding agent refer to it instead of writing "outdated" code?

Some frameworks/languages move really fast unfortunately.

simonw•36m ago
Yes, definitely. I've had a lot of success already showing LLMs short examples of coding libraries they don't know about from their core training data.
lacker•45m ago
I'm not sure if I have the right mental model for a "skill". It's basically a context-management tool? Like a skill is a brief description of something, and if the model decides it wants the skill based on that description, then it pulls in the rest of whatever amorphous stuff the skill has, scripts, documents, what have you. Is this the right way to think about it?
canadiantim•42m ago
I think it’s also important to think of skills in the context of tasks, so when you want an agent to perform a specialized task, then this is the context, the resources and scripts it needs to perform the task.
simonw•41m ago
It's a folder with a markdown file in it plus optional additional reference files and executable scripts.

The clever part is that the markdown file has a section in it like this: https://github.com/datasette/skill/blob/a63d8a2ddac9db8225ee...

  ---
  name: datasette-plugins
  description: "Writing Datasette plugins using Python and the pluggy plugin system. Use when Claude needs to: (1) Create a new Datasette plugin, (2) Implement plugin hooks like prepare_connection, register_routes, render_cell, etc., (3) Add custom SQL functions, (4) Create custom output renderers, (5) Add authentication or permissions logic, (6) Extend Datasette's UI with menus, actions, or templates, (7) Package a plugin for distribution on PyPI"
  ---
On startup Claude Code / Codex CLI etc scan all available skills folders and extract just those descriptions into the context. Then, if you ask them to do something that's covered by a skill, they read the rest of that markdown file on demand before going ahead with the task.
behnamoh•34m ago
why did this simple idea take so long to become available? I remember even in llama 2 days I was doing this stuff, and that model didn't even function call.
simonw•28m ago
Skills only work if you have a full blown code execution environment with a model that can run ls and cat and execute scripts and suchlike.

The models are really good at driving those environments now which makes skills the right idea at the right time.

leetrout•33m ago
Have you used AWS bedrock? I assume these get pretty affordable with prompt caching...
throwaway314155•13m ago
Do skills get access to the current context or are they a blank slate?
simonw•11m ago
They execute within the current context - it's more that the content of the skill gets added to that context when it is needed.
jmalicki•36m ago
Yes. I find these very useful for enforcing e.g. skills like debugging, committing code, make prs, responding to pr feedback from ai review agents, etc. without constantly polluting the context window.

So when it's time to commit, make sure you run these checks, write a good commit message, etc.

Debugging is especially useful since AI agents can often go off the rails and go into loops rewriting code - so it's in a skill I can push for "read the log messages. Inserting some more useful debug assertions to isolate the failure. Write some more unit tests that are more specific." Etc.

ohghiZai•45m ago
Is there a way to implement skills with Gemini?
simonw•35m ago
Looks like they added it to the Gemini CLI public roadmap last week: https://github.com/google-gemini/gemini-cli/issues/11506#eve...
canadiantim•43m ago
Can or should skills be used for managing the documentation of dependencies in a project and the expertise in them?

I’ve been playing with doing this but kind of doesn’t feel the most natural fit.

heliumtera•29m ago
So chatgpt can read markdown files? I am very confused
simonw•15m ago
ChatGPT has had a full Linux container system available to it for nearly three years now.

OpenAI keep changing their mind on what to call it. I like the original name, "ChatGPT Code Interpreter", but they've also called it "advanced data analysis" at various points.

Claude added the same feature in September this year: https://simonwillison.net/2025/Sep/9/claude-code-interpreter...

In both ChatGPT and Claude you can say things like "use your Python tool to calculate total mortgage payments over a 30 year period for X and Y" and it will write and execute code to do so - but you can also upload files (including CSVs or even SQLite database files) into that container file system and have them write and execute python code to process those in different ways.

Skills are just folders full of markdown files that are saved in that container when it first boots up.

macOS 26.2 enables fast AI clusters with RDMA over Thunderbolt

https://developer.apple.com/documentation/macos-release-notes/macos-26_2-release-notes#RDMA-over-...
260•guiand•4h ago•132 comments

OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI

https://simonwillison.net/2025/Dec/12/openai-skills/
78•simonw•1h ago•44 comments

GNU Unifont

https://unifoundry.com/unifont/index.html
149•remywang•4h ago•43 comments

Rats Play DOOM

https://ratsplaydoom.com/
170•ano-ther•5h ago•72 comments

Show HN: Tiny VM sandbox in C with apps in Rust, C and Zig

https://github.com/ringtailsoftware/uvm32
61•trj•3h ago•4 comments

Show HN: I made a spreadsheet where formulas also update backwards

https://victorpoughon.github.io/bidicalc/
50•fouronnes3•1d ago•16 comments

Capsudo: Rethinking Sudo with Object Capabilities

https://ariadne.space/2025/12/12/rethinking-sudo-with-object-capabilities.html
33•fanf2•3h ago•7 comments

Ensuring a National Policy Framework for Artificial Intelligence

https://www.whitehouse.gov/presidential-actions/2025/12/eliminating-state-law-obstruction-of-nati...
54•andsoitis•1d ago•87 comments

Go is portable, until it isn't

https://simpleobservability.com/blog/go-portable-until-isnt
16•khazit•5d ago•15 comments

50 years of proof assistants

https://lawrencecpaulson.github.io//2025/12/05/History_of_Proof_Assistants.html
27•baruchel•2h ago•2 comments

Security issues with electronic invoices

https://invoice.secvuln.info/
73•todsacerdoti•5h ago•42 comments

Freeing a Xiaomi humidifier from the cloud

https://0l.de/blog/2025/11/xiaomi-humidifier/
22•stv0g•19h ago•9 comments

SQLite JSON at full index speed using generated columns

https://www.dbpro.app/blog/sqlite-json-virtual-columns-indexing
308•upmostly•12h ago•96 comments

4 billion if statements (2023)

https://andreasjhkarlsson.github.io//jekyll/update/2023/12/27/4-billion-if-statements.html
585•damethos•6d ago•162 comments

Motion (YC W20) Is Hiring Senior Staff Front End Engineers

https://jobs.ashbyhq.com/motion/715d9646-27d4-44f6-9229-61eb0380ae39
1•ethanyu94•4h ago

Sick of smart TVs? Here are your best options

https://arstechnica.com/gadgets/2025/12/the-ars-technica-guide-to-dumb-tvs/
97•fleahunter•12h ago•111 comments

Pg_ClickHouse: A Postgres extension for querying ClickHouse

https://clickhouse.com/blog/introducing-pg_clickhouse
67•spathak•2d ago•28 comments

Building small Docker images faster

https://sgt.hootr.club/blog/docker-protips/
22•steinuil•15h ago•5 comments

Home Depot GitHub token exposed for a year, granted access to internal systems

https://techcrunch.com/2025/12/12/home-depot-exposed-access-to-internal-systems-for-a-year-says-r...
182•kernelrocks•7h ago•115 comments

String theory inspires a brilliant, baffling new math proof

https://www.quantamagazine.org/string-theory-inspires-a-brilliant-baffling-new-math-proof-20251212/
107•ArmageddonIt•9h ago•102 comments

CM0 – A new Raspberry Pi you can't buy

https://www.jeffgeerling.com/blog/2025/cm0-new-raspberry-pi-you-cant-buy
162•speckx•10h ago•39 comments

Bit flips: How cosmic rays grounded a fleet of aircraft

https://www.bbc.com/future/article/20251201-how-cosmic-rays-grounded-thousands-of-aircraft
54•signa11•4d ago•50 comments

Async DNS

https://flak.tedunangst.com/post/async-dns
97•todsacerdoti•8h ago•30 comments

Microservices should form a polytree

https://bytesauna.com/post/microservices
103•mapehe•4d ago•99 comments

C64 Maze Chomp.BAS

https://basic-code.bearblog.dev/c64-maze-chompbas/
13•ibobev•5d ago•1 comments

Fast Median Filter over arbitrary datatypes

https://martianlantern.github.io/2025/09/median-filter-over-arbitrary-datatypes/
21•martianlantern•6d ago•1 comments

Using secondary school maths to demystify AI

https://www.raspberrypi.org/blog/secondary-school-maths-showing-that-ai-systems-dont-think/
98•zdw•8h ago•212 comments

Fedora: Open-source repository for long-term digital preservation

https://fedorarepository.org/
99•cernocky•12h ago•45 comments

Good conversations have lots of doorknobs (2022)

https://www.experimental-history.com/p/good-conversations-have-lots-of-doorknobs
54•bertwagner•4d ago•9 comments

Google releases its new Google Sans Flex font as open source

https://www.omgubuntu.co.uk/2025/11/google-sans-flex-font-ubuntu
182•CharlesW•7h ago•94 comments