frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Microgpt

http://karpathy.github.io/2026/02/12/microgpt/
938•tambourine_man•9h ago•161 comments

Decision trees – the unreasonable power of nested decision rules

https://mlu-explain.github.io/decision-tree/
89•mschnell•2h ago•5 comments

10-202: Introduction to Modern AI (CMU)

https://modernaicourse.org
65•vismit2000•4h ago•14 comments

We do not think Anthropic should be designated as a supply chain risk

https://twitter.com/OpenAI/status/2027846016423321831
579•golfer•14h ago•287 comments

Why is the first C++ (m)allocation always 72 KB?

https://joelsiks.com/posts/cpp-emergency-pool-72kb-allocation/
35•joelsiks•2h ago•0 comments

Show HN: Terminal-Style Portfolio on the Internet

https://kuber.studio/
14•kuberwastaken•2h ago•8 comments

Robust and efficient quantum-safe HTTPS

https://security.googleblog.com/2026/02/cultivating-robust-and-efficient.html
27•tptacek•1d ago•1 comments

An ode to houseplant programming (2025)

https://hannahilea.com/blog/houseplant-programming/
38•evakhoury•1d ago•8 comments

The happiest I've ever been

https://ben-mini.com/2026/the-happiest-ive-ever-been
514•bewal416•3d ago•261 comments

Obsidian Sync now has a headless client

https://help.obsidian.md/sync/headless
490•adilmoujahid•19h ago•169 comments

The Windows 95 user interface: A case study in usability engineering (1996)

https://dl.acm.org/doi/fullHtml/10.1145/238386.238611
284•ksec•13h ago•189 comments

Switch to Claude without starting over

https://claude.com/import-memory
168•doener•3h ago•117 comments

Sub-second volumetric 3D printing by synthesis of holographic light fields

https://www.nature.com/articles/s41586-026-10114-5
66•zdw•3d ago•11 comments

H-Bomb: A Frank Lloyd Wright typographic mystery

https://www.inconspicuous.info/p/h-bomb-a-frank-lloyd-wright-typographic
94•mrngm•3d ago•27 comments

Block the “Upgrade to Tahoe” Alerts

https://robservatory.com/block-the-upgrade-to-tahoe-alerts-and-system-settings-indicator/
245•todsacerdoti•16h ago•118 comments

Hardwood: A New Parser for Apache Parquet

https://www.morling.dev/blog/hardwood-new-parser-for-apache-parquet/
43•rmoff•2d ago•2 comments

Woxi: Wolfram Mathematica Reimplementation in Rust

https://github.com/ad-si/Woxi
299•adamnemecek•3d ago•119 comments

MCP server that reduces Claude Code context consumption by 98%

https://mksg.lu/blog/context-mode
416•mksglu•1d ago•82 comments

Addressing Antigravity Bans and Reinstating Access

https://github.com/google-gemini/gemini-cli/discussions/20632
238•RyanShook•21h ago•199 comments

Our Agreement with the Department of War

https://openai.com/index/our-agreement-with-the-department-of-war
316•surprisetalk•15h ago•224 comments

Show HN: Now I Get It – Translate scientific papers into interactive webpages

https://nowigetit.us
244•jbdamask•22h ago•107 comments

Verified Spec-Driven Development (VSDD)

https://gist.github.com/dollspace-gay/d8d3bc3ecf4188df049d7a4726bb2a00
191•todsacerdoti•18h ago•103 comments

Pigeons and Planes Has a Website Again

https://www.pigeonsandplanes.com/read/pigeons-and-planes-has-a-website-again
5•herbertl•3d ago•1 comments

SpacetimeDB ThreeJS Support

https://discourse.threejs.org/t/spacetimedb-threejs-support-and-free-tier/90052
28•ryker2000•3d ago•5 comments

Qwen3.5 122B and 35B models offer Sonnet 4.5 performance on local computers

https://venturebeat.com/technology/alibabas-new-open-source-qwen3-5-medium-models-offer-sonnet-4-...
381•lostmsu•15h ago•210 comments

The Science of Detecting LLM-Generated Text (2024)

https://dl.acm.org/doi/10.1145/3624725
37•vinhnx•9h ago•17 comments

New evidence that Cantor plagiarized Dedekind?

https://www.quantamagazine.org/the-man-who-stole-infinity-20260225/
133•rbanffy•3d ago•77 comments

Iran's Ayatollah Ali Khamenei is killed in Israeli strike, ending 36-year rule

https://www.npr.org/2026/02/28/1123499337/iran-israel-ayatollah-ali-khamenei-killed
293•andsoitis•13h ago•406 comments

The whole thing was a scam

https://garymarcus.substack.com/p/the-whole-thing-was-scam
846•guilamu•18h ago•262 comments

The Eternal Promise: A History of Attempts to Eliminate Programmers

https://www.ivanturkovic.com/2026/01/22/history-software-simplification-cobol-ai-hype/
281•dinvlad•4d ago•184 comments
Open in hackernews

MCP Run Python

https://github.com/pydantic/pydantic-ai/tree/main/mcp-run-python
173•xrd•10mo ago

Comments

turnsout•10mo ago
Woof, use with care
mountainriver•10mo ago
Cool!
evacchi•10mo ago
cool!! you might also want to check out https://www.mcp.run/dylibso/eval-py

It's open source too :) https://github.com/dylibso/mcp.run-servlets/tree/main/servle...

We also use Wasm to sandbox all our servlets https://docs.mcp.run/blog/2025/04/07/mcp-run-security

(I work at Dylibso)

behnamoh•10mo ago
So their method of sandboxing Python code is to spin up a JS runtime (deno), run Pyodide on it, and then run the Python code in Pyodide.

Seems a lot of work to me. Is this really the best way to create and run Python sandboxes?

kissgyorgy•10mo ago
Not at all.
jononor•10mo ago
What is the best way? Or at least, a better way?
babush•10mo ago
I recall Shopify having a seccomp-based jail to run untrusted ruby code. But their use-case was very limited so they can get away with blocking almost every syscall.

Other than that... VMs? The fact that people consider JS/WASM engines good security sandboxes is a bit scary tbf.

simonw•10mo ago
I trust a WASM sandbox a whole lot more than I trust a Docker container sandbox.

WASM engines run in almost every browser on earth, billions of times a day. Security problems in those get spotted very quickly.

babush•10mo ago
It's a bit hard to do comparisons without going into threat models and all that _fun_ stuff :shrug:

For example, JS runs in almost every browser on earth too, yet it took V8 devs 2 years to find out that `Math.expm1()` could return -0.0 (https://chromium.googlesource.com/v8/v8.git/+/56f7dda67fdc97...). This is a cherry-picked example, and JS is clearly more complex than WASM, but still.

Just because stuff runs on a lot of devices doesn't mean it's more or less secure.

Linux runs on quite a few devices too, yet we still find bugs, people still don't ship updates to said bugs, yadda yadda yadda.

My point is just that lots of devs often skip the threat modeling and just think "I'll slap it in a WASM thingie an it'll be fine". Well good luck.

kissgyorgy•10mo ago
Landlock, cgroups on Linux
ehsanu1•10mo ago
gVisor
pansa2•10mo ago
It might be. CPython doesn't support sandboxing Python code, so the only option is to run the whole interpreter within a sandbox.
anentropic•10mo ago
It's what ChatGPT does apparently...

https://simonwillison.net/2024/Dec/10/chatgpt-canvas/

simonw•10mo ago
Not exactly - ChatGPT has two ways it can run Python code. It can use Pyodide and run it directly in the user's browser (for Canvas), and it can also run Python code on one of their servers in a Jupyter environment in a locked-down Kubernetes container (their "Code Interpreter" tool).

To my knowledge they don't yet have a run-Python-in-WASM-on-the-server implementation.

jamestimmins•10mo ago
What’s the purpose of Jupyter here? Isn’t that optimized for notebooks, which presumably wouldn’t be relevant on the server?
simonw•10mo ago
I think it's more about tapping into the Jupyter ecosystem of visualization libraries etc, plus the fact that there's lots of data analyst examples in the training data that come from notebooks.
jamestimmins•10mo ago
That's an interesting dynamic of the training data impacting the architecture. I wonder if this is a one-off or we see that in other areas as well.
fzzzy•10mo ago
I think this is inevitable. Whatever is most highly represented (correctly) will become even more dominant.
__mharrison__•10mo ago
So that's why it writes such bad pandas code...
pseudosavant•10mo ago
If there is a WASM build of the project, that is going to be the easiest and safest way to run that with untrusted user content. And Deno happens to be really good at hosting WASM itself. So, these are the two easiest tools to do this with.

I was looking into using WASM in Python yesterday for some image processing. It requires pulling in a full WASM runtime like wasmtime. Still better than calling out to native binaries like ImageMagick, but definitely more complicated than doing it in Deno. If I was writing it myself I'd do Deno, but LLMs are so good at writing Python.

kmangutov•10mo ago
Interesting to understand what is possible in this Deno/Pyodide environment. For example sklearn works despite being quite an involved dependency [1]. Another side to this is data input/output, which seems possible with a low level interface [2]. Very exciting that (a simple) end-to-end ML experience is now possible in the modern browser.

[1] https://www.erp5.com/NXD-Blog.Scipy.and.Scikit.Learn.Compile... [2] https://donatstudios.com/Read-User-Files-With-Go-WASM

achierius•10mo ago
Definitely not the safest: the safest way would be to spin up another VM. The hardware-level virtualization guarantees are much stronger than what any JS runtime could provide
ridruejo•10mo ago
It’s one of the best ways, at least on the sandboxing front. Hard to beat Wasm at that
simonw•10mo ago
I've been trying to find a good option for this for ages. The Deno/Pyodide one is genuinely one of the top contenders: https://til.simonwillison.net/deno/pyodide-sandbox

I'm hoping some day to find a recipe I really like for running Python code in a WASM container directly inside Python. Here's the closest I've got, using wasmtime: https://til.simonwillison.net/webassembly/python-in-a-wasm-s...

singularity2001•10mo ago
one wasmtime dependency and a self contained python file with 100 loc seems reasonable!

much better than calling deno, at least if you have no pip dependencies...

just had to update to new api:

# store.add_fuel(fuel) store.set_fuel(fuel) fuel_consumed=fuel-store.get_fuel()

and it works!!

time to hello world: hello_wasm_python311.py 0.20s user 0.03s system 97% cpu 0.234 total

lopuhin•10mo ago
it's pretty difficult to package native python dependencies for wasmtime or other wasi runtimes, e.g. lxml
Already__Taken•10mo ago
yeh if you can't shove numpy in there its not really useful.
antonvs•10mo ago
I was interested in how this compares in a kind of absolute sense. For comparison, an optimized C hello world program gave these results using `perf` on my Dell XPS 13 laptop:

       0.000636230 seconds time elapsed
       0.000759000 seconds user
       0.000000000 seconds sys
That's 36,800% faster. Hand-written assembly was very slightly slower. Using the standard library for output instead of a syscall brought it down to 20,900% faster.

(Yes I used percentages to underscore how big the difference is. It's 368x and 209x respectively. That's huge.)

Begrudgingly, here are the standard Python numbers:

    real    0m0.019s
    user    0m0.015s
    sys     0m0.004s
About 1230% faster than the sandbox, i.e. 12.3x. About an order of magnitude, which is typical for these kinds of exercises.
singularity2001•10mo ago
haha, 99% is startup time for the sandbox, but yeah, python via wasm is probably still 10-400 times slower than c.
fzzzy•10mo ago
Great, thanks for your post! I got it working too. This is going to be incredibly handy.
abshkbh•10mo ago
https://github.com/abshkbh/arrakis

Will come with MacOS support very soon :) Does work on Linux

Tsarp•10mo ago
I tried this path and found that MacOS has horrible support on firecracker and similar.
abshkbh•10mo ago
Crosvm (our original Google project) and its children projects Firecracker, Cloud-Hypervisor are all based on top of "/dev/kvm" i.e. the Linux Virtualization stack.

Apple's equivalent is the Apple Virtualization Framework which exposes kvm like functionality at a higher level.

Tsarp•10mo ago
Atleast on macos cant the sandbox-exec be used similar to what codex is doing?
simonw•10mo ago
Yeah, I got excited about that option a while back but was put off by the fact that Apple's (minimal) documentation say sandbox-exec is deprecated.
fzzzy•10mo ago
OpenAI's Codex CLI uses it on macOS. It's in typescript but maybe I'll take a look at what they do and port it to python.

[edit] looks really simple, except I'll have to look into how their raw-exec takes care of writeableRoots: https://github.com/openai/codex/blob/0d6a98f9afa8697e57b9bae...

[edit2] lol raw-exec doesn't do anything at all with writeableRoots, it's handled in the fullPolicy (from scopedWritePolicy)

fzzzy•10mo ago
I cleaned up the output of asking Gemini 2.5 Pro to rewrite it in python, and it seems to work well:

https://gist.github.com/fzzzy/319d6cbbdfff9c340d0e9c362247ae...

3abiton•10mo ago
> I'm hoping some day to find a recipe I really like for running Python code in a WASM container directly inside Python.

But what would be the usecase for this?

simonw•10mo ago
Running Python code from untrusted sources, including code written by LLMs.
3abiton•10mo ago
I see, the way I would approach is it by running a client on in a specific python env on an incus instance, with LLM hosted either on the host or another seperate an incus instance. Lately been addicted to sandboxing apps in incus, specifically for isolated vpn tunnels, and automating certain web access.
5rest•10mo ago
The demo looks really appealing. I have a real-world use case in mind: analyzing an Excel file and asking questions about its contents. The current approach (https://github.com/pydantic/pydantic-ai/blob/main/mcp-run-py...) seems limited to running standalone scripts—it doesn't support reading and processing files. Is there an extension or workaround to enable file input and processing?
jjuliano•10mo ago
I am nowhere near as big or as popular as Pydantic, but this is my solution - https://kdeps.com/getting-started/resources/python.html
redleader55•10mo ago
The author states:

> The code is executed using Pyodide in Deno and is therefore isolated from the rest of the operating system.

To me personally, the premise is a bit naive - it assumes that deno's WASM VM doesn't have exploits, that pyodide doesn't have bugs, etc. It might as well ask the LLM to produce javascript code and run it under deno and then it would be simpler.

In the end, the problem is one of risk budget. If you're running this in a VM you control and it's only you running your own prompts on it, maybe it's "good enough". If on the other hand, you want to sell this service to others who will attack your infrastructure, then no - it's not even close to be enough.

Your question is a bit vague because it doesn't explain what "best way" means for you. Cheap, secure, implementable by a person over a weekend?

fragmede•10mo ago
The answer, I think, is to push running the VM back onto the user, and build on top of Fabrice's JS Linux and run the sandbox on the user's machine. That way at the very worst they can escape and steal their own cookies from the browser process the VM is running on/in.
achierius•10mo ago
> premise is a bit naive - it assumes that deno's WASM VM doesn't have exploits, that pyodide doesn't have bugs,

Eh, I wouldn't call this naive. Two points:

1. Pyodide bugs should not be a huge concern here. As long as your python code is executing on top of a JS runtime, the runtime is what matters first and foremost from a security pov.

2. Yes, it's possible for Deno to have bugs. But frankly: it's much less likely to than most any other method for doing this sort of sandboxing. Deno sits on v8, which is the engine used by Chrome, and there are very few applications in the world which have a closer eye and larger dedicated security budget than Chrome. V8 can have bugs, sure, but I would expect they (along with JSC and maybe SpiderMonkey) will have far fewer than any other runtime for a serious dynamic language on the market today.

Yes, a VM would be better (and frankly, when you're talking about running Python on top of a JS runtime, might not even be less performance), but the reason why is not that they "have fewer bugs".

kodablah•10mo ago
There just aren't good Python sandboxing approaches. There are subinterpreters but they can slow to start from scratch. There are higher-level sandboxing approaches like microvms, but they have setup overhead and are not easy to use from inside Python.

At Temporal, we required a sandbox but didn't have any security requirement, so we wrote it from scratch with eval/exec and a custom importer [0]. It is not a foolproof sandbox, but it does a good job at isolating state, intercepting and preventing illegal calls we don't like, and allowing some imports to "pass through" the outside instead of being reloaded for performance reasons.

0 - https://github.com/temporalio/sdk-python?tab=readme-ov-file#...

achierius•10mo ago
Out of curiosity, why did you need a sandbox if you didn't have any security concerns?
necovek•10mo ago

  > but it does a good job at isolating state, intercepting and preventing illegal calls we don't like
Sounds like they put the reason just there.
kodablah•10mo ago
Sibling quoted the proper part. It's to help people keep code deterministic by helping prevent shared state and prevent non-deterministic standard library calls.
fzzzy•10mo ago
At least we have subinterpreters now. Even if they are slow that is a really good thing.
jacob019•10mo ago
Indeed. What ever happened to user mode linux?
samuel•10mo ago
I spin up a docker container using the docker API. I haven't used gvisor because I don't expect the model to try kernel level exploits. If it were the case, we're already doomed.
m3047•10mo ago
Having watched the repeated immolation of blissful innocence since smart email clients would run whatever smart (OLE? Smart? I'm kidding.) document was delivered, this is going to be so much fun in a trainwreck kind of way.
bigbuppo•10mo ago
I keep seeing this MCP thing and I'm really happy that people are getting into Burroughs mainframes rather than that stupid AI crap.
snoman•10mo ago
That’s a pretty obscure/dated reference to the Master Control Program that ran on Burroughs mainframes.

I suspect the downvotes are for “… stupid AI crap.”

bigbuppo•10mo ago
...and that still runs on Unisys Libra systems, doing the sort of boring but important work that keeps the world running, such as processing unemployment benefits for the people that are going to be laid off of the AI companies once everyone realizes AGI isn't going to happen and GenAI is the new leaded gasoline.
someguy101010•10mo ago
Nice! I'm working on a way to do this for javascript using v8 https://github.com/r33drichards/mcp-js. Right now this works but there is some significant jank.
_pdp_•10mo ago
Bookmarked it. We took another approach which provides more flexibility but at the cost of slower spin up. Basically we use firecracker vm. We mount the attachments and everything else into the vm so that the agent can run tools on them (anything on the os) and we destroy the machine at the very end. It works! It is also as secure as firecracker goes.

But I like using WASM especially in a hosted environment like Deno. It feels like a more scaleable solution and probably less maintenance too with the downside that that we wont be able to run just any cmd.

I am happy to provide more details and point to the tool is anyone is interested. It is not open-source but you can play with it for free.

retinaros•10mo ago
its like u using lambda
singularity2001•10mo ago
Why not Pyodide directly in python?
simonw•10mo ago
I haven't found a supported, documented way to do that yet. I'd love to find one.
Cluelessidoit•10mo ago
Hi, I don’t really know anything honestly, but I do remember an ai I running on my laptop using xpip or xpython as a contained environment I think it’s a single instance, would that work or is that close???
yahoozoo•10mo ago
All of these Agent frameworks are already overwhelming. Insert joke about parallels to the JavaScript ecosystem.

What agent framework is truly the top dog? Is it just working with the big model providers native frameworks, such as OpenAI’s Agents SDK?

jamesralph8555•10mo ago
How secure is this? I tried building something similar, but it was taking too long to setup a fully virtualized solution like kata container or firecracker.
simonw•10mo ago
I hacked around with this a bit and figured out a way to get it to spit out logging of the prompts and responses to the server: https://gist.github.com/simonw/54fc42ef9a7fb8f777162bbbfbba4...

Short-ish version:

    ANTHROPIC_API_KEY="$(llm keys get anthropic)" \
    uv run --with devtools --with pydantic-ai python -c '
    import asyncio
    from devtools import pprint
    from pydantic_ai import Agent, capture_run_messages
    from pydantic_ai.mcp import MCPServerStdio

    server = MCPServerStdio(
        "deno",
        args=[
            "run",
            "-N",
            "-R=node_modules",
            "-W=node_modules",
            "--node-modules-dir=auto",
            "jsr:@pydantic/mcp-run-python",
            "stdio",
        ],
    )

    agent = Agent("claude-3-5-haiku-latest", mcp_servers=[server])

    async def main():
        with capture_run_messages() as messages:
            async with agent.run_mcp_servers():
                result = await agent.run("How many days between 2000-01-01 and 2025-03-18?")
        pprint(messages)
        print(result.output)

    asyncio.run(main())'
Output here: https://gist.github.com/simonw/54fc42ef9a7fb8f777162bbbfbba4...

I got it running against Mistral Small 3.1 running locally too - notes on that here: https://simonwillison.net/2025/Apr/18/mcp-run-python/

neuroelectron•10mo ago
Crap but it's mcp so being good isn't the point anyway