frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Engineering Perception with Combinatorial Memetics

1•alan_sass•40s ago•1 comments

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

https://steamdaily.xyz
1•itshellboy•2m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
1•spenvo•2m ago•0 comments

Just Started Using AmpCode

https://intelligenttools.co/blog/ampcode-multi-agent-production
1•BojanTomic•3m ago•0 comments

LLM as an Engineer vs. a Founder?

1•dm03514•4m ago•0 comments

Crosstalk inside cells helps pathogens evade drugs, study finds

https://phys.org/news/2026-01-crosstalk-cells-pathogens-evade-drugs.html
2•PaulHoule•5m ago•0 comments

Show HN: Design system generator (mood to CSS in <1 second)

https://huesly.app
1•egeuysall•5m ago•1 comments

Show HN: 26/02/26 – 5 songs in a day

https://playingwith.variousbits.net/saturday
1•dmje•6m ago•0 comments

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

https://github.com/Paraxiom/topological-coherence
1•slye514•9m ago•1 comments

Top AI models fail at >96% of tasks

https://www.zdnet.com/article/ai-failed-test-on-remote-freelance-jobs/
3•codexon•9m ago•1 comments

The Science of the Perfect Second (2023)

https://harpers.org/archive/2023/04/the-science-of-the-perfect-second/
1•NaOH•10m ago•0 comments

Bob Beck (OpenBSD) on why vi should stay vi (2006)

https://marc.info/?l=openbsd-misc&m=115820462402673&w=2
2•birdculture•13m ago•0 comments

Show HN: a glimpse into the future of eye tracking for multi-agent use

https://github.com/dchrty/glimpsh
1•dochrty•14m ago•0 comments

The Optima-l Situation: A deep dive into the classic humanist sans-serif

https://micahblachman.beehiiv.com/p/the-optima-l-situation
2•subdomain•14m ago•0 comments

Barn Owls Know When to Wait

https://blog.typeobject.com/posts/2026-barn-owls-know-when-to-wait/
1•fintler•15m ago•0 comments

Implementing TCP Echo Server in Rust [video]

https://www.youtube.com/watch?v=qjOBZ_Xzuio
1•sheerluck•15m ago•0 comments

LicGen – Offline License Generator (CLI and Web UI)

1•tejavvo•18m ago•0 comments

Service Degradation in West US Region

https://azure.status.microsoft/en-gb/status?gsid=5616bb85-f380-4a04-85ed-95674eec3d87&utm_source=...
2•_____k•18m ago•0 comments

The Janitor on Mars

https://www.newyorker.com/magazine/1998/10/26/the-janitor-on-mars
1•evo_9•20m ago•0 comments

Bringing Polars to .NET

https://github.com/ErrorLSC/Polars.NET
3•CurtHagenlocher•22m ago•0 comments

Adventures in Guix Packaging

https://nemin.hu/guix-packaging.html
1•todsacerdoti•23m ago•0 comments

Show HN: We had 20 Claude terminals open, so we built Orcha

1•buildingwdavid•23m ago•0 comments

Your Best Thinking Is Wasted on the Wrong Decisions

https://www.iankduncan.com/engineering/2026-02-07-your-best-thinking-is-wasted-on-the-wrong-decis...
1•iand675•23m ago•0 comments

Warcraftcn/UI – UI component library inspired by classic Warcraft III aesthetics

https://www.warcraftcn.com/
1•vyrotek•25m ago•0 comments

Trump Vodka Becomes Available for Pre-Orders

https://www.forbes.com/sites/kirkogunrinde/2025/12/01/trump-vodka-becomes-available-for-pre-order...
1•stopbulying•26m ago•0 comments

Velocity of Money

https://en.wikipedia.org/wiki/Velocity_of_money
1•gurjeet•28m ago•0 comments

Stop building automations. Start running your business

https://www.fluxtopus.com/automate-your-business
1•valboa•33m ago•1 comments

You can't QA your way to the frontier

https://www.scorecard.io/blog/you-cant-qa-your-way-to-the-frontier
1•gk1•34m ago•0 comments

Show HN: PalettePoint – AI color palette generator from text or images

https://palettepoint.com
1•latentio•34m ago•0 comments

Robust and Interactable World Models in Computer Vision [video]

https://www.youtube.com/watch?v=9B4kkaGOozA
2•Anon84•38m ago•0 comments
Open in hackernews

Ask HN: Why can't Codex/Claude compile app and test that changes worked?

1•aurareturn•1w ago
Right now, both Codex and Claude makes changes based on your request but it's still you who needs to test those changes. Why can't Codex and Claude test them too?

For example, when making a change to the website, why can't Claude or Codex compile the changes, open a browser, test the changes to confirm that they worked?

It seems like the ability to verify is hugely important in autonomy. Yet, neither are doing it.

Comments

aurareturn•1w ago
It seems like Cursor can do this?
delaminator•1w ago
What do you mean?

Even Claude for Web can do the full compile edit cycle.

It runs Debian and apt installs stuff on demand and can run anything its Debian has.

I built a Chrome extension so Claude can get the full rendered DOM via a socket for web stuff.

Why do you think it can’t?

Are you perhaps using Claude Chatbot?

Claude Code will run anything you instruct it to.

I have to tell it to stop more than to start!

aurareturn•1w ago
So you had to build a Chrome extension for Claude so it can see its changes?

Can Claude Code open Chrome to test its changes by default?

No, not running chatbot. Running Claude Code CLI/Codex CLI.

delaminator•1w ago
I didn't have to no. It can run curl and get whatever HTML was served.

But how else would Claude see the live DOM ? The only way is through something running in the browser at runtime.

And yes, Google Chrome added an MCP into Chrome for doing it but I use Vivaldi.

When coding manually how would you check your code was working when it came to serving HTML?

Are you telling me your Claude / Codex cannot run curl to look at what was served?

If you need help with something you can ask, no need to be brusque.

aurareturn•1w ago
Why can't it just open the changes in Chrome and check it?
delaminator•1w ago
You mean like this?

    chrome --headless --dump-dom "https://example.com" > dom.html
aurareturn•5d ago
No. I don't need it to generate the html. I need it to render the service and check as well as clicking on the feature it built to test it.
delaminator•1w ago
I'm back at my desk now, here's something from my current Claude session of it writing and running code.

Are you saying yours cannot do this?

    Nawin.Auth is running on 567. Let me write a quick test to capture exactly what bytes Nawin.Auth sends vs Crabbit (on 5670):

    ● Write(/tmp/compare_auth.py)
       ⎿  Wrote 83 lines to ../../../../tmp/compare_auth.py
            1 #!/usr/bin/env python3
            2 """Compare auth server responses byte-by-byte"""
            3 import socket
            4
            5 def test_auth(host, port, name):
            6     print(f"\n=== Testing {name} on {host}:{port} ===")
            7
            8     # Build AuthTreq request (141 bytes)
            9     # type[1] + authid[28] + authdom[48] + chal[8] + hostid[28] + uid[28]
           10     request = bytearray(141)
          … +73 lines (ctrl+o to expand)

     ● Bash(sudo lsof -i :5670 2>/dev/null | head -3)
       ⎿  COMMAND     PID USER FD   TYPE DEVICE SIZE/OFF NODE NAME
          run-auth- 95646 root 9u  IPv4 139735      0t0  TCP \*:5670 (LISTEN)

     ● Bash(python3 /tmp/compare_auth.py 2>&1)
       ⎿ === Testing Nawin.Auth on 127.0.0.1:567 ===
         Request (141 bytes):
           Hex:              01676c656e6461000000000000000000000000000000000000000000006e6177696e00000000000000
         … +53 lines (ctrl+o to expand)

     ● Both send 145 bytes, same format! Differences are only in the random session keys (bytes 64-72 and 136-144). The ticket encryption is identical.
aurareturn•1w ago
It looks like it can execute scripts.

How about browser testing to verify?

theblazehen•1w ago
It can. For android, I have it dump screenshots and uiautomator xml dumps, and for web the playwright mcp.

I find that a critical step in agentic development is to close the loop for the LLM so it can get direct feedback without needing you to manually handle it.