frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•57s ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•1m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•2m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•2m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
1•pseudolus•2m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•7m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
1•bkls•7m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•8m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
2•roknovosel•8m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•16m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•17m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
1•surprisetalk•19m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•19m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
1•surprisetalk•19m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
2•pseudolus•20m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•20m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•21m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
2•1vuio0pswjnm7•21m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
3•obscurette•22m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
1•jackhalford•23m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•23m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
1•tangjiehao•26m ago•0 comments

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

https://caratria.com/
1•jonrosner•27m ago•1 comments

My Eighth Year as a Bootstrapped Founde

https://mtlynch.io/bootstrapped-founder-year-8/
1•mtlynch•27m ago•0 comments

Show HN: Tesseract – A forum where AI agents and humans post in the same space

https://tesseract-thread.vercel.app/
1•agliolioyyami•28m ago•0 comments

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

https://vibecolors.life/
2•tusharnaik•29m ago•0 comments

OpenAI is Broke ... and so is everyone else [video][10M]

https://www.youtube.com/watch?v=Y3N9qlPZBc0
2•Bender•29m ago•0 comments

We interfaced single-threaded C++ with multi-threaded Rust

https://antithesis.com/blog/2026/rust_cpp/
1•lukastyrychtr•30m ago•0 comments

State Department will delete X posts from before Trump returned to office

https://text.npr.org/nx-s1-5704785
7•derriz•30m ago•1 comments

AI Skills Marketplace

https://skly.ai
1•briannezhad•31m ago•1 comments
Open in hackernews

Launch HN: MindFort (YC X25) – AI agents for continuous pentesting

60•bveiseh•8mo ago
Hey HN! We're Brandon, Sam, and Akul from MindFort (https://mindfort.ai). We're building autonomous AI agents that continuously find, validate, and patch security vulnerabilities in web applications—essentially creating an AI red team that runs 24/7.

Here's a demo: https://www.loom.com/share/e56faa07d90b417db09bb4454dce8d5a

Security testing today is increasingly challenging. Traditional scanners generate 30-50% false positives, drowning engineering teams in noise. Manual penetration testing happens quarterly at best, costs tens of thousands per assessment, and takes weeks to complete. Meanwhile, teams are shipping code faster than ever with AI assistance, but security reviews have become an even bigger bottleneck.

All three of us encountered this problem from different angles. Brandon worked at ProjectDiscovery building the Nuclei scanner, then at NetSPI (one of the largest pen testing firms) building AI tools for testers. Sam was a senior engineer at Salesforce leading security for Tableau. He dealt firsthand with juggling security findings and managing remediations. Akul did his master's on AI and security, co-authored papers on using LLMs for ecurity attacks, and participated in red-teams at OpenAI and Anthropic.

We all realized that AI agents were going to fundamentally change security testing, and that the wave of AI-generated code would need an equally powerful solution to keep it secure.

We've built AI agents that perform reconnaissance, exploit vulnerabilities, and suggest patches—similar to how a human penetration tester works. The key difference from traditional scanners is that our agents validate exploits in runtime environments before reporting them, reducing false positives.

We use multiple foundational models orchestrated together. The agents perform recon to understand the attack surface, then use that context to inform testing strategies. When they find potential vulnerabilities, they spin up isolated environments to validate exploitation. If successful, they analyze the codebase to generate contextual patches.

What makes this different from existing tools? Validation through exploitation: We don't just pattern-match—we exploit vulnerabilities to prove they're real; - Codebase integration: The agents understand your code structure to find complex logic bugs and suggest appropriate fixes; - Continuous operation: Instead of point-in-time assessments, we're constantly testing as your code evolves; - Attack chain discovery: The agents can find multi-step vulnerabilities that require chaining different issues together.

We're currently in early access, working with initial partners to refine the platform. Our agents are already finding vulnerabilities that other tools miss and scoring well on penetration testing benchmarks.

Looking forward to your thoughts and comments!

Comments

blibble•8mo ago
what controls do you have to ensure consent from the target site?
bko•8mo ago
In the video demo they showed requiring a TXT in the DNS to confirm you have consent
blibble•8mo ago
so they'll point it a domain they control, then reverse proxy it onto their target?
icedchai•8mo ago
What do you propose they do instead?
blibble•8mo ago
not offer automated targeted hacking as a service?

even the booters market themselves as as "legitimate stress testing tools for enterprise"

Sohcahtoa82•8mo ago
> not offer automated targeted hacking as a service?

MindFort is not the first and won't be the last. There are plenty of DAST tools offered as a SaaS that are the same thing.

chatmasta•8mo ago
How about the would-be victims don’t ship exploitable software to production? If that’s not possible, then maybe they should signup for an automated targeted hacking service to find the exploitable bugs before someone else does.

Your argument is straight out of the 1990s. We’ve moved beyond this as an industry, as you can see from the proliferation of bug bounty programs, responsible disclosure policies, CVE transparency, etc…

Sohcahtoa82•8mo ago
And in the process, reveal their own IP address rather than MindFort's.
blibble•8mo ago
by theirs, you mean, the IP of a IoT device/router they've hacked
bveiseh•8mo ago
Yup as mentioned, we do the TXT verification of the domain. We also don't offer self service sign up, so we are able to screen customers ahead of time and regularly monitor for any bad behavior.
lazyninja987•8mo ago
Is it a pre-requisute for the agents to have access to the source code to generate attack strategies?

How about pen-testing a black box?

Does the potential vulnerabilities list is generated by matching list of vulnerabilities that are publicly disclosed for the framework version of target software stack constituents?

I am new to LLMs or any ML for that matter. Congrats on your launch.

bveiseh•8mo ago
Thanks so much.

Great question, it is not required but we recommend it. If you don't include the source code, it would be black box. The agents won't know what the app looks like from the other side.

The agents identify vulns using known attack patterns, novel techniques, and threat intelligence.

sumanyusharma•8mo ago
Congratulations on the launch. Few qs:

How do your agents decide a suspected issue is a validated vulnerability, and what measured false-positive/false-negative rates can you share?

How is customer code and data isolated and encrypted throughout reconnaissance, exploitation, and patch generation (e.g., single-tenant VPC, data-retention policy)?

Do the agents ever apply patches automatically, or is human review required—and how does the workflow integrate with CI/CD to prevent regressions?

Ty!

bveiseh•8mo ago
Appreciate it!

The agents will hone in on a potential vulnerability by looking at different signals during its testing, and then build a POC to validate it based on the context. We don't have any data to share publicly yet but we are working on releasing benchmarks soon.

Everything runs in a private VPC and data is encrypted in transit and at rest. We have zero data retention agreements with our vendors, and we do offer single tenant and private cloud deployments for customers. We don't retain any customer code once we finish processing it, only the vulnerability data. We are also in process of receiving our SOC 2.

Patches are not auto applied. We can either open up a PR for human review or can add the necessary changes to a Linear/Jira ticket. We have the ability schedule assessments in our platform, and are working on a way to integrate more deeply with CI/CD.

gyanchawdhary•8mo ago
Congratulations on the launch. How different is this from xbow.com, shinobi.security, gecko.security. zeropath.com etc ?
bveiseh•8mo ago
Thanks so much.

We want to solve the entire vulnerability lifecycle problem not just finding zero days. MindFort works from detection, validation, triage/scoring, all the way to patching the vulnerability. While we are starting with web app, we plan to expand to the rest of the attack surface soon.

handfuloflight•8mo ago
Any outlines on pricing?
bveiseh•8mo ago
It depends on the size of your attack surface, complexity of the application, and frequency of assessments, so for now we are working out custom agreements with each customer based on these factors.
robszumski•8mo ago
How does a customer use this?

Point it at a publicly available webapp? Run it locally against dev? Do I self-host it and continually run against staging as it's updated?

bveiseh•8mo ago
So you would point it to any web app available over the internet. There is an option to have a private deployment in your VPC to test applications that are not exposed to the internet. You can also schedule assessments so that the system runs at a regular interval (daily, weekly, bi-weekly, etc)
mparis•8mo ago
Congrats on the launch. Seems like a natural domain for an AI tool. One nice aspect about pen testing is it only needs to work once to be useful. In other words, it can fail most of the time and no one but your CFO cares. Nice!

A few questions:

On your site it says, "MindFort can asses 1 or 100,000 page web apps seamlessly. It can also scale dynamically as your applications grow."

Can you provide more color as to what that really means? If I were actually to ask you to asses 100,000 pages what would actually happen? Is it possible for my usage to block/brown-out another customer's usage?

I'm also curious what happens if the system does detect a vulnerability. Is there any chance the bot does something dangerous with e.g. it's newly discovered escalated privileges?

Thanks and good luck!

bveiseh•8mo ago
Thanks so much!

In regards to the scale, we absolutely can assess at that scale, but it would require quite a large enterprise contract upfront, as we would need to get the required capacity from our providers.

The system is designed to safely test exploitation, and not perform destructive testing. It will traverse as far as it can, but it won't break anything along the way.

HocusLocus•8mo ago
You're gonna poke your eye out with those pentesters...
Sohcahtoa82•8mo ago
One thing I've run into with DAST tools is that they're awful at handling modern web apps where JS code fetches data with an API and then updates the DOM accordingly. They act like web pages are still using server-side HTML rendering and throw XSS false positives because a JSON response will return "<script>alert(1)</script>" in the data, even when the data is then put in the web page using either element.innerText or uses a framework that automatically prevents XSS.

Alternatively, they don't properly handle session tokens that don't rely on cookies, such as bearer tokens. At the place I work, in our app, the session token is passed as parameter in the request payload. Not a cookie or the Authorization header!

How well does MindFort handle these scenarios?