frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
2•momciloo•18m ago•0 comments

Show HN: Stacky – certain block game clone

https://www.susmel.com/stacky/
2•Keyframe•22m ago•0 comments

Show HN: A toy compiler I built in high school (runs in browser)

https://vire-lang.web.app
2•xeouz•44m ago•1 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
266•isitcontent•20h ago•33 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
365•vecti•22h ago•166 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
338•eljojo•23h ago•209 comments

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

https://github.com/sandys/kappal
17•sandGorgon•2d ago•5 comments

Show HN: Nginx-defender – realtime abuse blocking for Nginx

https://github.com/Anipaleja/nginx-defender
3•anipaleja•2h ago•0 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
80•phreda4•19h ago•15 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
94•antves•2d ago•70 comments

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
3•sam256•4h ago•1 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
52•nwparker•1d ago•11 comments

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

https://www.biotradingarena.com/hn
27•dchu17•1d ago•12 comments

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

https://github.com/artifact-keeper
153•bsgeraci•1d ago•64 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
18•denuoweb•2d ago•2 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md
7•sakanakana00•5h ago•1 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/
3•pieterdy•5h ago•1 comments

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode
19•NathanFlurry•1d ago•9 comments

Show HN: XAPIs.dev – Twitter API Alternative at 90% Lower Cost

https://xapis.dev
3•nmfccodes•2h ago•1 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
2•melvinzammit•7h ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•8h ago•2 comments

Show HN: Micropolis/SimCity Clone in Emacs Lisp

https://github.com/vkazanov/elcity
173•vkazanov•2d ago•49 comments

Show HN: Daily-updated database of malicious browser extensions

https://github.com/toborrm9/malicious_extension_sentry
14•toborrm9•1d ago•8 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
10•michaelchicory•9h ago•3 comments

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

https://rahuljaguste.github.io/Nethack_Falcons_Eye/
6•rahuljaguste•19h ago•1 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
17•keepamovin•10h ago•6 comments

Show HN: Horizons – OSS agent execution engine

https://github.com/synth-laboratories/Horizons
24•JoshPurtell•1d ago•5 comments

Show HN: Local task classifier and dispatcher on RTX 3080

https://github.com/resilientworkflowsentinel/resilient-workflow-sentinel
25•Shubham_Amb•1d ago•2 comments

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

https://apps.apple.com/us/app/fitspire-5-minute-workout/id6758784938
2•devavinoth12•13h ago•0 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
4•ambitious_potat•14h ago•4 comments
Open in hackernews

Show HN: An open source access logs analytics script to block bot attacks

https://github.com/tempesta-tech/webshield
37•krizhanovsky•3mo ago
This is a small PoC Python project for web server access logs analyzing to classify and dynamically block bad bots, such as L7 (application-level) DDoS bots, web scrappers and so on.

We'll be happy to gather initial feedback on usability and features, especialy from people having good or bad experience wit bots.

*Requirements*

The analyzer relies on 3 Tempesta FW specific features which you still can get with other HTTP servers or accelerators:

1. JA5 client fingerprinting (https://tempesta-tech.com/knowledge-base/Traffic-Filtering-b...). This is a HTTP and TLS layers fingerprinting, similar to JA4 (https://blog.foxio.io/ja4%2B-network-fingerprinting) and JA3 fingerprints. The last is also available in Envoy (https://www.envoyproxy.io/docs/envoy/latest/api-v3/extension...) or Nginx module (https://github.com/fooinha/nginx-ssl-ja3), so check the documentation for your web server

2. Access logs are directly written to Clickhouse analytics database, which can cunsume large data batches and quickly run analytic queries. For other web proxies beside Tempesta FW, you typically need to build a custom pipeline to load access logs into Clickhouse. Such pipeliens aren't so rare though.

3. Abbility to block web clients by IP or JA5 hashes. IP blocking is probably available in any HTTP proxy.

*How does it work*

This is a daemon, which

1. Learns normal traffic profiles: means and standard deviations for client requests per second, error responses, bytes per second and so on. Also it remembers client IPs and fingerprints.

2. If it sees a spike in z-score (https://en.wikipedia.org/wiki/Standard_score) for traffic characteristics or can be triggered manually. Next, it goes in data model search mode

3. For example, the first model could be top 100 JA5 HTTP hashes, which produce the most error responses per second (typical for password crackers). Or it could be top 1000 IP addresses generating the most requests per second (L7 DDoS). Next, this model is going to be verified

4. The daemon repeats the query, but for some time, long enough history, in the past to see if in the past we saw a hige fraction of clients in both the query results. If yes, then the model is bad and we got to previous step to try another one. If not, then we (likely) has found the representative query.

5. Transfer the IP addresses or JA5 hashes from the query results into the web proxy blocking configuration and reload the proxy configuration (on-the-fly).

Comments

imiric•3mo ago
Thanks for sharing!

The heuristics you use are interesting, but this will likely only be a hindrance to lazy bot creators. TLS fingerprints can be spoofed relatively easily, and most bots rotate their IPs and signals to avoid detection. With ML tools becoming more accessible, it's only a matter of time until bots are able to mimic human traffic well enough, both on the protocol and application level. They probably exist already, even if the cost is prohibitively high for most attackers, but that will go down.

Theoretically, deploying ML-based defenses is the only viable path forward, but even that will become infeasible. As the amount of internet traffic generated by bots surpasses the current ~50%, you can't realistically block half the internet.

So, ultimately, I think allow lists are the only option if we want to have a usable internet for humans. We need a secure and user-friendly way to identify trusted clients, which, unfortunately, is ripe to be exploited by companies and governments. All proposed device attestation and identity services I've seen make me uneasy. This needs to be a standard built into the internet, based on modern open cryptography, and not controlled by a single company or government.

I suppose it already exists with TLS client authentication, but that is highly impractical to deploy. Is there an ACME protocol for clients? ... Huh, Let's Encrypt did support issuing client certs, but they dropped it[1].

[1]: https://news.ycombinator.com/item?id=44018400

krizhanovsky•3mo ago
This is quite insightful, thank you.

This particular project, WebShield, is simple and it didn't take too long to develop. Basically, with this project we're trying to figure out what can be built having fingerprints and traffic characteristics in an analytic database. It's seems easy to make PoCs with these features.

For now, if this tool can stop some dummy bots, we'll be happy. We definitely need more development and more sophisticated algorithms to fight against some paid scrapping proxies.

It's more or less simple to classify DDoS bots because they have clear impact - the system performance degrades. For some bots we also can introduce the target, for the bots and the protection system, e.g. the booked slots for a visa appointments. For some scrappers this is harder.

Another opportunity is to dynamically generate classification features and verify resulting models, build web page transition graphs and so on.

This is a good point about possible blocking of ~50% of the Internet. For DDoS we _mitigate_ an attack, not _block_ it, so probably for bots we should do the same - just rate limit them instead of full blocking.

Technically, we can implement verification of client side certificates, but, yes, the main problem of adoption on the client side.

monster_truck•3mo ago
This is not realistically useful
gpi•3mo ago
Where can I learn more about JA5? John Althouse (What JA stands for) had not published anything about JA5 yet.
krizhanovsky•3mo ago
Hi,

thank you for the reply!

You can read about JA5 at https://tempesta-tech.com/knowledge-base/Traffic-Filtering-b... .

But the thing is that the hashes were just inspired by the work of John Althouse and there is no any relation.

Unfortunately, we didn't realize what "JA" stands for at the time we were designing the feature. We will rename it https://github.com/tempesta-tech/tempesta/issues/2533 .

Sorry for the confusion.

goodthink•3mo ago
My advice to little sites: Basic Auth stops bots dead in their tracks. The credentials have to be discoverable of course, which makes it useless for most sites, but it's not that inconvenient. Once entered, browsers offer to use the same creds on subsequent visits. I have several apps that use 3D assets that are large enough for it to become worrisome when hundreds of bots are requesting them day and night. Not any more, lol.
krizhanovsky•3mo ago
That's a good advice, thank you.

In our approach we do our best to not to affect user experience. E.g. consider an example of a company website with a blog. The company does it's best to engage more audience to their blog, products whatever. I guess quite a part of the audience will be lost due to requirement of authentication on website, which they see first time.

However, for returning, and especially regular, clients I think that is a really simple and good solution.