frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

So I Decided to Build My Own Analytics, This Is How It Went

https://flowsery.com/en
1•tarasshyn•1h ago

Comments

tarasshyn•1h ago
I needed analytics for side projects. PostHog was overkill for what I wanted (Country, Origin, UTMs, per-user attribution, entry page, revenue) and events are immutable, so removing test data needs manual SQL filters everywhere.

Plausible had no per-user attribution. DataFast looked perfect, installed with a proxy. Months later the bill hit $40/m. My whole infra is $150/m. Not paying $500/yr for analytics, but switching meant losing historical data and attribution. So I built it myself.

*Getting the data out.* DataFast has no export option (red flag #1). Wrote a script paginating every exposed endpoint, transforming responses into SQL for my DB.

For context: I have a microservices setup (Kafka, Redis, gateway, auth) and a monorepo front-end with shared components. So I just needed the "core" analytics feature.

A weekend in I had an ugly dashboard, services, a DB, no tracking. DataFast data turned out broken and missing values. Connected my readonly DB via MCP and the readonly key from my payment processor, re-attributed everything. Got to ~95% and moved on.

*Backend refactor.* Claude's boilerplate did attribution with direct Postgres calls - one roundtrip per visitor. Built a caching layer: events go to Redis, flushed to Postgres every ~30s. A distributed Redis lock means only one instance flushes at a time (no duplicates, no races). Each flush processes 5,000 records per SQL statement (Postgres parameter limits); failed chunks get re-buffered to Redis with up to 5 retries. ClickHouse would solve this too, but Redis scales fine.

Then extraction. LLMs have no concept of heap - everything was loaded into memory and iterated. With 100k+ events that kills the server. Rewrote with pagination and batched queries, plus a pre-aggregated daily rollup table for historical queries with no filters. Dashboard now feels instant for past date ranges.

*Front-end.* DataFast's filter system is unusable; ported PostHog's pattern. Their rate limits: 20 concurrent requests per day, and moving back days doesn't abort prior requests, so 3 days back = 60 in flight = rate limited. No signal abort in a prod app in 2025 (red flag #2). Batched my FE down to 5 requests with proper aborts on filter changes.

*Bot protection - this is where it got bad.* Running my tracker side-by-side with DataFast, I had 30-50% fewer attributions. Added Arcjet, hit 100k bot requests in days, disabled it before it bankrupted me.

DataFast has zero bot protection (red flag #3). Datacenter IPs - passed. Null user-agent - passed. 10x10000 resolution - welcome aboard. Read Arcjet's posts, hit 96% bot blockage. Filter obvious user-agents and impossible displays. Use MaxMind DB to block datacenter IPs (I blocked my own infra and got 0 attributions, oops). Proxy real client IP through Cloudflare to my Fly backend.

While doing this I checked how DataFast handles IPs and... they don't (red flag #4). Maybe my misconfig, but their docs don't say. Either way, all my tracked users were attributed to the nearest Cloudflare CDN. I apparently take regular trips to Germany from Poland. Most of my DataFast tracking was garbage.

Added behavioral signals - bounces, no engagement + weird screens, weird browser versions - dozens of params combined into a per-session bot score with a "probably bots" toggle. Hard-filtered cases never hit the DB.

The bot scorer is import-aware: DataFast never tracked scroll depth, engagement, or interactions, so imported sessions have zero behavioral data. The scorer detects this and uses a fingerprint-only algorithm instead of penalizing them for data they never had.

Backend stress-tested (died, bumped RAM). Front-end looking good.

*The savings:* new microservice $25/m. So $39 - $25 = $14/m saved. Took about a month, on and off. Truly genius idea, replace every SaaS and never look back.

Link if curious: https://flowsery.com/

Open-Sourcing SEC Edgar on Hugging Face

https://twitter.com/TeraflopAI/status/2044430993549832615
1•EnricoShippole•2m ago•1 comments

40% Increased Throughput 16.8% Less Energy for AI (Verified via ZKP)

https://github.com/BerzeShift/Berze-Shift
1•BerzeShift•3m ago•1 comments

Democracy Policy Under Obama [pdf]

https://obamaforillinois.s3.amazonaws.com/static/files/Democracy_Under_Obama_Executive_Summary.pdf
1•prepostseo•4m ago•1 comments

Show HN: Lazy-HN, a faster Hacker News front end you probably don't need

https://hn.tin-sever.de/
1•tin7•5m ago•0 comments

Rest of the World Annual Report 2025

https://restofworld.org/annual-report/2025/
1•hunglee2•5m ago•0 comments

Snap's Crucible Moment

https://sources.news/p/snap-crucible-moment
1•gmays•5m ago•0 comments

Show HN: Evo – parallel autoresearch experiments for Claude Code and Codex

https://github.com/evo-hq/evo
2•abtom•6m ago•0 comments

Cal.com is going closed source

https://cal.com/blog/cal-com-goes-closed-source-why
5•Benjamin_Dobell•6m ago•3 comments

Richard Dawkins, let's not bring back Neanderthals

https://unherd.com/newsroom/no-richard-dawkins-lets-not-bring-back-neanderthals/
1•voxleone•6m ago•0 comments

Ask HN: Which LLM model and agentic CLI are you using for local development?

1•alfiedotwtf•7m ago•0 comments

The Malleable Computer

https://world.hey.com/dhh/the-malleable-computer-7c187a9b
1•Tomte•7m ago•0 comments

I built a calculator site that doesn't look like garbage

https://www.calculatoris.dev
1•danzxc•8m ago•1 comments

We're only seeing the tip of the chip-smuggling iceberg

https://cyberscoop.com/ai-chip-smuggling-china-export-controls-enforcement-op-ed/
2•lschueller•11m ago•0 comments

Meta creating AI version of Mark Zuckerberg so staff can talk to the boss

https://www.theguardian.com/technology/2026/apr/13/meta-ai-mark-zuckerberg-staff-talk-to-the-boss
3•gmays•12m ago•0 comments

The best way to advertise a programming language

https://www.stylewarning.com/posts/write-programs/
1•cottonseed•13m ago•0 comments

Cybersecurity Looks Like Proof of Work Now

https://www.dbreunig.com/2026/04/14/cybersecurity-is-proof-of-work-now.html
1•brie22•13m ago•0 comments

Show HN: A semantic flow tool for embeddings

https://github.com/Pixedar/TraceScope
1•pixedar•13m ago•0 comments

Allbirds shares surge over 430% as footwear firm trades shoes for AI business

https://www.euronews.com/business/2026/04/15/allbirds-shares-surge-over-430-as-footwear-firm-trad...
2•gouthamve•13m ago•1 comments

I built my first AI agent (and what I got wrong)

https://thoughts.jock.pl/p/how-to-build-your-first-ai-agent-beginners-guide-2026
3•joozio•16m ago•0 comments

I'm curating a digital library of lindy books

https://www.thelindylibrary.com/
1•juansuero•16m ago•1 comments

Show HN: Cachefetch – Fast CLI tool that shows cache file sizes

https://github.com/ErenayDev/cachefetch
1•Erenay09•16m ago•0 comments

Unreal Engine C++ compilation for Windows under Linux with Wine

https://tensorworks.com.au/blog/unreal-engine-cpp-compilation-for-windows-under-wine/
2•mariuz•17m ago•0 comments

WhatDoTheyMake, Anonymous Salary Sharing

https://whatdotheymake.com/
1•jabsters•18m ago•2 comments

Show HN: Aegis – 85ns Sovereign Infrastructure Running on $100 Android Hardware

1•Aegis_Labs•18m ago•1 comments

No one's sure if synthetic mirror life will kill us all

https://www.technologyreview.com/2026/04/15/1135197/synthetic-mirror-life-microbes-kill-us-all/
1•Brajeshwar•18m ago•0 comments

Mathematics Isn't Unreasonably Effective

https://itsiweinstock.substack.com/p/mathematics-isnt-unreasonably-effective
2•ItsiW•21m ago•0 comments

Show HN: I built on-device TTS app because I run out of audiobooks on a flight

https://loudreader.io
2•mowmiatlas•22m ago•1 comments

Technical debt is dead, the metaphor is broken

https://p-322.com/notes/technical-debt-metaphor-is-broken/en/
3•jauco•22m ago•0 comments

Show HN: DeepFake Detector Flags Swalwell Video as Fake

https://graomelo.github.io/
1•IzhaqBlues•22m ago•0 comments

Show HN: Avec – iOS email app that lets you handle your Gmail inbox in seconds

https://apps.apple.com/us/app/avec-email-app-for-gmail/id6742199038
3•jnnnthnn•24m ago•0 comments