Show HN: I built a social media management tool in 3 weeks with Claude and Codex

https://github.com/brightbeanxyz/brightbean-studio

63•JanSchu•2h ago

Comments

JanSchu•2h ago

I wanted to test how far AI coding tools could take a production project. Not a prototype. A social media management platform with 12 first-party API integrations, multi-tenant auth, encrypted credential storage, background job processing, approval workflows, and a unified inbox. The scope would normally keep a solo developer busy for the better part of a year. I shipped it in 3 weeks.

Before writing any code, I spent time on detailed specs, an architecture doc, and a style guide. All public: https://github.com/brightbeanxyz/brightbean-studio/tree/main...

I broke the specs into tasks that could run in parallel across multiple agents versus tasks with dependencies that had to merge first. This planning step was the whole game. Without it, the agents produce a mess.

I used Opus 4.6 (Claude Code) for planning and building the first pass of backend and UI. Opus holds large context better and makes architectural decisions across files more reliably. Then I used Codex 5.3 to challenge every implementation, surface security issues, and catch bugs. Token spend was roughly even between the two.

Where AI coding worked well: Django models, views, serializers, standard CRUD. Provider modules for well-documented APIs like Facebook and LinkedIn. Tailwind layouts and HTMX interactions. Test generation. Cross-file refactoring, where Opus was particularly good at cascading changes across models, views, and templates when I restructured the permission system.

Where it fell apart: TikTok's Content Posting API has poor docs and an unusual two-step upload flow. Both tools generated wrong code confidently, over and over. Multi-tenant permission logic produced code that worked for a single workspace but leaked data across tenants in multi-workspace setups. These bugs passed tests, which is what made them dangerous. OAuth edge cases like token refresh, revoked permissions, and platform-specific error codes all needed manual work. Happy path was fine, defensive code was not. Background task orchestration (retry logic, rate-limit backoff, error handling) also required writing by hand.

One thing I underestimated: Without dedicated UI designs, getting a consistent UX was brutal. All the functionality was there, but screens were unintuitive and some flows weren't reachable through the UI at all. 80% of features worked in 20% of the time. The remaining 80% went to polish and making the experience actually usable.

The project is open source under AGPL-3.0. 12 platform integrations, all first-party APIs. Django 5.x + HTMX + Alpine.js + Tailwind CSS 4 + PostgreSQL. No Redis. Docker Compose deploy, 4 containers.

Ask me anything about the spec-driven approach, platform API quirks, or how I split work between the two models.

hyperionultra•1h ago

Why postgre instead of classic mysql?

hk__2•1h ago

Why mysql instead of postgres should be the right question nowadays.

dewey•1h ago

Postgres isn't a newcomer any more. For most projects that I see it's the default and the "classic" already.

purerandomness•1h ago

MySQL does not let you have transactional DDL statements (alter, create, index etc).

If you're building anything serious and your data integrity is important, use Postgres.

Postgres is much stricter, and always was. MySQL tried to introduce several strict modes to mitigate the problems that they had, but I would always recommend to use Postgres.

JanSchu•1h ago

Postgres is simply a battle proven technology.

faangguyindia•1h ago

such apps should use sqlite. it's enough for this type of app.

incidentnormal•1h ago

What did your harness look like for this?

stavros•1h ago

This is interesting, how do you publish to LinkedIn? I thought they didn't allow automated posts.

dewey•1h ago

Seems to just use the website api: https://github.com/brightbeanxyz/brightbean-studio/blob/main...

stavros•1h ago

Very helpful, thanks!

dewey•1h ago

Thank you for this write up, this is much more interesting than all the "Show HN" that don't mention anything about AI but you can see it on every corner.

What you describe has also been my experience so far with building projects mostly with AI but with detailed specs but Rails instead of Django.

hk__2•1h ago

Nothing wrong here, but Django/HTMX seem quite 'old' technologies to me for a new project made in 2026. Nowadays I use FastAPI/SQLAlchemy for the backend and SvelteKit on the frontend.

JanSchu•1h ago

I do have originally a data science background, thus python is usually my go to language, and have a lot of experience with django already. This helps a lot when reviewing AI code and if you have to judge architecture, etc.

And for hmtx I simply wanted to have something lightweight that is not very invasive to keep things simple and dependencies low.

In my head this was a good consideration to keep complexity low for my AI agents :-)

_heimdall•1h ago

HTMX is 5 years old, version 2 is just under 2 years old, and the last release (2.0.7) came out 7 months ago.

JodieBenitez•59m ago

> Django/HTMX seem quite 'old' technologies to me for a new project made in 2026.

It's simple, it works, it's efficient, safe, and there are tons of online resources for it. Excellent choice, even more so when using a coding agent.

rrr_oh_man•58m ago

You don’t need a Drillator-X 3000 AI Ready™ if a simple screwdriver gets the job done. IMHO the main thing technical people get wrong about B2B problems.

Also calling HTMX old makes me feel old.

JanSchu•28m ago

yeah htmx is from 2020, it feels like yesterday

benterix•15m ago

SvelteKit is also from 2020.

purerandomness•19m ago

FastAPI is quite old (2018)

Svelte even older (2016, SvelteKit was just an new version in 2022)

SQLAlchemy is ancient (2006)

Use newer tech, like HTMX (2020)

(/s obviously)

jbk•1h ago

This is amazing. I started doing the same, but I did not have the time to polish it.

Questions: why no X? Do you have a feature to resize (summarize?) to the text to fit into short boxes?

mrsekut•58m ago

That was an interesting article. I have a few questions about the workflow.

1. You mentioned developing tasks in parallel—how many agents were you actually running at the same time? Did you ever reach a point where, even if you increased the degree of parallelism, merging and reviews became the bottleneck, and increasing the number further didn’t speed things up?

2. I really relate to the idea of “80% of features in 20% of the time, then 80% on polish.” Did you use AI for this final polishing phase as well? In other words, did you show the AI screenshots of the screens and explain them? Also, when looking back, do you feel that if you had written the initial specifications more carefully, you could have completed the work faster?

JanSchu•9m ago

What I did was to break the development into different layers which had to be completed after another, since the functionalities build on each other. Each layer had independent work streams which run in parallel. Each work stream was one independent worktree/session in Claude code

First I triggered all work streams per layer and brought them to a level of completion I was happy with. Then you merge one after another (challenge in github with the @codex the implementation and rebases when you move to the next work stream.

This is roughly how it looked like:

Layer 0 - Project Scaffolding

Layer 1 — Core Features Stream A — Content Pipeline Stream B — Social Platform Providers Stream C — Media Library Stream D — Notification System Stream E — Settings UI

                        T-0.1 (Scaffolding)
                              │
                        T-0.2 (Core Models + Auth)
                              │
          ┌───────────────────┼───────────────────┬──────────────┐
          │                   │                   │              │
     Stream A            Stream B            Stream C       Stream D
     (Content)           (Providers)         (Media)        (Notifs)
          │                   │                   │              │
     T-1A.1 Composer    T-1B.1 FB/IG/LI    T-1C.1 Library  T-1D.1 Engine
          │              T-1B.2 Others           │              │
     T-1A.2 Calendar         │                   │         Stream E
          │                  │                   │         T-1E.1 Settings UI
     T-1A.3 Publisher ◄──────┘                   │
          │                                      │
          └──────────◄───────────────────────────┘
          (Publisher needs providers + media processing)

Layer 2 — Collaboration & Engagement Stream F — Approval & Client Portal Stream G — Inbox Stream H — Calendar & Composer Enhancements Stream I — Client Onboarding

          Layer 1 complete
                │
    ┌───────────┼───────────┬──────────────┐
    │           │           │              │
 Stream F   Stream G    Stream H       Stream I
 (Approval  (Inbox)     (Calendar+     (Onboarding)
  + Portal)              Composer
    │                    enhance)
 T-2F.1 Approval
    │
 T-2F.2 Portal

Thus I did run up to 4 agents in parallel, but o be honest this is the max level of parallelism my brain was able to handle, I really felt like the bottleneck here.

Additionally, your token usage is very high since you are having so many agent do work at the same time, hence I very often reached my claude session token limits and had to wait for the next session to begin (I do have the 5x Max plan)

dontwannahearit•44m ago

How much of the specs themselves came from the LLM? The development schedule https://github.com/brightbeanxyz/brightbean-studio/blob/main... has very AI-looking estimates for exampl and I can see a commit in the architecture.md file which is exclusively changing em-dashes to normal dashes (https://github.com/brightbeanxyz/brightbean-studio/commit/74...) which suggests you wanted to make it seem less LLM-generated?

I ask, not to condemn, but to find out what your process was for developing the requirements. Clearly it was done with LLM help but what was the refinement process?

JanSchu•30m ago

The spec document was also written by Claude (over many iteration) and lots of manual additions. It took me tho 4 full days to get the specs to the level I was happy with.

One main thing I did was to use the deep research feature of Claude to get a good understanding of what other tools are offering (features, integrations etc.)

Then each feature in the specs document got refined with manual suggestions and screenshots of other tools that I took.

benmarten•1h ago

No x?

cyanydeez•1h ago

Do people still think twitter is a valuable place (besides being bot owners).

brobdingnagians•1h ago

It seems like geopolitical statements and international announcements happen a lot on Twitter/X these days.

bengale•1h ago

I know some people have ideological things going on that make them choose different networks, but they have more than half a billion active users so it's not exactly a ghost town.

spiderfarmer•1h ago

In The Netherlands it’s a full on crazy town. I’m not kidding. It’s bottom of the barrel vitriolic garbage. Not one positive , insightful or interesting tweet among them.

rocketpastsix•57m ago

how many of those "active" users are just bots?

grvdrm•1h ago

I’m not a power user/poster but I see it as no less valuable than many other similar places. All of them have similar problems. For me it’s probably bifurcated by time spent tuning the feeds.

forsalebypwner•1h ago

their API is insanely expensive

JanSchu•1h ago

I did not include it yet, because you have to pay for the API. They changed their pricing model recently to pay only per request. I'll be looking into it the next weeks

donohoe•57m ago

I’d argue it’s not worth it. Engagement and referral traffic from it continue to tank.

FireInsight•1h ago

I am genuinely in the "target market" for a tool such as this, but having evaluated one previously I found the quality and self-hosting experience to be pretty bad, and that a proprietary freemium product was still a better experience.

I'm hesitant to even take a look at this project due to the whole "vibe coded in 3 weeks" thing, though. Hearing that says to me that this is not serious or battle-tested and might go unmaintained or such. Do you think these are valid concerns to have?

spicyusername•56m ago

We're entering an era where the delivering of software is cheap. Basically any idea can have an MVP implemented by one or two people in just a month or two now. Very quickly the industry is learning what the next set of bottlenecks are, now that the bottleneck is no longer writing code.

Planning, design, management alignment, finding customers, integrating with other products, waiting for review, etc. Basically all the human stuff that can't be automated away.

Your comment reminds me to add building a support team to the list.

localhoster•50m ago

Was it ever? Even before llm, writing software, or at least web clients, was as easy as it can get.

written-beyond•46m ago

I agree, software (software startups) has always been the golden child of investors because of how cheap it is compared to hardware or any other physical good.

Good software is expensive regardless of the involvement of LLMs because you need someone to take responsibility. Large companies will save a buck because there may be fewer people needed to take said responsibility, but it's probably a marginal saving compared to the overall scheme of things.

baq•54m ago

You can vibe code minor fixes to some annoyances including the clanker managing the whole fork/pull request flow if you want to contribute back for $20/mo on codex or claude (though $20 is the free trial tier there, codex is nearly so since last week but should be good enough... for now).

63stack•42m ago

The era of sharing some small programs that you made with others to benefit from is over imo.

You can just vibe code it yourself. If your requirements are narrower (eg. you only need support for 3 networks and not 12), you will end up with something that takes less time to develop (possibly less than a day), it will have a smaller surface for problems, and it will be much better tailored to your specific needs. If you pay attention to what the LLM is doing it will also be easier to maintain or extend further.

The surface for security vulnerabilities also gets narrower, since you "only" have to trust the LLM (which is still a huge ask, but still better than LLM + 1 random person).

m000•38m ago

I agree. It's not like this project is disrupting an overpriced product/SaaS.

E.g. Buffer charges around $50 per year per social media account, which gives you an unlimited number of collaborating user accounts. And their single user plans are even cheaper.

I don't see how self-hosting would be a worthy investment of your time/effort in this case, unless you are in some grossly mismanaged organization where you have several devops engineers paid for doing literally nothing.

TrackerFF•25m ago

Lots and lots of commercial software is being vibe coded. Big difference here is that at least the OP honest about it.

sixtyj•12m ago

I see your point.

Last time I “vibe coded” something (internal) and I liked it because I couldn’t find external solution.

I admire coders who can finish their code into deliverable and usable piece.

Issue here is software abundance and ppl will start to hesitate due to absurd pile that they should evaluate.

It reminds me the statistics of ice cream global sales. People want certainty so they choose chocolate or vanilla :)

Therefore many good software projects will have a problem to find users.

ms7892•59m ago

Woah! I was looking for something like this from a long time

themonsu•53m ago

Does it work with multiple social accounts? E.g. if I have 100 customers whose social medias I manage for content posting.

JanSchu•30m ago

yes

nottorp•48m ago

Is it in Rust too?

throwatdem12311•26m ago

Why does it matter how long it took you to make it?

All elementary functions from a single binary operator

The economics of software teams: Why most engineering orgs are flying blind

Android now stops you sharing your location in photos

Taking on CUDA with ROCm: 'One Step After Another'

DIY Soft Drinks

Bring Back Idiomatic Design (2023)

Show HN: boringBar – a taskbar-style dock replacement for macOS

Ask HN: What Are You Working On? (April 2026)

Most people can't juggle one ball

A perfectable programming language

Optimization of 32-bit Unsigned Division by Constants on 64-bit Targets

Show HN: I built a social media management tool in 3 weeks with Claude and Codex

I gave every train in New York an instrument

Caffeine, cocaine, and painkillers detected in sharks from The Bahamas

Tell HN: Docker pull fails in Spain due to football Cloudflare block

We have a 99% email reputation, but Gmail disagrees

Is math big or small?

Show HN: Oberon System 3 runs natively on Raspberry Pi 3 (with ready SD card)

Apple's accidental moat: How the "AI Loser" may end up winning

Exploiting the most prominent AI agent benchmarks

I ran Gemma 4 as a local model in Codex CLI

A Canonical Generalization of OBDD

Seven countries now generate nearly all their electricity from renewables (2024)

JVM Options Explorer

How long-distance couples use digital games to facilitate intimacy (2025)

Phyphox – Physical Experiments Using a Smartphone

Pro Max 5x quota exhausted in 1.5 hours despite moderate usage

A Tour of Oodi

The peril of laziness lost

Google removes "Doki Doki Literature Club" from Google Play

Show HN: I built a social media management tool in 3 weeks with Claude and Codex

Comments

All elementary functions from a single binary operator

The economics of software teams: Why most engineering orgs are flying blind

Android now stops you sharing your location in photos

Taking on CUDA with ROCm: 'One Step After Another'

DIY Soft Drinks

Bring Back Idiomatic Design (2023)

Show HN: boringBar – a taskbar-style dock replacement for macOS

Ask HN: What Are You Working On? (April 2026)

Most people can't juggle one ball

A perfectable programming language

Optimization of 32-bit Unsigned Division by Constants on 64-bit Targets

Show HN: I built a social media management tool in 3 weeks with Claude and Codex

I gave every train in New York an instrument

Caffeine, cocaine, and painkillers detected in sharks from The Bahamas

Tell HN: Docker pull fails in Spain due to football Cloudflare block

We have a 99% email reputation, but Gmail disagrees

Is math big or small?

Show HN: Oberon System 3 runs natively on Raspberry Pi 3 (with ready SD card)

Apple's accidental moat: How the "AI Loser" may end up winning

Exploiting the most prominent AI agent benchmarks

I ran Gemma 4 as a local model in Codex CLI

A Canonical Generalization of OBDD

Seven countries now generate nearly all their electricity from renewables (2024)

JVM Options Explorer

How long-distance couples use digital games to facilitate intimacy (2025)

Phyphox – Physical Experiments Using a Smartphone

Pro Max 5x quota exhausted in 1.5 hours despite moderate usage

A Tour of Oodi

The peril of laziness lost

Google removes "Doki Doki Literature Club" from Google Play