I ranked every building and landlord in NYC using 17M+ public records

3•rorcodes•1mo ago

Comments

rorcodes•1mo ago

Hi HN,

I built StreetSmart (https://streetsmart.inc) — a free tool for charity that scores every residential building in New York City by aggregating data from 12 different sources.

Why I built this: This is a charity project I built as part of a hackathon. It's completely free and for public service. I was looking for an apartment in NYC and realized the information asymmetry is insane. Landlords know everything about you (credit score, income, references), but you know almost nothing about them. NYC actually publishes a ton of housing data — 10.5M violations, 4M permits, 35M 311 complaints — but it's scattered across a dozen different city databases, formatted inconsistently, and practically unusable for the average renter.

So I filed some FOIL requests, downloaded everything, and spent a few days building a unified search.

What it does: * Scores 600K+ buildings on 24 weighted dimensions (safety, pests, heat reliability, landlord responsiveness, etc.) * Ranks landlords across their entire portfolio, not just one building * Detects "shadow portfolios" — landlords who hide behind a unique LLC for each building but reuse the same phone number or superintendent * Shows floor-by-floor violation heatmaps ("Don't rent on the 3rd floor — 85% of pest issues are there") * Identifies "construction harassment" patterns — when landlords file renovation permits and violation spikes correlate (a known tactic to push out rent-stabilized tenants) * Tracks the "Groundhog Day effect" — recurring violations in the same apartment that keep getting "fixed" but come back

Technical stuff: * Next.js 15 + Supabase (Postgres) * ~28M records total across tables * Python pipeline for monthly data sync (DuckDB for local processing) * Pre-computed rankings refreshed weekly to avoid expensive runtime queries * Scoring algorithm uses time-decay (old fixed violations fade out), per-unit normalization (large buildings aren't unfairly penalized), and distinguishes paperwork violations from actual hazards (a "file bedbug report" violation is different from "abate bedbug infestation")

What's free: Everything. No ads, no premium tier, no data selling. I'm not trying to monetize this — I just think renters deserve better tools. NYC's housing data is public; I'm just making it searchable.

Interesting findings: * Some landlords have 50+ buildings across different LLCs but always use the same superintendent * Buildings with Class C (immediately hazardous) violations during active construction permits are often harassment, not accidents * The median time for landlords to fix violations varies wildly by borough (some are 3x slower) * Year built matters a lot — pre-1974 buildings have different regulatory protections Would love feedback, especially on the scoring methodology. The weighting system is somewhat opinionated (safety is 3.5x, pests are 2x, etc.) and I'm curious if others would weight things differently. Repo structure is a monorepo: street-web (Next.js app), street-data (raw CSVs + processing), street-db (migrations + rankings rebuild), street-parse (price scraper). * Manhattan has both the best and the worst buildings * Data can really help, there are great buildings in bad neighborhoods. Check out '37 Hillside Avenue' - it's in our lowest ranked neighborhood but is an amazing building.

biglyburrito•1mo ago

This is awesome & something I wish had existed when I bought our coop years ago. Well done!

rorcodes•1mo ago

Thanks so much, if there are any features you want, please let me know

Vectors and HNSW for Dummies

Sanskrit AI beats CleanRL SOTA by 125%

'Washington Post' CEO resigns after going AWOL during job cuts

Claude Opus 4.6 Fast Mode: 2.5× faster, ~6× more expensive

TSMC to produce 3-nanometer chips in Japan

Quantization-Aware Distillation

List of Musical Genres

Show HN: Sknet.ai – AI agents debate on a forum, no humans posting

University of Waterloo Webring

Large tech companies don't need heroes

Backing up all the little things with a Pi5

Game of Trees (Got)

Human Systems Research Submolt

The Threads Algorithm Loves Rage Bait

Search NYC open data to find building health complaints and other issues

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

Show HN: Grovia – Long-Range Greenhouse Monitoring System

Ask HN: The Coming Class War

Mind the GAAP Again

The Yardbirds, Dazed and Confused (1968)

Agent News Chat – AI agents talk to each other about the news

Do you have a mathematically attractive face?

Code only says what it does

The success of 'natural language programming'

The Scriptovision Super Micro Script video titler is almost a home computer

Discovering the "original" iPhone from 1995 [video]

Psychometric Comparability of LLM-Based Digital Twins

SidePop – track revenue, costs, and overall business health in one place

The Other Markov's Inequality

The Cascading Effects of Repackaged APIs [pdf]