frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

The Bluesky Dictionary

https://www.avibagla.com/blueskydictionary/
56•gaws•2h ago

Comments

neaden•2h ago
Is this not working or am I missing something, it just shows as seeing 0 words for me. Firefox on a PC.
SirFatty•2h ago
Same... maybe you need a Bluesky account, which I don't have.
gpm•2h ago
It doesn't... I can open it in a private browsing window.
GalaxyNova•2h ago
It's working fine for me on Firefox
accrual•1h ago
You may need to allow scripts from the domain avibagla.com, it shows 0 when the scripts are blocked.
zem•46m ago
ugh, it ought to be building the results on the server and serving up static pages.
rafram•21m ago
But it updates live...
AgentME•47m ago
For me it took a minute to start loading data and switch from just showing 0.
GalaxyNova•2h ago
fascinating! I think it's really cool that this is possible, and at the same time kine of sad that the norm is slowly moving towards more locked-down APIs.
timeon•55m ago
> slowly moving towards

Depends what we accept as norm.

75345d4c•1h ago
I just saw it indexed "eluvium," but the post was referring to a band with that same name
Kye•1h ago
GeologySky will get to it soon enough.
atlgator•57m ago
I checked out the author's other projects and this is common issue. For example, he has a "lean checker" for bluesky that claims it is right-leaning simply because of all the people saying "That's right," "He was right," etc. None of the supposed right-leaning posts were actually conservative in nature. They just used to word right to mean correct.
avibagla1•47m ago
one, thank you for checking my website. two, that is the joke, 100% - at the time people kept talking about how "left leaning" bsky was and that idea came to mind
wantlotsofcurry•1h ago
I'm very curious as to how this works in the backend. I realize it uses Bluesky's firehose to get the posts, but I'm more curious on how it's checking whether a post contains any of the available words. Any guesses?
bangaladore•1h ago
Maybe I'm being naive, but with only ~275k words to check against, this doesn't seem like a particularly hard problem. Ingest post, split by words, check each word via some db, hashmap, etc... and update metadata.
gpm•1h ago
Probably just a big hashtable mapping word -> the number of times it's been seen, and another hashset of all the words it hasn't seen. When a post comes in you hash all the words in it and look them up in the hashtable, increment it, and if the old value was 0 remove it from the hash set.

250k words at a generous 100 bytes per word is only 25MB of memory...

f311a•1h ago
You can probably fit all words under 10-15MB of memory, but memory optimisations are not even needed for 250k words...

Trie data structures are memory-efficient for storing such dictionaries (2-4x better than hashmaps). Although not as fast as hashmaps for retrieving items. You can hash the top 1k of the most common words and check the rest using a trie.

The most CPU-intensive task here is text tokenizing, but there are a ton of optimized options developed by orgs that work on LLMs.

stwrzn•57m ago
I very much hope that the backend uses one of the bluesky jetstream endpoints. When you only subscribe to new posts, it provides a stream of around 20mbit/s last time I checked, while the firehose was ~200mbit/s.
avibagla1•49m ago
yes it does!
avibagla1•50m ago
Hey! this is my site - it's not all that complex, i'm just using a sqlite db with two tables - one for stats, the other for all the words that's just word | count | first use | last use | post.

I... did not expect this to be so popular

spullara•1h ago
I did this against a pretty large tweet archive and got hits on about 125k of the words in the unix dictionary.
pona-a•1h ago
For a moment I thought it would be an AT-Proto based Urban Dictionary clone.
tough•44m ago
Words We Haven't Seen

- Search unseen words

made me chuckle

crm9125•14m ago
I've found content for all of my future skeets.

Lexiq – Learn English for Computer Science with Quizzes and Flashcards

https://lexiq.etudionet.life
1•yessinedev•35s ago•1 comments

Open SWE: An Open-Source Asynchronous Coding Agent

https://blog.langchain.com/introducing-open-swe-an-open-source-asynchronous-coding-agent/
1•gardnr•2m ago•1 comments

Quantum Systems as Indivisible Stochastic Processes [pdf]

https://arxiv.org/abs/2507.21192
1•WraithM•4m ago•0 comments

LFM2 WebGPU

https://huggingface.co/spaces/LiquidAI/LFM2-WebGPU
2•nmstoker•5m ago•0 comments

Trump Threatens 100% Tariff on Chips, with a Big Caveat

https://www.nytimes.com/2025/08/06/technology/trump-chip-tariffs-semiconductors.html
3•doener•5m ago•0 comments

What do we think of Wallet Pass Notifs?

https://tryreceiptify.com/
3•Neathed•7m ago•1 comments

Out-Fibbing CPython with the Plush Interpreter

https://pointersgonewild.com/2025-08-06-out-fibbing-cpython-with-the-plush-interpreter/
4•Bogdanp•11m ago•0 comments

Fear of super intelligent AI is driving Harvard and MIT students to drop out

https://www.forbes.com.au/news/innovation/agi-fears-is-driving-harvard-and-mit-students-to-drop-out/
1•birriel•12m ago•0 comments

Infinite Canvas for Video Generation

https://www.limitlessvideo.ai/
1•amin•12m ago•0 comments

Windows Subsystem for Linux "WSL" Updated for a Security Vulnerability

https://www.phoronix.com/news/Microsoft-WSL-2.5.10
3•kPwn•12m ago•0 comments

Calculator Tool

https://calculator.city/
1•ideahow•15m ago•0 comments

Apple increases US commitment to $600B, announces American Manufacturing Program

https://www.apple.com/newsroom/2025/08/apple-increases-us-commitment-to-600-billion-usd-announces-ambitious-program/
1•Zenbit_UX•15m ago•0 comments

Amplified summer wind stilling and land warming compound energy risks

https://iopscience.iop.org/article/10.1088/1748-9326/adb1f8
1•doener•17m ago•0 comments

Debian 13 Trixie Review: 11 Features That Make It the Biggest Release in Years

https://dtptips.com/debian-13-trixie-review-11-game-changing-features-that-make-it-the-biggest-release-in-years/
4•teleforce•17m ago•0 comments

When every meeting looks the same

https://www.ndra.dev/2025/08/06/when-every-meeting-looks-the-same.html
1•dangro•19m ago•0 comments

Go Concurrency Explorer

https://www.concurrency.rocks/
2•snehesht•21m ago•0 comments

Trump Says Japan to Import Ford’s Huge F-150 Pickup Trucks

https://www.bloomberg.com/news/articles/2025-08-06/trump-says-japan-to-import-ford-s-massive-f-150-pickup-trucks
2•voxadam•25m ago•0 comments

Rival Tea app for men is leaking its users' personal data and driver's licenses

https://techcrunch.com/2025/08/06/a-rival-tea-app-for-men-is-leaking-its-users-personal-data-and-drivers-licenses/
2•pseudolus•27m ago•1 comments

Why Building Billing Systems Is So Painful

https://www.dmitry.ie/2024/why-building-billing-systems-is-so-painful
2•Rafsark•30m ago•0 comments

We're building a new global entertainment company (2024)

https://www.volleygames.com/post/engineering-spotlight-building-a-new-global-entertainment-company
1•mooreds•32m ago•0 comments

Computer Games – More Than Just Shoot-Em-Ups? (1986) [video]

https://www.youtube.com/watch?v=iweTvbi0_8w
1•petethomas•33m ago•0 comments

Did I just build a new graph isomorphism fingerprint – with ChatGPT?

https://chatgpt.com/share/6893dc48-0bb0-8004-9c02-c881a9113fbe
2•d3ckard•34m ago•1 comments

AI Flight Pricing Can Push Travelers to the Limit of Their Ability to Pay

https://www.bloomberg.com/news/articles/2025-08-04/how-ai-can-raise-airline-ticket-prices
1•petethomas•37m ago•1 comments

Seiko Incredibly Specialized Watch Exhibition (2024)

https://by.seiko-design.com/powerdesignproject2024/en/
2•spython•37m ago•0 comments

When Ideas Matter More Than Code – An essay on building, framing, and strategy

https://substack.com/home/post/p-170011912
1•niho•38m ago•1 comments

NASA satellite that scientists and farmers rely on may be destroyed on purpose

https://text.npr.org/2025/08/04/nx-s1-5453731/nasa-carbon-dioxide-satellite-mission-threatened
8•antman•40m ago•2 comments

I'm tired of stupid people treating me like I'm an idiot

https://whatwelost.substack.com/p/im-tired-of-stupid-people-treating
9•jrflowers•42m ago•2 comments

Autonomous Timeline Analysis and Threat Hunting Black Hat Talk Slides

https://elie.net/talk/autonomous-timeline-analysis-and-threat-hunting-an-ai-agent-for-timesketch
2•ebursztein•43m ago•0 comments

Git-fetch-file – Sync files from other repos with commit tracking and safety

https://github.com/andrewmcwattersandco/git-fetch-file
4•andrewmcwatters•43m ago•0 comments

Trump Announces 100% Tariff on Semiconductors, unless made in US

https://www.macrumors.com/2025/08/06/trump-100-percent-tariff-chips/
11•mgh2•48m ago•0 comments