frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Essential CDN: The CDN that lets you do more than JavaScript

https://essentialcdn.fluidity.workers.dev/
1•telui•26s ago•1 comments

They Hijacked Our Tech [video]

https://www.youtube.com/watch?v=-nJM5HvnT5k
1•cedel2k1•3m ago•0 comments

Vouch

https://twitter.com/mitchellh/status/2020252149117313349
1•chwtutha•3m ago•0 comments

HRL Labs in Malibu laying off 1/3 of their workforce

https://www.dailynews.com/2026/02/06/hrl-labs-cuts-376-jobs-in-malibu-after-losing-government-work/
2•osnium123•4m ago•1 comments

Show HN: High-performance bidirectional list for React, React Native, and Vue

https://suhaotian.github.io/broad-infinite-list/
1•jeremy_su•6m ago•0 comments

Show HN: I built a Mac screen recorder Recap.Studio

https://recap.studio/
1•fx31xo•8m ago•0 comments

Ask HN: Codex 5.3 broke toolcalls? Opus 4.6 ignores instructions?

1•kachapopopow•14m ago•0 comments

Vectors and HNSW for Dummies

https://anvitra.ai/blog/vectors-and-hnsw/
1•melvinodsa•16m ago•0 comments

Sanskrit AI beats CleanRL SOTA by 125%

https://huggingface.co/ParamTatva/sanskrit-ppo-hopper-v5/blob/main/docs/blog.md
1•prabhatkr•27m ago•1 comments

'Washington Post' CEO resigns after going AWOL during job cuts

https://www.npr.org/2026/02/07/nx-s1-5705413/washington-post-ceo-resigns-will-lewis
2•thread_id•28m ago•1 comments

Claude Opus 4.6 Fast Mode: 2.5× faster, ~6× more expensive

https://twitter.com/claudeai/status/2020207322124132504
1•geeknews•29m ago•0 comments

TSMC to produce 3-nanometer chips in Japan

https://www3.nhk.or.jp/nhkworld/en/news/20260205_B4/
3•cwwc•32m ago•0 comments

Quantization-Aware Distillation

http://ternarysearch.blogspot.com/2026/02/quantization-aware-distillation.html
1•paladin314159•32m ago•0 comments

List of Musical Genres

https://en.wikipedia.org/wiki/List_of_music_genres_and_styles
1•omosubi•34m ago•0 comments

Show HN: Sknet.ai – AI agents debate on a forum, no humans posting

https://sknet.ai/
1•BeinerChes•34m ago•0 comments

University of Waterloo Webring

https://cs.uwatering.com/
1•ark296•35m ago•0 comments

Large tech companies don't need heroes

https://www.seangoedecke.com/heroism/
1•medbar•36m ago•0 comments

Backing up all the little things with a Pi5

https://alexlance.blog/nas.html
1•alance•37m ago•1 comments

Game of Trees (Got)

https://www.gameoftrees.org/
1•akagusu•37m ago•1 comments

Human Systems Research Submolt

https://www.moltbook.com/m/humansystems
1•cl42•37m ago•0 comments

The Threads Algorithm Loves Rage Bait

https://blog.popey.com/2026/02/the-threads-algorithm-loves-rage-bait/
1•MBCook•40m ago•0 comments

Search NYC open data to find building health complaints and other issues

https://www.nycbuildingcheck.com/
1•aej11•43m ago•0 comments

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

https://www.nytimes.com/2026/02/07/magazine/michael-pollan-interview.html
2•lxm•45m ago•0 comments

Show HN: Grovia – Long-Range Greenhouse Monitoring System

https://github.com/benb0jangles/Remote-greenhouse-monitor
1•benbojangles•49m ago•1 comments

Ask HN: The Coming Class War

2•fud101•49m ago•4 comments

Mind the GAAP Again

https://blog.dshr.org/2026/02/mind-gaap-again.html
1•gmays•51m ago•0 comments

The Yardbirds, Dazed and Confused (1968)

https://archive.org/details/the-yardbirds_dazed-and-confused_9-march-1968
2•petethomas•52m ago•0 comments

Agent News Chat – AI agents talk to each other about the news

https://www.agentnewschat.com/
2•kiddz•52m ago•0 comments

Do you have a mathematically attractive face?

https://www.doimog.com
3•a_n•57m ago•1 comments

Code only says what it does

https://brooker.co.za/blog/2020/06/23/code.html
2•logicprog•1h ago•0 comments
Open in hackernews

Ask HN: Aggregating authentic user reviews across platforms?

2•howardV•7mo ago
I'm exploring the technical feasibility of building a tool that aggregates genuine user reviews about websites from various sources (social media, forums, review platforms, etc.). The core challenge: How do you programmatically collect and verify authentic user sentiment about a website while respecting rate limits, ToS, and privacy concerns? Technical questions I'm grappling with:

Data sources: Which platforms actually allow review scraping legally? Authentication: How to handle platforms that require login for review access? Rate limiting: Best practices for respectful data collection across multiple APIs? Spam detection: How to filter out fake reviews and bot-generated content? Real-time updates: Efficient ways to keep review data current without overwhelming source platforms?

Broader questions:

Has anyone built something similar? What were the biggest technical hurdles? Are there existing APIs or datasets that make this more feasible? What legal/ethical considerations am I missing?

Currently researching this space and would love to hear from anyone who's tackled similar challenges in review aggregation, web scraping at scale, or sentiment analysis. Any insights on the technical architecture or cautionary tales would be incredibly valuable!

Comments

8organicbits•7mo ago
Let's break it down:

Authentic user sentiment - this is an impossible problem. At scale, the best you could do is to ask an LLM to rate sentiment and authenticity. If you can tolerate inaccuracies, that may be viable.

Rate limits: web crawling frameworks do this out of the box via robots.txt, various headers, etc. Non-trivial to set up, but not novel.

ToS: You'll need a lawyer to advise here. Possibly by reading each ToS document individually, including every ToS update.

Legal/ethical: we'd need to know more about what you're doing to comment.

Generally: Retail websites want users to use their website to purchase products. If your scraping doesn't drive traffic to their sites, then they won't want your scraping. If users previously went directly to the retail website to view the reviews, but now used your site instead, then they'd fear loss of revenue. One way or another, they'll try to prevent this from happening.