frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Ask HN: Aggregating authentic user reviews across platforms?

2•howardV•4h ago
I'm exploring the technical feasibility of building a tool that aggregates genuine user reviews about websites from various sources (social media, forums, review platforms, etc.). The core challenge: How do you programmatically collect and verify authentic user sentiment about a website while respecting rate limits, ToS, and privacy concerns? Technical questions I'm grappling with:

Data sources: Which platforms actually allow review scraping legally? Authentication: How to handle platforms that require login for review access? Rate limiting: Best practices for respectful data collection across multiple APIs? Spam detection: How to filter out fake reviews and bot-generated content? Real-time updates: Efficient ways to keep review data current without overwhelming source platforms?

Broader questions:

Has anyone built something similar? What were the biggest technical hurdles? Are there existing APIs or datasets that make this more feasible? What legal/ethical considerations am I missing?

Currently researching this space and would love to hear from anyone who's tackled similar challenges in review aggregation, web scraping at scale, or sentiment analysis. Any insights on the technical architecture or cautionary tales would be incredibly valuable!

Comments

8organicbits•2h ago
Let's break it down:

Authentic user sentiment - this is an impossible problem. At scale, the best you could do is to ask an LLM to rate sentiment and authenticity. If you can tolerate inaccuracies, that may be viable.

Rate limits: web crawling frameworks do this out of the box via robots.txt, various headers, etc. Non-trivial to set up, but not novel.

ToS: You'll need a lawyer to advise here. Possibly by reading each ToS document individually, including every ToS update.

Legal/ethical: we'd need to know more about what you're doing to comment.

Generally: Retail websites want users to use their website to purchase products. If your scraping doesn't drive traffic to their sites, then they won't want your scraping. If users previously went directly to the retail website to view the reviews, but now used your site instead, then they'd fear loss of revenue. One way or another, they'll try to prevent this from happening.

Show HN: Ossia Score 3.5.3

https://github.com/ossia/score/releases/tag/v3.5.3
1•jcelerier•1m ago•0 comments

Loss of Identity: Surviving Post-PhD Depression

https://voicesofacademia.com/2022/08/12/loss-of-identity-surviving-post-phd-depression-by-amy-gaeta/
1•colinprince•2m ago•0 comments

Trump says will impose 25% tariffs on Japan, South Korea

https://www.reuters.com/world/asia-pacific/trump-says-will-impose-25-tariffs-japan-south-korea-2025-07-07/
1•lysace•2m ago•0 comments

Tell HN: My 3 competitors are all super polished companies run by solo devs

1•gametorch•4m ago•1 comments

Missed OpenAI's GPT-4o Tutor? I Built the Demo They Didn't Release

https://www.brimink.com/
1•Daviduche03•5m ago•1 comments

Timemate achieves a carbon rating of A plus

https://www.websitecarbon.com/website/timemate-app/
1•chrisding•5m ago•0 comments

PubGrub: Next-generation version solving (2018)

https://nex3.medium.com/pubgrub-2fb6470504f
1•cosmic_quanta•5m ago•0 comments

Google Shut Down My Android Play Store Account and Killed My Business

https://flyingbytes.github.io
2•TheFastOne2•6m ago•0 comments

Automatically Packaging a Haskell Library as a Swift Binary XCFramework

https://alt-romes.github.io/posts/2025-07-05-packaging-a-haskell-library-as-a-swift-binary-xcframework.html
1•Bogdanp•6m ago•0 comments

Xbox producer suggests laid-off staff use AI to deal with emotions

https://www.bbc.co.uk/news/articles/ckglzxy389zo
1•ode•7m ago•0 comments

Tesla Stock Slides After Musk Says He's Creating a New Political Party

https://www.wsj.com/livecoverage/stock-market-today-dow-sp-500-nasdaq-07-07-2025/card/tesla-stock-slides-after-trump-slates-musk-s-new-political-party-CVgF3BCPrJsvivbECRFy
2•OptionOfT•7m ago•1 comments

AI is learning to lie, scheme, and threaten its creators during stress tests

https://fortune.com/2025/06/29/ai-lies-schemes-threats-stress-testing-claude-openai-chatgpt/
2•swyx•11m ago•1 comments

Generic Interfaces

https://go.dev/blog/generic-interfaces
2•Merovius•11m ago•0 comments

Excessive copying in C++ and your program's speed

https://johnnysswlab.com/excessive-copying-in-c-and-your-programs-speed/
1•ryandotsmith•13m ago•0 comments

Inside America’s Death Chambers

https://www.theatlantic.com/magazine/archive/2025/07/death-row-executions-witness/682891/
1•speckx•15m ago•0 comments

AlexScan – The Domain Security Analyzer

https://github.com/alexevan13/domain-analyzer
1•alexevan13•16m ago•1 comments

7 Nobel Economists Urge France to Lead with 2% Ultra Rich Wealth Tax

https://www.lemonde.fr/en/opinion/article/2025/07/07/tax-on-ultra-rich-france-has-the-opportunity-to-lead-the-way-say-nobel-prize-winning-economists_6743117_23.html
2•francou•20m ago•1 comments

Meetings Are the Mind Killer

1•jmugan•21m ago•0 comments

Introduction to Indian English

https://www.oed.com/discover/introduction-to-indian-english/
1•sandwichsphinx•21m ago•0 comments

Jane Street's Indian Options Trade Was Too Good

https://www.bloomberg.com/opinion/newsletters/2025-07-07/jane-street-s-indian-options-trade-was-too-good
3•frontfor•21m ago•1 comments

Show HN: Dwani.ai – multimodal inference API for Indian languages

https://dwani.ai
1•gaganyatri•22m ago•0 comments

Tempest-LoRa: Cross-Technology Covert Communication

https://arxiv.org/abs/2506.21069
1•sorenjan•22m ago•0 comments

Bedroom Design Orientation and Sleep Electroencephalography Signals (2019)

https://lww.com/_layouts/1033/OAKS.Journals/Error/JavaScript.html
1•walterbell•24m ago•0 comments

I used to prefer permissive licenses and now favor copyleft

https://vitalik.eth.limo/general/2025/07/07/copyleft.html
2•ryandotsmith•24m ago•0 comments

In Praise of the Contrarian Stack

https://hackers.pub/@hongminhee/2025/contrarian-stack/en
1•thm•25m ago•0 comments

Show HN: Doc81 – tech documentation tool designed in AI-native mind

https://github.com/ahnopologetic/doc81
1•stahn1995•25m ago•0 comments

27 Notes on Growing Old(er)

https://www.ian-leslie.com/p/27-notes-on-growing-older
2•underthenettle•28m ago•1 comments

The Economics of Programming Languages [video] (2023)

https://www.youtube.com/watch?v=XZ3w_jec1v8
1•nateb2022•30m ago•1 comments

ChatGPT could pilot a spacecraft unexpectedly well, early tests find

https://www.space.com/space-exploration/launches-spacecraft/chatgpt-could-pilot-a-spacecraft-unexpectedly-well-early-tests-find
2•DamnInteresting•30m ago•0 comments

Overlord Engine: The Game Engine for Web Development

https://overlordsystems.com/
2•omarmahdi•30m ago•0 comments