frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Major world system dust particle kalpas

https://www.nichirenlibrary.org/en/dic/Content/M/44
1•debo_•1m ago•0 comments

Map of Palaeohispanic Coins and Inscriptions

http://hesperia.ucm.es/consulta_hesperia/mapas.php
3•brendanashworth•4m ago•0 comments

Cognitronics and the Longest Running Voice in Telephony [video]

https://www.youtube.com/watch?v=RFL2dKvTM9Y
1•fortran77•4m ago•0 comments

IterGen: Iterative Semantic-Aware Structured LLM Generation with Backtracking

https://arxiv.org/abs/2410.07295
1•tough•8m ago•0 comments

ROCODE: Integrating Backtracking Mechanism and Program Analysis in LLMs for Code

https://arxiv.org/abs/2411.07112
1•tough•8m ago•0 comments

SRLCG: Self-Rectified Large-Scale Code Generation, CoT, Dynamic Backtracking

https://arxiv.org/abs/2504.00532
1•tough•10m ago•0 comments

AI #115: The Evil Applications Division

https://thezvi.substack.com/p/ai-115-the-evil-applications-division
1•paulpauper•19m ago•0 comments

"How did porcelain go from a rare luxury to a commodity?"

https://old.reddit.com/r/AskHistorians/comments/1kj9q61/comment/mrmrec1/
1•areoform•19m ago•0 comments

Silicon Valley Braces for Chaos

https://www.theatlantic.com/technology/archive/2025/05/silicon-valley-reacts-to-trump/682799/
1•paulpauper•21m ago•0 comments

The Myth of the Poverty Trap

https://www.theatlantic.com/podcasts/archive/2025/05/the-myth-of-the-poverty-trap/682786/
1•paulpauper•21m ago•0 comments

Less meat is nearly always better than sustainable meat

https://ourworldindata.org/less-meat-or-sustainable-meat
1•sohkamyung•21m ago•0 comments

Ibcs-us – Linux User space emulation of SCO, Solaris and others

https://ibcs-us.sourceforge.io
2•wmlive•22m ago•1 comments

What I Worked on (2021)

https://paulgraham.com/worked.html
1•wglb•26m ago•0 comments

Intel's new CEO 'isn't thinking about changes'

https://www.oregonlive.com/silicon-forest/2025/05/intels-new-ceo-isnt-thinking-about-massive-changes.html
2•rwc9•27m ago•0 comments

Ask HN: Take over FB group when Admin dies?

1•coffeeismydrug•30m ago•0 comments

Good luck to everyone applying for YC summer 2925 batch

1•byoung2•33m ago•2 comments

How to See MCP in Action?

1•mahimamanik•40m ago•0 comments

Why do NYC drivers waste two hundred million hours a year circling the block?

https://www.newyorker.com/magazine/2025/05/12/no-parking-zone-the-perils-of-finding-a-spot-in-nyc
2•haltingproblem•40m ago•0 comments

Deploying Software: A Technology Explainer and a Look Toward the Future

https://georgetownlawtechreview.org/deploying-software-a-technology-explainer-and-a-look-toward-the-future/GLTR-05-2025/
1•raybb•41m ago•0 comments

Your fingers wrinkle the same way every time you're in the water too long

https://www.binghamton.edu/news/story/5547/do-your-fingers-wrinkle-the-same-way-every-time-youre-in-the-water-too-long-new-research-says-yes
2•gnabgib•41m ago•0 comments

NeXTSTEP: The Visionary OS, Steve Jobs's Apple Exodus, and the GNUStep Legacy

https://machaddr.substack.com/p/nextstep-the-visionary-os-steve-jobss
1•wmlive•41m ago•1 comments

Nutpie: High-Performance Bayesian Inference

https://pymc-devs.github.io/nutpie/
1•helltone•44m ago•0 comments

Embedding files in Go using the "embed" package

https://echorand.me/posts/go-embed/
2•karagenit•47m ago•0 comments

Quay.io Push Unavailable

https://status.redhat.com/incidents/k7kvfvgfrbdf
1•croes•47m ago•0 comments

Disputing My Block at Justapedia

https://wikipediasucks.co/forum/viewtopic.php?t=3489
1•kurtreed2•50m ago•0 comments

Women in the Age of Polar Exploration

https://daily.jstor.org/women-in-the-age-of-polar-exploration/
1•areoform•51m ago•0 comments

Ask HN: Should bots actively be banned on HN

1•podnami•52m ago•8 comments

JavaScript's New Superpower: Explicit Resource Management

https://v8.dev/features/explicit-resource-management
2•xnx•57m ago•0 comments

There is nothing revolutionary in women abandoning women's rights

https://thecritic.co.uk/cerys-vaughan-shows-what-real-gender-non-conformity-is/
2•drankl•1h ago•0 comments

Deep Laziness

https://www.ribbonfarm.com/2018/04/06/deep-laziness/
1•jxmorris12•1h ago•0 comments
Open in hackernews

Ask HN: Web scraping in production?

3•arkmm•3h ago
Are any of you maintaining any web scrapers in production?

I've done some for side projects, automated testing, and personal scripts (checking personal bank balances, getting a Global Entry interview slot, etc.), but it always feels very brittle.

Curious what applications people have in industry and what sorts of techniques people use for reliability.

Comments

sargstuff•3h ago
excel web scraping[0] (vs. using python[1] and/or odbc/delimited files)

A few 2025 use cases [2],[3]:

   Use publically available database information (construction, taxes, sales, traffic report, proposed building/zone changes, etc) to find out what's going on withing an area aka. zip code, housing area, 'vacation spot', etc
----

   creative take on topic:

      modern looming / static 'threaded' approach : https://news.ycombinator.com/item?id=43977384

      Structurally reprogrammable magnetic maetamaterials hold promise for biomedicine, soft robotics. ("web" support formed via scraping material in relevant patterns) : https://techxplore.com/news/2025-05-reprogrammable-magnetic-metamaterials-biomedicine-soft.html

      3d printed smart-fabrics : https://techxplore.com/news/2025-05-d-smart-fabrics-flexibility-ability.html

----

[0] : excel scraping : https://www.youtube.com/watch?app=desktop&v=6coVzIt93vk

[1] : python scraping : https://www.youtube.com/watch?v=Oo8-nEuDBkk

[2] : https://dataforest.ai/blog/top-web-scraping-use-cases

[3] : https://www.parsehub.com/blog/web-scraping-examples/

arkmm•3h ago
Neat - didn't realize there were affordances for scraping in Excel (but in hindsight I shouldn't be surprised).

I didn't follow the connection between modern looming and scraping though?

sargstuff•3h ago
hint: silk spider webs & fabric threads.

Guess 3d printing should have been clarified as linear, fused deposition. Melted plastic line gets scraped along plate/material.

The 3d printed web reference, in this instance, being the in-fill pattern. [0]

Robotic metal pinching / incremental sheet forming might be bit more clearer example. [1]

-----

[0] : in-fill pattern : https://jlc3dp.com/blog/choosing-the-right-infill-structure-...

[1] : https://www.youtube.com/watch?v=Jc16Ob-yoDs

9d•3h ago
Scraping is inherently brittle, but it can be very useful for short-term scraping in very specific circumstances. I haven't had any in maybe 10 years.
sargstuff•3h ago
IMHO, "untyped" format/delimited file yes. Directly placing/'compiling' in appropriate topological construct/environment works wonders. aka environment of database, spreadsheet, "reports" with information beyond raw data, etc