frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Is there a balance to be struck between simple hierarchical models and

https://statmodeling.stat.columbia.edu/2024/05/26/is-there-a-balance-to-be-struck-between-simple-hierarchical-models-and-more-complex-hierarchical-models-that-augment-the-simple-frameworks-with-more-modeled-interactions-when-analyzing-real-data/
39•luu•4d ago

Comments

Onawa•21h ago
Full Title: Is there a balance to be struck between simple hierarchical models and more complex hierarchical models that augment the simple frameworks with more modeled interactions when analyzing real data?
a-dub•20h ago
"When working on your particular problem, start with simple comparisons and then fit more and more complicated models until you have what you want."

sounds algorithmic...

mnky9800n•19h ago
Yes and you can even build symbolic engines that do this for you. I think the real question we must ask ourselves as data scientists or statisticians or whatever is whether we believe these data models represent the space of data fully or by happenstance. And if by happenstance is it because the data doesn’t capture the underlying processes that produced the data or are they uncapturable in this way and function approximators like neural networks or gradient booster machines are better. And is that because those function approximators capture interactions between the driving processes that otherwise go unseen or is it because those processes have fractional dimensions that control their impact that are not captured by data models. This all is summed up well by Leo Breimans two cultures paper in my opinion. I have gone back and forth on which “culture” is the correct representation of how processes produce data. If you buy that only function approximators truly capture the complexity of whatever processes you are observing then you have to wonder why physics works so well. That’s because, at least in my opinion, from the statistical point of view physics has spent centuries developing equations that are linear combinations of variables that are essentially data models according to Leo. I hope this opinion generates discussion because I don’t know what the answer is or if it matters that there is one.
a-dub•16h ago
seems to me that one approach is fueled by data and the other is fueled by understanding. in the former, the observations form a view of behavior which is then modeled with high fidelity. in the latter, active inquiry, adversarial data collection and careful reasoning produce simpler models of hypothsized underlying processes that often prove to have nearly perfect generalization.

the interesting future is probably the one where the former produces new building blocks for the latter. (ie, the computer generates new simple and easy to understand constructs from which it explains previously not understood or well modeled phenomena.)

joe_the_user•19h ago
Well, my impression is that the statistic paradigm itself limits the complexity of a model through it's basic aims and measures. Especially, a statistical model aims to be an unbiased predictor of a variable whereas machine learning/"AI" just aims for prediction and doesn't care about bias in the sense of statistics.
klysm•17h ago
I think they have totally different goals typically. For example, let’s say we are doing a sampling procedure. How do you estimate the sampling error? I’m not aware of a machine learning technique that will help, but you can use Bayesian and MCMC techniques
usgroup•17h ago
I think this is accurate but mostly because statistical modelling aims for interpretable parameters. That very strongly regularises complexity.

A community-led fork of Organic Maps

https://www.comaps.app/news/2025-05-12/3/
91•maelito•2h ago•44 comments

University of Texas-Led Team Solves a Big Problem for Fusion Energy

https://news.utexas.edu/2025/05/05/university-of-texas-led-team-solves-a-big-problem-for-fusion-energy/
37•signa11•1h ago•8 comments

Spade Hardware Description Language

https://spade-lang.org/
29•spmcl•1h ago•8 comments

A crypto founder faked his death. We found him alive at his dad's house

https://sfstandard.com/2025/05/08/jeffy-yu-zerebro-fake-death/
50•bathtub365•39m ago•10 comments

I ruined my vacation by reverse engineering WSC

https://blog.es3n1n.eu/posts/how-i-ruined-my-vacation/
258•todsacerdoti•10h ago•127 comments

Plain Vanilla Web

https://plainvanillaweb.com/index.html
1212•andrewrn•21h ago•565 comments

A Typical Workday at a Japanese Hardware Tool Store [video]

https://www.youtube.com/watch?v=A98jyfB5mws
20•Erikun•2d ago•1 comments

Implicit UVs: Real-time semi-global parameterization of implicit surfaces [pdf]

https://baptiste-genest.github.io/papers/implicit_uvs.pdf
16•ibobev•3h ago•1 comments

Paul McCartney, Elton John and other creatives demand AI comes clean on scraping

https://www.theregister.com/2025/05/12/uk_creatives_ai_letter/
14•rntn•19m ago•6 comments

Spark AI (YC W24) Is Hiring a Full Stack Engineer in San Francisco

https://www.ycombinator.com/companies/spark/jobs/kDeJlPK-software-engineer-full-stack
1•tk90•1h ago

CrowdStrike CEO Cuts His Voting Power by 92% with Unexplained Gifts

https://www.bloomberg.com/news/articles/2025-05-12/billionaire-crowdstrike-ceo-cuts-voting-power-by-92-with-unexplained-gifts
37•wslh•1h ago•11 comments

US Copyright Office found AI companies breach copyright. Its boss was fired

https://www.theregister.com/2025/05/12/us_copyright_office_ai_copyright/
143•croes•3h ago•41 comments

Continuous Thought Machines

https://pub.sakana.ai/ctm/
228•hardmaru•11h ago•23 comments

Intellect-2 Release: The First 32B Model Trained Through Globally Distributed RL

https://www.primeintellect.ai/blog/intellect-2-release
164•Philpax•12h ago•45 comments

Armbian Updates: OMV support, boot improvents, Rockchip optimizations

https://www.armbian.com/newsflash/armbian-updates-nas-support-lands-boot-systems-improve-and-rockchip-optimizations-arrive/
39•transpute•5h ago•1 comments

Making PyPI's test suite 81% faster – The Trail of Bits Blog

https://blog.trailofbits.com/2025/05/01/making-pypis-test-suite-81-faster/
88•rbanffy•3d ago•24 comments

Why Bell Labs Worked

https://1517.substack.com/p/why-bell-labs-worked
257•areoform•17h ago•177 comments

Car companies are in a billion-dollar software war

https://insideevs.com/features/759153/car-companies-software-companies/
388•rntn•19h ago•668 comments

Absolute Zero Reasoner

https://andrewzh112.github.io/absolute-zero-reasoner/
100•jonbaer•4d ago•18 comments

The Academic Pipeline Stall: Why Industry Must Stand for Academia

https://www.sigarch.org/the-academic-pipeline-stall-why-industry-must-stand-for-academia/
129•MaysonL•11h ago•94 comments

High-school shop students attract skilled-trades job offers

https://www.wsj.com/lifestyle/careers/skilled-trades-high-school-recruitment-fd9f8257
217•lxm•22h ago•366 comments

Scraperr – A Self Hosted Webscraper

https://github.com/jaypyles/Scraperr
213•jpyles•19h ago•72 comments

Writing an LLM from scratch, part 13 – attention heads are dumb

https://www.gilesthomas.com/2025/05/llm-from-scratch-13-taking-stock-part-1-attention-heads-are-dumb
308•gpjt•3d ago•59 comments

Ask HN: Cursor or Windsurf?

193•skarat•9h ago•276 comments

For better or for worse, the overload (2024)

https://consteval.ca/2024/07/25/overload/
13•HeliumHydride•3d ago•0 comments

Title of work deciphered in sealed Herculaneum scroll via digital unwrapping

https://www.finebooksmagazine.com/fine-books-news/title-work-deciphered-sealed-herculaneum-scroll-digital-unwrapping
221•namanyayg•23h ago•108 comments

How friction is being redistributed in today's economy

https://kyla.substack.com/p/the-most-valuable-commodity-in-the
229•walterbell•3d ago•107 comments

Show HN: Codigo – The Programming Language Repository

https://codigolangs.com
57•adamjhf•2d ago•17 comments

ToyDB rewritten: a distributed SQL database in Rust, for education

https://github.com/erikgrinaker/toydb
108•erikgrinaker•17h ago•13 comments

LSP client in Clojure in 200 lines of code

https://vlaaad.github.io/lsp-client-in-200-lines-of-code
153•vlaaad•20h ago•23 comments