frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

https://arxiv.org/abs/2601.14340
1•PaulHoule•12s ago•0 comments

Show HN: AI Agent Tool That Keeps You in the Loop

https://github.com/dshearer/misatay
1•dshearer•1m ago•0 comments

Why Every R Package Wrapping External Tools Needs a Sitrep() Function

https://drmowinckels.io/blog/2026/sitrep-functions/
1•todsacerdoti•1m ago•0 comments

Achieving Ultra-Fast AI Chat Widgets

https://www.cjroth.com/blog/2026-02-06-chat-widgets
1•thoughtfulchris•3m ago•0 comments

Show HN: Runtime Fence – Kill switch for AI agents

https://github.com/RunTimeAdmin/ai-agent-killswitch
1•ccie14019•6m ago•1 comments

Researchers surprised by the brain benefits of cannabis usage in adults over 40

https://nypost.com/2026/02/07/health/cannabis-may-benefit-aging-brains-study-finds/
1•SirLJ•7m ago•0 comments

Peter Thiel warns the Antichrist, apocalypse linked to the 'end of modernity'

https://fortune.com/2026/02/04/peter-thiel-antichrist-greta-thunberg-end-of-modernity-billionaires/
1•randycupertino•8m ago•2 comments

USS Preble Used Helios Laser to Zap Four Drones in Expanding Testing

https://www.twz.com/sea/uss-preble-used-helios-laser-to-zap-four-drones-in-expanding-testing
2•breve•13m ago•0 comments

Show HN: Animated beach scene, made with CSS

https://ahmed-machine.github.io/beach-scene/
1•ahmedoo•14m ago•0 comments

An update on unredacting select Epstein files – DBC12.pdf liberated

https://neosmart.net/blog/efta00400459-has-been-cracked-dbc12-pdf-liberated/
1•ks2048•14m ago•0 comments

Was going to share my work

1•hiddenarchitect•18m ago•0 comments

Pitchfork: A devilishly good process manager for developers

https://pitchfork.jdx.dev/
1•ahamez•18m ago•0 comments

You Are Here

https://brooker.co.za/blog/2026/02/07/you-are-here.html
3•mltvc•22m ago•0 comments

Why social apps need to become proactive, not reactive

https://www.heyflare.app/blog/from-reactive-to-proactive-how-ai-agents-will-reshape-social-apps
1•JoanMDuarte•23m ago•1 comments

How patient are AI scrapers, anyway? – Random Thoughts

https://lars.ingebrigtsen.no/2026/02/07/how-patient-are-ai-scrapers-anyway/
1•samtrack2019•23m ago•0 comments

Vouch: A contributor trust management system

https://github.com/mitchellh/vouch
2•SchwKatze•23m ago•0 comments

I built a terminal monitoring app and custom firmware for a clock with Claude

https://duggan.ie/posts/i-built-a-terminal-monitoring-app-and-custom-firmware-for-a-desktop-clock...
1•duggan•24m ago•0 comments

Tiny C Compiler

https://bellard.org/tcc/
1•guerrilla•26m ago•0 comments

Y Combinator Founder Organizes 'March for Billionaires'

https://mlq.ai/news/ai-startup-founder-organizes-march-for-billionaires-protest-against-californi...
1•hidden80•26m ago•2 comments

Ask HN: Need feedback on the idea I'm working on

1•Yogender78•27m ago•0 comments

OpenClaw Addresses Security Risks

https://thebiggish.com/news/openclaw-s-security-flaws-expose-enterprise-risk-22-of-deployments-un...
2•vedantnair•27m ago•0 comments

Apple finalizes Gemini / Siri deal

https://www.engadget.com/ai/apple-reportedly-plans-to-reveal-its-gemini-powered-siri-in-february-...
1•vedantnair•28m ago•0 comments

Italy Railways Sabotaged

https://www.bbc.co.uk/news/articles/czr4rx04xjpo
6•vedantnair•28m ago•2 comments

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•fanf2•29m ago•0 comments

Nintendo Wii Themed Portfolio

https://akiraux.vercel.app/
2•s4074433•34m ago•2 comments

"There must be something like the opposite of suicide "

https://post.substack.com/p/there-must-be-something-like-the
1•rbanffy•36m ago•0 comments

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

2•amichail•37m ago•0 comments

Show HN: Engineering Perception with Combinatorial Memetics

1•alan_sass•43m ago•2 comments

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

https://steamdaily.xyz
1•itshellboy•45m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
2•spenvo•45m ago•0 comments
Open in hackernews

Ask HN: How do you handle logging and evaluation when training ML models?

3•calepayson•2mo ago
Hi all, I'm currently in a few ML classes and, while they do a great job covering theory, they don't cover application. At least not past some basic implementations in a Jupyter Notebook.

One friction point I keep running into is how to handle logging and evaluation of the models. Right now I'm using Jupyter Notebook, I'll train the model, then produce a few graphs for different metrics with the test set.

This whole workflow seems to be the standard among the folks in my program but I can't shake the feeling that it seems vibes-based and sub optimal.

I've got a few projects coming up and I want to use them as a chance to improve my approach to training models. What method works for you? Are there any articles or libraries that you would recommend? What do you wish Jr. Engineers new about this?

Thanks!

Comments

calepayson•2mo ago
For now, the plan is to move from Jupyter back to a text editor. Jupyter is very forgiving of mistakes. The model didn't work? Change some parameters and rerun the training cell. This is amazing for new folks, who are being bombarded by new information, and (it sounds like) for experienced folks who have already developed great habits around ML projects. But I think intermediate folks need a little friction to help hammer home why best practice is best practice.

I'm hoping the text editor + project directory approach helps force ML projects away from a single file and towards some sort of codified project structure. Sometimes it just feels like there's too much information in a file and it becomes hard to assign it to a location mentally (a bit like reading a physical copy of a tough book vs a kindle copy). Any advice or thoughts on this would be appreciated!

-1•2mo ago
I’m no ML expert so take what I say with a grain of salt.

Two resources that might be useful are AWS’ SageMaker documentation and the Machine Learning Engineering book by Andriy Burkov. This book doesn’t really go into detail on logging though. One way to evaluate a model is to run a SageMaker processing job that saves the performance metrics in a json file in S3 somewhere. More info on processing jobs: https://docs.aws.amazon.com/sagemaker/latest/dg/processing-j... . AWS has various services for logging which you can look into. This will mostly apply to orgs using AWS, but it might give a sense of how things can be done more generally.