frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Velocity of Money

https://en.wikipedia.org/wiki/Velocity_of_money
1•gurjeet•1m ago•0 comments

Stop building automations. Start running your business

https://www.fluxtopus.com/automate-your-business
1•valboa•5m ago•1 comments

You can't QA your way to the frontier

https://www.scorecard.io/blog/you-cant-qa-your-way-to-the-frontier
1•gk1•6m ago•0 comments

Show HN: PalettePoint – AI color palette generator from text or images

https://palettepoint.com
1•latentio•7m ago•0 comments

Robust and Interactable World Models in Computer Vision [video]

https://www.youtube.com/watch?v=9B4kkaGOozA
1•Anon84•11m ago•0 comments

Nestlé couldn't crack Japan's coffee market.Then they hired a child psychologist

https://twitter.com/BigBrainMkting/status/2019792335509541220
1•rmason•12m ago•0 comments

Notes for February 2-7

https://taoofmac.com/space/notes/2026/02/07/2000
2•rcarmo•14m ago•0 comments

Study confirms experience beats youthful enthusiasm

https://www.theregister.com/2026/02/07/boomers_vs_zoomers_workplace/
2•Willingham•21m ago•0 comments

The Big Hunger by Walter J Miller, Jr. (1952)

https://lauriepenny.substack.com/p/the-big-hunger
1•shervinafshar•22m ago•0 comments

The Genus Amanita

https://www.mushroomexpert.com/amanita.html
1•rolph•27m ago•0 comments

We have broken SHA-1 in practice

https://shattered.io/
8•mooreds•27m ago•2 comments

Ask HN: Was my first management job bad, or is this what management is like?

1•Buttons840•28m ago•0 comments

Ask HN: How to Reduce Time Spent Crimping?

2•pinkmuffinere•30m ago•0 comments

KV Cache Transform Coding for Compact Storage in LLM Inference

https://arxiv.org/abs/2511.01815
1•walterbell•34m ago•0 comments

A quantitative, multimodal wearable bioelectronic device for stress assessment

https://www.nature.com/articles/s41467-025-67747-9
1•PaulHoule•36m ago•0 comments

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

https://www.wsj.com/world/india/why-big-tech-is-throwing-cash-into-india-in-quest-for-ai-supremac...
1•saikatsg•36m ago•0 comments

How to shoot yourself in the foot – 2026 edition

https://github.com/aweussom/HowToShootYourselfInTheFoot
1•aweussom•37m ago•0 comments

Eight More Months of Agents

https://crawshaw.io/blog/eight-more-months-of-agents
4•archb•39m ago•0 comments

From Human Thought to Machine Coordination

https://www.psychologytoday.com/us/blog/the-digital-self/202602/from-human-thought-to-machine-coo...
1•walterbell•39m ago•0 comments

The new X API pricing must be a joke

https://developer.x.com/
1•danver0•40m ago•0 comments

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

https://rma-dashboard.bukhari-kibuka7.workers.dev/
1•bumahkib7•40m ago•0 comments

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

https://github.com/2015xli/jqassistant-graph-rag
1•artigent•45m ago•0 comments

Python Only Has One Real Competitor

https://mccue.dev/pages/2-6-26-python-competitor
4•dragandj•47m ago•0 comments

Tmux to Zellij (and Back)

https://www.mauriciopoppe.com/notes/tmux-to-zellij/
1•maurizzzio•47m ago•1 comments

Ask HN: How are you using specialized agents to accelerate your work?

1•otterley•49m ago•0 comments

Passing user_id through 6 services? OTel Baggage fixes this

https://signoz.io/blog/otel-baggage/
1•pranay01•50m ago•0 comments

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

https://davmail.sourceforge.net/
1•todsacerdoti•50m ago•0 comments

Visual data modelling in the browser (open source)

https://github.com/sqlmodel/sqlmodel
1•Sean766•52m ago•0 comments

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

https://github.com/chinonsochikelue/tharos
1•fluantix•53m ago•0 comments

Oddly Simple GUI Programs

https://simonsafar.com/2024/win32_lights/
1•MaximilianEmel•53m ago•0 comments
Open in hackernews

Show HN: We made GPT-4.1-mini beat 4.1 at Tic-Tac-Toe using dynamic context

https://github.com/opper-ai/opper-cookbook/tree/main/examples/tictactoe-tournament
5•farouqaldori•6mo ago
We wanted to test if a smaller model like GPT-4.1-mini could beat its bigger brother 4.1 at the game Tic-Tac-Toe using only context engineering.

We put them in a 100-game tournament. For the smaller model, we gave it a few examples of winning moves from past games right before it made its own move.

The results were clear. Without the examples, the smaller model struggled against GPT-4.1. With the examples, its effectiveness increased by nearly 200%, and it consistently won.

It's a simple demonstration, but it shows that a smaller, faster model with good, timely examples can outperform a more capable base model.

The full write up and code are in the repo.

Comments

totisjosema•6mo ago
Other author here, This started as an experiment to see how much the performance of models improves when you give them examples — basically, how big of a difference do examples actually make? We also wanted to explore whether there’s an ideal number of examples that gives the best results. Was quite fun and scalable to battle any LLMs you want…

We have a short video walkthrough of the setup here https://www.youtube.com/watch?v=z1MhXgmHbwk