frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

AI for People

https://justsitandgrin.im/posts/ai-for-people/
1•dive•46s ago•0 comments

Rome is studded with cannon balls (2022)

https://essenceofrome.com/rome-is-studded-with-cannon-balls
1•thomassmith65•6m ago•0 comments

8-piece tablebase development on Lichess (op1 partial)

https://lichess.org/@/Lichess/blog/op1-partial-8-piece-tablebase-available/1ptPBDpC
2•somethingp•7m ago•0 comments

US to bankroll far-right think tanks in Europe against digital laws

https://www.brusselstimes.com/1957195/us-to-fund-far-right-forces-in-europe-tbtb
2•saubeidl•8m ago•0 comments

Ask HN: Have AI companies replaced their own SaaS usage with agents?

1•tuxpenguine•11m ago•0 comments

pi-nes

https://twitter.com/thomasmustier/status/2018362041506132205
1•tosh•13m ago•0 comments

Show HN: Crew – Multi-agent orchestration tool for AI-assisted development

https://github.com/garnetliu/crew
1•gl2334•13m ago•0 comments

New hire fixed a problem so fast, their boss left to become a yoga instructor

https://www.theregister.com/2026/02/06/on_call/
1•Brajeshwar•15m ago•0 comments

Four horsemen of the AI-pocalypse line up capex bigger than Israel's GDP

https://www.theregister.com/2026/02/06/ai_capex_plans/
1•Brajeshwar•15m ago•0 comments

A free Dynamic QR Code generator (no expiring links)

https://free-dynamic-qr-generator.com/
1•nookeshkarri7•16m ago•1 comments

nextTick but for React.js

https://suhaotian.github.io/use-next-tick/
1•jeremy_su•17m ago•0 comments

Show HN: I Built an AI-Powered Pull Request Review Tool

https://github.com/HighGarden-Studio/HighReview
1•highgarden•18m ago•0 comments

Git-am applies commit message diffs

https://lore.kernel.org/git/bcqvh7ahjjgzpgxwnr4kh3hfkksfruf54refyry3ha7qk7dldf@fij5calmscvm/
1•rkta•20m ago•0 comments

ClawEmail: 1min setup for OpenClaw agents with Gmail, Docs

https://clawemail.com
1•aleks5678•27m ago•1 comments

UnAutomating the Economy: More Labor but at What Cost?

https://www.greshm.org/blog/unautomating-the-economy/
1•Suncho•34m ago•1 comments

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)

https://gettorr.com/
1•BenaouidateMed•35m ago•0 comments

Statin drugs safer than previously thought

https://www.semafor.com/article/02/06/2026/statin-drugs-safer-than-previously-thought
1•stareatgoats•37m ago•0 comments

Handy when you just want to distract yourself for a moment

https://d6.h5go.life/
1•TrendSpotterPro•38m ago•0 comments

More States Are Taking Aim at a Controversial Early Reading Method

https://www.edweek.org/teaching-learning/more-states-are-taking-aim-at-a-controversial-early-read...
2•lelanthran•40m ago•0 comments

AI will not save developer productivity

https://www.infoworld.com/article/4125409/ai-will-not-save-developer-productivity.html
1•indentit•45m ago•0 comments

How I do and don't use agents

https://twitter.com/jessfraz/status/2019975917863661760
1•tosh•51m ago•0 comments

BTDUex Safe? The Back End Withdrawal Anomalies

1•aoijfoqfw•54m ago•0 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
7•michaelchicory•56m ago•1 comments

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

https://github.com/O0000-code/Ensemble
1•IO0oI•59m ago•1 comments

PR to support XMPP channels in OpenClaw

https://github.com/openclaw/openclaw/pull/9741
1•mickael•1h ago•0 comments

Twenty: A Modern Alternative to Salesforce

https://github.com/twentyhq/twenty
1•tosh•1h ago•0 comments

Raspberry Pi: More memory-driven price rises

https://www.raspberrypi.com/news/more-memory-driven-price-rises/
2•calcifer•1h ago•0 comments

Level Up Your Gaming

https://d4.h5go.life/
1•LinkLens•1h ago•1 comments

Di.day is a movement to encourage people to ditch Big Tech

https://itsfoss.com/news/di-day-celebration/
4•MilnerRoute•1h ago•0 comments

Show HN: AI generated personal affirmations playing when your phone is locked

https://MyAffirmations.Guru
4•alaserm•1h ago•3 comments
Open in hackernews

Ask HN: Share real complaints about outsourcing data annotation

4•yogoism•8mo ago
Hi HN,

I’m mapping the data-annotation vendor landscape for an upcoming study.

For many AI teams, outsourcing labeling is a strategic way to accelerate projects—but it isn’t friction-free.

If you’ve worked with an annotation provider, what specific problems surfaced? Hidden costs, accuracy drift, privacy hurdles, tooling gaps, slow iterations—anything that actually happened. Please add rough project scale or data type if you can.

Your firsthand stories will give a clearer picture of where the industry still needs work. Thanks!

Comments

fzwang•8mo ago
We've explored using external vendors for data labeling and annotation work for a few projects (image and text data). I think overall the problem is more along of the lines of mis-aligned/drifting incentives. It's like Goodhart's law, where whatever metric you use for outcomes tend to be manipulated or have unintended consequences. And putting in the trusted systems to identify bad/shifting metrics is costly in a way that makes outsourcing not worth it.

In most cases, we've opted to build the data labeling operation in-house, so we have more control over the quality and can adjust on the fly. It's slower and more costly upfront, but better outcomes in the long run as we get higher quality data.

yogoism•8mo ago
Greetings from Japan.

Thank you for sharing such an insightful point. This really resonates, speaking from my experience as an annotator on crowdsourcing platforms. I also found that a genuine commitment to quality from fellow annotators can be quite rare.

This makes me curious about a few things:

1. What are some concrete examples of the "unintended consequences" you ran into?

2. When you initially considered outsourcing, what was the main benefit you were hoping for (e.g., speed, cost)?

3. On the flip side, what have been the biggest frustrations or challenges with the in-house approach?

Would love to hear your thoughts on any of these. Thanks!

fzwang•8mo ago
1) RE: Unintended consequences - It was usually some mix of willful or accidental misinterpretation of what we wanted. I can't go into details, but in many cases the annotators are really aiming for maximizing billable activities. In situations where there are some ambiguities, they would pick one interpretation and just go with it without really making the effort to verify. In some ways, I understand their perspective in the sense that they know their work is a commodity and would just do the minimally-viable job to get paid.

2) RE: Benefits of outsourcing - The primary benefit was usually speed to get to a certain dataset scale. These vendor had existing pools of workers, which we can access immediately. There were potential cost-savings but it was never as good as we had projected. The quality of labeling would be less than ideal, which would trigger interventions to verify or improve annotations, which then adds to cost and complexity.

3) RE: In-house ops - Essentially, moving things in-house doesn't magically solve the issues we had. It's a lot of work to recruit and organize data labeling teams. They are still subject to the same incentive-misalignment problems as outsourcing, but we obviously have a closer relationship with them and that seems to help. We try to communicate to them the importance of their work, especially early on, where their feedback and "feel" for the data is very valuable. And it's much much more expensive, but all things considered still the "right" approach in many cases. In some scenarios, we can amplify some of their work by using synthetic data generators etc.