frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The Search Engine Map

https://www.searchenginemap.com
1•cratermoon•53s ago•0 comments

Show HN: Souls.directory – SOUL.md templates for AI agent personalities

https://souls.directory
1•thedaviddias•2m ago•0 comments

Real-Time ETL for Enterprise-Grade Data Integration

https://tabsdata.com
1•teleforce•5m ago•0 comments

Economics Puzzle Leads to a New Understanding of a Fundamental Law of Physics

https://www.caltech.edu/about/news/economics-puzzle-leads-to-a-new-understanding-of-a-fundamental...
2•geox•6m ago•0 comments

Switzerland's Extraordinary Medieval Library

https://www.bbc.com/travel/article/20260202-inside-switzerlands-extraordinary-medieval-library
2•bookmtn•6m ago•0 comments

A new comet was just discovered. Will it be visible in broad daylight?

https://phys.org/news/2026-02-comet-visible-broad-daylight.html
2•bookmtn•11m ago•0 comments

ESR: Comes the news that Anthropic has vibecoded a C compiler

https://twitter.com/esrtweet/status/2019562859978539342
1•tjr•13m ago•0 comments

Frisco residents divided over H-1B visas, 'Indian takeover' at council meeting

https://www.dallasnews.com/news/politics/2026/02/04/frisco-residents-divided-over-h-1b-visas-indi...
1•alephnerd•13m ago•0 comments

If CNN Covered Star Wars

https://www.youtube.com/watch?v=vArJg_SU4Lc
2•keepamovin•19m ago•0 comments

Show HN: I built the first tool to configure VPSs without commands

https://the-ultimate-tool-for-configuring-vps.wiar8.com/
2•Wiar8•22m ago•2 comments

AI agents from 4 labs predicting the Super Bowl via prediction market

https://agoramarket.ai/
1•kevinswint•27m ago•1 comments

EU bans infinite scroll and autoplay in TikTok case

https://twitter.com/HennaVirkkunen/status/2019730270279356658
4•miohtama•30m ago•1 comments

Benchmarking how well LLMs can play FizzBuzz

https://huggingface.co/spaces/venkatasg/fizzbuzz-bench
1•_venkatasg•32m ago•1 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
13•SerCe•33m ago•6 comments

Octave GTM MCP Server

https://docs.octavehq.com/mcp/overview
1•connor11528•34m ago•0 comments

Show HN: Portview what's on your ports (diagnostic-first, single binary, Linux)

https://github.com/Mapika/portview
3•Mapika•36m ago•0 comments

Voyager CEO says space data center cooling problem still needs to be solved

https://www.cnbc.com/2026/02/05/amazon-amzn-q4-earnings-report-2025.html
1•belter•40m ago•0 comments

Boilerplate Tax – Ranking popular programming languages by density

https://boyter.org/posts/boilerplate-tax-ranking-popular-languages-by-density/
1•nnx•40m ago•0 comments

Zen: A Browser You Can Love

https://joeblu.com/blog/2026_02_zen-a-browser-you-can-love/
1•joeblubaugh•42m ago•0 comments

My GPT-5.3-Codex Review: Full Autonomy Has Arrived

https://shumer.dev/gpt53-codex-review
1•gfortaine•43m ago•0 comments

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

https://github.com/AGDNoob/FastLog
2•AGDNoob•45m ago•1 comments

God said it (song lyrics) [pdf]

https://www.lpmbc.org/UserFiles/Ministries/AVoices/Docs/Lyrics/God_Said_It.pdf
1•marysminefnuf•46m ago•0 comments

I left Linus Tech Tips [video]

https://www.youtube.com/watch?v=gqVxgcKQO2E
1•ksec•46m ago•0 comments

Program Theory

https://zenodo.org/records/18512279
1•Anonymus12233•51m ago•0 comments

Show HN: Local DNA analysis skill for OpenClaw

https://github.com/wkyleg/personal-genomics
2•wkyleg•51m ago•0 comments

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

1•netfortius•1h ago•0 comments

WiFi Could Become an Invisible Mass Surveillance System

https://scitechdaily.com/researchers-warn-wifi-could-become-an-invisible-mass-surveillance-system/
6•mgh2•1h ago•0 comments

Build your own Mac cloud

https://ciderstack.com
2•ciderdev•1h ago•0 comments

Anduril announces AI Grand Prix – autonomous drone racing competition (2026)

https://www.dcl-project.com/
1•aanet•1h ago•0 comments

How the Tandy Color Computer Works [video]

https://www.youtube.com/watch?v=r2Tq8jdS6mY
2•amichail•1h ago•0 comments
Open in hackernews

Ask HN: Is it likely AI training models could start training on personal files?

3•sjw987•4mo ago
I've been sorting through my content on Google recently. Backing up and moving off of Gmail and Google Drive was relatively simple, but Google Photos is a bit more daunting. The Google Takeout process has delivered me almost 500 2GB zip folders, with scrambled metadata in supplemental data files, which is going to take a while to sort through. It's my own fault for sticking with one platform for so long, and I got hooked during the "unlimited storage" days of early Google Pixel phones.

The reason I've begun downloading and removing stored files is because I'm (maybe justifiably or not) concerned about the prospect of my personal photos being used to train AI models. The chance that some diffusion model might end up recreating a heavily biased image of my wife, family, friends, or myself, or referencing any of my files or documents and what that all may be used for (commercially or otherwise) concerns me.

Google is the only place I've ever put my personal photos. I've never bothered with anything public facing and trusted that a private cloud storage service would always stay private. So in my case, Google would be the sole place to leave to ensure data sovereignty.

Does anybody believe Google (and other companies) might soon start scanning personal files we hold on their storage facilities? Is that a legal possibility for them?

It seems to me that it's a huge pool of fresh training data that they would inevitably want to get their hands on. And given how much they have already trained on, it seems the next logical step from a business standpoint.

Clearly they would need to change their privacy policies and terms of agreements and inform users of these changes. Is it possible they could slip this sort of change in without much notice?

I was also wondering if anybody might have pointers for the best strategy to securely backup offline. I don't want to just shift my family photos from one company to another where business execs are training their own model. Anybody else handled this recently?

Comments

incomingpain•4mo ago
>I've been sorting through my content on Google recently.

There's allegations that gemini is already trained on this data.

>Does anybody believe Google (and other companies) might soon start scanning personal files we hold on their storage facilities? Is that a legal possibility for them?

Free accounts already have agreed to be used.

>It seems to me that it's a huge pool of fresh training data that they would inevitably want to get their hands on. And given how much they have already trained on, it seems the next logical step from a business standpoint.

Im actually not so sure they have or ever will do. The problem isnt quantity, it's quality. Sure it could train on a bunch of trash in people's but then when inferring, it'll produce trash.

>Clearly they would need to change their privacy policies and terms of agreements and inform users of these changes. Is it possible they could slip this sort of change in without much notice?

you've been agreeing to them being able to read the content of the files for antivirus and antispam reasons for a very long time. To start doing it for AI requires no change.

>I was also wondering if anybody might have pointers for the best strategy to securely backup offline. I don't want to just shift my family photos from one company to another where business execs are training their own model. Anybody else handled this recently?

One of the useful apps I found was 'foldersync' which makes backup to cifs shares possible.