frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Faking a JPEG

https://www.ty-penguin.org.uk/~auj/blog/2025/03/25/fake-jpeg/
93•todsacerdoti•3h ago

Comments

lblume•2h ago
Given that current LLMs do not consistently output total garbage, and can be used as judges in a fairly efficient way, I highly doubt this could even in theory have any impact on the capabilities of future models. Once (a) models are capable enough to distinguish between semi-plausible garbage and possibly relevant text and (b) companies are aware of the problem, I do not think data poisoning will be an issue at all.
jesprenj•2h ago
Yes, but you still waste their processing power.
immibis•1h ago
There's no evidence that the current global DDoS is related to AI.
bschwindHN•2h ago
You should generate fake but believable EXIF data to go along with your JPEGs too.
derektank•2h ago
From the headline that's actually what I was expecting the link to discuss
mrbluecoat•2h ago
> I felt sorry for its thankless quest and started thinking about how I could please it.

A refreshing (and amusing) attitude versus getting angry and venting on forums about aggressive crawlers.

ASalazarMX•2h ago
Helped without doubt by the capacity to inflict pain and garbage unto those nasty crawlers.
dheera•2h ago
> So the compressed data in a JPEG will look random, right?

I don't think JPEG data is compressed enough to be indistinguishable from random.

SD VAE with some bits lopped off gets you better compression than JPEG and yet the latents don't "look" random at all.

So you might think Huffman encoded JPEG coefficients "look" random when visualized as an image but that's only because they're not intended to be visualized that way.

maxbond•39m ago
Encoded JPEG data is random in the same way cows are spherical.
BlaDeKke•10m ago
Cows can be spherical.
EspadaV9•2h ago
I like this one

https://www.ty-penguin.org.uk/~auj/spigot/pics/2025/03/25/fa...

Some kind of statement piece

myelinsheep•5m ago
Anything with Shakespeare in it?
hashishen•1h ago
the hero we needed and deserved
derefr•1h ago
> It seems quite likely that this is being done via a botnet - illegally abusing thousands of people's devices. Sigh.

Just because traffic is coming from thousands of devices on residential IPs, doesn't mean it's a botnet in the classical sense. It could just as well be people signing up for a "free VPN service" — or a tool that "generates passive income" for them — where the actual cost of running the software, is that you become an exit node for both other "free VPN service" users' traffic, and the traffic of users of the VPN's sibling commercial brand. (E.g. scrapers like this one.)

This scheme is known as "proxyware" — see https://www.trendmicro.com/en_ca/research/23/b/hijacking-you...

cAtte_•1h ago
sounds like a botnet to me
ronsor•1h ago
because it is, but it's a legal botnet
derefr•1h ago
Eh. To me, a bot is something users don't know they're running, and would shut off if they knew it was there.

Proxyware is more like a crypto miner — the original kind, from back when crypto-mining was something a regular computer could feasibly do with pure CPU power. It's something users intentionally install and run and even maintain, because they see it as providing them some potential amount of value. Not a bot; just a P2P network client.

Compare/contrast: https://en.wikipedia.org/wiki/Winny / https://en.wikipedia.org/wiki/Share_(P2P) / https://en.wikipedia.org/wiki/Perfect_Dark_(P2P) — pieces of software which offer users a similar devil's bargain, but instead of "you get a VPN; we get to use your computer as a VPN", it's "you get to pirate things; we get to use your hard drive as a cache node in our distributed, encrypted-and-striped pirated media cache."

(And both of these are different still to something like BitTorrent, where the user only ever seeds what they themselves have previously leeched — which is much less questionable in terms of what sort of activity you're agreeing to play host to.)

tgsovlerkhgsel•56m ago
AFAIK much of the proxyware runs without the informed consent of the user. Sure, there may be some note on page 252 of the EULA of whatever adware the user downloaded, but most users wouldn't be aware of it.
marcod•46m ago
Reading about Spigot made me remember https://www.projecthoneypot.org/

I was very excited 20 years ago, every time I got emails from them that the scripts and donated MX records on my website had helped catching a harvester

> Regardless of how the rest of your day goes, here's something to be happy about -- today one of your donated MXs helped to identify a previously unknown email harvester (IP: 172.180.164.102). The harvester was caught a spam trap email address created with your donated MX:

puttycat•42m ago
> compression tends to increase the entropy of a bit stream.

Does it? Encryption increases entropy, but not sure about compression.

JCBird1012•30m ago
I can see what was meant with that statement. I do think compression increases Shannon entropy by virtue of it removing repeating patterns of data - Shannon entropy per byte of compressed data increases since it’s now more “random” - all the non-random patterns have been compressed out.

Total information entropy - no. The amount of information conveyed remains the same.

gregdeon•27m ago
Yes: the reason why some data can be compressed is because many of its bits are predictable, meaning that it has low entropy per bit.
Modified3019•21m ago
Love the effort.

That said, these seem to be heavily biased towards displaying green, so one “sanity” check would be if your bot is suddenly scraping thousands of green images, something might be up.

OpenAI’s Windsurf deal is off, and Windsurf’s CEO is going to Google

https://www.theverge.com/openai/705999/google-windsurf-ceo-openai
408•rcchen•5h ago•278 comments

ETH Zurich and EPFL to release a LLM developed on public infrastructure

https://ethz.ch/en/news-and-events/eth-news/news/2025/07/a-language-model-built-for-the-public-good.html
330•andy99•7h ago•42 comments

Faking a JPEG

https://www.ty-penguin.org.uk/~auj/blog/2025/03/25/fake-jpeg/
93•todsacerdoti•3h ago•23 comments

Preliminary report into Air India crash released

https://www.bbc.co.uk/news/live/cx20p2x9093t
133•cjr•6h ago•218 comments

jank is C++

https://jank-lang.org/blog/2025-07-11-jank-is-cpp/
189•Jeaye•9h ago•68 comments

HDD Clicker generates HDD clicking sounds, based on HDD Led activity

https://www.serdashop.com/HDDClicker
24•starkparker•2h ago•10 comments

Cheeky Computer Scientist replicates Quantum Factoring record with a dog [pdf]

https://eprint.iacr.org/2025/1237.pdf
3•sebgan•37m ago•1 comments

Dict Unpacking in Python

https://github.com/asottile/dict-unpacking-at-home
41•_ZeD_•3d ago•10 comments

A software conference that advocates for quality

https://bettersoftwareconference.com/
42•leoncaet•4h ago•28 comments

Sam Altman delays open weights model release

https://twitter.com/sama/status/1943837550369812814
40•martinald•1h ago•21 comments

Upgrading an M4 Pro Mac mini's storage for half the price

https://www.jeffgeerling.com/blog/2025/upgrading-m4-pro-mac-minis-storage-half-price
309•speckx•12h ago•192 comments

Andrew Ng: Building Faster with AI [video]

https://www.youtube.com/watch?v=RNJCfif1dPY
169•sandslash•1d ago•45 comments

Bill Atkinson's psychedelic user interface

https://patternproject.substack.com/p/from-the-mac-to-the-mystical-bill
375•cainxinth•15h ago•199 comments

Astronomers race to study interstellar interloper

https://www.science.org/content/article/astronomers-race-study-interstellar-interloper
106•bikenaga•11h ago•54 comments

Repaste Your MacBook

https://christianselig.com/2025/07/repaste-macbook/
179•speckx•13h ago•88 comments

Activeloop (YC S18) Is Hiring AI Search and Python Back End Engineers(Onsite,MV)

https://careers.activeloop.ai/
1•davidbuniat•5h ago

Apple vs the Law

https://formularsumo.co.uk/blog/2025/apple-vs-the-law/
342•tempodox•19h ago•346 comments

Measuring power network frequency using junk you have in your closet

https://halcy.de/blog/2025/02/09/measuring-power-network-frequency-using-junk-you-have-in-your-closet/
11•zdw•4h ago•1 comments

Computer Scientists Figure Out How to Prove Lies

https://www.quantamagazine.org/computer-scientists-figure-out-how-to-prove-lies-20250709/
8•pseudolus•2d ago•1 comments

Monorail – Turn CSS animations into interactive SVG graphs

https://muffinman.io/monorail/
45•stanko•3d ago•5 comments

'123456' password exposed chats for 64M McDonald's job applicants

https://www.bleepingcomputer.com/news/security/123456-password-exposed-chats-for-64-million-mcdonalds-job-applicants/
71•nan60•4h ago•43 comments

Show HN: RULER – Easily apply RL to any agent

https://openpipe.ai/blog/ruler
50•kcorbitt•8h ago•9 comments

Introduction to Digital Filters

https://ccrma.stanford.edu/~jos/filters/
32•ofalkaed•7h ago•6 comments

Lead pigment in turmeric is the culprit in a global poisoning mystery (2024)

https://www.npr.org/sections/goats-and-soda/2024/09/23/nx-s1-5011028/detectives-mystery-lead-poisoning-new-york-bangladesh
304•perihelions•11h ago•154 comments

Show HN: Pangolin – Open source alternative to Cloudflare Tunnels

https://github.com/fosrl/pangolin
460•miloschwartz•1d ago•108 comments

At Least 13 People Died by Suicide Amid U.K. Post Office Scandal, Report Says

https://www.nytimes.com/2025/07/10/world/europe/uk-post-office-scandal-report.html
561•xbryanx•14h ago•478 comments

2-4 wire converters / hybrids (2009)

https://sound-au.com/appnotes/an010.htm
7•userbinator•3d ago•1 comments

Pa. House passes 'click-to-cancel' subscription bills

https://www.pennlive.com/news/2025/07/pa-house-passes-click-to-cancel-subscription-bills-as-court-throws-out-federal-rule.html
223•bikenaga•10h ago•77 comments

In a First, Solar Was Europe's Biggest Source of Power Last Month

https://e360.yale.edu/digest/solar-biggest-power-source-europe-june-2025
207•Brajeshwar•10h ago•124 comments

LLM Inference Handbook

https://bentoml.com/llm/
314•djhu9•1d ago•16 comments