https://phys.org/news/2018-02-power-grid-fluctuations-hidden...
> Electric network frequency is a signal unique over time and thus can be used in time estimation for videos.
Although there's a whole other problem with this, which is that it's not going to survive consumer compression codecs. Because the changes are too small to be easily perceptible, codecs will simply strip them out. The whole point of video compression is to remove perceptually insignificant differences.
https://en.wikipedia.org/wiki/Electrical_network_frequency_a...
(Practical systems often include a generational index or a timestamp, which further helps to detect replay attacks.)
I think for the approach discussed in the paper, bandwidth is the key limiting factor, especially as video compression mangles the result, and ordinary news reporters edit the footage for pacing reasons. You want short clips to still be verifiable, so you can ask questions like "where is the rest of this footage" or "why is this played out of order" rather than just going, "there isn't enough signature left, I must assume this is entirely fake."
If a celebrity says something and person A films a true video, and person B films a video and then manipulates it, you'd be able to see that B's light code is different. But if B simply takes A's lighting data and applies it to their own video, now you can't tell which is real.
Lets assume the pixels have an 8-bit luminance depth, and lets say the 7 most significant bits are kept, and the signature is coded in the last bit of the pixels in a frame. A hash of the full 7-bit image frame could be cryptographically signed, while you could copy the 8-th bit plane to a fake video, the same signature will not check out according to a verifying media player, since the fake video's leading 7-bit planes won't hash to the same hash that has been signed.
What does this change compared to status quo? nothing: you can already hash and sign a full 8-bit video, and Serious-Oath that it depicts Real imagery. Your signature would also not be transplantable to someone elses video, so others can't put fake video in your mouth.
The only difference: if the signature is generated by the image sensor, and end-users are unable to extract the private key, then it decreases the number of people / entities able to credibly fake a video, but provides great power to the manufacturers to sign fake videos while the masses are unable to (unless they play a fake video on a high quality screen being imaged by a manufacturer-privatekey-containing-image-sensor.
This is more akin to spread spectrum approaches--you can perfectly well know the signal is there and yet finding it without knowing the key is difficult. That's why old GPS receivers took a long time to lock on--all the satellites are transmitting on top of each other, just with different keys and the signal is way below the noise floor. You apply the key for each satellite and see if you can decode something. These days it's much faster because it's done in parallel.
Plus, the code gives information about the frame it's embedded into, so you still have more work to do.
Point the monitor at the wall, or desk, or whatever. Notice the radiosity and diffuse light scattering on the wall (and on the desk, and on the reflection on the pen cap, and on their pupils).
Now you can take a video that was purported to be taken at 1:23pm at $LOCATION and validate/reconstruct the expected "excess" RGB data and then compare to the observed excess RGB data.
What they say they've done as well is to not just embed a "trace" of expected RGB values at a time but also a data stream (eg: a 1FPS PNG) which kindof self-authenticates the previous second of video.
Obviously it's not RGB, but "noise" in the white channels, and not a PNG, but whatever other image compression they've figured works well for the purpose.
In the R, G, B case you can imagine that it's resistant (or durable through) most edits (eg: cuts, reordering), and it's interesting they're talking about detecting if someone has photoshopped in a vase full of flowers to the video (because they're also encoding a reference video/image in the "noise stream").
Definitely interesting for critical event and locations, but quite niche.
It's similar to any other private/public key scheme: it just serves to prove the signature was generated by the owner (here whoever owns the location at which the video is taken).
But I guess you could imagine multiple flickering patterns per location, with each pattern being owned by a different entity (an NGO + a governement + a private company for example), in essence doing a multi-sig of the video.
If this is the only info that's encoded, then that might not be an entirely bad idea.
(Usually, the stego-ing of info can help identify, say, a dissident who made a video that was critical of a regime. There are already other ways, but defeating them is whack-a-mole, if universities are going to keep inventing more.)
> Each watermarked light source has a secret code that can be used to check for the corresponding watermark in the video and reveal any malicious editing.
If I have the dissident video, and a really big computer, can I identify the particular watermarked light sources that were present (and from there, know the location or owner)?
(Once you have an identifying code, you can go through supply chain and sales information, and through analysis of other videos, to likely determine location and/or owner/user/affiliate.)
[1] https://en.wikipedia.org/wiki/CIA_fake_vaccination_campaign_...
[2] https://www.npr.org/2021/09/06/1034631928/the-cias-hunt-for-...
If you're even considering going to go to all the trouble of setting up these weird lights and specialized algorithms for some event you're hosting, just shoot your own video of the event and post it. Done.
"Viewers" aren't forensic experts. They aren't going to engage with this algorithm or do some complex exercise to verify the private key of the algorithm prior to running some app on the video, they are just going to watch it.
Opponents aren't going to have difficulty relighting. Relighting is a thing Hollywood does routinely, and it's only getting easier.
Posting your own key and own video does nothing to prove the veracity of your own video. You could still have shot anything you want, with whatever edits you want, and applied the lighting in software after the fact.
I'm sure it was fun to play with the lights in the lab, but this isn't solving a problem of significance well.
I’m under the impression this isn’t for end users, it’s for enforcement within context of intellectual property.
I’m curious to see what the value proposition is as it’s unclear who would be buying this and why. I suppose platforms might want it to prove they can help or offer services to enforce brand integrity, maybe?
One significant problem currently is long form discussions which are taken wildly out of context for the sake of propaganda, cancelling or otherwise damaging the reputation of those involved. The point isn't that a given video isn't edited originally, but that the original source video can be compared to another (whether the original was edited or not is neither here nor there).
I'm not saying this solution is the answer, but attempts to be able to prove videos were unedited from their original release is a pretty reasonable goal.
I also don't follow where the idea that viewers need to be forensic experts arises from? My understanding is that a video can be verified as authentic, at least in the sense of the way the original author intended. I didn't read that users would be responsible for this, but rather that it can be done when required.
This is particularly useful in cases like the one I highlighted above; where a video may be re-cut to make an argument the person (or people) in question never made (and which might be used to smear said persons–a common occurrence in the world of long form podcasting as an example).
I don’t think that’s where we are, right? People are happy to stop looking after they see the video that confirms their negative suspicions about the public figure on the other team, and just assume any negative clips from their own team are taken out of context.
Total Relighting SIGGRAPH Talk: https://www.youtube.com/watch?v=qHUi_q0wkq4
Physically Controllable Relighting of Photographs: https://www.youtube.com/watch?v=XFJCT3D8t0M
Changing the view point post process: https://www.youtube.com/watch?v=7WrG5-xH1_k
Maybe eventually we get a model that can take a video and "rotate" it, or generate a 3D scene that can be recorded at multiple angles. But maybe eventually we may get a model that can generate anything. For now, 4o can't maintain obvious consistency with so many details, and I imagine it's orders of magnitude harder to replicate spatial/lighting differences accurately enough to pass expert inspection.
If you want solid evidence that a video is real, ask for another angle. Meanwhile, anything that needs to be covered with a camera (security or witness) should have at least two.
Or maybe "we installed the right bulbs but then we set the cameras to record in 240p MPEG with 1/5 keyframe per second because nobody in the office understands how digital video works".
Anyways I'm of the opinion that the ultimate end-state of deep fakes will be some sort of hybrid system where the AI creates 3d models and animates a scene for a traditional raytracing engine. It lets the AI do what its best at (faces, voices, movement) and eliminates most of the random inconsistencies. If that happens then faking these light patterns won't be difficult at all.
I don’t see the point of this technology. It might be useful for entities like Meta and Google, which could use it to warn of fake content. However, in practice that amounts to giving those entities more power over our perceptions and the realities we build upon them.
I'm reminded of a novel that I utterly can not recall in general, but I'm thinking of a scene where someone realizes they're in a simulation because the sidewalk cracks are wrong. They knew the real scene there happened to be a shape (of what I can't recall) in the cracks, the simulation was showing random cracks.
I don't think it even requires a monotile. Just use an event-specific non-repeating pattern on a backdrop.
Unfortunately, in most of the cases where that would be useful, that's also a party pretty high on the list of “who are we concerned about manipulating video”, so...
https://www.youtube.com/watch?v=e0elNU0iOMY
https://en.wikipedia.org/wiki/Electrical_network_frequency_a...
It looks like fun research, but I think we'd get a lot better bang for our buck persuring ways to attach annotations to a video like:
> I was there and I saw this happen
...such that you can find transitive trust paths from yourself to a verifier who annotated the video. That'll require a bit of trust hygiene that the common folk aren't prepared for, but I dont think there's any getting around preparing them for it.
1.- Use filming devices that sign the footage with a key, and the device has some anti-tamper protections to prevent somebody from stealing the key.
2.- The thing above is useless for most consumers of the footage, which would only see it after three to four transcodings change the bytes beyond recognition. But in a few years everybody will assume most footage in the Internet is fake, and in those cases when people (say, law enforcement) want to know for sure if something really happened, they’ll have to go through the trouble of procuring and watching the original, authenticated file.
The alternatives to 1 and 2 are:
a) To engage in an arms race, like the one which is happening with captchas right now.
b) To roll back this type of AI.
b is not going to happen, even with a societal collapse and sweeping laws against GenAI, because the way GenAI works is widely known. Unless we want to roll back technology itself and stop producing the hardware, and culture so that people don’t know how to do any of this.
https://en.wikipedia.org/wiki/Cinavia
This stopped me from playing any movie that had this implemented via PS3 Media Server on my PS3. Most movies that are done by Sony Pictures will have this implemented. I ended up using my Xbox 360 instead because Microsoft would not pay to license the tech IIRC.
ranger_danger•6mo ago
I don't think there's any possible solution that cannot also be faked in itself.
xandrius•6mo ago
Encrypt some data in the video itself (ideally every frame changing), unique and can be created only by the holder the private key. Anyone can verify it. Flag reused codes. That's it?
vorgol•6mo ago
wongarsu•6mo ago
Or anyone else who cares enough about deepfakes and can afford the effort
kevinventullo•6mo ago
wongarsu•6mo ago
It'd agree that it's a lot of effort for very marginal gain
ARob109•6mo ago
edit forgot the link: https://people.csail.mit.edu/mrub/vidmag/
do_not_redeem•6mo ago
If you flag a reused code in 2 different videos, how do you tell which video is real?
zhivota•6mo ago
It's a lot of complexity, so probably only worthwhile for high value targets like government press conference rooms, etc.
do_not_redeem•6mo ago
hamburglar•6mo ago
“ rather than encoding a specific message, this watermark encodes an image of the unmanipulated scene as it would appear lit only by the coded illumination”
They are including scene data, presumably cryptographically signed, in the watermark, which allows for a consistency check that is not easily faked.
do_not_redeem•6mo ago
zhivota•6mo ago
do_not_redeem•6mo ago
It turns out if you give an adversary physical access to hardware containing a private key, and they are motivated enough to extract it, it's pretty hard to stop them.
twodave•6mo ago
xandrius•6mo ago
For example, you encrypt the hash of the frame itself (+ metadata: frame number, timestamp, etc.) with a pkey. My client decrypts the hash, computes the hash and compares it.
The problem might present itself when compressing the video but the tagging step can be done after compression. That would also prevent resharing.
ranger_danger•6mo ago
edm0nd•6mo ago