I understand the desire to make it harder for bots, but 1) it doesn't seem to be effective and bots seem to be going a very different route 2) there's got to be better ways that are more effective. It's not like you're going to stop clones through this because clones can replicate by just seeing how things work and reverse engineer blackbox style.
On top of that 1.5 seconds is also that there is a much larger CPU and memory cost from having to run that browser compared to a simple direct HTTP request which is near negligible.
So while you'll never truly defeat a sufficiently motivated actor, you may be able to drive their costs up high enough that it makes it difficult to enter the space or difficult to turn a profit if they're so inclined.
As for why the obfuscation is needed: bot management products suffer from a fundamental weakness in that ultimately, all of them simply collect static data from the environment, therefore it would make much more sense to make the steps involved as difficult to reverse engineer as possible. Once that is done, all you need to do is slightly change the schematics of your script every few weeks and publish a new bundle, and you've got yourself a pretty unsubvertible* protection scheme.
Regarding the "trojan horse", I think someone is yet to show proof that it's a Javascript exploit.
(*Unsubvertible is obviously relative, but raising the cost the attack, from say, $0.01/1000 requests to $10/1000 requests would massively cut down on abuse.)
[3] https://github.com/neuroradiology/InsideReCaptcha
[4] https://www.zenrows.com/blog/bypass-cloudflare#_qEu5MvVdnILJ...
https://github.com/neuroradiology/InsideReCaptcha
> bots seem to be going a very different route
If the "very different route" means running a headless browser, then it's a success for this tech. Because the bot must run a blackbox JS now, and this gives people a whole new street of ways to run bot detection, using the bot's CPU.
None of it's perfect, and they can be worked around, but by providing a barrier you've restricted some of the bad actors (spambots, scrapers) from acting at all.
It's easier to deal with 100 spambots than 1000!
If you ever end up on a video that's related to drugs, there will be entire chains of bots just advertising to each other and TikTok won't find any violations when reported. But sure, I'm sure they care a whole lot about not ending up like Twitter.
TikTok is a huge company, evidence of what the support department does or doesn't do has only minor bearing on the whole company, and basically none on the engineering department.
The thing that seems most likely to me is that they care about spam, the engineering department did this one thing, and the support department is either overworked or cares less. Or really efficient which is why you only see "a lot of spam", not "literally nothing but spam".
They won't be betting that this stops that entirely, but it adds a layer of friction that is easy for them to change on a continuous basis. These things are also very good for leaving honeypots in where if someone is found to still be using something after a change you can tag them as a bot or otherwise hacking. Both of those approaches are also widely used in game anti-cheat mechanisms, and as shown there the lengths people will go to anyway are completely insane.
Many popular/large websites and bot protection services usually have environment checking as a baseline and mouse-movement tracking in some of the more aggressive anti-bot checks.
It's always interesting to see how long it takes from when the measures have been defeated/publicised until the service ends up making changes to their mechanism to make you start over (hopefully not from scratch).
I was sharing this here since I thought it was a great write up, but did not intend to pass it off as my own!
There is certainly always a good amount of push and pull, though my personal concern as a contributor to yt-dlp under another alias is more about archival of the underlying media rather than automating things like comments.
YouTube also uses an interesting scheme for authenticating requests for media as well which required implementing a very basic JavaScript interpreter within Python for yt-dlp too. I expect this kind of thing to continue to become even more common and complicated.
2. Despite tiktok having a giant target painted on its back for its perceived connections to the CCP, I haven't really seen any evidence that it does any more tracking/fingerprinting that most other websites (eg. facebook) or security services (eg. cloudflare or recaptcha) already do.
Take a look for request parameters in TikTok vs. Instagram for example.
Every request for TikTok forces you to pass most of the information that browser can collect from the end-user before server responds:
Half of the parameters are stuff relating to the app itself, or could be inferred from other sources like user-agent. The other fingerprinting stuff (eg. canvas or webgl fingerprinting) is basically industry standard and by no means unique to tiktok. Even the claim that "browser can collect from the end-user before server responds" doesn't hold up to scrutiny, because there's no meaningful difference between that, and browser check interstitials (eg. the cloudflare checkbox), which fingerprint you before letting you access the content. It's also unclear how that's more sinister than the alternative approach of sending telemetry/fingerprinting data to a separate endpoint.
Also discussed on HN
And I've gotta say, emplying an AI assistant has proven to be an invaluable help in trying to understand obfuscated code. It's actually really cool to take a function of gobbledegook JavaScript and ask the AI to rewrite it in a more canonical and easily understandable way, with inline comments. Of course, there are flaws every now and then, but the ability to do this has been such a game changer for reverse engineering, IMO.
I can even ask to take a guess at finding better variable/function names and the AI can infer from the code (maybe has seen the unobfuscated libraries during training?) what this code is actually doing on a high-level and turn something like e.g(e.g) into player.initialize(player.state) which is nothing short of amazing.
So for anyone doing similar work, I cannot recommend highly enough to have an AI agent as another tool in your tool belt.
Ask more questions. Get some right answers. Repeat.
Make question asking muscle get swole.
https://github.com/jehna/humanify
What they do is ground the LLM to the AST with Babel to ensure you still get the same shape of AST out of your deobfuscation pass. Probably this tool could be cleaned up, made to work with multiple llm and parser backends, have its prompts improved, &c.
But if AI can help to fight those people's work, good for humanity I guess.
That said... Is AI going to de-obfuscate/reverse engineer their obsfuscated AI prompts or web apps?
> This can be achieved by using two browser extensions known as Tampermonkey for executing custom code and CSP to disable CSP so I can fetch files from blocked origins. This is so I can put latestDeobf.js in my own file server and have it be fetched each time, this is so I can easily edit the file and let the changes take effect each time I refresh. This makes it much easier to bebug when reversing functions.
I believe you can achieve the same effect without any 3rd party extensions. You can use Local Overrides in Chrome DevTools.
Great work!
Likely overkill for this use case, but no matter the client, you can in theory do whatever you want to any traffic up until the point it leaves your network.
ad-hoc code, or something with a more structured workflow, maybe?
this sounds like a fun thing to try, thanks for your time
I've used VM's for years to run Windows on top of macOS or Linux on top of Windows or macOS on top of macOS when I need an isolated testing environment. I also know that Java works via the "Javascript Virtual Machine" which I've always thought of as "Java code actually runs in its own lightweight operating system on top of the host OS, which makes it OS-agnostic". The JVM can't run on bare metal because it doesn't have hardware drivers, but presumably it could if you wrote those drivers.
But presumably the VM being discussed in TFA isn't that kind of VM, right? Bytedance didn't write an operating system in Javascript?
I've been seeing "VM" used in lots of contexts like this recently and it makes me think I must be missing something, but it's the sort of question I don't know how to Google. AIs have not been helpful either, plus I don't trust them.
And also VM223, with statements that do stuff to an array "stack": https://github.com/LukasOgunfeitimi/TikTok-ReverseEngineerin...
One obvious giveaway for a VM is laying out memory, or processing some intermediate language. In this case, it could be the latter.
In-browser, you have Chrome V8 running Javascript; that Javascript could be running an interpreted environment where abstractions are not purely business logic, but an execution model separate from domain stuff: auth, video, user, etc.
By that observation, this C snippet is a VM:
char instruction = 'p'; /* or array */
if (instruction == 'p') {
println("document.appendChild(...)");
}
If the program outputs to a vm.js file, it's kinda-sorta a "VM." I would call it something else, maybe a generator of sorts (for now). Just in my opinion, for me, if I were working on a VM, the threshold of calling it that would be much higher than the above.On the other hand, if I had to comment in the generated Javascript debugging hints referring to execution stack or stack pointers, it is kind of a VM idea.
> I also know that Java works via the "Javascript Virtual Machine"
Java Virtual machine. That Java and JavaScript are named the way they are is... basically a historical accident of a cross-promotion gone too far, IMO. They aren't really related (at least, in the way that the name might imply).
Now to your real question. Virtual machines are anything that is one computer pretending to be another computer. Sometimes, that's an x86_64 PC pretending to be another x86_64 PC to run a different OS. Sometimes that's an x86_64 PC pretending to be a 50-year-old mainframe ( https://opensimh.org/ really shines there). Sometimes it's an ARM laptop running macOS pretending to be an x86_64 PC so it can run Windows. And, relevant here, sometimes it's a phone pretending to be a machine that has never actually existed in hardware. You can just make up an imaginary machine that has any old characteristics you want. Maybe it has a built-in high-level network card that magically turns HTTP requests into responses without programs having to implement HTTP themselves. Maybe it has an imaginary graphics card that directly renders buttons. Maybe you imagine a CPU that runs Java opcodes directly. Whatever it is, if you can imagine a system and then write a program that emulates it, you can make a virtual machine and run stuff in it.
Oops, that was a typo! Thank you.
The VM you are familiar with indeed can run an OS, and is indeed not what TikTok does.
#1 VMM - hypervisor runs VMs
#2 JVM/.NET - efficient bytecode
#3 Obfuscation - obscure bytecode
The main thing is that for #2 and #3 the machine language changes.
With "virtualization" as used in most contexts, involving a virtual machine monitor, or hypervisor, one creates zero or more new (virtual) machines, to execute on multiple software recipes. All the recipes are written in the same (machine) language, for all the machines. This can help security by introducing isolation, for example, where one VM cannot read memory belonging to another VM unless the hypervisor allows it.
With the "virtual machine" used for obfuscation, the machine language changes. The system performs the same actions as it would without obfuscation, but now it is performing those actions using a different machine language. Behaviorally, the result is the same. But, the new language makes it harder to reverse engineer the behavior.
Stupid example:
Original instruction: MOV A,B
Under hypervisor virtualization, VM0 and VM1 will perform this same instruction.
Under obfuscation virtualization, software will perform instructions that amount to the same result, but are harder to figure out. So, the MOV instruction is redefined and mapped onto a new (virtual) machine. The new machine does not simply leverage the existing instruction, rather an obfuscated sequence. For example:
A <- B + C + D * E
A <- A - C
A <- A - D * E
Obviously, the above transformation is easy to understand and undo. Others are harder to understand and undo. Look up MOVfuscator to see how crazy things may get.
It's a function wrapping the functionality of its host environment. Then provides the caller with its own byte code language to execute instructions. The virtual machine translates those instructions to the corresponding real functionality of the host environment (Javascript) upon execution.
This particular case is sophisticated but the idea is simple.
Correct me if I'm wrong. I'm not knowledgeable in this. This is my current understanding of it.
Sun popularized the term "virtual machine" when marketing Java instead of using "interpreter" or "P-code", both for marketing reasons (VMware had just come on the scene and was making tech headlines), but also to get away from the perception of classic interpreters being slower than native code since Java had a JIT compiler. Just-in-time compilers that compiled to the host's machine code at runtime were well-known in research domains at the time, but were much less popular than the more dominant execution models of "AST interpreter" and "bytecode interpreter".
There might be some gatekeepers that suggest that "interpreter" means AST interpreter (not true for the Python interpreter, for instance), or VM always means JIT compiled (not true for Ruby, which calls its bytecode-based MRI "RubyVM" in a few places), but you can ignore them.
Did you know that every chip on a Chip & Pin bank card is powered by a Java Virtual Machine that when you go to tap or insert in to a card reader it's activated.
xfeeefeee•18h ago
TikTok uses a custom virtual machine (VM) as part of its obfuscation and security layers. This project includes tools to:
Deobfuscate webmssdk.js that has the virtual machine.
Decompile TikTok’s virtual machine instructions into readable form.
Script Inject Replace webmssdk.js with the deobfuscated VM injector.
Sign URLs Generate signed URLs which can be used to perform auth-based requests eg. Post comments.
noduerme•15h ago
Still, I had no idea. This is really taking JS obfuscation to the next level.
One kind of wonders, what is the purpose of that level of obfuscation? The naive take is that obfuscation is usually to protect intellectual property... but this is client-side code that wouldn't give away anything about their secret sauce algorithm.
throwaway48476•15h ago
The VM term is applied because the obfuscator creates a custom instruction set and executes custom byte code. This is generated per build.
noduerme•9h ago
MonkeyClub•14h ago
From the Repo's README:
"TikTok is using a full-fledged bytecode VM, if you browse through it, it supports scopes, nested functions and exception handling. This isn't a typical VM and shows that it is definitely sophiscated."
noduerme•10h ago
Also, one major purpose of a VM is to improve performance over what's available in the browser. If you use that as a measurement, this clearly doesn't fit that goal.
gruez•7h ago
Emulators and VMs aren't mutually exclusive.
>Also, one major purpose of a VM is to improve performance over what's available in the browser. If you use that as a measurement, this clearly doesn't fit that goal.
And from your other comment:
>I would define it as a custom instruction set plus some sort of plug-in that allows those opcodes to be run closer to the metal than the language they're written in.
A virtual machine just means a machine that's virtual. All the other expectations you apply on top of it (eg. "improve performance over what's available in the browser") is totally irrelevant. The JVM clearly doesn't improve performance of java code than running natively, but nobody denies it's a virtual machine. The same goes for VMWare products ("VM" is literally in its name!), which executes x86 code but is further away from "the metal" that it's running on.
userbinator•14h ago
codetrotter•11h ago
HN takes that text and turns it into a comment. I’ve seen it happen before.
The unfortunate outcome of that IMO is that sometimes text that makes sense as a description of a submission feels a bit out of place as a comment due to how they are worded. And these comments sometimes then end up getting downvoted.
I wouldn’t be completely sure it was not human written. Even though it feels a bit weird to read it as a comment.
xfeeefeee•6h ago
Yeah, this is exactly what happened, but I decided to keep it rather than delete and filled it out more with the synopsis from the repo.
Looking back at it, it really does look like an AI bulleted summary. I probably should have noted that the last part was indeed a quotation.
dmitrygr•3h ago
xfeeefeee•2h ago