Hey, have you ever thought about how cool and unique your algorithms are? ? A lot of programmers and companies do, which is why they might be hesitant to share their work with everyone. This problem gets a little better if part of the code is moved to the server (for client-server applications), but this approach isn't always possible. Sometimes, we have to leave sensitive code sections right out in the open.
In this article, we're going to take a look at obfuscation in the JavaScript, creating ways to hide algorithms and make it harder to study code. We'll also be exploring what AST is and providing tools that can be used to interact with it to implement obfuscation.
Here's a silly example. Let's imagine this situation:
let w = screen.width, h = screen.height; // Let's say there's a logic with some check. console.info(w, h);
Unfortunately, Bob can't access the giveaway page, and he's pretty upset about it. He doesn't understand why. Then he learns in the rules of the giveaway that users with big, good monitors are not allowed.
Luckily, Bob had taken some computer science classes in high school. He opens the developer console decisively by pressing F12, studies the script, and realizes that the organizers check the screen resolution. He then decides to participate from his phone and successfully passes the test.
A fictional story with a happy ending - but it couldn't have been this good if the main character had seen this instead of the previous code:
l=~[];l={___:++l,$$$$:(![]+"")[l],__$:++l,$_$_:(![]+"")[l],_$_:++l,$_$$:({}+"")[l],$$_$:(l[l]+"")[l],_$$:++l,$$$_:(!""+"")[l],$__:++l,$_$:++l,$$__:({}+"")[l],$$_:++l,$$$:++l,$___:++l,$__$:++l};l.$_=(l.$_=l+"")[l.$_$]+(l._$=l.$_[l.__$])+(l.$$=(l.$+"")[l.__$])+((!l)+"")[l._$$]+(l.__=l.$_[l.$$_])+(l.$=(!""+"")[l.__$])+(l._=(!""+"")[l._$_])+l.$_[l.$_$]+l.__+l._$+l.$;l.$$=l.$+(!""+"")[l._$$]+l.__+l._+l.$+l.$$;l.$=(l.___)[l.$_][l.$_];l.$(l.$(l.$$+"\""+(![]+"")[l._$_]+l.$$$_+l.__+"\"+l.$__+l.___+"\"+l.__$+l.$$_+l.$$$+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$$_+l.$$$+"\"+l.__$+l.$_$+l.__$+l.$$_$+l.__+"\"+l.__$+l.$_$+l.___+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$_$+l.___+l.$$$_+"\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$__+l.$$$+"\"+l.__$+l.$_$+l.___+l.__+";\"+l.__$+l._$_+l.$$__+l._$+"\"+l.__$+l.$_$+l.$$_+"\"+l.__$+l.$$_+l._$$+l._$+(![]+"")[l._$_]+l.$$$_+".\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$_$+l.$$_+l.$$$$+l._$+"(\"+l.__$+l.$$_+l.$$$+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+");"+"\"")())();
I assure you, it's not gibberish, it's JavaScript! And it performs the same actions. You can try to run the code in the console here.
I guess in this case, our hero would've just accepted his fate by not taking part in the giveaway, and the organizers would've kept their plan.
So what's the point here? Congrats - you've learned about the jjencode tool and what obfuscation is and what role it can play.
In summary, obfuscation is the process of converting program code or data into a form that's hard for humans to understand but still works for a machine or program.
Enough theories, let's move on to more practical examples ??. Now let's try to convert the code with the help of obfuscations that you are more likely to find on the Internet. Let's take a more interesting code that contains our “know-how” operations. And it is highly undesirable that everyone who is not too lazy to reach F12 can find out about them:
let w = screen.width, h = screen.height; // Let's say there's a logic with some check. console.info(w, h);
This code collects device and browser data and outputs the result to the console, for example (we'll use the output as a metric of the code's performance):
l=~[];l={___:++l,$$$$:(![]+"")[l],__$:++l,$_$_:(![]+"")[l],_$_:++l,$_$$:({}+"")[l],$$_$:(l[l]+"")[l],_$$:++l,$$$_:(!""+"")[l],$__:++l,$_$:++l,$$__:({}+"")[l],$$_:++l,$$$:++l,$___:++l,$__$:++l};l.$_=(l.$_=l+"")[l.$_$]+(l._$=l.$_[l.__$])+(l.$$=(l.$+"")[l.__$])+((!l)+"")[l._$$]+(l.__=l.$_[l.$$_])+(l.$=(!""+"")[l.__$])+(l._=(!""+"")[l._$_])+l.$_[l.$_$]+l.__+l._$+l.$;l.$$=l.$+(!""+"")[l._$$]+l.__+l._+l.$+l.$$;l.$=(l.___)[l.$_][l.$_];l.$(l.$(l.$$+"\""+(![]+"")[l._$_]+l.$$$_+l.__+"\"+l.$__+l.___+"\"+l.__$+l.$$_+l.$$$+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$$_+l.$$$+"\"+l.__$+l.$_$+l.__$+l.$$_$+l.__+"\"+l.__$+l.$_$+l.___+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$_$+l.___+l.$$$_+"\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$__+l.$$$+"\"+l.__$+l.$_$+l.___+l.__+";\"+l.__$+l._$_+l.$$__+l._$+"\"+l.__$+l.$_$+l.$$_+"\"+l.__$+l.$$_+l._$$+l._$+(![]+"")[l._$_]+l.$$$_+".\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$_$+l.$$_+l.$$$$+l._$+"(\"+l.__$+l.$$_+l.$$$+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+");"+"\"")())();
Now let's take the above code and modify it with a popular obfuscator for JS - obfuscator.io. As a result, we will get a code like this:
function getGpuData(){ let cnv = document.createElement("canvas"); let ctx = cnv.getContext("webgl"); const rendererInfo = ctx.getParameter(ctx.RENDERER); const vendorInfo = ctx.getParameter(ctx.VENDOR); return [rendererInfo, vendorInfo] } function getLanguages(){ return window.navigator.languages; } let data = {}; data.gpu = getGpuData(); data.langs = getLanguages(); console.log(JSON.stringify(data))
Voila! Now, only a machine will be happy to parse this code (you and I are probably not among them ?). Nevertheless, it still works and produces the same result. Note the changes:
The last technique is perhaps the most nasty in this case, in terms of burdening static code analysis.
Alright, looks like all the secrets are hidden. Shall we deploy the code to production?
Wait... If there are services for code obfuscation, perhaps there are some that can pull this stuff back? Absolutely ?, and more than one! Let's try to use one of them - webcrack. And see if we can get the original, readable code. Below is the result of using this deobfuscator:
{"gpu":["ANGLE (NVIDIA, NVIDIA GeForce GTX 980 Direct3D11 vs_5_0 ps_5_0), or similar","Mozilla"],"langs":["en-US","en"]}
Oops ?. Of course, it did not return the names of variables, but thanks for that.
So it turns out that the only obstacle to calmly studying our code in this case is the researcher's willpower to use deobfuscator. Undoubtedly, it is also possible to use other solutions and customizations, but for any popular obfuscation, we should most likely expect popular deobfuscation.
Should we despair and give up our secrets without a fight? Of course not! Let's see what more we can do....
An obfuscator - sounds like some kind of mage from a fantasy universe, doesn't it? ??♂️
Definitely, someone can obfuscate code while writing it and is a born magician. You may even have unintentionally been able to cast such spells yourself for a while. But what should you do now if the skills have disappeared thanks to the criticism of “senior programmers” and you have an idea that potentially allows you to make the program difficult to investigate? In this case, it makes sense to turn to tools that interact with the code structure itself and allow you to modify it. Let's take a look at them.
It's also possible to try to modify the code by interacting with it simply as with text, replacing certain constructions with regular expressions and so on. But I would say that following this way, you have more chances to ruin your code and time than to obfuscate it.
For more reliable and controlled modification it makes sense to bring it to an abstract structure, a tree (AST - abstract syntax tree), passing through which we can change the elements and constructs we are interested in.
There are different solutions for working with JS code, with differences in the final AST. In this article, we will use babel for this purpose. You don't need to install anything, you can experiment with everything on such resource as astexplorer.
(If you don't want to mess with babel, check out shift-refactor. It allows you to interact with AST using **CSS selectors. Pretty minimalistic and convenient approach for learning and modifying code. But it uses a specific version of AST, different from babel. You can test your CSS queries for this tool at shift-query interactive demo).
Now let's see how these tools can be easily used without leaving the browser, based on a simple example. Suppose we need to change the name of the test variable in the same-named function to changed:
let w = screen.width, h = screen.height; // Let's say there's a logic with some check. console.info(w, h);
Paste this code into astexplorer (select JavaScript and @babel/parser from above), it should appear as an AST there. You can click on the test variable to see the syntax for this code section in the right window:
To solve our problem, we can write the following babel plugin, which will parse our code and look for all namesidentifiers in it and rename them if certain conditions are met. Let's paste it into the bottom left window in astexplorer (turn on the transform slider and select babelv7 to make it appear):
let w = screen.width, h = screen.height; // Let's say there's a logic with some check. console.info(w, h);
Console output is included in this plugin for a reason. This allows us to debug our plugin by examining the output in the browser console. In this case, we output information about all nodes of Identifier type. This information contains data about the node itself (node), the parent node (parent), and the environment (scope- contains variables created in the current context and references to them):
Thus, in the bottom right window, we can notice that the variable in our source code has been successfully changed without affecting other identifiers:
l=~[];l={___:++l,$$$$:(![]+"")[l],__$:++l,$_$_:(![]+"")[l],_$_:++l,$_$$:({}+"")[l],$$_$:(l[l]+"")[l],_$$:++l,$$$_:(!""+"")[l],$__:++l,$_$:++l,$$__:({}+"")[l],$$_:++l,$$$:++l,$___:++l,$__$:++l};l.$_=(l.$_=l+"")[l.$_$]+(l._$=l.$_[l.__$])+(l.$$=(l.$+"")[l.__$])+((!l)+"")[l._$$]+(l.__=l.$_[l.$$_])+(l.$=(!""+"")[l.__$])+(l._=(!""+"")[l._$_])+l.$_[l.$_$]+l.__+l._$+l.$;l.$$=l.$+(!""+"")[l._$$]+l.__+l._+l.$+l.$$;l.$=(l.___)[l.$_][l.$_];l.$(l.$(l.$$+"\""+(![]+"")[l._$_]+l.$$$_+l.__+"\"+l.$__+l.___+"\"+l.__$+l.$$_+l.$$$+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$$_+l.$$$+"\"+l.__$+l.$_$+l.__$+l.$$_$+l.__+"\"+l.__$+l.$_$+l.___+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$_$+l.___+l.$$$_+"\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$__+l.$$$+"\"+l.__$+l.$_$+l.___+l.__+";\"+l.__$+l._$_+l.$$__+l._$+"\"+l.__$+l.$_$+l.$$_+"\"+l.__$+l.$$_+l._$$+l._$+(![]+"")[l._$_]+l.$$$_+".\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$_$+l.$$_+l.$$$$+l._$+"(\"+l.__$+l.$$_+l.$$$+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+");"+"\"")())();
I hope, based on this example, it became a little clearer how we can parse and modify the code. Anyway, let me summarize the work done:
It is clear now how code modification can be done. Let's try something more useful, which we will be able to call obfuscation :) We'll take a more complex code we tried to obfuscate in the previous section. Now we'll change all the names of variables and functions in it to random ones. So, a potential reverse engineer would have less info about the purpose of some code elements.
Also, feel free to use any JS code to debug problems. As they say, there's no better teacher than pain ?.
The following plugin will help us to get the job done:
function getGpuData(){ let cnv = document.createElement("canvas"); let ctx = cnv.getContext("webgl"); const rendererInfo = ctx.getParameter(ctx.RENDERER); const vendorInfo = ctx.getParameter(ctx.VENDOR); return [rendererInfo, vendorInfo] } function getLanguages(){ return window.navigator.languages; } let data = {}; data.gpu = getGpuData(); data.langs = getLanguages(); console.log(JSON.stringify(data))
What does this code do? Pretty much the same as in the previous example:
As a result of our plugin execution, we get the following code with random variables names and functions:
{"gpu":["ANGLE (NVIDIA, NVIDIA GeForce GTX 980 Direct3D11 vs_5_0 ps_5_0), or similar","Mozilla"],"langs":["en-US","en"]}
You can check it by executing the code in console - after our manipulations, it still works! And this is the main quality of a good obfuscator ✨.
But what about the quality of our obfuscation? As for me - the evil is not too strong yet: even by replacing the names, it will be easy for an experienced programmer to understand the purpose of this code. And what's the point if any JS minifier can handle this task. Is it possible now to do something more practical and troublesome for a reverser? There is one more spell...
I may have been a bit confident when I wrote “everything”, but what we are going to do now will hide the actions of our code to the maximum extent possible. In this section, we will conceal strings and various object properties in order to complicate static analysis and potentially prevent the “client” from digging into our code!
Let's take the code with hidden names obtained at the previous stage and apply the following plugin to it:
let w = screen.width, h = screen.height; // Let's say there's a logic with some check. console.info(w, h);
I have already described a little bit the work of this plugin in the code comments, but let's briefly describe step by step what it does:
It is worth mentioning that parsing operations are not performed sequentially, but as the necessary node is found during AST processing.
As a result of executing this plugin, we will get the following code:
l=~[];l={___:++l,$$$$:(![]+"")[l],__$:++l,$_$_:(![]+"")[l],_$_:++l,$_$$:({}+"")[l],$$_$:(l[l]+"")[l],_$$:++l,$$$_:(!""+"")[l],$__:++l,$_$:++l,$$__:({}+"")[l],$$_:++l,$$$:++l,$___:++l,$__$:++l};l.$_=(l.$_=l+"")[l.$_$]+(l._$=l.$_[l.__$])+(l.$$=(l.$+"")[l.__$])+((!l)+"")[l._$$]+(l.__=l.$_[l.$$_])+(l.$=(!""+"")[l.__$])+(l._=(!""+"")[l._$_])+l.$_[l.$_$]+l.__+l._$+l.$;l.$$=l.$+(!""+"")[l._$$]+l.__+l._+l.$+l.$$;l.$=(l.___)[l.$_][l.$_];l.$(l.$(l.$$+"\""+(![]+"")[l._$_]+l.$$$_+l.__+"\"+l.$__+l.___+"\"+l.__$+l.$$_+l.$$$+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$$_+l.$$$+"\"+l.__$+l.$_$+l.__$+l.$$_$+l.__+"\"+l.__$+l.$_$+l.___+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$_$+l.___+l.$$$_+"\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$__+l.$$$+"\"+l.__$+l.$_$+l.___+l.__+";\"+l.__$+l._$_+l.$$__+l._$+"\"+l.__$+l.$_$+l.$$_+"\"+l.__$+l.$$_+l._$$+l._$+(![]+"")[l._$_]+l.$$$_+".\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$_$+l.$$_+l.$$$$+l._$+"(\"+l.__$+l.$$_+l.$$$+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+");"+"\"")())();
As you can see from the resulting code, all properties have been replaced by getData function calls with a given index. We did the same thing with strings and started to get them through function calls. The property names and strings themselves were encoded with base64 to make them more difficult to notice...
I guess you have already noticed - this plugin, and the code in general, has flaws at this stage. For example, the following things could be corrected:
Despite all this simplicity and downsides, I think it can already be called obfuscation. But then again, how do we differ from the open-source obfuscators, since they do similar things?
We've got to remember the original problem — those obfuscations were a piece of cake for public deobfuscators. Now, let's take that code we got and deobfuscate it in webcrack! (hopefully it still can't tackle our spell?). I guess you could say the practical importance has been achieved - our “protected” code can no longer be pulled back in one click via a public deobfuscator
Now let's learn a brand-new spell. Although public deobfuscators are not able to handle our plugins, however, having studied the actual concept of our obfuscation we can notice some patterns that can be used to restore the source code.
Let's get into it, and specifically take advantage of:
Given these disadvantages, we can implement the following plugin:
let w = screen.width, h = screen.height; // Let's say there's a logic with some check. console.info(w, h);
Let's describe the functionality of this deobfuscation plugin:
As a result, we get the following code:
l=~[];l={___:++l,$$$$:(![]+"")[l],__$:++l,$_$_:(![]+"")[l],_$_:++l,$_$$:({}+"")[l],$$_$:(l[l]+"")[l],_$$:++l,$$$_:(!""+"")[l],$__:++l,$_$:++l,$$__:({}+"")[l],$$_:++l,$$$:++l,$___:++l,$__$:++l};l.$_=(l.$_=l+"")[l.$_$]+(l._$=l.$_[l.__$])+(l.$$=(l.$+"")[l.__$])+((!l)+"")[l._$$]+(l.__=l.$_[l.$$_])+(l.$=(!""+"")[l.__$])+(l._=(!""+"")[l._$_])+l.$_[l.$_$]+l.__+l._$+l.$;l.$$=l.$+(!""+"")[l._$$]+l.__+l._+l.$+l.$$;l.$=(l.___)[l.$_][l.$_];l.$(l.$(l.$$+"\""+(![]+"")[l._$_]+l.$$$_+l.__+"\"+l.$__+l.___+"\"+l.__$+l.$$_+l.$$$+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$$_+l.$$$+"\"+l.__$+l.$_$+l.__$+l.$$_$+l.__+"\"+l.__$+l.$_$+l.___+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+"\"+l.$__+l.___+"=\"+l.$__+l.___+"\"+l.__$+l.$$_+l._$$+l.$$__+"\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\"+l.__$+l.$_$+l.$$_+".\"+l.__$+l.$_$+l.___+l.$$$_+"\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$__+l.$$$+"\"+l.__$+l.$_$+l.___+l.__+";\"+l.__$+l._$_+l.$$__+l._$+"\"+l.__$+l.$_$+l.$$_+"\"+l.__$+l.$$_+l._$$+l._$+(![]+"")[l._$_]+l.$$$_+".\"+l.__$+l.$_$+l.__$+"\"+l.__$+l.$_$+l.$$_+l.$$$$+l._$+"(\"+l.__$+l.$$_+l.$$$+",\"+l.$__+l.___+"\"+l.__$+l.$_$+l.___+");"+"\"")())();
Thus, we were able to get rid of obfuscation that hides properties and strings by writing a simple plugin for babel using the shown disadvantages.
I hope this small example explained how you can fight such nuisances with the help of babel. Using these approaches, you can also solve more complex obfuscations - the main thing is to find patterns in the code and skillfully operate with AST.
We've learned about obfuscation, a technique that complicates code reverse engineering, and the tools to implement it. Although there are public solutions that obfuscate JavaScript code, there are just as many public solutions that can remove this protection in an instant.
Therefore, you need to write your own solutions to protect code that can't be removed by public deobfuscators. One reliable way to implement obfuscation in JS is to write custom babel plugins that interact with the AST of the desired code, turning it into a less readable form.
Of course, this area has known techniques and approaches to obfuscation, but nevertheless remains open to creativity and new “tricks” that can potentially make learning the code more difficult. Despite the large number of such techniques, they do not guarantee the secrecy of algorithms at all, because code is always “in the hands” of the client. Besides, there is a possibility of debugging, which can make it easier to study the code. Obfuscation allows you to rather turn away poorly motivated researchers, thus increasing the cost of reverse engineering.
There are some advanced approaches, for example, one of them among obfuscation is virtualization of code, or simply speaking, creating a virtual machine in JS that will execute custom bytecode. This approach almost completely removes the chance of static analysis and makes debugging as difficult as possible. However, this is a separate subject for discussion ?....
I hope it was useful for you to get information on this topic, and you won't blame yourself or your programmers for initially obfuscated code anymore. Appreciate these wizards ??♀️! I will be glad to discuss with you the latest trends in magic here?
The above is the detailed content of Heres how to brew obfuscation in JavaScript without burning the lab: AST, Babel, plugins.. For more information, please follow other related articles on the PHP Chinese website!