Friday, January 18, 2008

Compressing and Obfuscating JavaScript (with Prototype & Script.aculo.us)

Abstract
My methods, trials and eventual solution for optimizing the delivery of JavaScript files via minification, obfuscation, and compression, including examples with notoriously problematic to minify libraries such as Prototype and Script.aculo.us. Some of my explanation will work on any platform, although my eventual sweet solution is Java specific and relies on a JSP taglib.

Introduction
Last fall, my company developed an election game for the primaries and caucuses called Kingmaker, which we launched in the middle of December. We built portions of the site in Adobe Flex. Although this provided some cool functionality, it increased our page sizes significantly. So, I began my search for ways to shrink down everything else on the site to compensate for the tremendous increase caused by Flex apps.

First, we compressed the images as far as possible before losing too much quality. Next, I saw to the shrinking of our JavaScript files. This presented a series of problems:
  1. I like Prototype. I wanted to keep it. But it's huge (124K uncompressed).
  2. I like Script.aculo.us. I wanted to keep that too. Again, it's huge (I only use effects, controls, slider, and dragdrop... 111K uncompressed)
  3. My list of known, and highly used JavaScript compressors were known not to be able to compress Prototype or Script.aculo.us correctly.
Why Minify/Obfuscate/Compress Anyway?
At the most basic level, regardless of connection speeds, smaller files translate into shorter download times. Sure, web browsers cache static content such as stylesheets and JavaScript files, but there is no justification for unnecessarily slowing the initial download, or relying on the client when you can do something about it quickly and easily.

As for obfuscation, you can't ever really protect JavaScript like you can server-side code/bytecode, and you shouldn't be writing JavaScript that make it easy for a malicious user to manipulate your system. Bottom line: a security risk is still a security risk when obfuscated and you should avoid it all together. That said, with a system as easy to use as the one I'll outline shortly, there's no reason not make a malicious user's goals just a little bit more difficult. On another level, perhaps there are some people you want to share your code with, and some people you don't. Obfuscation gives you more control, since obfuscated code is a major pain to read, understand, and modify safely. Plus, the substitution of tiny meaningless variable names also means a smaller size file (minimally, but smaller nonetheless).

What Didn't Work
Dean Edwards packer is a pretty common tool for packing JavaScript files. There's an online (JavaScript) version, as well as offline versions in .NET, perl, and PHP. Packer, however, uses regex, has strict syntax requirements for it to work that Prototype is known to break. Also, even after alleviating Prototype's syntax issues, the online version of packer still chokes on a few properties, though the offline versions do not. It's reported that there is an online version of the PHP version that does work, located here, but it didn't work when I was trying it. Besides, you still have to fix up the syntax first, and even then there's multiple better solutions.

JSMin is also regex based, so it has the same issues.

Fixing Prototype's Syntax
I really shouldn't say fixing... because technically (as in, according to JavaScript interpreters) what Prototype does is 100% OK. Nevertheless, regex based parsers need stricter syntax than a JavaScript interpreter. Fortunately, the guys over at Dojo had the wonderful insight to ditch the regex all together and use (you guessed it) a JavaScript interpreter! Specifically, they use a custom version of Rhino, which is an open source JavaScript interpreter used in Mozilla applications (Firefox, Camino, Thunderbird, Seamonkey, etc).

Rhino is written in Java, and as a JavaScript interpreter has more context about what's going on in a JavaScript file than a rigid regex does. Hence, Dojo ShrinkSafe came into the world with the ability to safely compress JavaScript files without additional strict syntax requirements. By running files through Dojo ShrinkSafe first, they could then be packed/obfuscated additionally by regex packers because the post Rhino treatment was syntactically 'fixed' for a regex packer.

But there was a remaining problem: If you're using the Dojo Toolkit, then your files are automatically compressed using ShrinkSafe, but I am not and I didn't want to have to go through those steps to manually re-minify a JavaScript file every time I edited it... so my search continued.

A Changing Solution
Shortly, I discovered the pack:tag JSP Tag Library, which seemed to be what I was looking for. The pack:tag had many benefits:
  1. Simultaneous minification and obfuscation
  2. gzip compression
  3. Combination of static resources to minimize round-trips from client to server
  4. Caching of compressed static resources via a memory (Servlet) or file cache, and thus minification, obfuscation, and compression at resource request time (as opposed to compile time)
  5. Pluggable compression algorithms (and an implementable interface for creating your own packing strategy)
  6. Configuration via a .properties file
Four packing strategies are included: 2 for CSS (Isaac Shlueter's CSS Compressor, and the iBloom CSS Compressor), and 2 for JavaScript(JSMin, and the YUI Compressor).

My favorite additional benefits are as follows:

First, the pack:tag allows you to edit your JavaScript files in their un-minified and un-obfuscated form, thus relinquishing me from having to either edit a minified file or use Dojo Shrinksafe and the perl version of Dean Edward's packer every time I changed a file.

Second, the pack:tag checks to ensure that a resource is not included more than once in the same page, automatically ignoring subsequent requests, which can accidentally happen quite easily when using multiple JSP includes (used in multiple places) to dynamically build a page.

Third, the pack:tag allows you to keep your uncompressed (and thus easy to read and edit) JavaScript files within the WEB-INF directory of a web application, where they are protected from prying eyes.

Using the pack:tag as simple as including the JSP Taglib declaration in your JSP, and then using the following instead of normal <script> tags to include JavaScript files:

<pack:script src="/my/file.js" />

Or, to combine multiple resources:

<pack:script>
<src>/my/file1.js</>
<src>/my/file2.js</>
</pack:script>

pack:tag Refined
The only remaining problem with the pack:tag was the same one I discussed earlier: JSMin and other regex based packers do not safely minify Prototype. Fortunately, pluggable compressor strategies come to the rescue! As of version 2.2, the YUICompressor can safely compress Prototype (thanks to the fact that it, like Dojo ShrinkSafe, is implemented using Rhino). This was an amazing development because I now had a solution whereby I could hide my JavaScript files, edit them with complete clarity, and have them automatically combined, minified, obfuscated, and gzip compressed at request time, then cached for the next request!

An added benefit of using the YUICompressor is a significant amount of logging output that complains to you're doing something stupid. Unfortunately, I haven't found a way to turn off that output, so it does fill up the log a bit when compressing files wherein I reference functions that are declared in other files (which it has no context of, unless the files are combined before processing).

Statistical Results
For testing, all downloaded file size measurements were taken using the Firefox plugin Firebug, version 1.05.

I wanted large, commonly used libraries for testing. Since I happen to be a big fan of Prototype and Script.aculo.us, and since Prototype has been my resident problem in this description, I decided those two libraries would be sufficient. I use the same technique outlined above on my own JavaScript files with wonderful results. Since I mostly use Script.aculo.us for the effects, dragdrop, and slider, and thus never include all the components at once, I decided to measure each file individually rather than Script.aculo.us as a whole package.

Terminology
Minification - refers to both minifying and obfuscating the JavaScript
Individual - Each JavaScript file was included separately in the page; for the base case via :

<script type="text/javascript" src="/js/prototype.js"><script>

Or for the other scenarios using pack:tag to include each with:

<pack:script src="/js/prototype.js" />

Grouped - The JavaScrpt files were included together in one file. For the case with minification and gzip via:

<pack:tag>
<src>/js/prototype.js</src>
<src>/js/effects.js</src>
<src>/js/controls.js</src>
<src>/js/slider.js</src>
<src>/js/builder.js</src>
<src>/js/dragdrop.js</src>
<src>/js/sound.js</src>
</pack:tag>

For the minified test without gzip, I created a new file (total.js) with the output of the above grouping and included it via:

<script type="text/javascript" src="/js/total.js"></script>

Versions used in testing
Prototype 1.6
Script.aculo.us 1.7.1 Beta 3

Individual Results


Grouped Results


Conclusion
Compression in some form can significantly decrease file size and, implicitly, download times. Different methods of compression yield varying degrees of benefit. Namely, Gzip (which, by the way, is Prototype's 'officially supported' compression method) yields the greatest initial benefit, with minification yielding additional benefit. On the whole, I now have a setup that I don't have to do anything. I create my JavaScript files as if I were not compressing, include them with slightly different syntax, and the pack:tag with the YUICompressor strategy takes care of everything else.

Adendum
As a testament to this conclusion, more compression options have become available lately. John-David Dalton has released a collection of compressed Prototype and Script.aculo.us versions on Google Code. He explained his process in an Axajian post last December:

I format the code manually, fixing semi-colons and fixing references to $super. I run them through a compressor with quotes around the $super vars so they aren’t changed then fix their method arguments. I use Dean Edward’s Packer because it creates the smallest files. From there you can use a server side solution to gzip/version/and deploy the file. I use Prado (www.pradosoft.com) and their asset publishing capabilities.

I have a Blog but it’s currently in the early stages, I never have time to work on it.

Basically it's the process I described above, except manually fixing the syntax 'problems' rather than using a JavaScript interpreter to do it for you. It's a great solution for non-Java server environments where the pack:tag is not an option.

12 comments:

Anonymous said...

You may try an online javascript compressor at http://www.compressjavascript.com

Ryan said...

http://www.compressjavascript.com looks like a nice utility, though I haven't tried it. However, unlike my solution above, it does not allow editing of JavaScript files in uncompressed form with dynamic and automatic compression/obfuscation at deploy-time. Also, it appears to have syntax requirements beyond that of a JavaScript interpreter, again, unlike the above solution. For non-Java environments, though, where the dynamic solution is not available, it could be useful.

mike said...

Hey,

I was really happy to find your article. I immediately implemented pack using the yui compressor, but when trying to hit my page I got this error:

java.lang.NoClassDefFoundError: org/mozilla/javascript/ErrorReporter

I can see the file JavaScriptErrorReporter.class in the jar, so I'm not sure why it's failing. I'm using version 3.2 of pack, with Tomcat 6 and Java 5.

Ideas?

Ryan said...

Did you download the YUICompressor from http://www.julienlecomte.net/yuicompressor/ ? The compressor strategy is included with the pack:tag, but the actual compressor is not. If you did, perhaps it's not on the classpath when deployed? I always deploy my apps as .war files, and I use Netbeans to build the .war for me, but if you're using some other system, make sure the YUICompressor .jar is on the classpath.

mike said...

Yep, that was it--I had seen the YUI compressor in Pack and assumed it was the actual compressor. Once I put that in it worked--now I just have to track down some other bugs to get this working.

Thanks!

jordi said...

I mantain a very similar library called JAWR.
It also allows editing in uncompressed form, and it has the advantage that you don't need to repeat the whole list of scripts you need compressed at every page. Let me know what you think! :-)

Ryan said...

@jordi JAWR looks pretty great, I'll have to try it out sometime. I like that JAWR handles DWR scripts... I've had to hack around the packtag solution I'm using by referencing the DWR scripts via the absolute URL as external files, which works well. Looking forward to trying out JAWR...

squith said...

I'm surprised that nobody mentioned Google closing compiler. Not limited to reduce / compress analyzed to find and remove unused code, and rewritten for maximum minification.

Dubturbo

Harley Davidson Parts said...

Packer offer the highest compression ratio I have ever seen for large files and there is even a JavaScript version of the packager so you can pack your scripts into the browser.

Steven said...

Yep, that was it--I had seen the YUI compressor in Pack and assumed it was the actual compressor. Once I put that in it worked--now I just have to track down some other bugs to get this working.

dub turbo

Username said...

DECOMPRESS JAVASCRIPT

Is IMPOSSIBLE to hide the compressed code.

A typical JavaScript compressed with /packer/ starts with the following code:

`eval(function(p,a,c,k,e,r)`…
`eval` can simply be replaced by `alert`.



The `eval` function evaluates a string argument that contains JavaScript. In most packers, `eval` is used, followed by `document.write`.

To decompress JavaScript, replace these methods by one of the following:

1. Replace `eval` by `alert` (*Like this:* `alert(function(p,a,c,k,e,r)`… *and open or refresh the html page, the alert will simply print the code in a popup-window*)

2. If the JavaScript appears after the `body` element, you can add a `textarea` like so:

"textarea id="code".../textarea"

Then, replace `eval`(…); by `document.getElementById("code").value=…;`


From both options the first is more simple... Good luck :)

Anonymous said...

When I use this, my log gets entries like:
ERROR [JavaScriptErrorReporter] ...
How can I disable these? Thank you! RVic