Anyone playing in the "Web 2.0" domain, even if non-technical, has likely heard of such essential components of the Web applications style as AJAX and JSON. The 'J' in these acronyms stands for JavaScript, on which these standards are built. So the strengths and weaknesses of JavaScript are directly related to what can be built in the common Web 2.0 style, and therefore what business ideas can be implemented. Now there's another JavaScript related technology coming along quietly but quickly, and so far little noticed outside of developers' circles. It is called Caja, and it may have a major impact on what can and can't be accomplished by JavaScript-based Web 2.0 sites.
JavaScript Security 101
To explain Caja's significance, a little history is necessary. JavaScript began as an 'action language' embedded inside the Netscape Web browser, enabling pages to have more dynamic behavior than could be specified with HTML text markup. (Note that JavaScript is not the same as the programming language Java, nor are they compatible. The name overlap is the result of a long-ago marketing deal between Netscape and Sun that mostly served to sow confusion.) Netscape and later browsers would execute JavaScript programs that came along with conventional HTML (later XML) page content. As HTML content proliferated, JavaScript was dragged along and became a de facto standard beyond its original intent. For instance, the rise of Web-based e-mail meant that scripts within mail payloads became common.
Of course, having a full powered programming language that can be remotely commanded while running on one's machine is a recipe for security mayhem. The JavaScript inventors did think about this, though as we'll see, not deeply enough. JavaScript came with three levels of security:
The simplest model was 'JavaScript off' - the user can simply disable JavaScript. Often the simplest security solution is "Don't do that!", but this option - probably only known and used by the paranoid to begin with - was increasingly unattractive as more Web sites abandoned their text-only versions and depended on JavaScript for portions of their user interface.
A second model is 'anything goes'. You can think of this as download or toolbar mode. A particular chunk of JavaScript is given the same authority to control the user's machine as the browser, and ultimately as the user. If the user allows this mode for a piece of malicious script, mayhem will ensue. Due to the overhead to the typical user in deciding what code to trust, and enabling the facility, this mode is also little used.
The third model of JavaScript on the web client is sandbox mode. A particular script can see and manipulate data within the scope of the page it arrives on, and other pages loaded from the same source. However, it is walled away from both the raw capabilities of the underlying machine, and from data and code that arrives from other sources. Like letters written in the sand on a beach, the effects of the script will last only as long as its page stays loaded. This is the mostly commonly used security mode, and while it's largely succeeded in averting security mayhem, it does have its holes. Without getting into details, exploits such as cross site scripting seek to find and poke holes in the sandbox and gain malicious access to local resources or to spy on data in other site's open pages. (Here's a more complete list of evil scripting tricks. Update: Google itself has thoughtfully provided an example of an XSS vulnerability.) Strict sand boxing also prevents potentially useful interaction among scripts originating on different sites, such as multiple widgets cooperating in a social networking context.
Web Apps Make Things Worse
A JavaScript program running in the browser sandbox cannot store a user's information onto the client machine - that's a prohibited operation. What can it do if the user's data must be preserved for a later session before the page is closed and the script disappears? Phone home! The script can formulate and send a request back to its home site to store the information under the user's account in a database. The next time the page loads, its scripts can ask for that data back and life will go on, the user none the wiser.
Problem solved? Not quite. Remember the original intent of the sandbox was to keep a malicious script from wreaking havoc on the local machine, which includes stealing, destroying or modifying the user's valuable data. We've just saved a piece of user data from the rising tide, but by putting it outside that original security perimeter. Now it's over on someone's server cluster. And what one script has done, another can undo. Specifically, if the browser (or its user) can be fooled into executing a malicious script within a live window, it can read, destroy or alter that data as if it were the user. Now you've got some idea why (for instance) e-mails read within a Web mail interface typically have their embedded scripts disabled. All it would take is the wrong spam or phishing mail opened in the browser window, and your GMail or Yahoo account could be toast.
And that's really the point, because the little trick of sending the data back home has burgeoned into the entire Web applications domain, one of the hottest grounds for investing right now. Every day there are more services saving more valuable user data in a way the original JavaScript architecture did not anticipate. The value and sensitivity of that data has also soared as we went from noting the last message read on a bulletin board to keeping sales contacts, collaborative spreadsheet models and the family photo archives on the server. Every Web application site now has to walk a line between desirable user interface and other features, and the risk to the user's data and the entire business model if a leak is left that allows a malicious script to start trashing the place. (Note: a more complete and precise description of these issues can be found in section 2 of the draft Caja spec (PDF). It assumes a fair amount of vocabulary, and is a bit metaphor happy, but really quite readable as security literature goes.)
So What's Caja?
Caja stands for Capabilities JavaScript, and is an open source project based at Google (more about that later). It applies an advanced security concept, capabilities, to define a version of JavaScript that can be safer than the sandbox, while allowing highly selective use of powers that would have been forbidden by the sandbox. Capabilities is a notion that has wandered in the technology wilderness for some time. Its few pure implementations have either never made it to market, or ended up in the computer trivia list of dead systems. Ironically, it may find its moment of glory in retrofitting a workable security model to JavaScript.
For the truly non-technical, you might think of capabilties as a 'magic word' system. The permission to do a particular operation on a particular piece of data is represented by a magic word. Moreover, the magic word itself contains information about when it can be said, and by whom. And it can be different every time, so you can't just write it down and reuse it later. So rather than putting everyone in a sandbox, we get smart about what magic words are created, and what they can do, and who gets them. They can be passed along, but if someone tries to say them in the wrong context, nothing happens. For instance, the JavaScript embedded within a Webmail could be given the magic word that allows it to display something, but never given the magic words that allow read or write access to the mail directory.
For the slightly more technical, Caja is taking advantage of a general characteristic of object oriented (OO) languages, JavaScript being one. A true OO language never allows direct programmatic access to a piece of data. Reading or modifying the data requires invoking a 'method' (a small program) that hides the details of the operation. Data is never naked in an OO system, instead it is 'encapsulated' by an enclosing object and the associated methods. This is obviously fairly close to the 'magic word' notion - if you don't know the name of the particular object, and the right method to call, nothing happens. But few OO languages have any notion of program or user identity at a fundamental level. Most also have built-in methods (often related to reflection) that could be used to circumvent a capabilities model if left exposed.
What the clever Caja folks noticed was that one can define a proper subset of JavaScript that does have capability properties. Confine usage to that subset, and very useful security properties arise both in preventing exploits, and allowing selective access to more dangerous operations. An existing JavaScript that runs within that subset already has capability properties and will not need modification. All scripts go through a runtime filter that checks for safety and 'cajoles' (heh!) the script into a modified version that has runtime capabilities checks, or rejects it if it does not comply.
What's It All Mean?
Caja could become one of those open source projects that never gets traction, but it's not likely. The project is sponsored by Google, and its home page is there. It's already being adopted by MySpace, and is being discussed as part of the OpenSocial platform. Yahoo is part of the OpenSocial collaboration. Though it's not clear if this amounts to an official endorsement for Caja, Yahoo personnel such as Doug Crockford are participating in related discussions, and I'm told it's well-regarded by the Yahoo security community. That could be a critical mass for adoption right there. Getting native support on the browser side would be the next step, and could easily get entangled with a project to update the formal 'ECMAScript' definition of JavaScript, as well as the Yahoo/Microsoft wars.
At any rate, this adds up to a very good chance that something that's right now fairly obscure could turn into a major force in Web 2.0 within months, not years. Because Caja modifies the de facto definition of JavaScript, it would have an immediate impact on any scripts and sites that are doing things regarded as unsafe in the new model. If you've got a Web 2.0 based site, get ready for a project to review for 'Caja-safety'. If the Caja model spreads, then the edges of the sandbox are going to get blurry. Various users and sites will be able to make choices to allow more powerful operations, and figuring out which ones are significant and allow enhanced value could be a fairly torturous product management challenge, and perhaps allow market entry chances for more powerful forms of widgets and Facebook-style 'apps'.
That's important in itself, but a second and perhaps more interesting problem comes from the utilization of the Caja's capabilities system. Capabilities is a more powerful concept than the sandbox, and like most powerful tools it can also be dangerous. Deciding what 'magic words' to allow in a particular setting, and who gets to say them, could become arcane and confusing, particularly if you are trying to push the edge in allowing collaborating applications.
The usual recourse in the face of such complexity is to come up with well-known and safe patterns of application. That seems the likely result here as well: The capabilities frameworks implemented from (for instance) OpenSocial compliant sites aren't going to be open-ended, they will be particular to that site or class of applications. So while the Caja mechanism will present one set of issues, the definition of actual policies for its use will create another. It may constrain or expand the horizons for those who program or build business plans within the scope of the Google, MySpace, Yahoo and probably other platforms. So if you're playing in this area, make sure the technical side of the shop is watching Caja. The Caja spec is here, and there's a good starter pitch here. Then invest some time in discussing what issues or opportunities may arise if it takes off.
(Thanks to Kevin Marks and Ben Laurie for reviewing this post. Any remaining errors and omissions are my own.)
Comments