Archive for the ‘JavaScript’ category

Client Side Map-Reduce with Javascript

January 29, 2010

I’ve been doing a lot of scalability research lately. The High Scalability website has been fairly valuable to this end. I’ve been thinking of alternate approaches to my application designs, mostly based on services. There was an interesting article about Amazon.com’s architecture that describes a little bit on how they put services together.

I started thinking about an application that I work on and how it would work if every section of the application was talking to each other through a web service or sockets passing JSON or Protocol Buffers rather than the current monolithic design that uses object method calls. I then had the thought that why limit your services to being deployed on a set of static machines. There’s only so much expandability in that, what if we harnessed all of the unused power of the client machines that visit the site.

Anyone who’s done any serious work with ECMAScript (aka Javascript) knows that you can do some pretty powerful things in that language. One of the more interesting features about it is the ability to evaluate plain text in JSON format into Javascript objects using eval(). Now eval() is dangerous because it will run any text given to it as if it were Javascript. However that also makes it powerful.

Imagine needing to do a fairly intensive series of computation on a large data set and you don’t have a cluster to help you. You could have clients connect to a web site that you’ve set up and have their browser download data in JSON format along with executable Javascript code. The basic Javascript library can be such that a work unit will be very generic and contain any set of data and functions to perform on that data along with a finishing AJAX call to upload the results. On the server side when you have a Map-Reduce operation that you need to perform, you can distribute work units that contain both a section of the data along with the code needed to execute on it to any connected clients that have this library and are sending AJAX polling requests asking for work.

A work unit gets placed into a JSON string and sent back as the AJAX reply. The client then evaluates the reply which calls a javascript function that processes the data (which is probably a map job). Once the data is process the javascript automatically makes another AJAX call to send the result data back, most likely with some sort of work unit ID to keep track of anything, and requests the next work unit. The server then just coordinates between the various clients, setting time outs for each work unit so that if someone closes their browser the job can be handed out to an active client or a back-end server.

This will work a lot better on CPU intensive processes than it will on memory intensive ones. For example, sorting a large data set requires some CPU, but a lot more memory because you need to keep each element in memory that you’re sorting. Sending entire large lists to clients to sort and receiving the results would be slow due to bandwidth and latency restraints. However, performing large computations on a smaller series of data such as what’s done with SETI or brute force cryptography circumvention where you can send heartbeats of partial results back, there could be a benefit.

The limits of course will be on how much memory you can allocate in your browser to JavaScript. Also, since this technique would focus on heavy computational functions, the user will probably notice a fairly large hit on browsing speed in non-multithreaded or multiprocessing browsers. Naturally from a non-scientific point of view, most people would be outraged if their computer was being taken advantage of for its computing resources without their knowledge. However for applications working on an internal network or applications that state their intentions up front, users might be interested in leaving a browser open to a grid service to add their power to the system. This would probably be easier if you make each work unit worth a set of points and give them a running score like a game.

People like games.

Reusable Javascript: Write your own libraries

December 11, 2009

JavaScript is such an interesting language in that it is both extremely popular and hated at the same time. It is the most used programming language in the world, thanks to its inherent involvement in web programming. It is the sauce behind AJAX, and creates the dynamic aspect of dynamic web pages. It’s easy to pick up and write, requiring no compiler and running on any system that has a web browser. Unfortunately this has generated a stigma of it not being a ‘real’ programming language. Often even professional programmers will treat JavaScript in a more backhanded manner than they would code written in Java, C, Perl, Python, Ruby, or one of the many other compiled and scripting languages.

JavaScript, or ECMAScript as it is technically known as, is often written in a very procedural fashion. If you’re lucky you may see it being used as a functional language, when a developer decides to write functions at all. Rarely will you see it generated in a true object-oriented fashion. This is unfortunate because JavaScript has support for great object-oriented programming. Objects can be created through prototypes of other objects, allowing you to create “classes” through merely building a base object from scratch and then cloning it via prototypes.

A lot of JavaScript is created as one-off functions or scripts for a single purpose. Many developers are rebuilding their same wheel multiple times in the same application or even web page. While there are great libraries that showcase how useable pluggable JavaScript can be (take a look at the fully open sourced YUI library some time), very few developers abstract their business logic in a way that is reusable.

I think the problem mostly stems from the ingrained thought of the coupling between JavaScript and the web browser. People tend to write JavaScript for a specific web page, rather than to perform specific functions. A reapplication of the MVC pattern can be quite helpful here to separate concerns and promote reusability. I think this shows greatest when dealing with AJAX/Web Services.

Many applications are using AJAX to create more interactive, faster, and user friendly applications (we’ll ignore the back button concern for now). Web designers love AJAX because it creates a great feel for the users due to instant feedback and the lack of page loading. Web programmers dislike AJAX because it makes writing programs more difficult and can cause a lot more work. However, by separating some of the JavaScript code into distinct units you can reuse common functions across many pages.

Here’s an example:

WIDGET = {} // we use this as a namespace
WIDGET.removeRowFromTable = function() {
   var selectedItem = WIDGET.getSelectedItem();

   if (selectedItem != null) {
      var callback = WIDGET.getCallback(WIDGET.callbackDelegate);
      var postData = "itemId=" + selectedItem;

      YAHOO.util.Connect.asyncRequest('POST', 'removeRowFromTable.do', callback, postData);
};

/**
 * Override this method to change what to do on response (view stuff for response)
 * o - JSON object of response data
 */
WIDGET.callbackDelegate = function(o) {
   // Update view by removing row, maybe supplying text box to the user of what happened
   window.alert("The WIDGET.callbackDelegate method should be overridden per view.");
};

WIDGET.getCallback = function(successFunction) {
   return {
      timeout: function(o) { //default timeout function },
      success: successFunction,
      failure: function(o) {
         // default failure function. You could also change getCallback to pass this in if you want
         // to customize your failure.
      }
   };
};

This file can be part of your base “AJAX action” javascript which provides the actual actions to perform your AJAX commands. You can then use another javascript file to declare custom versions of WIDGET.callbackDelegate which will overridden the already declared version, just make sure that your new file is placed in your HTML file after the first one. By declaring default visual response to be alerts you will know what methods you should override on each new page.

Douglas Crockford, a fellow Yahoo and a member of the ECMAScript standards group has done a series of webcasts that describe some additional best practices of JavaScript that many developers may not practice. He also wrote a book that’s valuable to check out.