[philiptellis] /bb|[^b]{2}/
Never stop Grokking

Thursday, October 19, 2006

Selectively enable network interfaces at bootup in linux

Do you have multiple network interfaces on your linux box and find yourself needing to have not all of them active at bootup? Perhaps not all networks are connected and you don't want to waste time with attemtps at network negotiation for a connection you know isn't available.

I've faced this situation a couple of times, and came up with a way to tell my startup scripts to skip certain interfaces via kernel commandline parameters (which can be specified via your boot loader).

It's so simple that I often wonder why I (or anyone else) hadn't done it before. It's likely that everyone else on earth answered no to my question above.

Anyway, this is what I did:

In /etc/init.d/network, in the loop that iterates over $interfaces:

# bring up all other interfaces configured to come up at boot time
for i in $interfaces; do
after we've eliminated non-boot interfaces:

if LANG=C egrep -L "^ONBOOT=['\"]?[Nn][Oo]['\"]?" ifcfg-$i > /dev/null ; then
# this loads the module, to preserve ordering
is_available $i
I add this:

# If interface was disabled from kernel cmdline, ignore
if cat /proc/cmdline | grep -q $i=off &>/dev/null; then

Add the same for the next loop that iterates over $vlaninterfaces $bridgeinterfaces $xdslinterfaces $cipeinterfaces and you're done. As simple as that.

Now, when your boot loader pops the question, you choose to edit the kernel command line, and add something like eth0=off to prevent eth0 from coming up at boot time. You could turn this into an alternate boot config in grub by adding an entry in /boot/grub/grub.conf like this:

title Linux, skip eth1
root (hd0,1)
kernel /vmlinuz-2.6.10 ro root=LABEL=/ rhgb quiet eth1=off
initrd /initrd-2.6.10.img

Which will give you a nice menu option below your default Linux option saying Linux, skip eth1.

You can always enable your interface later by doing /sbin/ifup eth1.

Note: You may need to add is_available $i inside the if block. I don't know, and it works for me without it.

Sunday, October 08, 2006

Unintrusive dynamic script nodes

There are several methods of doing remoting via javascript. Dynamic script nodes are one of the easiest to use. Unlike XHR, which requires a response handler for most cases, dynamic script nodes require no javascript action after the initial call. The response from the server contains all logic that needs to be executed, and the browser takes care of executing it without interrupting the rest of your control flow.

They do come with two caveats however.

Dynamic script nodes also allow one to do cross-domain remoting without setting off sirens and flashing lights in the browser. This opens up an inherrent security problem. If you - as the developer of this application - do not have control over the source of your script, then you cannot trust that it will do nothing malicious. I'll skirt the issue in this article which concentrates on issue number two.

Using dynamic script nodes involves adding a <script> node to the DOM. For a simple application that just makes one or two calls, this isn't much of an issue, but for a complex application, the DOM can easily grow large, which starts to push memory limits of browsers.

An application that makes use of the flickr APIs or Yahoo! APIs that return JSON data suitably wrapped in a callback of your choice could hit these limits if built entirely in Javascript.

The call would be something like this:
var scr = document.createElement("script");
scr.src = "http://api.flickr.com/services/rest/?method=flickr.interestingness.getList"
+ "&api_key=xxxxxxxxx&format=json&jsoncallback=myfunc";
And a large number of such calls will add a lot of <script> nodes to the document, all of which have already served their purpose and are no longer needed.

These script nodes can be safely removed as soon as they've called their callback function. One needn't even wait for the callback to return, which means that the removal could be done within the callback itself.

While simple on the face of it, there's a bunch of housekeeping that goes with making this possible, and this isn't logic that all developers should be required to code into all their callbacks. It really should be generic enough to be separated out.

The code I came up with looks like this:
// The global callbacks array that stores a reference to all
// active callbacks.  This array is sparse.
var callbacks = [];

function call_api_method(url, callback)
  // We create the script element first so that it will
  // be available to the closure
  var scr = document.createElement("script");

  // Now add our custom callback to the callback array
  // so that the added script node can access it
  var i = callbacks.length;
  callbacks[i] = function(json)
    // first remove the script node since we no longer
    // need it

    // On failure display a generic error message
    if(json.stat == 'fail')
      alert("Error calling method: " + json.errorString);
    // On success, call the callback

    // Clear out our entry in the callback array since we
    // don't need it anymore
    callbacks[i] = null;
    delete callbacks[i];

    // and clear out all outer variables referenced in the
    // closure to prevent memory leaks in some browsers
    i = scr = callback = null;

  // add our own callback function to the script url
  // the resource sitting at the other end of this url
  // needs to know what to do with this argument
  scr.src = url
    + '&callback=' + encodeURIComponent('callbacks["' + i + '"]');

  // finally, add the script node to initiate data transfer
A few things stand out:
  1. The reference to the script node is held by the callback since it's a closure
  2. A global reference to the callback is necessary so that the script can access it
  3. We need to clean up in a particular order to avoid memory leaks caused by circular references
The comments in the code should explain the logic.

This code is moderately simplified from what it would be if it were truly generic. Additions that would need to be made include:
  • Encapsulate the callbacks array and the method into an object so that we don't pollute global space.
  • Instead of accepting a callback function, accept a callback object with separate methods for success and failure as well as callback arguments and execution scope.
  • For most applications, the url would be similar for various calls differring only in small parts, eg: the method name in the Flickr API. It would be good if the above class were an abstract base class with specialisations for different services providing the base API url.
Add your own ideas in the comments, as well as specific uses.

This is what I did for the flickr API:
var apikey = "xxxxxxxxx";
var apiurl = "http://api.flickr.com/services/rest?api_key=" + apikey + "&format=json&method=flickr.";
var usernamemap = {};

function call_api_method(method, uname, callback, params)
  var scr = document.createElement("script");

  var cb = function(json)
    usernamemap[uname].callback = null;

    if(json.stat == 'fail')
      alert("Error " + json.code + " calling flickr." + method + ": " + json.message);
      callback(json, uname);

    scr = uname = method = callback = null;

  usernamemap[uname] = usernamemap[uname] || {};
  usernamemap[uname].callback = cb;


  var url = apiurl + method
      + "&jsoncallback=usernamemap" + encodeURIComponent("['" + uname + "']") + ".callback";
  for(var k in params)
      url += '&' + encodeURIComponent(k) + '=' + encodeURIComponent(params[k]);

  scr.src = url;


And this is how you call methods on flickr:
call_api_method("people.findByUsername", uname, map_user_to_nsid, {"username":uname});

call_api_method("photosets.getList", uname, show_set_list, {"user_id":nsid});

PS: I didn't mention it earlier, but caveat number three is the maximum size restriction on GET requests. There's nothing that can be done to get around this since the browser is going to make a GET request. Just use this for API calls that don't send too much data to the server. You can receive as much as you want.

Monday, June 26, 2006

YUI addListenerByClassName

I use Yahoo!'s yui libraries a lot in my DHTML applications. I've found the combination of YAHOO.util.Dom.getElementsByClassName with YAHOO.util.Event.addListener to be particularly convenient for event attachment. It almost allows you to do true class based object oriented programming, though there are probably better methods to do that.

In any case, I like yui, and I use yui, and while working on one of my apps, I saw performance problems trying to add a large number of event listeners to input elements by class name. My initial implementation was something like this:
for each className:
        YAHOO.util.Dom.getElementsByClassName( className, 'input', container ),
The obvious bottleneck here was that getElementsByClassName was calling container.getElementsByTagName('input') and iterating through the elements once for each className. An O(n2) algorithm.

Even if getElementsByClassName cached the result of getElementsByTagName, that would not have made much difference because
that call is lightening fast compared to the iteration (over about 1200 elements).

My solution was to create a map of classes to event handlers, and write my own function that ran through the element loop only once to attach all events.

It works something like this:
var classHandlers = {
    "class1":{"click":[c1ClickHandler1, c1ClickHandler2], "change":c1ChangeHandler1}

addListenersByClassName( classHandlers, "input", "my_container" );

function addListenersByClassName( classHandlerMap, tagName, container )
    container = YAHOO.util.Dom.get(container);
    return false;

    var els = container.getElementsByTagName(tagName);

    for(var i=els.length-1; i>=0; i--)
            var classes = els[i].className.split(/ +/);
            for(var j=classes.length-1; j>=0; j--)
                    var handlerMap = classHAndlerMap[classes[j]];
                    for(var ev in handlerMap)
                        var handlers = handlerMap[ev];
                        if(typeof(handlers) == "function")
                            handlers = [handlers];
                        for(var k=handlers.length-1; k>=0; k--)
                            YAHOO.util.Event.addListener(els[i], ev, handlers[k]);
On the face of it, this looks like an O(n3) algorithm, but if you consider that the average values of j and k are 1.2 and 1.1 respectively (for my application), this really just reduces to an O(n) algorithm.

I suspect that for most applications, j (number of events to handle per element) would never cross 2, and the value of k (number of handlers per event per element) would not cross 1.

I did see significant performance improvements with this, however, IE is still dead slow in iterating through a list of elements.

Sunday, April 30, 2006

(Grouped or regular) expressions

If you're still here, chances are you're serious about regular expressions. In that case, I'd suggest reading Mastering Regular Expressions by the great Jeffrey Friedl. You'll learn much more there than I'll cover here, and well, that's where I learnt most of what I'm covering here.

Lesson three looks at entities larger than single character regexes, and how to alternate among them. As usual, let's start with an example. Let's try matching the word 'ohlala'. Fairly simple right? /ohlala/ should do it, and it does, but it misses out on all the variations and melodrama. What if someone were a little more excited and said 'ohlalala' instead? What of the really overwhelmed who go 'ohlalalalalala'? These matches follow a pattern of course. One that we discovered yesterday. They seem to be 'oh' followed by two or more 'la's. But we don't know how to match two or more of something. We know how to match one or more, and zero or more, but not two or more. If you think about it though, two or more is just one followed by one or more, and we know how to match both of those.
   /oh/  matches the 'oh' part
/la/ matches the first 'la' part
/la+/ matches ?
Remember, you concatenate multiple regexes to let them match a single string in sequence. Concatenating these, we get:
Does this match what we want?

It matches 'ohlala', but does it match 'ohlalala'? Not really. The last /la+/ matches 'la', 'laa', 'laaa', and so on. We therefore have matches like 'ohlalaa' and 'ohlalaaaaaaa'. Not what we want. Only the 'a' repeats and not the entire 'la'. To get the entire 'la' to match, we've got to make sure that 'la' is the entity preceding the + and not just 'a'. This is where grouping parentheses come in. These are the ( and ) characters placed around the portion that we want to group.

Grouping parentheses ( and ) create a group of the partial regex that they enclose. Everything within ( and ) is treated as a single entity for all modifier metacharacters.

In this case, it's the 'la', so let's do this:
By our definition of +, it should match one or more of the preceding entity, and by our definition of grouping parentheses, everything within the ( and ) is a single entity. This regex therefore matches 'la', 'lala', 'lalala' and so on. Taking it back to our example, we now have the regex:
which matches 'ohlala', 'ohlalala', and 'ohlalalalala'.

Looking at this though, it's always tempting to wonder whether we could have combined the /la/ and the /(la)+/ into a single entity. Looking at our arsenal of metacharacters, it doesn't seem possible. I believe most popular current implementations have a range count match for entities, ie they allow you to specify both the minimum and the maximum number of matches a particular entity can match. The notation used is curly braces with the min and max within, eg: {2, 15} will match a minimum of 2 and a maximum of 15 times.

A modifier of {n,m} tells the pattern matcher to match the preceding entity at least n times and at most m times.

If any one of n or m are omitted, a suitable default is used. The default value of n is 0, and all implementations agree on this. The default value of m is infinite, but not all implementations agree on the value of infinite. You'll figure it out as you work with it. If the braces contain only a single number, with no comma, then it's a match of an exact count, so {5} means match exactly five times.

We can now modify our example to use this:
Which says match two or more times. This is exactly what we want.

It's tempting to think if we could do something at this point to match the patterns from our last lesson. To recollect, we needed a single pattern to match both /a+r+g+h*/ and /a+r+g*h+/. If we think about it, these can be factored.

/g+h*/ can also be written as /g*gh*/ while /g*h+/ can also be written as /g*hh*/.

Looking at these to regexes, it's basically /g*/ and /h*/ with either a 'g' or an 'h' in between. Either 'g' or 'h' is also written as /[gh]/, so our regular expression now becomes /a+r+g*[gh]h*/

Does this match everything? If the character class chooses 'g', then it is identical to /a+r+g+h*/ and if the character class chooses 'h', then it is identical to /a+r+g*h+/. The character class can never choose anything else, and must match exactly one character. It will therefore not match 'ar'. So, yes, it does match everything we want and nothing we do not want.

There are other ways to match it though. When we came up with the original problem, our solution was to have two regular expressions and pick one of them. In other words, we could alternate between two regexes. The regex language provides a meta character for alternation as well.

The | metacharacter alternates a match between the regex on its left and the regex on its right

Note the use of regex rather than entity. Unlike other meta characters, the | metacharacter does not apply to single character or group entities. It applies to everything on its left within the current regex. The current regex though, may be limited by grouping parentheses. A few examples should clear it up:
   /abc|def/         -  match /abc/ or /def/
/a(bc)|(de)f/ - match /a(bc)/ or /(de)f/
/a(bc|de)f/ - match /a/ followed by /bc/ or /de/ followed by /f/
Take note of all three. The | metacharacter is not affected by complete groups on its left or right. It will take into affect for everything extending up to the next enclosing group, ie the group in which the | is. If the | is not in any group, then it matches up to the start and end of the regex.

What if there are two | characters in there?
The rule still holds. Everything to the left and right. It doesn't really matter whether you think of each | individually matching everything to its left or everything to its right or all the | characters within a group as a single alternation switch. The outcome will always be the same. Think of it in the way that you find simplest. Note that in case 2, we have two separate | characters, each within a different group, and that case 3 is very similar to case 1 except that a group in case 3 limits the extent of the |.

Getting back to our example now, we could have written the initial regular expression as:
Which would have worked correctly, and possibly have been easier to read.

It's however not as efficient as the character class method we developed earlier, so when given a choice between a character class and an alternation, pick the character class. That said, it's not always possible to convert an alternation to a character class, and it can in many cases be easier to read, so if performance isn't hitting you, stick with what's easier to read and write.

I'll end lesson three here, and wonder about what's left to be covered.

More (or less) Regular Expressions?

In our last lesson, we learnt about simple regular expressions that matched exact character sequences, and then how to enhance them to match more than just one specific character at a position. Today we'll look at extending this to match more (or less) than one character. I'm also going to drop the =~ notation for now, as we will seldom care about the string that needs to be matched, and always look at the pattern only.

As an example, let's take the simple string 'argh'. I'm sure everyone relates to that, and it's language independent, so it's a good choice. The regular expression to match this, is simple: /argh/. Cool? Everyone happy? Let's all scream now.

Unfortunately, not everyone screams in the same way. While some of us go 'argh', others go, 'aaaaaargh', and still others go 'aaaaaarrrrrgggghhhh'. Man, it's gonna take some doing trying to match all those arghs. That's where counts come into play.

We have three new meta characters to look at today. The first is +, which says:

Match one or more of the preceding entity

Of course, this raises the question of what exactly is an entity. An entity is an atomic regular expression, or, in terms of what we've learnt so far, a single character, dot or character class. With that in mind, we can take some examples:
   /a/      -  matches 'a'
/ab/ - matches 'ab'
/a+/ - matches 'a', 'aa', 'aaa', 'aaaaa', and so on ad infinitum
/a+b/ - matches 'ab', 'aab', 'aaab', 'aaaaab', and so on but not 'abb'
/ab+/ - matches 'ab', 'abb', 'abbb', 'abbbbb', but not 'aab'
Note the last one in particular. The + applies only to the 'b' and not to 'ab' as a whole.

So, let's get back to our example. We need to match 'argh', 'aaaaaaargh', 'aaaaaarrrrggggghhhh' and everything in between. The only thing we know is that each of 'a', 'r', 'g' and 'h' occur at least once. The moment we hear 'at least once', it should trigger the image of a '+' in the regex constructing part of our brains, so let's go ahead and use the +:
Simple enough right? We now have a regex that matches all of our requirements and then some. We should be able to catch everyone screaming now, so go ahead and scream.

Hmm, there's still a bunch of folks who scream differently. Do you hear a bunch of 'aaaaarrrrhhhh's in there? The 'g' completely missing. And there's those who go 'aaaaaarrrrrgggg', but no 'h' at the end.

This brings us to our second metacharacter of the day. The * character, which says:

Match zero or more of the preceding entity

Just as with + our examples with * are:
   /a/      -  matches 'a'
/ab/ - matches 'ab'
/a*/ - matches '', 'a', 'aa', 'aaa', 'aaaaa', ...
/a*b/ - matches 'b', 'ab', 'aab', 'aaab', 'aaaaab', ...
/ab*/ - matches 'a', 'ab', 'abb', 'abbb', 'abbbbb', ...
Note the case of /a*/, it also matches the empty string, and the case of /a*b/ which also matches 'b', ie, both these matches do not contain 'a', or, in other words, they contain zero occurances of 'a'.

While + says that there MUST be at least ONE of the preceding entity, * says that the preceding entity may be completely absent and is in fact optional. They both agree that there's no maximum. Specific implementations may apply arbitrary maxima, but for most practical cases, you can assume it's infinite.

Back to our example now, we know that in some cases the 'g' is optional, while in the other case, the 'h' is optional.

So, to change the 'g' from matching 'one or more' times to match 'optionally one or more' times or 'zero or more' times, we change the + to a *:
Similarly for the optional 'h', we have:
We don't however have a single regular expression to match both cases, ie, either an optional 'g' or an optional 'h'. If we combine the two changes, we end up with this:
which unfortunately also matches 'ar', and I couldn't hear anyone scream that way. We've got to find a better way of matching this, but I'll do that later.

In the course of writing regular expressions, you'll come across several cases where a particular entity must occur either zero or one times, but not more than one time. An example would be pluralisation of words by adding 's'. For example, if you needed to match 'apple' as well as 'apples', what would you do? An easy solution would be to use /apples*/ which we know will match 'apple' and 'apples' because the 's' at the end can occur zero or more times. Unfortunately, that 'more times' part comes back to kick us in the behind by also matching 'appless' and 'applesssssss', which isn't what we want to match.

Enter our third metacharacter for the day, the ? character, which says something to the effect of:

The presence of the preceding entity is questionable

Or stated less formally:

This here entity's optional and may occur either once or not at all

Well, what a coincidence, that's exactly what we need. We write up our new regex as /apples?/ and are done with it. Ok, all done, let's go home now.

Umm, just one minute. A few more examples:
   /a?/     -  matches '' and 'a' and nothing else
/a?b/ - matches 'b' and 'ab' and nothing else
/ab?/ - matches 'a' and 'ab' and nothing else
Note that that last one does not match the empty string because the ? applies only to the 'b' and not to the entire 'ab'.

We've learnt a whole bunch of metacharacters now, including . + * ? and []. One question that I hope has arisen in your minds is, what if I want to match one of these characters? What if I actually do want to match a dot or a plus or star? Well, simply escape it using a backslash character: '\'. Thus, the regex to match '.' is /\./ and the regex to match '*' is '\*' and so on. So how do you match '\.' then? Well, escape the '\' as well as the '.' to get:
Yup, I know, pretty hairy, and depending on the implementation of regular expressions you use, this could set you up for what is known in professional circles as backslashitis. You have been warned.

I'll stop here today. We'll look into larger entities in the next lesson.

Friday, April 28, 2006

Beginning Regular Expressions

A poet, too, was there, whose verse
Was tender, musical, and terse.
I often get asked to take a class on Regular Expressions. I've been doing them for about 7 years now, and I like the terseness with which they get the job done. In the years that I've taught regular expressions, my beginner lectures have gotten far more concise and less detailed. I thought of putting some notes up here.

Start Simple

To start with, are thoughts on how one should go about building a regular expression. Any regular expression. Always start simple. Start with the smallest pattern that you can comfortably identify with. Note the emphasis on you. Don't go by someone else's idea of simplicity, always use your own. To a first time regexer, that may mean thinking in terms of english words directly. Let's start...

Note: for this tutorial, I'll use single quotes '' to mark literal strings and slashes // to mark regular expressions. Also, the =~ operator will be used to match the string on the left with the pattern on the right.

The simplest regular expression is one that matches a single specific character. Let's take the character 'a'. The regular expression to match this is /a/
'a' =~ /a/
Let's try that again. This time, a regular expression to match 't'.
't' =~ /t/
Simple so far?

Combining expressions

Let's move on to a little more complex stuff. Combining two simple regular expressions to make one complex regular expression. We'll combine the two regexes we learnt above to build a slightly more complex regex. A regex to match 'at'. This is basically the regex to match 'a' followed by the regex to match 't':
'at' =~ /at/
If you've got that, then you're all set, and can build some really complex stuff. Just remember to always break it down to the smallest manageable piece.

This also means that you don't break it down unnecessarily. Most people I know don't need to break everything down to a single character regex. They're quite comfortable dealing with /at/ or even something as big as /apple/ or /orange/. In fact, any combination of letters that make up a word in a familiar language also makes a familiar regex. From now on, we'll deal with these large thoughts and make even bigger things jumping back to the smaller chunks only when we introduce a new metacharacter.

The . metacharacter

Regular expressions are full of meta characters. Wildcards and modifiers that allow you to match more than just one specific literal character. One of the most used metachars is '.' (that's a dot or period). A dot matches any single character, so a regex like /./ matches 'a', 'b', 'c' and everything else, but not '' ie, the empty string, because that has no characters. The dot doesn't say what the character at its position should be, but it does say that there MUST be ONE and only ONE character there. It also does not say what, if anything, should preceed or succeed that one character. On its own therefore, a dot is not very useful. A string empty check is probably far more efficient. Used in conjunction with other regexes though, it becomes quite powerful.

For example, take a regular expression to match the word 'ate' preceded by any one character. Ie, I want to match words like 'aate', 'bate', 'cate', 'date', eate', and so on, for any possible first letter.
$string =~ /.ate/
Ok, I lied a bit when I said that /./ matches any character. There are two characters that it does not match. These are the NUL character (ie, character with ASCII code 0) and the newline character (except in special circumstances). For most cases, this isn't really a problem.

I'll move on.

Character Classes

The above example would also match character sequences like ' ate' (that's a space) or '9ate', or '+ate', etc. Perhaps that wasn't quite my intention. My intention was to have a letter of the alphabet followed by ate. The solution is to use what is known as a character class.

Now, before we go forward, in one corner of your mind store the fact that what most languages call a character class, and what POSIX calls a character class are minutely different, but both relate to the same thing. We'll address POSIX later, but for now we'll just think about what most implementations call a character class.

A character class is a series of characters, any one of which is a valid match for that position. An example may be easier to understand. Taking my earlier case, if I had to match the following words: 'date', 'fate', 'gate', 'hate', then my regular expression would be /[dfgh]ate/
$string =~ /[dfgh]ate/
Notice the square brackets containing dfgh (which just coincidentally happen to lie right next to each other on a QWERTY keyboard): '[dfgh]'. What that says is that the character at that position must match any one of 'd', 'f', 'g' or 'h'.

Note that it states two things. First, there MUST be ONE and only ONE character at that position, and second, that one character may be either a 'd', an 'f', a 'g' or an 'h' and nothing else. Commit to memory the ONE character thing, because that's where most people go wrong when they do inverted matches. Remember this as a rule:
  • A character class [] matches ONE and only ONE character
  • The . metacharacter is like a character class with all characters except \000 and \n
Note also that you can include \n and \000 in your character class.

Now character classes are useful, but if you have to match too many characters, this could get unwieldy. Not as unwieldy as the alternatives, but unwieldy nonetheless, and terseness is what we like anyway. Enter ranges.

Ranges in character classes

You can specify a character range within the square brackets to indicate that you want to match any character that lies within that range within the current character set. So, if we wanted to match all the lower case letters of the english alphabet, we'd write /[a-z]/. Note the order. In the previous case, it made no difference if we wrote /[dfgh]/ or /[dhgf]/, but in the current example, /[a-z]/ is very different from /[z-a]/. The latter is an invalid range.

A single character class needn't be restricted to a range though, and you can combine a range alongwith characters that are not part of the range. Our earlier example could be replaced with this instead:
$string =~ /[df-h]ate/
Not that it buys us much in terms of reducing keystrokes in this particular example, but there are cases where it will.

Inverted character classes

Character classes can also be negated to match any character NOT included in the class. To do this, use a '^' as the first character of the character class: /[^a-z]/. This regex will match any character that does not lie in the range 'a-z', both inclusive. Note, that it still requires ONE character to be at that position. The only difference is that we now state what that character cannot be rather than what that character can be. This regex will still not match the empty string ''.

I'm going to stop here for today. There's much more that needs to be done before we can make really useful regexes, but we should be able to play around a bit with this much. I'll post a follow up tomorrow.

Tuesday, February 21, 2006

Rich, accessible pagination with unobtrusive javascript

I've seen a few questions about doing pagination with AJAX, and I don't like the thought of it. It smells of accessibility problems. I've addressed this issue before in my own toy pages (since I don't actually write production web code), so thought I'd share it.

This is a sort of continuation of my post on progressive enhancement.

Problem definition

Given a long list of data, display it to the user in pages to avoid scrolling. Typically you'd have a bunch of navigation links at the end with First, Last, Next, Previous links, or links to specific pages.

Six steps from vanilla HTML pages to AJAX pages

It's important to note that the problem definition does not mention AJAX, but people always like to make their apps buzzword compliant. So, forget about AJAX for the moment and concentrate on solving the problem — retrieve a database resultset in limited sized pages. Once we've done that, it's five more steps to accessible AJAXified pages:
  1. Build the page as you would for html only pagination
  2. When your pagination links load, attach onclick handlers to them.
  3. The onclick handler makes an asyncRequest to this.href + '&js=1' (or something similar)
  4. Modify your backend code to check for js=1 in the query string.

    If not found, then send the entire page with header and footer as before
    If found, then send one of the following:
    • The html for the paged data
    • XML for the paged data
    • A JSON object representing the paged data
  5. In your callback for the asyncRequest, do one of the following:
    • Put the html into the innerHTML of your page's container object
    • Parse the XML and translate it to the DOM objects for your paged data
    • eval() the JSON and redraw the DOM for the paged data
  6. Rewrite the hrefs in your paging links to point to new pages.
You now have pagination that works with and without javascript enabled.

The Code

Let's look at some of the code. I'll use the yui utilities for connection and event management since I've been playing with that.

For simplicity, I'll assume that we're representing our data as a <LI>st. A table is similar, except that you need to redraw the entire table since it's read-only in IE.

Step 1: Build HTML (PHP with MySQL)
<div id="page">

// Fetch all results and print them
while($o = mysql_fetch_array($result, MYSQL_ASSOC))
<li><?php print $o['name'] ?></li>

// Verify next/last page links
$prev_page = ($pg<=0?0:$pg-1);
$next_page = ($pg>=$num_pages?$num_pages:$pg+1);

// Display navigation links, disable (via css) links that cannot be selected
<p class="navbar">
<a id="first-link" href="foo.php?pg=0"
class="<?php if($pg == 0) echo 'disabled' ?>">First</a>
<a id="prev-link" href="foo.php?pg=<?php print $prev_page ?>"
class="<?php if($pg == 0) echo 'disabled' ?>">Prev</a>
<a id="last-link" href="foo.php?pg=<?php print $num_pages ?>"
class="<?php if($pg == $num_pages) echo 'disabled' ?>">Last</a>
<a id="next-link" href="foo.php?pg=<?php print $next_page ?>"
class="<?php if($pg == $num_pages) echo 'disabled' ?>">Next</a>

Step 2: Attach onclick handlers
var nav_links = ['first-link', 'prev-link', 'next-link', 'last-link'];

YAHOO.util.Event.addListener(nav_links, 'click', navigationHandler);

Step 3: Make async call:
var callback =
success: gotResponse,
failure: failedResponse

var navigationHandler = function(e)
var url = this.href + '&js=1';

YAHOO.util.Connect.asyncRequest('GET', url, callback, null);

return false;

Step 4: Modify back end to check for js=1:
$js = $_GET['js']; if($js) { header('Content-type: text/json'); } else {
?> <div id="page"> <ul> <?php
} $json = array('n'=>$num_pages, 'p'=>$pg, 'l' => array());
// Fetch all results and print them while($o = mysql_fetch_array($result, MYSQL_ASSOC)) {
if($js) { $json['l'][] = $o['name']; } else {
?> <li><?php print $o['name'] ?></li> <?php
if($js) { print json_encode($json); // nothing more to output, so quit exit(); }
?> </ul>

I've hilighted the code that changed, it's just a bunch of if conditions. Yeah, it's ugly, but cleaning it up is not the purpose of this article.

Step 5: Make your asyncRequest handler:
var gotResponse = function(o)
var json = eval("(" + o.responseText + ")") ;

var list = json['l'];
var num_pages = json['n'];
var page = json['p'];

var prev_page = (page<=0?0:page-1);
var next_page = (page>=num_pages?num_pages:page+1);

var lis="";
for(var i=0; i<list.length; i++)
lis += "<li>" + list[i] + "</li>\n";

var ul = document.getElementById('page').getElementsByTagName('ul')[0];
ul.innerHTML = lis;

Step 6: Rewrite paging urls:
var fl = document.getElementById('first-link');
var pl = document.getElementById('prev-link');
var nl = document.getElementById('next-link');
var ll = document.getElementById('last-link');

var url = fl.href.substr(0, fl.href.indexOf('pg=')+3);

pl.href = url + prev_page;
nl.href = url + next_page;
ll.href = url + num_pages;

fl.className = pl.className = (page<=0?'disabled':'');
nl.className = ll.className = (page>=num_pages?'disabled':'');


Steps 5 and 6 are the same function of course, so don't split them up.

A brief explanation

Well, there you have it. If javascript is disabled, the default <A> behaviour is to make a GET request to foo.php with default values for pg. On every page call, the back end changes the value of pg in the Next and Previous links, and possibly in the Last link if records in the database have changed.

If javascript is enabled, we prevent the default href from being called with our return false;, and instead make the same call using asyncRequest, but with an additional query parameter saying that we want a javascript (json) object back.

The back end php script still hits the database as usual, and gets back a result set, which it now builds into a PHP hash, and then converts to a JSON object. The JSON object is sent back to the client where it is converted into HTML to push into the <UL>.

The page and num_pages variables allow us to rewrite the hrefs so that they point to up to date paging links, that you can, in fact, bookmark.


To make the code cleaner, you may want to build your PHP hash at the start, and then based on the value of $js, either convert it to HTML or to JSON. This of course has the disadvantage of having to iterate through the array twice. If you're just looking at 20 records, I'd say it was worth it, and a better approach if you start off that way.

This is quite a simple implementation. You could get really creative with javascript, showing funky page transitions that keep the user busy while your asyncRequest returns.

Update: json_encode is available from the PHP-JSON extension available under the LGPL.

Update 2: The cleaner way to write the PHP code that I mentioned in Improvements above is something like this:
$js = $_GET['js'];

header('Content-type: text/json');

$list = array();
while($o = mysql_fetch_array($result, MYSQL_ASSOC))
$list[] = $o['name'];

$json = array('n'=>$num_pages, 'p'=>$pg, 'l' => $list);
print json_encode($json);

// nothing more to output, so quit
<div id="page">
foreach($list as $name)
<li><?php print $name ?></li>

// Verify next/last page links
$prev_page = ($pg<=0?0:$pg-1);
$next_page = ($pg>=$num_pages?$num_pages:$pg+1);

// Display navigation links, disable (via css) links that cannot be selected
<p class="navbar">
<a id="first-link" href="foo.php?pg=0"
   class="<?php if($pg == 0) echo 'disabled' ?>">First</a>
<a id="prev-link" href="foo.php?pg=<?php print $prev_page ?>"
   class="<?php if($pg == 0) echo 'disabled' ?>">Prev</a>
<a id="last-link" href="foo.php?pg=<?php print $num_pages ?>"
   class="<?php if($pg == $num_pages) echo 'disabled' ?>">Last</a>
<a id="next-link" href="foo.php?pg=<?php print $next_page ?>"
   class="<?php if($pg == $num_pages) echo 'disabled' ?>">Next</a>

Update: I've put up a working example on sourceforge.

Thursday, February 16, 2006

Add drag and drop to any website

Have you ever visited a website and wondered, "Man, I wish I could drag that stuff out of the way"? Well, it really shouldn't be that hard. I've been doing this for a while on my own, and thought I'd share it.

To make it easy to use, I'd suggest you make a bookmarklet out of it that can be clicked on when you get to a page.

To start, download Yahoo's yui utilities:
Zip file. Unzip it into a convenient directory on your computer.
For this particular hack, I'd suggest putting the following files into the same directory:
  • YAHOO.js - from any of the subdirectories
  • event/build/event.js
  • dom/build/dom.js
  • dragdrop/build/dragdrop.js
Next, create your own script file, that looks like this:
function add_drag_drop(element)
   var divs = document.getElementsByTagName(element);
   for(var i=0; i<divs.length; i++)
      var id = "div-" + i;
         divs[i].id = id;

      var dd = new YAHOO.util.DD(id);

   return "Added drag-drop to " + i + " " + element + "s";
I'll call it adddragdrop.js. Use a javascript prompt at the top instead if you'd like to be prompted for the element's tag name.

The only thing this function does is iterate through all elements of the specified type, adding drag drop to them. If elements don't have an id, an id is added. Notice that it takes more lines of code to add ids than it does to add drag and drop. That's why I love this library.

You now need to add these five files to the page you're viewing. You can do it via the browser URL bar like this:
javascript:void(_s=document.createElement("script"), _s.src="YAHOO.js", document.body.appendChild(_s);
javascript:void(_s=document.createElement("script"), _s.src="event.js", document.body.appendChild(_s);
javascript:void(_s=document.createElement("script"), _s.src="dom.js", document.body.appendChild(_s);
javascript:void(_s=document.createElement("script"), _s.src="dragdrop.js", document.body.appendChild(_s);
javascript:void(_s=document.createElement("script"), _s.src="adddragdrop.js", document.body.appendChild(_s);
Of course, use the correct path for the files in there.

Finally, call the function from your url bar like this:
All <div>s on the page should now be draggable.

Ok, so it's a pain to do this over and over, so turn it into a bookmarklet or greasemonkey script. This is what
my bookmarklet looks like:
<a href='javascript:void(
_s=document.createElement("script"), _s.src="http://localhost/philip/yui/YAHOO.js", document.body.appendChild(_s),
_s=document.createElement("script"), _s.src="http://localhost/philip/yui/event.js", document.body.appendChild(_s),
_s=document.createElement("script"), _s.src="http://localhost/philip/yui/dom.js", document.body.appendChild(_s),
_s=document.createElement("script"), _s.src="http://localhost/philip/yui/dragdrop.js", document.body.appendChild(_s),
_s=document.createElement("script"), _s.src="http://localhost/philip/yui/adddragdrop.js",
_s.onload=_s.onreadystatechange = function(){alert(add_drag_drop("div"));},
document.body.appendChild(_s))'>Add drag drop to page</a>
(Whitespace added for readability)

Drag that link to your bookmarks toolbar to make it easily accessible, and of course, change the links to point to your own versions of the files.

Ok, we're ready to go now. Visit any page, click on the bookmarklet, and all divs become draggable. Is that cool or what?

If it doesn't work, let me know and I'll try and figure it out.

Anyway, now that you've got the code, play with it and show me what you can do.

Wednesday, February 01, 2006

Geo microformat to Yahoo! Map

After my post about converting the geo microformat to a google map, this one adds support for Yahoo! Maps. What I found cool was how alike the two APIs were.

I changed my map creation code to this:

if(mapAPI == 'G')
point = new GPoint(lon,lat);
map = new GMap(adr[i]);
map.addControl(new GSmallMapControl());
map.addControl(new GMapTypeControl());
map.centerAndZoom(point, 0);
point = new YGeoPoint(lat,lon);
map = new YMap(adr[i]);
map.drawZoomAndCenter(point, 1);
addMarkerToMap(map, point, html);

And the marker addition code to this:

if(mapAPI == 'G')
var marker = new GMarker(point);
GEvent.addListener(marker, 'click', function() {marker.openInfoWindowHtml(html);});
var marker = new YMarker(point);
html = "<div class='map-marker'>" + html + "</div>";
YEvent.Capture(marker, EventsList.MouseClick, function() {marker.openSmartWindow(html);});

Which creates as close an experience as possible in both maps.

The map provider should be selected at random, but if you'd like to force a particular map, add #G or #Y to the url. They can be found on the california reviews page.

Let me know what you think.

Tuesday, January 31, 2006

Of microformats and geocoding

I'd been toying with the idea of adding geo data to my restaurant reviews. I thought it would be nice to have a map pointer right below the instructions for getting there. I started looking around for well estabilished methods to markup geo data in a blog post.

I came upon an article in linux journal that spoke about geotagging and geocoding for websites. It talks about ICBM and meta tags, which are great except for one thing. It goes into the page header, so can't be different for different sections of the page. It also talks about embedding the information in comments - which means that users and javascript can't easily read it, and about RDF/RSS feeds - which would be useful for my blog feed, but not for my blog itself.

I decided to try my own minimalistic markup, and came up with this:

<address class="gmap" lat="yy" lon="xx" zoom="z">Some text</address>

Of course, I went through a couple of iterations to settle on this, and it was based largely on what the google maps api accepts.

A side note before I go on. The reason I chose google maps was because they provided aerial photos of a few major Indian cities, which is where most of my reviews are based. This turned out to be of no use though, because the aerial photos are not provided via the API. One has to visit Google Local to see them.

Ok, so this gave me the ability to easily add geo information to a post - just a single line. If I wanted to be really cool, I'd need to translate that to a map, so I started studying the google maps API, and after several iterations, came up with this:

//! Add a marker with a callout containing the specified html
function addMarkerToMap(map, point, html)
var marker = new GMarker(point);
GEvent.addListener(marker, 'click', function() {marker.openInfoWindowHtml(html);});

var adr = document.getElementsByTagName("address");
for(var i=0; i<adr.length; i++)
if(adr[i].className == 'gmap')
// Grab HTML to put into callout
var html=adr[i].innerHTML;
var lat, lon, zoom;
adr[i].innerHTML = "";

// Build map and center on lat/lon
var map = new GMap(adr[i]);
map.addControl(new GSmallMapControl());
map.addControl(new GMapTypeControl());
var point = new GPoint(lon,lat);
map.centerAndZoom(point, zoom);
addMarkerToMap(map, point, html);

What it does is quite simple. After page load, it iterates through all the divs in the page, looking for divs with class 'gmap'. When it finds such a div, it looks at the lat, lon and zoom attributes (which are not standard HTML btw) of the div and uses that to draw the map.

I soon realised that I was only using zoom level 0, so dropped that attribute and hard coded it in.

I showed it to Nate a little later, and he mentioned that there was a microformat for geocoding. Had a look at it, and while it was slightly more verbose than my format, it achieved a little more, and wouldn't be much harder to parse.

Changed the div to this:

<address class="geo">
Some text<br>
<abbr class="latitude" title="37.801324">N 37° 48.079</abbr> <br>
<abbr class="longitude" title="-122.424903">W 122° 25.494</abbr>

This shows up neatly, and I could change my javascript to accept both classes, 'gmap' as well as 'geo', and change the parsing to this:

if(adr[i].className == 'geo')
var ab = adr[i].getElementsByTagName('abbr');
for(var j=0; j<ab.length; j++)
if(ab[j].className == 'latitude')
lat = 1*ab[j].title;
else if(ab[j].className == 'longitude')
lon = 1*ab[j].title;

The rest of the code remains the same.

It isn't too hard to replace Google maps with Yahoo! maps in this implementation. The parsing of the microformat is the heart of it all, after that it's just a matter of using your API of choice.

You can see both formats in action on my reviews of US restaurants. Note that one of the restaurants uses the old format that I've described above, and the other two use the geo microformat. I'll add more in time, and a few from the UK as well.

Update: Calvin Yu has written a Yahoo! Maps implementation of geo.

Update: Switched to use the <address> tag instead of a <div>. Just seems more semantic.

Update: I have a Yahoo! Maps version as well.

Sunday, January 29, 2006

Progressive Enhancement via μMVC - I

The web today is like a huge buzzword bingo game. There's so much flying around that it's hard to stay in touch unless you're in it day in and day out. That's not something that old school engineers like me find easy. I'm far more comfortable staring at my editor, hacking code to interact with a database or some hardware. Interacting with users is tough. Doing it with sound engineering principles is even tougher.

I'm going to take a deep breath now and mention all the web buzzwords that I can think of and somehow fit them into this article.

AJAX, RIA, JSON, XML, XSLT, Progressive Enhancement, Unobtrusiveness, Graceful Degradation, LSM, Accessibility.


Let's get a few definitions in there:
A generic term for the practice of asynchronously exchanging data between the browser and server without affecting browsing history. AJAX often results in inline editing of page components on the client side.
Rich Internet Applications - Web apps built to feel like desktop applications. Most often built using AJAX methods and other funky user interactions
A popular new data interchange language used to exchange data between languages. Extremely useful for compactly sending data from a server side script to a Javascript function on the client
A common, but verbose and slow to parse data interchange/encapsulation format, used to exchange data between client and server.
XSL Transformations - transform XML to something else (most likely HTML) using rules. Can be executed on either client or server depending on capabilities
Progressive Enhancement
The practice of first building core functionality and then progressively adding enhancements to improve usability, performance and functionality
The practice of adding a progressive enhancement without touching existing code
Graceful Degradation
The ability of an application to gracefully retain usability when used on devices that do not support all required features, if necessary by degrading look and feel. Graceful Degradation follows from Progressive Enhancement
Layered Semantic Markup - The practice of building an application in layers. At the lowest layer is data encapsulated in semantic markup, ie, data marked up with meaning. Higher layers add style and usability enhancements. LSM enables Progressive Enhancement and Graceful Degradation.
The ability of an application to be accessed by all users and devices regardless of abilities or capabilities.
See Also: Progressive Enhancement at Wikipedia, Progressive Enhancement from the guy who coined the term, Progressive Enhancement from Jeremy Keith, Ajax, Graceful Degradation, Layered Semantic Markup, JSON

We'll get down to what this article is about, but first let me add my take on LSM.

LSM's layers

While LSM suggests development in layers, it doesn't specify what those layers should be. Traditionally, developers have looked at three layers: Semantic Markup, Semantic CSS and Javascript. I'd like to take this one level further.

The way I see it, we have 4 (or 5) layers.

Layers 1 and 2 are semantic markup (HTML) and semantic classes (CSS). Layer 3 in my opinion should be restricted to unobtrusive javascript added for UI enhancements. This would include drag and drop, hidden controls, and client side form validation, but no server communication.

Layer 4 adds the AJAX capability, however, just like Layer 3 does not absolve the back end from validating data, layer 4 does not absolve the back end from producing structured data.

Right down at the bottom is syncrhonous, stateless HTTP (Layer 0)

And now, back to our show.

Web application frameworks and MVC

There's been a lot of work in recent times to build web application development frameworks that make it easy for a developer to add AJAX methods to his app. Tools like Ruby on Rails, Django, Dojo and others do this for the user, and build on time tested design patterns.

For a long while web application frameworks have implemented the MVC pattern. Current frameworks merely extend it to move some parts of the view and controller to the client side instead of doing it all server side.

See also: MVCs in PHP, Intro to MVCs in PHP5, The controller, The view.

The problem with this is that your code is now fragmented between client and server, and implemented in different languages, possibly maintained by different programmers. Questions arise as to whether the bulk of your code should go into the server or the client, and of course, which model degrades best to account for accessibility?

Brad Neuberg has an excellent article on the pros and cons of each approach, and when you should choose which.

He still leaves my second question unanswered, but Jeremy Keith answers it with Hijax, his buzzword for hijacking a traditionally designed page with AJAX methods... in other words, progressive enhancement.

I've had thoughts that ran parallel to Jeremy's and it was quite odd that we ended up speaking about almost the same ideas at the same place and time. Well, he published and I didn't, so my loss.

Jeremy's ideas are spot on, but he doesn't mention implementation specifics, or whether the same code base can be used for more than just adding Ajax to an existing application.

More about MVC

The MVC pattern is great in that it doesn't state what your view should be, but merely that it should not be tightly coupled with your application model. Most implementers look at it as a way of designing an entire application around a single controller. Every action and subaction correspond to a controller branch, which in turn decides how data should be manipulated, and which view to call.

While this is good (if implemented correctly) at the high level, it is complex, and prone to bad design. It's not surprising that the big boys get wary when MVCs for web apps and PHP in particular are mentioned.


If instead, we look at views as just views of data, and different views of the same data, then we end up with a different structure. Instead of selecting a view based on the action to be performed, we select a view based on the output format that the user wants. This may be HTML in the default case, and the top level controller would merely stitch various HTML subviews together to form the entire page. Each subview sent across to the browser as soon as it's ready to improve performance.

If the user has other capabilities though, we send data in a different format, and chances are, we don't need to send across all subviews. A single subview that's very specific to the data requested is sufficient. We do less work on the server, fewer database queries, send less data across the network and improve performance overall on client and server. The data format selected depends on the client application, and may be an html snippet that goes in to innerHTML, a JSON datastructure that gets parsed on the client side, javascript code that gets evaled on the client side, or XML data returned as a web service or for client side XSL transforms.

We use exactly the same data processing code for all requests, and only switch on the final function that transforms your internal data structures to the required output format.

I call this a micro MVC (μMVC) because the model, view and controller all act on a very fine granularity without considering overall application behaviour. Note also that the view and controller are now split across the server and client.

The client side controller kicks in first telling the server side controller what it's interested in. The server side controller performs data manipulation, and invokes the server side view. The client side controller passes the server side view to the client side view for final display.

This development model fits in well with the LSM framework which in turn leads to Progressive Enhancement and Graceful Degradation, and most of all, it opens up new avenues of accessibility without excessive degradation.

In part II of this article, I'll go into implementation details with examples in PHP and some amount of pseudocode.