Tuesday, September 20, 2011

Resource-oriented URLs with dojo.hash and HTML5 history

Web development is so exciting! Every new browser release is shipped with a whole bunch of shiny new features to play with. And now that Firefox and Chrome are releasing new versions every couple of months, the dream of HTML5 is getting closer and closer to becoming "real". One of these new features is History API, which allows a web page to programmatically update the browser's URL without causing a page transition. Holy ajax goodness, Batman!  We've been stuck manipulating hashes for years because hashes were the only part of the URL that could be changed without causing the page to reload... until now.

In HTML5, the history object has been augmented with two new functions: pushState and replaceState. pushState adds a new entry to the browser back history and replaceState replaces the current entry in the browser history, both without reloading the browser.  This is very powerful, and makes it extremely easy for ajax applications to use resource-oriented URL schemes.  What's a resource-oriented URL and why is it so desirable to use them?  I'm glad you asked!

In resource-oriented architecture (ROA), an application must expose one URI per resource, and operations on each resource are performed using standard HTTP operations (GET, POST, PUT, DELETE, etc). For the purpose of this post, I'm going to focus on GETs, which is what web browsers do when you type a URL into the address bar. When a browser performs a GET on a resource, content negotiation is performed (by looking at the request's accept header), and the server responds with the resource in the format that was requested. The browser expects an HTML response, but other consumers may want a different representation (RDF+XML, JSON, etc). This concept is extremely powerful, because it serves different types of consumers in the way they wish to be served via the exact same URL - the URL of the resource.

The trouble starts when you want to combine these resource-oriented URLs with ajax applications (i.e. web apps that enable transitions between different resources without reloading the page). This is a fairly common practice these days, with big players like Google, Facebook and Twitter all using an ajax approach to UIs. Ajax applications are desirable because it means you only have to load common parts of the UI once. How annoying would it be if your Facebook chat windows along the bottom had to disappear and reload everytime you navigated to some other page in Facebook? (Answer: very). It would be a jarring user experience and wasteful of bits over the wire, even with heavy caching and optimized performance. How can we have our cake and eat it too? Enter history.pushState.

Before, because we were unable to manipulate the URL's path without a page reload, we used the following hash-based URL design:
https://example.com/myapp#/resource
Under the covers, we would register a servlet at /myapp/* that sends down an empty page with some JavaScript. The JavaScript would inspect the hash, figure out what needs to be displayed, fetch it, and then render it.  There are a few problems with this approach:
  1. The servlet is unable to see the value of the hash because the browser does not send it with the request. This makes it impossible to do any pre-processing of the hash on the server-side (forces reliance on a second round trip to fetch the resource payload).
  2. The page is blank until the JavaScript parses the URL and loads the appropriate resource. Making users wait is bad.
  3. The URL is ugly.
  4. It's impossible to do content negotiation to send other resource representations, since the server can't see what resource you're trying to fetch.
Now, with history.pushState, we can use the following URL scheme:
https://example.com/myapp/resource
We still register a servlet at /myapp/*, but this servlet can read the full path, so it can a) do content negotiation, and b) send more meat down with the initial HTML payload. Within the UI, links to other resources follow the same format (i.e. https://example.com/myapp/anotherresource), but we use a body click listener to intercept clicks to links that are within the servlet's namespace, and use history.pushState to update the URL instead of allowing the click to reload the page. Some JavaScript detects that the URL has changed, parses it, and then fetches the resource.

This adds some new considerations: a) how do I detect that the URL has changed? b) how do we gracefully degrade for browsers that don't support history.pushState (IE9-, FF3.6-)?  This is where dojo.hash comes in.

A while back, I contributed a little utility to the Dojo Toolkit for manipulating hashes, called dojo.hash (with help from Bill Higgins and John Ryding).  dojo.hash shims the HTML5 onhashchange event for browsers that don't support it, and provides convenience methods for getting and setting the hash value.  Before history.pushState existed, we lived in this strange dystopia where all useful information that we needed for building up the UI was part of the hash, so we took dojo.hash() for granted and used it everywhere; for triggering transitions to other resources, for subscribing to hash changes, and for reading the current hash for determining context. Rather than reinvent the wheel and come up with new API for these things, I decided to add history.pushState support to dojo.hash.

First off, I had to come up with a convention for distinguishing between dojo.hash() setter calls that should invoke history.pushState versus those that should set the hash. This was easy. Anything that starts with a forward slash is intended to be part of the path, so use history.pushState (if the browser supports it).  Otherwise, set the hash. Next, I had to augment the dojo.hash() getter to return the resource specific part of the path. Lastly, I had to define the concept of a context root, so that the getter reads and the setter sets only the part of the path that would have previously been part of the hash.

I had to implement a few other little nitpicky things: 1) the body click listener that listens for clicks on links that begin with the context root and call dojo.hash() instead, 2) a first-load redirect rule that strips off redundant hashes.

In the end:
  1. All of our URLs are fully qualified and resource-oriented.  
  2. The servlet is able to see the full URL, so it can do content negotiation.
  3. The first load experience doesn't have to be a blank page anymore!
  4. Consumers continue to use the same API they always used to get and set hashes and it transparently handles pushState if it can.  
  5. URLs are RESTful and pretty.
  6. Graceful degradation for older browsers.
Visit Jazz personal dashboards on Jazz.net (registration required) for a look at the (almost) finished product. We're doing content negotiation and everything! (if you request a dashboard resource with an application/rdf+xml accept header, you'll get what you asked for). We haven't implemented (3) yet though, but that'll be along shortly.

Note: these dojo.hash changes have not been contributed to dojo. They're implemented as overrides in Jazz. Stay tuned.

11 comments:

Boris Bokowski said...

Does the body click handler do the right thing when you middle-click on a link, or Ctrl/Cmd-click it?

Rob Retchless said...

Definitely. If a modifier key (alt, ctrl, shift) is pressed, the body click handler does nothing.

That said, I think I just found a bug... I'm not handling metaKey properly in Jazz! Doh!

dantechpro said...

I need your help, Rob. I am trying to use the functionality you demonstrate in

http://retchless.com/hash/hash.html

I got everything working great in Firefox, but I can't get my implementation working in IE8. Your page works fine in IE8.

In my attempt to track down the issue, I have copied your source code and your dojo.js file. When I open hash.html in IE8, the page loads fine. When I click a color link, the background color does not change. The hash updates and IE reports Errors on page. If I refresh the page with a color set in the hash, then the background color does change accordingly.

My copy of your page does work fine in Firefox. Any ideas on what I might be missing? I have tried this by opening hash.html from my local machine and I have tried hosting it on my tomcat web server and got the same results.

Rob Retchless said...

That hash.html demo is very out of date. The most recent code is shipping as part of the dojo toolkit as dojo.hash: http://docs.dojocampus.org/dojo/hash

There are two issues with the old code.

1) Make sure your page uses the HTML5 doctype:
2) Connect to the onhashchange event on the window instead of document.body.

Does it work now?

Rob Retchless said...

Oops. Lost the doctype from previous comment: <!DOCTYPE html>

dantechpro said...

DOCTYPE was set properly. Changed document.body to window. Still not working. I went ahead and used the new code described at http://docs.dojocampus.org/dojo/hash

I had trouble using it before, but it is working great now. Thanks for the help and your prompt response.

If I understand right, you wrote dojo.hash, correct?

Rob Retchless said...

Yup!

jcantwell said...

Any update on when the code will be contributed to dojo?

Rob Retchless said...

No update yet, but there is a bug open and targeted for Dojo 1.8: http://bugs.dojotoolkit.org/ticket/13958

fbeuvens said...

Very interesting piece of work!

Do you know when the code will be contributed to Dojo? Is the update still expected for Dojo1.8?

Karl Tiedt said...

Looks like this got pushed back even farther? A patch would help keep interest on the ticket you linked...

Did this require any changes to dojo/router to work with this or did that just work?

Would love to see the patch... Kind of surprised in 3 years it was never shared. :(