Front End Performance Lessons from Rotten Tomatoes: Caching

I’ve been spending a lot of time obsessing over making things faster at rottentomatoes.com. In this multi-part performance series I’ll go over various thing’s I’ve tried and which of those things were successful. Some of them are evangelized by Steve Sauders, and some not mentioned at all. I want to first focus on caching. When it comes to the browser there are various possible caching levels:

  1. Data caching: this is caching at the data layers(db), this is very commonly accomplished with distributed caches like memcache and/or in memory cache like ehcache. If you have a site that has at least a million users you are already doing this. This is a well covered topic, so I will not be discussing it in this post.
  2. Dynamic content caching: regenerating dynamic content is expensive. One JSP page can hit the database/memcache many times in one request. Consolidating all the generated content into one memcache call can possibly yield faster response times.
  3. HTTP caching: often times overlooked part of speeding up performance. HTTP came built with an elegant caching solution. By setting ETag and cache-control headers for content that doesn’t change often you can reduce server load and improve page performance. The HTML content is not loaded again, saving on bandwidth and response latency; plus most browsers are optimized to display the page faster  if nothing has changed.

JSP Caching

We are obviously already doing #3 at Rotten Tomatoes. So the first caching mechanism I went after was JSP caching; the main focus of this post. The most sound way of implementing this is by creating a custom tag. This way the author of a particular page can choose to wrap the content in a cache tag and future modifiers are aware of what parts of the page are being cached. Some content can be very dynamic and so you want to give granular control to the developer so they can make their own decisions. To sum up our requirements:

  • Granular. Allows from sections to the entire page to be cached. This allows common shareable sections to be
  • Visible. Developers have to be aware where it’s being employed.
  • Simple. Should be easy to drop in without having to consider what’s happening behind the scenes. Should assume safe defaults (caching for a couple hour is a bad idea).
  • Transparent. You should know when/what is cached and be able to see the page without the cache.
  • Fast. Obvious goal at the end of the day is that we shouldn’t be hitting the cache multiple times at the view layer per request.

The defaults I chose were to put all the JPS cached content into their own regions keyed off by TTL. The default TTL being 15 minutes, and the default key being the request URL. Both values can safely be overridden and usually are depending on the context they are being used. Setting up the tag is easy:

public class JSPCacheTag extends SimpleTagSupport {
    public void doTag() {
        String cached = memcache.get(key, ttl);
        if(cached == null) {            
        StringWriter buff = new StringWriter();
            getJspBody().invoke(buff);
            cached = buffer.toString();
            memcache.put(key,cached, ttl);
        }
        getJspContext().getOut().write(cached);
     }
...
}

Due to the nature of JSTL tags we have already succeeded in all of the requirements that we set to accomplish. The one remaining problem is nested caching.

Nested caching

One problem that we encounter with this approach is that if we cache the entire page, and children of that page are also marked to be cached than we will be violating our simplicity requirement and speed requirement. Every time the entire page is cached, we will then be checking and possibly caching children elements as well, which would be unnecessary since the entire page is cached. There is a way to avoid this. You can set a boolean request variable that is set when the parent is cached; the children will check for the variable and if it exists will not cache their content. This is of course a micro-optimization for reduced latency and to reduce the use of your cache space.

Result

For rottentomatoes.com the effect of the JSP cache was significant. On average the JSP generation step went from ~400ms to ~20ms yielding a noticiable reduction in request latency.

HTTP Caching

HTTP caching is often highly leveraged for CDN bound static content, but unused for dynamic content. Setting the cache is best done in a request Filter by setting the headers. The implementation is definetly simpler than the one for JSP cache. Sample pseudo-code:

response.setHeader ("Cache-Control", "max-age=86000, must-revalidate");
response.setHeader("ETag", MD5Hash(content));
if(request.getHeader("ETag").equals(MD5Hash(content)) {
    r.setStatus(304);
}

This is one of the next steps in our caching optimizations. Early testing gives very positive results.

Conclusion

Hopefully this has give you so me ideas on how to leverage caching in different ways for increased front end performance in your project. These techniques are mostly helpful in a semi-static environments. In cases where the content is highly dynamic and user specific one you can leverage HTML5 local storage, but that is a topic for another day.

Advertisements

3 Responses to Front End Performance Lessons from Rotten Tomatoes: Caching

  1. andrei says:

    Hello Artur
    Thanks for interesting article.
    Question – was performance the main reason for going to jsp. AFAIK – didn’t you guys havehave been using jsp before?

  2. andrei says:

    i meant:

    AFAIK – didn’t you guys have been using PHP before with zend framework or something?

  3. artur says:

    Hi Andrei,

    That is correct RottenTomatoes did used to be on PHP. However, after Flixster acquired them we had to convert them over to our infrastructure, which was built on top of Java EE. The transition was required for simplicity of deployment, and development, but there were some performance wins as well.

    Technically you can use PHP caching as well in a similar way. Both essentially serve the same function in the Java and LAMP world.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: