Tech stack of my blog

February 2012

This is a post about the technology stack of tomicloud.com blog.

Being a software engineer familiar with a number of frameworks and tools, I happily developed this blog myself, what else! Having a total control of the stack allows me to play and experiment with the system in unseen ways. This site will most likely function more as a technology playground than a static blog.

Overview

Here are the most important building blocks of my blog:

Python language. I am a long-time Pythonista since 1999. Python just fits my brain and makes me ultra productive.
Django is a high-level full-stack framework that has become the web framework of choice in the Python land. I've used other frameworks too but the benefits of Django ecosystem cannot be ignored. (Frameworks like CherryPy, Bottle and a custom WSGI glue for Plango.)
nginx is a high-performance asynchronous HTTP server that can tackle the C10K problem. nginx now powers 12% of the world sites and has become the 2nd most popular HTTP server after Apache.
uWSGI is a fast Python application container with robust deployment features. Still a relatively unknown gem.
SQLite is a tiny embedded database. It's very easy to use and sufficient for now, since my site won't get millions of page hits anytime soon, and essentially the blog is currently read-only without comments. Once the need arises, I'll switch to PostgreSQL or to my favorite NoSQL solution, MongoDB.
memcached is the world-famous distributed in-memory key-value caching system. Used for session storage.
Upstart is the next generation init daemon for managing services and tasks on Unix. Default on Ubuntu since version 9.10.
Ubuntu is the Linux distribution at the bottom of my stack, hosted by Linode. Linode has provided the best bang for the buck for my small needs. Linode data center is located in London.

No need to mention: all of these components are open-source.

I won't go into details about compiling and installing these. Instead, I'll explain some of the high level configurations and hilights relevant to my blog.

Upstart

Upstart controls services via configuration script files in the /etc/init/ directory. Here's the simple script for my Django blog:

# /etc/init/tomi-django.conf:
start on startup
stop on shutdown

chdir /home/tomi/django-workdir
exec /bin/sh productionrun.sh

Two points here: the working directory of the daemon is set to my Django site root directory /home/tomi/django-workdir/, and a script called productionrun.sh is executed from the directory.

I start and stop the service from the command line with sudo start tomi-django and sudo stop tomi-django.

uWSGI

uWSGI is an application container that provides great facilities for building fast, scalable, robust and self-healing Python servers. Basically uWSGI forks a number of slave processes and monitors their operation.

Here's the script that is executed by Upstart:

# /home/tomi/django-workdir/productionrun.sh:
#!/bin/sh
exec /usr/local/bin/uwsgi \
    -M \
    -p 3 \
    -t 5 \
    -Q /tmp/spooldjango \
    --socket 127.0.0.1:8030 \
    --max-requests 1000 \
    --module wsgi \
    --logto /home/tomi/django.log

Remarks here: -M activates the uWSGI master-slave operation. -p 3 specifies 3 simultaneous slave processes. -t 5 specifies harakiri timeout: a request must complete in 5 seconds or the slave will get killed and a fresh slave is forked. -Q specifies a queue for background tasks, see below.

--max-requests 1000 specifies that each slave process serves a maximum of 1000 requests and is then killed and reforked. This is a simple mechanism to prevent memory or db connection leaks; no slave process can run forever. I don't see any leaks here but this good defensive step anyway.

--module wsgi specifies that a main script called wsgi.py is run by uWSGI.

nginx

nginx is the gateway HTTP server that talks to the world. It forwards browser requests to uWSGI daemon via the efficient uwsgi binary protocol that is built-in to nginx.

Here's my nginx configuration, appended to the end of standard nginx.conf:

# /usr/local/nginx/conf/nginx.conf:
...
upstream django {
       ip_hash;
       server 127.0.0.1:8030;
}
server {
       listen       80;
       server_name  tomicloud.com www.tomicloud.com;

       location /static/ {
           root   /home/tomi/django-workdir/;
           expires 1d;
       }
       location /favicon.ico {
           root   /home/tomi/django-workdir/static/;
       }
       location /robots.txt {
           root   /home/tomi/django-workdir/static/;
       }
       location / {
            uwsgi_pass  django;
            include     uwsgi_params;
       }
}

nginx serves static files directly from the file system and forwards other requests to a single uWSGI daemon running in the port 8030. (And the uWSGI daemon runs 3 workers.)

nginx can also load balance uWSGI daemons across machines, providing scalability.

nginx is also started via Upstart script.

Build script

The site is built and deployed the old school way: with a single Makefile and a command: make django:

PRODHOST=tomicloud.com

TMPDIR := $(shell mktemp -d /tmp/django.XXX)
SRCDIR = $(TMPDIR)
JSDIR = $(TMPDIR)/static/
DESTDIRSTOCK = django-workdir
COMPRESSOR = java -jar ../../minimize/yuicompressor-2.4.2.jar

all: xxx

prepare:
    @echo "Copying sources locally " . $(TMPDIR)
    @cp -R * $(TMPDIR)/
    @mkdir $(TMPDIR)/static/admin
    @cp -R ~/Downloads/Django-1.3.1/django/contrib/admin/media/ $(TMPDIR)/static/admin/

django: minify
    @echo "Uploading sources..."
    @-ssh $(PRODHOST) sudo stop tomi-django
    @rsync -a -v $(SRCDIR)/ $(PRODHOST):$(DESTDIRSTOCK)
    @-ssh $(PRODHOST) sudo start tomi-django
    @echo "Done! Deleting " . $(TMPDIR)
    @rm -fr $(TMPDIR)/

minify: prepare
    @echo "Minifying .js and .css files locally"

    # js full
    @cat $(JSDIR)/main.js >$(JSDIR)/temp.js
    @$(COMPRESSOR) $(JSDIR)/temp.js -o $(JSDIR)/temp2.js  --line-break 80 --preserve-semi --charset utf-8
    @echo "/* Hello there, you hacker! */" >$(JSDIR)/full.js
    @cat $(JSDIR)/temp2.js >>$(JSDIR)/full.js

    # js mobile
    @cat $(JSDIR)/zepto.js >$(JSDIR)/temp.js
    @cat $(JSDIR)/fx_methods.js >>$(JSDIR)/temp.js
    @cat $(JSDIR)/main.js >>$(JSDIR)/temp.js
    @$(COMPRESSOR) $(JSDIR)/temp.js -o $(JSDIR)/mob.js  --line-break 80 --preserve-semi --charset utf-8

    # css full
    @cat $(JSDIR)/main.css >$(JSDIR)/temp.css
    @cat $(JSDIR)/full.css >>$(JSDIR)/temp.css
    @cat $(JSDIR)/pastie.css >>$(JSDIR)/temp.css
    @$(COMPRESSOR) $(JSDIR)/temp.css -o $(JSDIR)/full.css  --line-break 80 --charset utf-8

    # css mobile
    @cat $(JSDIR)/main.css >$(JSDIR)/temp.css
    @cat $(JSDIR)/mobile.css >>$(JSDIR)/temp.css
    @cat $(JSDIR)/pastie.css >>$(JSDIR)/temp.css
    @$(COMPRESSOR) $(JSDIR)/temp.css -o $(JSDIR)/mob.css  --line-break 80 --charset utf-8

    # rm temp+orig files
    @rm  $(JSDIR)/main.css $(JSDIR)/mobile.css $(JSDIR)/temp.css $(JSDIR)/temp*.js

The makefile first copies the sources into a local temporary directory, then merges and minifies CSS and JS files (using the YUI compressor), and finally rsyncs the files over SSH into the remote server, stopping and restarting the uWSGI daemon in between. By using SSH public key authentication and SSH agent, the script never prompts for passwords.

Note that Django admin static files need to be copied to a directory where nginx serves them since Django in production mode doesn't serve any static files itself.

I've also used tools like django-pipeline for build tasks but prefer the Makefile approach for its simplicity and extendability and because I'm not always working with Django.

Background services

It is good practise to offload as much server work as possible to background workers. Time consuming tasks should not be executed within the request if possible since requests should complete instantly and there is always limited amount of request processors available in a server.

uWSGI provides good facilities for running background tasks. Below is a script from my site that demonstrates how one can schedule tasks to be run in the background.

# task.py:
from uwsgidecorators import spool, timer, cron
from django.core.mail import send_mail

# run every hour
@timer(3600)
def timerfunc(num):
    print "hour passed"

# read tweets every night
@cron(10, 4, -1, -1, -1)
def get_tweets():
    print("it's 4:10 in the morning: read tweets")
    tweet.fetch_tweets()

# send an email
@spool
def sendemail(args):
    subject = args["subject"]
    body    = args["body"]
    to      = args["to"]
    fromm   = "admin@tomicloud.com"
    send_mail(subject, body, fromm, [to], fail_silently=False)

# how to call sendemail above:
# task.sendemail.spool({'subject':'Hello', 'body':'Message to you.',
#   'to':'tomi@example.org'})

uWSGI solution is simple and works well. For more advanced setups, uWSGI mules can be considered. Another popular solution in the Python land is Celery.

Mobile

In todays IT environment, mobile UI should be considered as a first class citizen. A site should have basic support for a mobile version.

My blog has two versions of the UI, the desktop/full site and the mobile site. The site automatically tries to select the appropriate version of the UI based on the browser agent string. Here's my simple home-cooked mobile detection script:

# is mobile (forced or real one)?
def is_mobile(req):
    ismob = False
    forced = req.COOKIES.get('m')
    if forced == "m":
        ismob = True
    elif forced != "f" and is_real_mobile_browser(req):
        ismob = True
    return ismob

# is mobile (agent string)
def is_real_mobile_browser(req):
    a = get_agent(req).lower()
    moblist = ["android", "iphone", "ipod", "symbian", "nokia",
            "blackberry", "palmsource", "sonyericsson", "midp-",
            "opera mini"]
    for m in moblist:
        if m in a:
            return True
    return False

# return agent string
def get_agent(req):
    return req.META.get("HTTP_USER_AGENT","")

This simple script only detects the most common mobile browsers. It's impossible to make it 100% water proof since the number of different devices grows by day. So it is possible that the detection fails. Hence, it is important that the user can always switch the UI at will. I have a link in the footer to switch to another UI version. The selected UI version is stored in a cookie which is remembered even if the browser is closed.

The mobile site differs from the desktop site in all three levels: HTML, CSS and Javascript are all optimized for mobile use. The HTML has small changes to topmost layout (header, sidebar, footer), CSS has a few changes and the Javascript framework is a lighter alternative. All of these could not be achieved with CSS media queries alone. However, CSS media queries could be used in addition to the basic mobile/desktop separation. I'll get back to this in the future when the iPad3 arrives...

In my blog I have three CSS files internally but only a single CSS file is sent to a client in production mode to minimize http requests. A base CSS is merged with either a desktop CSS or mobile CSS by the deployment script.

Most of the back-end and front-end code of the UI versions is shared. A mobile UI is not written from scratch, saving development time. Here is a piece of code from my blog that shows how I select the Javascript files in my base.html. ISMOB is a variable that is always available in templates and is True for a mobile version.

{% if ISPRODUCTION %}
    {% if ISMOB %}
    <script src="/static/mob.js"></script>
    {% else %}
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
    <script src="/static/full.js"></script>
    {% endif %}
{% else %}
    {% if ISMOB %}
    <script src="/static/zepto.min.js"></script>
    <script src="/static/fx_methods.js"></script>
    {% else %}
    <script src="/static/jquery-1.7.1.min.js"></script>
    {% endif %}
    <script src="/static/main.js"></script>
{% endif %}

jQuery is a great and wildly popular Javascript framework. It is the base framework for all my projects. However, it is also quite large in size, and has extra baggage for mobile use. Zepto.js is a great lighter alternative for mobile. It provides the same API for a subset of jQuery functionality but with much less size. 94K vs 23K (-76%) in my case. Zeptojs has good plugin system for extra functionality.

Other

All text in the blog is written with Markdown. All code examples are colorized by the amazing Pygments Python library. The HTML generated by Markdown is cached in the database to save CPU; no reason to generate static HTML on every request.

My tweets are fetched every night by using Tweepy Python library.

I experimented in using a custom font Ubuntu from the wonderful Google fonts repository, but didn't take it into use finally since it didn't quite look good on all platforms. It was disabled in the mobile version anyway to save bandwidth.

There is some CSS3 in the front-end. For example, the header uses CSS3 transforms to enlarge the logo on hover, and CSS3 transitions are used to scroll the icons in the navigation tabs. The image of me is flipped with a 3D transform and a transition (webkit browsers only!) CSS3 transitions are great for small and subtle UI candy.

Box-shadows surround the top-level divs. Box-shadows are increasingly popular in contemporary web-design and they do look nice. No reason to do Photoshopping for shadows anymore.

That's the overview of the back and front of my blog, thanks.

Back