Apr 112015

Some motivation

Django has a field called ChoiceField which lets you very easily create a <select> dropdown in your forms. A great feature is the ability to transform nested choices into appropriate <optgroup> groups. For instance, the following code:

my_choices = [("Group 1", [
                 (1, "Choice 1"), (2, "Choice 2")]), 
              ("Group 2", 
                 [(3, "Choice 3"), (4, "Choice 4")]), 
              (5, "Choice 5")]
field = ChoiceField(label="Field", choices=my_choices)

…creates a field that when rendered looks something like this:


Looks nice


The HTML that Django produced looks like this (excluding the label):

<select id="id_test" name="test">
    <optgroup label="Group 1">
        <option value="1">Choice 1</option>
        <option value="2">Choice 2</option>
    <optgroup label="Group 2">
        <option value="3">Choice 3</option>
        <option value="4">Choice 4</option>
    <option value="5">Choice 5</option>

Nice. So, what happens when we try to add more levels of nesting in the choices?

my_choices = [
    ("Group 1", [
        (0, "Choice 0"),
        ("Subgroup 1", [
            (1, "Choice 1")]),
        ("Subgroup 2", [
            (2, "Choice 2")])]),
    ("Group 2",[
        ("Subgroup 3", [
            (3, "Choice 3")]),
        ("Subgroup 4", [(4, "Choice 4")])]),
     (5, "Choice 5")]

Bad, bad, bad


That didn’t work. Here’s the HTML Django gave us:

<select id="id_test" name="test">
    <optgroup label="Group 1">
        <option value="0">Choice 0</option>
        <option value="Subgroup 1">[(1, &#39;Choice 1&#39;)]</option>
        <option value="Subgroup 2">[(2, &#39;Choice 2&#39;)]</option>
    <optgroup label="Group 2">
        <option value="Subgroup 3">[(3, &#39;Choice 3&#39;)]</option>
        <option value="Subgroup 4">[(4, &#39;Choice 4&#39;)]</option>
    <option value="5">Choice 5</option>


The problem

As it turns out, ChoiceField only supports one level of nesting. Five years ago someone submitted a patch to correct the problem, but it was rejected for good reason: HTML itself only officially supports one level of <optgroup> nesting.

To see if it would work anyway, I manually applied the patch to my Django installation. Although the patch itself worked (Django produced the nested option groups), Chrome wasn’t having any of it. My dropdown displayed incorrectly, and developer tools showed that the HTML got mangled during parsing.

Emulating nested optgroups

My solution to the problem is to fake it by using disabled <option> elements as group headers. Then, we apply indentation to the actual options to make them appear to belong to the groups. This is accomplished by a custom widget called NestedSelect (below), which is a cross between the original Select widget and the patch I linked to above.

from itertools import chain
from django.forms.widgets import Widget
from django.utils.encoding import force_text
from django.utils.html import format_html
from django.utils.safestring import mark_safe
from django.forms.utils import flatatt

class NestedSelect(Widget):
    allow_multiple_selected = False

    def __init__(self, attrs=None, choices=()):
        super(NestedSelect, self).__init__(attrs)
        # choices can be any iterable, but we may need to render this widget
        # multiple times. Thus, collapse it into a list so it can be consumed
        # more than once.
        self.choices = list(choices)

    def render(self, name, value, attrs=None, choices=()):
        final_attrs = self.build_attrs(attrs, name=name)
        output = [format_html('<select{}>', flatatt(final_attrs))]
        selected_choices = set(force_text(v) for v in [value])
        options = 'n'.join(self.process_list(selected_choices, chain(self.choices, choices)))
        if options:
        return mark_safe('n'.join(output))

    def render_group(self, selected_choices, option_value, option_label, level=0):
        padding = "&nbsp;" * (level * 4)
        output = format_html("<option disabled>%s{}</option>" % padding, force_text(option_value))
        output += "".join(self.process_list(selected_choices, option_label, level + 1))
        return output

    def process_list(self, selected_choices, l, level=0):
        output = []
        for option_value, option_label in l:
            if isinstance(option_label, (list, tuple)):
                output.append(self.render_group(selected_choices, option_value, option_label, level))
                output.append(self.render_option(selected_choices, option_value, option_label, level))
        return output

    def render_option(self, selected_choices, option_value, option_label, level):
        padding = "&nbsp;" * (level * 4)
        if option_value is None:
            option_value = ''
        option_value = force_text(option_value)
        if option_value in selected_choices:
            selected_html = mark_safe(' selected="selected"')
            if not self.allow_multiple_selected:
                # Only allow for a single selection.
            selected_html = ''
        return format_html('<option value="{}"{}>%s{}</option>' % padding, option_value, selected_html,

Then it’s just a matter of swapping out the widget that ChoiceField uses:

self.fields["test"] = ChoiceField(label="Field", choices=my_choices, 

Now we get something closer to what we wanted:


Improving the aesthetics

The solution above works, but it can be improved in two ways:

  1. The manual addition of padding results in an ugly side-effect – that padding is visible in the selected option. Look at the difference when Choice 1 and Choice 5 are selected (below). Choice 5 is aligned correctly, but Choice 1 appears to be floating out in no-man’s land.

1 2

  1. We no longer have the nice bolding effect on the option group headers. It’s not possible for us to individually style the elements that are supposed to be headers.

To overcome these shortcomings of HTML, we must turn a JavaScript based replacement: Select2. With that plugin installed, the fix is just a little bit of JavaScript:

$(function () {
        // Ensure the final select box is wide enough to fit the bolded headings
        $(this).width($(this).width() * 1.1);
        templateSelection: function (selection) {
            return $.trim(selection.text);
        templateResult: function (option) {
            if ($(option.element).attr("disabled")) {
                return $("<b>" + option.text + "</b>").css("color", "black");
            } else {
                return option.text;


Finished product

The final results look nice, and you even get a free search box thanks to Select2:


Nov 292014

About a year ago I published an article entitled Parsing HTML with C++. It is by far my most popular article (second most popular being this one), and is a top result on Google for queries such as “html c++ parsing”. Nevertheless there is always room for improvement. Today, I present a revisit of the topic including a simpler way to parse, as well as a self-contained ready to go example (which many people have been asking me for.)

Old solution

Before today, my prescription for HTML parsing in C++ was a combination of the following libraries and associated wrappers:

cURL, of course, is needed to perform HTTP requests so that we have something to parse. Tidy was used to transform the HTML into XML that was then consumed by libxml2. libxml2 provided a nice DOM tree that is traversable with XPath expressions.


This kludge presents a number of problems, with the primary one being no HTML5 support. Tidy doesn’t support HTML5 tags, so when it encounters one, it chokes. There is a version of Tidy in development that is supposed to support HTML5, but it is still experimental.

But the real sore point is the requirement to convert the HTML into XML before feeding it to libxml2. If only there was a way for libxml2 to consume HTML directly… Oh, wait.

At the time, I hadn’t realized that libxml2 actually had a built in HTML parser. I even found a message on the mailing list from 2004 giving a sample class that encapsulates the HTML parser. Seeing as though the last message posted was also in 2004, I suppose that there isn’t much interest.

New solution

With knowledge of the native HTML parser in hand, we can modify the old solution to completely remove libtidy from the mix. libxml2 by default isn’t happy with HTML5 tags either, but we can fix this by silencing errors (HTML_PARSE_NOERROR) and relaxing the parser (HTML_PARSE_RECOVER).

The new solution, then, requires solely cURL, libxml2, and their associated wrappers.

Below is a self-contained example that visits iplocation.net to acquire the external IP address of the current computer:

#include <libxml/tree.h>
#include <libxml/HTMLparser.h>
#include <libxml++/libxml++.h>

#include <curlpp/cURLpp.hpp>
#include <curlpp/Easy.hpp>
#include <curlpp/Options.hpp>

#include <iostream>
#include <string>

#define HEADER_ACCEPT "Accept:text/html,application/xhtml+xml,application/xml"
#define HEADER_USER_AGENT "User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.70 Safari/537.17"

int main() {
    std::string url = "http://www.iplocation.net/";
	curlpp::Easy request;

	// Specify the URL

	// Specify some headers
	std::list<std::string> headers;
	request.setOpt(new curlpp::options::HttpHeader(headers));
    request.setOpt(new curlpp::options::FollowLocation(true));

	// Configure curlpp to use stream
	std::ostringstream responseStream;
	curlpp::options::WriteStream streamWriter(&responseStream);

	// Collect response
    std::string re = responseStream.str();

    // Parse HTML and create a DOM tree
    xmlDoc* doc = htmlReadDoc((xmlChar*)re.c_str(), NULL, NULL, HTML_PARSE_RECOVER | HTML_PARSE_NOERROR | HTML_PARSE_NOWARNING);

    // Encapsulate raw libxml document in a libxml++ wrapper
    xmlNode* r = xmlDocGetRootElement(doc);
    xmlpp::Element* root = new xmlpp::Element(r);

    // Grab the IP address
    std::string xpath = "//*[@id="locator"]/p[1]/b/font/text()";
    auto elements = root->find(xpath);
    std::cout << "Your IP address is:" << std::endl;
    std::cout << dynamic_cast<xmlpp::ContentNode*>(elements[0])->get_content() << std::endl;

    delete root;

    return 0;

Install prerequisites and compile like this (Linux):

sudo apt-get install libcurlpp-dev libxml++2.6-dev
g++ main.cpp -lcurlpp -lcurl -g -pg `xml2-config --cflags --libs` `pkg-config libxml++-2.6 --cflags --libs` --std=c++0x


Future work

In the near future, I will be releasing my own little wrapper class for cURL which simplifies a couple of workflows involving cookies and headers. It will make it easy to perform some types of requests with very few lines of code.

Something I need to investigate a little further is a small memory leak that occurs when I grab the content: dynamic_cast(elements[0])->get_content(). On my computer, it seems to range between 16-64 bytes lost. It may be a problem with libxml++ or just a false alarm by Valgrind.

Finally, I may consider following up on that mailing list post to see if we can get the HTML parser included in libxml++.

Nov 092014

With the holidays fast approaching, I thought it might be fun to share my list of on-the-go tech essentials for any geek technology enthusiast.

Leatherman Sidekick Multi Tool

What’s a “tech essentials” list without at least one multitool? I like this Leatherman Sidekick because of its great tool selection and reasonable price. The pliers are solidly built, and the locking blades are a welcome safety feature.

Fenix PD35 850 Lumen Flashlight

I discovered Fenix flashlights a few years ago, and have been hooked. The Fenix PD32 (link) was my first foray into the PD line, and is still the light I carry most often. Its successor, the Fenix PD35, is pictured below. With a removable clip, six output modes (including strobe), and full one-handed operation, this is a great choice for EDC (everyday carry). If it’s built half as good as my PD32, it will last you for many, many years.

Fenix PD12 360 Lumen Flashlight

Why another flashlight? This Fenix PD12 is small enough to fit on a keychain so there is little chance of you forgetting to take it with you. It runs on a single CR123A battery, which is the same type that the PD35 uses.

Swiss+Tech Utili-Key 6-in-1

This is one of those subtle little tools that you don’t even notice until you need it. Great for when you accidentally leave your full-sized multitool at home.

Pluggable USB 3.0 Memory Card Reader

Unlike most other memory card readers, this one has a built-in cable, which makes it great for travel. And, in addition to the usual SD, microSD, and MMC families, it supports the Sony Memory Stick (MS, MS Pro Duo, etc.) card types.


Micro USB to USB On-the-go adapter cable

This little USB OTG adapter cable is great for letting you access USB storage from your phone. When paired with the memory card reader above, you’ll be able to upload photos from your camera without a computer!

Anker 3.6A Dual USB Wall Charger

This one kind of speaks for itself – when you’re on the go, a quality USB charger is a must. The Anker 3.6A Dual USB Wall Charger has two ports – one designed for Android and the other for Apple devices. I like this one because of its slim design and lack of annoying LEDs.

Panasonic In-Ear Headphones

These Panasonic In-Ear Headphones have no business being as good as they are, especially considering that they’re under $10. They sit comfortably in the ear and produce surprisingly good sound.

Anker 10000mAh Portable USB Charger

For long car rides, a quality portable USB charger is a necessity. This Anker 10000mAh Portable USB Charger has two ports, and holds enough juice to charge an iPhone 4+ times or a Galaxy S4 2+ times.

Targus XL Backpack

With all of these essentials in hand, you’ll need a way to store and transport them. For the last 2.5 years, I have been carrying the Targus XL backpack.

Let’s get something out of the way: this backpack is huge. It’s designed to hold a 17″ laptop and it does so with ease. Even with a laptop, you’ll have enough room to hold multiple textbooks and most of the items mentioned on this page. It has an incredible number of pockets and zippered compartments for storing anything you could imagine. It is easily the most quality constructed backpack I have used, as well.

Product images owned by Amazon

Jul 132014

Since Linux 3.13, Radeon power management is enabled by default. This is great if you have a supported card, but if you don’t, you may encounter issues such as overheating and overeager cooling fans. If you fall into the latter category, you can use these instructions to disable the new power management features.

Disable Radeon power management

  1. Add the parameter radeon.runpm=0 to the Linux kernel boot parameters by editing /etc/default/grub. This is accomplished by appending the parameter to the value of GRUB_CMDLINE_LINUX_DEFAULT. After doing so, that line in my grub file looks like this:
    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash radeon.runpm=0"
  2. Run sudo update-grub
  3. Reboot your computer.


Manually control the GPU

You can now manually manage the power of your graphics card, by using vgaswitcheroo or similar.

Personally, I completely disable my discrete graphics card under Linux by adding the following line to /etc/rc.local:

echo OFF > /sys/kernel/debug/vgaswitcheroo/switch

You can verify that the card is off by running the sensors command. When my discrete GPU is switched off, sensors reports the temperature as 511°C:

Adapter: PCI adapter
temp1:       +511.0°C  (crit = +120.0°C, hyst = +90.0°C)


More resources

These pages were useful to me while researching how to disable Radeon power management:

Feb 212014

I was planning on writing a beginner’s tutorial for using PWM on raw AVR chips, but I found that Arduino already has a nice guide here: http://arduino.cc/en/Tutorial/SecretsOfArduinoPWM

The only change you need to make to their code to use it without the Arduino software is to remove calls to “pinMode”. Do so by using the appropriate DDR register twiddling or macros such as these: http://www.avrfreaks.net/index.php?name=PNphpBB2&file=printview&t=66939&start=0

Jan 202014

I have just finished migrating all of my BitBucket repositories from Mercurial to Git. All of the projects are still accessible at the same URLs they were before, but they now use Git. The reason for the change is that I have been using Git for a number of projects since the summer, and have grown to like it more than Mercurial. Sorry for the inconvenience.

Nov 032013

Almost three years ago I released css2xpath#, a port of Andrea Giammarchi’s project by the same name. Today, I’m excited to announce a new project, css2xpath Reloaded, which will supplant my previous project. css2xpath Reloaded does the same thing that css2xpath# did (convert CSS selectors to XPath selectors), but is instead based off of Ian Bicking’s excellent Python app (also somewhat confusingly named css2xpath).

Whereas css2xpath# (and Andrea’s original project) relied on regular expressions to perform the conversion, css2xpath Reloaded recreates Ian’s CSS selector parser.

Although more complicated, a benefit of using a parser is that your CSS selectors can be validated during conversion. It’s also a little bit easier to extend the code. And most importantly (for me at least) it was a lot of fun to write.

I’d consider the code to be beta quality at the moment; it’s lacking in documentation and rigorous testing. However, if you’re interested in testing it out, proceed to BitBucket.

css2xpath Reloaded on BitBucket

Oct 192013

I just released version 1.2.1 of VirtualScroller. This minor update adds two enhancements:

  • Any number of items (at least 1) is supported. If less than 6 items are specified, the VirtualScroller falls back on a standard ScrollableView with the fancy scroll logic disabled. This is transparent to developers and users. Resolves issue 3.
  • VirtualScroller automatically detects whether you want finite or infinite scrolling. If the itemCount property is omitted, infinite scrolling is assumed. Otherwise finite scrolling is used. As such, the infinite property is no longer used. Resolves issue 4.

Download the latest version
Visit the Wiki


Aug 242013

I came across a nice example of a Twisted “man-in-the-middle” style proxy on Stack Overflow. This style of proxy is great for logging traffic between two endpoints, as well as modifying the requests and responses that travel between them.

The original was posted here, and I reproduce the vast majority of the code below with some modifications. My real motivation for posting this is to “get the code out there”, because I had a hard time finding it originally. A big thanks to the original author for posting his code on Stack Overflow.

All you need to do is change the three constants at the top, and add whatever validation/modification logic you want in the dataReceived and write methods. Those four methods are labeled so you know which “hop” the data is taking. A request is going to take the following path: client => proxy => server => proxy => client. For example, the first dataReceived method handles data travelling from the client to your proxy.

#!/usr/bin/env python

SERVER_ADDR = "server address"

from twisted.internet import protocol, reactor

# Adapted from http://stackoverflow.com/a/15645169/221061
class ServerProtocol(protocol.Protocol):
    def __init__(self):
        self.buffer = None
        self.client = None

    def connectionMade(self):
        factory = protocol.ClientFactory()
        factory.protocol = ClientProtocol
        factory.server = self

        reactor.connectTCP(SERVER_ADDR, SERVER_PORT, factory)

    # Client => Proxy
    def dataReceived(self, data):
        if self.client:
            self.buffer = data

    # Proxy => Client
    def write(self, data):

class ClientProtocol(protocol.Protocol):
    def connectionMade(self):
        self.factory.server.client = self
        self.factory.server.buffer = ''

    # Server => Proxy
    def dataReceived(self, data):

    # Proxy => Server
    def write(self, data):
        if data:

def main():
    factory = protocol.ServerFactory()
    factory.protocol = ServerProtocol

    reactor.listenTCP(LISTEN_PORT, factory)

if __name__ == '__main__':