D3.js is a powerful, extensible library for data
visualization. It makes some fairly advanced data visualization ideas available
to anybody who can bind data to DOM elements. The set of
supported features is vast, including everything from layouts like the stream
graph,
to an arsenal of map
projections for the
geodesist. However, D3 acheives this amazing breadth of utility through a an
unconventional programming pattern. For example, to add new SVG circle elements
to an existing <svg>
element, you’d do something like:
This style of declension is a double edged sword. It is terse and powerful: You can bind data to DOM elements, execute transitions, enter new elements, and remove obsolete elements all in a single block of code. On the other hand, it defies naive attempts to avoid repetition. This can lead to some pretty awful spaghetti code. Hence, it’s important to take a proactive stance on code repetition: noting where it happens and abstracting it away. The first thing to do is to have a decent working definition for the task. This semi-formal definition is my working concept of what a data visualization is:
Chart: Let
data
be the set of possible data points. Let the set of graphical elements be calledgraphics
. Finally, letmappings
be a collection of functions transforming dimensions ofdata
intographics
elements on the screen. Then aChart
is a tripple of(mappings, data, graphics)
,
data
is an exogenous variable, and choosing graphical elements is largely a
design decision. Of course, this can get complicated, but much of D3’s rich
feature set is targeted at resolving the intricacies of this problem. However,
the treatment of the mapping
element of the tupple is bare bones.
D3 implements the heavy lifting with the d3.scales
object. d3.scales
implements a variety of common mappings from data space to screen space.
For example, in this
area chart, courtesy of Mike Bostock,
d3.scale.linear
and d3.time.scale
map data to the vertical and horizontal
dimensions of the screen.
However, d3.scales
also leave a few steps to the user, and for the most part,
people implement these steps
over and
over and
over again, and
they do so for each mapping function in their data visualization.
This isn’t good—- it’s tedious, error prone, and certainly not D.R.Y. There are essentially two operations that are responsible for most of the repetition: Accessing data elements and computing the output of the mapping function.
Initially these were the only functionalities my solution addressed, but there
are a few additional operations we almost always do once we have the accessor
function, the scale, and the data. The first is
computing
the extent of the data, so we can figure out the ratio between units of data
and units on the screen (whether they be in cartesian coordinates or RGB).
Secondly we should always include an axis for the graph, although interaction
can alleviate some of that pressure. In any event I can’t imagine a case in which
you would use d3.svg.axis
without eventually using a scale, so it
makes sense to keep these ideas together.
I think the amount of repeated thought involved in creating D3 scales and axes
is a bit of a wart, so I wrote a
class, called Mapping
, to relieve
the pain. This class lets you stop thinking about what a scale actually is,
and provides a few convenience methods, for the common scale-related tasks
described above.
You initialize the class with the base d3.scale
object of your choice, and
the accessor function responsible for computing inputs to the scale. Commonly
the accessor is as simple as function(d){return d.x}
, but even in those cases, once
you’ve defined it you never need to think about it again. An example Mapping
might be:
If you need the value of a data element p
, just call
mapping.accessor(p)
. More commonly you’d just call mapping.place
to map p
into the screen space.
Finally there’s two convenience methods, mapping.compute_domain
which takes
the data array and calculates the extent of the data set and updates the domain
of the d3.scale
object; and create_axis
, which returns a newly created
d3.svg.axis()
with the scale
attribute already set.
Methods involving the scale or the axis return the appropriate object, so you can continue method chaining like you’re used to.