Visualizing OD mobilities & edgebundling in QGIS#

Intro#

Data showing spatial relations between places is common – in fact, we visualized flight connections between airports during week 2 and plotted mobilities between Flickr users’ homecountries and protected areas as chord diagrams in non-carto-vis-Python.ipynb. Origin and destination (OD) are the minimum locations is the minimum information needed to draw a trajectory.

Working with big (or even small) OD line data can overcrowd a static map very quickly. For example, a naïve “hairball” visualization (Poorthuis, 2018) of this practical’s data without any styling looks like this:

Hairball visualization of lines

Our example data only contains 472 mobilities, heading to one of three regions in Germany. A map of even this simple data above tells us very little of those mobility dynamics. Which of the three is the most popular? Where do the mobilities originate from?

We’ve seen in previous tutorials how finetuning the symbology by reducing linewidth, adding transparency and choosing a more pleasant color scheme improves the readability of such mobility maps. However, we can do more.

OD data capture simple, abstracted mobilities (e.g., city bike trips between stations). In these cases, it’s beneficial to show not just which places are connected but the magnitude of mobilities.

This tutorial will introduce two approaches to visualizing OD data in QGIS, (1) Graduated line maps and (2) Edgebundling. The tutorial closes by pointing to resources where you can delve deeper into the subjects.

Prerequisites#

Plugin#

  • Edgebundling plugin by Anita Graser. This plugin is not available in the public plugin repository. Instead, follow the instructions below to install it:

  1. Download processing_edgebundling.zip from this link or from the path QGIS-files/processing_edgebundling.zip

  2. Open the plugin installation window (Plugins > Manage and Install plugins)

  3. Go to Install from zip, select the zip file you just downloaded and install the plugin.

  4. The function Force-directed edge bundling has been added to the Processing Toolbox.

Data#

We’ll work with data that describes student mobilities in the European Union’s Erasmus exhange program at the level of the statistical NUTS2 regions. The full dataset has been processed by Tuomas Väisänen and Oula Inkeröinen as part of Mobi-Twin project at the Digital Geography Lab, University of Helsinki.

QGIS files#

As always, there are several style files and a QGIS processing model file that runs the whole processing chain.

  • You can download all the files from this link or download them individually in the folder QGIS-files.

Graduated line map#

One way to represent quantity is, naturally, to make certain connections more prominent. With a graduated line map, we can use width and color for that purpose.

Add data from erasmus-mobility-data to the project. Examine line layer on the map view and its attribute table.

The geopackage contains two layers: - 2018_student_mobility_NUTS2_germany_top3: Erasmus student exhanges that have their destination in German NUTS2 regions. The data has been filtered to only include mobilities towards the three most visited NUTS2 areas in Germany – Berlin (DE30), Köln (DEA2) and Oberbayern (DE21). - NUTS2 Centroids: Centroids of the three layers. Used for labelling.

Let’s calculate how many connections are between the each origin and destination region, similarly to how we did it in week 2’s global map:

  1. Run the processing tool Aggregate.

  2. Parameters:

    • Input layer: 2018_student_mobility_NUTS2_germany_top3

    • Group by expression: OD_ID

    • Aggregates:

      1. Keep only OD_ID and fid, remove others.

      2. Aggregate function

        • OD_ID: first_value

        • fid: count

      3. Name:

        • fid: mobilities

Styling#

QGIS offers two ways to emphasize a graduated style: size or color. This will usually be enough, but what if you’d want to use both variables at the same time? It’s very much doable, but we’ll need a bit more fidgeting.

  1. Apply a graduated style to the Aggregated layer.

  2. Choose color as the styling method. Select what you think is an appropiate number of classes, classification method and color scale. (Example: 4, Natural breaks and Reds.

  3. To modify linewidth, we’ll use data-defined overrides.

    1. Open up *Symbol > Configure symbol > Width > DD override symbol > Edit.

    2. Paste the expression scale_linear( "mobilities", minimum("mobilities"), maximum("mobilities"), 0.2,2.5)

      1. Read this expression as: use values from the field mobilities and scale them to a new value between 0.2 and 2.5.

    3. Modify the transparency and other style definitions as you wish.

This data-defined method differs from rule-based or graduated approaches in that we’re not classifying the data, but rather smoothly scaling the linewidth from the minimum value (0.2) to the maximum (2.5 millimeters). However, doing styling like this complicates some other aspects: for example, automatically creating a legend that would accurately show

This example uses the following styles:

  • graduated_line_style.qml

  • nuts2_centroids_label_style.qml

Graduated line map

QGIS tip#

Wonder where that background world map came from?

Simply type “world” into the field that shows the current coordinate locations and press Enter. This will add a simple world map of country boundaries, most likely based on Natural Earth.

Edgebundling#

Instead of aggregating by attribute information, there are methods to aggregate, or cluster, by location. Edgebundling is a clustering technique for line features (see Graser, 2019). It can be used to lessen visual clutter in linemaps.

We’ll be using a plugin that implements force-directed edgebundling for QGIS. This is, to our knowledge, the only edgebundling implementation that has been published for QGIS, although it is by no means the only edgebundling technique available (see examples of EB algorithms in Wallinger 2021).

  1. Run Force-directed edge bundling from the processing toolbox.

  2. Parameters:

    • Input layer: 2018_student_mobility_NUTS2_germany_top3

    • Use cluster field: Leave this deselected

    • Initial step size: 1000

    • You may leave the other parameters as-is.

Edgebundling and parameters#

Finding description of these parameters was a bit tough, but these are their effects to our understanding based on Graser et al. (2019):

  • Initial step size [map units]: Larger values will cause more distortion, possibly also artifacts. It uses map units (meters in the data’s CRS).

  • Compability [0–1]: Defines how many edges are involved in the bundling. Lower values will take more processing time but have stronger bundling outcomes.

  • Cycles & iterations [>0]: higher values will result in better outcomes at the expense of processing time.

  • Cluster field[Yes / No]: A feature in this implementation of edgebundling – instead of bundling all lines, it can bundle a set number of clusters. May reduce computation time at the expense of accuracy.

Edgebundling is a fickle craft. Good parameter values will be dependent on the dataset and its scale – finding a good mix will likely be a process of trial and error.

Of these, initial step size will be especially influential for the outlook of the map.

Below is our data processed with initial step sizes of 1000, 2000, 5000 and 10,000 while keeping other parameters constant. Notice how the larger values will cause larger distortions and larger bundles whereas the smaller values will produce a more conservative outcome.

These examples use the style: bundled_edges_thin.qml

Edgebundling starting step comparison

Styling#

Bundling the lines only helps somewhat to distinguish the routes (and even that’s up for debate!). We’ll still need smart styling of the layer to make our map more useful.

Some of the ideas that went into this style:

  • Categorized layer style with DESTINATION as the value field. A qualitative color scheme should be used here. For example:

    • DE21: #ffa719 (orange)

    • DE30: #ff23e9 (violet)

    • DEA2: #63bbff (blue)

  • High transparency (opacity 20 %) to make the clusters of lines stand out.

  • Exaggerated linewidth (0.5 mm)

  • Remember to consider layouting and map elements!

    • For example, remember to fit the printout page to the data. In this case, it’s rather square.

Edgebundling final

Compare the edgebundled linemap to the graduated one. What do they highlight well? What are their weaknesses? When would, for example, a chord diagram be better suited to describe mobilities between places?

Where to dig deeper into edgebundling?#

Edgebundling can be used create some really striking flow maps – with the right data and a lot of parameter fidgeting. Force-directed edgebundling has some downsides, as well. For one, it doesn’t scale particularly well to large datasets. This is why the example data we used is a small extract of the whole with only some hundreds of lines to three destinations – processing anything bigger might take from minutes up to days. Also note that this example data only has movements to Germany: usually, OD data has mobilities to and from! Finally, having pre-made implementations of cutting-edge algorithms in QGIS is not likely. For that, the programmatic way is usually wise.

Below are a few examples of tools for programming languages. The repositories linked have various cool examples that use other algorithms, as well!

Replicating the processing flow of this notebook#

To replicate this processing flow, run the processing model lines-edgebundling-model.model3. Open the model in QGIS from the leftmost button below Processing toolbox -> Open existing model.

You will need to add the example data and have the style files shared in the folder QGIS-files at the ready to run the model. Please also note that this model includes some hard-coded field-names (as do most of these models!). They are meant for replicating this notebook, and repurposing them for general use might require some modifications.

Line map model