A few months back I had the honour of being asked to approve the use of a couple of excerpts from my London Surname Map in The Times Atlas of London. The wait was finally over last week when I received my copy in the post. It is a great book and an essential guide to the city. The Atlas begins with the first mapped representation of London in 1250 and covers a detailed history of London mapping that provides interesting context to both the development of the city and its mapping culture.
The Atlas includes detailed information on each of the London boroughs interspersed with great photos and fascinating insights into London life covering everything from its inhabitants (below) to football,
and public transport (below) through to London’s World Heritage Sites.
As you would expect from an atlas there are may detailed maps (see below) to break up the extensive commentary associated with the topics I mentioned above. I think the Atlas is brilliant book (not just because I’m in it!) and should make a great addition to the coffee table of fellow London map addicts.
Another day, another Twitter map- this time showing the global distribution of tweets that link to academic journal articles. I am always a bit skeptical of Twitter data (especially with location information) but as an academic seeking to publish in many of the journals that feature in people’s tweets I was prepared to make an exception when producing these maps. The data come from a cool service called Altmetric.
I think mapping Twitter’s engagement with the academic literature is important as it echos the map below and provides another example of the dominance of researchers (both in terms of access and production) from a few countries in the academic literature.
You can see how the “hotspots” of collaboration allign with the hotspots of tweets below.
One thing academics strive for is “impact”. This can mean many things, but one often applied criterion is the number of people outside of your immediate academic community that read your work. Twitter is becoming increasingly cited as an example of impact (for example how many followers an academic has), but as these maps demonstrate, much of the engaged audience is already where work is relatively easily accessible.
About the data from Altmetric: “This dataset lists the ~ 58k tweets that mentioned a scientific article (broadly speaking anything with a DOI, PMID or arxiv ID) between the 1st and 31st of July 2011. Recall isn’t 100%: my best estimate is that it’s missing another ~ 6k tweets where the article couldn’t be identified, the link was malformed or the journal involved is new or gets very low traffic. Twitter’s TOS prohibit re-distribution of the tweets themselves but the dataset contains the extracted links, the tweet ID and some information about the tweeter (screen name, country & lat/lng derived from their location using Yahoo! Placemaker).“
This week I feature on the 4th Episode of the Global Lab podcast. The podcast is a great new initiative led by Martin and Steve from CASA where they talk about cities, global connectivity and the impact of technology on people’s lives. Episode four features some horrendous physics jokes, Einstein’s Garden at the Green Man festival and a few minutes of me sounding a little underwhelmed about my research. You can listen here.
A map doing the rounds at the moment (thanks to a plug from flowingdata) is Derek Watkin’s brilliant map of “generic” terms for rivers in the United States (below).The map shows how different cultural and linguistic factors have influenced the naming of geographic features in the US. For example French settlers named the streams they encountered “bayous”.
The number of rivers in the US, combined with Derek’s brilliant design, make this a really compelling map. Inspired by this work I have quickly (with much less cartographic flair) extracted the major rivers and streams in Great Britain from the Ordnance Survey’s Strategi dataset and coloured them according to whether they are a “river”, “canal” (not sure if this really counts in terms of naming), “water”, “afon” (Welsh for river) and “brook”. You can see that a clear geography exists. I was not surprised by all the “afons” being in Wales but I was surprised to see so many “waters” in Scotland.
On the topic of naming, settlement names also have a clear geography as they, like rivers in the US, reflect the different settlers (or invaders!) of the British Isles over millennia. The map below (taken from my thesis) shows the different naming influences on settlements in Britain. The most striking aspect is the abrupt end to the Viking settlement names along what is called the Danelaw Line. So if you live north of this line you will be using more Viking words on a daily basis than those to the south. There are loads of people studying and recording the different place naming conventions in Britain- I would recommend you check out the “Institute for Name Studies” if you want more information.
“Treemaps display hierarchical (tree-structured) data as a set of nested rectangles. Each branch of the tree is given a rectangle, which is then tiled with smaller rectangles representing sub-branches. A leaf node’s rectangle has an area proportional to a specified dimension on the data. Often the leaf nodes are colored to show a separate dimension of the data. When the color and size dimensions are correlated in some way with the tree structure, one can often easily see patterns that would be difficult to spot in other ways, for example, if a certain color is particularly relevant. A second advantage of treemaps is that, by construction, they make efficient use of space. As a result, they can legibly display thousands of items on the screen simultaneously.” (source: Wikipedia)
I think treemaps are becoming an increasingly popular method of visualising multidimensional datasets and there are many ways to create them. This tutorial is going to focus on the use of R and builds on the one published on flowingdata.
Open R and install the following packages install.packages("portfolio")
now load them. library(RColorBrewer)
The next step is to load in the data file we are using. This is an edited version of the London Borough Profiles csv taken from the London Datastore. There are five columns of data. The three we are interested in are “pop”, “earnings” and “party”. input<-read.csv("https://jcheshire.com/wp-content/uploads/2011/08/tree_eg_data.csv")
A treemap generally requires 4 pieces of information:
the item- in this case the London Borough’s or “id”- each will be assigned a rectangle,
a value to scale the size of the rectangle by- in this case the population or “pop”,
a value for assigning the colour- in this case the average earnings per person or “earnings”,
and a broader group to which the item belongs- in this case the ruling political party or “party”.
Armed with this we can simply used the map.market function from the portfolio package (installed earlier) to produce a treemap. map.market(id, pop, party, earnings, lab = c(TRUE, TRUE), main="London Earnings, Population and Politics")
The output looks OK but I don’t really like the colours. I have therefore edited the code so that a selection of colours can be used using the ColorBrewer palettes. You can either download the code or load it straight away with source("http://dl.dropbox.com/u/10640416/treemapbrewer.r")
you have now loaded in a new function called “treemap” that does a very similar job to the one above but has a few other options that you can see used below treemap(id, pop, party, earnings, lab = c(TRUE, TRUE), main="London Earnings, Population and Politics", pal="Oranges", linecol= "dark gray", textcol="white")
The plot above used the “Oranges” palette but there are many more such as “Blues”, “BuPu” and “Reds”. Try for example treemap(id, pop, party, earnings, lab = c(TRUE, TRUE), main="London Earnings, Population and Politics", pal="Blues", linecol= "white", textcol="black")
When you are happy with the results save the plot as a pdf pdf("my_tree_map2.pdf")
treemap(id, pop, party, earnings, lab = c(TRUE, TRUE), main="London Earnings, Population and Politics", pal="Oranges", linecol= "dark gray")
or PNG png("my_tree_map2.png")
treemap(id, pop, party, earnings, lab = c(TRUE, TRUE), main="London Earnings, Population and Politics", pal="Oranges", linecol= "dark gray")
and then you can edit it using image/ vector editing software such as GIMP or Inkscape to get the following result:
I hope that the options to change colours makes for a more interesting treemaps than the standard red/green ones we are used to seeing. If anyone knows how to alter the scale bar so that it does not show values beyond the range of the data it would be great to see how it is done!
In the UK, August is exam results month for 16-18 year olds. Every year, photos of leaping teenagers clutching their results are accompanied by reports of record attainment rates, debates around how challenging modern exams are and, more so recently than ever, concerns for the number of sixth form and university places. Back in March the full list of the 2010 GCSE results (exams taken by UK 16 year olds [except in Scotland]) were released and I mapped them but never got round to sharing them with anyone. Now seems a good time to do this so here goes…
The map below uses the increasingly popular cartogram method to show the success of students in each Local Authority (LA) across England. The non cartogram version is also shown alongside.
This is quite a coarse map as England is only split into the 152 LAs and we know there is much greater variation between schools at a local level and even sometimes within individual schools. Moreover, schools on authority borders often serve communities from the areas on either side, limiting the application of LA data to their populations only. Independent (fee-charging) schools are also included in these broad LA results, which is significant when we take into account the predictably higher results of fee-paying pupils and the fact that these schools have not been established with regard for even distribution across the country. The size of the LA (in school-age population terms) does not seem to have a strong link to the results of its pupils. There must be other factors at play. Concerning known evidence indicates that a pupil’s level of deprivation has a stark impact on his/her attainment. This is supported by the plot for London below that shows the relationship between a borough’s mean national deprivation rank (known as the index of multiple deprivation or IMD).
Another way to show represent this information is by mapping the 2010 GCSE scores for each of the London Boroughs and resizing the borough so that it represents the levels of child poverty (measured by number of under 16s receiving means-tested benefits).
Again, the map above is not perfect as it is still quite generalised and shows only one of the many measures of child poverty that are used. Both maps also show only one measure of attainment the “GCSE or Equivalent” score. The “or Equivalent” bit is important here as it covers a wide range of more vocational qualifications (called NVQs) that are often perceived as less academically challenging and can be a way for students to get the equivalent of 5 A* to C grades including English and maths (a key educational benchmark) without having to be proficient in these core subjects. This is important as schools in England are often ranked by the proportion of their students achieving this benchmark resulting in a possible bias towards the schools offering more vocational subjects and against those offering more challenging ones such as modern languages. It is interesting to consider whether the nature of equivalent qualifications makes them more likely to be used by certain types of school and to explore this further I have produced the plots below. The codes are as follows: AC= Academy, CTC= City Tech. College, CY= Community School, CYS= Community Special School, FD= Foundation School, FDS= Foundation Special School, IND= Registered Independent School, INDSS= Independent Special School, NMSS=Non-Maintained Special School, VA= Voluntary Aided School, VC= Voluntary Controlled School (if you are as baffled about these as I was see here or here).
The plot shows 9 regions of England. Each point represents a school in that region and is coloured by its type. On the x-axis is the inverse (higher= better) regional ranking of the school based on its GCSE scores only and on the y-axis is the regional ranking if “equivalents” are included. If the inclusion/ exclusion of equivalents made no difference to the rankings then the points would follow the grey lines perfectly. In reality we get schools falling either side of this line with those under it benefitting if equivalents are counted and those above benefitting if they are excluded. For example, broadly speaking independent schools (light blue) look worse when GCSE equivalents are used in the ranking criteria and therefore would benefit if such qualifications were excluded. This also seems to be the case for the voluntary controlled schools in pink. Academy Schools (orange) however do much better with the inclusion of equivalent qualifications perhaps reflecting a more vocational emphasis to their curriculum. There are also some interesting regional distinctions with independent schools, for example, in the South West and South East appearing to do well whatever the ranking criteria whilst the East/ West Midlands and the North East present a more mixed picture. I think a lot more can be said about these plots so I would welcome comments!
The Guardian have been keeping track of the magistrate cases and convictions resulting from the recent rioting in England. Using this data I have produced the “tree map” below. For each magistrate I have grouped each offence committed and represented it as a square. The size of the square represents the number of people who have committed the offence and its colour is the mean age of the offenders. I have highlighted some of the most frequent/ serious offences in each area.
[styled_image_spec w=”577″ h=”614″ link=”https://jcheshire.com/wp-content/uploads/2011/08/riots_tree.png” lightbox=”yes” alt=”England Riots Tree Map”]https://jcheshire.com/wp-content/uploads/2011/08/riots_tree_sm.png[/styled_image_spec]
I was struck by the seriousness of the offences and the age of the perpetrators represented above. As time goes on many more squares will need to be added, but I think the plot provides a useful generalisation of the nature of the riots across England.
Migrations of people have existed for millennia and occur at a range of scales and time-periods (from small-scale journeys to work through to intercontinental resettlement). As a geographer I have long been interested in these and thought it was about time I mapped them! Using data from the Global Migrant Origin Database (thanks Adam for the tip) and R, my favourite stats software, I have produced the maps you see here (click on them for higher resolution). Each line shows the origins and destinations of at least 4000 people in a given year (2000 in this case). The more red the line the more people it represents. I have used great circle distance to plot them onto the Earth. The map below shows the same magnitude of flows but just for Europe. The Earth has been flattened for this one so the flows are represented by arbitrary arcs.
These visualisations aren’t perfect. Firstly they are based on a dataset where many of the movements are best guesses rather than measured data. You can read more about this here. It would also be great to have actual flows rather than inferred flows based on the number of migrants in each country. If I made these maps again I might draw lines between capital cities or population centres to avoid the impression that the majority of migrations to/ from Russia start/end in Siberia for example. There are of course endless ways of partitioning the data/ selecting the colours. Despite this I am really pleased with effect and the maps go some way to showing the dynamism in many 21st Century populations.
I think Paul Butler’s Facebook Map threw down the gauntlet to the R community in terms of the quality of visualisations that can be produced with the software so I was keen to see what I could do. To produce the maps I calculated the great circle distances using the geosphere package, I calculated my own arcs for the second map and used the maps package for my World outline. The visualisations (including projections) were done using ggplot2. Over the next few months I plan to stick together a more complete tutorial (PhD write-up permitting!).**UPDATE** the flowingdata blog has beaten me to it see here.
Flattening the Earth so that it can be easily drawn on a 2-dimensional surface is complicated. Over many years map projections have been developed to aid in this process, but they can only really estimate (albeit very accurately) the shape and dimensions of things on the Earth’s round surface. Whilst it is important to understand the technical aspects of map projections, it is also worth considering the effects that such transformations can have on people’s view of the world.
The image below shows an assortment of map projections of the UK (and one of Great Britain). These have all been taken from Wikipedia so the level of detail along the coastline varies a little. They demonstrate nicely the effect that different map projections can have on the shape of a country.
As you can see, some of the projections have squashed the UK whilst others have stretched it or changed its orientation. The British National Grid is the best representation because it has been designed specifically for Britain. It is the projection you will see used on Ordnance Survey maps and therefore most printed maps of the UK (it is rarer to find it online). Whilst excellent for Britain, the National Grid projection does not work on a global scale because it would cause massive distortions to the other countries. Instead, we should apply global projections, which need to be chosen carefully depending on their purpose and the scale of the map being produced. A poor choice of projection can have significant consequences because the relative size of a country on a map matters. Perception of country’s size is a delicate issue: in international politics, for example, countries which appear small on the map fear being overlooked. Indeed, this has long been the argument against the commonly-used Mercator projection, especially with reference to Africa, which appears relatively small on maps of this style. This effect is seen in Kai Krause’s “True Size of Africa”.
However, even this map has been criticised for using an inappropriate projection. The Economist, for example, responded by producing a map that maintains the correct relative areal proportions between each of the countries included (with Gall’s Stereographic Cylindrical Projection). Nonetheless, both maps illustrate the way in which our perceptions of this vast continent have been altered by “mainstream” mapping practices.
The most widely used projection (in the Western World at least) is the Mercator Projection, but as the West Wing explains this has a number of flaws that are often overlooked…
If you are interested in where people live, then the above picture may also be misleading. It is possible to alter projections so that the size of the country on the map is influenced by its total population.
It is clear that cartographers can produce different views of the world. We, as informed consumers of maps, need to be aware of this, to think twice about what we see and to consider how the information would look if projected differently. More importantly, by asking why the cartographer chose the projection they did, we may even be able to learn something beyond what we see on paper.