Skip to main content
KBS_Icon_questionmark link-ico
DAFM - modelling with big data - main image ;

Supporting students and staff in the world of big data

Business, government and non-governmental organisations increasingly rely on large volumes of data from novel sources to make decisions. TThis includes data from mobile phones, web-scraping, social media, environmental sensors and satellites. As a result, the need for people with the technical skills to process and analyse such data, along with the soft skills to communicate findings and their implications, has given rise to a new profession: the data scientist. However, many of these large data sets are also spatial, meaning they comprise of geographic information about the earth and its features, like a road map for example. This requires people with a third skill: spatial data awareness.

However, large spatial data sets present unique challenges, which have implications for what students need to learn and, consequently, to the way that Geography is taught. This is why Dr Jon Reades and Dr James Millington have developed a suite of undergraduate ‘Spatial Data Science’ modules for the Department of Geography.

Together with Dr Chen Zhong and Dr Zara Shabrina, they are making the most of online, digitally-enhanced learning to ensure that students graduate with the much-needed computational, as well quantitative, skills for life beyond their degree.

Making big data accessible to students

Code is literally everywhere, and it is increasingly important that undergraduate students acquire computational skills alongside quantitative ones if they are to manoeuvre in the world of big data. However, very few undergraduate programmes currently offer more than a one-off Geographic Information Systems (GIS) class.

Geography's Spatial Data Science pathway is unique: offering students no fewer than three modules, including FoundationsPrinciples and Applications of Spatial Data Science – to develop both technical competence and critical rigour in dealing with (spatial) data.

This is important because students coming into a Geography undergraduate degree may not have much experience with quantitative data, nor the current crop of tools with which to analyse them. They may also find it challenging to visualise quantities or make maps, and although more students are coming to King’s with some kind of introductory Computer Science class under their belts, these classes have rarely been connected to the practical problems associated with collecting, analysing and interpreting data.

Consequently, the Spatial Data Science pathway has been designed to take students with little or no experience of programming and to empower them to understand and make use of code in real-world analytical challenges. To support students with no prior experience, the pathway also includes an optional introductory ‘Code Camp’ in which students learn the basics of the Python programming language online using geographical examples.

Although there are many ways to learn Python, our staff think that students learn best when they have access to familiar concepts, examples, applications and even names. To enable everyone to join in equally, Code Camp instruction is done using ‘Jupyter notebooks’, which allow students to run Python code in a web browser without having to install anything on their computer, tablet or phone.

The design of our BA and BSc degrees, and of the Spatial Data Science pathway, gives students a chance to ‘try before they buy’. Many students are inspired by the process of learning to code and decide to continue on to further study in Data Science and Geoinformatics master's programmes. Meanwhile, others go on to become data scientists or client partners for data-intensive companies. Yet, if students decide that it’s not for them, then they can stop at any time.

Why we use Jupyter notebooks to teach big data?

Jupyter notebooks removes significant barriers to teaching coding skills and big data analytics because it foregrounds interaction with data rather than with the code itself. We can combine explanation, commentary, code and output (eg graphs, statistics and maps) in a single document. Plus, it all works through a web browser.

Making Python, via Jupyter notebooks, more accessible allows students to learn through ‘hacking’ (in the MIT sense) and thereby come to better understand spatial analytical methods. Students also don’t need to know where their code is running and, together with the virtualisation tools (eg Docker and Vagrant) heavily used by companies like Amazon, Google and Apple, it becomes possible to hide some of the complexity of managing local programming language installations.

The ability to provide rich media and contextual information along with the code is one of the reasons that practising data scientists have adopted Jupyter and these are the same reasons it is a good tool for teaching.

Trying out Code Camp

We want to get students connected early to the tools they will be using throughout their degree and beyond. They can learn the basics of coding from home via our introductory Code Camp. There is no software to install and we think the camp offers huge benefits for students if they are to hit the ground running with coding when they start their degree.

In this story

James Millington

James Millington

Reader in Landscape Ecology

Latest news