iStock-507964848.jpg
Insight

Big Data: What is it? How Can It Help Advance Sustainable Development?

What is big data and how could it help advance sustainable development? We sat down with Geoff Gunn from our Water Program to learn more.

By Geoffrey Gunn, Sumeep Bath on June 26, 2017

What is big data and how could it help advance sustainable development? We sat down with Geoff Gunn from our Water Program to learn more.

First things first. What is big data?

There are so many different streams of data that are being continually collected and processed. "Big data" has one or more of the following:

  • High volume (there is a lot of it)
  • High velocity (it is collected at a fast rate)
  • High variety (it measures a lot of different things)

Can you give us some examples of big data?

There are so many examples and uses in your daily life that you may not recognize every instance as big data, including data collected from satellites, Internet searches, mobile phones, remote sensing and more. It is often used to reveal patterns, trends and associations, especially relating to human behaviour and interactions.

Think about when you log into Twitter. Twitter develops a feed of "trending" topics for your given location, detailing what are the most popular searched and posted-about topics in your location at the current time. In order to populate that list, Twitter will have used some big data. It will have collected data from millions of Twitter users (high volume), every second (high velocity), about what people are searching, clicking on and posting about (high variety). All this data is processed at a very fast rate in order to come up with the real-time "trending" list.

Just because data is tracked over a long period of time does not mean that it is big data. For example, at IISD Experimental Lakes Area, researchers have maintained vast datasets as they have monitored lakes over the last 50 years. Even though this data is unique and highly valuable, it would not count as big data, as it is collected periodically (once every few weeks) and measures only a few variables. It is also relatively small—you could fit it all on a USB stick.

Just because data is tracked over a long period of time, does not mean that it is big data. 

What are some of the challenges that we face when dealing with big data?

One of the most pressing concerns regarding big data is storage and processing. Imagine collecting data from every citizen in Canada, every second. That is going to result in large amounts of big data that need to be stored in large data centres requiring a lot of space and money to maintain.

Then there are also the questions of privacy and security. How do people feel about some of their perceived private data being stored, shared, and potentially analyzed and used without their knowledge or consent? Both proponents and critics of big data need to discuss how we can use it to help society while addressing these concerns.

Are there potential applications for sustainable development?

Given that one of the major uses for big data is to track and analyze human populations and movements, there are certainly many possible applications for the sustainable development agenda.

A recent survey from the United Nations Statistics Division looked at how existing non-governmental organizations are using big data to implement and report on the progress of the Sustainable Development Goals (SDGs). In fact, they have set up a whole task team to determine how big data can help us achieve the SDGs. Among many of the interesting findings is that non-governmental organizations are primarily using big data from mobile phone use, ubiquitous in many developing countries, to inform their work. They don’t go into detail about what that information is used for, but we might expect that it could be used to track the movement of refugees, for example, to determine where large groups of people are moving to, so that there can be adequate preparation.

The United States Geological Survey is currently working on technology that would glean big data from smart phones and mobile devices in order supplement existing earthquake detection systems. This technology could serve regions of the world that cannot afford higher quality, but more expensive, conventional earthquake early warning systems, or could complement those systems.

This is somewhat related to the big data used to determine the locations of current traffic jams and update programs such as Google Maps in real time. This information can then be used to improve the design of road infrastructure, thus making driving less carbon intensive and safer for everyone involved.

We look at big data from both sides: at the global scale so we may discover more about the complexity of the physical and human features of our world, and at the human scale to ensure it is meaningful, helpful and representative of the seven billion people on Earth.

How is IISD and its Water Program already using big data?

There are many ways we are already using big data to develop work and access information that we would otherwise not be able to access. For example, when I was working on developing the Manitoba Bioeconomy Atlas, I needed a clear picture of the whole province over the last 10 years in order to determine where biomass can be found, and how its availability had changed over time. Thanks to public domain satellite data, I was able to access satellite images of the whole province for the last 10 years free of charge. I then used Google Earth Engine to process that data, to give me a full picture of what the province has looked like over the last decade to be able to accurately track biomass availability. Moreover, I could remove cloudy pixels and analyze the data in a fraction of the time that it would have taken on my desktop.

At IISD Experimental Lakes Area, we are actually helping to verify, or "ground-truth" data taken by satellites. Working with Natural Resources Canada (NRCan), we have set up cameras on some of our lakes to compare to the (big) data that NRCan is collecting with its satellites about when the lakes freeze over and melt. It is exciting stuff, and just goes to show that big data has its limitations, which can be somewhat mitigated by on-the-ground work.

We look at big data from both sides: at the global scale so we may discover more about the complexity of the physical and human features of our world, and at the human scale to ensure it is meaningful, helpful and representative of the seven billion people on Earth.