Mobile Phone Positioning Data 101 – Data for Development Series

May 7, 2020
Posted in Data 4 Good
May 7, 2020 devcafe

Mobile Phone Positioning Data 101 – Data for Development Series

Mobile Phone Positioning or Location Data – Opportunities for Development Sector.

By DevCAFE Geek Squad

There is increasing interest in use of Data sciences for development, big data, Artificial Intelligence and Machine Learning are gaining ground. In this blog we explain one of the serious contenders in this space, the ‘Mobile Phone Positioning Data’ or ‘Location’ Data.  Nearly all of us carry mobile phones and therefore continuously generate real time data. In this introductory blog will explain below the basics of what constitutes this data. This will be followed by our next blog on how we apply it in development.

 

Ingredients of Mobile Phone Data |Source: The Development CAFÉ

Mobile location data provides a granular solution for consumer understanding. Combining this understanding with other datasets are helping to solve business problems and achieve goals across many different industries. For these reasons, location data has quickly become the holy grail of mobile. It’s applications are broad and run across a number of different industries and verticals.

But before we get onto that, what exactly is location data?

What is location data?

The smartphone

The mobile device or smartphone has been revolutionary. Its growth has been incredible – many predict that there are now more of these devices in the world than there are people.

Smartphones have transformed everything about our everyday lives -we rarely leave home without it, and it’s always on our person, ready to provide us with instant information or guidance.

These devices have enabled the location data industry to understand how audiences move and behave in the real-world. This information is location data. It comes in many different forms and from various sources.

What is location data?

Location data is geographical information about a specific device’s whereabouts associated to a time identifier.

This device data is assumed to correlate to a person – a device identifier then acts as a pseudonym to separate the person’s identify from the insights generated from the data.

Location data is often aggregated to provide significant scale insights into audience movement.

How is location data generated?

Companies are collecting location data in many different ways. There are several different techniques to collect location data. These techniques differ in reliability (but more on that later).

For now, the primary process of collecting location data requires the following ingredients.

A location source/signal

The first ingredient is a location signal. This signal is not a product of the device itself – it comes from another piece of technology that produces signals. The device listens to these external signals and uses it for positioning. These signals are as follows:

 

GPS

GPS is shorthand for the global positioning system and was first developed in the 1970s. The system is made up of over 30 satellites which are in orbit around the earth. This technology works in your device by receiving signals from the satellites.

It can calculate where it is by measuring the time it takes for the signal to arrive.

GPS location data can be very accurate and precise under certain conditions, mostly in outdoor locations. In the best instances, the signal can be reliable down to within a 4.9 metre radius under open sky (source) .

 

Wi-fi

Wi-fi networks are another source of location signals that are great at providing accuracy and precision indoors. Devices can use this infrastructure for more accurate placement when GPS and cell towers aren’t available, or when these signals are obstructed.

 

Beacon

Beacons are small devices that are usually found in a single, static location. Beacons transmit low energy signals which smartphones can pick up.

Similarly to Wifi, the device uses the strength of the signal to understand how far away from the beacon it is.

These devices are incredibly accurate and can be used to place a location within half a meter with optimal signal strength.

 

Carrier data/cell towers

Mobile devices are usually connected to cell towers so that they can send and receive phone calls and messages. A device can often identify multiple cell towers and by triangulation, based on signal strength, can be used to place a device location.

 

An identifier

Each smartphone needs to be associated with an identifier to understand movement over time. This identifier is called a device ID. For iOS, this is called an Identifier for Advertising (IDFA), and for Android, it’s called an Android Advertising ID (AAID).

 

Meta data or additional dataset (optional)

A location signal combined with an identifier will allow you to see the movement of a device over time. However, for more detailed insights and to get more value from location data, you’ll need some metadata or an addition dataset.

The most common dataset to do this is a POI dataset. This dataset includes points of interest that are important when comparing how audiences move and behave in the context of the real world..

For example, a series of latitudes and longitudes showing how Londoners move between 7-10am could be useful. Tying this to a dataset that included tube stations and key travel routes would allow you to do much more with the initial data.

Location data sources – where does location data come from?

So, we have already looked at the ingredients that combine to make location data, including the different types of location signals. However, what are the sources of location data? If you are looking to use location data in your organization, then you need to know the differences between every potential source.

The source can have a significant effect on accuracy, scale and the precision of devices. So, from where does location data come? There are three primary sources:

 

The bidstream

A sizeable proportion of location data comes from something called the bidstream (also referred to as the exchange). The bidstream is a part of the advertising ecosystem. Don’t worry if you’ve never heard of this – we’ll explain everything.

Explainer: The ad buying ecosystem

The ad buying ecosystem

Before we talk about bidstream data, it’s helpful to understand how ads are bought and sold.

  • Direct deals with publishers such as an app, site, or network.
  • Ad networks which group ad inventory to sell it to advertisers
  • Ad exchanges provide a solution for publishers to offer up their inventory programmatically, allowing advertisers to buy it in real-time. Purchasing advertising inventory in this way produces a bid request.

Why is this relevant for location data I hear you ask? In every bid request information is passed on – this data contains several attributes used to determine whether to serve the ad on the device.

Source: The Development CAFE

Included in this dataset is a form of device location. A company will package up this location data, and the result is the bidstream location data that is available today.

Bidstream location data is appealing because of the sheer amount of it – it can very quickly provide a large amount of scale. However, bidstream data also comes with specific issues – it can be inaccurate, inconsistent, and even fraudulent. Because it’s captured programmatically ,then bidstream location data also has the benefit of being immediately actionable.

“Up to 60% of ad requests contain some form of location data. Of these requests, less than a third are accurate within 50-100 meters of the stated location”

 Telcos

Remember, in the last section, when we identified location signals? Cell tower location is one of these and is the process of triangulating the strength of mobile cell tower signals to place the device in a specific location.

This kind of location comes directly from a telecommunications company (telco). Usually, they have some demographic data associated with the location data.

Similarly to bidstream data, the scale that telcos can offer (they have an extensive reach as in many countries few companies serve the entire population) is appealing.

However, in the same way, this scale is masking many issues with the accuracy of the data. Some studies have found that as little as 15% of data sampled was incorrect.

Location SDKs

A software development kit (SDK) is a toolkit that app publishers can add to their app to provide third party functionality. Developers add location-based SDKs to their apps to access the most precise and accurate location data signals from the user’s device.

Location SDKs come in many shapes and forms – some make use of the core location functionality present in the OS, others do a degree of data processing on top, to boost accuracy.

Some SDKs only operate in the integrated app when the app is open. Others can run in the background to gain broader insights into the movement and behaviors of the device.

Location-based SDKs collect data with the user’s consent – the apps native permissions often collect this consent, but some SDK providers offer consent tools to ensure that the location based app is collecting data in accordance with relevant regulations.

The difference between SDK generated data, and other sources of data can be seen in the accuracy and precision of datasets. Data collected by location SDKs are more accurate because they can listen for multiple location signals.

For example, SDKs can use the device’s built-in GPS to place the device and then, using Bluetooth signal strength from beacons, verify and fine-tune the location of the device down to within a meter of accuracy.

Location SDKs usually have a more sophisticated way of understanding how the device is behaving. For example, the Tamoco SDK uses motion behavior and other entry/exit events to know when a device visits a venue or location.

 

Why isn’t all data collected using SDKs?

If location SDKs are the most accurate and highly precise, then why don’t we use them to collect all location data?

The issue with many location SDKs is that they require integration into a publisher’s app. This app then needs to cover an adequate number of devices before the data is representative enough to gain any valuable insight or relevant patterns.

However, some SDKs have been built with functionality that benefits the publisher and limits battery usage to a minimal level. These SDKs are the ones that have achieved significant scale.

For example, the Tamoco SDK is optimised to send data in batches to minimise the number of requests. We also modify how data is collected depending on the current battery level.

All of these factors are a direct result of a close working relationship with our developer partners and allows the Tamoco SDK to scale along with our partners.

 

Publisher datasets

It’s possible to obtain location data directly from app publishers. Some publishers have developed methods of obtaining location by using the devices inbuilt location services.

These will usually coincide with a location-based process within the app – such as looking up a nearby restaurant.

These are often not as accurate as the location SDKs that have been carefully built to collect verified location signals. However, they can be a good source of location data as long as you can validate and understand the process of data collection put in place by the publisher.

 

In our next blog we will specifically talk about how we can apply this in the development context, including discussion on ethics and case studies. Watch this space!

 

DevCAFE – Geek Squad

, ,

Leave a Reply

Your email address will not be published. Required fields are marked *