Location Accuracy: Who can you Trust?
Thanks to the widespread availability of low cost GPS technology, many mobile devices have the ability to precisely geo-locate themselves. When users of these devices opt-in to make their location information available to applications and network services, a better, location-aware user experience is provided to them.
Location offers improved precision when targeting ads. Accurate location information, when combined with other data like user interest profiles and time of day can significantly improve the chances of ads being noticed by users, and being relevant to them. For instance, store discount coupons may be shown only to users within walking distance of a particular store.
In order to enable location targeting, the location of the device (i.e., its latitude and longitude) is included in the ad request that the device sends to a server. An ad matching service uses the location to find ads that are appropriate to the user’s location. The ad server may also forward the request to other downstream ad servers, ad networks, or ad exchanges that will use the latitude and longitude as part of their own ad matching services to find relevant ads.For location targeting to work correctly, it is important to ensure that the location of the device is captured correctly and transmitted without modification all the way to any of the matching services. The ad-service that receives a request directly from a handset is known as a first-party service. First part services have a significant advantage in terms of cleanliness and freshness of request data.
Privacy-compliant, reliable location data: the key for trust
At InMobi we run a large ad network where we receive billions of first-party ad requests, from hundreds of millions of devices every day. A large fraction of these requests come with latitude and longitude signals (which are sent upon the user’s consent to an app for collecting location information). A subset of these requests also come from devices which run our SDK. When our SDK runs as part of an application that has permissions to geo-locate the user (see image above), it obtains the latitude and longitude from the device and sends the same to our servers in a secure manner. As a result, we have a greater degree of assurance of the correctness of the location signals from a device when we get a request that is sent from our SDK. We may also receive location signals from the same device, from a different application, that comes to our servers via other channels. This enables us to cross verify different location signals. Our experience in dealing with location signals at scale has taught us that one needs to exercise a lot of care in interpreting and identifying where users are.
In this article, we discuss some of our learnings along with a few examples that illustrate the biggest challenges with Location data and also offer insights into how important accuracy is for us at InMobi.
Detecting errors by comparing latitude/longitude country with IP country
A request that comes with a latitude/longitude (henceforth, also referred to as lat/lng) value will also typically be associated with an IP address. Even in cases where the request is forwarded via multiple servers, the servers are expected to provide the IP address of the user’s device. If the IP address’ country does not match the lat/lng’s country value, one of the two signals might be wrong.
Detecting errors by comparing ad request volumes to population densities
Looking for anomalies in the geo-distribution of ad requests is one of the techniques we use to identify errors in location signals. The human population is not evenly distributed across the the globe and there are several low density regions. Any time we see a large number of ad requests coming from regions of low user density, it is an anomaly that requires further investigation.
One common anomaly in geo distribution is a large number of requests from latitude = 0 and longitude = 0. The location (0,0) is in the middle of the ocean off the coast of Ghana. This anomaly occurs when the ad request simply sends the default system values of lat/lng when it does not have access to precise user location. Ideally the lat/lng values should be omitted from the request when they are unavailable.
Swapped or equal lat/lng values: typical errors
Another anomaly shown in the pictures below is a high volume of requests coming from the seas off the Scandinavian coast and the south Atlantic ocean. Upon further investigation it is apparent to us that the ad requests have swapped the values of latitude and longitude, making requests from India appear to come from Scandinavia and requests from Brazil appear to come from the South Atlantic ocean. These errors are typically observed in ad requests that come to us from other servers, possibly from applications that have a lat/lng swapping bug. We are able to confirm the existence of this bug when we receive other SDK-originated ad requests from the same devices containing correct lat/lng values. It is tricky to detect this kind of problem if both the correct and swapped locations correspond to regions with reasonable population density, which adds complexity to location hygiene methodologies.
The picture below shows another interesting anomaly: a high density of ad requests originating along a line that cuts across the Mediterranean Sea. This issue seems to result from a bug where the application copies the same value into both the latitude as well as the longitude parameters. We have also seen such streaks along the equator and the prime meridian where only one of the two values of latitude / longitude are populated while the other is left as zero.
First-party location data from the SDK: manageable and reliable
Almost all of these observed bugs occur with ad requests that come to us via server-to-server integration, where our SDK is not involved. In these cases, when requests come from blinded apps, we are forced to discard all the lat/lng signals from those servers to avoid erroneously targeting location specific ads. Wherever we are able to identify the application with the bugs described above, we try and help the developers fix the bug.
If we were to serve ads on requests that always set longitude to zero, we may end up targeting a user in the western part of London with ads meant for people on the eastern side, closer to longitude zero.
Detecting errors based on multiple requests from the same user
Observing the different lat/lng samples from the same user over time also helps in the identification of errors in the location signals. Here is an example of ad requests that we see from a certain device closely spaced in time but from two different applications. One request appears to come from the New York area and a short while later we see another request from the mountains of Kyrgyzstan.
It is unlikely that the user actually traveled that far!
Also, the IP country did not match with Kyrgyzstan. The root cause seems to be an error in one of the apps, where the longitude value might have lost its negative sign.
Detecting deliberate abuse
Often, advertisers are willing to pay a premium for precisely location-targeted ads and as a result app developers who are able to send requests with precise lat/lng values can make more money. This creates an incentive for app developers to abuse the lat/lng signals and introduce values in these fields that have nothing to do with where the user actually is located. While some of the techniques described above, especially looking at location points over time for the same user, or high request density from sparse regions, can help identify such abuse, it often requires a trustworthy location signal such as an SDK signal to confirm the abuse.
Not all users have apps that are integrated with the InMobi SDK. In such cases detecting the abuse is harder. We list a few methods of abuse that we have detected and eliminated.
- User lat/lngs are sometimes populated in the request through geo-coding of the corresponding IP addresses. Populating accurate lat/lng fields allows the supply source to command a higher premium by pretending to have access to precise user locations. Such abuse tends to place a large number of users at the centroid of the US (Potwin Kansas), for instance, where these IP addresses are typically geo-coded. When we detect such abuse, we discard the lat/lng values from these requests.
- Some supply sources generate a random lat/lng value within a region where they think the user is located based on some other signal such as IP address. Detecting such abuse requires intense analytical efforts and clever algorithms.
An example of such abuse is depicted in the following snapshot where the location signals appear to be evenly distributed across a wide region, some of it spilling into the South China sea.
There are more types of abuse we have observed that are beyond the scope of this blog.
Best Practices Insights
Location Hygiene is a complex task that requires deep understanding of diverse, complex data sets and the use of sophisticated analytical algorithms for detection/prevention of abuse. Before subscribing to a vendor’s location solutions, it is important to understand (1) whether the solutions are based on first-party (SDK) data or on 3rd-party (off-exchanges) data, and (2) whether the vendor has methodologies in place to detect and guard against anomalies. Such anomalies, whether instituted through unintentional errors or systematic abuse/fraud, can derail the promise of precisely geo targeted services and geo targeted advertising.
At Inmobi, we take the user’s experience seriously - serving relevant content while assigning utmost importance to concerns such as propriety and privacy. Having a set of trusted signals helps us identify issues in the broader signal space. Our experience teaches us that unless service providers deploy safeguards against errors in geo signals, these signal errors can degrade the user experience.