Whether due to data protection regulations or a general interest in safeguarding user privacy, there has been rising interest in forms of safeguarding personal user information. Those efforts include curtailing the use of 3rd party cookies or otherwise redacting information about individuals that is shared with advertisers, data processors, and platforms.
Some organizations consider IPs a form of personal identification, although how true that is will depend on the type of connection – for instance, CGNAT means that some mobile carrier IPs are used by tens of thousands of users simultaneously and cannot be traced back to any single one of them. Nevertheless, some IPinfo users prefer to obfuscate individual IP addresses through the use of last digit IP anonymization as a form of data protection.
Since several organizations use this method in their arsenal of privacy policies, the IPinfo team set out to research the effect of hashing out the last digits of IP addresses on the accuracy of contextual IP address data.
Here’s what we discovered.
While the term IP anonymization more broadly can include techniques around hiding one’s own IP through the use of VPNs, proxies, and privacy relays, last-digit IP anonymization refers to obfuscation that happens “in the middle.” Let’s say you run an app with advertisers and you share information about a user with them: by obfuscating the last octet of the IP address (say, by sharing 205.22.12.0 instead of the full 205.22.12.121), you can preserve some level of anonymity while still keeping most of the context around the IP.
Last-digit IP anonymization is a feature of Google Analytics, and can also be called IP obfuscation. Learn more IP address information.
While obfuscating the last octet is a pretty straightforward method to protect individuals’ data, it also raises a few questions:
Thankfully, these are exactly the kinds of questions IPinfo’s wealth of IP data is well-prepared to answer! For most users, IP addresses are only as useful as the contextual information that can be pulled from them, such as geolocation or associated company, as we’ll address below.
If there is no change in those results for IPs with the same first three octets, data accuracy is not heavily affected by the anonymization process for APIs or database downloads.
IP addresses are routed by ASNs in blocks or prefixes. The smallest prefix size that can be announced by an ASN with BGP on the internet in IPv4 consists of 256 IPs, or a /24. A /24 is named that way because its mask length is 24 bits: all IPs from 205.22.12.0 to 205.22.12.255 are part of the same block, 205.22.12.0/24.
This means that IPs are not randomly, uniformly distributed; neighboring IPs tend to be operated by the same organizations and located in similar locations.
The result of our analysis is that our given geolocation for an IP is the same for the entire block of IPs 87.5% of the time, meaning that last-digit anonymization could feasibly alter the geolocation 12.5% of the time. However, this geolocation alteration tends to be small: changes in country are much rarer, occurring only in 0.3% of addresses.
However, when there is a geolocation distinction within the /24, even without a change in country, the distance can be significant. A typical range is between 20 and 700km, with a median of 232km. This can be a large enough distance to considerably affect use cases where location precision is important.
The takeaway: it’s not rare that obfuscating the last octet of an IP address will result in a different geolocation, but it won’t be the case most of the time. Moreover, the observed change will rarely be radical, and the whole block is very likely to be in the same country. However, for use cases where location precision matters, the distance, when it exists, may be large enough to cause problems.
Learn more about IP address location accuracy.
While the smallest prefixes announced in the public internet are /24s, as mentioned above, the reality is different within an ASN’s network, where they can be broken down further, with ranges attributed to different owners and businesses operating within that ASN. Through RWhois, this can get pretty granular, and even individual IP addresses can be assigned to different owners and companies.
That said, only 3.2% of ranges show any distinction between the attribution of any two given IPs within a /24, meaning that it is unlikely for last-digit anonymization to significantly impact data accuracy when it comes to company data.
When it comes to our mobile carrier data, there is essentially no impact from last digit anonymization when returning information on whether the IP is part of a mobile network and, if so, its carrier name, country code, and network code. This is because mobile carriers will not generally use networks smaller than a /24 when determining which IPs their users will connect through when on them, so which part of the block the IP is on will not make a difference.
There are two main ways of detecting VPN usage on a given IP: either we run the VPN service in question to find out which IPs it is using, or we run service scans of the entire internet to determine which IPs respond to handshakes from common VPN protocols. Both of these methods identify a lot of individual IPs, and so it is common for one or more IPs to be flagged within a /24, while others are not.
In this case, last digit anonymization has a significant impact. Roughly 20% of the VPN flags in our privacy dataset show a potential difference between a given IP and its anonymized, obfuscated version.
Find out more about our privacy detection data.
Obfuscating IPs before sending them out to third-party processors like IPinfo does sometimes affect accuracy and reliability for certain use cases. However, these downsides can be avoided by simply using a downloadable database of IP data.
For instance, if users download IPinfo’s datasets (rather than using our API), they can safely use the full IP address from a user entirely within their own closed system, making the anonymization entirely unnecessary. And since IPinfo’s insights mostly exist around ranges of IP addresses, companies can avoid recording any specific IPs while gathering enough contextual information to ensure reliable use cases.
Another benefit of the downloadable database is that users don’t need to submit any IP information to a third-party data provider, such as IPinfo. Plus, the database is updated every 24 hours and is monitored by our data experts and proprietary algorithms.
Our data is not different between the API and databases, however, a privacy-conscious customer might be uneasy sharing a full IP over the API (opting to anonymize the last digit). They would not have that issue for the downloadable database because they would have full control over the environment and no data would have to leave their organization.
In other words, by choosing the downloadable database, customers won’t experience any change in accuracy and can avoid exposing IPs as personal identifiers.
Hashing out the last digits of IPs can still work for some use cases, such as those that only need broad geographic information or don’t mind losing precision some amount of the time. For others, such as VPN detection, doing so could mean missing out on a lot of detail.
Users who need a high degree of data accuracy for use cases where reliability is key (think online targeting or website security) can use the downloadable database to get full precision without needing to expose user data or experience a coarsening of the data.
Leverage raw IP datasets, customize your data feeds, and choose your ideal format with Database Downloads.
Hashing out the last digit of an IP does affect accuracy. That being said, anonymized IPs can be used to target broad geographic areas such as countries.
City-level pinpointing, however, won’t be as reliable. This limits use cases such as geotargeting or content restrictions. In other words, as we’ve helped customers build out use cases, we’ve noticed that most users require a high level of accuracy that last digit anonymization hinders.
For highly pinpointed use cases, users will maintain the best accuracy by using IPinfo’s Database Download. That being said, if you have questions, our data team can offer the best recommendations based on your specific use of IP data.
IP anonymization shouldn’t come at the cost of accuracy. Reliable IP data is crucial for security, compliance, and personalization. Using a trusted IP data provider ensures access to accurate, up-to-date information while respecting users who want to prioritize another level of privacy.
IPinfo offers high-quality, accurate data, global coverage, and the richest context in both API and downloadable database formats, helping businesses make informed decisions. By choosing IPinfo, you get precise IP intelligence without compromise.
Use the #1 data provider for future-oriented developers, PMs, CTOs, and analysts.
Daniel Quandt leads the solutions engineering team at IPinfo, where he helps customers get the most out of internet data. Before IPinfo, he worked in data science in the hospitality industry.