IPinfo founder and CEO Ben Dowling sat down recently with the Geomob podcast, where he and host Ed Freyfogle discussed the many uses of IP geolocation and other IP address data, and how Ben was able to grow what started as a weekend project into a leading global provider of IP address datasets. The transcript is below, edited for readability. Listen to the full episode here.
Ed Freyfogle: Welcome to the Geomob podcast, where we discuss geo innovation in any and all forms. I’m very excited about today's episode because we have a Geomobster from way back in the day. He last spoke at Geomob 10 years ago in 2011. Then he left Europe for the U.S., where he started one of the world's leading IP geolocation companies, IPinfo. My guest today is Ben Dowling. Welcome back to Geomob. Really looking forward to hearing your story.
Ben Dowling: Really excited to be here. I’m the founder and CEO of IPinfo. We're an IP geolocation company and we primarily sell our services via an API. We have IP geolocation and other datasets, and we are one of the leading IP geolocation API providers. We do about 40 billion requests a month.
Ed Freyfogle: Congratulations. This is a great success story of a geo-based company, and Ben is a personal idol of mine as someone who runs a smaller API-based geo company. Let's start from the basics. What is IP geolocation? How does it work?
IP Geolocation: The Basics
Ben Dowling: IP geolocation is giving an IP address and what geographic region it’s in. The most obvious reason you might want this is, if you’re a website owner, you may want to customize the content to show a visitor something specific or track that location for marketing purposes. An IP address on its own doesn't have any context. It's just a number. With IP geolocation, you can passively get a geolocation for it. So the website owner may say, we've got a visitor from Barcelona, we've got one from Seattle, and then there'll be things that they want to do based on having that information.
Ed Freyfogle: How do you have that information? Where does the data come from?
Ben Dowling: The user sends an IP address to our API and we'll say, this is Barcelona, or this is Seattle. Behind the scenes, we do a lot of work to map that IP address to location. We've got a bunch of different data sources and data processing, clean-up, and algorithms on top of that, and that gives us our output. We rebuild that database daily. Some of the data sources include hosting data, rDNS data, traceroute data, who is data, and routing table data. We have a big probe network. We send out IP data and traceroute data and do some processing to say where we think each IP range or individual IP ranges. An ISP might look like it's in Barcelona, Seattle, etc. The output of that whole data pipeline is our geolocation data set.
Ed Freyfogle: So some of it is publicly available, and some of it you're creating yourself and enhancing, if I understand?
Ben Dowling: A lot of the input source data is publicly available.There are a bunch of pieces that aren't, that we add to it. And then there is a set of algorithms and data processing on top of that, resulting in the final data set.
How Accurate is IP Geolocation Data?
Ed Freyfogle: How precise is it? How well can you know exactly, right where I'm sitting?
Ben Dowling: It varies dramatically based on tons of factors. Those could be, how much does the IP address itself move? I'm sitting in my office now in Seattle. My office has probably had this IP address for months and will have it for many months in the future. Theoretically we could say, this IP address is definitely on this very specific street in Seattle. It never moves. Then there'll be other IP addresses on a mobile phone. Lots of mobile carriers might pull IP addresses across devices. It may be that the same IP address gets used on different devices throughout the whole country.
There are a lot of complexities that go into it. Maybe the different countries route their traffic in different ways, and different factors determine how stable an IP address is. We guarantee a city-level result in our datasets. Even if in the case of my office IP, we may be able to tell the actual address in Seattle, we never expose that level of detail. We'll say, this IP address is in Seattle. We don't go more granular than that.
For the cases where it's broader than the city level, we'll still give you city-level details. So for my office example, Seattle is accurate. But, sometimes, for an IP address that is pulled across devices, we still return a city-level result, but it will be based on other factors. If we have some raw data that shows us where an IP has actually traveled, we may say, we've seen this IP address in a bunch of different cities all over California, but most of the data shows that it's usually in LA, so we'll give it an LA result. We overlay population-level data and say, based on population density, it's most likely to be in LA. We always return a city-level result. Not every specific IP can be located to that level, but we abstract it so our customers don't have to worry about it. They give us an IP and we give them a city.
Ed Freyfogle: That’s similar to some of the issues we have with geocoding, the way we give what we call a confidence score to say, we know it's in this box, but if the box is really big, we don't know exactly which precise location.
Ben Dowling: We have a bunch of those confidence scores internally. We don't currently expose that in our API. We have all this data internally that tells us how we got to this location. How confident are we? I think that will probably be available at some point in the future.
Do IP Addresses Move or Change Over Time?
Ed Freyfogle: You said you've been at your office for a long time and had the same IP address. Imagine your connection is longer good and you switch ISPs. Presumably they assign your IP address to a new customer. How do you find out about that?
Ben Dowling: There are a bunch of different ways things can change. This goes back to the question of how precise we can be and the challenges we have internally with measuring our accuracy. Maybe I just reboot my router in the office and I get a new IP address. The old one could go somewhere else, but depending on my ISP, they may just give it to someone else down the street, or to someone else in Seattle. So even though that's been a change in who's connected through that IP, or a change in business, there may not be a change in location. They could say, this is our pool of Seattle IP addresses. Or it could go back into the pool for the whole of the U.S. and end up anywhere.It could change when my ISP sells a bunch of IP space. As there is a shortage of IPV4 addresses, we see more and more trading of addresses and selling of blocks. We’re even seeing extreme cases where ISPs reassign the IP address to some European country as part of a block that AWS has bought.
We’ll see an IP address that was in Germany or France suddenly be hosted on the East Coast of the U.S. These changes happen frequently. Our datasets change 5-10% a month. The changes will usually be reflected in one or more of the various data sources we have. We rebuild most of our datasets daily so the changes should automatically be picked up. We ship fresh data to our API daily. If an IP changed, we should capture it pretty quickly. If an IP address changed from Seattle to Miami, within a few days, we would show Miami. Even with the 5-10% change per month, there isn't much IP space that changes really frequently. We don’t see an IP address popping up in Seattle today, in Miami tomorrow, and Germany that day after that. Things are relatively static.
Stay tuned for part 2 of this interview, where Ben and Ed discuss how Ben grew IPinfo from a one-man operation into a global team -- and what’s next.