
Every IP address tells a story, but the story is often cryptic. The most surefire way to know where an IP is based on triangulations, but it has limitations. RTT measurements are affected by cable paths, network congestion, rate-limiting and packet drops. Often, we look for alternative signals that may give clues as to where an IP address actually resides. At IPinfo, we are dedicated to maximizing the utility of every piece of Internet metadata available. Our goal is to synthesize these diverse signals to achieve unparalleled coverage and accuracy in our IP geolocation data.
For decades, researchers have recognized the value Reverse DNS (rDNS) names, also known as DNS PTR records, have offered. They give us valuable clues about where IPs are located, embedded in hostnames like brsp41.gru-lo100.antel.net.uy (gru is an IATA code for São Paulo) , orae-0.fastly.brslbe03.be.bb.gin.ntt.net (brslbe is a modified sillycode for Brussels). These clues, however, are inconsistent, multilingual, and often ambiguous, differing from operator to operator.
At IPinfo, we collect rDNS data on a massive scale, over 50,000 ASes across IPv4 and IPv6. However, using this data as a method for geolocation is a hard problem. Firstly, the structure of information within these records is highly variable. Hints can be encoded in different places within the hostname, and even when that hint can be extracted, the same hint string can have multiple meanings depending on who is using them.
Traditional systems can only go so far — they depend on manually defined patterns and universal hint maps, which can’t keep up with the global diversity of operator naming schemes. The same hint could mean different things based on the operator. For example, bva could signify either Beauvais, France or Boa Vista, Brazil; similarly, chc might refer to Christchurch, New Zealand or Chicago, Illinois. Furthermore, many city names are heavily overloaded, such as Toledo, which could refer to locations in Illinois, Iowa, Ohio, Oregon, Washington, Belize, Brazil, Canada, Portugal, the Philippines, Spain, or Uruguay.
That’s where Aleph comes in: a system that uses Large Language Models (LLMs) to decode, classify, and map these hostnames into real-world locations. During my 2025 internship, I worked on extending Aleph’s capabilities, integrating it with IPinfo’s production pipeline, and evaluating how LLM-driven inference can improve IP geolocation accuracy, scalability, and cost efficiency.

The pipeline is described at a high level in the figure above. From an Autonomous System (AS), we pull all of their PTR records for a given IP version. From this, we take a weighted sample and use an LLM to group them by semantically similarity. Using these groups, we use the model to generate regular expressions that capture relevant parts of the hostname for geographic disambiguation. We then apply the regular expressions to the entire space of hostnames for the AS, getting a list of candidate strings that may have geographic meaning. In the final step, we use the model to disambiguate these candidate strings to structured location information, optionally guiding the LLM with hints formed based on RTT information with Retrieval-Augmented Generation (RAG).
Aleph began as an experiment on 2,646 ASes, enough to geolocate around 90% of IPv4 PTR records. To reach Internet-wide coverage, we scaled to more than 14,000 ASes, which together account for 99.99% of all IPv4 PTRs. Among these, 4,738 ASes also publish IPv6 PTR data, letting us apply the same three-stage LLM pipeline (classification → regex generation → hint mapping) to both protocols.
To validate results, we compared Aleph’s inferred locations against RTT polygons formed using measurements from IPinfo’s ProbeNet internet measurement platform, representing empirical ground-truth regions derived from latency measurements; available for roughly 20% of the analyzed IPs. The results show strong agreement:
In short, Aleph generalizes well across protocols and measurement types, demonstrating that rDNS-based location inference — when decoded through learned naming patterns — can be both scalable and highly reliable at Internet scale.
In total, we found 1.3 million unique hint strings across both protocols, of which about 800,000 were shared between IPv4 and IPv6. Interestingly, when comparing shared hints across IPv4 and IPv6, only 0.35% had conflicting meanings. This suggests IPv4-derived hint maps are robust enough to generalize to IPv6, a useful finding for multi-protocol geolocation.
As we expanded Aleph’s coverage from 2,646 to over 14,000 ASes, we began noticing patterns that don’t appear in any database: custom geo-hints. These are locally invented codes, abbreviations, and multilingual variations that operators use to describe geography in their own internal “dialect” of the Internet.
Some networks use short codes like longb for London or bairesargar for Buenos Aires. Others embed street addresses directly into hostnames (350ecermak, 111eighthave), or reference neighborhoods (mountsouris, tottenhamhub). Many even mix languages: we’ve seen London encoded as londen (Dutch), londra (Italian), or londres (Spanish).
Across all providers, this diversity is striking — more than 15,000 unique encodings for Moscow, 14,000 for London, and 7,000 for New York. Aleph’s LLM pipeline learns these patterns and generalizes them, recognizing that taksi-moksva-24 and moscowwitt both refer to the same city.
The result is a system that understands the Internet’s geographic shorthand across languages, scripts, and naming conventions, which can translate these hints into consistent, structured location data. These findings in total show that Aleph’s learned mappings are robust enough to decode how operators around the world quite literally name their networks.
While LLMs can infer locations from hostnames, ambiguity often remains — especially for short or overloaded codes like mtk, which could refer to Mitaka, Matsukawa, or Motoki, as shown in the example below, or city names.

To resolve these ambiguities, we integrated RTT-based geographic polygons generated by ProbeNet, IPinfo’s internet measurement platform, creating a RAG pipeline that constrains Aleph’s reasoning using latency-derived geographic boundaries. This allows the model to prefer interpretations that are not only linguistically plausible, but also physically consistent with measured network delay.

The results were striking. Using hostname hints alone, Aleph geolocated over 52 million IPv4 PTRs with 90.8% accuracy. When augmented with RTT polygons, coverage expanded to nearly 68 million PTRs — a 30% increase — and accuracy rose to 93.9%. Across all ASes, 95% saw accuracy improvements or stability, with the largest gains in cases involving ambiguous abbreviations or multilingual hints. In practice, RTT guidance helps Aleph make more grounded decisions, like being able to distinguish between, for example, Paris, Texas and Paris, France when both appear plausible in text.
A key question for any geolocation system is how often the data needs to be refreshed. To find out, we compared Aleph’s artifacts (its learned regexes and hint maps) across four years of data, from 2021 to 2025.
We looked at three kinds of evolution:
The results show a perhaps surprisingly stable Internet. Most networks preserve over 95% of their PTR records from one year to the next, forming a stable core that rarely changes. A smaller fraction — roughly 5–10% of ASes — exhibit very high turnover, often triggered by mergers, re-branding, or large-scale network restructuring.
At the same time, new hints appear every year, reflecting the Internet’s ongoing growth: new datacenters, new prefixes, new naming conventions. To manage this, we built a staleness-detection framework that automatically flags ASes whose regexes or hint maps degrade over time. This lets us target retraining only where patterns drift, instead of recomputing the entire model — keeping Aleph both current and efficient.
Our findings on model performance have implications that reach beyond the Aleph pipeline, they offer a powerful insight into the rapidly evolving LLM ecosystem and the future of cost-efficient AI deployment.
While higher-tier models traditionally guaranteed better results, our experiments comparing Gemini 2.5 Flash and Gemini 2.5 Pro on a sample of 100 ASes revealed a critical convergence: their performance for this task is now virtually identical.
The rapid maturation of the LLM landscape offers significant economic advantages for large-scale applications, making constant performance evaluation essential for all AI-driven projects. Our experiments compared the three tiers of the Gemini 2.5 family: while the ultra-cost-effective Flash-Lite tier did not meet the necessary accuracy benchmarks for this specific complex task, we discovered a critical convergence in performance between the Flash and Pro models for the Aleph system. With an accuracy difference of less than 0.3%, the highly cost-effective Gemini 2.5 Flash, especially when fortified with strategic prompting and data retrieval (RAG), now delivers the same high-quality inference as the more expensive Pro model. This convergence is key to our scaling imperative: by substituting Pro with Flash, we dramatically lowered the overall cost per inference. This strategic move ensures the LLM-driven decoding process, which aims to cover 99.99% of all IPv4 PTRs, remains economically viable at Internet scale, translating model improvements directly into significant cost optimization.
This project showed that LLMs can meaningfully improve the precision, automation, and interpretability of Internet infrastructure data. From decoding obscure hostnames to discovering hidden geographic patterns, Aleph demonstrates how machine learning can make cryptic Internet metadata more transparent, which is core to IPinfo’s mission of providing the highest quality and most accurate IP data available.
I was lucky to work with an incredibly supportive team — especially Oliver, Calvin, and William, who provided invaluable guidance at each step. This internship was an unforgettable experience that combined systems engineering, applied ML, and Internet measurement all towards the goal of making the Internet more understandable.

Kedar is a 2025 IPinfo Research Intern. He’s also a a third-year PhD student at Northwestern University in Chicago focusing on internet measurement and machine learning for networking problems.