The Domain Name System or DNS is a system that stores information about host names and domain names in a kind of distributed database on networks, such as the Internet. Most importantly, it provides an IP address for each host name, and lists the mail exchange servers accepting e-mail for each domain.
The DNS provides a vital service on the Internet, because while computers and network hardware work with IP addresses to perform tasks such as addressing and routing, humans generally find it easier to work with host names and domain names, for example in URLs and e-mail addresses. The DNS therefore mediates between the needs and preferences of wetware and of software.
A brief history of the DNS
The practice of using a name as a more human-legible abstraction of a machine's address on the network predates even TCP/IP, and goes back to the ARPAnet era. Originally, each computer on the network retrieved a file called HOSTS.TXT from SRI (now SRI International), which mapped an address to a name (technically, this file still exists - most modern operating systems either by default or through configuration, can check their Hosts file to match a URL to an IP address before checking the DNS). However, such a system had inherent limitations, because of the obvious requirement that every time a given computer's address changed, every single system that wanted to communicate with that computer would need an update to its Hosts file.
The growth of networking called for a more scalable system: one which recorded a change in a host's address only in one place, and in which other hosts would learn about the change dynamically. Enter the DNS.
Paul Mockapetris invented the DNS in 1983; the original specifications appear in RFC 882 and 883. In 1987 the publication of RFC 1034 and RFC 1035 updated the DNS specification and made RFC 882 and RFC 883 obsolete. Several more recent RFCs have proposed various extensions to the core protocols.
How the DNS works in theory
Meet the players
The practical operation of the DNS system consists of three players:
- The DNS resolver, a DNS client program which runs on a user's computer, and which generates DNS requests on behalf of software programs;
- The recursive DNS server, which searches through the DNS in response to queries from resolvers, and returns answers to those resolvers;
- The authoritative DNS server which hands out answers to queries from recursors, either in the form of an answer, or in the form of a delegation (i.e. referral to another authoritative DNS server).
Understanding the parts of a domain name
A domain name usually consists of two or more parts (technically labels), separated by dots.
- The rightmost label conveys the top-level domain (for example, the address www.wikipedia.org has the top-level domain org).
- Each label to the left specifies a subdivision or subdomain of the domain above it. Note that "subdomain" expresses relative dependence, not absolute dependence: for example, wikipedia.org comprises a subdomain of the org domain, and en.wikipedia.org could form a subdomain of the domain wikipedia.org (in practice, however, en.wikipedia.org actually represents a hostname - see below). In theory, this subdivision can go down to 127 levels deep, and each label can contain up to 63 characters, as long as the whole domain name does not exceed a total length of 255 characters. But in practice some domain registries have shorter limits than that.
- Finally, the leftmost part of the domain name (usually) expresses the hostname. The rest of the domain name simply specifies a way of building a logical path to the information required; the hostname is the actual target system name for which an IP address is desired. For example, the domain name www.wikipedia.org has the hostname "www".
The DNS consists of a hierarchical set of DNS servers. Each domain or subdomain has one or more authoritative DNS servers that publish information about that domain and the name servers of any domains "beneath" it. The hierarchy of authoritative DNS servers matches the hierarchy of domains. At the top of the hierarchy stand the root servers: the servers to query when looking up (resolving) a top-level domain name.
An example of theoretical DNS recursion
An example may clarify this process. Suppose an application needs to find the IP address of www.wikipedia.org. It puts this question to a local DNS recursor.
- Before starting, the recursor has to know where to find the root servers; administrators of recursive DNS servers manually specify (and periodically update) a file called the root hints zone which specifies the IP addresses of these servers.
- The process starts by the recursor asking one of these root servers - for example, the server with the IP address "188.8.131.52" - the question "what is the IP address for www.wikipedia.org?"
- The root server replies with a delegation, meaning roughly: "I don't know the IP address of www.wikipedia.org, but I do know that the DNS server at 184.108.40.206 has information on the org domain."
- The local DNS recursor then asks that DNS server (i.e. 220.127.116.11) the same question it had previously put to the root servers, i.e. "what is the IP address for www.wikipedia.org?". It gets a similar reply - essentially, "I don't know the address of www.wikipedia.org, but I do know that the DNS server at 18.104.22.168 has information on the wikipedia.org domain."
- Finally the request goes to this third DNS server (22.214.171.124), which replies with the required IP address.
This process utilises recursive searching.
Understanding domain registration and glue records
Reading the example above, you might reasonably wonder: "how does the DNS server 126.96.36.199 know what IP address to give out for the wikipedia.org domain?" In the first step of the process, we noted that a DNS recursor has the IP addresses of the root servers more-or-less hard coded. Equally, the name servers that are authoritative for the Top-Level Domains change only very infrequently.
However, the name servers that provide authoritative answers for common domain names may change relatively often. As part of the process of registering a domain name (and at any time thereafter), a registrant provides the registry with the name servers that will be authoritative for that domain name; therefore, when registering wikipedia.org, that domain is associated with the name servers gunther.bomis.com and zwinger.wikipedia.org at the .org registry. Consequentially, in the example above, when the server identified by 188.8.131.52 receives a request, the DNS server scans its list of domains, locates wikipedia.org, and returns the name servers associated with that domain.
Usually, name servers appear listed by name, rather than by IP address. This generates another string of DNS requests to resolve the name of the name server; when an IP address of a name server has a registration at the parent zone, network programmers call this a glue record.
DNS in practice
When an application (such as a web browser), wants to find the IP address of a domain name, it doesn't necessarily follow all of the steps outlined in the Theory section above. We will first look at the concept of caching, then outline the operation of DNS in "the real world".
Caching and time to live
Because of the huge volume of requests generated by a system like the DNS, the designers wished to provide a mechanism to reduce the load on individual DNS servers. The mechanism devised provided that when a DNS resolver (i.e. client) received a DNS response, it would cache that response for a given period of time. A value (set by the administrator of the DNS server handing out the response) called the time to live, or TTL defines that period of time. Once a response goes into cache, the resolver will consult its cached (stored) answer; only when the TTL expires (or until an administrator manually flushes the response from the resolver's memory) will the resolver contact the DNS server for the same information.
An important consequence of this distributed and caching architecture is that changes to the DNS are not necessarily immediately effective globally. This is best explained with an example: If an administrator has set a TTL of 6 hours for the host www.wikipedia.org, and then changes the IP address to which www.wikipedia.org resolves at 12:01pm, the administrator must consider that a person who cached a response with the old value at 12:00pm will not consult the DNS server again until 6:00pm. The period between 12:01pm and 6:00pm in this example is called propagation time, which is best defined as a period of time that begins between whenever you make a change to a DNS record, and ends after the maximum amount of time specified by the TTL expires. This essentially leads to an important logistical consideration when making changes to the DNS: not everyone is necessarily seeing the same thing you're seeing. RFC1537 helps setting it.
DNS in the real world
In the real world, users do not interface directly with a DNS resolver - they interface with programs like web browsers (Mozilla Firefox, Safari, Opera, Internet Explorer etc.) and mail clients (Outlook Express, Mozilla Thunderbird etc.). When users make a request which requires a DNS lookup (in effect, virtually any request that uses the Internet), such programs send a request to the DNS resolver built into their operating system.
The DNS resolver will almost invariably have a cache (see above) containing recent lookups. If the cache can provide the answer to the request, the resolver will return the value in the cache to the program that made the request. If the cache does not contain the answer, the resolver will send the request to a designated DNS server or servers. In the case of most home users, the Internet service provider to which the machine connects will usually supply this DNS server: such a user will either configure that server's address manually or allow DHCP to set it; however, where systems administrators have configured systems to use their own DNS servers, their DNS resolvers will generally point to their own nameservers. This name server will then follow the process outlined above in DNS in theory, until it either successfully finds a result, or does not. It then returns its results to the DNS resolver; assuming it has found a result, the resolver duly caches that result for future use, and hands the result back to the software which initiated the request.
As a final level of complexity, some applications such as Web browsers also have their own DNS cache, in order to reduce use of the DNS resolver library itself, which can add extra difficulty to DNS debugging, as it obscures which data is fresh, or lies in which cache. These caches typically have very short caching times of the order of 1 minute.
Other DNS applications
The system outlined above provides a somewhat simplified scenario. The DNS includes several other functions:
- Host names and IP addresses do not necessarily match on a one-to-one basis. Many host names may correspond to a single IP address: combined with virtual hosting, this allows a single machine to serve many web sites. Alternatively a single host name may correspond to many IP addresses: this can facilitate fault tolerance and load distribution, and also allows a site to move physical location seamlessly.
- There are many uses of DNS besides translating names to IP addresses. For instance, Mail transfer agents use DNS to find out where to deliver E-mail for a particular address. The domain to mail exchanger mapping provided by MX records accommodates another layer of fault tolerance and load distribution on top of the name to IP address mapping.
Sender Policy Framework controversially takes advantage of a DNS record type, the TXT record.
- To provide resilience in the event of computer failure, multiple DNS servers provide coverage of each domain. In particular, thirteen root servers exist worldwide. DNS programs or operating systems have the IP addresses of these servers built in. The USA hosts, at least nominally, all but three of the root servers. However, because many root servers actually implement anycast, where many different computers can share the same IP address to deliver a single service over a large geographic region, most of the physical (rather than nominal) root servers now operate outside the USA.
The DNS uses TCP and UDP on port 53 to serve requests. Almost all DNS queries consist of a single UDP request from the client followed by a single UDP reply from the server. TCP typically comes into play only when the response data size exceeds 512 bytes, or for such tasks as zone transfer.
Types of DNS records
Important categories of data stored in the DNS include the following:
- An A record or address record maps a host name to its 32-bit IPv4 address.
- An AAAA record or IPv6 address record maps a host name to its 128-bit IPv6 address.
- A CNAME record or canonical name record makes one domain name an alias of another. The aliased domain gets all the subdomains and DNS records of the original.
- An MX record or mail exchange record maps a domain name to a list of mail exchange servers for that domain.
- A PTR record or pointer record maps a host name to the canonical name for that host. Setting up a PTR record for a host name in the in-addr.arpa domain that corresponds to an IP address implements reverse DNS lookup for that address. For example (at the time of writing), www.icann.net has the IP address 184.108.40.206, but a PTR record maps 220.127.116.11.in-addr.arpa to its canonical name, referrals.icann.org.
- An NS record or name server record maps a domain name to a list of DNS servers for that domain. Delegations depend on NS records.
- An SOA record or start of authority record specifies the DNS server providing authoritative information about an Internet domain.
- A SRV record is a generalized service location record.
- A TXT record allows an administrator to insert arbitrary text into a DNS record; this record is also used in the Sender Policy Framework specification.
Other kinds of records simply provide information (for example, a LOC record gives the physical location of a host), or experimental data (for example, a WKS record gives a list of servers offering some well-known service such as HTTP or POP3 for a domain).
Internationalised domain names
Domain names must use only a subset of ASCII characters, preventing many languages from representing their names and words natively. ICANN has approved the Punycode-based IDNA system, which maps Unicode strings into the valid DNS character set, as a workaround to this issue, and some registries have adopted IDNA.
Various flavors of DNS software implement the DNS, including:
DNS-oriented utilities include:
- dig (the domain information groper)
Legal users of domains
No one in the world really "owns" a domain name except the Network Information Centre (NIC), or domain name registry. Most of the NICs in the world receive an annual fee from a legal user in order for the legal user to utilise the domain name (i.e. a sort of a leasing agreement exists, subject to the registry's terms and conditions). Depending on the various naming convention of the registries, legal users become commonly known as "registrants" or as "domain holders".
ICANN holds a complete list of domain registries in the world. One can find the legal user of a domain name by looking in the WHOIS database held by most domain registries.
For most of the more than 240 country code top-level domains (ccTLDs), the domain registries hold the authoritative WHOIS (Registrant, name servers, expiry dates etc). For instance, DENIC, Germany NIC holds the authoritative WHOIS to a .DE domain name.
However, some domain registries, such as VeriSign, use a registry-registrar model. For .COM, .NET domain names, the domain registries, VeriSign holds a basic WHOIS (registrar and name servers etc). One can find the detailed WHOIS (Registrant, name servers, expiry dates etc) at the registrars.
Since about 2001, most gTLD registries (.ORG, .BIZ, .INFO) have adopted a so-called "thick" registry approach, i.e. keeping the authoritative WHOIS with the various registries instead of the registrars.
A registrant usually designates an administrative contact to manage the domain name. Management functions delegated to the administrative contacts may include (for example):
- the obligation to conform to the requirements of the domain registry in order to retain the right to use a domain name
- authorisation to update the physical address, email address and telephone number etc in WHOIS
A technical contact manages the name servers of a domain name. The many functions of a technical contact include:
- making sure the configurations of the domain name conforms to the requirements of the domain registry
- updating the domain zone
- providing the 24x7 functionality of the name servers (that leads to the accessibility of the domain name)
Self-explanatory, the party whom a NIC invoices.
Namely the authoritative name servers that host the domain name zone of a domain name.
Many investigators have voiced criticism of the methods used currently to control ownership of domains. Most commonly, critics claim abuse by monopolies or near-monopolies, such as VeriSign, Inc., and problems with assignment of top-level domains. The international body ICANN (the Internet Corporation for Assigned Names and Numbers) oversees the domain name industry.
U.S. Truth in Domain Names Act
The U.S. "Truth in Domain Names Act", in combination with the PROTECT Act, forbids knowingly using a misleading domain name with the intent of attracting people into viewing a visual depiction of sexually explicit conduct on the Internet.
External links and documentation