The TCP/IP Guide - Version 3.0 (Contents) ` 1297 _ © 2001-2005 Charles M. Kozierok. All Rights Reserved.
Creating and Interpreting Relative URLs
It is for these reasons that URL syntax was extended to include a relative form. In simplest
terms, a relative URL is the same as an absolute URL but with pieces of information omitted
that are implied by context. Like our “go downstairs” instruction, a relative URL does not by
itself contain enough information to specify a resource. A relative URL must be interpreted
within a context that provides the missing information.
The context needed to find a resource from a relative URL is provided in the form of a base
URL that provides the missing information. A base URL must be either a specific absolute
URL, or itself a relative URL that refers to some other absolute base. The base URL may be
either explicitly stated or may be inferred from use. The RFCs dealing with URLs define
three methods for determining the base URL, which are arranged into the following
precedence:
1. Base URL Within Document: Some documents allow the base URL to be explicitly
stated. If present, this specification is used for any relative URLs in the document.
2. Base URL From Encapsulating Entity: In cases where no explicit base URL is
specified in a document, but the document is part of a higher-level entity enclosing it,
the base URL is the URL of the “parent” document. For example, a document within a
body part of a MIME multipart message can use the URL of the message as a whole
as the base URL for relative references.
3. Base URL From Retrieval URL: If neither of those two methods are feasible, the
base URL is inferred from the URL used to retrieve the document containing the
relative URL.
Of these three methods, #1 and #3 are the most common. HTML, the language used for the
Web, allows a base URL to be explicitly stated which removes any doubt about how relative
URLs are to be interpreted. Failing this, method #3 is commonly used for images and other
links in HTML documents that are specified in relative terms.
For example, let's go back to the poor slob maintaining “http://www.longdomainnamesareir-
ritating.com/index.htm”. By default, any images referenced from that “index.htm” HTML
document can use relative URLs—the base URL will be assumed from the name of the
document itself. So he can just say “companylogo.gif” instead of “http://www.longdomain-
namesareirritating.com/companylogo.gif”, as long as that file is in the same directory on the
same server as “index.htm”.
If all three of these methods fail for whatever reason, then no base URL can be determined.
Relative URLs in such a document will be interpreted as absolute URLs, and since they do
not contain complete information, they will not work properly.
Also, relative URLs only have meaning for certain URL schemes. For others, they make no
sense and cannot be used. In particular, relative URLs are never used for the “telnet”,
“mailto” and “news” schemes. They are very commonly used for HTTP documents, and
may also be used for FTP and file URLs.