Charles M. Kozierok The TCP-IP Guide

Подождите немного. Документ загружается.

Encoding was a significant issue for MIME, because it was created for the specific purpose

of sending non-text data using the old RFC 822 e-mail message standard. RFC 822

imposes several significant restrictions on the messages it carries, the most important of

which is that data must be encoded using 7-bit ASCII. RFC 822 messages are also limited

to lines of no more than 1000 characters that end in a “CRLF” sequence.

These limitations mean that arbitrary binary files, which have no concept of lines and

consist of bytes which can each contain a value from 0 to 255, cannot be sent using RFC

822 in their native format. In order for MIME to transfer these files, they must be encoded

using a method such as base64, which converts three 8-bit characters to a set of four 6-bit

characters that can be represented in ASCII. When this sort of transformation is done, the

MIME Content-Transfer-Encoding header is included in the message so the recipient can

reverse the encoding to return the data to its normal form.

Now, while this technique works, it is less efficient than sending the data directly in binary,

because base64 encoding increases the size of the message by 33% (three bytes are

encoded using four ASCII characters, each of which takes one byte to transmit). HTTP

messages are transmitted directly between client and server over a TCP connection, and

do not use the RFC 822 standard. Thus, binary data can be sent between HTTP clients and

servers without the need for base64 encoding or other transformation techniques. Since it is

more efficient to send the data unencoded, this may be one reason why HTTP’s developers

decided to not make the protocol strictly MIME compliant.

HTTP's Two-Level Encoding Scheme

This would seem to be an area where HTTP was simpler than MIME—since there is no

need to encode the entity, there is no need for the Content-Transfer-Encoding header, and

we have one less thing to worry about. Ha, nice try! ☺ It is true that HTTP could have

simply been designed so that all entities were just sent one byte at a time with no need to

specify encodings. But the developers of the protocol recognized that this would have made

the protocol inflexible. There are situations where it might be useful to transform or encode

an entity or message for transmission, and then reverse the operation upon receipt.

This effort to make HTTP flexible resulted in a system of representing encodings that is

actually more complicated than MIME’s. The key to understanding it is to recognize that

HTTP/1.1 actually splits MIME’s notion of a “content transfer encoding” into two different

encoding levels:

☯ Content Encoding: This is an encoding that is applied specifically to the entity carried

in an HTTP message, to prepare or package it prior to transmission. Content

encodings are said to be “end-to-end”, because the encoding of the entity is done

once before it sent by the client or server, and only decoded upon receipt by the

ultimate recipient: server or client. When this type of encoding is done, the method is

identified in the special Content-Encoding entity header. A client may also specify what

content encodings it can handle, using the Accept-Encoding header, as we will see in

the topic on content negotiation.

☯ Transfer Encoding: This is an encoding that is done specifically for the purpose of

ensuring that data can be safely transferred between devices. It is applied across an

entire HTTP message, and not specifically to the entity. This type of encoding is “hop-

by-hop” because a different transfer encoding may be used for each hop of a message

that is transmitted through many intermediaries in the request/response chain. The

transfer encoding method, if any, is indicated in the Transfer-Encoding general header.

Use of Content and Transfer Encodings in HTTP

Since the two encodings are applied at different levels, it is possible for both to be used at

the same time. A content encoding may be applied to an entity and then placed into a

message. On some or all of the hops that are used to move the message containing that

entity, a transfer encoding may be applied to the entire message (of course including the

entity). The transfer encoding is removed first, and then the content encoding.

Okay, so what are these used for in practice? Not a great deal. The HTTP standard defines

a small number of content and transfer encodings, and specifies that additional methods

may be registered with the IANA. As of the time that I write this, however, only the ones

defined in the HTTP/1.1 standard are in use.

Content encodings are currently used only to implement compression. This is a good

example of an encoding that while not strictly necessary, can be useful since it improves

performance—for some types of data, dramatically. The RFC 2616 defines three different

encoding algorithms: gzip (the compression used by the UNIX gzip program, and described

in RFC 1952); compress (again, representing the compression method used by the UNIX

program of that name) and deflate (a method defined in RFCs 1950 and 1951).

Note: It is also possible to apply compression to an entire HTTP message as a

transfer encoding. Obviously, if the entity is already compressed using content

encoding, this will result in some duplication of effort. Since the size of HTTP

headers is not that large compared to some entities that HTTP messages carry, it is usually

simpler just to compress the entity using content encoding.

Since transfer encodings are intended to be used to make data safe for transfer, and we’ve

already discussed the fact that HTTP can handle arbitrary binary data, this suggests that

transfer encodings are not really necessary. As it turns out, however, there is one situation

where “safe transport” does become an issue: the matter of identifying the end of a

message. This issue is the subject of the next topic.

Key Concept: HTTP supports two levels of codings for data transfer. The first is

content encoding, which is utilized in certain circumstances to transform the entity

carried in an HTTP message; the second is transfer encoding, which is used to

encode an entire HTTP message to assure its safe transport. Content encodings are often

employed when entities are compressed to improve communication efficiency; transfer

encoding is used primarily to deal with the problem of identifying the end of a message.

HTTP Data Length Issues, "Chunked" Transfers and Message Trailers

Two different levels of encodings are used in HTTP: content encodings, which are applied

to HTTP entities, and transfer encodings, which are used over entire HTTP messages.

Content encodings are used for convenience to package entities for transmission, where

transfer encodings are hop-specific, and are intended for use in situations where data

needs to be made “safe” for transfer.

However, we’ve already seen in the previous topic that HTTP can transport arbitrary binary

data, so unlike the situation where MIME had to make binary data “safe” for RFC 822, this is

not an issue. Therefore, why are transport encodings needed at all? In theory they are not,

and HTTP/1.0 did not even have a Transfer-Encoding header (though it did use content

encodings). The concept of transfer encoding became important in HTTP/1.1 due to

another key feature of that version of HTTP: persistent connections.

Recall that HTTP uses the Transmission Control Protocol (TCP) for connections. One of the

key characteristics of TCP is that it transmits all data as a stream of unstructured bytes.

TCP itself does not provide any way of differentiating between the end of one piece of data

and the start of the next; this is left up to each application.

In HTTP/1.0 (and HTTP/0.9) this was not a problem, because those versions used only

transitory connections. Each HTTP session consisted of only one request and one

response; since client and server only each sent one piece of data, there was no need to

worry about differentiating HTTP messages on a connection. HTTP/1.1’s persistent

connections improve performance by letting devices send requests and responses one

after the other over a single TCP connection. However, the fact that messages are sent in

sequence makes differentiating them a concern.

Using The Content-Length Header

There are two usual approaches to dealing with this sort of data length issue: either using

an explicit delimiter to mark the end of the message, or including a length header or field to

tell the recipient how long each message is. The first approach could not really have been

done easily while maintaining compatibility with older versions of the protocol. This left the

second approach; since HTTP already had a Content-Length entity header, the solution

was to use this to indicate the length of each message at transmission time.

This method works fine in cases where the size of the entity to be transferred is known in

advance, such as when a static file such as a text document, image or executable program

needs to be transmitted. However, there are many types of resources that are generated

dynamically; the total size of such a resource is not known until it has been completely

processed. While not typical in HTTP’s early days, these account for a large percentage of

World Wide Web traffic today.

Many Web pages are often not static HTML files, but rather are created as output from

scripts or programs based on user input; discussion forums would be a good example.

Even HTML files today are often not static. They usually contain program elements such as

server-side includes (SSIs) that cause code to be generated on-the-fly, so their exact size is

cannot be determined in advance.

Using "Chunked" Transfers

The problem of unknown message length could be resolved by buffering the entire resource

before transmission. However, this would be wasteful of server memory and would delay

the transmission of the entity unnecessarily—no part could be sent until the entire entity

was ready. Instead, a special transfer encoding method was developed to handle this

particular problem of “unsafe” transport: not knowing the length of a file. The method is

called chunking.

When this technique is used, instead of sending an entity as a raw sequence of bytes, it is

broken into, well, chunks. ☺ This allows HTTP to send a dynamically-generated resource,

such as output from a script, a piece at a time as the data becomes available from the

software processing it. To indicate that this method has been used, the special header

“Transfer-Encoding: chunked” is placed in the message. A special format is also used for

the body of the HTTP message to delineate the chunks:

<chunk-1-length>

<chunk-1-data>

<chunk-2-length>

<chunk-2-data>

...

<message-trailers>

Basically, instead of putting the whole entity in the body and indicating its length in a

Content-Length header, each chunk is placed in the body sequentially, each preceded by

the length of the chunk. The length is specified in hexadecimal, and represented using

ASCII characters. All chunk lengths and chunk data are terminated with a “CRLF”

sequence. The recipient knows it has received the last chunk when it sees a chunk-length

of zero.

Key Concept: Since HTTP/1.1 uses persistent connections that allow multiple

requests and responses to be sent over a TCP connection, clients and servers need

some way to identify where one message ends and the next begins. The easier

solution is to use the Content-Length header to indicate the size of a message, but this only

works when the length of a message can be easily determined in advance. For dynamic

content or other cases where message length cannot be easily computed before sending

the data, the special chunked transfer encoding can be used, where the message body is

sent as a sequence of chunks, each preceded by the length of the chunk.

Message Trailers

When chunked transfer encoding is used, the sender of the message may also choose to

specify one or more message trailers. These are the same as entity headers, describing the

contents of the message body, but appear after the entity instead of before it. They provide

flexibility in the same way that chunking itself does—they allow a device to include an HTTP

header that may contain information that was not available when the HTTP message trans-

mission began. A good example would be an integrity check field calculated based on the

byte values of the entire entity.

Trailers are optional, and not always be needed. When they are used, they are processed

just like regular entity headers. To give the recipient of a message a “heads up” that trailers

have been used, the special Trailer header should be included at the start of the message,

which lists the names of each header that appears as a trailer.

Example Using the Content-Length Header and "Chunking"

Yes, I really did say that headers can actually be trailers, in which case a header called

Trailer lists each header that is actually a trailer. Perhaps an example would help clarify

matters somewhat? Suppose we have a server that contains a program that, when supplied

with a file name, returns a simple HTML response that contains the size and last modifi-

cation date of the file. This is obviously dynamic content, so the length of the response

cannot be determined in advance.

If the server were to buffer the entire output of this program (since it is small) it could

construct a conventional HTTP response using the Content-Length header, as shown in the

sample output of Table 279. Instead, chunking allows the server to send out parts of the

response as soon as they become available from the program. The equivalent output of

that example using chunked transfers is shown in Table 280; notice that the Expires header

is now a trailer, so it can be calculated based on the output of the program, and this is

indicated by the “Trailer: Expires” header. Remember that the Content-Length header

specifies the length as a decimal number while chunking specifies chunk lengths in

hexadecimal; the chunks in this example are 41, 5, 35, 29 and 19 bytes, respectively.

Note: An HTTP/1.1 client can specify that it does not want to use persistent

connections by including the “Connection: close” header in its request. In this

case, the server does not have to use chunking in its response—since it will close

the connection after the first response message, the client knows that everything it receives

from the server is part of that response. However, some servers may use chunked transfers

anyway, even in this situation.

Table 279: Example HTTP Response Using Content-Length Header

HTTP/1.1 200 OK

Date: Mon, 22 Mar 2004 11:15:03 GMT

Content-Type: text/html

Content-Length: 129

Expires: Sat, 27 Mar 2004 21:12:00 GMT

<html><body><p>The file you requested is 3,400 bytes long and was

last modified: Sat, 20 Mar 2004 21:12:00 GMT.</p></body></html>

Table 280: Example HTTP Response Using Chunked Transfer Encoding

HTTP/1.1 200 OK

Date: Mon, 22 Mar 2004 11:15:03 GMT

Content-Type: text/html

Transfer-Encoding: chunked

Trailer: Expires

<html><body><p>The file you requested is

3,400

bytes long and was last modified:

Sat, 20 Mar 2004 21:12:00 GMT

.</p></body></html>

Expires: Sat, 27 Mar 2004 21:12:00 GMT

Key Concept: When chunked transfer encoding is used, the sender of the message

may move certain headers from the start of the message to the end, where they are

known as trailers. They are interpreted in the same way as normal headers by the

recipient. The special Trailer header is used in such messages to tell the recipient to look for

trailers after the body of the message.

HTTP Content Negotiation and "Quality Values"

Many Internet resources have only one representation, meaning a single way in which they

are stored or made available. In this situation, a client request to a server is an “all or

nothing” proposition. The client may specify conditions under which it would like the server

to send the resource, using the “If-” series of request headers. If the condition is met, the

resource will be sent in the server’s response in the one form in which it exists; if the

condition is not met, no entity will be returned.

Other resources, however, may have multiple representations. The most common example

would be a document that is available in multiple languages, or that is stored using more

than one character set. Similarly, a graphical image might exist in two different formats: one

a Tagged Image File Format (TIFF) file for those wanting maximum image quality despite

the large size of TIFF images; and a more compact JPEG file for those who need to see the

image quickly and don’t care as much about its quality level.

Content Negotiation Techniques

To provide flexibility in allowing clients to obtain the best version of resources that exist in

multiple forms, HTTP/1.1 defines a set of features that are collectively called content negoti-

ation. The standard defines two basic methods by which this negotiation may be performed:

☯ Server-Driven Negotiation: In this technique, the client includes headers in its

request that provide guidance to the server about its desired representation for the

resource. The server uses an algorithm that processes this information and provides

the version of the resource that it feels best matches the client’s preferences.

☯ Agent-Driven Negotiation: This method puts the client in charge of the negotiation

process. It first sends a preliminary request for the resource to the server. If the

resource is available in multiple forms, the server typically sends back a 300 (“Multiple

Choices”) response, which contains a list of the various representations in which the

resource is available. The client then sends a second request for the one that it

prefers.

Comparing Negotiation Methods

To draw an analogy, suppose a co-worker offers to go out at lunch-time to pick up lunch for

the two of you, at a new restaurant where neither of you have eaten before. You could

provide him with some parameters regarding what you like to eat—“I like roast beef

sandwiches, fish & chips, and pizza, but not chicken”—and then trust him to pick something

you will like. Or, he could go to the restaurant, call you on his cell phone, and read the menu

to you and let you make a selection. This first is like server-driven negotiation; the second,

like agent-driven negotiation.

I think this is a good analogy not only because it (hopefully) helps you see the differences

between the two methods, but it also highlights the key advantages and disadvantages of

each. Trusting your co-worker with your lunch selection is simple and efficient, but not

foolproof. It’s possible that the restaurant may not have any of the items you specified, or

that your friend may get you something containing another ingredient that you don’t like but

that you forgot to mention. Similarly, server-based negotiation is a “best-guess” process that

does not guarantee that the client will receive the resource in the format it wants. This is

exacerbated by the fact that there are only so many ways for the client to specify its prefer-

ences using a handful of request headers.

Agent-based negotiation, on the other hand, allows the client to select exactly what it wants

from the available choices, just as you can choose your favorite dish from the menu of the

restaurant. The problem here is that it is inefficient, because two requests and responses

are required for each resource access. (Would you really want to read a restaurant’s menu

over the phone to someone so they could choose their ideal dish? ☺)

Key Concept: HTTP includes a feature called content negotiation that allows the

selection of a particular variation of a resource that has more than one represen-

tation. There are two negotiation techniques: server-driven, where the client includes

headers in its request that indicate what it wants and the server does its best to select the

most appropriate variant; and agent-driven, where the server sends the client a list of the

available resource alternatives and the client chooses one.

Server-Based Negotiation in HTTP

In practice, server-based negotiation is the type that is most commonly used today. The

client specifies its preferences using a set of four request headers that indicate what it

would prefer in the representation of the resource. The headers each represent one charac-

teristic of a resource: Accept (media type); Accept-Charset (character set); Accept-

Encoding (content encoding); and Accept-Language (resource language). Any or all of

these may be included in the request.

Each “Accept-” header contains a list of acceptable values that is appropriate to the charac-

teristic that it specifies, separated by a comma. For example, the Accept header lists media

types the client considers acceptable, while Accept-Language contains language tags.

Suppose you have a friend who is trilingual in English, French and Spanish. She can read a

particular document in any of these languages, so she might instruct her browser to include

the following header in her requests:

Accept-Language: en, fr, sp

Weighting Preferences with "Quality Values"

Even better than simple acceptance lists, HTTP allows the client to weight each of the items

in such a list, to indicate which is preferred of the alternatives. This is done by adding a

decimal quality value after each parameter using the syntax “q=<value>”, which represents

the relative priority of that parameter relative to others. The highest priority is 1 and the

lowest is 0; the default if no value is indicated is 1, while a value of 0 means the client is

specifically saying it is not willing to accept documents with that characteristic.

This is best illustrated by an example, so let’s take our trilingual friend again. This time, let’s

say she knows English, French and Spanish, but her French is a bit rusty (she hasn’t used

it in a while). Furthermore, she may need to share this document with a friend of hers who

only knows a little Spanish, so it would be best if she got the document in English. Finally,

she knows there is a German version of the resource that she definitely does not want. This

could be represented as follows:

Accept-Language: en, fr;q=0.3, sp;q=0.7, de;q=0

Translated to English, this means “I prefer if you sent me the document in English. If not,

Spanish is okay, or French if that is all you have, but definitely don’t send it in German”.

Incidentally, the name “quality value” is the one used in the HTTP standard, but is really a

poor choice of terminology (which, to be fair, is also mentioned in the standard!) These

values do not really have anything to do with quality; for all we know, the German version of

this document may be the original and the others could be lousy translations. The “q” values

only specify the relative preference of the client making the request.

Finally, the “*” wildcard can be used in the Accept family of headers to represent “any

value”, or “everything else”. This is often used to tell the server “if you can’t find what I

specifically asked for, then here’s my preference level on the alternatives”. Let’s take an

example using the Accept header:

Accept: text/html, text/*;q=0.6, */*;q=0.1

This header represents the client saying “My preference (q=1) is an HTML text document. If

not available, I would prefer some other type of text document. Failing that, you may send

me any other type of document relevant to the requested resource.”

Key Concept: Server-driven content negotiation is the type most often used in

HTTP. A client sending a request can include up to four different headers that provide

information about how the server should fill its request. These may include optional

quality values that specify the client’s relative preference amongst a set of alternative

resource characteristics such as media type, language, character set or encoding.

HTTP Features, Capabilities and Issues

The first four subsections of the large section covering the Hypertext Transfer Protocol were

meant to give you a good understanding of the fundamental concepts and basic operation

of the protocol. Modern HTTP, however, goes beyond the simple mechanics by which HTTP

requests and responses are exchanged. It includes a number of features and capabilities

that extend the basic protocol to improve performance and meet the various needs of

organizations using modern TCP/IP internetworks.

In this section, I complete my description of HTTP by discussing several important matters

that are essential to the operation of the modern World Wide Web. I begin with an overview

of HTTP caching, which is the single most important feature that promotes efficiency in Web

transactions. I discuss the different uses of proxies in HTTP and some of the issues

associated with them. I briefly examine the issues related to security and privacy in HTTP,

and conclude with a discussion of the matter of state management, and how it is imple-

mented despite HTTP being an inherently stateless protocol.

Background Information: This section assumes that you have already covered

the preceding ones in this larger section on HTTP. If you are not already familiar

with concepts such as the HTTP request/reply chain, HTTP message structure and

HTTP headers, you should review those materials first.

HTTP Caching Features and Issues

The explosive growth of the World Wide Web was a marvel for its users, but a nightmare for

networking engineers. The biggest problem that the burgeoning Web created was an

overloading of the internetworks over which it ran. Many of the features that were added to

HTTP/1.1 were designed specifically to improve the efficiency of the protocol and reduce

unnecessary bandwidth consumed by HTTP requests and responses. Arguably the most

important of these is a set of features designed to support caching.

The subject of caching comes up again and again in discussions of computers and

networking, because of a phenomenon that is widely observed in these technologies:

whenever a user, hardware device or software process requests a particular piece of data,

there is a good chance it will ask for it again in the near future. Thus, by storing recently-

retrieved items in a cache, we can eliminate duplicated effort. This is why caching plays an

important role in the efficiency of protocols such as ARP and DNS.

The Significance of Caching to HTTP

Caching is important to HTTP because Web users tend to request the same documents

over and over again. For example, in writing this section on HTTP, I made reference to RFC

2616 many, many times. Each time, I loaded it from a particular Web server. Since the

document never changes, it would be more efficient to just load it from a local cache rather

than having to retrieve it from the distant Web server each time.