You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
343 lines
17 KiB
343 lines
17 KiB
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
|
<html xmlns='http://www.w3.org/1999/xhtml'>
|
|
<head><title>HTTP/1.1: Introduction</title></head>
|
|
<body><address>part of <a rev='Section' href='rfc2616.html'>Hypertext Transfer Protocol -- HTTP/1.1</a><br />
|
|
RFC 2616 Fielding, et al.</address>
|
|
<h2><a id='sec1'>1</a> Introduction</h2>
|
|
<h3><a id='sec1.1'>1.1</a> Purpose</h3>
|
|
<p>
|
|
The Hypertext Transfer Protocol (HTTP) is an application-level
|
|
protocol for distributed, collaborative, hypermedia information
|
|
systems. HTTP has been in use by the World-Wide Web global
|
|
information initiative since 1990. The first version of HTTP,
|
|
referred to as HTTP/0.9, was a simple protocol for raw data transfer
|
|
across the Internet. HTTP/1.0, as defined by RFC 1945 <a rel='bibref' href='rfc2616-sec17.html#bib6'>[6]</a>, improved
|
|
the protocol by allowing messages to be in the format of MIME-like
|
|
messages, containing metainformation about the data transferred and
|
|
modifiers on the request/response semantics. However, HTTP/1.0 does
|
|
not sufficiently take into consideration the effects of hierarchical
|
|
proxies, caching, the need for persistent connections, or virtual
|
|
hosts. In addition, the proliferation of incompletely-implemented
|
|
applications calling themselves "HTTP/1.0" has necessitated a
|
|
protocol version change in order for two communicating applications
|
|
to determine each other's true capabilities.
|
|
</p>
|
|
<p>
|
|
This specification defines the protocol referred to as "HTTP/1.1".
|
|
This protocol includes more stringent requirements than HTTP/1.0 in
|
|
order to ensure reliable implementation of its features.
|
|
</p>
|
|
<p>
|
|
Practical information systems require more functionality than simple
|
|
retrieval, including search, front-end update, and annotation. HTTP
|
|
allows an open-ended set of methods and headers that indicate the
|
|
purpose of a request <a rel='bibref' href='rfc2616-sec17.html#bib47'>[47]</a>. It builds on the discipline of reference
|
|
provided by the Uniform Resource Identifier (URI) <a rel='bibref' href='rfc2616-sec17.html#bib3'>[3]</a>, as a location
|
|
(URL) <a rel='bibref' href='rfc2616-sec17.html#bib4'>[4]</a> or name (URN) <a rel='bibref' href='rfc2616-sec17.html#bib20'>[20]</a>, for indicating the resource to which a
|
|
</p>
|
|
<p>
|
|
method is to be applied. Messages are passed in a format similar to
|
|
that used by Internet mail [9] as defined by the Multipurpose
|
|
Internet Mail Extensions (MIME) <a rel='bibref' href='rfc2616-sec17.html#bib7'>[7]</a>.
|
|
</p>
|
|
<p>
|
|
HTTP is also used as a generic protocol for communication between
|
|
user agents and proxies/gateways to other Internet systems, including
|
|
those supported by the SMTP <a rel='bibref' href='rfc2616-sec17.html#bib16'>[16]</a>, NNTP <a rel='bibref' href='rfc2616-sec17.html#bib13'>[13]</a>, FTP <a rel='bibref' href='rfc2616-sec17.html#bib18'>[18]</a>, Gopher <a rel='bibref' href='rfc2616-sec17.html#bib2'>[2]</a>,
|
|
and WAIS <a rel='bibref' href='rfc2616-sec17.html#bib10'>[10]</a> protocols. In this way, HTTP allows basic hypermedia
|
|
access to resources available from diverse applications.
|
|
</p>
|
|
<h3><a id='sec1.2'>1.2</a> Requirements</h3>
|
|
<p>
|
|
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
|
|
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
|
|
document are to be interpreted as described in RFC 2119 <a rel='bibref' href='rfc2616-sec17.html#bib34'>[34]</a>.
|
|
</p>
|
|
<p>
|
|
An implementation is not compliant if it fails to satisfy one or more
|
|
of the MUST or REQUIRED level requirements for the protocols it
|
|
implements. An implementation that satisfies all the MUST or REQUIRED
|
|
level and all the SHOULD level requirements for its protocols is said
|
|
to be "unconditionally compliant"; one that satisfies all the MUST
|
|
level requirements but not all the SHOULD level requirements for its
|
|
protocols is said to be "conditionally compliant."
|
|
</p>
|
|
<h3><a id='sec1.3'>1.3</a> Terminology</h3>
|
|
<p>
|
|
This specification uses a number of terms to refer to the roles
|
|
played by participants in, and objects of, the HTTP communication.
|
|
</p>
|
|
<dl>
|
|
<dt> connection
|
|
</dt> <dd> A transport layer virtual circuit established between two programs
|
|
for the purpose of communication.
|
|
</dd>
|
|
<dt> message
|
|
</dt> <dd> The basic unit of HTTP communication, consisting of a structured
|
|
sequence of octets matching the syntax defined in section 4 and
|
|
transmitted via the connection.
|
|
</dd>
|
|
<dt> request
|
|
</dt> <dd> An HTTP request message, as defined in section 5.
|
|
</dd>
|
|
<dt> response
|
|
</dt> <dd> An HTTP response message, as defined in section 6.
|
|
</dd>
|
|
<dt> resource
|
|
</dt> <dd> A network data object or service that can be identified by a URI,
|
|
as defined in section <a rel='xref' href='rfc2616-sec3.html#sec3.2'>3.2</a>. Resources may be available in multiple
|
|
representations (e.g. multiple languages, data formats, size, and
|
|
resolutions) or vary in other ways.
|
|
</dd>
|
|
<dt> entity
|
|
</dt> <dd> The information transferred as the payload of a request or
|
|
response. An entity consists of metainformation in the form of
|
|
entity-header fields and content in the form of an entity-body, as
|
|
described in section 7.
|
|
</dd>
|
|
<dt> representation
|
|
</dt> <dd> An entity included with a response that is subject to content
|
|
negotiation, as described in section 12. There may exist multiple
|
|
representations associated with a particular response status.
|
|
</dd>
|
|
<dt> content negotiation
|
|
</dt> <dd> The mechanism for selecting the appropriate representation when
|
|
servicing a request, as described in section 12. The
|
|
representation of entities in any response can be negotiated
|
|
(including error responses).
|
|
</dd>
|
|
<dt> variant
|
|
</dt> <dd> A resource may have one, or more than one, representation(s)
|
|
associated with it at any given instant. Each of these
|
|
representations is termed a `varriant'. Use of the term `variant'
|
|
does not necessarily imply that the resource is subject to content
|
|
negotiation.
|
|
</dd>
|
|
<dt> client
|
|
</dt> <dd> A program that establishes connections for the purpose of sending
|
|
requests.
|
|
</dd>
|
|
<dt> user agent
|
|
</dt> <dd> The client which initiates a request. These are often browsers,
|
|
editors, spiders (web-traversing robots), or other end user tools.
|
|
</dd>
|
|
<dt> server
|
|
</dt> <dd> An application program that accepts connections in order to
|
|
service requests by sending back responses. Any given program may
|
|
be capable of being both a client and a server; our use of these
|
|
terms refers only to the role being performed by the program for a
|
|
particular connection, rather than to the program's capabilities
|
|
in general. Likewise, any server may act as an origin server,
|
|
proxy, gateway, or tunnel, switching behavior based on the nature
|
|
of each request.
|
|
</dd>
|
|
<dt> origin server
|
|
</dt> <dd> The server on which a given resource resides or is to be created.
|
|
</dd>
|
|
<dt> proxy
|
|
</dt> <dd> An intermediary program which acts as both a server and a client
|
|
for the purpose of making requests on behalf of other clients.
|
|
Requests are serviced internally or by passing them on, with
|
|
possible translation, to other servers. A proxy MUST implement
|
|
both the client and server requirements of this specification. A
|
|
"transparent proxy" is a proxy that does not modify the request or
|
|
response beyond what is required for proxy authentication and
|
|
identification. A "non-transparent proxy" is a proxy that modifies
|
|
the request or response in order to provide some added service to
|
|
the user agent, such as group annotation services, media type
|
|
transformation, protocol reduction, or anonymity filtering. Except
|
|
where either transparent or non-transparent behavior is explicitly
|
|
stated, the HTTP proxy requirements apply to both types of
|
|
proxies.
|
|
</dd>
|
|
<dt> gateway
|
|
</dt> <dd> A server which acts as an intermediary for some other server.
|
|
Unlike a proxy, a gateway receives requests as if it were the
|
|
origin server for the requested resource; the requesting client
|
|
may not be aware that it is communicating with a gateway.
|
|
</dd>
|
|
<dt> tunnel
|
|
</dt> <dd> An intermediary program which is acting as a blind relay between
|
|
two connections. Once active, a tunnel is not considered a party
|
|
to the HTTP communication, though the tunnel may have been
|
|
initiated by an HTTP request. The tunnel ceases to exist when both
|
|
ends of the relayed connections are closed.
|
|
</dd>
|
|
<dt> cache
|
|
</dt> <dd> A program's local store of response messages and the subsystem
|
|
that controls its message storage, retrieval, and deletion. A
|
|
cache stores cacheable responses in order to reduce the response
|
|
time and network bandwidth consumption on future, equivalent
|
|
requests. Any client or server may include a cache, though a cache
|
|
cannot be used by a server that is acting as a tunnel.
|
|
</dd>
|
|
<dt> cacheable
|
|
</dt> <dd> A response is cacheable if a cache is allowed to store a copy of
|
|
the response message for use in answering subsequent requests. The
|
|
rules for determining the cacheability of HTTP responses are
|
|
defined in section 13. Even if a resource is cacheable, there may
|
|
be additional constraints on whether a cache can use the cached
|
|
copy for a particular request.
|
|
</dd>
|
|
<dt> first-hand
|
|
</dt> <dd> A response is first-hand if it comes directly and without
|
|
unnecessary delay from the origin server, perhaps via one or more
|
|
proxies. A response is also first-hand if its validity has just
|
|
been checked directly with the origin server.
|
|
</dd>
|
|
<dt> explicit expiration time
|
|
</dt> <dd> The time at which the origin server intends that an entity should
|
|
no longer be returned by a cache without further validation.
|
|
</dd>
|
|
<dt> heuristic expiration time
|
|
</dt> <dd> An expiration time assigned by a cache when no explicit expiration
|
|
time is available.
|
|
</dd>
|
|
<dt> age
|
|
</dt> <dd> The age of a response is the time since it was sent by, or
|
|
successfully validated with, the origin server.
|
|
</dd>
|
|
<dt> freshness lifetime
|
|
</dt> <dd> The length of time between the generation of a response and its
|
|
expiration time.
|
|
</dd>
|
|
<dt> fresh
|
|
</dt> <dd> A response is fresh if its age has not yet exceeded its freshness
|
|
lifetime.
|
|
</dd>
|
|
<dt> stale
|
|
</dt> <dd> A response is stale if its age has passed its freshness lifetime.
|
|
</dd>
|
|
<dt> semantically transparent
|
|
</dt> <dd> A cache behaves in a "semantically transparent" manner, with
|
|
respect to a particular response, when its use affects neither the
|
|
requesting client nor the origin server, except to improve
|
|
performance. When a cache is semantically transparent, the client
|
|
receives exactly the same response (except for hop-by-hop headers)
|
|
that it would have received had its request been handled directly
|
|
by the origin server.
|
|
</dd>
|
|
<dt> validator
|
|
</dt> <dd> A protocol element (e.g., an entity tag or a Last-Modified time)
|
|
that is used to find out whether a cache entry is an equivalent
|
|
copy of an entity.
|
|
</dd>
|
|
<dt> upstream/downstream
|
|
</dt> <dd> Upstream and downstream describe the flow of a message: all
|
|
messages flow from upstream to downstream.
|
|
</dd>
|
|
<dt> inbound/outbound
|
|
</dt> <dd> Inbound and outbound refer to the request and response paths for
|
|
messages: "inbound" means "traveling toward the origin server",
|
|
and "outbound" means "traveling toward the user agent"
|
|
</dd>
|
|
</dl>
|
|
<h3><a id='sec1.4'>1.4</a> Overall Operation</h3>
|
|
<p>
|
|
The HTTP protocol is a request/response protocol. A client sends a
|
|
request to the server in the form of a request method, URI, and
|
|
protocol version, followed by a MIME-like message containing request
|
|
modifiers, client information, and possible body content over a
|
|
connection with a server. The server responds with a status line,
|
|
including the message's protocol version and a success or error code,
|
|
followed by a MIME-like message containing server information, entity
|
|
metainformation, and possible entity-body content. The relationship
|
|
between HTTP and MIME is described in appendix <a rel='xref' href='rfc2616-sec19.html#sec19.4'>19.4</a>.
|
|
</p>
|
|
<p>
|
|
Most HTTP communication is initiated by a user agent and consists of
|
|
a request to be applied to a resource on some origin server. In the
|
|
simplest case, this may be accomplished via a single connection (v)
|
|
between the user agent (UA) and the origin server (O).
|
|
</p>
|
|
<pre> request chain ------------------------>
|
|
UA -------------------v------------------- O
|
|
<----------------------- response chain
|
|
</pre>
|
|
<p>
|
|
A more complicated situation occurs when one or more intermediaries
|
|
are present in the request/response chain. There are three common
|
|
forms of intermediary: proxy, gateway, and tunnel. A proxy is a
|
|
forwarding agent, receiving requests for a URI in its absolute form,
|
|
rewriting all or part of the message, and forwarding the reformatted
|
|
request toward the server identified by the URI. A gateway is a
|
|
receiving agent, acting as a layer above some other server(s) and, if
|
|
necessary, translating the requests to the underlying server's
|
|
protocol. A tunnel acts as a relay point between two connections
|
|
without changing the messages; tunnels are used when the
|
|
communication needs to pass through an intermediary (such as a
|
|
firewall) even when the intermediary cannot understand the contents
|
|
of the messages.
|
|
</p>
|
|
<pre> request chain -------------------------------------->
|
|
UA -----v----- A -----v----- B -----v----- C -----v----- O
|
|
<------------------------------------- response chain
|
|
</pre>
|
|
<p>
|
|
The figure above shows three intermediaries (A, B, and C) between the
|
|
user agent and origin server. A request or response message that
|
|
travels the whole chain will pass through four separate connections.
|
|
This distinction is important because some HTTP communication options
|
|
</p>
|
|
<p>
|
|
may apply only to the connection with the nearest, non-tunnel
|
|
neighbor, only to the end-points of the chain, or to all connections
|
|
along the chain. Although the diagram is linear, each participant may
|
|
be engaged in multiple, simultaneous communications. For example, B
|
|
may be receiving requests from many clients other than A, and/or
|
|
forwarding requests to servers other than C, at the same time that it
|
|
is handling A's request.
|
|
</p>
|
|
<p>
|
|
Any party to the communication which is not acting as a tunnel may
|
|
employ an internal cache for handling requests. The effect of a cache
|
|
is that the request/response chain is shortened if one of the
|
|
participants along the chain has a cached response applicable to that
|
|
request. The following illustrates the resulting chain if B has a
|
|
cached copy of an earlier response from O (via C) for a request which
|
|
has not been cached by UA or A.
|
|
</p>
|
|
<pre> request chain ---------->
|
|
UA -----v----- A -----v----- B - - - - - - C - - - - - - O
|
|
<--------- response chain
|
|
</pre>
|
|
<p>
|
|
Not all responses are usefully cacheable, and some requests may
|
|
contain modifiers which place special requirements on cache behavior.
|
|
HTTP requirements for cache behavior and cacheable responses are
|
|
defined in section 13.
|
|
</p>
|
|
<p>
|
|
In fact, there are a wide variety of architectures and configurations
|
|
of caches and proxies currently being experimented with or deployed
|
|
across the World Wide Web. These systems include national hierarchies
|
|
of proxy caches to save transoceanic bandwidth, systems that
|
|
broadcast or multicast cache entries, organizations that distribute
|
|
subsets of cached data via CD-ROM, and so on. HTTP systems are used
|
|
in corporate intranets over high-bandwidth links, and for access via
|
|
PDAs with low-power radio links and intermittent connectivity. The
|
|
goal of HTTP/1.1 is to support the wide diversity of configurations
|
|
already deployed while introducing protocol constructs that meet the
|
|
needs of those who build web applications that require high
|
|
reliability and, failing that, at least reliable indications of
|
|
failure.
|
|
</p>
|
|
<p>
|
|
HTTP communication usually takes place over TCP/IP connections. The
|
|
default port is TCP 80 [19], but other ports can be used. This does
|
|
not preclude HTTP from being implemented on top of any other protocol
|
|
on the Internet, or on other networks. HTTP only presumes a reliable
|
|
transport; any protocol that provides such guarantees can be used;
|
|
the mapping of the HTTP/1.1 request and response structures onto the
|
|
transport data units of the protocol in question is outside the scope
|
|
of this specification.
|
|
</p>
|
|
<p>
|
|
In HTTP/1.0, most implementations used a new connection for each
|
|
request/response exchange. In HTTP/1.1, a connection may be used for
|
|
one or more request/response exchanges, although connections may be
|
|
closed for a variety of reasons (see section <a rel='xref' href='rfc2616-sec8.html#sec8.1'>8.1</a>).
|
|
</p>
|
|
</body></html>
|