You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
187 lines
8.8 KiB
187 lines
8.8 KiB
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
|
<html xmlns='http://www.w3.org/1999/xhtml'>
|
|
<head><title>HTTP/1.1: Notational Conventions and Generic Grammar</title></head>
|
|
<body><address>part of <a rev='Section' href='rfc2616.html'>Hypertext Transfer Protocol -- HTTP/1.1</a><br />
|
|
RFC 2616 Fielding, et al.</address>
|
|
<h2><a id='sec2'>2</a> Notational Conventions and Generic Grammar</h2>
|
|
<h3><a id='sec2.1'>2.1</a> Augmented BNF</h3>
|
|
<p>
|
|
All of the mechanisms specified in this document are described in
|
|
both prose and an augmented Backus-Naur Form (BNF) similar to that
|
|
used by RFC 822 <a rel='bibref' href='rfc2616-sec17.html#bib9'>[9]</a>. Implementors will need to be familiar with the
|
|
notation in order to understand this specification. The augmented BNF
|
|
includes the following constructs:
|
|
</p>
|
|
<dl>
|
|
<dt> name = definition
|
|
</dt> <dd> The name of a rule is simply the name itself (without any
|
|
enclosing "<" and ">") and is separated from its definition by the
|
|
equal "=" character. White space is only significant in that
|
|
indentation of continuation lines is used to indicate a rule
|
|
definition that spans more than one line. Certain basic rules are
|
|
in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. Angle
|
|
brackets are used within definitions whenever their presence will
|
|
facilitate discerning the use of rule names.
|
|
</dd>
|
|
<dt> "literal"
|
|
</dt> <dd> Quotation marks surround literal text. Unless stated otherwise,
|
|
the text is case-insensitive.
|
|
</dd>
|
|
<dt> rule1 | rule2
|
|
</dt> <dd> Elements separated by a bar ("|") are alternatives, e.g., "yes |
|
|
no" will accept yes or no.
|
|
</dd>
|
|
<dt> (rule1 rule2)
|
|
</dt> <dd> Elements enclosed in parentheses are treated as a single element.
|
|
Thus, "(elem (foo | bar) elem)" allows the token sequences "elem
|
|
foo elem" and "elem bar elem".
|
|
</dd>
|
|
<dt> *rule
|
|
</dt> <dd> The character "*" preceding an element indicates repetition. The
|
|
full form is "<n>*<m>element" indicating at least <n> and at most
|
|
<m> occurrences of element. Default values are 0 and infinity so
|
|
that "*(element)" allows any number, including zero; "1*element"
|
|
requires at least one; and "1*2element" allows one or two.
|
|
</dd>
|
|
<dt> [rule]
|
|
</dt> <dd> Square brackets enclose optional elements; "[foo bar]" is
|
|
equivalent to "*1(foo bar)".
|
|
</dd>
|
|
<dt> N rule
|
|
</dt> <dd> Specific repetition: "<n>(element)" is equivalent to
|
|
"<n>*<n>(element)"; that is, exactly <n> occurrences of (element).
|
|
Thus 2DIGIT is a 2-digit number, and 3ALPHA is a string of three
|
|
alphabetic characters.
|
|
</dd>
|
|
<dt> #rule
|
|
</dt> <dd> A construct "#" is defined, similar to "*", for defining lists of
|
|
elements. The full form is "<n>#<m>element" indicating at least
|
|
<n> and at most <m> elements, each separated by one or more commas
|
|
(",") and OPTIONAL linear white space (LWS). This makes the usual
|
|
form of lists very easy; a rule such as
|
|
<pre> ( *LWS element *( *LWS "," *LWS element ))
|
|
</pre> can be shown as
|
|
<pre> 1#element
|
|
</pre> Wherever this construct is used, null elements are allowed, but do
|
|
not contribute to the count of elements present. That is,
|
|
"(element), , (element) " is permitted, but counts as only two
|
|
elements. Therefore, where at least one element is required, at
|
|
least one non-null element MUST be present. Default values are 0
|
|
and infinity so that "#element" allows any number, including zero;
|
|
"1#element" requires at least one; and "1#2element" allows one or
|
|
two.
|
|
</dd>
|
|
<dt> ; comment
|
|
</dt> <dd> A semi-colon, set off some distance to the right of rule text,
|
|
starts a comment that continues to the end of line. This is a
|
|
simple way of including useful notes in parallel with the
|
|
specifications.
|
|
</dd>
|
|
<dt> implied *LWS
|
|
</dt> <dd> The grammar described by this specification is word-based. Except
|
|
where noted otherwise, linear white space (LWS) can be included
|
|
between any two adjacent words (token or quoted-string), and
|
|
between adjacent words and separators, without changing the
|
|
interpretation of a field. At least one delimiter (LWS and/or
|
|
</dd>
|
|
<dt></dt> <dd> separators) MUST exist between any two tokens (for the definition
|
|
of "token" below), since they would otherwise be interpreted as a
|
|
single token.
|
|
</dd>
|
|
</dl>
|
|
<h3><a id='sec2.2'>2.2</a> Basic Rules</h3>
|
|
<p>
|
|
The following rules are used throughout this specification to
|
|
describe basic parsing constructs. The US-ASCII coded character set
|
|
is defined by ANSI X3.4-1986 <a rel='bibref' href='rfc2616-sec17.html#bib21'>[21]</a>.
|
|
</p>
|
|
<pre> OCTET = <any 8-bit sequence of data>
|
|
CHAR = <any US-ASCII character (octets 0 - 127)>
|
|
UPALPHA = <any US-ASCII uppercase letter "A".."Z">
|
|
LOALPHA = <any US-ASCII lowercase letter "a".."z">
|
|
ALPHA = UPALPHA | LOALPHA
|
|
DIGIT = <any US-ASCII digit "0".."9">
|
|
CTL = <any US-ASCII control character
|
|
(octets 0 - 31) and DEL (127)>
|
|
CR = <US-ASCII CR, carriage return (13)>
|
|
LF = <US-ASCII LF, linefeed (10)>
|
|
SP = <US-ASCII SP, space (32)>
|
|
HT = <US-ASCII HT, horizontal-tab (9)>
|
|
<"> = <US-ASCII double-quote mark (34)>
|
|
</pre>
|
|
<p>
|
|
HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all
|
|
protocol elements except the entity-body (see appendix 19.3 for
|
|
tolerant applications). The end-of-line marker within an entity-body
|
|
is defined by its associated media type, as described in section <a rel='xref' href='rfc2616-sec3.html#sec3.7'>3.7</a>.
|
|
</p>
|
|
<pre> CRLF = CR LF
|
|
</pre>
|
|
<p>
|
|
HTTP/1.1 header field values can be folded onto multiple lines if the
|
|
continuation line begins with a space or horizontal tab. All linear
|
|
white space, including folding, has the same semantics as SP. A
|
|
recipient MAY replace any linear white space with a single SP before
|
|
interpreting the field value or forwarding the message downstream.
|
|
</p>
|
|
<pre> LWS = [CRLF] 1*( SP | HT )
|
|
</pre>
|
|
<p>
|
|
The TEXT rule is only used for descriptive field contents and values
|
|
that are not intended to be interpreted by the message parser. Words
|
|
of *TEXT MAY contain characters from character sets other than ISO-
|
|
8859-1 <a rel='bibref' href='rfc2616-sec17.html#bib22'>[22]</a> only when encoded according to the rules of RFC 2047
|
|
<a rel='bibref' href='rfc2616-sec17.html#bib14'>[14]</a>.
|
|
</p>
|
|
<pre> TEXT = <any OCTET except CTLs,
|
|
but including LWS>
|
|
</pre>
|
|
<p>
|
|
A CRLF is allowed in the definition of TEXT only as part of a header
|
|
field continuation. It is expected that the folding LWS will be
|
|
replaced with a single SP before interpretation of the TEXT value.
|
|
</p>
|
|
<p>
|
|
Hexadecimal numeric characters are used in several protocol elements.
|
|
</p>
|
|
<pre> HEX = "A" | "B" | "C" | "D" | "E" | "F"
|
|
| "a" | "b" | "c" | "d" | "e" | "f" | DIGIT
|
|
</pre>
|
|
<p>
|
|
Many HTTP/1.1 header field values consist of words separated by LWS
|
|
or special characters. These special characters MUST be in a quoted
|
|
string to be used within a parameter value (as defined in section
|
|
<a rel='xref' href='rfc2616-sec3.html#sec3.6'>3.6</a>).
|
|
</p>
|
|
<pre> token = 1*<any CHAR except CTLs or separators>
|
|
separators = "(" | ")" | "<" | ">" | "@"
|
|
| "," | ";" | ":" | "\" | <">
|
|
| "/" | "[" | "]" | "?" | "="
|
|
| "{" | "}" | SP | HT
|
|
</pre>
|
|
<p>
|
|
Comments can be included in some HTTP header fields by surrounding
|
|
the comment text with parentheses. Comments are only allowed in
|
|
fields containing "comment" as part of their field value definition.
|
|
In all other fields, parentheses are considered part of the field
|
|
value.
|
|
</p>
|
|
<pre> comment = "(" *( ctext | quoted-pair | comment ) ")"
|
|
ctext = <any TEXT excluding "(" and ")">
|
|
</pre>
|
|
<p>
|
|
A string of text is parsed as a single word if it is quoted using
|
|
double-quote marks.
|
|
</p>
|
|
<pre> quoted-string = ( <"> *(qdtext | quoted-pair ) <"> )
|
|
qdtext = <any TEXT except <">>
|
|
</pre>
|
|
<p>
|
|
The backslash character ("\") MAY be used as a single-character
|
|
quoting mechanism only within quoted-string and comment constructs.
|
|
</p>
|
|
<pre> quoted-pair = "\" CHAR
|
|
</pre>
|
|
</body></html>
|