Add material about the "r" and "u" prefixes for string literals; should

author Fred Drake <fdrake@acm.org>

Tue, 14 Aug 2001 21:43:31 +0000 (21:43 +0000)

committer Fred Drake <fdrake@acm.org>

Tue, 14 Aug 2001 21:43:31 +0000 (21:43 +0000)
author Fred Drake <fdrake@acm.org>
Tue, 14 Aug 2001 21:43:31 +0000 (21:43 +0000)
committer Fred Drake <fdrake@acm.org>
Tue, 14 Aug 2001 21:43:31 +0000 (21:43 +0000)
diff --git a/Doc/ref/ref2.tex b/Doc/ref/ref2.tex

index 0e4ea26ea7e86569deaa34aafa680730d0c49e7d..3ccfaefe2608e39c3e30fa8fb68d96d8f849002d 100644 (file)
--- a/Doc/ref/ref2.tex
+++ b/Doc/ref/ref2.tex
@@ -311,7 +311,9 @@ String literals are described by the following lexical definitions:
  \index{ASCII@\ASCII{}}
  \begin{productionlist}
    \production{stringliteral}
-             {\token{shortstring} | \token{longstring}}
+             {[\token{stringprefix}](\token{shortstring} | \token{longstring})}
+  \production{stringprefix}
+             {"r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" | "uR"}
    \production{shortstring}
               {"'" \token{shortstringitem}* "'"
                | '"' \token{shortstringitem}* '"'}
@@ -325,11 +327,15 @@ String literals are described by the following lexical definitions:
    \production{shortstringchar}
               {<any ASCII character except "\e" or newline or the quote>}
    \production{longstringchar}
-             {<any ASCII character except "\e">}
+             {<any ASCII characteru except "\e">}
    \production{escapeseq}
               {"\e" <any ASCII character>}
  \end{productionlist}
  
+One syntactic restriction not indicated by these productions is that
+whitespace is not allowed between the \grammartoken{stringprefix} and
+the rest of the string literal.
+
  \index{triple-quoted string}
  \index{Unicode Consortium}
  \index{string!Unicode}
@@ -340,12 +346,14 @@ are generally referred to as \emph{triple-quoted strings}).  The
  backslash (\code{\e}) character is used to escape characters that
  otherwise have a special meaning, such as newline, backslash itself,
  or the quote character.  String literals may optionally be prefixed
-with a letter `r' or `R'; such strings are called
-\dfn{raw strings}\index{raw string} and use different rules for
+with a letter `r' or `R'; such strings are called \dfn{raw
+strings}\index{raw string} and use different rules for interpreting
  backslash escape sequences.  A prefix of 'u' or 'U' makes the string
  a Unicode string.  Unicode strings use the Unicode character set as
  defined by the Unicode Consortium and ISO~10646.  Some additional
  escape sequences, described below, are available in Unicode strings.
+The two prefix characters may be combined; in this case, `u' must
+appear before `r'.
  
  In triple-quoted strings,
  unescaped newlines and quotes are allowed (and are retained), except
@@ -374,10 +382,8 @@ to those used by Standard C.  The recognized escape sequences are:
         {Character named \var{name} in the Unicode database (Unicode only)}
  \lineii{\e r}  {\ASCII{} Carriage Return (CR)}
  \lineii{\e t}  {\ASCII{} Horizontal Tab (TAB)}
-\lineii{\e u\var{xxxx}}
-       {Character with 16-bit hex value \var{xxxx} (Unicode only)}
-\lineii{\e U\var{xxxxxxxx}}
-       {Character with 32-bit hex value \var{xxxxxxxx} (Unicode only)}
+\lineii{\e u\var{xxxx}}    {Character with 16-bit hex value \var{xxxx} (Unicode only)}
+\lineii{\e U\var{xxxxxxxx}}{Character with 32-bit hex value \var{xxxxxxxx} (Unicode only)}
  \lineii{\e v}  {\ASCII{} Vertical Tab (VT)}
  \lineii{\e\var{ooo}} {\ASCII{} character with octal value \var{ooo}}
  \lineii{\e x\var{hh}} {\ASCII{} character with hex value \var{hh}}
author	Fred Drake <fdrake@acm.org>
	Tue, 14 Aug 2001 21:43:31 +0000 (21:43 +0000)
committer	Fred Drake <fdrake@acm.org>
	Tue, 14 Aug 2001 21:43:31 +0000 (21:43 +0000)