Qindex Programming Tips
62 [Quick Reference] Strings & Escapes
written by Qindex at 2006-10-30 00:29 /

-----------------------------------------------------------------------------------
CSS2 Identifiers & Strings
---------------------

All CSS style sheets are case-insensitive.
Parts that are not under the control of CSS : values of the HTML attributes "id" and "class", of font names, and of URIs.
Identifiers can contain only the characters [A-Za-z0-9] and ISO 10646 characters 161 and higher, plus the hyphen (-). They cannot start with a hyphen or a digit. They can also contain escaped characters and any ISO 10646 character as a numeric code.
Character escapes:

    * Inside a string, a backslash followed by a newline is ignored.
    * Any character (except a hexadecimal digit) can be escaped with a backslash to remove its special meaning.
    * Characters authors can't easily put in a document can be expressed with a backslash followed by at most six hexadecimal digits (0..9A..F), which stand for the ISO 10646 character with that number. If hexadecimal digit(s) less than 6 are used, authors should add one whitespace character, which will be ignored.

Backslash escapes are always considered to be part of an identifier or a string (i.e., "\7B" is not punctuation, even though "{" is, and "\32" is allowed at the start of a class name, even though "2" is not).



-----------------------------------------------------------------------------------
HTML 4 Attribute Value
-------------------

Attribute values are delimited using either double quotation marks or single quotation marks.
Single quote marks can be included within the attribute value when the value is delimited by double quote marks, and vice versa.
Attribute values are generally case-insensitive.
Characters may be represented as nemeric character reference(such as ") or character entity reference(such as ").



-----------------------------------------------------------------------------------
JavaScript 1.5 String Literals
------------------------

A string literal is zero or more characters enclosed in double (") or single (') quotation marks.
Special Characters:

\b       Backspace
\f       Form feed
\n       New line
\r       Carriage return
\t       Tab
\v       Vertical tab
\'       Apostrophe or single quote
\"       Double quote
\\       Backslash character (\).
\XXX     The character with the Latin-1 encoding
         specified by up to three octal digits XXX between 0 and 377.
         For example, \251 is the octal sequence for the copyright symbol.
\xXX     The character with the Latin-1 encoding
         specified by the two hexadecimal digits XX between 00 and FF.
         For example, \xA9 is the hexadecimal sequence for the copyright symbol.
\uXXXX   The Unicode character specified by the four hexadecimal digits XXXX.
         For example, \u00A9 is the Unicode sequence for the copyright symbol.
         See Unicode Escape Sequences.

For characters not listed in the above, a preceding backslash is ignored, but this usage is deprecated and should be avoided.
"</script>" or '</script>' will cause an error


-----------------------------------------------------------------------------------
MySQL String
------------

A string is a sequence of characters, surrounded by either single quote (') or double quote (") characters (only the single quote in ANSI mode). MySQL recognises the following escape sequences:

  \0  An ASCII 0 (NUL) character.
  \'  A single quote (') character.
  \"  A double quote (") character.
  \b  A backspace character.
  \n  A newline character.
  \r  A carriage return character.
  \t  A tab character.
  \z  ASCII(26) (Control-Z).
      This character can be encoded to allow to work around
      the problem that ASCII(26) stands for END-OF-FILE on Windows.
      (ASCII(26) will cause problems if you try to use mysql database < filename.)
  \\  A backslash (\) character.
  \%  A '%' character.
      This is used to search for literal instances of '%'
      in contexts where '%' would otherwise be interpreted as a wildcard character.
  \_  A '_' character.
      This is used to search for literal instances of '_'
      in contexts where '_' would otherwise be interpreted as a wildcard character.

There are several ways to include quotes within a string:
- A single quote inside a string quoted with single quotes may be written as ''''.
- A double quote inside a string quoted with double quotes may be written as '""'.
- You can precede the quote character with an escape character (\).
- A single quote inside a string quoted with double quotes need not be doubled or escaped. In the same way, a double quote inside a string quoted with singe quotes needs no special treatment.



-----------------------------------------------------------------------------------
PHP 4 String Type
---------------

A string literal can be specified in three different ways.

1. Single Quoted: A backslash(\) to escape a literal single quote. Double backslash(\\) to escape a backslash before a single quote or at the end of the string. A backslash will not escape any character other than those ones. Variables will not be expanded in single quoted strings.

2. Double Quoted:

  \n                 linefeed (LF or 0x0A (10) in ASCII)
  \r                 carriage return (CR or 0x0D (13) in ASCII)
  \t                 horizontal tab (HT or 0x09 (9) in ASCII)
  \\                 backslash
  \$                 dollar sign
  \"                 double-quote
  \[0-7]{1,3}        escape sequence in octal notation
  \x[0-9A-Fa-f]{1,2} escape sequence in hexadecimal notation

3. Heredoc:
$str = <<< identifier
string
identifier;
Identifier must contain only alphanumeric characters and underscores, and must start with a non-digit character or underscore.
The closing identifier must begin in the first column of the line. There may not be any character except the identifier, a semicolon and a new line in the closing line.

* Variable Parsing
1. Simple Syntax
If a dollar sign ($) is encountered, the parser will greedily take as much tokens as possible to form a valid variable name. Enclose the variable name in curly braces to explicitly specify the end of the name.
With array indices, the closing square bracket (]) marks the end of the index.
2. Complex Syntax
This syntax will only be recognised when the $ is immediately following the {. Use "{\$" or "\{$" to get a literal "{$").



-----------------------------------------------------------------------------------

URI RFC1738 (1994)

----------------

uchar          = unreserved | escape
xchar          = unreserved | reserved | escape
 
unreserved     = alpha | digit | safe | extra
alpha          = lowalpha | hialpha
lowalpha       = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
                 "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
                 "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
                 "y" | "z"
hialpha        = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
                 "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
                 "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
digit          = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
                 "8" | "9"
safe           = "$" | "-" | "_" | "." | "+"
extra          = "!" | "*" | "'" | "(" | ")" | ","
national       = "{" | "}" | "|" | "\" | "^" | "~" | "[" | "]" | "`"
punctuation    = "<" | ">" | "#" | "%" | <">
 
reserved       = ";" | "/" | "?" | ":" | "@" | "&" | "="
 
escape         = "%" hex hex
hex            = digit | "A" | "B" | "C" | "D" | "E" | "F" |
                 "a" | "b" | "c" | "d" | "e" | "f"
 
unreserved(73):
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789$-_.+!*'(),
reserved(7):
;/?:@&=

 

URI RFC2396 (1998)

----------------

      uric        = reserved | unreserved | escaped
      reserved    = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                    "$" | ","
      unreserved  = alphanum | mark
      alphanum    = alpha | digit
      alpha       = lowalpha | upalpha

      lowalpha    = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
                    "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
                    "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"

      upalpha     = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
                    "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
                    "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
 
      digit       = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
                    "8" | "9"
mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
 
      escaped     = "%" hex hex
      hex         = digit | "A" | "B" | "C" | "D" | "E" | "F" |
                            "a" | "b" | "c" | "d" | "e" | "f"
 
      control     = <US-ASCII coded characters 00-1F and 7F hexadecimal>
      space       = <US-ASCII coded character 20 hexadecimal>
      delims      = "<" | ">" | "#" | "%" | <">
      unwise      = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
 
reserved(10):
;/?:@&=+$,
unreserved(71):
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_.!~*'()
excluded(14+):
 <>#%"{}|\^[]`


URI
RFC3986 (2005)

----------------

reserved    = gen-delims / sub-delims
gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims  = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" 
unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
 
reserved(18):
:/?#[]@!$&'()*+,;=

-----------------------------------------------------------------------------------
VBScript String Constants
---------------------
vbCr          Chr(13)                      Carriage return. 
VbCrLf        Chr(13) & Chr(10)            Carriage return?linefeed combination. 
vbFormFeed    Chr(12)                      Form feed; not useful in Microsoft Windows. 
vbLf          Chr(10)                      Line feed.
vbNewLine     Chr(13) & Chr(10) or Chr(10) Platform-specific newline character;                                            whatever is appropriate for the platform. vbNullChar    Chr(0)                       Character having the value 0.
vbNullString  String having value 0        Not the same as a zero-length string ("");                                            used for calling external procedures.
vbTab         Chr(9)                       Horizontal tab.
vbVerticalTab Chr(11)                      Vertical tab; not useful in Microsoft Windows.


 [post]
[permission] read:Anonymous, comment:Anonymous, write:Webmaster, upload:Webmaster, manage:Webmaster
Qindex.info