The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

CSS grammar

I have a couple of questions about CSS grammar. I'm looking at http://www.w3.org/TR/CSS21/grammar.html

The grammar seems to be using square brackets to denote grouping, like parentheses in a regular expression. I don't understand this rule:

term
  : unary_operator?
    [ NUMBER S* | PERCENTAGE S* | LENGTH S* | EMS S* | EXS S* | ANGLE S* |
      TIME S* | FREQ S* ]
  | STRING S* | IDENT S* | URI S* | hexcolor | function
  ;

Why the square brackets? It looks to me like:

  "term is an optional unary_operator, followed by one of NUMBER or PERCENTAGE or etc., or STRING or IDENT or etc."

What's the effect of the square brackets? Are they superfluous? Isn't an expression like "(A|B)|C" just the same as "A|B|C"? The square brackets are part of the ter definition in the http://www.w3.org/TR/REC-CSS2/grammar.html as well, but not in http://www.w3.org/TR/REC-CSS1#appendix-b

There's also something I don't understand in the version 2.1 tokenizer definition. It includes:

h        [0-9a-f]
nonascii    [\200-\377]
unicode        \\{h}{1,6}(\r\n|[ \t\r\n\f])?
escape        {unicode}|\\[^\r\n\f0-9a-f]
string1        \"([^\n\r\f\\"]|\\{nl}|{escape})*\"
nl        \n|\r\n|\r|\f

I don't see how this definition of string1 allows ordinary alphanumeric characters, for example "ABC"! I do understand the version 2 definition, which was:

h        [0-9a-f]
nonascii    [\200-\377]
unicode        \\{h}{1,6}[ \t\r\n\f]?
escape        {unicode}|\\[ -~\200-\377]
nmstart        [a-z]|{nonascii}|{escape}
nmchar        [a-z0-9-]|{nonascii}|{escape}
string1        \"([\t !#$%&(-~]|\\{nl}|\'|{nonascii}|{escape})*\"

In the version 2 definition, alphanumerics are allowed by the " -~" portion of the escape definition, but not so in the 2.1 definition.
Christopher Wells Send private email
Friday, September 28, 2007
 
 
Well it says in the grammar that they have been "optimized for human consumption" and as such I think that superfluous groupings are to just make it clear to a human reader that they are related.

The ^ in the beginning of that character class in the definition of string1 means "anything NOT in this character class."

So [^\n\r\f\\"] would basically mean anything that isn't \n and not \r and not \f and not ", which basically include any alphanumeric character.
Stephen Caldwell Send private email
Wednesday, October 03, 2007
 
 
Ah. So it's "not whitespace, or it's backslash followed by newline or escape". Thanks.
Christopher Wells Send private email
Wednesday, October 03, 2007
 
 
> not whitespace

Sorry, I mean "not carriage return, linefeed, or formfeed, also not an unescaped '"' in the middle of the string".
Christopher Wells Send private email
Wednesday, October 03, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz