11.2 Tokenizing rules

Many of the tokens used in the BNF grammars follow obviously from their names: DATABLOCK is the literal string ‘data’, COMMA is a single ‘,’ character, etc. The literal representation of each operator is additionally provided in the operator precedence table.

A few tokens are not so obvious, and are defined here in regular expressions:

IDENTIFIER = [a-zA-Z] [a-zA-Z0-9_]*

STRINGLITERAL = ".*"

INTNUMERAL = [0-9]+ (_ [0-9]+)*

EXPLITERAL = [eE] [+-]? INTNUMERAL

REALNUMERAL = INTNUMERAL \. INTNUMERAL? EXPLITERAL?
            | \. INTNUMERAL EXPLITERAL?
            | INTNUMERAL EXPLITERAL

IMAGNUMERAL = (REALNUMERAL | INTNUMERAL) i