To get started define your tokens:
var tokens = {
Name: {
name: "NAME",
expression: /[A-Za-z][A-Za-z0-9]*/
},
String: {
name: "STRING",
expression: /\"[^\"]*\"/
},
Number: {
name: "NUMBER",
expression: /0|[1-9][0-9]*/
},
Equals: {
name: "EQUALS",
expression: /=/
},
WhiteSpace: {
name: "WHITESPACE",
expression: /\s/
}
};Create the grammar:var grammar = $g(tokens);Perform the lexical analysis on a string (this will throw an error if the string violates the lexicon):
var lts = $lex('test = 0', $g(tokens)); //NAME WS EQUALS WS NUMBER
Confirm your suspicions:for (var i = 0; i < lts.length; i++) {
alert('Token: ' + lts[i].token + ', Value: ' + lts[i].value);
}
Lexical analysis is your oyster! Go forth, and please make sure to mention any issues you might find at github.