Friday, July 23, 2010

RegLex.js

I've released a Regular Expressions based Lexer for JavaScript at github.  The script is extremely easy to use and requires absolutely no dependencies.  To give it a spin one must simply download the code.

To get started define your tokens:
var tokens = {
    Name: {
        name: "NAME",
        expression: /[A-Za-z][A-Za-z0-9]*/
    },
    String: {
        name: "STRING",
        expression: /\"[^\"]*\"/
    },
    Number: {
        name: "NUMBER",
        expression: /0|[1-9][0-9]*/
    },
    Equals: {
        name: "EQUALS",
        expression: /=/
    },
    WhiteSpace: {
        name: "WHITESPACE",
        expression: /\s/
    }
};
Create the grammar:
var grammar = $g(tokens);
Perform the lexical analysis on a string (this will throw an error if the string violates the lexicon):
var lts = $lex('test = 0', $g(tokens)); //NAME WS EQUALS WS NUMBER
Confirm your suspicions:
for (var i = 0; i < lts.length; i++) {
    alert('Token: ' + lts[i].token + ', Value: ' + lts[i].value);
}

Lexical analysis is your oyster! Go forth, and please make sure to mention any issues you might find at github.

No comments: