Unlike commonly used regexp libraries, regular expressions are not strings: instead a first class syntax is used to define them.
Felix allows you to name regular expressions with the syntax:
regexp <name> = <regexp> ;The name is an identifier. A string used in a regexp stands for a match of each character of the string in sequence. The following symbols are special, and are given from weakest to strongest binding order:
| symbol | syntax | meaning |
|---|---|---|
| | | infix | alternatives |
| * | postfix | 0 or more occurences |
| + | postfix | 1 or more occurences |
| ? | postfix | 0 or 1 occurences |
| <juxtaposition> | infix | concatenation |
| <name> | atomic | re denoted by the name in a REGEXP definition |
| <string> | atomic | sequence of chars of the string |
| [<charset>] | atomic | any char of the charset |
| [^<charset>] | atomic | any char not in the charset |
| . | atomic | any char other than end of line |
| _ | atomic | any char |
| eof | atomic | end marker |
| (<regexp>) | atomic | brackets |
| symbol | meaning |
|---|---|
| <string> | any character in the string |
| <char>-<char> | any between or including the two chars |
1: #line 896 "./lpsrc/flx_tutorial.pak" 2: #import <flx.flxh> 3: regexp lower = ["abcdefghijklmnopqrstuvwxyz"]; 4: regexp upper = ["ABCDEFGHIJKLMNOPQRSTUVWXYZ"]; 5: regexp digit = ["0123456789"]; 6: regexp alpha = lower | upper | "_"; 7: regexp id = alpha (alpha | digit) *; 8:
9: #line 910 "./lpsrc/flx_tutorial.pak" 10: print 11: regmatch "identifier" with 12: | digit+ => "Number" 13: | id => "Identifier" 14: endmatch 15: ; 16: endl; 17: 18: print 19: regmatch "9999" with 20: | digit+ => "Number" 21: | id => "Identifier" 22: endmatch 23: ; 24: endl; 25: 26: print 27: regmatch "999xxx" with 28: | digit+ => "Number" 29: | id => "Identifier" 30: | _* => "Neither" 31: endmatch 32: ; 33: endl; 34: 35:
Note: the generated code is *extremely* fast, within one or two memory fetches of the fastest possible code. here is the generated code for the inner loop of a regmatch:
while(state && start != end)
state = matrix[*start++][state];