I'm trying to create simple tokenizer to transform following (only part shown) search expression to tokens
word1 near(1) word2
where word1, word2 are some words and near(1) is distance operator. The question is how this expression should be tokenized. I see two ways
1. <WORD, word1> <WORD, near> <LPAREN> <NUMBER,1> <RPAREN> <WORD, word2>.
2. <WORD, word1> <NEAROP, 1> <WORD, word2>
But should I really try to tokenize NEAR(\d+) during tokenization, or should I go first way and handle NEAR operator at parser level, during building parse tree?