3

In most, if not all implementations of regular expressions, the \w metacharacter matches all alphanumeric characters plus the underscore.

Historically speaking, why was the underscore character included in this character class? And why not include dashes too?

Alex
  • 133
  • 1
    probably because C and most C-like languages have identifiers_like_this. Don't use \w if you need a specific meaning like letters – either use POSIX charclasses like [[:alnum:]] or Unicode properties for that, depending on what your regex engine offers and what you need exactly. – amon May 02 '14 at 22:55

1 Answers1

7

Because underscores are second-nature for identifiers in almost all computer languages that matter. Dashes are not; they're typically used as an operator for subtraction, and are specifically excluded from identifiers.

Robert Harvey
  • 199,517