1

Often I come across documentation that says "use a regular expression here" I have to spend quite some time digging around trying to work out which regular expression format they are expecting.

As far as I can tell, there are many types of regular expression.

But, at my last place of work I was made to feel stupid when I suggested adding some text to our User Documentation to specify the type of regular expression to be used.

When someone says "a regular expression" what is the regular expression syntax most people expect and where is it documented?

Update: I was prompted to single-out some examples - but no disrespect to these great projects:

JW01
  • 3,569
  • What docs are you referring to? What tools or languages are you using? For example, I know that the Java documentation clearly specifies the syntax and usage of that particular flavor of regular expression. – Thomas Owens Sep 14 '11 at 15:29
  • @Thomas Owens - I believe he is speaking of documentation in general. And he is right for the most part; quite often one encounters just "regular expression". Nothing on the expected regular expression flavor. (http://www.regular-expressions.info/refflavors.html) – Rook Sep 14 '11 at 15:31
  • 2
    Docs probabaly shouldn't specify because the programming environment and language probably won't give you much choice in the matter. – FrustratedWithFormsDesigner Sep 14 '11 at 15:32
  • 1
    @Rook I'm thinking along the same lines as FrustratedWithFormsDesigner. Somewhere, it's specified that a language uses PCRE or some other variant. I know it's in the Java docs, and I'm pretty sure it's in the MSDN docs for C#, and so on. I'm curious as to what tools or languages don't specify it somewhere. Besides, many are similar to each other - knowing one flavor makes it fairly straightforward to read and write another. – Thomas Owens Sep 14 '11 at 15:35
  • @Thomas and Rook - yes. At my last place of work I was glared at for wasting time adding some text that specifies the type of regular expression to our User Documentation. It made me feel stupid , hence the question. – JW01 Sep 14 '11 at 15:36
  • It depends on the language. Every one is different. But it would be nice if every mention of "regular expression" were a link to a page giving the accepted syntax. – kevin cline Sep 14 '11 at 15:52
  • 1
    @JW01 Can you reword or update your question, then? Your title question and closing question don't match. One is asking why tools/languages don't document the syntax they use, and in my experiences, they all do. The second is asking what syntax people expect. Exactly what do you want answered here? – Thomas Owens Sep 14 '11 at 16:14
  • 1
    @FrusteratedWithFormsDesigner - I think you hit the nail on the head with that. You should post it as an answer. – Craige Sep 14 '11 at 16:29
  • "which regular expression format"? "the type of regular expression"? What context would this be ambiguous or unclear? How is it even possible for a programming language or library to omit a reference to the specific regular expression syntax used? Can you provide an actual example where it's actually unclear? – S.Lott Sep 14 '11 at 17:02
  • 1
  • Added example - as requested. – JW01 Sep 14 '11 at 17:54
  • @JW01 - as per php regular expressions, the page clearly states: "As of PHP 5.3.0, the regex extension is deprecated in favor of the PCRE extension" – Craige Sep 14 '11 at 18:07
  • @Craige - well i was just giving examples. Its generally handy to know what the syntax is rather than what its not. – JW01 Sep 14 '11 at 18:09
  • 1
    @JW01 - also: http://www.php.net/manual/en/intro.regex.php "Regular expressions are used for complex string manipulation. PHP uses the POSIX extended regular expressions as defined by POSIX 1003.2. For a full description of POSIX regular expressions see the » regex man pages". Sometimes you have to hunt, but the documentation for regex libraries usually do have something on expected syntax. – Craige Sep 14 '11 at 18:17
  • 2
    I'm possibly being over-pedantic here, but it's most likely that the documentation should be talking about regexes rather than regular expressions anyway. Most languages use regex flavours which aren't regular, and some of them aren't even context-free. In fact I think the latest revisions of Perl have regexes which are Turing-complete. – Peter Taylor Oct 14 '12 at 22:18

2 Answers2

2

I suspect Documentation does not specify which flavour of regular expression to use, because it's rarely a choice. The choice is usually specified by the language that has been chosen (unless the language implements more than one flavour, in which case it might be valid to specifiy a preference for one of them in the docs, to keep things consistent), or by other constraints of the programming environment.

As a handy reference, here's a web page comparing regex flavours: http://www.regular-expressions.info/refflavors.html

FrustratedWithFormsDesigner
  • 46,235
  • 7
  • 128
  • 176
  • So, as a php developer (where there's been a choice of pattern for years) I find it weird that the flavour is not specified. But, for most people, there is often no choice within a particular context so its weirder to specify it. Aaaaagh. I geddit now. Cheers for that. – JW01 Sep 14 '11 at 18:28
  • @JW01: Maybe it's up to the developer if they want to use ereg or preg (which as far as I can tell is the only way to "switch flavours" in PHP), and simply saying in the document "we will use PHP" restricts the number of flavours to 2 - and after that the developer can choose. – FrustratedWithFormsDesigner Sep 14 '11 at 18:41
  • @JW01: "But, for most people, there is often no choice". Stronger than that. Except in very, very weird edge cases, there is no choice. It's not "most". It's "nearly all". – S.Lott Sep 14 '11 at 19:57
1

I usually go with the flavor found in books dealing with Perl. Rarely fails.

Rook
  • 19,909
  • Thanks for the reassuring reply. I must have wasted hours trying to discover which syntax and like you say it almost always ends up as PCRE... I think i'll just go with it from now on and save time. – JW01 Sep 14 '11 at 15:33