5

I'm using Ruby's scan() method to find text in a particular format. I then output it into a string separated by commas. The text I'm trying to find would look like this:

AB_ABCD_123456

Here's the what I've come up with so far to find the above. It works fine:

text.scan(/.._...._[0-9][0-9][0-9][0-9][0-9][0-9]/)
puts text.uniq.sort.join(', ')

Now I need a regex that will find the above with or without a two-letter country designation at the end. For example, I would like to be able to find all three of the below:

AB_ABCD_123456
AB_ABCD_123456UK
AB_ABCD_123456DE

I know I could use two or three different scans to achieve my result, but I'm wondering if there's a way to get all three with one regex.

michaelmichael
  • 13,755
  • 7
  • 54
  • 60

4 Answers4

13
/.._...._[0-9][0-9][0-9][0-9][0-9][0-9](?:[A-Z][A-Z])?/

You can also use {} to make the regex shorter:

/.{2}_.{4}_[0-9]{6}(?:[A-Z]{2})?/

Explanation: ? makes the preceding pattern optional. () groups expressions together (so ruby knows the ? applies to the two letters). The ?: after the opening ( makes the group non-capturing (capturing groups would change the values yielded by scan).

sepp2k
  • 363,768
  • 54
  • 674
  • 675
1
 /.._...._\d{6}([A-Z]{2})?/
Avdi
  • 18,340
  • 6
  • 53
  • 62
  • 1
    If you don't make the group non-capturing scan will only yield the country-codes (or nil for the strings that didn't include one), not the entire string that was matched. – sepp2k Aug 05 '09 at 21:24
1

Why not just use split?

"AB_ABCD_123456".split(/_/).join(',')

Handles the cases you listed without modification.

ezpz
  • 11,767
  • 6
  • 38
  • 39
1

Try this:

text.scan(/\w{2}_\w{4}_\d{6}\w{0,2}/) 
#matches AB_ABCD_123456UK or ab_abcd_123456uk and so on...

or

text.scan(/[A-Z]{2}_[A-Z]{4}_\d{6}[A-Z]{0,2}/) 
# tighter, matches only AB_ABCD_123456UK and similars...
# and not something like ab_aBCd_123456UK or ab_abcd_123456uk and similars...

refer to these urls:

Ruby gsub / regex modifiers?

http://ruby-doc.org/docs/ruby-doc-bundle/Manual/man-1.4/syntax.html#regexp

if you want to learn more about regex.

Community
  • 1
  • 1
vulcan_hacker
  • 116
  • 1
  • 8
  • i like that second regex example. thanks for the links. i've gone through them, though not as thoroughly as i should. real life problems help my understanding a lot. – michaelmichael Aug 05 '09 at 22:01