1
votes

I'm looking to create some python regexes by combining smaller reusable patterns and I'd like the reusable patterns to use the verbose flag. For example, suppose I had a simple pattern for digits and one for a lowercase character,

DIGIT_PATTERN = re.compile(r"""
    (?P<my_digit_pattern>        # start named group
      \d+                        # 1 or more integers
    )                            # close named group
    """, re.VERBOSE)
CHAR_PATTERN = re.compile(r"""
    (?P<my_char_pattern>         # start named group
      [a-z]                      # a character
    )                            # close named group
    """, re.VERBOSE)

Is there a way I can create a new pattern that is composed of the above patterns? Something like,

NEW_PATTERN = CHAR_PATTERN followed by DIGIT PATTERN followed by CHAR_PATTERN

which I'd want to match the string a937267t. The above examples are highly simplified, but the main point is how to combine regexes that were defined with the verbose flag.

UPDATE

This is what I have so far ... might be the only way ....

NEW_PATTERN = re.compile(
    CHAR_PATTERN.pattern + 
    DIGIT_PATTERN.pattern + 
    CHAR_PATTERN.pattern,
    re.VERBOSE
)

I had to ditch the named groups b/c there cant be two groups named the same thing but I think this is what I was looking for.

1
Please keep in mind that the regex language allows you to write regular expressions that would look ahead beyond what you are actually matching, or allow repetition skipping parts of what was matched. There is really no straightforward way to 'add' these to each other and your solution wouldn't for them. What is the actual use case you have in mind for this? - Grismar
may need __add__ in re module class Pattern..., bu it's unnecessary, why don't you just finish it in your re.compile in one time . o(╯□╰)o - jia Jimmy
I have about a dozen different "NEW_PATTERN" objects to define [NP0, ..., NP11]. Some of the component pieces of the NP objects are the same and I dont want to have to change code in multiple places if I decide the CHAR_PATTERN is wrong or needs to be updated. - Gabriel

1 Answers

0
votes
NEW_PATTERN = re.compile(r"""
    (?P<my_new_pattern>         # start named group
      [a-z]                      # a character
      \d+                        # 1 or more integers
      [a-z]                      # a character
    )                            # close named group
    """, re.VERBOSE)