PCRE2EZRegex

Official docs:
https://www.pcre.org/current/doc/html/pcre2syntax.html
https://www.pcre.org/current/doc/html/pcre2pattern.html

options

Usage: word + options(ignore_case=True) word + options('ignore_case') word + options('ignore_case', 'multiline') word + options('ignore_case', multiline=True)

Args: global: Global mode. Match everything in the given string, instead of just the first match multiline: Not recommended. Makes the '^' and '$' special characters match the start and end of lines, instead of the start and end of the string. This is automatically inserted when using line_start and line_end, you shouldn't need to add it manually ignore_case: Perform case-insensitive matching, including expressions that explicitly use uppercase members. Full Unicode matching (such as Ü matching ü) also works unless the ASCII flag is used to disable non-ASCII matches. The current locale does not change the effect of this flag unless the LOCALE flag is also used verbose: Not recommended. Allows for comments and whitespace, which both don't do anything in this library. single_line: Not recommended. Makes the '.' special character match any character at all, including a newline. It's recommended you simply use literally_anything instead lazy: The engine will per default to lazy matching, instead of greedy. It's recommended you just specify greedy=False instead duplicate_groups: This allows regex to accept duplicate pattern names, however each capture group still has its own ID. Thus the two capture groups produce their own match instead of a single combined one noncapturing: Not recomendded. Don't capture with any groups. Instead, simply don't use any groups

any_between

Aliases: amt_between, numBetween, num_between, anyBetween, amtBetween

Match any char between char and and_char, using the ASCII table for reference

            Args:
                char (str): the first character
                and_char (str): the second character

any_char_except

Aliases: anythingExcept, any_except, anyExcept, anyCharExcept, anything_except

This matches any char that is NOT in chars. chars can be multiple parameters, or a single string of chars to split.

            Args:
                chars (str): any of the characters to match

any_of

Aliases: anyof, oneof, oneOf, anyOf, one_of

Match any of the given patterns. Note that patterns can be multiple parameters, or a single string. Can also accept parameters chars and split. If char is set to True, then patterns must only be a single string, it interprets patterns as characters, and splits it up to find any of the chars in the string. If split is set to true, it forces the ?(...) regex syntax instead of the [...] syntax. It should act the same way, but your output regex will look different. By default, it just optimizes it for you.

            Args:
                patterns: any of the patterns to match
                chars (bool): whether to interpret patterns as characters (default: auto)
                split (bool): whether to split patterns into characters (default: auto)

anything

Aliases: anychar, any_char, anyChar, char

Matches any single character, except a newline. To also match a newline, use literally_anything

at_least_none

Aliases: zero_or_more, atLeast0, any_amt, anyAmt, at_least_0, atLeastNone, noneOrMore, zeroOrMore, none_or_more

at_least_one

Aliases: atLeastOne, atLeast1, at_least_1, one_or_more, oneOrMore

chunk

Aliases: stuff

A "chunk": Any clump of characters up until the next newline

earlier_group

Aliases: same_as_group, same_as, sameAs, sameAsGroup, earlierGroup

Matches whatever the group referenced by num_or_name matched earlier. Must be after a group which would match num_or_name

            Args:
                num_or_name (int | str): either the number or name of the previous group

either

Aliases: or_, or

Match either pattern or or_pattern. To choose between more than 2 things, you can either chain multiple either calls, or use any_of. Note that the order here matters: it first tries pattern, and if that doesn't match, then it tries or_pattern.

            Args:
                pattern: a pattern to match
                or_pattern: a pattern to match if the first one fails

hex_digit

Aliases: hexDigit, hex

if_enclosed_with

Aliases: ifEnclosedWith, ifEnclosedBy, if_enclosed_by

if_not_proceded_by

Aliases: ifNotFollowedBy, ifNotProcededBy, if_not_followed_by

if_proceded_by

Aliases: if_followed_by, ifProcededBy, ifFollowedBy

is_exactly

Aliases: exactly, isExactly

letter

Aliases: alpha

Matches just a letter -- not numbers or _ like word_char

letter_num

Aliases: alphaNum, alphanum, alpha_num, letterNum

line_ends_with

Aliases: lineEndsWith, lineEnd, line_end

Matches at a line if it ends with pattern

            Args:
                pattern: the pattern to match

line_starts_with

Aliases: line_start, lineStart, lineStartsWith

Matches at a line if it starts with pattern

            Args:
                pattern: the pattern to match

match_at_least

Aliases: matchAtLeast, atLeast, at_least, matchMin, match_min

match_at_most

Aliases: atMost, matchAtMost, at_most

match_max

Aliases: matchMax, repeat

match_more_than

Aliases: more_than, matchMoreThan, moreThan, match_greater_than, matchGreaterThan

match_num

Aliases: amt, matchNum, num, matchAmt, match_amt

match_range

Aliases: matchRange, matchBetween, between, match_between

new_line

Aliases: newLine, newline

optional

Aliases: oneOrNone, one_or_none, opt

period

Aliases: dot

signed_integer

Aliases: signed, signed_int, integer, signedInt, signedInteger

A signed integer, that also accepts e notation, like 123, -123+10, or +123e-10

string_ends_with

Aliases: stringEnd, stringEndsWith, string_end

Matches the string if it ends with pattern

            Args:
                pattern: the pattern to match

string_starts_with

Aliases: stringStart, string_start, stringStartsWith

Matches the string if it starts with pattern

            Args:
                pattern: the pattern to match

unsigned_integer

Aliases: unsigned_int, unsignedInt, unsignedInteger, unsigned

An unsigned integer, that also accepts e notation, like 123, or +123e-10

white_char

Aliases: whitechar, whiteChar

whitechunk

Aliases: white_space, white_chunk, whiteChunk, whitespace, whiteSpace

A "chunk" of whitespace. Just any amount of whitespace together

Replacement EZRegexs

replace_entire

Aliases: replaceAll, replaceEntire, replace_all

Puts in its place the entire match

rgroup

Aliases: replaceGroup, replace_group

Puts in its place the group specified, either by group number (for unnamed groups) or group name (for named groups). Named groups are typically also counted by number, check your specific dialect docs for details. Group 0 is handled specially by this function, so it calls for the entire match, even if 0 doesn't mean the entire match in your dialect.

        Args:
            num_or_name (int | str): the number or name of the group you want to insert here

replace

Generates a valid regex replacement string, using Python f-string like syntax.

            Args:
                string (str): the templated replacement string
                compile (bool): whether to compile the string into an EZRegex subclass instance (default: True)

            Example:
                ``` replace("named: {group}, numbered: {1}, entire: {0}") ```

            Like Python f-strings, use {{ and }} to specify { and }

            Set the `compile` parameter to False to have it return an EZRegex subclass instance instead of a string

            Note: 0 is handled specially by this function, so it calls for the entire match,
                even if 0 doesn't mean the entire match in your dialect.

            There's a few of advantages to using this instead of just the regular regex replacement syntax:
            - It's consistent between dialects
            - It's closer to Python f-string syntax, which is cleaner and more familiar
            - It handles numbered, named, and entire replacement types the same