Writing a high-quality implementation of POSIX Extended Regular Expressions does not seem easy.
Ideally, the following features would be offered at the same time:
* Strict standards compliance (see https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html and https://pubs.opengroup.org/onlinepubs/9799919799/functions/regexec.html).
* Predictable performance (polynomial in the length of the inputs, and linear in the length of the matched string).
* Limited resource usage (memory, CPU time (also related to the previous point)).
* Absence of vendor-specific syntax extensions (for portability).
In practice, most implementations fall short of these goals, due to a variety of issues:
* Unclear wording in the standards, leading to diverging outcomes between implementations.
* Exponential matching time due to backtracking implementation (sometimes even mandated by non-standard syntax extensions).
* Excessive consumption of stack memory.
* Spurious matching failures (either indication of a non-match, or an error).
* Incorrect capturing behaviour of parenthesised subexpressions.
Here we test a small sample of regular expressions that are known to be hard to match properly against a couple of popular regex engines.