forked from lix-project/lix-website
3.6 KiB
3.6 KiB
safe-regex
Detect potentially catastrophic exponential-time regular expressions by limiting the star height to 1.
WARNING: This module has both false positives and false negatives. Use vuln-regex-detector for improved accuracy.
Example
Suppose you have a script named safe.js
:
var safe = require('safe-regex');
var regex = process.argv.slice(2).join(' ');
console.log(safe(regex));
This is its behavior:
$ node safe.js '(x+x+)+y'
false
$ node safe.js '(beep|boop)*'
true
$ node safe.js '(a+){10}'
false
$ node safe.js '\blocation\s*:[^:\n]+\b(Oakland|San Francisco)\b'
true
Methods
const safe = require('safe-regex')
const ok = safe(re, opts={})
Return a boolean ok
whether or not the regex re
is safe and not possibly
catastrophic.
re
can be a RegExp
object or just a string.
If the re
is a string and is an invalid regex, returns false
.
opts.limit
- maximum number of allowed repetitions in the entire regex. Default:25
.
Install
With npm do:
npm install safe-regex
Resources
What should I do if my project has a super-linear regex?
- Confirm that it is reachable by untrusted input.
- If it is, you can consider whether you can prevent worst-case behavior by trimming the input, revising the regex, or replacing the regex with another algorithm like string functions. For examples, see Table 5 in this article.
- If none of those solutions looks feasible, you might also consider changing regex engines. The RE2 bindings might work, though test carefully to confirm there are no semantic portability problems.
Further reading
The following documents may be edifying:
- Research brief on the extent of super-linear regexes in practice
- Research brief on the variability of super-linear regex behavior across programming languages
- Comparing regex matching algorithms
Project policies
Versioning
This project follows Semantic Versioning 2.0 (semver).
Here are the project-specific meanings of MAJOR, MINOR, and PATCH updates:
- MAJOR: "Incompatible" API changes were introduced. There are two types in this module:
- Changes that modify the interface
- Changes that cause any regexes to be marked as unsafe that were formerly marked as safe
- MINOR: Functionality was added in a backwards-compatible manner. There are two types in this module:
- Refactoring the analyses but not changing their results
- Modifying the analyses to reduce false positives, without affecting negatives (false or true)
- PATCH: I don't anticipate using PATCH for this module