Skip to main content

Crate ct_regex

Crate ct_regex 

Source
Expand description

A crate for creating types that match regular expressions at compile time. Regex-implementing types can be created with the regex! macro.

This crate was heavily inspired by ctreg, which provides named, infallible capture groups and syntax error checking at compile time.

I’m yet to do a complexity analysis on this crate, but it should generally have time complexity O(n*m) where n and m are the length of the pattern and haystack, the same as most regex crates.

§Approach

How does this crate differ from the many other regex crates on crates.io?

The answer is in the name: it creates types that match regular expressions at compile time, as opposed to runtime like most other implementations.

  1. As with most crates, this one starts by parsing the provided expressions using the regex_syntax crate, producing an abstract syntax tree before translating and optimising into a high-level intermedite representation (HIR).

  2. Rather than using NFAs or DFAs, the macro converts the HIR into a Rust type expression, made of Matcher components that describe the various actions needed to match / capture a regular expression. An simple example of this generated type expression can be seen at demo::Email::Pattern.

  3. The macro finishes and the binary is compiled normally, using a collection of associated functions on each Matcher to perform the relvant matching / capturing. In short, matching or capturing at runtime boils down to a series of function calls, which the Rust compile can optimise as it sees fit.

§When To Use This Crate

If you’re writing a small piece of software using expressions known at compile time and you don’t want to package a whole regex interpreter into your binary. It’s also a good idea if you only use an expression a few times.

Parsing strings from an input file or command line arguments using the named captures is probably one of the major benefits, and the reason I started writing this in the first place.

§When Not To Use This Crate

For runtime regular expressions (gasp). Seriously though, most of the work done by this crate occurs when building the binary, so it isn’t possible to create expressions on the fly. See one of the other crates listed above if this is something you want.

After the other regular expressions are compiled, they can achieve speeds a fair bit faster than this crate, using parallel operations and such.

Some complex functionality isn’t implement yet, primarily complex look-arounds. An error will occur at compile-time if you try to use an unimplemented feature.

Modules§

demodemo
A demonstration of the types produced by the regex! macro.
haystack
A collection of traits and structs that form the haystack system. Although usually inferred, these type may be needed on occasion for full type names etc.
iter
A collection of Iterators used in return types for Regex methods. Although also usually inferred, these may be needed to name types in some cases.

Macros§

regex
A macro to create, at compile time, a type that can match the provided regular expression.

Traits§

AnonRegex
A trait that is automatically implemented for ‘anonymous’ regular expression types. There is only one difference between this and Regex: all functions take self as the first parameter, removing the need to name the expression itself.
Regex
A trait that is automatically implemented for types produced by the regex! macro. Various function are included that test this pattern against a provided Haystack.