What I Learned Today: Regular Expression Backreferences

This is another post in my challenge to learn something new every day and then share that in a blog post.

This is fairly simple, but even though I'm comfortable with regular expressions, I was not familiar with the "?:" syntax (aka: question mark colon). I was working on some Behat tests using the MinkExtension, and this is used fairly often in the code (?P<option>(?:[^"]|\\")*).

The ?P<option> portion says to name the group enclosed by parenthesis "option", so in PHP in will be in the variable $option. The ?: portion says to not capture that group enclosed by parenthesis in a backreference. Besides being an optimization for the regular expression engine, it could be helpful to use this when doing a preg_match to only extract the groups that you want.


Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <cpp>, <java>, <php>. The supported tag styles are: <foo>, [foo].
  • Web page addresses and email addresses turn into links automatically.
  • Lines and paragraphs break automatically.

Ready for transformation?