Flow action regex

Syntax

The regex action has the following attributes:

  • pattern="..." to define the search expression including delimiter and modifier (required)
  • replace="..." to set the replacement text (required)
  • in="..." to define the input location (optional, default is fit://request/content)
  • out="..." to define the output location (optional, default is fit://request/content)

Examples:

<regex pattern="/abc$/ms" replace="123" />
<regex in="fit://request/content/test" out="fit://request/content/test" pattern="/foo/i" replace="bar" />

Usage

The regex action works on text content only. Thus, it is typically run before the parse action. However, if the input is a DOM, it is automatically serialized into a string for the action to work.

Be aware that the entire flow before the parse action is executed on all requests including sub requests for JavaScript or CSS files. If you want to skip any actions for sub requests, you have to do so explicitly using conditions, such as choose, if or if attributes. While you could use the type of the loaded content for such decisions, we recommend the DC property request/purpose. The request/purpose will be js or css for URLs that were rewritten in a script or link element. For main requests (i.e. a user clicks on a link), the purpose is main (but this is also true for direct links to images or download files).

Examples:

<flow>
  <default-request />
  <if test="request/purpose = 'main'">
    <regex pattern="|class=&quot;link&quot;|" replace="class=&quot;url&quot;" />
  </if>
  <parse />
</flow>

Caution: The output of this action is always a string. If you use it after parsing, you have to call the parse action again (or any other action that creates a DOM). This is particularly important for the main content, because the engine expects fit://request/content to be a DOM after the Flow has finished.

Regular expressions

The replacement is carried out with the PCRE regular expression library.

The replacement string may contain backreferences to capture groups (..) with a backslash followed by its position in the pattern starting with \1. \0 contains the matched text.

The following example converts a date of the form YYYY-MM-DD into DD.MM.YY:

<regex pattern="|20(1\d)-(\d\d)-(\d\d)|" replace="\3.\2.\1" />

Errors and debugging

If the pattern does not compile, the request will terminate with an error.

It is not an error if the pattern does not match or the replacement is empty.

The regex action provides information in the debug channel regex. Activate debug with the ;d=pageend-regex-debug URL Mark.