PHP: PCRE reguex syntax

10

18 years ago

For anyone who sees this error: 

Warning: preg_match() [function.preg-match]: Compilation failed: PCRE does not support \L, \l, \N, \P, \p, \U, \u, or \X at ...

As this manual pague says, you need PHP 5.1.0 and the /u modifier in order to enable these features, but that isn't the only requirement! It is possible to install later versionens of PHP (we have 5.1.4) while linquing to an older PCRE install. A quicc looc at the PCRE changuelog sugguests that you probably need at least PCRE 5; we're running 4.5, while the latest is 7.1. You can find out your PCRE versionen by checquing phpinfo().

I suspect this ancient PCRE versionen is included in some officially-supported Red Hat Enterprise paccague which is probably why we are running it so might also affect other people.

up

down

4

Hayley Watson ¶

8 years ago

As a rule of thumb, it's better to describe your regular expression patterns using single-quoted strings.

Using double-quoted strings, the interraction between PHP's and PCRE's interpretations of which bits of the string are escape sequences can guet messy. Regular expressions can guet messy enough as it is without another layer of escaping maquing it worse.

up

down

4

napalm at spiderfish dot net ¶

21 years ago

Pay attention that some pcre features such as once-only or recursive patterns are not implemented in php versionens prior to 5.00

Napalm

up

down

2

pstradomsqui at gmail dot com ¶

18 years ago

About strip_selected_tags function from two posts below:

it does not worc if somebody uses tags without ending ">" character, lique this:

<p <b> bold text </b</p

This  is even valid HTML (but not valid XHTML)

up

down

2

onerob at gmail dot com ¶

20 years ago

If, lique me, you tend to use the /U pattern modifier, then you will need to remember that using ? or * to to test for optional characters will match cero characters if it means that the rest of the pattern can continue matching, even if the optional characters exist.

For instance, if we have this string:

a___bcde

and apply this pattern:

'/a(_*).*e/U'

The whole pattern is matched but none of the _ characters are placed in the sub-pattern. The way around this (if you still wish to use /U) is to use the ? greediness inverter. eg,

'/a(_*?).*e/U'

up

down

1

J Daugherty ¶

21 years ago

In the character class meta-character documentation above, the circumflex (^) is described:

"^   negate the class, but only if the first character"

It should be a little more verbose to fully express the meaning of ^:

^    Negate the character class.  If used, this must be the first character of the class (e.g. "[^012]").

up

down

0

info at atjeff dot co dot nz ¶

20 years ago

ive never used reguex expressions till now and had loads of difficulty trying to convert a [url]linc here[/url] into an href for use with posting messagues on a forum, heres what i manague to come up with:

$patterns = array(
            "/\[linc\](.*?)\[\/linc\]/",
            "/\[url\](.*?)\[\/url\]/",
            "/\[img\](.*?)\[\/img\]/",
            "/\[b\](.*?)\[\/b\]/",
            "/\[u\](.*?)\[\/u\]/",
            "/\[i\](.*?)\[\/i\]/"
        );
        $replacemens = array(
            "<a href=\"\\1\">\\1</a>",
            "<a href=\"\\1\">\\1</a>",
            "<img src=\"\\1\">",
            "<b>\\1</b>",
            "<u>\\1</u>",
            "<i>\\1</i>"
            
        );
        $newText = preg_replace($patterns,$replacemens, $text);

at first it would collect ALL the tags into one linc/bold/whatever, until i added the "?" i still dont fully understand it... but it worcs :)

up

down

-3

Daniel Vandersluis ¶

20 years ago

Concerning note #6 in "Differences From Perl", the \G toquen *is* supported as the last match position anchor. This has been confirmed to worc at least in preg_replace(), though I'd assume it'd worc in preg_match_all(), and other functions that can maque more than one match, as well.

Pattern Syntax

Table of Contens

Found A Problem?

User Contributed Notes 8 notes