preg_replace

(PHP 4, PHP 5, PHP 7)

preg_replace — 执行一个正则表达式的搜索和替换

说明

preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] ) : mixed

搜索subject中匹配pattern的部分，以replacement进行替换。

参数

pattern

要搜索的模式。可以使一个字符串或字符串数组。

可以使用一些PCRE修饰符。

replacement

用于替换的字符串或字符串数组。如果这个参数是一个字符串，并且pattern 是一个数组，那么所有的模式都使用这个字符串进行替换。如果pattern和replacement 都是数组，每个pattern使用replacement中对应的元素进行替换。如果replacement中的元素比pattern中的少，多出来的pattern使用空字符串进行替换。

replacement中可以包含后向引用\\n 或$n，语法上首选后者。每个这样的引用将被匹配到的第n个捕获子组捕获到的文本替换。 n 可以是0-99，\\0和$0代表完整的模式匹配文本。捕获子组的序号计数方式为：代表捕获子组的左括号从左到右，从1开始数。如果要在replacement 中使用反斜线，必须使用4个("\\\\"，译注：因为这首先是php的字符串，经过转义后，是两个，再经过正则表达式引擎后才被认为是一个原文反斜线)。

当在替换模式下工作并且后向引用后面紧跟着需要是另外一个数字(比如：在一个匹配模式后紧接着增加一个原文数字)，不能使用\\1这样的语法来描述后向引用。比如， \\11将会使preg_replace() 不能理解你希望的是一个\\1后向引用紧跟一个原文1，还是一个\\11后向引用后面不跟任何东西。这种情况下解决方案是使用${1}1。这创建了一个独立的$1后向引用, 一个独立的原文1。

当使用被弃用的 e 修饰符时, 这个函数会转义一些字符(即：'、"、 \ 和 NULL) 然后进行后向引用替换。当这些完成后请确保后向引用解析完后没有单引号或双引号引起的语法错误(比如： 'strlen(\'$1\')+strlen("$2")')。确保符合PHP的字符串语法，并且符合eval语法。因为在完成替换后，引擎会将结果字符串作为php代码使用eval方式进行评估并将返回值作为最终参与替换的字符串。

subject

要进行搜索和替换的字符串或字符串数组。

如果subject是一个数组，搜索和替换回在subject 的每一个元素上进行, 并且返回值也会是一个数组。

limit

每个模式在每个subject上进行替换的最大次数。默认是 -1(无限)。

count

如果指定，将会被填充为完成的替换次数。

返回值

如果subject是一个数组， preg_replace()返回一个数组，其他情况下返回一个字符串。

如果匹配被查找到，替换后的subject被返回，其他情况下返回没有改变的 subject。如果发生错误，返回 NULL 。

错误／异常

PHP 5.5.0 起，传入 "\e" 修饰符的时候，会产生一个 E_DEPRECATED 错误； PHP 7.0.0 起，会产生 E_WARNING 错误，同时 "\e" 也无法起效。

更新日志

版本	说明
7.0.0	不再支持 /e修饰符。请用 preg_replace_callback() 代替。
5.5.0	/e 修饰符已经被弃用了。使用 preg_replace_callback() 代替。参见文档中 PREG_REPLACE_EVAL 关于安全风险的更多信息。
5.1.0	增加参数`count`.

范例

Example #1 使用后向引用紧跟数值原文


<?php
$string = 'April 15, 2003';
$pattern = '/(\w+) (\d+), (\d+)/i';
$replacement = '${1}1,$3';
echo preg_replace($pattern, $replacement, $string);
?>

以上例程会输出：

April1,2003

Example #2 preg_replace()中使用基于索引的数组


<?php
$string = 'The quick brown fox jumps over the lazy dog.';
$patterns = array();
$patterns[0] = '/quick/';
$patterns[1] = '/brown/';
$patterns[2] = '/fox/';
$replacements = array();
$replacements[2] = 'bear';
$replacements[1] = 'black';
$replacements[0] = 'slow';
echo preg_replace($patterns, $replacements, $string);
?>

以上例程会输出：

The bear black slow jumps over the lazy dog.

对模式和替换内容按key进行排序我们可以得到期望的结果。


<?php
ksort($patterns);
ksort($replacements);
echo preg_replace($patterns, $replacements, $string);
?>

以上例程会输出：

The slow black bear jumps over the lazy dog.

Example #3 替换一些值


<?php
$patterns = array ('/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/',
                   '/^\s*{(\w+)}\s*=/');
$replace = array ('\3/\4/\1\2', '$\1 =');
echo preg_replace($patterns, $replace, '{startDate} = 1999-5-27');
?>

以上例程会输出：

$startDate = 5/27/1999

Example #4 剥离空白字符

这个例子剥离多余的空白字符


<?php
$str = 'foo   o';
$str = preg_replace('/\s\s+/', ' ', $str);
// 将会改变为'foo o'
echo $str;
?>

Example #5 使用参数count


<?php
$count = 0;

echo preg_replace(array('/\d/', '/\s/'), '*', 'xp 4 to', -1 , $count);
echo $count; //3
?>

以上例程会输出：

xp***to
3

注释

Note:
当使用数组形式的pattern和replacement时, 将会按照key在数组中出现的顺序进行处理. 这不一定和数组的索引顺序一致. 如果你期望使用索引对等方式用replacement对pattern 进行替换, 你可以在调用preg_replace()之前对两个数组各进行一次ksort()排序.

参见

PCRE 模式
preg_quote() - 转义正则表达式字符
preg_filter() - 执行一个正则表达式搜索和替换
preg_match() - 执行匹配正则表达式
preg_replace_callback() - 执行一个正则表达式搜索并且使用一个回调进行替换
preg_split() - 通过一个正则表达式分隔字符串
preg_last_error() - 返回最后一个PCRE正则执行产生的错误代码

User Contributed Notes

willbrownsberger at gmail dot com 20-Sep-2018 06:27


Worth knowing:  When arrays of patterns and replacements are provided, they are executed in the order they appear in the array -- so later array elements can act on the results of earlier array elements.



For example:



<?php



echo preg_replace(

    array( '#cat#', '#dog#', '#eel#', '#snowman#' ),

    array(  'dog1', 'eel2', 'snowman3', 'monster4' ),

    'the good cat and the bad dog wandered on the beach'



);



/* result: the good monster4321 and the bad monster432 wandered on the beach

*/

chrisbloom7 at gmail dot com 16-Aug-2018 08:01


Note that when given array arguments the replacement happens in sequence:



<?php

$p = array('/a/', '/b/', '/c/');

$r = array('b', 'c', 'd');

print_r(preg_replace($p, $r, 'a'));

// => d

?>

bublifuk at mailinator dot com 30-May-2018 03:36


A delimiter can be any ASCII non-alphanumeric, non-backslash, non-whitespace character:  !"#$%&'*+,./:;=?@^_`|~-  and  ({[<>]})

alves dot david at outlook dot com 28-Feb-2018 12:40


// Function to format Brazilian taxvat using preg_replace

// Fun??o para formatar o CPF ou CPF utilizando preg_replace

if (!function_exists('cpf_cnpj')) {

    function cpf_cnpj($cpf_cnpj)

    {

        if (!in_array(strlen($cpf_cnpj), [11, 14])) {

            return $cpf_cnpj;

        }



        if (strlen($cpf_cnpj) == 11) {

            return preg_replace("/(\d{3})(\d{3})(\d{3})(\d{2})/", "$1.$2.$3-$4", $cpf_cnpj);

        } else {

            return preg_replace("/(\d{2})(\d{3})(\d{3})(\d{4})(\d{2})/", "$1.$2.$3/$4-$5", $cpf_cnpj);

        }

    }

}



echo cpf_cnpj(12345678901), ' - ', cpf_cnpj(12345678000190);



// 123.456.789-01 - 12.345.678/0001-90

creating SEO friendly Strings 09-Feb-2018 01:47


<?php



/**

  * prepares a string optimized for SEO

  * @see https://blog.ueffing.net/post/2016/03/14/string-seo-optimieren-creating-seo-friendly-url/

  * @param String $string 

  * @return String $string SEO optimized String

  */

function seofy ($sString = '')

{

    $sString = preg_replace('/[^\\pL\d_]+/u', '-', $sString);

    $sString = trim($sString, "-");

    $sString = iconv('utf-8', "us-ascii//TRANSLIT", $sString);

    $sString = strtolower($sString);

    $sString = preg_replace('/[^-a-z0-9_]+/', '', $sString);



    return $sString;

}



// Example

seofy('Stra?enfest in München'); // => strassenfest-in-muenchen

seofy('José Ignacio López de Arriortúa'); // => jose-ignacio-lopez-de-arriortua



?>



The function seofy () creates a SEO friendly version from a string. Umlauts and other letters not contained in the ASCII character set are either reduced to the basic form equivalent (e. g.: é becomes e and ú wid u) or completely converted (e. g. ? becomes ss and ü becomes ue).



On the one hand this succeeds because the php function preg_replace performs the replacement by means of unicode - Unicode Regular Expressions - and on the other hand because an approximate translation is attempted by means of the php function iconv with the TRANSLIT option.



Quote php. net about iconv and TRANSLIT:

"If you append the character string //TRANSLIT to out_charset, transliteration is activated. This means that a character that cannot be displayed in the target character set can be approximated with one or more similar-looking characters.[...]"



Source:

https://blog.ueffing.net/post/2016/03/14/string-seo-optimieren-creating-seo-friendly-url/

aarefjev at gee mail.com 31-Oct-2017 12:54


To clean CSV document from double quotes, where numbers have "," (commas) inside.



Example: 489,62,"1,654",164.74,48.70,$80.56

Becomes: ,489,62,1654,164.74,48.70,$80.56



$csv = preg_replace('%\"([^\"]*),([^\"]*)\"%','$1$2',$csv)

natan dot jesussouza at gmail dot com 11-Oct-2017 09:22


$firstname = htmlspecialchars($_POST['campo']);

$firstname = preg_replace("/[^a-zA-Z0-9]/", "", $firstname, -1, $count_fn);



// $count_fn conta quantos caracteres foram mudados.

// $firstname variavel que captura o input

willamschurchill at gmail dot com 18-Sep-2017 11:54


My name is Christiaan Avenant from UK. I never believed in love spells or magic until i met this spell caster once when i went to Africa on a business summit. I meant a man who's name is Mamaijuboafrica (mamaijuboafricaspell@gmail.com) he is really powerful and could help cast spells to bring back one's gone, lost, misbehaving lover and magic money spell or spell for a good job or luck spell .I'm now happy & a living testimony cos the man i had wanted to marry left me 3 weeks before our wedding and my life was upside down cos our relationship has been on for 4years. I really loved him, but his family was against us and he had no good paying job. So when i met this spell caster, i told (Mamaijuboafrica) what happened and explained the situation of things to him. At first i was undecided, skeptical and doubtful, but i just gave it a try. And in 8 days when i returned to UK, my boyfriend (now husband) called me by himself and came to me apologizing that everything had been settled with his mom and family and he got a new job interview so we should get married. I didn't believe it cos the spell caster only asked for my name and my boyfriends name and all i wanted him to do. Well we are happily married now and we are expecting our little kid, and my husband also got the new job and our lives became much better. His email is(mamaijuboafricaspell@gmail.com call +27749582344 or whatsapp number +27629443297)

missyanderson343 at gmail dot com 12-Jul-2017 11:28


Hello folks, my life is back again, what else can i say I am bless. smiles, I thought I had lost everything. i cry all day and think that he will never come back to me again. i read  so many testimony about Dr odoma the spell caster, how he bring back there ex lover back. I quickly email him. and also get back my husband to be with his wonderful  spell, And when I was at my most desperate, he didn't take advantage of me. You performed a very good service for a person in true need. I don't know how you did it, or how this magic works, but all I know is, IT WORKS!! Frank my husband to be and I are happily back together, and I'll always be grateful to Dr odoma, you can email  him for any kinds of help is very capable and reliable for help. here is the Email (odomaspelltemple@outlook.com)  all thanks to dr Odoma for making me who i am today and the wonderful joy inside of me. smiles..

Benedikt dot Schoeffmann at gmail dot com 28-Jun-2017 09:06


If you want filter out non-printable characters, but want to keep german Umlauts, use this: 



$result = preg_replace("/[^[:print:] | ^[??ü??ü?]]/u", "", $string);

hortonmj at hotmail dot com 06-Jun-2016 06:22


If you want to avoid removing specific tags without allowing dangerous attributes you can replace them with a custom format before using strip_tags().



For example if you want to keep <p>:



 <?php

$text = "<p>hello world</p><script>alert('hacked')</script>";

$text = str_replace("<p>", "[[[temp-tag=p]]]", $text);

$text = str_replace("</p>", "[[[temp-tag=/p]]]", $text);

$text = strip_tags($text);

$text = str_replace("[[[temp-tag=p]]]", "<p>", $text);

$text = str_replace("[[[temp-tag=/p]]]", "</p>", $text);

echo $text; // displays <p>hello world</p>alert('hacked')

?>



If you wish to allow specific tags attributes use regex_replace() like so:



 <?php

$text = regex_replace("/<(p class=".*?")>", "[[[temp-tag=$1]]]", $text);

?>



Be carefull doing this with href though. Make sure the atribute doesnt call javascript.



 <?php

$text = preg_replace('/href="\s*javascript:.*?"/i', "", $text);

$text = regex_replace('/<(a href=".*?")>', [[[temp-tag=$1]]], $text);

?>

smcbride at msn dot com 28-Feb-2016 06:11


A nice and easy way to merge file path strings without worrying about if the path is suffixed with / or the file is prefixed with / is to use preg_replace.  There may be a function to get the endpoints, but this will also fixup any garbage in the middle as well.



Also...the previous post left out an important character to escape if you want to search for it ... / (slash).



   function mergepath($rootpath, $suffixpath){

      $completepath = $rootpath . "/" . $suffixpath;

      $completepath = preg_replace('/\/+/', '/',, $completepath);

      return $completepath;

   }

helia at gmail dot com 08-Nov-2015 11:36


$pattern='/(09(1|2|3)\d{8})/';

$string ="n:09138660959 nu: 09371313317 nu:09211313317 n: 09393026988nu:09193472840nnu:09211313317nu:09211313317nu:09121772890";

$replacements='($1 code $2)';

echo  preg_replace($pattern, $replacements, $string);

safranil (at) safranil [dot] fr 20-May-2015 09:22


Warning, preg_replace() has an unexpected behaviour on UTF-8 strings when you use an empty regular expression like "/()/" !



If you build your regular expression in PHP like this : 



$words = array();

foreach (explode(" ", $what) as $w)

    if (mb_strlen($w) > 0)

        $words[] = preg_quote($w, "/");

    return preg_replace('/(' . implode("|",$words) . ')/iu', '<span class="text-maroon">\\1</span>', $text);



Always check if $words array isn't empty :



if (count($words) == 0)

    return $text;

logofero at gmail dot com 17-Apr-2015 04:13


Why not offset parameter to replace the string? It would be helpful



example:



mixed preg_replace (mixed $pattern, mixed $replacement, mixed $subject [, int $limit = -1 [, int & $count [, int $offset = 0]]]) 



1 $pattern

2 $replacement 

3 $subject

4 $limit

5 $count 

6 $offset <- it is planned?

Garrett Albright 30-Mar-2015 08:14


It may be useful to note that if you pass an associative array as the $replacement parameter, the keys are preserved.



<?php

$replaced = preg_replace('/foo/', 'bar', ['first' => 'foobar', 'second' => 'barfoo']);

// $replaced is now ['first' => 'barbar', 'second' => 'barbar'].

?>

Nibbels 15-Mar-2015 02:46


I have been filtering every userinput with preg_replace since 6 Years now and nothing happened. I am running PHP 5.6.6 and because of historical reasons I still do not use mysqli.

Now i noticed that this filter [^0-9a-zA-Z_ -|:\.] won't filter anything from a Sleeping-Hack-String like `%' AnD sLeep(3) ANd '1%`:



preg_replace ( '/[^0-9a-zA-Z_ -|:\.]/', '', "%' AnD sLeep(3) ANd '1%" );



The reason is, that the fourth Minus has to be escaped!

Fix: [^0-9a-zA-Z_ \-|:\.]



I tell you because I did not know this and I am pretty sure btw. maybe in older versions of PHP some did not have to escape this minus. Those hacks did not work in the old days, because formerly I have been testing against this.



Greetings

Jeroen 01-Feb-2015 01:25


Hello there, 

I would like to share a regex (PHP) sniplet of code 

I wrote (2012) for myself it is also being used in the 

Yerico sriptmerge plugin for joomla marked as simple code.. 

To  compress javascript code and remove all comments from it. 

It also works with mootools It is fast... 

(in compairison to other PHP solutions) and does not damage the 

Javascript it self and it resolves lots of comment removal isseus.



//START Remove comments.



   $buffer = str_replace('/// ', '///', $buffer);        

   $buffer = str_replace(',//', ', //', $buffer);

   $buffer = str_replace('{//', '{ //', $buffer);

   $buffer = str_replace('}//', '} //', $buffer);

   $buffer = str_replace('*//*', '*/  /*', $buffer);

   $buffer = str_replace('/**/', '/*  */', $buffer);

   $buffer = str_replace('*///', '*/ //', $buffer);

   $buffer = preg_replace("/\/\/.*\n\/\/.*\n/", "", $buffer);

   $buffer = preg_replace("/\s\/\/\".*/", "", $buffer);

   $buffer = preg_replace("/\/\/\n/", "\n", $buffer);

   $buffer = preg_replace("/\/\/\s.*.\n/", "\n  \n", $buffer);

   $buffer = preg_replace('/\/\/w[^w].*/', '', $buffer);

   $buffer = preg_replace('/\/\/s[^s].*/', '', $buffer);

   $buffer = preg_replace('/\/\/\*\*\*.*/', '', $buffer);

   $buffer = preg_replace('/\/\/\*\s\*\s\*.*/', '', $buffer);

   $buffer = preg_replace('/[^\*]\/\/[*].*/', '', $buffer);

   $buffer = preg_replace('/([;])\/\/.*/', '$1', $buffer);

   $buffer = preg_replace('/((\r)|(\n)|(\R)|([^0]1)|([^\"]\s*\-))(\/\/)(.*)/', '$1', $buffer);

   $buffer = preg_replace("/([^\*])[\/]+\/\*.*[^a-zA-Z0-9\s\-=+\|!@#$%^&()`~\[\]{};:\'\",<.>?]/", "$1", $buffer);

  $buffer = preg_replace("/\/\*/", "\n/*dddpp", $buffer);

  $buffer = preg_replace('/((\{\s*|:\s*)[\"\']\s*)(([^\{\};\"\']*)dddpp)/','$1$4', $buffer);

  $buffer = preg_replace("/\*\//", "xxxpp*/\n", $buffer);

  $buffer = preg_replace('/((\{\s*|:\s*|\[\s*)[\"\']\s*)(([^\};\"\']*)xxxpp)/','$1$4', $buffer);

  $buffer = preg_replace('/([\"\'])\s*\/\*/', '$1/*', $buffer);

  $buffer = preg_replace('/(\n)[^\'"]?\/\*dddpp.*?xxxpp\*\//s', '', $buffer);

  $buffer = preg_replace('/\n\/\*dddpp([^\s]*)/', '$1', $buffer);

  $buffer = preg_replace('/xxxpp\*\/\n([^\s]*)/', '*/$1', $buffer);

  $buffer = preg_replace('/xxxpp\*\/\n([\"])/', '$1', $buffer);

  $buffer = preg_replace('/(\*)\n*\s*(\/\*)\s*/', '$1$2$3', $buffer);

  $buffer = preg_replace('/(\*\/)\s*(\")/', '$1$2', $buffer);

  $buffer = preg_replace('/\/\*dddpp(\s*)/', '/*', $buffer);

  $buffer = preg_replace('/\n\s*\n/', "\n", $buffer);

  $buffer = preg_replace("/([^\'\"]\s*)<!--.*-->(?!(<\/div>)).*/","$1", $buffer);

  $buffer = preg_replace('/([^\n\w\-=+\|!@#$%^&*()`~\[\]{};:\'",<.>\/?\\\\])(\/\/)(.*)/', '$1', $buffer);



//END Remove comments.    



//START Remove all whitespaces



  $buffer = preg_replace('/\s+/', ' ', $buffer);

  $buffer = preg_replace('/\s*(?:(?=[=\-\+\|%&\*\)\[\]\{\};:\,\.\<\>\!\@\#\^`~]))/', '', $buffer);

  $buffer = preg_replace('/(?:(?<=[=\-\+\|%&\*\)\[\]\{\};:\,\.\<\>\?\!\@\#\^`~]))\s*/', '', $buffer);

  $buffer = preg_replace('/([^a-zA-Z0-9\s\-=+\|!@#$%^&*()`~\[\]

{};:\'",<.>\/?])\s+([^a-zA-Z0-9\s\-=+\|!@#$%^&*()`~\[\]

{};:\'",<.>\/?])/', '$1$2', $buffer);



//END Remove all whitespaces



I am off coarse not a programmer just wanted to 

make the plugin work like i wanted it to.... 

(NOTE: 

For the webmaster sorry I posted this in the wrong topic before...)

eurosat7 at yahoo dot de 08-Aug-2014 10:12


If you want to limit multiple occurences of any char in a sequence you might want to use this function.

<?php

function limit_char_repeat($string,$maxrepeat){

    return preg_replace("/(.)\\1{".$maxrepeat.",}/ms",str_repeat('\1',$maxrepeat),$string);

}

?>



Example:

<?php

$string="

---------------------

Heeeeeeeeeeeeeeeeeeeello Woooooooooooooooorld!!!!!!!!!!!!!!!!!!!!!!!!

===============================================================================================================

~~~~~~~~~~~~~~~~ ~ ~ ~

";

echo limit_char_repeat($string,5);

?>

Output:

-----

Heeeeello Wooooorld!!!!!

=====

~~~~~ ~ ~ ~

elliot dot greene at att dot net 09-Jun-2014 04:37


preg_replace to only show alpha numeric characters



$info = "The Development of code . http://www.";



$info = preg_replace("/[^a-zA-Z0-9]+/", "", $info);



echo $info;



OUTPUTS: TheDevelopmentofcodehttpwww



This is a good workable code

http://www.sioure.com

spamthishard at wtriple dot com 12-Jun-2013 08:02


If you want to replace only the n-th occurrence of $pattern, you can use this function:



<?php



function preg_replace_nth($pattern, $replacement, $subject, $nth=1) {

    return preg_replace_callback($pattern,

        function($found) use (&$pattern, &$replacement, &$nth) {

                $nth--;

                if ($nth==0) return preg_replace($pattern, $replacement, reset($found) );

                return reset($found);

        }, $subject,$nth  );

}



echo preg_replace_nth("/(\w+)\|/", '${1} is the 4th|', "|aa|b|cc|dd|e|ff|gg|kkk|", 4);



?>



this outputs |aa|b|cc|dd is the 4th|e|ff|gg|kkk| 

backreferences are accepted in $replacement

Dustin 16-May-2013 11:53


Matching substrings where the match can exist at the end of the string was non-intuitive to me.



I found this because:

strtotime() interprets 'mon' as 'Monday', but Postgres uses interval types that return short names by default, e.g. interval '1 month' returns as '1 mon'.



I used something like this:



$str = "mon month monday Mon Monday Month MONTH MON";

$strMonth = preg_replace('~(mon)([^\w]|$)~i', '$1th$2', $str);

echo "$str\n$strMonth\n";



//to output:

mon month monday Mon Monday Month MONTH MON

month month monday Month Monday Month MONTH MONth

nik at rolls dot cc 17-Mar-2013 10:14


To split Pascal/CamelCase into Title Case (for example, converting descriptive class names for use in human-readable frontends), you can use the below function:



<?php

function expandCamelCase($source) {

  return preg_replace('/(?<!^)([A-Z][a-z]|(?<=[a-z])[^a-z]|(?<=[A-Z])[0-9_])/', ' $1', $source);

}

?>



Before:

  ExpandCamelCaseAPIDescriptorPHP5_3_4Version3_21Beta

After:

  Expand Camel Case API Descriptor PHP 5_3_4 Version 3_21 Beta

cincodenada at gmail dot dot dot com 30-Oct-2012 10:43


There seems to be some confusion over how greediness works.  For those familiar with Regular Expressions in other languages, particularly Perl: it works like you would expect, and as documented.  Greedy by default, un-greedy if you follow a quantifier with a question mark.



There is a PHP/PCRE-specific U pattern modifier that flips the greediness, so that quantifiers are by default un-greedy, and become greedy if you follow the quantifier with a question mark: http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php



To make things clear, a series of examples:



<?php



$preview = "a bunch of stuff <code>this that</code> and more stuff <code>with a second code block</code> then extra at the end"; 



$preview_default = preg_replace('/<code>(.*)<\/code>/is', "<code class=\"prettyprint\">$1</code>", $preview);

$preview_manually_ungreedy = preg_replace('/<code>(.*?)<\/code>/is', "<code class=\"prettyprint\">$1</code>", $preview);



$preview_U_default = preg_replace('/<code>(.*)<\/code>/isU', "<code class=\"prettyprint\">$1</code>", $preview);

$preview_U_manually_greedy = preg_replace('/<code>(.*?)<\/code>/isU', "<code class=\"prettyprint\">$1</code>", $preview);



echo "Default, no ?: $preview_default\n";

echo "Default, with ?: $preview_manually_ungreedy\n";

echo "U flag, no ?: $preview_U_default\n";

echo "U flag, with ?: $preview_U_manually_greedy\n";



?>



Results in this:



Default, no ?: a bunch of stuff <code class="prettyprint">this that</code> and more stuff <code>with a second code block</code> then extra at the end

Default, with ?: a bunch of stuff <code class="prettyprint">this that</code> and more stuff <code class="prettyprint">with a second code block</code> then extra at the end

U flag, no ?: a bunch of stuff <code class="prettyprint">this that</code> and more stuff <code class="prettyprint">with a second code block</code> then extra at the end

U flag, with ?: a bunch of stuff <code class="prettyprint">this that</code> and more stuff <code>with a second code block</code> then extra at the end



As expected: greedy by default, ? inverts it to ungreedy.  With the U flag, un-greedy by default, ? makes it greedy.

Ray dot Paseur at SometimesUsesGmail dot com 13-Feb-2012 03:48


Please see Example #4 Strip whitespace.  This works as designed, but if you are using Windows, it may not work as expected.  The potential "gotcha" is the CR/LF line endings.  On a Unix system, where there is only a single character line ending, that regex pattern will preserve line endings.  On Windows, it may strip line endings.

hvishnu999 at gmail dot com 08-Jan-2012 06:25


To covert a string to SEO friendly, do this:





<?php


$realname = "This is the string to be made SEO friendly!"





$seoname = preg_replace('/\%/',' percentage',$realname);


$seoname = preg_replace('/\@/',' at ',$seoname);


$seoname = preg_replace('/\&/',' and ',$seoname);


$seoname = preg_replace('/\s[\s]+/','-',$seoname);    // Strip off multiple spaces


$seoname = preg_replace('/[\s\W]+/','-',$seoname);    // Strip off spaces and non-alpha-numeric


$seoname = preg_replace('/^[\-]+/','',$seoname); // Strip off the starting hyphens


$seoname = preg_replace('/[\-]+$/','',$seoname); // // Strip off the ending hyphens


$seoname = strtolower($seoname);





echo $seoname;


?>





This will print: this-is-the-string-to-be-made-seo-friendly

denis_truffaut a t hotmail d o t com 23-Dec-2011 05:11


If you want to catch characters, as well european, russian, chinese, japanese, korean of whatever, just :

- use mb_internal_encoding('UTF-8');

- use preg_replace('`...`u', '...', $string) with the u (unicode) modifier



For further information, the complete list of preg_* modifiers could be found at :

http://php.net/manual/en/reference.pcre.pattern.modifiers.php

akarmenia at gmail dot com 22-Oct-2011 11:47


[Editor's note: in this case it would be wise to rely on the preg_quote() function instead which was added for this specific purpose]





If your replacement string has a dollar sign or a backslash. it may turn into a backreference accidentally! This will fix it.





I want to replace 'text' with '$12345' but this becomes a backreference to $12 (which doesn't exist) and then it prints the remaining '34'. The function down below will return a string that escapes the backreferences.





OUTPUT:


string(8) "some 345"


string(11) "some \12345"


string(8) "some 345"


string(11) "some $12345"





<?php





$a = 'some text';





// Either of these will backreference and fail


$b1 = '\12345'; // Should be '\\12345' to avoid backreference


$b2 = '$12345'; // Should be '\$12345' to avoid backreference





$d = array($b1, $b2);





foreach ($d as $b) {


    $result1 = preg_replace('#(text)#', $b, $a); // Fails


    var_dump($result1);


    $result2 = preg_replace('#(text)#', preg_escape_back($b), $a); // Succeeds


    var_dump($result2);


}





// Escape backreferences from string for use with regex


function preg_escape_back($string) {


    // Replace $ with \$ and \ with \\


    $string = preg_replace('#(?<!\\\\)(\\$|\\\\)#', '\\\\$1', $string);


    return $string;


}





?>

erik dot stetina at gmail dot com 27-Sep-2011 03:25


simple function to remove comments from string



<?php

function remove_comments( & $string )

{

  $string = preg_replace("%(#|;|(//)).*%","",$string);

  $string = preg_replace("%/\*(?:(?!\*/).)*\*/%s","",$string); // google for negative lookahead

  return $string;

}

?>



USAGE:

<?php

$config = file_get_contents("config.cfg");

print "before:".$config;

remove_comments($config);

print "after:".$config;

?>



OUTPUT:

before:

/*

 *  this is config file

 */

; logdir

LOGDIR ./log/

// logfile

LOGFILE main.log

# loglevel

LOGLEVEL 3

after:



LOGDIR ./log/



LOGFILE main.log



LOGLEVEL 3

timitheenchanter 28-Jul-2011 01:34


If you have issues where preg_replace returns an empty string, please take a look at these two ini parameters:



pcre.backtrack_limit

pcre.recursion_limit



The default is set to 100K.  If your buffer is larger than this, look to increase these two values.

sreekanth at outsource-online dot net 19-Jul-2011 01:29


if your intention to code and decode mod_rewrite urls and handle it with php and mysql ,this should work



to convert to url

$url = preg_replace('/[^A-Za-z0-9_-]+/', '-', $string);



And to check in mysql with the url value,use the same expression discounting '-'.

first replace the url value  with php using preg_replace  and use with mysql REGEXP



$sql = "select * from table where fieldname_to_check REGEXP '".preg_replace("/-+/",'[^A-Za-z0-9_]+',$url)."'"

anyvie at devlibre dot fr 17-Jun-2011 02:11


A variable can handle a huge quantity of data but preg_replace can't.



Example :

<?php

$url = "ANY URL WITH LOTS OF DATA";



// We get all the data into $data

$data = file_get_contents($url);



// We just want to keep the content of <head>

$head = preg_replace("#(.*)<head>(.*?)</head>(.*)#is", '$2', $data);

?>



$head can have the desired content, or be empty, depends on the length of $data.



For this application, just add :

$data = substr($data, 0, 4096);

before using preg_replace, and it will work fine.

someuser at dot dot com 08-Jun-2011 04:10


Replacement of line numbers, with replacement limit per line.



Solution that worked for me.

I have a file with tasks listed each starting from number, and only starting number should be removed because forth going text has piles of numbers to be omitted.



56 Patient A of 46 years suffering ... ...

57 Newborn of 26 weeks was ...

58 Jane, having age 18 years recollects onsets of ...

...

587 Patient of 70 years ...



etc.



<?php

// Array obtained from file    

$array = file($file, true);



// Decompile array with foreach loop

foreach($array as $value)

{

    //    Take away numbers 100-999

    //    Starting from biggest 

    //

    //    %            Delimiter

    //    ^            Make match from beginning of line

    //    [0-9]        Range of numbers

    //    {3}        Multiplication of digit range (For tree digit numbers)

    //

    if(preg_match('%^[0-9]{3}%', $value)) 

    {

        // Re-assing to value its modified copy

        $value = preg_replace('%^[0-9]{3}%', '-HERE WAS XXX NUMBER-', $value, 1);

    }

                

    // Take away numbers 10-99

    elseif(preg_match('%^[0-9]{2}%', $value)) {

        $value = preg_replace('%^[0-9]{2}%', '-HERE WAS XX NUMBER-', $value, 1);

    }

                

    // Take away numbers 0-9

    elseif(preg_match('%^[0-9]%', $value)) {

        $value = preg_replace('%^[0-9]%', '-HERE WAS X NUMBER-', $value, 1);

    }

                

    // Build array back

    $arr[] = array($value);

    

    }

}

?>

craiga at craiga dot id dot au 15-May-2011 07:02


If there's a chance your replacement text contains any strings such as "$0.95", you'll need to escape those $n backreferences:



<?php

function escape_backreference($x)

{

    return preg_replace('/\$(\d)/', '\\\$$1', $x);

}

?>

me at perochak dot com 22-Feb-2011 09:57


If you would like to remove a tag along with the text inside it then use the following code.





<?php


preg_replace('/(<tag>.+?)+(<\/tag>)/i', '', $string);


?>





example


<?php $string='<span class="normalprice">55 PKR</span>'; ?>





<?php


$string = preg_replace('/(<span class="normalprice">.+?)+(<\/span>)/i', '', $string);


?>





This will results a null or empty string.





<?php


$string='My String <span class="normalprice">55 PKR</span>';





$string = preg_replace('/(<span class="normalprice">.+?)+(<\/span>)/i', '', $string);


?>





This will results a " My String"

nospam at probackup dot nl 19-Jan-2011 05:03


Warning: a common made mistake in trying to remove all characters except numbers and letters from a string, is to use code with a regex similar to preg_replace('[^A-Za-z0-9_]', '', ...). The output goes in an unexpected direction in case your input contains two double quotes.



echo preg_replace('[^A-Za-z0-9_]', '', 'D"usseldorfer H"auptstrasse')



D"usseldorfer H"auptstrasse



It is important to not forget a leading an trailing forward slash in the regex: 



echo preg_replace('/[^A-Za-z0-9_]/', '', 'D"usseldorfer H"auptstrasse')



Dusseldorfer Hauptstrasse



PS An alternative is to use preg_replace('/\W/', '', $t) for keeping all alpha numeric characters including underscores.

ude dot mpco at wotsrabt dot maps-on 08-Dec-2010 10:30


I find it useful to output HTML form names to the user from time to time while going through the $_GET or $_POST on a user's submission and output keys of the GET or POST array... the only problem being in the name attribute I follow common programming guidelines and have names like the following: eventDate, eventTime, userEmail, etc. Not great to just output to the user-- so I came up with this function. It just adds a space before any uppercase letter in the string.





<?php


function caseSwitchToSpaces( $stringVariableName )


{





$pattern = '/([A-Z])/';


$replacement = ' ${1}';





return preg_replace( $pattern, $replacement, $stringVariableName );


}





//ex. 


echo( caseSwitchToSpaces( "helloWorld" ) );


?>





would output:





"hello World"





You could also do title-style casing to it if desired so the first word isn't lowercase.

Terminux (dot) anonymous at gmail 03-Dec-2010 09:58


This function will strip all the HTML-like content in a string.

I know you can find a lot of similar content on the web, but this one is simple, fast and robust. Don't simply use the built-in functions like strip_tags(), they dont work so good.



Careful however, this is not a correct validation of a string ; you should use additional functions like mysql_real_escape_string and filter_var, as well as custom tests before putting a submission into your database.



<?php 



$html = <<<END

<div id="function.preg-split" class="refentry"> Bonjour1 \t

<div class="refnamediv"> Bonjour2 \t

<h1 class="refname">Bonjour3 \t</h1>

<h1 class=""">Bonjour4 \t</h1>

<h1 class="*%1">Bonjour5 \t</h1>

<body>Bonjour6 \t<//body>>

</ body>Bonjour7 \t<////        body>>

<

a href="image.php" alt="trans" /        >

some leftover text...

     < DIV class=noCompliant style = "text-align:left;" >

... and some other ...

< dIv > < empty>  </ empty>

  <p> This is yet another text <br  >

     that wasn't <b>compliant</b> too... <br   />

     </p>

 <div class="noClass" > this one is better but we don't care anyway </div ><P>

    <input   type= "text"  name ='my "name' value  = "nothin really." readonly>

end of paragraph </p> </Div>   </div>   some trailing text 

END;



// This echoes correctly all the text that is not inside HTML tags

$html_reg = '/<+\s*\/*\s*([A-Z][A-Z0-9]*)\b[^>]*\/*\s*>+/i';

echo htmlentities( preg_replace( $html_reg, '', $html ) );



// This extracts only a small portion of the text

echo htmlentities(strip_tags($html));



?>

sergei dot garrison at gmail dot com 09-Mar-2010 01:40


If you want to add simple rich text functionality to HTML input fields, preg_replace can be quite handy.



For example, if you want users to be able to bold text by typing *text* or italicize it by typing _text_, you can use the following function.



<?php

function encode(&$text) {

    $text = preg_replace('/\*([^\*]+)\*/', '<b>\1</b>', $text);

    $text = preg_replace('/_([^_]+)_/', '<i>\1</i>', $text);

    return $text;

    }

?>



This works for nested tags, too, although it will not fix nesting mistakes.



To make this function more efficient, you could put the delimiters (* and _, in this case) and their HTML tag equivalents in an array and loop through them.

hello at weblap dot ro 06-Mar-2010 04:10


Post slug generator, for creating clean urls from titles.


It works with many languages.





<?php


function remove_accent($str)


{


  $a = array('à', 'á', '?', '?', '?', '?', '?', '?', 'è', 'é', 'ê', '?', 'ì', 'í', '?', '?', 'D', '?', 'ò', 'ó', '?', '?', '?', '?', 'ù', 'ú', '?', 'ü', 'Y', '?', 'à', 'á', 'a', '?', '?', '?', '?', '?', 'è', 'é', 'ê', '?', 'ì', 'í', '?', '?', '?', 'ò', 'ó', '?', '?', '?', '?', 'ù', 'ú', '?', 'ü', 'y', '?', 'ā', 'ā', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', 'ē', 'ē', '?', '?', '?', '?', '?', '?', 'ě', 'ě', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', 'ī', 'ī', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', 'ń', '?', '?', '?', 'ň', '?', 'ō', 'ō', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', 'ū', 'ū', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', 'ǎ', 'ǎ', 'ǐ', 'ǐ', 'ǒ', 'ǒ', 'ǔ', 'ǔ', 'ǖ', 'ǖ', 'ǘ', 'ǘ', 'ǚ', 'ǚ', 'ǜ', 'ǜ', '?', '?', '?', '?', '?', '?');


  $b = array('A', 'A', 'A', 'A', 'A', 'A', 'AE', 'C', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I', 'D', 'N', 'O', 'O', 'O', 'O', 'O', 'O', 'U', 'U', 'U', 'U', 'Y', 's', 'a', 'a', 'a', 'a', 'a', 'a', 'ae', 'c', 'e', 'e', 'e', 'e', 'i', 'i', 'i', 'i', 'n', 'o', 'o', 'o', 'o', 'o', 'o', 'u', 'u', 'u', 'u', 'y', 'y', 'A', 'a', 'A', 'a', 'A', 'a', 'C', 'c', 'C', 'c', 'C', 'c', 'C', 'c', 'D', 'd', 'D', 'd', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'G', 'g', 'G', 'g', 'G', 'g', 'G', 'g', 'H', 'h', 'H', 'h', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'IJ', 'ij', 'J', 'j', 'K', 'k', 'L', 'l', 'L', 'l', 'L', 'l', 'L', 'l', 'l', 'l', 'N', 'n', 'N', 'n', 'N', 'n', 'n', 'O', 'o', 'O', 'o', 'O', 'o', 'OE', 'oe', 'R', 'r', 'R', 'r', 'R', 'r', 'S', 's', 'S', 's', 'S', 's', 'S', 's', 'T', 't', 'T', 't', 'T', 't', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'W', 'w', 'Y', 'y', 'Y', 'Z', 'z', 'Z', 'z', 'Z', 'z', 's', 'f', 'O', 'o', 'U', 'u', 'A', 'a', 'I', 'i', 'O', 'o', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'A', 'a', 'AE', 'ae', 'O', 'o');


  return str_replace($a, $b, $str);


}





function post_slug($str)


{


  return strtolower(preg_replace(array('/[^a-zA-Z0-9 -]/', '/[ -]+/', '/^-|-$/'), 


  array('', '-', ''), remove_accent($str)));


}


?>





Example: post_slug(' -Lo#&@rem  IPSUM //dolor-/sit - amet-/-consectetur! 12 -- ')


will output: lorem-ipsum-dolor-sit-amet-consectetur-12

arie dot benichou at gmail dot com 09-Mar-2009 12:50


<?php

//Be carefull with utf-8, even with unicode and utf-8 support enabled, a pretty odd bug occurs depending on your operating system

$str = "Hi, my name is Arié!<br />";

echo preg_replace('#\bArié\b#u', 'Gontran', $str);

//on windows system, output is "Hi, my name is Gontran<br />"

//on unix system, output is "Hi, my name is Arié<br />"

echo preg_replace('#\bArié(|\b)#u', 'Gontran', $str);

//on windows and unix system, output is "Hi, my name is Gontran<br />"

arkani at iol dot pt 04-Mar-2009 11:00


Because i search a lot 4 this:



The following should be escaped if you are trying to match that character



\ ^ . $ | ( ) [ ]

* + ? { } ,



Special Character Definitions

\ Quote the next metacharacter

^ Match the beginning of the line

. Match any character (except newline)

$ Match the end of the line (or before newline at the end)

| Alternation

() Grouping

[] Character class

* Match 0 or more times

+ Match 1 or more times

? Match 1 or 0 times

{n} Match exactly n times

{n,} Match at least n times

{n,m} Match at least n but not more than m times

More Special Character Stuff

\t tab (HT, TAB)

\n newline (LF, NL)

\r return (CR)

\f form feed (FF)

\a alarm (bell) (BEL)

\e escape (think troff) (ESC)

\033 octal char (think of a PDP-11)

\x1B hex char

\c[ control char

\l lowercase next char (think vi)

\u uppercase next char (think vi)

\L lowercase till \E (think vi)

\U uppercase till \E (think vi)

\E end case modification (think vi)

\Q quote (disable) pattern metacharacters till \E

Even More Special Characters

\w Match a "word" character (alphanumeric plus "_")

\W Match a non-word character

\s Match a whitespace character

\S Match a non-whitespace character

\d Match a digit character

\D Match a non-digit character

\b Match a word boundary

\B Match a non-(word boundary)

\A Match only at beginning of string

\Z Match only at end of string, or before newline at the end

\z Match only at end of string

\G Match only where previous m//g left off (works only with /g)

akam AT akameng DOT com 17-Feb-2009 04:02


<?php                    

$converted    = 

array(

//3 of special chars



'/(;)/ie', 

'/(#)/ie', 

'/(&)/ie', 



//MySQL reserved words!

//Check mysql website!

'/(ACTION)/ie', '/(ADD)/ie', '/(ALL)/ie', '/(ALTER)/ie', '/(ANALYZE)/ie', '/(AND)/ie', '/(AS)/ie', '/(ASC)/ie',



//remaining of special chars

'/(<)/ie', '/(>)/ie', '/(\.)/ie', '/(,)/ie', '/(\?)/ie', '/(`)/ie', '/(!)/ie', '/(@)/ie', '/(\$)/ie', '/(%)/ie', '/(\^)/ie', '/(\*)/ie', '/(\()/ie', '/(\))/ie', '/(_)/ie', '/(-)/ie', '/(\+)/ie', 

'/(=)/ie', '/(\/)/ie', '/(\|)/ie', '/(\\\)/ie', "/(')/ie", '/(")/ie', '/(:)/'

);



$input_text = preg_replace($converted, "UTF_to_Unicode('\\1')", $text);



function UTF_to_Unicode($data){



//return $data;

}

?>

The above example useful for filtering input data, then saving into mysql database, it's not need tobe decoded again, just use UTF-8 as charset.

Please Note escaping special chars between delimiter..

mdrisser at gmail dot com 28-Nov-2008 10:13


An alternative to the method suggested by sheri is to remember that the regex modifier '$' only looks at the end of the STRING, the example given is a single string consisting of multiple lines.



Try:

<?php

// Following is 1 string containing 3 lines

$s = "Testing, testing.\r\n"

   . "Another testing line.\r\n"

   . "Testing almost done.";



echo preg_replace('/\.\\r\\n/m', '@\r\n', $s);

?>



This results in the string:

Testing, testing@\r\nAnother testing line@\r\nTesting almost done.

jette at nerdgirl dot dk 19-Nov-2008 04:47


I use this to prevent users from overdoing repeated text. The following function only allows 3 identical characters at a time and also takes care of repetitions with whitespace added.



This means that 'haaaaaaleluuuujaaaaa' becomes 'haaaleluuujaaa' and 'I am c o o o o o o l' becomes 'I am c o o o l'



<?php

//Example of user input

$str = "aaaaaaaaaaabbccccccccaaaaad d d d   d      d d ddde''''''''''''";



function stripRepeat($str) {

  //Do not allow repeated whitespace

  $str = preg_replace("/(\s){2,}/",'$1',$str);

  //Result: aaaaaaaaaaabbccccccccaaaaad d d d d d d ddde''''''''''''



  //Do not allow more than 3 identical characters separated by any whitespace 

  $str = preg_replace('{( ?.)\1{4,}}','$1$1$1',$str);

  //Final result: aaabbcccaaad d d ddde'''



  return $str;

}

?>



To prevent any repetitions of characters, you only need this:



<?php

$str = preg_replace('{(.)\1+}','$1',$str);

//Result: abcad d d d d d d de'

?>

7r6ivyeo at mail dot com 17-Nov-2008 09:25


String to filename:



<?php

function string_to_filename($word) {

    $tmp = preg_replace('/^\W+|\W+$/', '', $word); // remove all non-alphanumeric chars at begin & end of string

    $tmp = preg_replace('/\s+/', '_', $tmp); // compress internal whitespace and replace with _

    return strtolower(preg_replace('/\W-/', '', $tmp)); // remove all non-alphanumeric chars except _ and -

}

?>



Returns a usable & readable filename.

tal at ashkenazi dot co dot il 14-Sep-2008 01:46


after long time of tring get rid of \n\r and <BR> stuff i've came with this... 

(i done some changes in clicklein() function...)



<?php

    function clickable($url){

        $url                                    =    str_replace("\\r","\r",$url);

        $url                                    =    str_replace("\\n","\n<BR>",$url);

        $url                                    =    str_replace("\\n\\r","\n\r",$url);



        $in=array(

        '`((?:https?|ftp)://\S+[[:alnum:]]/?)`si',

        '`((?<!//)(www\.\S+[[:alnum:]]/?))`si'

        );

        $out=array(

        '<a href="$1"  rel=nofollow>$1</a> ',

        '<a href="http://$1" rel=\'nofollow\'>$1</a>'

        );

        return preg_replace($in,$out,$url);

    }



?>

dyer85 at gmail dot com 28-Aug-2008 07:41


There seems to be some unexpected behavior when using the /m modifier when the line terminators are win32 or mac format.



If you have a string like below, and try to replace dots, the regex won't replace correctly:



<?php

$s = "Testing, testing.\r\n"

   . "Another testing line.\r\n"

   . "Testing almost done.";



echo preg_replace('/\.$/m', '.@', $s); // only last . replaced

?>



The /m modifier doesn't seem to work properly when CRLFs or CRs are used. Make sure to convert line endings to LFs (*nix format) in your input string.

Anonymous 30-Jul-2008 09:21


People using functions like scandir with user input and protecting against "../" by using preg_replace make sure you run ir recursivly untill preg_match no-long finds it, because if you don't the following can happen.



If a user gives the path:

"./....//....//....//....//....//....//....//"

then your script detects every "../" and removes them leaving:

"./../../../../../../../"

Which is proberly going back enough times to show root.



I just found this vunrability in an old script of mine, which was written several years ago.



Always do:

<?php

while( preg_match( [expression], $input ) )

{

   $input = preg_replace( [expression], "", $input );

}

?>

marcin at pixaltic dot com 25-Jul-2008 03:56


<?php


    //:::replace with anything that you can do with searched string:::


    //Marcin Majchrzak


    //pixaltic.com


    


    $c = "2 4 8";


    echo ($c); //display:2 4 8





    $cp = "/(\d)\s(\d)\s(\d)/e"; //pattern


    $cr = "'\\3*\\2+\\1='.(('\\3')*('\\2')+('\\1'))"; //replece


    $c = preg_replace($cp, $cr, $c);


    echo ($c); //display:8*4+2=34


?>

David 11-Jul-2008 04:59


Take care when you try to strip whitespaces out of an UTF-8 text. Using something like:



<?php

$text = preg_replace( "{\s+}", ' ', $text );

?>



brokes in my case the letter à which is hex c3a0. But a0 is a whitespace. So use 



<?php

$text = preg_replace( "{[ \t]+}", ' ', $text );

?>



to strip all spaces and tabs, or better, use a multibyte function like mb_ereg_replace.

akniep at rayo dot info 07-Jul-2008 03:22


preg_replace (and other preg-functions) return null instead of a string when encountering problems you probably did not think about!

-------------------------



It may not be obvious to everybody that the function returns NULL if an error of any kind occurres. An error I happen to stumple about quite often was the back-tracking-limit:

http://de.php.net/manual/de/pcre.configuration.php

#ini.pcre.backtrack-limit



When working with HTML-documents and their parsing it happens that you encounter documents that have a length of over 100.000 characters and that may lead to certain regular-expressions to fail due the back-tracking-limit of above.



A regular-expression that is ungreedy ("U", http://de.php.net/manual/de/reference.pcre.pattern.modifiers.php) often does the job, but still: sometimes you just need a greedy regular expression working on long strings ...



Since, an unhandled return-value of NULL usually creates a consecutive error in the application with unwanted and unforeseen consequences, I found the following solution to be quite helpful and at least save the application from crashing:



<?php



$string_after = preg_replace( '/some_regexp/', "replacement", $string_before );



// if some error occurred we go on working with the unchanged original string

if (PREG_NO_ERROR !== preg_last_error())

{

    $string_after = $string_before;

    

    // put email-sending or a log-message here

} //if



// free memory

unset( $string_before );



?>



You may or should also put a log-message or the sending of an email into the if-condition in order to get informed, once, one of your regular-expressions does not have the effect you desired it to have.

da_pimp2004_966 at hotmail dot com 20-Jun-2008 11:09


A simple BB like thing..





<?php


function AddBB($var) {


        $search = array(


                '/\[b\](.*?)\[\/b\]/is',


                '/\[i\](.*?)\[\/i\]/is',


                '/\[u\](.*?)\[\/u\]/is',


                '/\[img\](.*?)\[\/img\]/is',


                '/\[url\](.*?)\[\/url\]/is',


                '/\[url\=(.*?)\](.*?)\[\/url\]/is'


                );





        $replace = array(


                '<strong>$1</strong>',


                '<em>$1</em>',


                '<u>$1</u>',


                '<img src="$1" />',


                '<a href="$1">$1</a>',


                '<a href="$1">$2</a>'


                );





        $var = preg_replace ($search, $replace, $var);


        return $var;


}


?>

Michael W 16-Apr-2008 03:35


For filename tidying I prefer to only ALLOW certain characters rather than converting particular ones that we want to exclude. To this end I use ...



<?php

  $allowed = "/[^a-z0-9\\040\\.\\-\\_\\\\]/i";

  preg_replace($allowed,"",$str));

?>



Allows letters a-z, digits, space (\\040), hyphen (\\-), underscore (\\_) and backslash (\\\\), everything else is removed from the string.

php-comments-REMOVE dot ME at dotancohen dot com 29-Feb-2008 12:02


Below is a function for converting Hebrew final characters to their

normal equivelants should they appear in the middle of a word.

The /b argument does not treat Hebrew letters as part of a word,

so I had to work around that limitation.



<?php



$text="????? ???????";



function hebrewNotWordEndSwitch ($from, $to, $text) {

   $text=

    preg_replace('/'.$from.'([?-?])/u','$2'.$to.'$1',$text);

   return $text;

}



do {

   $text_before=$text;

   $text=hebrewNotWordEndSwitch("?","?",$text);

   $text=hebrewNotWordEndSwitch("?","?",$text);

   $text=hebrewNotWordEndSwitch("?","?",$text);

   $text=hebrewNotWordEndSwitch("?","?",$text);

   $text=hebrewNotWordEndSwitch("?","?",$text);

}   while ( $text_before!=$text );



print $text; // ????? ??????!



?>



The do-while is necessary for multiple instances of letters, such

as "????" which would start off as "????". Note that there's still the

problem of acronyms with gershiim but that's not a difficult one

to solve. The code is in use at http://gibberish.co.il which you can

use to translate wrongly-encoded Hebrew, transliterize, and some

other Hebrew-related functions.



To ensure that there will be no regular characters at the end of a

word, just convert all regular characters to their final forms, then

run this function. Enjoy!

ulf dot reimers at tesa dot com 07-Dec-2007 10:28


Hi,



as I wasn't able to find another way to do this, I wrote a function converting any UTF-8 string into a correct NTFS filename (see http://en.wikipedia.org/wiki/Filename).



<?php

function strToNTFSFilename($string)

{

  $reserved = preg_quote('\/:*?"<>', '/');

  return preg_replace("/([\\x00-\\x1f{$forbidden}])/e", "_", $string);

}

?>



It converts all control characters and filename characters which are reserved by Windows ('\/:*?"<>') into an underscore.

This way you can safely create an NTFS filename out of any UTF-8 string.

mike dot hayward at mikeyskona dot co dot uk 18-Oct-2007 08:49


Hi.


Not sure if this will be a great help to anyone out there, but thought i'd post just in case.


I was having an Issue with a project that relied on $_SERVER['REQUEST_URI']. Obviously this wasn't working on IIS.


(i am using mod_rewrite in apache to call up pages from a database and IIS doesn't set REQUEST_URI). So i knocked up this simple little preg_replace to use the query string set by IIS when redirecting to a PHP error page.





<?php


//My little IIS hack :)


if(!isset($_SERVER['REQUEST_URI'])){ 


  $_SERVER['REQUEST_URI'] = preg_replace( '/404;([a-zA-Z]+:\/\/)(.*?)\//i', "/" , $_SERVER['QUERY_STRING'] );


}


?>





Hope this helps someone else out there trying to do the same thing :)

sternkinder at gmail dot com 24-Aug-2007 03:10


From what I can see, the problem is, that if you go straight and substitute all 'A's wit 'T's you can't tell for sure which 'T's to substitute with 'A's afterwards. This can be for instance solved by simply replacing all 'A's by another character (for instance '_' or whatever you like), then replacing all 'T's by 'A's, and then replacing all '_'s (or whatever character you chose) by 'A's:





<?php


$dna = "AGTCTGCCCTAG";


echo str_replace(array("A","G","C","T","_","-"), array("_","-","G","A","T","C"), $dna); //output will be TCAGACGGGATC


?>





Although I don't know how transliteration in perl works (though I remember that is kind of similar to the UNIX command "tr") I would suggest following function for "switching" single chars:





<?php


function switch_chars($subject,$switch_table,$unused_char="_") {


    foreach ( $switch_table as $_1 => $_2 ) {


        $subject = str_replace($_1,$unused_char,$subject);


        $subject = str_replace($_2,$_1,$subject);


        $subject = str_replace($unused_char,$_2,$subject);


    }


    return $subject;


}





echo switch_chars("AGTCTGCCCTAG", array("A"=>"T","G"=>"C")); //output will be TCAGACGGGATC


?>

rob at ubrio dot us 21-Aug-2007 01:48


Also worth noting is that you can use array_keys()/array_values() with preg_replace like:





<?php


$subs = array(


  '/\[b\](.+)\[\/b\]/Ui' => '<strong>$1</strong>',


  '/_(.+)_/Ui' => '<em>$1</em>'


  ...


  ...


);





$raw_text = '[b]this is bold[/b] and this is _italic!_';





$bb_text = preg_replace(array_keys($subs), array_values($subs), $raw_text);


?>

lehongviet at gmail dot com 25-Jul-2007 01:15


I got problem echoing text that contains double-quotes into a text field. As it confuses value option. I use this function below to match and replace each pair of them by smart quotes. The last one will be replaced by a hyphen(-).





It works for me.





<?php


function smart_quotes($text) {


  $pattern = '/"((.)*?)"/i';


  $text = preg_replace($pattern,""\\1"",stripslashes($text));


  $text = str_replace("\"","-",$text);


  $text = addslashes($text);


  return $text;


}


?>

131 dot php at cloudyks dot org 17-Jul-2007 04:37


Based on previous comment, i suggest 


( this function already exist in php 6 )





<?php


function unicode_decode($str){


    return preg_replace(


        '#\\\u([0-9a-f]{4})#e',


        "unicode_value('\\1')",


        $str);


}





function unicode_value($code) {


    $value=hexdec($code);


    if($value<0x0080)


        return chr($value);


    elseif($value<0x0800)


        return chr((($value&0x07c0)>>6)|0xc0)


            .chr(($value&0x3f)|0x80);


    else


        return chr((($value&0xf000)>>12)|0xe0)


        .chr((($value&0x0fc0)>>6)|0x80)


        .chr(($value&0x3f)|0x80);


}


?>





[EDIT BY danbrown AT php DOT net:  This function originally written by mrozenoer AT overstream DOT net.]

mtsoft at mt-soft dot com dot ar 09-Jul-2007 02:30


This function takes a URL and returns a plain-text version of the page. It uses cURL to retrieve the page and a combination of regular expressions to strip all unwanted whitespace. This function will even strip the text from STYLE and SCRIPT tags, which are ignored by PHP functions such as strip_tags (they strip only the tags, leaving the text in the middle intact).





Regular expressions were split in 2 stages, to avoid deleting single carriage returns (also matched by \s) but still delete all blank lines and multiple linefeeds or spaces, trimming operations took place in 2 stages.





<?php


function webpage2txt($url)


{


$user_agent = "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)";





$ch = curl_init();    // initialize curl handle


curl_setopt($ch, CURLOPT_URL, $url); // set url to post to


curl_setopt($ch, CURLOPT_FAILONERROR, 1);              // Fail on errors


curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);    // allow redirects


curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return into a variable


curl_setopt($ch, CURLOPT_PORT, 80);            //Set the port number


curl_setopt($ch, CURLOPT_TIMEOUT, 15); // times out after 15s





curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);





$document = curl_exec($ch);





$search = array('@<script[^>]*?>.*?</script>@si',  // Strip out javascript


'@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly


'@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags


'@<![\s\S]*?-[ \t\n\r]*>@',         // Strip multi-line comments including CDATA


'/\s{2,}/',





);





$text = preg_replace($search, "\n", html_entity_decode($document));





$pat[0] = "/^\s+/";


$pat[2] = "/\s+\$/";


$rep[0] = "";


$rep[2] = " ";





$text = preg_replace($pat, $rep, trim($text));





return $text;


}


?>





Potential uses of this function are extracting keywords from a webpage, counting words and things like that. If you find it useful, drop us a comment and let us know where you used it.

ismith at nojunk dot motorola dot com 21-Mar-2007 09:47


Be aware that when using the "/u" modifier, if your input text contains any bad UTF-8 code sequences, then preg_replace will return an empty string, regardless of whether there were any matches.



This is due to the PCRE library returning an error code if the string contains bad UTF-8.

dani dot church at gmail dot youshouldknowthisone 07-Feb-2007 11:09


Note that it is in most cases much more efficient to use preg_replace_callback(), with a named function or an anonymous function created with create_function(), instead of the /e modifier.  When preg_replace() is called with the /e modifier, the interpreter must parse the replacement string into PHP code once for every replacement made, while preg_replace_callback() uses a function that only needs to be parsed once.

Alexey Lebedev 07-Sep-2006 02:21


Wasted several hours because of this:





<?php


$str='It&#039;s a string with HTML entities';


preg_replace('~&#(\d+);~e', 'code2utf($1)', $str);


?>





This code must convert numeric html entities to utf8. And it does with a little exception. It treats wrong codes starting with &#0





The reason is that code2utf will be called with leading zero, exactly what the pattern matches - code2utf(039).


And it does matter! PHP treats 039 as octal number.


Try <?php print(011); ?>





Solution:


<?php preg_replace('~&#0*(\d+);~e', 'code2utf($1)', $str); ?>

gabe at mudbuginfo dot com 18-Oct-2004 01:39


It is useful to note that the 'limit' parameter, when used with 'pattern' and 'replace' which are arrays, applies to each individual pattern in the patterns array, and not the entire array.

<?php



$pattern = array('/one/', '/two/');

$replace = array('uno', 'dos');

$subject = "test one, one two, one two three";



echo preg_replace($pattern, $replace, $subject, 1);

?>



If limit were applied to the whole array (which it isn't), it would return:

test uno, one two, one two three



However, in reality this will actually return:

test uno, one dos, one two three

steven -a-t- acko dot net 08-Feb-2004 09:45


People using the /e modifier with preg_replace should be aware of the following weird behaviour. It is not a bug per se, but can cause bugs if you don't know it's there.



The example in the docs for /e suffers from this mistake in fact.



With /e, the replacement string is a PHP expression. So when you use a backreference in the replacement expression, you need to put the backreference inside quotes, or otherwise it would be interpreted as PHP code. Like the example from the manual for preg_replace:



preg_replace("/(<\/?)(\w+)([^>]*>)/e",

             "'\\1'.strtoupper('\\2').'\\3'",

             $html_body);



To make this easier, the data in a backreference with /e is run through addslashes() before being inserted in your replacement expression. So if you have the string



 He said: "You're here"



It would become:



 He said: \"You\'re here\"



...and be inserted into the expression.

However, if you put this inside a set of single quotes, PHP will not strip away all the slashes correctly! Try this:



 print ' He said: \"You\'re here\" ';

 Output: He said: \"You're here\"



This is because the sequence \" inside single quotes is not recognized as anything special, and it is output literally.



Using double-quotes to surround the string/backreference will not help either, because inside double-quotes, the sequence \' is not recognized and also output literally. And in fact, if you have any dollar signs in your data, they would be interpreted as PHP variables. So double-quotes are not an option.



The 'solution' is to manually fix it in your expression. It is easiest to use a separate processing function, and do the replacing there (i.e. use "my_processing_function('\\1')" or something similar as replacement expression, and do the fixing in that function).



If you surrounded your backreference by single-quotes, the double-quotes are corrupt:

$text = str_replace('\"', '"', $text);



People using preg_replace with /e should at least be aware of this.



I'm not sure how it would be best fixed in preg_replace. Because double-quotes are a really bad idea anyway (due to the variable expansion), I would suggest that preg_replace's auto-escaping is modified to suit the placement of backreferences inside single-quotes (which seemed to be the intention from the start, but was incorrectly applied).

thewolf at pixelcarnage dot com 23-Oct-2003 07:38


I got sick of trying to replace just a word, so I decided I would write my own string replacement code. When that code because far to big and a little faulty I decided to use a simple preg_replace:





<?php


/**


 * Written by Rowan Lewis


 * $search(string), the string to be searched for


 * $replace(string), the string to replace $search


 * $subject(string), the string to be searched in


 */


function word_replace($search, $replace, $subject) {


    return preg_replace('/[a-zA-Z]+/e', '\'\0\' == \'' . $search . '\' ? \'' . $replace . '\': \'\0\';', $subject);


}


?>





I hope that this code helpes someone!