Home  >  Article  >  Backend Development  >  PHP sensitive word filtering advanced version

PHP sensitive word filtering advanced version

巴扎黑
巴扎黑Original
2016-11-10 13:34:431955browse

We introduced a php program that filters some special characters before. Let’s upgrade this sensitive word filtering function to be more powerful. With it, we are no longer afraid of adding spaces or other punctuation marks in the middle of sensitive words.


As long as users can speak, advertisements or other sensitive words may appear, so a sensitive word filtering mechanism must be added to maintain the "purity" of the site.入 Filtering mechanism: Add PHP keyword regular matching

// $ STR is user data

Function wordfilter ($ STR) {

/*







Sensitive words Storage method:
1: store in TXT In the file (general method)
2: Store in cache (better method)

I store it in memcachd.

*/
$words = getSensitiveWords();

foreach ($words as $word)
{
$preg_letter = '/^[A-Za-z]+$/';
if (preg_match($preg_letter, $ Str)/{// Matching Chinese
$ Str = StrTolower ($ Str);
$ Pattern_1 = '/([^A-Za-Z]+'. $ Word. '[— Za-Z] +)|([^A-Za-z]+' . $word . 's+)|(s+' . $word . '[^A-Za-z]+)|(^' . $word . '[ ^A-Za-z]+)|([^A-Za-z]+' . $word.'$)/';
                                                 ; |(^' . $word . '$)/';
                                                                                                                                                                                                                                                                                      . }
        else
{//Match English strings, case insensitive
            $pattern = '/s*' . ;
}
}}}}}
Existing problems:

If you simply add keyword matching, the user's counter -filtering method is diverse, including adding spaces or other punctuation symbols in the middle.
Example:
Sensitive word: buckle

After user processing:
buckle buckle
buckle, buckle
buckle @ buckle
buckle 1 buckle
At this time, the regular matching of the code may not match.

Solution:

First remove all punctuation marks and some special characters from the user data, and then conduct sensitive word judgment.

Code:

$flag_arr=array('?','!','¥','(',')',':',''',''','"','"' ,'《','》',',','...','. ',',','nbsp',']','[','~'); /s/','',preg_replace("/[[:punct:]]/",'',strip_tags(html_entity_decode(str_replace($flag_arr,'',$content),ENT_QUOTES,'UTF-8')) ));
$content_filter is the processed user data, and then perform wordFilter($content_filter) filtering operation



Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn