.*?"/> .*?">

Home  >  Article  >  Backend Development  >  php清除HTML格式的代码

php清除HTML格式的代码

WBOY
WBOYOriginal
2016-05-22 18:41:081366browse
   在字符截取时常会因为HTML格式发生意外,ASP是,PHP也是,如果是可预见的简单HTML格式用replace就行了,对于文章正文这一类里面可能包含所有的HTML格式,想高效点还是用下面的的,已测试

$search = array ("’’si", // 去掉 javascript
         "’]*?>’si",      // 去掉 HTML 标记
         "’([rn])[s]+’",         // 去掉空白字符
         "’&(quot|#34);’i",         // 替换 HTML 实体
         "’&(amp|#38);’i",
         "’&(lt|#60);’i",
         "’&(gt|#62);’i",
         "’&(nbsp|#160);’i",
         "’&(iexcl|#161);’i",
         "’&(cent|#162);’i",
         "’&(pound|#163);’i",
         "’&(copy|#169);’i",
         "’(d+);’e");          // 作为 PHP 代码运行

$replace = array ("",
         "",
         "\1",
         "\"",
         "&",
         "         ">",
         " ",
         chr(161),
         chr(162),
         chr(163),
         chr(169),
         "chr(\1)");
//$document为需要处理字符串,如果来源为文件可以$document = file_get_contents($filename);
//$text = preg_replace($search, $replace, $document);
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn