Home > Backend Development > PHP Tutorial > Example discussion on url Chinese encoding issues when using Curl to crawl remote content_PHP tutorial

Example discussion on url Chinese encoding issues when using Curl to crawl remote content_PHP tutorial

WBOY
Release: 2016-07-13 10:25:42
Original
1036 people have browsed it

To encode URLs in PHP, you can use urlencode() or rawurlencode(). The difference between the two is that the former encodes spaces as '+', while the latter encodes spaces as '%20'. However, it should be noted that when encoding Only part of the URL should be encoded, otherwise colons and backslashes in the URL will also be escaped. The following is a detailed explanation:

Copy code The code is as follows:

string urlencode( string str)

Returns a string in which all non-alphanumeric characters except -_. will be replaced with a percent sign (%) followed by two hexadecimal digits, and spaces are encoded as plus signs (+ ).
Example 1: The difference between urlencode function and rawurlencode function
Copy code The code is as follows:

$str='blog';
echo urlencode($str);
echo "
";
echo rawurlencode($str);

url results:
Copy code The code is as follows:

%B2%A9+%BF%CD
%B2%A9%20% BF%CD

Example 2: URL Chinese encoding method
Convert from url: "http://www.baidu.com/s?wd=blog" to url:"http://www.baidu.com/s?wd=%E5%8D%9A%20%E5%AE%A2";
Copy code The code is as follows:

$url='http://www.baidu.com/s?wd=blog';
$arr=explode('=',$url );
$url=$arr[0].'='.rawurlencode($arr[1]);
echo $url;

Result:
http: //www.baidu.com/s?wd=%E5%8D%9A%20%E5%AE%A2
Perhaps use the following url encoding function
Copy code The code is as follows:

function cn_urlencode($url){
$pregstr = "/[x{4e00}-x{9fa5}]+/u";// UTF-8 Chinese regular
if(preg_match_all($pregstr,$url,$matchArray)){//Match Chinese, return array
foreach($matchArray[0] as $key=>$val){
$url=str_replace($val, urlencode($val), $url);//Replace translation with Chinese
}
if(strpos($url,' ')){//If exists Space
$url=str_replace(' ','%20',$url);
}
}
return $url;
}

url Result:
http://www.baidu.com/s?wd=%E5%8D%9A%20%E5%AE%A2

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/824958.htmlTechArticleTo encode URLs in PHP, you can use urlencode() or rawurlencode(). The difference between the two is that the former Spaces are encoded as '+', and the latter encodes spaces as '%20', but it should be noted...
Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template