PHP collection experience skills_PHP tutorial-PHP Tutorial-php.cn

PHP collection experience skills_PHP tutorial

WBOY

Release： 2016-07-21 15:46:42

Original

879 people have browsed it

1. Get the source code of the remote file (file_get_contents or use fopen).
2. Analyze the code to get the content you want (use regular matching here, usually to get paging).
3. Download and store the content obtained from the root.
The second step here may have to be repeated several times. For example, we need to analyze the paging address first, and then analyze the content of the inner page to get what we want.
Code:
I remember posting part of the code before. Today I will simply post it here
Copy PHP content to the clipboard
PHP code:
@$nl=file_get_contents ($rs['url']);//Catch remote content
preg_match_all("/var url = "gameswf/(.*?).swf";/is",$nl,$connect);/ /Perform formal matching to obtain the content you want
mysql_query("insert...insert database part");
The above code is the code used for all collections. Of course, you can also use fope To do this, I personally like to use file_get_contents.
Now I will share my method of downloading pictures to local flash, it is too simple. Two lines of code
PHP code:

Copy code Code As follows:

if(@copy($url,$newurl)){ 
echo 'ok'; 
} 

Previously on the forum I have sent an image download function and I will post it here for everyone
PHP code:

Copy code The code is as follows:

/*Local image function*/ 
function getimg($url,$filename){ 
　/*Determine whether the url of the image is empty, and stop the function if it is empty*/ 
if($url= =""){ 
 return false; 
 } 
 /*Get the extension of the image and store it in the variable $ext*/ 
$ext=strrchr($url,"."); 
　/*Determine whether it is a legal image file*/ 
　if($ext!=".gif" && $ext!=".jpg"){ 
　　　　return false; 
　　} 
 /*Read images*/ 
 $img=file_get_contents($url); 
 /*Open the specified file*/ 
 $fp=@fopen($filename.$ext,"a") ; 
　/*Write the image to the pointed file*/ 
　fwrite($fp,$img); 
　/*Close the file*/ 
　fclose($fp); Return the new file name of the image*/ 
 Return $filename.$ext; 
} 

Share your personal collection experience:
1. Do not use those to prevent hotlinking In fact, you can fake the origin of the site, but the collection cost of such a site is too high
2. For sites that collect as quickly as possible, it is best to collect locally
3. When collecting, there are many times when you can save part of the data first Enter the database and wait for the next step of processing.
4. You must handle errors when collecting. I usually skip it if the collection fails three times. In the past, I would often get stuck picking out a piece of content just because I couldn't pick it up.
5. You must make good judgment before entering the database, check the legality of the content, and filter unnecessary strings.