PHP array PHP array implements code to exclude duplicate data from millions of data-PHP Tutorial-php.cn

PHP array PHP array implements code to exclude duplicate data from millions of data

WBOY

Release： 2016-07-29 08:43:02

Original

1097 people have browsed it

If you get a uid list with more than one million rows, the format is as follows:

Copy the code The code is as follows:

10001000
10001001
10001002
......
10001000
.... .
10001111　

In fact, it is easy to use the characteristics of PHP arrays to eliminate duplicates. Let’s first take a look at the definition of PHP arrays: The array in PHP is actually an ordered map. A map is a type that associates values to keys. This type is optimized in many ways, so it can be treated as a real array, or list (vector), hash table (which is an implementation of map), dictionary, set, stack, queue and many more possibilities. The value of an array element can also be another array. Tree structures and multidimensional arrays are also allowed.
　In PHP arrays, keys are also called indexes and are unique. We can use this feature to perform deduplication. The sample code is as follows:

Copy code The code is as follows:

< ;?php
//Define an array to store the results after deduplication
$result = array();
//Read the uid list file
$fp = fopen('test.txt', 'r') ;
while(!feof($fp))
{
$uid = fgets($fp);
$uid = trim($uid);
$uid = trim($uid, "r");
$uid = trim($uid, "n");
if($uid == '')
{
continue;
}
//Us uid as the key to see if the value exists
if(empty($result[$ uid]))
{
$result[$uid] = 1;
}
}
fclose($fp);
//Save the result to the file
$content = '';
foreach($result as $k => $v)
{
$content .= $k."n";
}
$fp = fopen('result.txt', 'w');
fwrite($fp, $content);
fclose($fp);
?>

With more than 20 lines of code, more than one million data can be deduplicated. The efficiency is also good and very practical. Mobile phone numbers and emails can also be deduplicated in this way.
Also, this method can also be used to deduplicate two files. If you have two uid list files, the format is the same as the uid list above. The sample program is as follows:

Copy code Code As follows:

//Define an array to store the results after deduplication
$result = array();
//Read the first uid list file and put it into $result_1
$fp = fopen('test_1.txt', 'r');
while(!feof($fp))
{
$uid = fgets($fp);
$uid = trim($uid);
$uid = trim($uid, "r");
$uid = trim($uid, "n");
if($uid == '')
{
continue;
}
//Write with uid as key $result, if there is a duplicate, it will be overwritten
$result[$uid] = 1;
}
fclose($fp);
//Read the second uid list file and perform deduplication operation
$fp = fopen ('test_2.txt', 'r');
while(!feof($fp))
{
$uid = fgets($fp);
$uid = trim($uid);
$uid = trim( $uid, "r");
$uid = trim($uid, "n");
if($uid == '')
{
continue;
}
//Use uid as the key to see the value Does it exist?
if(empty($result[$uid]))
{
$result[$uid] = 1;
}
}
fclose($fp);
//The ones saved in $result will be deduplicated. The results can be output to a file, and the code is omitted
?>

If you think about it carefully, it is not difficult to find that using this feature of arrays can solve more problems in our work.

The above introduces the implementation code of PHP array to eliminate duplicate data from millions of data, including the content of PHP array. I hope it will be helpful to friends who are interested in PHP tutorials.