Required to be imported into the database. The data table structure is:
The code is as follows: <?php /** * txt海量数据入库 * http://bbs.it-home.org */ $file = 'words.txt';//10W条记录的TXT源文件 $lines = file_get_contents($file); ini_set('memory_limit', '-1');//不要限制Mem大小,否则会报错 $line=explode("\n",$lines); $i=0; $sql="INSERT INTO words_sentiment (word,senti_type,senti_value,word_type) VALUES "; foreach($line as $key =>$li) { $arr=explode(" ",$li); $senti_value=$arr[2]-$arr[3]; if($senti_value!=0) { if($i>=20000&&$i<25000)//分批次导入,避免失败 { $mm=explode(" ",$arr[4]); foreach($mm as $m) //【adductive#1 adducting#1 adducent#1】这一个TXT记录要转换为3个SQL记录 { $nn=explode("#",$m); $word=$nn[0]; $sql.="(\"$word\",1,$senti_value,2),";//这个地方要注意到是 word有可能包含单引号(如jack's),因此我们要用双引号来包含word(注意转义) } } $i++; } } //echo $i; $sql=substr($sql,0,-1);//去掉最后一个逗号 //echo $sql; file_put_contents('20000-25000.txt', $sql); //批量导入数据库,5000条一次,大概需要40秒的样子;一次导入太多max_execution_time会不够,导致失败 ?> Copy after login Instructions: 1. When importing massive data, you should pay attention to some limitations of PHP. You can make temporary adjustments, otherwise an error will be reported. Allowed memory size of 33554432 bytes exhausted (tried to allocate 16 bytes) 2. PHP operates TXT files file_get_contents() file_put_contents() 3. When importing large amounts, it is best to import in batches to reduce the chance of failure. 4. Before mass import, the script must be tested multiple times before use, such as testing with 100 pieces of data. 5. After importing, if PHP’s mem_limit is still not enough, the program still cannot run. (It is recommended to increase mem_limit by modifying php.ini instead of using temporary statements) That’s it. Anyone who has the courage can also find a txt with a large amount of data to test, haha. |