Home > php教程 > php手册 > MySQL大批量数据插入,PHP之for不断插入时出现缓慢的解决方案及

MySQL大批量数据插入,PHP之for不断插入时出现缓慢的解决方案及

WBOY
Release: 2016-06-06 19:53:59
Original
1786 people have browsed it

很多时候为了测试数据库设计是否恰当,优化SQL语句,需要在表中插入大量的数据,怎么插入大量的数据就是个问题了。 最开始想到的办法就是写一个程序通过一个很大的循环来不停的插入,比如这样: 1 2 3 4 int i = LOOP_COUNT ; while ( i --= 0 ){ //insert d

很多时候为了测试数据库设计是否恰当,优化SQL语句,需要在表中插入大量的数据,怎么插入大量的数据就是个问题了。

最开始想到的办法就是写一个程序通过一个很大的循环来不停的插入,比如这样:

<span>1</span>
<span>2</span>
<span>3</span>
<span>4</span>
Copy after login
<code><span><span>int</span> <span>i</span> <span>=</span> <span>LOOP_COUNT</span><span>;</span>
</span><span><span>while</span><span>(</span><span>i</span><span>-->=</span><span>0</span><span>){</span>
</span><span>    <span>//insert data here.</span>
</span><span><span>}</span>
</span></code>
Copy after login

不过我在这么做的时候发现这样插入数据非常的慢,一秒钟插入的数据量还不到100条,于是想到不要一条一条的插入,而是通过

<span>1</span>
Copy after login
<code><span>INSERT</span> <span>INTO</span> <span>TABLE</span> <span>VALUES</span> <span>(),(),(),()...</span>
</code>
Copy after login

这样的方式来插入。于是修改程序为:

<span>1</span>
<span>2</span>
<span>3</span>
<span>4</span>
5
6
7
8
9
Copy after login
<code><span><span>int</span> <span>i</span> <span>=</span> <span>LOOP_COUNT</span><span>;</span>
</span><span><span>StringBuilder</span> <span>stringBuilder</span><span>;</span>
</span><span><span>while</span><span>(</span><span>i</span><span>-->=</span><span>0</span><span>){</span>
</span><span>    <span>if</span><span>(</span><span>LOOP_COUNT</span><span>!=</span><span>i</span> <span>&&</span> <span>i</span><span>%</span><span>5000</span><span>==</span><span>0</span><span>){</span>
</span><span>     <span>//通过insert values的方式插入这5000条数据并清空stringBuilder</span>
</span><span>    <span>}</span>
</span><span>    <span>stringBuilder</span><span>.</span><span>append</span><span>(</span><span>"(数据)"</span><span>);</span>
</span><span><span>}</span>
</span><span><span>//插入剩余的数据</span>
</span></code>
Copy after login

这样做的插入速度是上升了很多,不过如果想要插入大量的输入,比如上亿条,那么花费的时间还是非常长的。

查询MySQL的文档,发现了一个页面:LOAD DATA INFILE 光看这个名字,觉得有戏,于是仔细看了下。

官方对于这个命令的描述是:

<span>1</span>
<span>2</span>
<span>3</span>
<span>4</span>
5
6
7
8
9
10
11
12
13
14
15
16
Copy after login
<code><span><span>LOAD</span> <span>DATA</span> <span>[</span><span>LOW_PRIORITY</span> <span>|</span> <span>CONCURRENT</span><span>]</span> <span>[</span><span>LOCAL</span><span>]</span> <span>INFILE</span> <span>'</span><span>file_name</span><span>'</span>
</span><span>    <span>[</span><span>REPLACE</span> <span>|</span> <span>IGNORE</span><span>]</span>
</span><span>    <span>INTO</span> <span>TABLE</span> <span>tbl_name</span>
</span><span>    <span>[</span><span>CHARACTER</span> <span>SET</span> <span>charset_name</span><span>]</span>
</span><span>    <span>[{</span><span>FIELDS</span> <span>|</span> <span>COLUMNS</span><span>}</span>
</span><span>        <span>[</span><span>TERMINATED</span> <span>BY</span> <span>'</span><span>string</span><span>'</span><span>]</span>
</span><span>        <span>[[</span><span>OPTIONALLY</span><span>]</span> <span>ENCLOSED</span> <span>BY</span> <span>'</span><span>char</span><span>'</span><span>]</span>
</span><span>        <span>[</span><span>ESCAPED</span> <span>BY</span> <span>'</span><span>char</span><span>'</span><span>]</span>
</span><span>    <span>]</span>
</span><span>    <span>[</span><span>LINES</span>
</span><span>        <span>[</span><span>STARTING</span> <span>BY</span> <span>'</span><span>string</span><span>'</span><span>]</span>
</span><span>        <span>[</span><span>TERMINATED</span> <span>BY</span> <span>'</span><span>string</span><span>'</span><span>]</span>
</span><span>    <span>]</span>
</span><span>    <span>[</span><span>IGNORE</span> <span>number</span> <span>LINES</span><span>]</span>
</span><span>    <span>[(</span><span>col_name_or_user_var</span><span>,...)]</span>
</span><span>    <span>[</span><span>SET</span> <span>col_name</span> <span>=</span> <span>expr</span><span>,...]</span>
</span></code>
Copy after login

命令不复杂,具体的每个参数的意义和用法请看官方的解释 http://dev.mysql.com/doc/refman/5.5/en/load-data.html

那么现在做的就是生成数据了,我习惯用\t作为数据的分隔符、用\n作为一行的分隔符,所以生成数据的代码如下:

<span>1</span>
<span>2</span>
<span>3</span>
<span>4</span>
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
Copy after login
<code><span><span>long</span> <span>start</span> <span>=</span> <span>System</span><span>.</span><span>currentTimeMillis</span><span>()</span> <span>/</span> <span>1000</span><span>;</span>
</span><span><span>try</span> <span>{</span>
</span><span>    <span>File</span> <span>file</span> <span>=</span> <span>new</span> <span>File</span><span>(</span><span>FILE</span><span>);</span>
</span><span>    <span>if</span> <span>(</span><span>file</span><span>.</span><span>exists</span><span>())</span> <span>{</span>
</span><span>        <span>file</span><span>.</span><span>delete</span><span>();</span>
</span><span>    <span>}</span>
</span><span>    <span>file</span><span>.</span><span>createNewFile</span><span>();</span>
</span><span>    <span>FileOutputStream</span> <span>outStream</span> <span>=</span> <span>new</span> <span>FileOutputStream</span><span>(</span><span>file</span><span>,</span> <span>true</span><span>);</span>
</span><span>    <span>StringBuilder</span> <span>builder</span> <span>=</span> <span>new</span> <span>StringBuilder</span><span>(</span><span>10240</span><span>);</span>
</span><span>    <span>DateFormat</span> <span>dateFormat</span> <span>=</span> <span>new</span> <span>SimpleDateFormat</span><span>(</span><span>DATE_FORMAT</span><span>);</span>
</span><span>    <span>Random</span> <span>rand</span> <span>=</span> <span>new</span> <span>Random</span><span>();</span>
</span><span>    <span>String</span> <span>tmpDate</span> <span>=</span> <span>dateFormat</span><span>.</span><span>format</span><span>(</span><span>new</span> <span>Date</span><span>());</span>
</span><span>    <span>Long</span> <span>tmpTimestamp</span> <span>=</span> <span>System</span><span>.</span><span>currentTimeMillis</span><span>()</span> <span>/</span> <span>1000</span><span>;</span>
</span><span>    <span>int</span> <span>i</span> <span>=</span> <span>0</span><span>;</span>
</span><span>    <span>while</span> <span>(</span><span>i</span><span>++</span> <span> <span>LOOP</span><span>)</span> <span>{</span>
</span><span>        <span>if</span> <span>(</span><span>i</span> <span>></span> <span>0</span> <span>&&</span> <span>i</span> <span>%</span> <span>30000</span> <span>==</span> <span>0</span><span>)</span> <span>{</span>
</span><span>            <span>System</span><span>.</span><span>out</span><span>.</span><span>println</span><span>(</span><span>"write offset:"</span> <span>+</span> <span>i</span><span>);</span>
</span><span>            <span>outStream</span><span>.</span><span>write</span><span>(</span><span>builder</span><span>.</span><span>toString</span><span>().</span><span>getBytes</span><span>(</span><span>CHARCODE</span><span>));</span>
</span><span>            <span>builder</span> <span>=</span> <span>new</span> <span>StringBuilder</span><span>(</span><span>10240</span><span>);</span>
</span><span>        <span>}</span>
</span><span>        <span>if</span> <span>(</span><span>tmpTimestamp</span><span>.</span><span>compareTo</span><span>(</span><span>System</span><span>.</span><span>currentTimeMillis</span><span>()</span> <span>/</span> <span>1000</span><span>)</span> <span>!=</span> <span>0</span><span>)</span> <span>{</span>
</span><span>            <span>tmpDate</span> <span>=</span> <span>dateFormat</span><span>.</span><span>format</span><span>(</span><span>new</span> <span>Date</span><span>());</span>
</span><span>            <span>tmpTimestamp</span> <span>=</span> <span>System</span><span>.</span><span>currentTimeMillis</span><span>()</span> <span>/</span> <span>1000</span><span>;</span>
</span><span>        <span>}</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>tmpDate</span><span>);</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>"\t"</span><span>);</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>rand</span><span>.</span><span>nextInt</span><span>(</span><span>999</span><span>));</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>"\t"</span><span>);</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>Encrypt</span><span>.</span><span>md5</span><span>(</span><span>System</span><span>.</span><span>currentTimeMillis</span><span>()</span> <span>+</span> <span>""</span> <span>+</span> <span>rand</span><span>.</span><span>nextInt</span><span>(</span><span>99999999</span><span>)));</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>"\t"</span><span>);</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>rand</span><span>.</span><span>nextInt</span><span>(</span><span>999</span><span>)</span> <span>%</span> <span>2</span> <span>==</span> <span>0</span> <span>?</span> <span>"AA."</span> <span>:</span> <span>"BB"</span><span>);</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>"\t"</span><span>);</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>rand</span><span>.</span><span>nextFloat</span><span>()</span> <span>*</span> <span>2000</span><span>);</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>"\t"</span><span>);</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>rand</span><span>.</span><span>nextInt</span><span>(</span><span>9</span><span>));</span>
</span><span>        <span>builder</span><span>.</span><span>append</span><span>(</span><span>"\n"</span><span>);</span>
</span><span>    <span>}</span>
</span><span>    <span>System</span><span>.</span><span>out</span><span>.</span><span>println</span><span>(</span><span>"write data:"</span> <span>+</span> <span>i</span><span>);</span>
</span><span>    <span>outStream</span><span>.</span><span>write</span><span>(</span><span>builder</span><span>.</span><span>toString</span><span>().</span><span>getBytes</span><span>(</span><span>CHARCODE</span><span>));</span>
</span><span>    <span>outStream</span><span>.</span><span>close</span><span>();</span>
</span><span><span>}</span> <span>catch</span> <span>(</span><span>Exception</span> <span>e</span><span>)</span> <span>{</span>
</span><span>    <span>e</span><span>.</span><span>printStackTrace</span><span>();</span>
</span><span><span>}</span>
</span><span><span>System</span><span>.</span><span>out</span><span>.</span><span>println</span><span>(</span><span>System</span><span>.</span><span>currentTimeMillis</span><span>()</span> <span>/</span> <span>1000</span> <span>-</span> <span>start</span><span>);</span>
</span></span></code>
Copy after login

这段代码会生成一个数据文件,每一行为一条记录,然后再使用上面提到的 LOAD DATA 来导入数据就可以了,我在公司的电脑下(2G内存+垃圾双核CPU,MySQL直接装在windows下,没任何优化,developer模式)每秒能达到近万条的插入速度,比其他方式都快很多。

另外如果想直接用GUI工具操作也可以,比如SQLYog中,右键要导入的表,选择Import – Import CSV Data Using Load Local.. 然后设置好编码、分隔符后就可以直接导入了。


Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Recommendations
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template