I want to add more than 2000 columns to the existing csv with more than 8000 columns. Because the file is too large and cannot be loaded into the memory at once, I want to write it into the existing csv column by column. I have tried many times. None of the methods work. How can I solve it?
PHPz2017-05-18 10:56:39
按行读入,然后加入这些列啊。CSV一般都是逗号分隔的文本文件,按照文本文件的处理方法处理就行。一般流程就是:
1.读入一行
2.以逗号切分字符串为数组
3.给数组加上你想要的列元素
4.以逗号为分隔连接数组
5.将这行写入新文件
6.一直到文件结尾即可。
怪我咯2017-05-18 10:56:39
pandas有分块读取,示例代码
import pandas as pd
reader = pd.read_csv('a.csv', iterator=True)
header = True
try:
df = reader.get_chunk(10000)
#循环加添新列到df
df['新列'] = '值'
#把记录追加到新csv
df.to_csv('b.csv', mode='a', index=False, header=header)
#文件头只写一次
header = False
except StopIteration:
pass