Should python use pickle or csv in this situation?
给我你的怀抱
给我你的怀抱 2017-05-18 11:00:47
0
2
855

I have many hundreds of M csv on hand to store some data, and I often need to use pandas and matplotlib to read and plot these data. Before drawing, it is usually necessary to perform preprocessing, slicing and other cleaning operations. Because figures need to be interacted with and reported frequently, I use %matplotlib notebook in jupyter notebook to operate and interact. Should these intermediate data generated from the original data be saved in csv so that the csv can be directly read to obtain the intermediate data for the next display, or should it be saved using pickle, and reading pickle is faster for subsequent use?

给我你的怀抱
给我你的怀抱

reply all(2)
洪涛

CSV must be safe. It seems that changing pickle to another python version may cause reading failure. This is not a universal format. If it is a few hundred megabytes, the csv reading speed is actually not slow. What's more, there is hdf5, these are serious data exchange formats.

Peter_Zhu

csv is enough, if you think it’s not fast enough, you can try hdf5 file

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template