Home > Article > Backend Development > Detailed explanation of how python reads text data and converts it into DataFrame format
This time I will bring you python a detailed explanation of the method of reading text data and converting it into DataFrame format. What are the precautions for reading text data and converting it into DataFrame in Python? The following is a practical case, let’s take a look.
I saw a question like this in a technical Q&A, and I thought it was relatively common, so I wrote a separate article about it.
Read data from the plain text format file "file_in" in the following format:
Needs to be output as "file_out", the format is as follows:
The original format of the data is "Category: Content", with blank lines "\n" means sub-entries. After conversion, it becomes one entry per line, and the content is written out in order of category.
It is recommended that after reading, use pandas to create a table called DataFrame from the data. This will make it easier to process the data later. But the original format is not the usual table format, so some simple processing needs to be done first.
#coding:utf8 import sys from pandas import DataFrame #DataFrame通常来装二维的表格 import pandas as pd #pandas是流行的做数据分析的包 #建立字典,键和值都从文件里读出来。键是nam,age……,值是lili,jim…… dict_data={} #打开文件 with open('file_in.txt','r')as df: #读每一行 for line in df: #如果这行是换行符就跳过,这里用'\n'的长度来找空行 if line.count('\n') == len(line): continue #对每行清除前后空格(如果有的话),然后用":"分割 for kv in [line.strip().split(':')]: #按照键,把值写进去 dict_data.setdefault(kv[0],[]).append(kv[1]) #print(dict_data)看看效果 #这是把键读出来成为一个列表 columnsname=list(dict_data.keys()) #建立一个DataFrame,列名即为键名,也就是nam,age…… frame = DataFrame(dict_data,columns=columnsname) #把DataFrame输出到一个表,不要行名字和列名字 frame.to_csv('file_out0.txt',index=False,header=False)
I believe you have mastered the method after reading the case in this article. For more exciting information, please pay attention to other related articles on the php Chinese website!
Recommended reading:
How to convert object into float data
How python handles the time field of dataframe
The above is the detailed content of Detailed explanation of how python reads text data and converts it into DataFrame format. For more information, please follow other related articles on the PHP Chinese website!