How to Handle UTF8 Encoding in Python When Reading CSV Files?

Mary-Kate Olsen
Release: 2024-11-02 14:10:30
Original
454 people have browsed it

How to Handle UTF8 Encoding in Python When Reading CSV Files?

Reading a UTF8 CSV File with Python

CSV files, commonly used for data exchange, often contain accented characters that require UTF8 encoding to preserve their integrity. The Python csvreader, however, supports only ASCII data.

Problem

When attempting to read a UTF8 CSV file with accented French or Spanish characters, despite using code to handle UTF8 encoding, the following exception was encountered:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 68: ordinal not in range(128)
Copy after login

Solution

The solution lies in understanding the purpose of the encode method. It converts Unicode strings into byte strings, not vice versa. By correctly utilizing the codecs module and specifically codecs.open for handling UTF8 text files, the code can be simplified:

<code class="python">import csv

def unicode_csv_reader(utf8_data, dialect=csv.excel, **kwargs):
    csv_reader = csv.reader(utf8_data, dialect=dialect, **kwargs)
    for row in csv_reader:
        yield [unicode(cell, 'utf-8') for cell in row]

filename = 'da.csv'
reader = unicode_csv_reader(open(filename))
for field1, field2, field3 in reader:
  print field1, field2, field3 </code>
Copy after login

Note

If the input data is not in UTF8, such as ISO-8859-1, the code requires transcoding:

<code class="python">line.decode('whateverweirdcodec').encode('utf-8')</code>
Copy after login

However, this is often unnecessary as csv can directly handle ISO-8859-* encoded byte strings.

The above is the detailed content of How to Handle UTF8 Encoding in Python When Reading CSV Files?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!