Home > Backend Development > Python Tutorial > How to Import and Process Nested JSON Data into Pandas DataFrames?

How to Import and Process Nested JSON Data into Pandas DataFrames?

Linda Hamilton
Release: 2024-10-24 11:40:02
Original
908 people have browsed it

How to Import and Process Nested JSON Data into Pandas DataFrames?

Reading Nested JSON Files as Pandas DataFrames

When working with JSON data containing nested objects, it can be necessary to convert it into a more structured format for analysis or manipulation. Pandas provides useful tools for efficiently handling such data.

Scenario:

Consider a JSON file with the following structure:

<code class="json">{
    "number": "",
    "date": "01.10.2016",
    "name": "R 3932",
    "locations": [
        { ... },
        { ... },
        { ... }
    ]
}</code>
Copy after login

Using json_normalize:

The json_normalize function allows you to flatten nested JSON into a DataFrame. For the given JSON, you can do the following:

<code class="python">import pandas as pd

with open('myJson.json') as data_file:    
    data = json.load(data_file)  

df = pd.json_normalize(data, 'locations', ['date', 'number', 'name'], 
                    record_prefix='locations_')
print (df)</code>
Copy after login

This will create a DataFrame with the following columns:

Extending to Keep Nested Data:

If you prefer to keep the nested array intact, you can use read_json with the parsing parameter. This will parse the JSON into a DataFrame with the locations column as a list of dictionaries.

<code class="python">df = pd.read_json("myJson.json", orient='records', parsing = True)</code>
Copy after login

Alternatively, you can parse the locations column using the constructor parameter:

<code class="python">df = pd.read_json("myJson.json", orient='records',
                  constructor=lambda x: pd.DataFrame(x['locations']))</code>
Copy after login

Concatenating Nested Values:

If you want to join the values in the locations column into a single string, you can use the groupby and apply functions:

<code class="python">df = df.groupby(['date', 'name', 'number'])['locations'].apply(','.join).reset_index()</code>
Copy after login

The above is the detailed content of How to Import and Process Nested JSON Data into Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!

source:php
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template