How to Import and Process Nested JSON Data into Pandas DataFrames?-Python Tutorial-php.cn

How to Import and Process Nested JSON Data into Pandas DataFrames?

Linda Hamilton

Release： 2024-10-24 11:40:02

Original

964 people have browsed it

How to Import and Process Nested JSON Data into Pandas DataFrames?

Reading Nested JSON Files as Pandas DataFrames

When working with JSON data containing nested objects, it can be necessary to convert it into a more structured format for analysis or manipulation. Pandas provides useful tools for efficiently handling such data.

Scenario:

Consider a JSON file with the following structure:

<code class="json">{
    "number": "",
    "date": "01.10.2016",
    "name": "R 3932",
    "locations": [
        { ... },
        { ... },
        { ... }
    ]
}</code>

Copy after login

Using json_normalize:

The json_normalize function allows you to flatten nested JSON into a DataFrame. For the given JSON, you can do the following:

<code class="python">import pandas as pd

with open('myJson.json') as data_file:    
    data = json.load(data_file)  

df = pd.json_normalize(data, 'locations', ['date', 'number', 'name'], 
                    record_prefix='locations_')
print (df)</code>

Copy after login

This will create a DataFrame with the following columns:

Extending to Keep Nested Data:

If you prefer to keep the nested array intact, you can use read_json with the parsing parameter. This will parse the JSON into a DataFrame with the locations column as a list of dictionaries.

<code class="python">df = pd.read_json("myJson.json", orient='records', parsing = True)</code>

Copy after login

Alternatively, you can parse the locations column using the constructor parameter:

<code class="python">df = pd.read_json("myJson.json", orient='records',
                  constructor=lambda x: pd.DataFrame(x['locations']))</code>

Copy after login

Concatenating Nested Values:

If you want to join the values in the locations column into a single string, you can use the groupby and apply functions:

<code class="python">df = df.groupby(['date', 'name', 'number'])['locations'].apply(','.join).reset_index()</code>

Copy after login

The above is the detailed content of How to Import and Process Nested JSON Data into Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!