Convert dot-separated values to Go struct using Python

王林
Release: 2024-02-10 13:33:08
forward
916 people have browsed it

使用 Python 将点分隔值转换为 Go 结构

php editor Youzi will introduce in this article how to use Python to convert dot-separated values (such as "key1.subkey1.subkey2") into a structure in the Go language. This transformation process is useful for extracting and processing data from configuration files or API responses. We will use Python's recursive functions and Go language structures to implement this conversion, and give detailed code examples and explanations. After studying this article, readers will be able to easily process and convert point-separated values, improving the efficiency and flexibility of data processing.

Question content

This is a specific requirement for an application that can change the configuration (specifically the wso2 identity server, since I'm using go to write the kubernetes operator for it). But it's really not relevant here. I want to create a solution that allows easy management of large number of configuration maps to generate go structures. These configurations are mapped in .csv

Link to .csv - my_configs.csv

I want,Write a python script that automatically generates go structuresso that any changes to the application configuration can be updated by simply executing the python script to create the corresponding go structure. I'm referring to the configuration of the application itself. For example, toml key names in csv can be changed/new values can be added.

So far I have successfully created a python script thatalmost achieves my goal. The script is,

import pandas as pd def convert_to_dict(data): result = {} for row in data: current_dict = result for item in row[:-1]: if item is not none: if item not in current_dict: current_dict[item] = {} current_dict = current_dict[item] return result def extract_json_key(yaml_key): if isinstance(yaml_key, str) and '.' in yaml_key: return yaml_key.split('.')[-1] else: return yaml_key def add_fields_to_struct(struct_string,go_var,go_type,json_key,toml_key): struct_string += str(go_var) + " " + str(go_type) + ' `json:"' + str(json_key) + ',omitempty" toml:"' +str(toml_key) + '"` ' + "\n" return struct_string def generate_go_struct(struct_name, struct_data): struct_name="configurations" if struct_name == "" else struct_name struct_string = "type " + struct_name + " struct {\n" yaml_key=df['yaml_key'].str.split('.').str[-1] # base case: generate fields for the current struct level for key, value in struct_data.items(): selected_rows = df[yaml_key == key] if len(selected_rows) > 1: go_var = selected_rows['go_var'].values[1] toml_key = selected_rows['toml_key'].values[1] go_type=selected_rows['go_type'].values[1] json_key=selected_rows['json_key'].values[1] else: go_var = selected_rows['go_var'].values[0] toml_key = selected_rows['toml_key'].values[0] go_type=selected_rows['go_type'].values[0] json_key=selected_rows['json_key'].values[0] # add fields to the body of the struct struct_string=add_fields_to_struct(struct_string,go_var,go_type,json_key,toml_key) struct_string += "}\n\n" # recursive case: generate struct definitions for nested structs for key, value in struct_data.items(): selected_rows = df[yaml_key == key] if len(selected_rows) > 1: go_var = selected_rows['go_var'].values[1] else: go_var = selected_rows['go_var'].values[0] if isinstance(value, dict) and any(isinstance(v, dict) for v in value.values()): nested_struct_name = go_var nested_struct_data = value struct_string += generate_go_struct(nested_struct_name, nested_struct_data) return struct_string # read excel csv_file = "~/downloads/my_configs.csv" df = pd.read_csv(csv_file) # remove rows where all columns are nan df = df.dropna(how='all') # create the 'json_key' column using the custom function df['json_key'] = df['yaml_key'].apply(extract_json_key) data=df['yaml_key'].values.tolist() # read the 'yaml_key' column data = pd.dataframe({'column':data}) # convert to dataframe data=data['column'].str.split('.', expand=true) # split by '.' nested_list = data.values.tolist() # convert to nested list data=nested_list result_json = convert_to_dict(data) # convert to dict (json) # the generated co code go_struct = generate_go_struct("", result_json) # write to file file_path = "output.go" with open(file_path, "w") as file: file.write(go_struct)
Copy after login

The problem is (see below part of csv),

authentication.authenticator.basic authentication.authenticator.basic.parameters authentication.authenticator.basic.parameters.showAuthFailureReason authentication.authenticator.basic.parameters.showAuthFailureReasonOnLoginPage authentication.authenticator.totp authentication.authenticator.totp.parameters authentication.authenticator.totp.parameters.showAuthFailureReason authentication.authenticator.totp.parameters.showAuthFailureReasonOnLoginPage authentication.authenticator.totp.parameters.encodingMethod authentication.authenticator.totp.parameters.timeStepSize
Copy after login

Here, since thebasicandtotpfieldsparametersare duplicated, the script confuses itself and generates twototpparametersstructures. The expected result is to havebasicparametersandtotpparametersstructures. There are many similar duplicate words in theyaml_keycolumn of the csv.

I know this has to do with the index being hardcoded to 1 ingo_var = selected_rows['go_var'].values[1], but it's hard to fix this.

Can someone give me an answer? I think,

  1. Problems with recursive functions
  2. There is a problem with the code that generates json may be the root cause of this issue.

Thanks!

I also tried using chatgpt, but since this has to do with nesting and recursion, the answer provided by chatgpt is not very efficient.

renew

I found a problem with the row containing theproperties,pooloptions,endpointandparametersfields. This is because they are duplicated in theyaml_keycolumn.

Solution

I was able to resolve this issue. However, I had to use a completely new approach to the problem, which was to use a tree data structure and then iterate over it. This is the main logic behind it -https://www.geeksforgeeks.org/level-sequential tree traversal/

This is the working python code.

import pandas as pd from collections import deque structs=[] class TreeNode: def __init__(self, name): self.name = name self.children = [] self.path="" def add_child(self, child): self.children.append(child) def create_tree(data): root = TreeNode('') for item in data: node = root for name in item.split('.'): existing_child = next((child for child in node.children if child.name == name), None) if existing_child: node = existing_child else: new_child = TreeNode(name) node.add_child(new_child) node = new_child return root def generate_go_struct(struct_data): struct_name = struct_data['struct_name'] fields = struct_data['fields'] go_struct = f"type {struct_name} struct {{\n" for field in fields: field_name = field['name'] field_type = field['type'] field_default_val = str(field['default_val']) json_key=field['json_key'] toml_key=field['toml_key'] tail_part=f"\t{field_name} {field_type} `json:\"{json_key},omitempty\" toml:\"{toml_key}\"`\n\n" if pd.isna(field['default_val']): go_struct += tail_part else: field_default_val = "\t// +kubebuilder:default:=" + field_default_val go_struct += field_default_val + "\n" + tail_part go_struct += "}\n\n" return go_struct def write_go_file(go_structs, file_path): with open(file_path, 'w') as file: for go_struct in go_structs: file.write(go_struct) def create_new_struct(struct_name): struct_name = "Configurations" if struct_name == "" else struct_name struct_dict = { "struct_name": struct_name, "fields": [] } return struct_dict def add_field(struct_dict, field_name, field_type,default_val,json_key, toml_key): field_dict = { "name": field_name, "type": field_type, "default_val": default_val, "json_key":json_key, "toml_key":toml_key } struct_dict["fields"].append(field_dict) return struct_dict def traverse_tree(root): queue = deque([root]) while queue: node = queue.popleft() filtered_df = df[df['yaml_key'] == node.path] go_var = filtered_df['go_var'].values[0] if not filtered_df.empty else None go_type = filtered_df['go_type'].values[0] if not filtered_df.empty else None if node.path=="": go_type="Configurations" # The structs themselves current_struct = create_new_struct(go_type) for child in node.children: if (node.name!=""): child.path=node.path+"."+child.name else: child.path=child.name filtered_df = df[df['yaml_key'] == child.path] go_var = filtered_df['go_var'].values[0] if not filtered_df.empty else None go_type = filtered_df['go_type'].values[0] if not filtered_df.empty else None default_val = filtered_df['default_val'].values[0] if not filtered_df.empty else None # Struct fields json_key = filtered_df['yaml_key'].values[0].split('.')[-1] if not filtered_df.empty else None toml_key = filtered_df['toml_key'].values[0].split('.')[-1] if not filtered_df.empty else None current_struct = add_field(current_struct, go_var, go_type,default_val,json_key, toml_key) if (child.children): # Add each child to the queue for processing queue.append(child) go_struct = generate_go_struct(current_struct) # print(go_struct,"\n") structs.append(go_struct) write_go_file(structs, "output.go") csv_file = "~/Downloads/my_configs.csv" df = pd.read_csv(csv_file) sample_data=df['yaml_key'].values.tolist() # Create the tree tree = create_tree(sample_data) # Traverse the tree traverse_tree(tree)
Copy after login

Thank you for your help!

The above is the detailed content of Convert dot-separated values to Go struct using Python. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:stackoverflow.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!