I want to convert a set of yaml files in a folder into an xlsx file. I thought I'd start with trying to convert one yaml file into an xlsx file. The yaml files in the folder are all in the format given below:
info:
city: Bangalore
competition: IPL
dates:
- 2008-04-18
gender: male
match_type: T20
outcome:
by:
runs: 140
winner: Kolkata Knight Riders
overs: 20
player_of_match:
- BB McCullum
teams:
- Royal Challengers Bangalore
- Kolkata Knight Riders
toss:
decision: field
winner: Royal Challengers Bangalore
umpires:
- Asad Rauf
- RE Koertzen
venue: M Chinnaswamy Stadium
innings:
- 1st innings:
team: Kolkata Knight Riders
deliveries:
- 0.1:
batsman: SC Ganguly
bowler: P Kumar
extras:
legbyes: 1
non_striker: BB McCullum
runs:
batsman: 0
extras: 1
total: 1
The data continues for each ball of the match (0.2, 0.3, 0.4 ... 20.0) and shifts to the second half of the game (second innings) and continues further
My attempt at converting one of these yaml file into an xlsx file:
import pandas as pd
import yaml as ya
with open(r"location of folder") as f:
data = ya.load(f, Loader=ya.FullLoader)
df1=pd.DataFrame(data['info'])
df1.to_excel(r"location of folder\output.xlsx")
However, after running the above code, I got the following errors:
File "c:\Users\kosal\hello\prj.py", line 8, in <module>
df1=pd.DataFrame(data['info'])
File "C:\Users\kosal\anaconda3\lib\site-packages\pandas\core\frame.py", line 529, in __init__
mgr = init_dict(data, index, columns, dtype=dtype)
File "C:\Users\kosal\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 287, in init_dict
return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
File "C:\Users\kosal\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 80, in arrays_to_mgr
index = extract_index(arrays)
File "C:\Users\kosal\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 401, in extract_index
raise ValueError("arrays must all be same length")
I do realize why this error is coming up but I have no idea as to how I should go about fixing it.
P.S. I can't find an appropriate tag for this question and hence have used the 'python' tag.