python - Pandas read_csv - rows with variable number of columns -
i have csv file has rows variable number of columns (and no column headers). e.g. file begin rows 23 columns , rows 83 columns etc. when read_csv() starts reading file guesses number of columns after first few rows read (i think) if data rows in beginning shorter @ end exception below. there way pass parameter function set number of columns max value? or there better way this?
thanks.
cparsererror: error tokenizing data. c error: expected 23 fields in line 150, saw 83
# coding: utf-8 # in[16]: def params(text): pairs = text.split("|") print pairs out = {i.split("=")[0]:i.split("=")[1] in pairs} return pd.series(out) params("asd=2|qwe=5") # in[27]: import pandas pd aa = pd.dataframe({'id':[1,2],'text':["asd=2|qwe=5","asd=20|qwe=5|qzxc=5"]}) aa # in[29]: aa['text'].apply(params) # in[30]: pd.concat([aa,aa['text'].apply(params)],1)
Comments
Post a Comment