python - Pandas data frame with repeating sequences: How to do a spaghetti plot? -
i've got data looking following data pandas.dataframe
:
diff_1 diff_2 1949-01-01 06:00:00 -0.555 -0.123 1949-01-01 07:00:00 -0.654 0.230 1949-01-02 06:00:00 -0.879 0.012 1949-01-02 07:00:00 -0.459 0.672 1949-01-03 06:00:00 -0.588 0.980 1949-01-03 07:00:00 -0.068 0.375 1950-01-01 06:00:00 -0.654 0.572 1950-01-01 07:00:00 -0.544 0.092 1950-01-02 06:00:00 0.374 -0.275 1950-01-02 07:00:00 0.562 -0.260 1950-01-03 06:00:00 -0.200 0.240 1950-01-03 07:00:00 -0.226 0.202
now, want "spaghetti plot", "spaghetti groups" in 1 color determinated whether curve diff_1 or diff_2 (so x-axis time 01-01 01-03, y-axis differences, each "spaghetti" 1 year). tried orient @ question:
plot pandas data frame year on year data
however, fear have got 1 dimension many. ideas how work?
edit: following simple image illustrates i'm looking for. multiple lines 1 color result fact time period on x-axis repeat annually.
this best do, not totally satisfied might enough:
# add column year can pivot on later. tdf = df.assign(year=df.index.year) # make dates have same year (a leap 1 in case) tdf.index = df.index.map(lambda x: x.replace(year=2004)) # pivot using years columns , put them in topmost level. tdf = (tdf.pivot(columns='year').swaplevel(0, 1, axis='columns')) print(tdf) year 1949 1950 1949 1950 diff_1 diff_1 diff_2 diff_2 2004-01-01 06:00:00 -0.555 -0.654 -0.123 0.572 2004-01-01 07:00:00 -0.654 -0.544 0.230 0.092 2004-01-02 06:00:00 -0.879 0.374 0.012 -0.275 2004-01-02 07:00:00 -0.459 0.562 0.672 -0.260 2004-01-03 06:00:00 -0.588 -0.200 0.980 0.240 2004-01-03 07:00:00 -0.068 -0.226 0.375 0.202 # create list of many colors columns in df color = [c['color'] c in plt.rcparams['axes.prop_cycle'][:df.columns.size]] # plot ax = plt.subplot() year in tdf.columns.levels[0]: tdf[year].plot(color=color, legend=false, ax=ax) plt.legend(ax.lines[:df.columns.size], df.columns, loc='best') plt.show()
now customize tick labels heart content.
Comments
Post a Comment