numpy - Python Pandas - Reformat Datetime Index -
i have 2 data frames both multi-indexed on 'date' , 'name', , want sql style join
combine them. i've tried
pd.merge(df1.reset_index(), df2.reset_index(), on=['name', 'date'], how='inner')
which results in empty dataframe.
if inspect data frames can see index of 1 represented '2015-01-01'
, other represented '2015-01-01 00:00:00'
explains issues joining.
is there way 'recast' index specific format within pandas?
i've included tables see data i'm working with.
df1= +-------------+------+------+------+ | date | name | col1 | col2 | +-------------+------+------+------+ | 2015-01-01 | mary | 12 | 123 | | 2015-01-02 | mary | 23 | 33 | | 2015-01-03 | mary | 34 | 45 | | 2015-01-01 | john | 65 | 76 | | 2015-01-02 | john | 67 | 78 | | 2015-01-03 | john | 25 | 86 | +-------------+------+------+------+ df2= +------------+------+-------+-------+ | date | name | col3 | col4 | +------------+------+-------+-------+ | 2015-01-01 | mary | 80809 | 09885 | | 2015-01-02 | mary | 53879 | 58972 | | 2015-01-03 | mary | 23887 | 3908 | | 2015-01-01 | john | 9238 | 2348 | | 2015-01-02 | john | 234 | 234 | | 2015-01-03 | john | 5325 | 6436 | +------------+------+-------+-------+
desired result:
+-------------+------+------+-------+-------+-------+ | date | name | col1 | col2 | col3 | col4 | +-------------+------+------+-------+-------+-------+ | 2015-01-01 | mary | 12 | 123 | 80809 | 09885 | | 2015-01-02 | mary | 23 | 33 | 53879 | 58972 | | 2015-01-03 | mary | 34 | 45 | 23887 | 3908 | | 2015-01-01 | john | 65 | 76 | 9238 | 2348 | | 2015-01-02 | john | 67 | 78 | 234 | 234 | | 2015-01-03 | john | 25 | 86 | 5325 | 6436 | +-------------+------+------+-------+-------+-------+
the reason cannot join because have different dtypes on indicies. pandas silently fails if indicies have different dtypes.
you can change indicies string representations of time proper pandas datetimes this:
df = pd.dataframe({"data":range(1,30)}, index=['2015-04-{}'.format(d) d in range(1,30)]) df.index.dtype dtype('o') df.index = df.index.to_series().apply(pd.to_datetime) df.index.dtype dtype('<m8[ns]')
now can merge dataframes on index:
pd.merge(left=df, left_index=true, right=df2, right_index=true)
assuming have df2, example omitting...
Comments
Post a Comment