pandas:How to get int by subtract timedelta64[ns] time inside MultiIndex

Published on:
Last updated:

This post is also available in: 日本語 (Japanese)

By using python and pandas, I wrote sample code how to get int value by subtract timedelta64[ns] time inside MultiIndex.

import pandas as pd


# sample data
data = [
        [1,'2023-03-28','7508'],
        [1,'2023-03-30','7600'],
        [2,'2023-06-18','9568'],
        [2,'2023-06-30','9586'],
        [3,'2023-07-10','7507'],
        [3,'2023-07-20','7520'],
    	]


# create dataframe
df = pd.DataFrame(data=data, columns=['id','date','val'])
print(df)
"""
   id        date   val
0   1  2023-03-28  7508
1   1  2023-03-30  7600
2   2  2023-06-18  9568
3   2  2023-06-30  9586
4   3  2023-07-10  7507
5   3  2023-07-20  7520
"""

# transform to timedelta64[ns]
print(df['date'].dtypes) # object
df['date'] = pd.to_datetime(df['date'])
print(df['date'].dtypes) # datetime64[ns]


# create MultiIndex
df_mulindex = df.set_index(['id','date'])
print(df_mulindex)
"""
                val
id date
1  2023-03-28  7508
   2023-03-30  7600
2  2023-06-18  9568
   2023-06-30  9586
3  2023-07-10  7507
   2023-07-20  7520
"""

# get int by subtract timedelta64[ns] time inside MultiIndex
print(df_mulindex.groupby('id').apply(lambda x: x.index[1][1]-x.index[0][1]).dt.days) # timedelta64[ns] to int
"""
id
1     2
2    12
3    10
dtype: int64
"""

About
Kuniyoshi Takemoto is the founder of Amelt.net LLC, and editor of this blog(www.amelt.net).Learn more and follow me on LinkedIn.