This may be a known limitation, but I'm struggling to calculate the cumulative minimum of a series in Pandas when that series contains NaT's. Is there a way to make this work?
Simple example below:
import pandas as pd
s = pd.Series(pd.date_range('2008-09-15', periods=10, freq='m'))
s.loc[10] = pd.NaT
s.cummin()
ValueError: Could not convert object to NumPy datetime
This bug has been fixed in Pandas 0.15.2 (to be released).
As a workaround, you could use skipna=False
, and handle the NaTs "manually":
import pandas as pd
import numpy as np
np.random.seed(1)
s = pd.Series(pd.date_range('2008-09-15', periods=10, freq='m'))
s.loc[10] = pd.NaT
np.random.shuffle(s)
print(s)
# 0 2008-11-30
# 1 2008-12-31
# 2 2009-01-31
# 3 2009-06-30
# 4 2008-10-31
# 5 2009-03-31
# 6 2008-09-30
# 7 2009-04-30
# 8 NaT
# 9 2009-05-31
# 10 2009-02-28
# dtype: datetime64[ns]
mask = pd.isnull(s)
result = s.cummin(skipna=False)
result.loc[mask] = pd.NaT
print(result)
yields
0 2008-11-30
1 2008-11-30
2 2008-11-30
3 2008-11-30
4 2008-10-31
5 2008-10-31
6 2008-09-30
7 2008-09-30
8 NaT
9 2008-09-30
10 2008-09-30
dtype: datetime64[ns]
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments