将Timestamp转为datetime类型
在Pandas中我们在处理时间序列的时候常用的方法有:
pd.to_datetime()
pd.date_range()
pandas生成时间索引
# pd.date_range()
index = pd.date_range("20210101",periods=20)
index
Out[29]:
DatetimeIndex(['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04',
'2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08',
'2021-01-09', '2021-01-10', '2021-01-11', '2021-01-12',
'2021-01-13', '2021-01-14', '2021-01-15', '2021-01-16',
'2021-01-17', '2021-01-18', '2021-01-19', '2021-01-20'],
dtype='datetime64[ns]', freq='D')
# pd.to_datetime()
df = pd.DataFrame(data=range(20210101,20210128),columns=["period"])
df["aa"] = pd.to_datetime(df["period"],format="%Y%m%d")
df
Out[24]:
period aa
0 20210101 2021-01-01
1 20210102 2021-01-02
2 20210103 2021-01-03
3 20210104 2021-01-04
4 20210105 2021-01-05
5 20210106 2021-01-06
6 20210107 2021-01-07
7 20210108 2021-01-08
8 20210109 2021-01-09
9 20210110 2021-01-10
10 20210111 2021-01-11
11 20210112 2021-01-12
12 20210113 2021-01-13
13 20210114 2021-01-14
14 20210115 2021-01-15
15 20210116 2021-01-16
16 20210117 2021-01-17
17 20210118 2021-01-18
18 20210119 2021-01-19
19 20210120 2021-01-20
20 20210121 2021-01-21
21 20210122 2021-01-22
22 20210123 2021-01-23
23 20210124 2021-01-24
24 20210125 2021-01-25
25 20210126 2021-01-26
26 20210127 2021-01-27
index[1]
Out[30]: Timestamp('2021-01-02 00:00:00', freq='D')
df["aa"][1]
Out[31]: Timestamp('2021-01-02 00:00:00')
df["aa"][1] == index[1]
Out[32]: True
type(df["aa"][1])
Out[33]: pandas._libs.tslibs.timestamps.Timestamp
type(index[1])
Out[34]: pandas._libs.tslibs.timestamps.Timestamp
Timestamp与datetime
从上面代码可以看出,pandas中的时间格式是pandas._libs.tslibs.timestamps.Timestamp
但是python中常用的时间格式是datetime.datetime
to_pydatetime()
t = datetime(2021,1,2)
type(t)
Out[54]: datetime.datetime
t
Out[55]: datetime.datetime(2021, 1, 2, 0, 0)
r = (index[1].to_pydatetime())
type(r)
Out[57]: datetime.datetime
t == r
Out[58]: True
将pandas Timestamp 转为 datetime 类型
In [11]: ts = pd.Timestamp('2014-01-23 00:00:00', tz=None)
In [12]: ts.to_pydatetime()
Out[12]: datetime.datetime(2014, 1, 23, 0, 0)
It's also available on a DatetimeIndex
rng = pd.date_range('1/10/2011', periods=3, freq='D')
rng.to_pydatetime()
Out[60]:
array([datetime.datetime(2011, 1, 10, 0, 0),
datetime.datetime(2011, 1, 11, 0, 0),
datetime.datetime(2011, 1, 12, 0, 0)], dtype=object)
pandas从Timestamp中提取小时分钟等
官方文档: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#from-timestamps-to-epoch
最近需要提取某一天的时刻距离0:00的分钟数,找了文档之后想到这样一个办法:
假设数据为
In [64]: stamps = pd.date_range('2012-10-08 18:15:05', periods=4, freq='h')
In [65]: stamps
Out[65]:
DatetimeIndex(['2012-10-08 18:15:05', '2012-10-08 19:15:05',
'2012-10-08 20:15:05', '2012-10-08 21:15:05'],
dtype='datetime64[ns]', freq='D')
先得到距离1970-01-01的秒数
In [66]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s')
Out[66]: Int64Index([1349720105, 1349723705, 1349727305, 1349730905], dtype='int64')
对天取余,得到距离0:00的秒数
In [67]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400
Out[67]: Int64Index([65705, 69305, 72905, 76505], dtype='int64')
取距离0:00的分钟数
In [68]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400 /60
Out[68]: Int64Index([1095.0833333333333, 1155.0833333333333, 1215.0833333333333,
1275.0833333333333], dtype='float64')
同样的,也可以取小时数
In [69]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400 /3600
Out[68]: Int64Index([18.25138888888889, 19.25138888888889, 20.25138888888889,
21.25138888888889], dtype='float64')
取小时整数–当然取小时整数也有别的方法。
In [70]: (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s') % 86400 //3600
Out[70]: Int64Index([18, 19, 20, 21], dtype='int64')
以上为个人经验,希望能给大家一个参考,也希望大家多多支持编程网。