In the process of data processing often encounter various time data, but because the format of time data is not uniform, so there are some problems in data processing. Python's standard library provides the appropriate modules, but usability is not high, and is not human-friendly. This column has been an article about how time data (lubridate packages) are handled in R, while Python also has packages that implement similar functionality. This article tells us how to use Python's third-party library arrow to process time data.
Arrow provides an easy-to-use, intelligent way to create, manipulate, format, and transform time data.
Basic use
Arrow has the flexibility to convert time data in multiple formats, such as time data separated by different spacers, by turning data into arrow objects before processing time data:
>>> arrow.get(‘2017-01-05‘)<Arrow [2017-01-05T00:00:00+00:00]>>>> arrow.get(‘2017.01.05‘)<Arrow [2017-01-05T00:00:00+00:00]>>>> arrow.get(‘2017/01/05‘)<Arrow [2017-01-05T00:00:00+00:00]>>>> arrow.get(‘2017/01.05‘)<Arrow [2017-01-05T00:00:00+00:00]>
There are also time data in different order:
>>> arrow.get(‘05/2017.01‘, ‘DD/YYYY.MM‘)<Arrow [2017-01-05T00:00:00+00:00]>>>> arrow.get(‘05/01/2017‘, ‘DD/MM/YYYY‘)<Arrow [2017-01-05T00:00:00+00:00]>>>> arrow.get(‘01.05.2017‘, ‘MM.DD.YYYY‘)<Arrow [2017-01-05T00:00:00+00:00]>
Timestamps time data can of course also:
>>> arrow.get(‘1586782011‘)<Arrow [2020-04-13T12:46:51+00:00]>>>> arrow.get(‘1586782011.123456‘)<Arrow [2020-04-13T12:46:51.123456+00:00]>
Time data in a string can also be obtained:
>>> arrow.get(‘June was born in May 1980‘, ‘MMMM YYYY‘)<Arrow [1980-05-01T00:00:00+00:00]>
Get Data
After converting to the arrow object, we can easily get the various time data we want, through the attributes such as year, month, day, hour, minute, second, week, etc., such as:
>>> now = arrow.now()>>> now<Arrow [2017-02-04T13:47:58.114342+08:00]>>>> now.year2017>>> now.month2>>> now.day4>>> now.hour13>>> now.minute47>>> now.second58>>> now.week5
modifying data
We inevitably need to manipulate the time data, and arrow also provides a handy way to manipulate it, such as switching the time zone to () method:
>>> utc = arrow.get(‘2017-02-03T13:47:58.114342+00:00‘)>>> utc<Arrow [2017-02-03T13:47:58.114342+00:00]>>>> utc.to(‘local‘)<Arrow [2017-02-03T21:47:58.114342+08:00]>>>> utc.to(‘US/Pacific‘)<Arrow [2017-02-03T05:47:58.114342-08:00]>>>> utc.to(‘+02:00‘)<Arrow [2017-02-03T15:47:58.114342+02:00]>
Of course there is the replace () method of the modified time:
>>> utc = arrow.get(‘2017-02-03T13:47:58.114342+00:00‘)>>> utc<Arrow [2017-02-03T13:47:58.114342+00:00]>>>> utc.replace(days=+1)<Arrow [2017-02-04T13:47:58.114342+00:00]>>>> utc.replace(days=+1, hours=-1)<Arrow [2017-02-04T12:47:58.114342+00:00]>>>> utc.replace(weeks=+1)<Arrow [2017-02-10T13:47:58.114342+00:00]>
Data operations
Arrow objects can be judged by simply being greater than or less than conforming to the time sequence, such as:
>>> start = arrow.get(‘2017-02-03T15:47:58.114342+02:00‘)>>> end = arrow.get(‘2017-02-02T07:17:41.756144+02:00‘)>>> start<Arrow [2017-02-03T15:47:58.114342+02:00]>>>> end<Arrow [2017-02-02T07:17:41.756144+02:00]>>>> start > endTrue>>> start_to = start.to(‘+08:00‘)>>> start == start_toTrue
You can also use the '-' operator to obtain the difference in time, such as:
>>> start - enddatetime.timedelta(1, 30616, 358198)
Time interval
Arrow can also get a time interval based on time, such as:
>>> utc = arrow.get(‘2017-02-03T13:47:58.114342+00:00‘)>>> utc<Arrow [2017-02-03T13:47:58.114342+00:00]>>>> utc.span(‘hour‘)(<Arrow [2017-02-03T13:00:00+00:00]>, <Arrow [2017-02-03T13:59:59.999999+00:00]>)>>> utc.span(‘year‘)(<Arrow [2017-01-01T00:00:00+00:00]>, <Arrow [2017-12-31T23:59:59.999999+00:00]>)>>> utc.span(‘day‘)(<Arrow [2017-02-03T00:00:00+00:00]>, <Arrow [2017-02-03T23:59:59.999999+00:00]>)
You can also get the maximum time and minimum time based on a certain qualification, such as:
>>> utc = arrow.get(‘2017-02-03T13:47:58.114342+00:00‘)>>> utc<Arrow [2017-02-03T13:47:58.114342+00:00]>>>> utc.floor(‘year‘)<Arrow [2017-01-01T00:00:00+00:00]>>>> utc.ceil(‘year‘)<Arrow [2017-12-31T23:59:59.999999+00:00]>>>> utc.floor(‘day‘)<Arrow [2017-02-03T00:00:00+00:00]>>>> utc.ceil(‘day‘)<Arrow [2017-02-03T23:59:59.999999+00:00]>
Humanized
Arrow also provides some ways to compare human time, humanize () method, specific examples are as follows:
>>> earlier = arrow.utcnow().replace(hours=-2)>>> earlier.humanize()‘2 hours ago‘>>> later = later = earlier.replace(hours=4)>>> later.humanize(earlier)‘in 4 hours‘
>>> Import Arrow
>>> UTC = Arrow.utcnow ()
>>> UTC
<arrow [2013-05-11t21:23:58.970460+00:00]>
>>> UTC = Utc.replace (hours=-1)
>>> UTC
<arrow [2013-05-11t20:23:58.970460+00:00]>
>>> local = utc.to (' us/pacific ')
>>> Local
<arrow [2013-05-11t13:23:58.970460-07:00]>
>>> arrow.get (' 2013-05-11t21:23:58.970460+00:00 ')
<arrow [2013-05-11t21:23:58.970460+00:00]>
>>> Local.timestamp
1368303838
>>> Local.format (' yyyy-mm-dd HH:mm:ss ZZ ')
' 2013-05-11 13:23:58-07:00 '
>>> local.humanize ()
' An hour ago '
>>> local.humanize (locale= ' Ko_kr ')
' 1?? ?‘
Arrow: Make Python's date and time better