timestamp

Overview

The timestamp statement is used to create or update a column containing a timestamp.

The value of the timestamp is derived by applying a template to one or (optionally) two source columns. Timestamps generated by the timestamp statement can be UNIX epoch values or yyyyMMdd format strings.

Syntax

timestampTimeCol[offsetsecs]usingColName [ColName2]templateTemplate[format yyyymmdd]

Details

The timestamp statement populates the TimeCol column with either a UNIX timestamp (an integer representing a number of seconds since 00:00:00 on January 1st, 1970) or a yyyyMMdd format string.

If a column called TimeCol does not exist, then one will be created. If it does exist then the values within it will be overwritten.

The ColName argument must be the name of an existing column which contains date and/or time information to be extracted and used to derive the values in the TimeCol column. If the optional ColName2 argument is present then for each row of data the values in TimeCol and TimeCol2 are concatenated before the extraction of the data is done.

The TimeCol, ColName and ColName2 arguments may be fully qualified column names, but all three must be located in the same DSET. If TimeCol does not exist, then it will be created in the DSET that ColName is located in.

A UNIX timestamp will be generated unless format yyyymmdd is specified in which case the result will be a yyyyMMdd format timestamp.

If a secs parameter is provided and if the output is to be a UNIX timestamp, the specified number of seconds will be added to, or subtracted from, the result. As well as being useful for adjusting for timezones, this permits a time which falls on a midnight boundary to be 'nudged' back 1 second to fall on 23:59:59 of the previous day.

Currently, an offset can only be applied to UNIX format timestamps

Data used by timestamps

In order to create a UNIX timestamp value, the following data is used:

In order to create a yyyyMMdd timestamp value, the following data is used:

In both cases the fields are extracted from the ColName and (optionally) ColName2 columns using the Template argument as detailed below.

Templates

The Template argument is a string of characters defining which characters in the ColName column (and, if present, the ColName2 column) are to be extracted in order to obtain the field values shown in the tables above.

The template for a UNIX format timestamp consists of the following characters:

Upper-case Y, M and D characters are used for the date

Lower-case h, m and s characters are used for the time

The template for a yyyyMMdd format timestamp consists of the following characters:

Source values

For every row in the dataset, the template is applied to a source value which is constructed from the ColName and ColName2 columns as follows:

When applying the template to the source value the characters in the template are examined one at a time. A dot (.) causes the character in the same position in the source value to be ignored. Any of the other template characters will cause the character in the same position in the source value to be extracted and added to one of the fields used to create the timestamp.

Here are some sample template definitions:

The length of the template may be shorter than the value that it is being applied to. In the last example shown above, the year, month and date values occur at the start of the string, and the template therefore is only as long as is required to extract them.

The template must always contain four Y characters to define the year, although they do not have to be consecutive. For all the other characters (apart from .) there may be 0 - 2 of them present in the template.

If none of any given character is present, then the value will default to the lowest possible value. For example the template YYYYMM.., when applied to the input value 20160224 will result in a year of 2016, a month of 02 and a day of 01 (being the lowest possible value of a day in any given month).

Timestamp generation

Once the above fields have been extracted, they are converted into a UNIX or yyyyMMdd timestamp value as follows:

  • If a UNIX timestamp, then it is adjusted for the local time of the Exivity server

  • If a yyyyMMdd timestamp then it is treated 'as is'

This value is then placed in the TimeCol column. The TimeCol column may be the same as ColName or ColName2.

Examples

Example 1 - UNIX format timestamp

Given a dataset of the form:

start_date,end_date,start_time,end_time,subscriptionId, ...
20160630,20160630,14:00:00,14:59:59,a9470811-83f2-474b-9523-0ece853d8c3c, ...
20160630,20160630,15:00:00,15:59:59,a9470811-83f2-474b-9523-0ece853d8c3c, ...

The statements:

timestamp start_time using start_date start_time template "YYYYMMDDhh.mm.ss"
timestamp end_time using end_date end_time template "YYYYMMDDhh.mm.ss"
delete columns start_date end_date

Will produce the following result:

start_time,end_time,subscriptionId, ...
1467291600,1467295199,a9470811-83f2-474b-9523-0ece853d8c3c, ...
1467295200,1467298799,a9470811-83f2-474b-9523-0ece853d8c3c, ...

For verification, the converted values translate back to the following times:

Example 2 - yyyyMMdd format timestamp

Given a dataset of the form:

subscriptionId,effectiveDate ...
a9470811-83f2-474b-9523-0ece853d8c3c,2015-09-01T00:00:00Z
a9470811-83f2-474b-9523-0ece853d8c3c,2017-01-01T00:00:00Z

The statement:

timestamp effectiveDate using effectiveDate template YYYY.MM.DD format yyyymmdd

Will produce the following result:

subscriptionId,effectiveDate ...
a9470811-83f2-474b-9523-0ece853d8c3c,20150901
a9470811-83f2-474b-9523-0ece853d8c3c,20170101

Last updated