timecolumns
Overview
The timecolumns
statement is used to set the start time and end time columns in a DSET
Syntax
timecolumns
start_time_col end_time_col
timecolumns clear
Details
The usage data stored in a DSET may or may not be time sensitive. By default it is not, and every record is treated as representing usage for the entire day. In many cases however, the usage data contains start and end times for each record which define the exact time period within the day that the record is valid for.
If the usage data contains start and end times that are required for functions such as aggregation or reporting, the column(s) containing those times need to be marked such that they can be identified further on in the processing pipeline. This marking is done using the timecolumns
statement.
The timecolumns
statement does not perform any validation of the values in either of the columns it is flagging. This is by design, as it may be that the values in the columns will be updated by subsequent statements.
The values in the columns will be validated by the finish statement.
If the timecolumns
statement is executed more than once, then only the columns named by the latest execution of the statement will be flagged. It is not possible to have more than one start time and one end time column.
Both the start_time_col and end_time_col parameters may be fully qualified column names, but they must both belong to the same DSET.
It is possible to use the same column as both the start and end times. In such cases the usage record is treated as spanning 1 second of time. To do this, simply reference it twice in the statement:
Clearing the flagged timestamp columns
To clear both the start and end time columns, thus restoring the default DSET to treating each record as spanning the entire day, the statement timecolumns clear
may be used.
Currently the statement timecolumns clear
will only clear the timestamp columns in the default DSET
This can be useful in the following use case:
The DSET is loaded and timestamp columns are created
finish is used to create a time-sensitive RDF
The timestamp columns are cleared
The DSET is renamed using the rename dset statement
Further processing is done on the DSET as required
finish is used to create a second RDF which is not time-sensitive
Example
Last updated