correlate

Overview

The correlate statement is used to enrich the default DSET by adding new columns to it, and/or updating existing columns with useful values. The new column names are derived from other DSETs and the values in those columns are set using a lookup function based on the value in a key column shared between the DSETs.

Syntax

correlateColName1 [ ... ColNameN]usingKeyColumn[assumingassumeDSET][defaultDefaultValue]

Details

The ColName1 ... ColNameN arguments are column names that will be copied from their original DSETs and merged into the default DSET.

Column names must be fully qualified, unless the assuming parameter is used, in which case any column names that are not fully-qualified will be assumed to belong to the DSET specified by assumeDSET.

Source and Destination columns

Source columns are those from which a cell is to be copied when the KeyColumn matches. Destination columns are columns in the default DSET into which a cell will be copied. Destination column names are derived from the names of the source columns as follows:

  • The source column is the argument in its original form, for example: Azure.usage.MeterName

  • The destination column is the same argument, but with the DSET ID replaced with that of the default DSET. For example if the default DSET is Custom.Services then the destination column for the above would be Custom.Services.MeterName.

If a destination column name doesn't exist in the default DSET then a new column with that name will automatically be created.

The Key Column

The KeyColumn argument is a column name that must not be fully qualified and which must exist in the default DSET and all of the DSETs referenced by the ColNameN arguments.

Default values

The DefaultValue argument, if present, specifies the value to write into the destination column if there is no match for the KeyColumn. If the DefaultValue argument is not specified then any rows where there is no match will result in a blank cell in the destination column.

For each row in the default DSET, the source DSET is searched for a matching KeyColumn value, and if a match is found then the value in the source column is used to update the default DSET. The row of the first match found in the source DSET will be used.

Overwriting

When matching the KeyColumn values, the logic in the following table is evaluated against every row in the destination DSET.

means no or disabled, means yes or enabled

Examples

Given two Datasets as follows, where the default DSET is MyData.Owners:

Dataset 'MyData.Owners'

owner,id
John,100
Tim,110
Fokke,120
Joost,130
Jon,140

Dataset 'Custom.Services'

service,description,id
Small_VM,Webserver,130
Medium_VM,App_Server,100
Large_VM,DB_Server,110
Medium_VM,Test_Server,120

The statement: correlate service description using id assuming Custom.Services

Will enrich the MyData.Owners Dataset such that it contains:

owner,id,service,description
John,100,Medium_VM,App_Server
Tim,110,Large_VM,DB_Server
Fokke,120,Medium_VM,Test_Server
Joost,130,Small_VM,Web_Server
Jon,140,,

The statement: correlate service description using id assuming Custom.Services default unknown

Will produce:

owner,id,service,description
John,100,Medium_VM,App_Server
Tim,110,Large_VM,DB_Server
Fokke,120,Medium_VM,Test_Server
Joost,130,Small_VM,Web_Server
Jon,140,unknown,unknown

Last updated