# correlate

## Overview

The `correlate` statement is used to enrich the default DSET by adding new columns to it, and/or updating existing columns with useful values. The new column names are derived from other DSETs and the values in those columns are set using a lookup function based on the value in a key column shared between the DSETs.

## Syntax

**`correlate`***`ColName1 [ ... ColNameN]`***`using`***`KeyColumn`***`[assuming`***`assumeDSET]`***`[default`***`DefaultValue]`*

## Details

The *ColName1 ... ColNameN* arguments are column names that will be copied from their original DSETs and merged into the default DSET.

Column names must be [fully qualified](https://olddocs.exivity.io/2.3.1/getting-started/concepts/datasets), unless the *assuming* parameter is used, in which case any column names that are not fully-qualified will be assumed to belong to the DSET specified by *assumeDSET*.

### Source and Destination columns

Source columns are those from which a cell is to be copied when the *KeyColumn* matches. Destination columns are columns in the default DSET into which a cell will be copied. Destination column names are derived from the names of the source columns as follows:

* The source column is the argument in its original form, for example: *Azure.usage.MeterName*
* The destination column is the same argument, but with the DSET ID replaced with that of the [default](https://olddocs.exivity.io/2.3.1/diving-deeper/transform/language/default) DSET. For example if the default DSET is *Custom.Services* then the destination column for the above would be *Custom.Services.MeterName*.

If a destination column name doesn't exist in the default DSET then a new column with that name will automatically be created.

### The Key Column

The *KeyColumn* argument is a column name which must not be fully qualified and which must exist in the default DSET and all of the DSETs referenced by the *ColNameN* arguments.

### Default values

The *DefaultValue* argument, if present, specifies the value to write into the destination column if there is no match for the *KeyColumn*. If the *DefaultValue* argument is not specified then any rows where there is no match will result in a blank cell in the destination column.

For each row in the default DSET, the source DSET is searched for a matching *KeyColumn* value, and if a match is found then the value in the source column is used to update the default DSET. The row of the first match found in the source DSET will be used.

### Overwriting

When matching the *KeyColumn* values, the logic in the following table is evaluated against every row in the destination DSET.

`✘` means *no* or *disabled*, `✔` means *yes* or *enabled*

| Match Found | Overwrite | Default Value | Result                                                                        |
| ----------- | --------- | ------------- | ----------------------------------------------------------------------------- |
| ✘           | ✘         | ✘             | No values will be updated                                                     |
| ✔           | ✘         | ✘             | Empty destination column cells will be updated                                |
| ✘           | ✘         | ✔             | Empty destination column cells will be set to the default value               |
| ✔           | ✘         | ✔             | Empty destination column cells will be set to the matched source column value |
| ✘           | ✔         | ✘             | No values will be updated                                                     |
| ✔           | ✔         | ✘             | Destination column cells will be updated                                      |
| ✘           | ✔         | ✔             | Destination column cells will be set to the default value                     |
| ✔           | ✔         | ✔             | Destination column cells will be set to the matched source column value       |

## Examples

Given two Datasets as follows, where the default DSET is *MyData.Owners*:

**Dataset 'MyData.Owners'**

```
owner,id
John,100
Tim,110
Fokke,120
Joost,130
Jon,140
```

**Dataset 'Custom.Services'**

```
service,description,id
Small_VM,Webserver,130
Medium_VM,App_Server,100
Large_VM,DB_Server,110
Medium_VM,Test_Server,120
```

The statement:\
&#x20;`correlate service description using id assuming Custom.Services`

Will enrich the *MyData.Owners* Dataset such that it contains:

```
owner,id,service,description
John,100,Medium_VM,App_Server
Tim,110,Large_VM,DB_Server
Fokke,120,Medium_VM,Test_Server
Joost,130,Small_VM,Web_Server
Jon,140,,
```

The statement:\
&#x20;`correlate service description using id assuming Custom.Services default unknown`

Will produce:

```
owner,id,service,description
John,100,Medium_VM,App_Server
Tim,110,Large_VM,DB_Server
Fokke,120,Medium_VM,Test_Server
Joost,130,Small_VM,Web_Server
Jon,140,unknown,unknown
```
