# split

## Overview

The `split` statement is used to create and/or update columns by splitting the textual value of an existing column into multiple parts.

## Syntax

**`split`***`ColName`***`using`***`sep`*

**`split`***`ColName`***`using`***`sep`***`retaining first`***`[column_count]`*

**`split`***`ColName`***`using`***`sep`***`retaining last`***`[column_count]`*

**`split`***`ColName`***`using`***`sep`***`retaining`***`first_column_index`***`[to`***`last_column_index]`*

{% hint style="info" %}
The keyword following *ColName* may be `using`, `separator` or `delimiter`. All three work in exactly the same way.
{% endhint %}

## Details

The *ColName* argument is the name of an existing column whose values are to be split and the *sep* argument (which must only be a single character in length) is the delimiter by which to split them.

For example a value `one:two:three:four`, if divided by a *sep* character of `:`, would result in the fields `one`, `two`, `three` and `four`.

{% hint style="info" %}
To specify a *sep* value of a space, use `" "` or`\<space>`\
(where `<space>`is a literal space character)
{% endhint %}

New columns are created if necessary to contain the values resulting from the split. These additional columns are named *`ColName_split1`* through *`ColName_splitN`* where *N* is the number of values resulting from the process.

If there is an existing column with a name conflicting with any of the columns that `split` creates then:

* If the [overwrite option](/3.4.3/data-pipelines/transform/language/option.md) is set then all values in that pre-existing column will be overwritten
* If the [overwrite option](/3.4.3/data-pipelines/transform/language/option.md) is not set:
  * If there are blank values in the existing column they will be updated
  * If there are no blank values in the existing column then no changes will be made to it

### Keeping only specific fields

If the `retaining` keyword is specified, `split` will discard one or more of the resulting columns automatically based on the specification that follows the keyword. The following specifications are supported:

| Specification         | Result                                                                |
| --------------------- | --------------------------------------------------------------------- |
| `first` or `1`        | Discard all but the first column                                      |
| `first N` or `1 to N` | Discard all but the first *N* columns                                 |
| `last`                | Discard all but the last (rightmost) column                           |
| `last N`              | Discard all but the last *N* columns                                  |
| `N`                   | Discard all but the \_N\_th column                                    |
| `N` to `M`            | Discard all but the columns from the \_N\_th to the \_M\_th inclusive |

{% hint style="info" %}
In the table above, *N* refers to a result column's number where the first column created by `split` has a number of `1`
{% endhint %}

## Example

Given an input dataset of the form:

```
Name,ID
VM-One,sales:2293365:37
VM-Two,marketing:18839:division:89AB745
VM-Three,development:34345:engineering:345345:Jake Smith
VM-Four,sales::38
VM-Five,marketing:234234234:testMachine
VM-Six,development:xxxx:test
VM-Seven,1234:5678
VM-Eight,test:::
VM-Nine,field::5
VM-Ten,test::3425:
```

The statement ...

`split ID using :`

... will result in the dataset:

```
Name,ID,ID_split1,ID_split2,ID_split3,ID_split4,ID_split5
VM-One,sales:2293365:37,sales,2293365,37,,
VM-Two,marketing:18839:division:89AB745,marketing,18839,division,89AB745,
VM-Three,development:34345:engineering:345345:Jake Smith,development,34345,engineering,345345,Jake Smith
VM-Four,sales::38,sales,,38,,
VM-Five,marketing:234234234:testMachine,marketing,234234234,testMachine,,
VM-Six,development:xxxx:test,development,xxxx,test,,
VM-Seven,1234:5678,1234,5678,,,
VM-Eight,test:::,test,,,,
VM-Nine,field::5,field,,5,,
VM-Ten,test::3425:,test,,3425,,
```

Using the same original dataset, the statement ...

`split ID using : retaining 3 to 5`

... will result in the dataset:

```
Name,ID,ID_split1,ID_split2,ID_split3
VM-One,sales:2293365:37,37,,
VM-Two,marketing:18839:division:89AB745,division,89AB745,
VM-Three,development:34345:engineering:345345:Jake Smith,engineering,345345,Jake Smith
VM-Four,sales::38,38,,
VM-Five,marketing:234234234:testMachine,testMachine,,
VM-Six,development:xxxx:test,test,,
VM-Seven,1234:5678,,,
VM-Eight,test:::,,,
VM-Nine,field::5,5,,
VM-Ten,test::3425:,3425,,
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://olddocs.exivity.io/3.4.3/data-pipelines/transform/language/split.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
