LogoLogo
3.6.9
3.6.9
  • Introduction
  • Getting started
    • Installation
      • Prerequisites
        • Server requirements
      • On-premises
        • Single-node
          • Directory structure
        • Multi-node
      • Azure Market Place
      • AWS Market Place
    • Tutorials
      • Amazon AWS CUR
      • Amazon AWS CUR (Athena)
      • Azure Stack
      • Azure EA
      • Azure CSP
      • Google Cloud
      • VMware vCloud
      • VMware vCenter
    • How-to guides
      • How to configure receiving a monthly billing report
      • How to automatically trigger a monthly billing report
      • How to update your license
      • How to store contract information with an Account in a report
      • How to automatically send workflow errors as webhooks to a monitoring system
    • Concepts
      • User interface
      • Services
    • Releases
      • Upgrading to version 3
      • Known issues
      • Announcements
      • Archive
  • Reports
    • Accounts
    • Services
    • Instances
    • Summary
    • Budget
  • Services
    • Manage
    • Rates
      • Tiered Services
        • Aggregation Levels and the Account Hierarchy
    • Adjustments
    • Subscriptions
  • ACCOUNTS
    • Budget management
  • Data pipelines
    • Extract
      • Configuration
      • Extractor templates
      • Script basics
      • Parslets
      • Subroutines
        • check_dateformat
        • check_dateargument
        • format_date
        • validate_response
      • Language
        • aws_sign_string
        • basename
        • buffer
        • csv
        • clear
        • decimal_to_ipv4
        • discard
        • encode
        • encrypt
        • environment
        • escape
        • exit_loop
        • foreach
        • generate_jwt
        • get_last_day_of
        • gosub
        • gunzip
        • hash
        • http
        • if
        • ipv4_to_decimal
        • json
        • loglevel
        • loop
        • lowercase
        • match
        • pause
        • print
        • return
        • save
        • set
        • subroutine
        • terminate
        • unzip
        • uppercase
        • uri
        • var
    • Transform
      • Configuration
      • Transformer templates
      • Transform Preview
      • Language
        • aggregate
        • append
        • calculate
        • capitalise
        • convert
        • copy
        • correlate
        • create
        • default
        • delete
        • dequote
        • environment
        • event_to_usage
        • export
        • finish
        • Functions
        • if
        • import
        • include
        • lowercase
        • normalise
        • option
        • rename
        • replace
        • round
        • services
        • set
        • sort
        • split
        • terminate
        • timecolumns
        • timerender
        • timestamp
        • update_service
        • uppercase
        • var
        • where
    • Datasets
    • Lookups
    • Metadata
    • Reports
    • Workflows
  • Administration
    • User management
      • Users
      • Groups
    • Notifications
      • Budget Notifications
      • Report notifications
      • Workflow notifications
    • Settings
      • Global Variables
      • White Labeling
  • Advanced
    • Integrate
      • GUI automation
        • Examples
      • API docs
      • Single sign-on
        • Claims-based identity provisioning: users, Account access and user groups
        • Azure-AD
        • Auth0
        • OKTA
        • OneLogin
        • ADFS
        • LDAP
    • Digging deeper
      • Authentication flows
      • Transformer datadate
      • Dataset lifecycle
      • Config.json
      • Databases
  • Security
    • Security
    • Authentication
      • Token
      • LDAP
      • SAML2
    • Password reset
    • Password policy
    • Announcements
  • Troubleshooting
    • Logs
  • Terms & Conditions
  • Privacy Policy
Powered by GitBook
On this page
  • Overview
  • Syntax
  • Details
  • Keeping only specific fields
  • Example

Was this helpful?

Export as PDF
  1. Data pipelines
  2. Transform
  3. Language

split

Overview

The split statement is used to create and/or update columns by splitting the textual value of an existing column into multiple parts.

Syntax

splitColNameusingsep

splitColNameusingsepretaining first[column_count]

splitColNameusingsepretaining last[column_count]

splitColNameusingsepretainingfirst_column_index[tolast_column_index]

The keyword following ColName may be using, separator or delimiter. All three work in exactly the same way.

Details

The ColName argument is the name of an existing column whose values are to be split and the sep argument (which must only be a single character in length) is the delimiter by which to split them.

For example a value one:two:three:four, if divided by a sep character of :, would result in the fields one, two, three and four.

To specify a sep value of a space, use " " or\<space> (where <space>is a literal space character)

New columns are created if necessary to contain the values resulting from the split. These additional columns are named ColName_split1 through ColName_splitN where N is the number of values resulting from the process.

If there is an existing column with a name conflicting with any of the columns that split creates then:

    • If there are blank values in the existing column they will be updated

    • If there are no blank values in the existing column then no changes will be made to it

Keeping only specific fields

If the retaining keyword is specified, split will discard one or more of the resulting columns automatically based on the specification that follows the keyword. The following specifications are supported:

Specification

Result

first or 1

Discard all but the first column

first N or 1 to N

Discard all but the first N columns

last

Discard all but the last (rightmost) column

last N

Discard all but the last N columns

N

Discard all but the _N_th column

N to M

Discard all but the columns from the _N_th to the _M_th inclusive

In the table above, N refers to a result column's number where the first column created by split has a number of 1

Example

Given an input dataset of the form:

Name,ID
VM-One,sales:2293365:37
VM-Two,marketing:18839:division:89AB745
VM-Three,development:34345:engineering:345345:Jake Smith
VM-Four,sales::38
VM-Five,marketing:234234234:testMachine
VM-Six,development:xxxx:test
VM-Seven,1234:5678
VM-Eight,test:::
VM-Nine,field::5
VM-Ten,test::3425:

The statement ...

split ID using :

... will result in the dataset:

Name,ID,ID_split1,ID_split2,ID_split3,ID_split4,ID_split5
VM-One,sales:2293365:37,sales,2293365,37,,
VM-Two,marketing:18839:division:89AB745,marketing,18839,division,89AB745,
VM-Three,development:34345:engineering:345345:Jake Smith,development,34345,engineering,345345,Jake Smith
VM-Four,sales::38,sales,,38,,
VM-Five,marketing:234234234:testMachine,marketing,234234234,testMachine,,
VM-Six,development:xxxx:test,development,xxxx,test,,
VM-Seven,1234:5678,1234,5678,,,
VM-Eight,test:::,test,,,,
VM-Nine,field::5,field,,5,,
VM-Ten,test::3425:,test,,3425,,

Using the same original dataset, the statement ...

split ID using : retaining 3 to 5

... will result in the dataset:

Name,ID,ID_split1,ID_split2,ID_split3
VM-One,sales:2293365:37,37,,
VM-Two,marketing:18839:division:89AB745,division,89AB745,
VM-Three,development:34345:engineering:345345:Jake Smith,engineering,345345,Jake Smith
VM-Four,sales::38,38,,
VM-Five,marketing:234234234:testMachine,testMachine,,
VM-Six,development:xxxx:test,test,,
VM-Seven,1234:5678,,,
VM-Eight,test:::,,,
VM-Nine,field::5,5,,
VM-Ten,test::3425:,3425,,
PrevioussortNextterminate

Last updated 3 years ago

Was this helpful?

If the is set then all values in that pre-existing column will be overwritten

If the is not set:

overwrite option
overwrite option