1 of 44

Extract

Introduction

Extraction is the process by which USE (Unified Scriptable Extractor) retrieves data from external locations. The following types of data source are supported:

USE script

A USE script is required for USE to operate. Further information can be found via the links below:

An introductory overview of the scripting language:

A reference guide for the USE scripting language:

How to parse XML and JSON data

Template scripts that can be used as starting points for common data sources:

Configuration

The 'Data pipelines' menu allows an admin of the Exivity solution to manage USE 'Extractors'. USE has its own language reference, which is fully covered in a separate chapter of this documentation.

As described in the , you are free to use your editor of choice to create and modify USE Extractors. However, the GUI also comes with a built-in USE Extractor-editor.

Creating Extractors

To create a new USE Extractor, follow these steps:

From the menu on the left, select "Data pipelines" > 'Extractors'
To create a new USE Extractor where you want pull usage or lookup data from, click the 'Add Extractors' button
Provide a meaningful name for your USE Extractor. In the above example we're creating an USE Extractor for VMware vCenter 6.5 and higher. Therefore we call this USE Extractor : 'vCenter 6.5'
When you're done creating your USE Extractor, click the 'Insert' at the bottom of the screen

Edit and Delete Extractors

When you want to change or delete an existing USE Extractor, first select one from the list of USE Extractors that you want to change:

After you select your USE Extractor, you can change its variable values at the 'Variables' tab.
At the "Editor" tab you can make more advanced changes or delete the original USE-script. Such as:
- changing existing API calls
- changing csv output format

Don't forget to save any changes with the "SAVE" button.

Run and Schedule Extractors

To test your USE Extractor, you can execute or schedule it directly from the Glass interface:

After you have selected the USE Extractor that you would like to run, click to the 'Run' tab next to the 'Editor' tab
Most Extractors require one or more parameters, usually in a date format such as 20171231. In this example, the USE Extractor requires two parameters: a from and to date
When you've provided the required run parameters, click 'Run Now' to execute the USE Extractor. After the USE Extractor has completed running, you will receive some success or failed message, after which you might need to make additional changes to your USE Extractor
Once you're happy with your output, you can schedule the USE Extractor via the 'Schedule' tab, which is located next to the 'Run' tab at the top of the screen.
USE Extractors can be scheduled to run once a day at a specific time. Also you should provide a from and (optionally) to date, which are provided by using an offset value. For example, if you want to use the day before yesterday as a from date, you should use the down pointing arrows on the right, to select a value of -2. If the to date should always correspond with yesterdays date, you should provide a value there of -1.
If your Use Extractor requires additional parameters, you may provide these as well in the 'Schedule with these arguments' text field.
When you're done with the schedule configuration, you may click the 'Schedule' button. In case you want to change or remove this schedule afterwards, click the 'Unschedule' button.

As of version 1.6, it is recommend to use the Workflow function instead of the Extractor schedule

Templates

Exivity provides a catalogue of USE extraction scripts that can be used to integrate with almost any cloud provider, hypervisor or legacy IT end point. We've published some of our templates on GitHub for your convenience.

This repository contains Extractors for VMware, Azure, Amazon and others. However, if you are currently missing an integration template and are unwilling or unable to create your own, feel free to drop us an e-mail at support@exivity.com.

Script basics

USE scripts are stored in <basedir>/system/config/use, and are ASCII files which can be created with any editor. Both UNIX and Windows end-of-line formats are supported but in certain circumstances they may be automatically converted to UNIX end-of-line format.

Statements

Each statement in a USE script must be contained on a single line. Statements consist of a keyword followed by zero or more parameters separated by whitespace. The contains documentation for each statement.

Quotes and escapes

By default, a space, tab or newline will mark the end of a word in a USE script. To include whitespace in a word (for example to create a variable with a space in it) then double quotes - " - or an escape - \ - must be used to prevent the parser from interpreting the space as an end of word marker. Unless within double quotes, to specify a literal tab or space character it must be escaped by preceding it with a backslash character - \.

Examples:

The following table summarises the behaviour:

Comments

Comments in a USE script start with a # character that is either of

the first character of a line
the first character in a word

Comments always end at the end of the line they were started on

Variables

Overview

USE scripts often make use of variables. Variables have a name and a value. When a variable name is encountered on any given line during execution of the script, the name is replaced with the value before the line is executed.

To reference a variable, the name should be preceded with ${ and followed by }. For example to access the value of a variable called username, it should be written as ${username}.

The length (in characters) of a variable can be determined by appending .LENGTH to the variable name when referencing it. Thus if a variable called result has a value of success then ${result.LENGTH} will be replaced with 7.

Creation

Encryption

Publishing to the user interface

Variables may be exposed in the GUI by prefixing their declaration with the word public as follows:

Any variable so marked may be edited using a form in the GUI before the script is executed. If a public variable is followed by a comment on the same line, then the GUI will display that comment for reference. If there is no comment on the same line, then the line before the variable declaration is checked, and if it starts with a comment then this is used. Both variants are shown in the example below:

If a variable declaration has both kinds of comment associated with it then the comment on the same line as the variable declaration will be used

Named buffers

A named buffer (also termed a response buffer) contains data retrieved from an external source, such as an HTTP or ODBC request. Buffers are created with the buffer statement.

Once created, a buffer can be referenced by enclosing its name in { and } as follows:

Buffer names may be up to 31 characters in length
Up to 128 buffers may exist simultaneously
Up to 2Gb of data can be stored in any given buffer (memory permitting)

Extracting data with Parslets

Parslets are used to extract data from from the contents of a named buffer.

Parslets

Overview

After populating a named buffer with data from an external source such as an HTTP request or a file, it is often necessary to extract fields from it for uses such as creating subsequent HTTP requests or rendering output csv files.

This is accomplished using parslets. There are two types of parslet, static and dynamic. In both cases, when a parslet is used in a script it is expanded such that it is replaced with the value it is referencing, just like a variable is.

Static parslets refer to a fixed location in XML or JSON data
Dynamic parslets are used in conjunction with foreach loops to retrieve values when iterating over arrays in XML or JSON data

Parslets can be used to query JSON or XML data. Although JSON is used for illustrative purposes, some additional notes specific to XML can be found further down in this article.

A quick JSON primer

Consider the example JSON shown below:

The object containing all the data (known as the root node) contains the following children:

Objects and arrays can be nested to any depth in JSON. The children of nested objects and arays are not considered as children of the object containing those objects and arrays, i.e. the children of the heading object are not considered as children of the root object.

Every individual 'thing' in JSON data, regardless of its type is termed a node.

Although different system return JSON in different forms, the JSON standard dictates that the basic principles apply universally to all of them. Thus, any possible valid JSON may contain arrays, objects, strings, boolean values (true or false values), numbers and null children.

It is often the case that the number of elements in arrays is not known in advance, therefore a means of iterating over all the elements in an array is required to extract arbitrary data from JSON. This principle also applies to objects, in that an object may contain any number of children of any valid type. Valid types are:

Some systems return JSON in a fixed and predictable format, whereas others may return objects and arrays of varying length and content. The documentation for any given API should indicate which fields are always going to be present and which may or may not be so.

Parslets are the means by which USE locates and extracts fields of interest in any valid JSON data, regardless of the structure. For full details of the JSON data format, please refer to http://json.org

Static parslets

Static parslets act like variables in that the parslet itself is expanded such that the extracted data replaces it. Static parslets extract a single field from the data and require that the location of that field is known in advance.

In the example JSON above, let us assume that the data is held in a named buffer called example and that the title and heading children are guaranteed to be present. Further, the heading object always has the children category and finalised. Note that for all of these guaranteed fields, the value associated with them is indeterminate.

The values associated with these fields can be extracted using a static parslet which is specified using the following syntax:

$JSON{buffer_name}.[node_path]

Static parslets always specify a named buffer in curly braces immediately after the $JSON prefix

The buffer_name is the name of the buffer containing the JSON data, which must have previously been populated using the buffer statement.

The node_path describes the location and name of the node containing the value we wish to extract. Starting at the root node, the name of each node leading to the required value is specified in square brackets. Each set of square brackets is separated by a dot.

The nodepaths for the fixed nodes described above are therefore as follows:

Putting all the above together, the parslet for locating the category in the heading is therefore:

$JSON{example}.[heading].[category]

When this parslet is used in a USE script, the value associated with the parslet is extracted and the parslet is replaced with this extracted value. For example:

print $JSON{example}.[heading].[category]

will result in the word Documentation being output by the statement, and:

var category = $JSON{example}.[heading].[category]

will create a variable called category with a value of Documentation.

Currently, a parslet must be followed by whitespace in order to be correctly expanded. If you want to embed the value into a longer string, create a variable from a parslet and use that instead:

var category = $JSON{example}.[heading].[category]
var filename = JSON_${category}_${dataDate}

When using JSON parslets that reference values that may contain whitespace it is sometimes necessary to enclose them in double quotes to prevent the extracted value being treated as multiple words by the script

Anonymous JSON arrays

It may be required to extract values from a JSON array which contains values that do not have names as shown below:

{
  "data": {
    "result": [
      {
        "account": {
          "name": "account_one"
        },
        "metrics": [
          [
            34567,
            "partner"
          ],
          [
            98765,
            "reseller"
          ]
        ]
      },
      {
        "account": {
          "name": "account_two"
        },
        "metrics": [
          [
            24680,
            "internal"
          ],
          [
            13579,
            "partner"
          ]
        ]
      }
    ]
  }
}

Extraction of values that do not have names can be accomplished via the use of nested foreach loops in conjunction with an empty nodepath ([]) as follows:

buffer json_data = FILE system/extracted/json.json

csv OUTFILE = system/extracted/result.csv
csv add_headers OUTFILE account related_id type
csv fix_headers OUTFILE

foreach $JSON{json_data}.[data].[result] as this_result {

    # Extract the account name from each element in the 'result' array
    var account_name = $JSON(this_result).[account].[name]

    print Processing namespace: ${account_name}

    # Iterate over the metrics array within the result element
    foreach $JSON(this_result).[metrics] as this_metric {

    # As the metrics array contains anonymous arrays we need to iterate
    # further over each element. Note the use of an empty notepath.

        foreach $JSON(this_metric).[] as this_sub_metric {
            if (${this_sub_metric.COUNT} == 1) {
                # Assign the value on the first loop iteration to 'related_id'
                var related_id = $JSON(this_sub_metric).[]
            }
            if (${this_sub_metric.COUNT} == 2) {
                # Assign the value on the second loop iteration to 'type'
                var type = $JSON(this_sub_metric).[]
            }
        }

        csv write_fields OUTFILE ${account_name} ${related_id} ${type}
    }    
}
csv close OUTFILE

The result of executing the above against the sample data is:

"account","related_id","type"
"account_one","34567","partner"
"account_one","98765","reseller"
"account_two","24680","internal"
"account_two","13579","partner"

If the anonymous arrays have a known fixed length then it is also possible to simply stream the values out to the CSV without bothering to assign them to variables. Thus assuming that the elements in the metrics array always had two values, the following would also work:

buffer json_data = FILE system/extracted/json.json

csv OUTFILE = system/extracted/result.csv
csv add_headers OUTFILE account related_id type
csv fix_headers OUTFILE

foreach $JSON{json_data}.[data].[result] as this_result {

    # Extract the account name from each element in the 'result' array
    var account_name = $JSON(this_result).[account].[name]

    print Processing namespace: ${account_name}

    # Iterate over the metrics array within the result element
    foreach $JSON(this_result).[metrics] as this_metric {

    # As the metrics array contains anonymous arrays we need to iterate
    # further over each element. Note the use of an empty notepath.

        csv write_field OUTFILE ${account_name}

        foreach $JSON(this_metric).[] as this_sub_metric {
                csv write_field OUTFILE $JSON(this_sub_metric).[]
        }        
    }    
}
csv close OUTFILE

Which method is used will depend on the nature of the input data. Note that the special variable ${loopname.COUNT} (where loopname is the label of the enclosing foreach loop) is useful in many contexts for applying selective processing to each element in an array or object as it will be automatically incremented every time the loop iterates. See foreach for more information.

Dynamic parslets

Dynamic parslets are used in to extract data from locations in the data that are not known in advance, such as when an array of unknown length is traversed in order to retrieve a value from each element in the array.

A dynamic parslet must be used in conjunction with a foreach loop and takes the following form:

$JSON(loopName).[node_path]

Note the following differences between a static parslet and a dynamic parslet:

A dynamic parslet does not reference a named buffer directly, rather it references the name of a foreach loop
Parentheses are used to surround the name of the foreach loop (as opposed to curly braces)
The nodepath following a dynamic parslet is relative to the target of the foreach loop

The following script fragment will render the elements in the items array (in the example JSON above) to disk as a CSV file.

# For illustrative purposes assume that the JSON
# is contained in a named buffer called 'myJSON'

# Create an export file
csv "items" = "system/extracted/items.csv"
csv add_headers id name category subcategory
csv add_headers subvalue1 subvalue2 subvalue3 subvalue4
csv fixheaders "items"

foreach $JSON{myJSON}.[items] as this_item
{
    # Define the fields to export to match the headers
    csv write_field items $JSON(this_item).[id]
    csv write_field items $JSON(this_item).[name]
    csv write_field items $JSON(this_item).[category]
    csv write_field items $JSON(this_item).[subcategory]

    # For every child of the 'subvalues' array in the current item
    foreach $JSON(this_item).[subvalues] as this_subvalue
    {
        csv write_field items $JSON(this_item).[0]
        csv write_field items $JSON(this_item).[10]
        csv write_field items $JSON(this_item).[100]
        csv write_field items $JSON(this_item).[1000]
    }
}
csv close "items"

In the example above, the first foreach loop iterates over the elements in the 'items' array, and each of the dynamic parslets extract values from the current element in that loop. The dynamic parslets use the current element, this_item as the root for their node paths.

If a parslet references a non-existent location in the XML or JSON data then it will resolve to the value EXIVITY_NOT_FOUND

XML parslets

XML parslets work in exactly the same way that JSON parslets do, apart from the following minor differences:

XML parslets are prefixed $XML
When extracting data from XML, the foreach statement only supports iterating over XML arrays (whereas JSON supports iterating over objects and arrays)
An XML parslet may access an XML attribute

To access an XML attribute, the node_path should end with [@atrribute_name] where attribute_name is the name of the attribute to extract. For example given the following data in a buffer called xmlbuf:

<note>
<to>Tove</to>
<from>
    <name comment="test_attribute">Jani</name>
</from>
<test_array>
    <test_child>
        <name attr="test">Child 1</name>
        <age>01</age>
    </test_child>
    <test_child>
        <name attr="two">Child 2</name>
        <age>02</age>
    </test_child>
    <test_child>
        <name attr="trois">Child 3</name>
        <age>03</age>
    </test_child>
    <test_child>
        <name attr="quad">Child 4</name>
        <age>04</age>
    </test_child>
</test_array>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

The following script:

foreach $XML{xmlbuf}.[test_array] as this_child {
    print Child name ${this_child.COUNT} is $XML(this_child).[name] and age is $XML(this_child).[age] - attribute $XML(this_child).[name].[@attr]
}

will produce the following output:

Child name 1 is Child 1 and age is 01 - attribute test
Child name 2 is Child 2 and age is 02 - attribute two
Child name 3 is Child 3 and age is 03 - attribute trois
Child name 4 is Child 4 and age is 04 - attribute quad

Subroutines

Example subroutines which can be used in your data Extractors

The gosub function in a data Extractor allows invoking a subroutine in a data extractor script. Subroutines are useful to avoid having to duplicate code snippets.

Our customers and Solution Architects have created several useful subroutines over the last few years. We are keeping a small library of the most useful ones on our public docs. You can find them listed on this page.

check_dateformat

This Subroutine checks if a date is in the YYYYMMDD format, if not then it raises and error.

Syntax

gosub check_dateformat ("YYYYMMDD")

Code Snippet

subroutine check_dateformat {
	match date "^(([0-9]{4}(0[1-9]|1[0-2])(0[1-9]|1[0-9]|2[0-9]|3[0-1])))" ${SUBARG_1}
	if (${date.STATUS} != MATCH) {
		print Argument error: ${SUBARG_1} is not in YYYYMMDD format
		terminate with error
	}
}

check_dateargument

including 1 day mode

This Subroutines checks if the FROM and TO date are in order. In the case that there is only 1 day entered, it will automatically fill in the second day in a "1 Day Mode".

Syntax

gosub check_dateargument ()

Code Snippet

subroutine check_dateargument {
	# Validate that amount of input arguments is as expected
	if (${ARGC} != 2) {
		if (${ARGC} == 1) {
			print "Running in 1 day mode"
			var firstday = ${ARG_1}
			var lastday = (@DATEADD(${firstday}, 1))
		} else {
			print "This requires 1 or 2 arguments, the day to collect usage for, and the date following that day, both in YYYYMMDD format"
			terminate with error
		}
	} else {
		var firstday = ${ARG_1}
		var lastday = ${ARG_2}
	}
	
	# Validate that to date is not before from date
	if (${firstday} > ${lastday}) {
		print "TO date cannot be a date that lies before FROM date"
		terminate with error
	}
	# Validate that to date is not the same as from date
	if (${firstday} == ${lastday}) {
		print "TO date cannot be the same as FROM date"
		terminate with error
	}
}

format_date

This Subroutine extracts the day, month and year from a given date in YYYYMMDD format.

Syntax

gosub format_date ("YYYYMMDD")

Code Snippet

subroutine format_date {
    match day "^[0-9]{6}([0-9]{2})" ${SUBARG_1}
    if (${day.STATUS} != MATCH) {
        terminate with error
    } else {
        var day = ${day.RESULT}
    }
    match month "^[0-9]{4}([0-9]{2})[0-9]{2}" ${SUBARG_1}
    if (${day.STATUS} != MATCH) {
        terminate with error
    } else {
        var month = ${month.RESULT}
    }
    match year "^([0-9]{4})[0-9]{4}" ${SUBARG_1}
    if (${year.STATUS} != MATCH) {
        terminate with error
    } else {
        var year = ${year.RESULT}
    }
}

validate_response

This subroutine allows validating an HTTP response from a buffer.

Syntax

gosub validate_response({my_buffer})

Code snippet

subroutine validate_response {
    if (${HTTP_STATUS_CODE} != 200) {
        print Got HTTP status ${HTTP_STATUS_CODE}, expected a status of 200
        print The server response was:
        json format ${SUBARG_1}
        print ${SUBARG_1}
        terminate with error
    }
}

Language

This article links to detailed descriptions of all the statements supported by USE script.

These descriptions assume knowledge of the USE script basics.

Statement reference

aws_sign_string

The aws_sign_String statement is used to generate an AWS4-HMAC-SHA256 signature, used as the signature component of the Authorization HTTP header when calling the AWS API.

Syntax

aws_sign_stringvarNameusingsecret_key date region service

Details

The authentication method used by AWS requires the generation of an authorization signature which is derived from a secret key known to the client along with specific elements of the query being made to the API.

This is a fairly involved process and a full step-by-step walkthrough is provided by Amazon on the following pages (these should be read in the order listed below):

The aws_sign_string statement is used to generate the final signature as detailed on the calculate signature page listed above.

Note that in order to use this statement it is necessary to have the following strings available:

A string to sign, obtained by following the process creating a string to sign, containing meta-data about the request being made
A secret_key, obtained from Amazon which is used by any client application authorising against their API
The date associated with the API request, in YYYYMMDD format
The AWS region associated with the API request (for example eu-central-1)
The AWS service being accessed (for example s3)

The aws_sign_string statement will use these inputs to generate the HMAC-SHA256 signature which is a component of the Authorization header when connecting to the API itself.

The varName parameter is the name of a variable containing the string to sign. After executing aws_sign_string the contents of this same variable will have been updated to the base-16 encoded signature value.

If there are any errors in the string to sign, _date, AWS region or AWS service strings used as input to aws_sign_string then a signature will still be generated, but the AWS API will reject the request. In this case it is necessary to review the process by which these strings were created as per the AWS guide provided above.

Example

The following is an example USE script that implements everything described above.

#################################################################
# This USE script will download a file from an S3 bucket        #
#                                                               #
# It takes three parameters:                                    #
# 1) The name of the bucket                                     #
# 2) The name of the object to download                         #
# 3) The name of the file to save the downloaded object as      #
#                                                               #
# Created: 13th Jan 2018                                        #
# Author: Eddy Deegan                                           #
# --------------------------------------------------------------#
# NOTES:                                                        #
# - This script hardcodes the Region as eu-central-1 but this   #
#   can easily be changed or made a parameter as required       #
#################################################################

if (${ARGC} != 3) {
    print This script requires the following parameters:
    print bucketName objectName saveFilename
    terminate
}

# Set this to 1 to enable a debug trace output when the script is run
var DEBUG = 0

# This is the text that appears to the left and right of debug headings 
var banner = ________

######################################################################
# Customer specific values here (these can be encrypted if required) #
#                                                                    #
var bucket = "${ARG_1}"
var s3_object = "${ARG_2}"
var AWS_Region = "eu-central-1"
var AWS_Service = "s3"
encrypt var access_key = <YOUR ACCESS KEY>
encrypt var secret_key = <YOUR SECRET KEY>
#                                                                    #
# End customer specific values                                       #
######################################################################

# This is the SHA256 hash of an empty string (required if making a request with no body)
var hashed_empty_string = e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

#########################################################################################
# SETUP                                                                                 #
# Create a number of variables to represent the various components that the steps       #
# below are going to use in order to construct a correct AWS request                    #
#---------------------------------------------------------------------------------------#
# This is the request syntax for retrieving an object from a bucket:                    #
# GET /<ObjectName> HTTP/1.1                                                            #
# Host: <BucketName>.s3.amazonaws.com                                                   #
# Date: date                                                                            #
# Authorization: authorization string                                                   #
#########################################################################################

var HTTP_Method = GET
var URI = ${s3_object}
var query_params                    # Must have an empty variable for 'no query parameters'
var host = ${bucket}.s3-${AWS_Region}.amazonaws.com
var date = ${OSI_TIME_UTC}

# Initialise config variables specific to this script
var save_path = "system/extracted"
var save_file = ${ARG_3}

#########################################################################################
# STEP 1                                                                                #
# Create a canonical request as documented at                                           #
# at https://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html  #
#########################################################################################

# 1a) Canonical Headers string
#     - This is part of the Canonical Request string which will be generated below.
#     - The Canonical Headers are a list of all HTTP headers (including values but
#       with the header names in lowercase) separated by newline characters and in
#       alphabetical order

var canonical_headers = "date:${date}${NEWLINE}host:${host}${NEWLINE}x-amz-content-sha256:${hashed_empty_string}${NEWLINE}"
if (${DEBUG} == 1) {
    print ${NEWLINE}${banner} Canonical Headers ${banner}${NEWLINE}${canonical_headers}
}

# 1b) Signed Headers string
#     - This is a list of the header names that were used to create the Canonical Headers,
#       separated by a semicolon
#     - This list MUST be in alphabetical order
#     - NOTE: There is no trailing newline on this variable (we need to use it both with and without
#             a newline later so we explicitly add a ${NEWLINE} when we need to)

var signed_headers = "date;host;x-amz-content-sha256"
if (${DEBUG} == 1) {
    print ${banner} Signed Headers ${banner}${NEWLINE}${signed_headers}${NEWLINE}
}

# 1c) Canonical Request
#     - The above are now combined to form a Canonical Request, which is created as follows:
#     - HTTPRequestMethod + '\n' + URI + '\n' + QueryString + '\n' + CanonicalHeaders + '\n' +
#       SignedHeaders + '\n' + Base16 encoded SHA256 Hash of any body content
##    - Note that the Canonical Headers are followed by an extra newline (they have one already)

vvar canonical_request = "${HTTP_Method}${NEWLINE}/${URI}${NEWLINE}${query_params}${NEWLINE}${canonical_headers}${NEWLINE}${signed_headers}${NEWLINE}${hashed_empty_string}"
iif (${DEBUG} == 1) {
    print ${banner} Canonical Request ${banner}${NEWLINE}${canonical_request}${NEWLINE}
}}

## 1d) Hash of the Canonical Request
##     - This is an SHA256 hash of the Canonical Request string

hhash sha256 canonical_request as hashed_canonical_request

######################################################################################
## STEP 2                                                                             #
## Create a 'string to sign' as documented at                                         #
## at https://docs.aws.amazon.com/general/latest/gr/sigv4-create-string-to-sign.html  #
##------------------------------------------------------------------------------------#
## In a nutshell this is the following components separated by newlines:              #
## 2a) Hash algorithm designation                                                     #
## 2b) UTC date in YYYYMMDD'T'HHMMSS'Z' format                                        #
## 2c) credential scope (date/region/service/"aws4_request")                          #
## 2d) base16-encoded hashed canonical request                                        #
######################################################################################

## Extract the yyyyMMdd from the UTC time
mmatch yyyyMMdd "(.{8})" ${date}
vvar yyyyMMdd = ${yyyyMMdd.RESULT}

vvar string_to_sign = AWS4-HMAC-SHA256${NEWLINE}${date}${NEWLINE}${yyyyMMdd}/${AWS_Region}/${AWS_Service}/aws4_request${NEWLINE}${hashed_canonical_request}
iif (${DEBUG} == 1) {
    print ${banner} String to sign ${banner}${NEWLINE}${string_to_sign}${NEWLINE}
}}

######################################################################################
## STEP 3                                                                             #
## Calculate the signature for AWS Signature Version 4 as documented at:              #
## at https://docs.aws.amazon.com/general/latest/gr/sigv4-calculate-signature.html    #
##                                                                                    #
######################################################################################

## 3a) Derive a signing key and apply it to the string to sign
##     Use the secret access key to create the following hash-based auth codes:
##     a) ksecret (our secret access key)
##     b) kDate = HMAC("AWS4" + kSecret, Date) NOTE: yyyyMMdd only
##     c) kRegion = HMAC(kDate, Region)
##     d) kService = HMAC(kRegion, Service)
##     e) kSigning = HMAC(kService, "aws4_request")
##     f) HMAC the string_to_sign with the key derived using steps a - e

vvar signature = ${string_to_sign}

iif (${DEBUG} == 1) {
    print ${banner}Deriving Signing Key using these parameters${banner}${NEWLINE}${secret_key} ${yyyyMMdd} ${AWS_Region} ${AWS_Service}${NEWLINE}${NEWLINE}
}}

# # The following statement takes care of all the details listed above
# # Notes: 
# #      - The word 'signature' in the statement below is the NAME of a variable and
# #        NOT a reference to its contents
# #      - The contents of this variable are the string to sign, and after the statement
# #        has completed these contents will have been modified to be the authorization
# #        signature for that string
#
AWS_sign_string signature using ${secret_key} ${yyyyMMdd} ${AWS_Region} ${AWS_Service}

#######################################################################################
## STEP 4                                                                             #
## Add the signing information to the request as documented at:                       #
## https://docs.aws.amazon.com/general/latest/gr/sigv4-add-signature-to-request.html  #
##                                                                                    #
######################################################################################

vvar credential_scope = "${yyyyMMdd}/${AWS_Region}/${AWS_Service}/aws4_request"
iif (${DEBUG} == 1) {
    print ${banner} Credential Scope ${banner}${NEWLINE}${credential_scope}${NEWLINE}${NEWLINE}
}}

vvar auth_header = "Authorization: AWS4-HMAC-SHA256 Credential=${access_key}/${credential_scope}, SignedHeaders=${signed_headers}, Signature=${signature}"

iif (${DEBUG} == 1) {
    print ${banner} Authorization Header ${banner}${NEWLINE}${auth_header}${NEWLINE}
}}

sget http_header ${auth_header}

########################################################
## STEP 5                                              #
## Execute the query                                   #
##-----------------------------------------------------#
## Note that all the headers that were included in the #
## signed_headers created in STEP 1 must be set before #
## the request is executed                             #
########################################################

sget http_header "Date: ${date}"
sget http_header "x-amz-content-sha256: ${hashed_empty_string}"
sget http_savefile ${save_path}/${save_file}

sget http_progress yes
pprint "Downloading ${host}/${URI}:"
hhttp GET https://${host}/${URI}
pprint ${NEWLINE}Done

basename

The basename statement is used to extract the filename portion of a path + filename string

Syntax

basenamevarName

basenamestringasvarName

Details

Given a string describing the full path of a file, such as /extracted/test/mydata.csv the basename statement is used to identify the filename (including the file extension, if any) portion of that string only. If there are no path delimiters in the string then the original string is returned.

The basename statement supports both UNIX-style (forward slash) and Windows-style (backslash) delimiters.

When invoked as basename varName, the varName parameter must be the name of the variable containing the string to analyse. The value of the variable will be updated with the result so care should be taken to copy the original value to a new variable beforehand if the full path may be required later in the script.

As a convenience in cases where the full path needs to be retained, the result of the operation can be placed into a separate variable by using the form basename string as varName where string is the value containing the full path + filename and varName is the name of the variable to set as the result.

When invoked using basename string as varName if a variable called varName does not exist then it will be created, else its value will be updated.

Examples

Example 1

The following script ...

var path = "extracted/test/testdata.csv"

# Copy the path as we'll need it later
var file = ${path}

# Note: use the NAME of the variable not the value
basename file

# The variable called 'file' now contains the result
print The basename of the path '${path}' is '${file}'

var path = "testdata.csv"
var file = ${path}
basename file

print The basename of the path '${path}' is '${file}'

... will produce the following output:

The basename of the path 'extracted/test/testdata.csv' is 'testdata.csv'
The basename of the path 'testdata.csv' is 'testdata.csv'

Example 2

The following script ...

var path = "extracted/test/testdata.csv"
basename ${path} as file
print The basename of the path '${path}' is '${file}'

... will produce the following output:

The basename of the path 'extracted/test/testdata.csv' is 'testdata.csv'

buffer

The buffer command is used to create and/or populate one of these named buffers with data.

Syntax

buffername=protocol protocol_parameter(s)

Details

The first argument to the buffer statement is the name of the buffer to create. If a buffer with this name already exists then any data it contains will be overwritten.

There must be whitespace on both sides of the 'equals' symbol following the buffer name.

The following protocols are supported:

file

bufferbuffername= filefilename

The file protocol imports a file directly into a buffer. This can be very useful when developing USE scripts, as the USE script for processing for a JSON file (for example) can be implemented without requiring access to a server.

If the specified buffer name already exists, then a warning will be logged and any data in it will be cleared before importing the file.

data

bufferbuffername= datastring

The data protocol populates the buffer with the literal text specified in string. This is useful when extracting embedded JSON. For example the JSON snippet below contains embedded JSON in the instanceData field:

"properties": {
"subscriptionId":"sub1.1",
"usageStartTime": "2015-03-03T00:00:00+00:00",
"usageEndTime": "2015-03-04T00:00:00+00:00",
"instanceData":"{\"Microsoft.Resources\":{\"resourceUri\":\"resourceUri1\",\"location\":\"Alaska\",\"tags\":null,\"additionalInfo\":null}}",
"quantity":2.4000000000,
"meterId":"meterID1"

}

In this case the instanceData field can be extracted using a parslet, placed into a new buffer and re-parsed to extract the values within it. Assuming the snippet is in a file called my_data.json this would be done as follows:

buffer properties = file my_data.json
var instanceData = $JSON{my_data}.[properties].[instanceData]

buffer embedded = data ${instanceData}
print The embedded resourceUri is $JSON{embedded}.[Microsoft.Resources].[resourceUri]

http

bufferbuffername= httpmethod url

!!! note For full details on the HTTP protocol and its parameters please refer to the http article.

Once the HTTP request has been executed, any data it returned will be contained in the named buffer, even if the data is binary in format (eg: images, audio files or anything else non-human readable).

If the HTTP request returned no data, one of the following will apply:

If the buffer does not already exist then the buffer will not be created
If the buffer already exists then it will be deleted altogether

For details of how to access the data in a named buffer, please refer to the USE script basics article.

odbc

bufferbuffername= odbcdsn [username password] query

username and password are optional, but neither or both must be specified

where:

dsn is the ODBC Data Source Name (this should be configured at the OS level)
username and password are the credentials required by the DSN
query is an SQL query

Once the query has been executed, the resulting data is located in the named buffer. It can subsequently be saved as a CSV file to disk using:

save {buffername} as filename.csv

The resulting CSV uses a comma (,) as the separator and double quotes (") as the quoting character. Any fields in the data which contain a comma will be quoted.

odbc_direct

bufferbuffername= odbc_directquery

where query is an SQL query.

Executes SQL query against ODBC datasource that is described in set's odbc_connect parameter.

Once the query has been executed, the resulting data is located in the named buffer. It can subsequently be saved as a CSV file to disk using:

save {buffername} as filename.csv

The resulting CSV uses a comma (,) as the separator and double quotes (") as the quoting character. Any fields in the data which contain a comma will be quoted.

Examples

The following examples retrieve data from ODBC and HTTP sources:

# Typical usage in USE script to retrieve all data from the usage table

buffer odbc_csv = odbc ExivityDB admin secret "select * from usage"
save {odbc_csv} as "odbc.csv"
discard {odbc_csv}

# Retrieve the service summary from a local CloudCruiser 4 server and place it in a buffer
set http_username admin
set http_password admin
set http_authtype basic

buffer services = http GET "http://localhost:8080/rest/v2/serviceCatalog/summaries"
# The 'services' buffer now contains the HTTP response data

csv

The csv statement is used to create and populate CSV files. It is typically combined with foreach loops to write values extracted from an array in a JSON and/or XML document stored in a named buffer.

Details

CSV files are produced via the use of multiple csv statements which perform the following functions:

Create a new empty CSV file
Define the headers
Finalise the headers
Write data to one or more rows of the file
Close the file

All CSV files created by the csv command use a comma as the separator character and a double quote as the quote character. Headers and data fields are automatically separated and quoted.

Create a new CSV file

The following is used to create a new, empty CSV file:

csvlabel = filename

The label must not be associated with any other open CSV file. Any number of CSV files may be open simultaneously and the label is used by suesequent csv statements to determine which of the open files the statement should operate on. Labels are case sensitive and may be from 1 to 15 characters in length.

The specified filename is created immediately, and if it is the name of an existing file then it will be truncated to 0 bytes when opened.

The filename argument may contain a path component but the csv statement does not create directories, so any path component in the filename must already exist. The path, if specified, will be local to the Exivity home directory.

Example

csv usage = "${exportdir}/azure_usage.csv"

Define the headers

This section refers to add_headers as the action, but either add_header or add_headers may be used. Both variants work in an identical fashion.

csv add_headerslabel header1 [header2 ... headerN]

All CSV files created by USE script must start with a header row which names the columns in the file. The number of columns can vary from file to file, but in any given file every data row must have the same number of columns as there are headers.

To create one or more columns in a newly created CSV file, the csv add_headers statement is used as shown above. The label must match the label previously associated with the file as described previously.

One or more header names can be specified as arguments to csv add_headers. Multiple instances of the csv add_headers statement may reference the same CSV file, as each statement will append additional headers to any headers already defined for the file.

No checks are done to ensure the uniqueness of the headers. It is therefore up to the script author to ensure that all the specified headers in any given file are unique.

Example

csv add_headers usage username user_id subscription_id

Finalise the headers

This section refers to fix_headers as the action, but either fix_header or fix_headers may be used. Both variants work in an identical fashion

csv fix_headerslabel

After csv add_headers has been used to define at least one header, the headers are finalised using csv fix_headers statement. Once the headers have been fixed, no further headers can be added to the file and until the headers have been fixed, no data can be written to the file.

Example

csv fix_headers usage

Write data

This section refers to write_fields as the action, but either write_field or write_fields may be used. Both variants work in an identical fashion

csv write_fieldslabel value1 [value2 ... valueN]

After the headers have been fixed, the csv write_fields statement is used to write one or more fields of data to the CSV file. Currently it is not possible to write a blank field using csv write_fields, however when extracting data from a buffer using a parslet, if the extracted value is blank then it will automatically be expanded to the string (no value).

USE keeps track of the rows and columns as they are populated using one or more csv write_fields statements, and will automatically write the fields from left to right starting at the first column in the first data row and will advance to the next row when the rightmost column has been written to.

It is the responsibility of the script author to ensure that the number of fields written to a CSV file is such that when the file is closed, the last row is complete, in order to avoid malformed files with one or more fields missing from the last row.

Example

csv write_fields usage Eddy 47EF-26EA-AAF1-B199 SUB_2311_89EFAA1273
csv write_fields usage Tim 2492-ACC2-8829-4444 SUB_2991_BBAFE20BBA

Close the file

csv closelabel

Once all fields have been written to a CSV file, it must be closed using the csv close statement. This will ensure that all data is properly flushed to disk, and will free the label for re-use.

Example

csv close usage

Example

Consider the file "\examples\json\customers.json" representing two customers:

    {
      "totalCount": 2,
      "items": [
        {
          "id": "1234-4567",
          "companyProfile": {
            "tenantId": "xyz-abc",
            "domain": "example.domain.com",
            "companyName": "Example, Inc"
          }
        },
        {
          "id": "9876-6543",
          "companyProfile": {
            "tenantId": "stu-vwx",
            "domain": "another.domain.com",
            "companyName": "A Company, Inc"
          }
        }
      ]
    }

Using a combination of foreach loops and parslets, the information in the above JSON can be converted to CSV format as follows:

        # Load the file into a named buffer
        buffer customers = FILE "${baseDir}\examples\json\customers.json"

        # Create an export file
        csv "customers" = "${baseDir}\exported\customers.csv"

        # Initialise and fix the headers (using two 'add_headers' statements for illustration)
        csv add_headers "customers" id tenant_id 
        csv add_headers "customers" domain company_name
        csv fix_headers "customers"

        # Iterate over the 'items' array in the JSON
        foreach $JSON{customers}.[items] as this_item
        {
            csv write_field "customers" $JSON(this_item).[id]
            csv write_field "customers" $JSON(this_item).[companyProfile].[tenantId]
            csv write_field "customers" $JSON(this_item).[companyProfile].[domain]
            csv write_field "customers" $JSON(this_item).[companyProfile].[companyName]
        }

        # Tidy up
        csv close "customers"
        discard {customers}

The resulting CSV file is as follows:

    "id","tenant_id","domain","company_name"
    "1234-4567","xyz-abc","example.domain.com","Example, Inc"
    "9876-6543","stu-vwx","another.domain.com","A Company, Inc"

clear

The clear statement is used to delete all HTTP headers previously configured using the statement.

Syntax

clear http_headers

Details

The clear statement will remove all the headers currently defined, after which a new set of headers can be specified using .

Example

discard

The discard statement is used to delete a named .

Syntax

discard{buffer_name}

Details

The discard statement will delete the named buffer and free the memory used to store its contents. The statement takes immediate effect and any attempt to reference the buffer afterwards (at least until such time as another buffer with the same name is created) will cause the USE script to log an error and fail.

Example

encode

The encode statement is used to base16 or base64 encode the contents of a variable or a .

Syntax

encode base16|base64varName|{buffer_name}

Details

The encode statement will encode the contents of an existing variable or named buffer, replacing those contents with the encoded version.

The result of encoding the contents will increase their length. With base16 encoding the new length will be double the original. With base64 encoding the new length will be greater than the original but the exact size increase will depend on the contents being encoded.

When encoding a variable, if the size of the result after encoding exceeds the maximum allowable length for a variable value (8095 characters) then the USE script will fail and an error will be returned.

Encoding an empty variable or buffer will produce an empty result

Example

The following script ...

... produces the following output:

encrypt

This article assumes knowledge of variables.

The encrypt statement is used to conceal the value of a variable, such that it does not appear in plain text in a USE script.

Syntax

encrypt varname = value_to_be_encrypted

Details

The encrypt statement differs from other statements in that it takes effect before execution of a USE script begins. In this regard is is effectively a directive to the internal script pre-processor which prepares a script for execution.

Comments, quotes and escapes in the value to be encrypted are treated as literal text up until the end of the line.

White-space following the value to be encrypted will therefore be included in the encrypted result.

White-space preceding the value to be encrypted will be ignored and will not be included in the encrypted result.

Encrypting one or more variables

Any variable prefixed with the word encrypt will be encrypted by the pre-processor and the script file itself will be modified as follows:

All text (including trailing white-space) from the word following the = character up to the end of the line is encrypted
The encrypted value is base64 encoded
The original variable value in the USE script is substituted with the result
The encrypt keyword for that variable is changed to encrypted
The USE script is overwritten on disk in this new form

This process is repeated for all variables preceded by the encrypt keyword.

As a side effect of the encryption process, it is not currently possible to encrypt a value that begins with a space or a tab. This functionality will be implemented in due course.

Using encrypted variables

Once encrypted a variable can be used just as any other, the only requirement being that the encrypted keyword preceding its declaration is not removed or modified.

To change the value of an encrypted variable simply replace the declaration altogether and precede the new declaration with encrypt. Upon first execution, the USE script will be updated with an encrypted version of the variable as described above.

Encrypted values can only be used on the system that they were created on. If an encrypted value is moved or copied to a different installation of Exivity then any attempt to reference or decrypt it will result in something other than the original value.

Example

Firstly, create the script as usual, with encrypt preceding any variables that are to be encrypted:

# ---- Start Config ----
encrypt var username = admin
encrypt var password = topsecret
var server = "http://localhost"
var port = 8080
var api_method = getdetails
# ---- End Config ----

set http_authtype basic
set http_username ${username}
set http_password ${password}

buffer {response} = http GET ${server}:${port}/rest/v2/${api_method}

Secondly, run the script. Prior to execution the script will be automatically modified as shown below:

# ---- Start Config ----
encrypted var username = AGF5dU0KJaB+NyHWu2lkhw==
encrypted var password = b0Sa29tyL+M8wix/+JokjMCdeMwiY9n5
var server = "http://localhost"
var port = 8080
var api_method = getdetails
# ---- End Config ----

set http_authtype basic
set http_username ${username}
set http_password ${password}

buffer {response} = http GET ${server}:${port}/rest/v2/${api_method}

environment

The environment statement specifies the name of environment to use for resolving global variables.

Syntax

environment name

Details

The environment statement selects the predefined environment to use for global variable lookup. It is and error to specify the environment which is not defined in global database.

If no environment specified, default environment (the one specified as default in global database) is assumed.

Environment can be changed many times without limitations, and change affects only global variables that are referenced first time within the script, e.g. all global variables, resolved (copied to local variables) retain their values.

escape

The escape statement is used to escape quotes in a variable value or the contents of a named buffer

Syntax

escape quotes invarName|{bufferName}[usingescape_char]

Details

If a variable value or named buffer contains quotes then it may be desirable to escape them, either for display purposes (to prevent USE from removing them before rendering the data as output) or in order to satisfy the requirements of an external API.

The escape statement will precede all occurrences of the character " with a specified escape character (backslash by default) as shown in the example below. This operation is not just temporary - it will update the actual contents of the variable or named buffer.

The escape statement does not take into account the context of existing quote characters in the data. Running it multiple times against the same data will add an additional escape character each time to each occurrence of a quote.

Example

Given an input file called 'escapeme.txt' containing the following data:

The following script:

will produce the following output:

exit_loop

The exit_loop statement will terminate the current loop.

Either exit_loop or loop_exit may be used. Both variants work identically.

Syntax

exit_loop

Details

The exit_loop statement will immediately terminate the current loop and script execution will jump to the statement following the } at the end of the current loop.

This can be done even if the exit_loop statement is within one or more constructs inside the loop.

If no loop is in effect then an error will be logged and the script will terminate.

foreach

The foreach statement defines a block of zero or more statements and associates this block with multiple values. The block of statements is executed repeatedly, once for each value.

Syntax

foreachparsletasloop_label{

}

The opening { may be placed on a line of its own if preferred

Details

The foreach statement is used to iterate over the values in an array or object (identified by a ) within the data in a .

The loop will execute for as many elements as there are in the array, or for as many members there are in the object. For the purposes of this documentation, the term child will be used to refer to a single array element or object member.

If the array or object is empty, then the body of the loop will be skipped and execution will continue at the statement following the closing }.

The loop_label can be any string, but must not be the same as any other loop_label values in the same scope (ie: when nesting foreach loops, each loop must have a unique label). This label is used to uniquely identify any given loop level when loops are nested.

The foreach statement will execute the statements in the body of the loop once for every child. foreach loops can be nested, and at each iteration the loop_label can be used to extract values from an array or object in the current child using a . See the examples at the end of this article for a sample implementation showing this in action.

As the foreach loop iterates over the children, a number of variables are automatically created or updated as follows:

Examples

Basic looping

Consider the following JSON in a file called samples/json/array.json:

To generate a list of IDs and names from the items array, the following would be used:

Nested looping

To extract values from an array using nested loops:

Given the source JSON in a file called example.json, the following USE script:

will produce the following output:

generate_jwt

The generate_jwt statement is used to generate an RFC 7515-compliant JWT (JSON Web Token) which can be used, for example, for Google Cloud OAuth 2.0 Server to Server Authentication.

Syntax

generate_jwt keykey component1 [... componentN]asresult

Details

The generate_jwt statement performs the following actions:

encodes all components as Base64URL
concatenates all components using a dot separator (.)
hashes the concatenated result using SHA256
signs the hash with a provided PEM-encoded key using the RSA algorithm
encodes the resulting signature as Base64URL
builds JWT by concatenating the two results using a dot separator (.)
stores the final result in th variable specified by the result parameter

The RSA key needs to be in PEM format. PEM format requires the header and footer to be on separate lines so it is important to separate the key contents with ${NEWLINE}as shown below:

var key = "-----BEGIN PRIVATE KEY-----${NEWLINE}Key-data-goes-here{$NEWLINE}-----END PRIVATE KEY-----"

Example

To acquire a Google Cloud OAuth 2.0 access token:

var private = "-----BEGIN PRIVATE KEY-----${NEWLINE}key goes here${NEWLINE}-----END PRIVATE KEY-----"
var email = "user@account.iam.gserviceaccount.com"
var url = "https://www.googleapis.com/oauth2/v4/token"
var scope = "https://www.googleapis.com/auth/cloud-platform"

var now = ${UNIX_UTC}
var expiry = (${now} + 3600)

var header = "{\"alg\":\"RS256\",\"typ\":\"JWT\"}"
var payload = "{\"iss\":\"${email}\",\"scope\":\"${scope}\",\"aud\":\"${url}\",\"iat\":\"${now}\",\"exp\":\"${expiry}\"}"

generate_jwt key ${private} ${header} ${payload} as JWT

# Make HTTP request according to https://developers.google.com/identity/protocols/OAuth2ServiceAccount
set http_header "Content-Type: application/x-www-form-urlencoded"
set http_body data "grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant-type%3Ajwt-bearer&assertion=${JWT}"
buffer token = HTTP POST "${url}"

if (${HTTP_STATUS_CODE} != 200) {
	print Got HTTP status ${HTTP_STATUS_CODE}, expected a status of 200
	print The server response was:
	json format {token} 
	print {token}
	terminate
}

var access_token = $JSON{token}.[access_token]
print Access token: ${access_token}

get_last_day_of

The days_in_month statement sets a variable to contain the number of days in the specified month

Syntax

get_last_day_ofyyyyMMasvarName

Details

The get_last_day_of statement will set the value of the variable called varName to contain the number of days in the month specifed by yyyyMM where yyyy is a four-digit year and MM is a 2-digit month.

The statement will take leap years into account.

Example

gosub

The gosub keyword is used to run a named subroutine

Syntax

gosub subroutineName([argument1, ... argumentN])

The argument list may span multiple lines, so long as any given argument is contained on a single line and ends with a comma, eg:

gosub subroutineName (argument1,
      argument2,
      argument3,
      )

Details

The subroutineName provided to the gosub statement must be that of a subroutine defined elsewhere in the script using the subroutine statement.

If any argument contains white-space or a comma then it must be quoted:

gosub getfile("directory with spaces/filename.txt")

It is permitted to call a subroutine from within another subroutine, therefore gosub can be used within the body of a subroutine. This may be done up to 256 levels in depth.

The opening bracket after subroutineName may or may not be preceded with a space:

gosub getfile ("filename.txt")

To call a subroutine with no parameters, use empty brackets:

gosub dosomething()

Example

Please refer to the example in the documentation for the subroutine statement

gunzip

The functionality described in this article is not yet available. This notice will be removed when the appropriate release is made.

The gunzip statement is used to inflate a GZIP file

Syntax

gunzip filename as filename

gunzip {bufferName} as filename

Details

The gunzip statement can be used to extract the contents of a GZIP archive containing a single file. The GZIP archive may be a file on disk or may be the contents of a named buffer.

It is not possible to inflate GZIP data directly in memory, but the same effect can be achieved by extracting GZIP data in a named buffer to disk, and then loading the extracted data back into the named buffer as shown in the example below.

All paths and filenames are treated as relative to the Exivity home directory

Example

# Download an archive and extract it into a named buffer
buffer archivedata = http GET http://server/archived.csv.gz
gunzip {archivedata} as system/extracted/extracted.csv
buffer archivedata = FILE system/extracted/extracted.csv

# Download an archive and extract it to disk, automatically deriving the
# output filename from the input filename based on the .gz extension
var save_path = system/extracted
var archivefile = extracted.csv.gz
set http save_file ${save_path}/${archivefile}

match csv_name "(.*)\.gz$" ${filename}
if (${csv_name.STATUS} != MATCH) {
    print WARNING: Downloaded file does not end in .gz and will not be extracted
} else {
    gunzip "${save_path}/${archivefile}" as "${save_path}/${csv_name.RESULT}"
    print Extracted file: "${save_path}/${csv_name.RESULT}"
}

http

The http statement initiates an HTTP session using any settings previously configured using the set statement. It can also be used for querying response headers.

Syntax

httpmethod url

http dump_headers

http get_headerheaderNameasvarName

Details

Executing an HTTP request

The http statement performs an HTTP request against the server and resource specified in the url paramater. Any http-related settings previously configured using set will be applied to the request.

The method argument determines the HTTP method to use for the request and must be one of GET, PUT, POST or DELETE.

The url argument must start with either http: or https:. If https: is used then SSL will be used for the request.

The url argument must also contain a valid IP address or hostname. Optionally, it may also contain a port number (preceded by a colon and appended to the IP address or hostname) and a resource.

The following defaults apply if no port or resource is specified:

The format of the http statement is identical when used in conjunction with the buffer statement.

Querying response headers

To dump a list of all the response headers returned by the server in the most recent session use the statement:

http dump_headers

This will render a list of the headers to standard output, and is useful when implementing and debugging USE scripts. The intention of this statement is to provide a tool to assist in script development, and as such it would normally be removed or suppressed with a debug mode switch in production environments.

To retrieve the value of a specific header, use the statement:

http get_headerheaderNameasvarName

This will set the variable varName to be the value of the header headerName.

If headerName was not found in the response, then a warning will be written to the log-file. In this case varName will not be created but if it already exists then its original value will be unmodified.

Examples

Example 1

# A simple request using the default port and no SSL
set http_savefile "/extracted/http/customers.json"
http GET "http://localhost/v1/customers"

# A more complex request requiring setup and a custom port
clear http_headers
set http_header "Accept: application/json"
set http_header "Authorization: FFDC-4567-AE53-1234"    
set http_savefile "extracted/http/customers.json"
buffer customers = http GET "https://demo.server.com:4444/v1/customers"

Example 2

The following shows the process of retrieving a header. The output of:

buffer temp = http GET https://www.google.com
http dump_headers
http get_header Date as responseDate
print The Date header from google.com was: ${responseDate}

Takes the following form:

Last response headers:
HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Date: Mon, 26 Mar 2018 13:50:39 GMT
Transfer-Encoding: chunked
Content-Type: text/html; charset=ISO-8859-1
Expires: -1
Accept-Ranges: none
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
Set-Cookie: 1P_JAR=2018-03-26-13; expires=Wed, 25-Apr-2018 13:50:39 GMT; path=/; domain=.google.co.uk
Set-Cookie: [redacted]; expires=Tue, 25-Sep-2018 13:50:39 GMT; path=/; domain=.google.co.uk; HttpOnly
Vary: Accept-Encoding
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alt-Svc: hq=":443"; ma=2592000; quic=51303432; quic=51303431; quic=51303339; quic=51303335,quic=":443"; ma=2592000; v="42,41,39,35"

The Date header from google.com was: Mon, 26 Mar 2018 13:50:39 GMT

if

The if statement is used to conditionally execute one or more statements. In conjunction with an optional else statement it can cause one or other of two blocks of statements to be executed depending on whether an expression is true or false.

Syntax

if(expression){

} [else {

}]

Details

If the condition evaluates to true, then the first block of statements is executed, and the second block (if present) is skipped over. If the condition evaluates to false then the first block of statements is skipped and the second block (if present) is executed.

The opening { character at the start of each block may be placed on a line of its own if preferred but the closing } must be on a line of its own.

Multiple conditions can be used in a single expression and combined with the boolean operators && or || (for AND and OR respectively) so long as each condition is enclosed in braces. For example:

Example

Given the source JSON in a file called example.json, the following USE script:

will produce the following output:

json

The json statement is used to format JSON in a named buffer.

Syntax

json format{buffername}

Details

In many cases an API or other external source will return JSON in a densely packed format which is not easy for the human eye to read. The json statement is used to re-format JSON data that has been previously loaded into a named buffer (via the buffer statement) into a form that is friendlier to human eyes.

After the JSON has been formatted, the buffer can be saved or printed for subsequent inspection

Example

Given the following single packed line of JSON in a named buffer called myJSON:

{"title":"Example JSON data","heading":{"category":"Documentation","finalised":true},"items":[{"id":"01","name": "Item number one","subvalues":{"0":1,"10":42,"100":73,"1000":100},"category":"Example data","subcategory":"First array"},{"id":"02","name":"Item number two","subvalues":{"0":10,"10":442,"100":783,"1000":1009},"category":"Example data","subcategory":"First array"}]}

The following USE script fragment:

json format {myJSON}
print {myJSON}

will result in the following output:

{
  "title": "Example JSON data",
  "heading": {
    "category": "Documentation",
    "finalised": true
  },
  "items": [
    {
      "id": "01",
      "name": "Item number one",
      "subvalues": {
        "0": 1,
        "10": 42,
        "100": 73,
        "1000": 100
      },
      "category": "Example data",
      "subcategory": "First array"
    },
    {
      "id": "02",
      "name": "Item number two",
      "subvalues": {
        "0": 10,
        "10": 442,
        "100": 783,
        "1000": 1009
      },
      "category": "Example data",
      "subcategory": "First array"
    }
  ]
}

loglevel

While executing a USE script, various messages are written to a logfile. The loglevel option determines the amount of detail recorded in that logfile.

Syntax

loglevelloglevel

Details

The table below shows the valid values for the loglevel argument. Either the numeric level or the label can be specified. If the label is used then it must be specified in CAPITAL LETTERS.

The log levels are cumulative, in that higher log-level values include lower level messages. For example a level of INFO will cause FATAL, ERROR, WARN and INFO level messages to be written to the log.

The loglevel statement takes immediate effect and may be used multiple times within a USE script in order to increase or decrease the logging level at any time.

loop

The loop statement executes one or more statements multiple times.

Syntax

 looplabel [count] [timeout timelimit] {
   # Statements
}

The opening { may be placed on a line of its own if preferred but the closing } must be on a line of its own

Details

The loop statement will loop indefinitely unless one of three exit conditions cause it to stop. These are as follows:

The number of loops specified by the count parameter are completed
At least as many milliseconds as are specified by the timelimit parameter elapse
An exit_loop statement explicitly exits the loop

In all three cases when the loop exits, execution of the script will continue from the first statement after the closing } marking the end of the loop.

In the event that both count and timelimit parameters are specified, the loop will exit as soon as one or other of the limits have been reached, whichever comes first.

Both the count and timeout parameters are optional. If omitted then the default for both of them will be infinite.

The loop statement will automatically create and update a variable called loop_label.COUNT which can be referenced to determine how many times the loop has executed (as shown in the example below). This variable is not deleted when the loop exits which means that it is possible to know how many times any given loop executed, even after the loop has exited.

Any specified timeout value is evaluated at the end of each execution of the loop and as such the actual time before the loop exits is likely to be a short time (typically a few milliseconds) greater than the specified value. In practice this should be of no consequence.

Example

loop example 10 {
    This is loop number ${example.COUNT}
}

The loop shown above will result in the following output:

This is loop number 1
This is loop number 2
This is loop number 3
This is loop number 4
This is loop number 5
This is loop number 6
This is loop number 7
This is loop number 8
This is loop number 9
This is loop number 10

match

The match statement is to used search either a specified string or the contents of a named buffer using a regular expression.

Syntax

matchlabel expression target

Details

The three parameters serve the following purposes:

Label

The label associates a meaningful name to the search. Once the match has been attempted, two variables will be created or updated as follows:

These variables can be checked after the match in order to determine the result status and access the results.

Expression

The regular expression must contain one or more characters enclosed in brackets - ( ... ) - the contents of which are termed a subgroup. If a successful match is made then the portion of the target text that was matched by the subgroup will be returned in the _label.RESULT_variable.

Target

The target determines whether a supplied string or the contents of a named buffer are searched. By default the parameter will be treated as a string.

If the string contains white-space then it must be enclosed in double quotes

If the target argument is surrounded with curly braces - { ... } - then it is taken to be the name of a buffer and the expression will be applied to the contents of that buffer.

Regular expressions are generally used for searching ASCII data. Searching binary data is possible but may be of limited usefulness.

Examples

Search the contents of a variable for the text following the word 'connection:' with or without a capital 'C':

match varsearch "[Cc]onnection: (.*)" ${variable}
if (${varsearch.STATUS} = MATCH) {
    print Connection string is: ${varsearch.RESULT}
} else {
    print No match found
}

Search a text file previously retrieved from a HTTP request to locate the word 'Error' or 'error'

match error_check "([Ee]rror)" {text_data}
if (${error_check.STATUS} == MATCH) {
    print Found: ${error_check.RESULT}
} else {
    print No error was found
}

pause

The pause statement is used to suspend execution of a USE script for a specified time.

Syntax

pausedelaytime

Details

The delaytime parameter is the number of milliseconds to wait before continuing. A value of 0 is allowed, in which case no delay will occur.

The pause statement may be useful in cases where an external data source imposes some form of rate limiting on the number of queries that can be serviced in a given time-frame, or to slow down execution at critical points when debugging a long or complex script.

Example

This example makes use of script parameters which are provided when USE is executed. For more information on script parameters please refer to the Extract introduction.

var first = ${ARG_1}
var last = ${ARG_2}
var last += 1
var x = ${first}

# Retrieve a number of files from http://server.local/?.dat where ? is a number
# Wait for 1 second between each file
loop slurp {
    var url = http://server.local/datafiles/${x}.dat
    set http_savefile data/${x}.png
    print Getting datafile ${x}
    http GET ${url}
    if (${HTTP_STATUS_CODE} == 200) {
        print 200 OK
    }
    if (${HTTP_STATUS_CODE} == 404) {
        print Data file ${x} missing on server
    }
    var x += 1
    if (${x} == ${last}) {
        exit_loop
    }
        pause 1000   # Wait for 1 second
}
print ${x} files were downloaded
terminate

print

The print statement is used to display text to standard output while a USE script is executing.

Syntax

print [-n]word|{buffer_name} [... word|{buffer_name]

Details

The print statement enables user-defined output to be generated during the execution of a USE script. When retrieving data from external sources it may take some time for a lengthy series of operations to complete, so one use of the print statement is to provide periodic status updates during this time.

The print statement will process as many arguments as it is given, but at least one argument is required. If the first argument is -n then no newline will be output after the last argument has been echoed to standard output, else a newline is output after the last argument.

Arguments that are normal words will be sent to standard output followed by a space. Arguments referencing a named buffer will result in the contents of the buffer being displayed.

Note that print will stop output of data from a named buffer as soon as a NUL (ASCII value 0) character is encountered

Binary data

It is not recommended that print is given a buffer containing binary data to display, as when echoed to a console on screen this is likely to result in various control codes and other sequences to be sent to the console which may have undesired side effects.

Example

var server = "https://my_json_server.com"
print Obtaining token from server
buffer response = http GET ${server}/generatetoken        
print Token received:
print {response}

# Create a variable called ${secret_token} from the
# 'access_token' string in the JSON in the {response} buffer
var secret_token = $JSON{response}.[access_token]

# We no longer need the {response} buffer as the value
# extracted from it is stored in a variable
discard {response}
print Original server response now discarded

return

The return statement is used to exit a subroutine at an arbitrary point and return to the calling location

Syntax

return

Details

A subroutine will automatically return to the location it was called from when the end of its body is reached. However it may be desirable to explicitly exit the subroutine at some other point in which case the return statement is used.

The return statement cannot be used to return a value to the calling code (this should be done via the use of variables as described in the subroutine statement documentation)

Example

#
# Download two files into named buffers
# using a subroutine to do so
#
gosub getfile(data1, "http://intranet/datadump1.json")
gosub getfile(data2, "http://someotherserver/anotherfile.xml")

# (Script to do something with the data goes here)

#
# Argument 1: the name of the buffer to store the data
# Argument 2: the URL of the file to download
#
subroutine getfile {
    if (${SUBARG.COUNT} != 2) {
        print "Error: This subroutine requires two arguments
        return
    } 

    buffer ${SUBARG_1} = http GET "${SUBARG_2}"
    # There is an implicit 'return' here
}

save

The save statement is used to write the contents of a named buffer to disk.

Syntax

save{buffer_name}asfilename

Details

The save statement will write the contents of a named buffer to filename. As well as providing a means of direct-to-disk downloading this can be useful for retrieving server responses and capturing them for later examination, whether it be for analysis, debugging or audit purposes.

If the destination file already exists then it will be overwritten.

If the filename argument contains a path component, then any directories not present in the path will be created. If creation of the path destination file is not successful then an error will be logged and the USE script will fail.

The save statement is similar in effect to the http_savefile option supported by set, in that data from a server is written to disk. There is one important distinction however:

When set http_savefile has been used to specify a file to save, the next HTTP request will stream data to the file as it is received from the server
When a buffer statement is used to capture the server response, and a subsequent save statement is used to write it to disk, all the buffered data will be written to the file immediately

Example

var server = "https://my_json_server.com"
buffer response = http GET ${server}/generatetoken        

# Save a copy of the original server response for diagnostic purposes
save {response} as "${baseDir}\diagnostics\token.json"

# Create a variable called ${secret_token} from the 'access_token'
# string in the JSON in the {response} buffer
var secret_token = $JSON{response}.[access_token]

# We no longer need the {response} buffer as the value extracted
# from it is stored in a variable
discard {response}

set

The set statement is used to configure a setting for use by a subsequent http or buffer statements.

Syntax

setsetting value

Details

A protocol such as http offers a number of configuration options. Any given option is either persistent or transient:

The following settings can be configured using set:

http_progress

set http_progress yes|no

Persistent. If set to yes then dots will be sent to standard output to indicate that data is downloading when an HTTP session is in progress. When downloading large files if a lengthy delay with no output is undesirable then the dots indicate that the session is still active.

http_username

set http_usernameusername

Persistent. Specifies the username to be used to authenticate the session if the http_authtype setting is set to anything other than none. If the username contains any spaces then it should be enclosed in double quotes.

http_password

set http_passwordpassword

Persistent. Specifies the password to be used to authenticate the session if the http_authtype setting is set to anything other than none. If the password contains any spaces then it should be enclosed in double quotes.

http_authtype

set http_authtypetype

Persistent. Specifies the type of authentication required when initiating a new connection. The type parameter can be any of the following:

http_authtarget

set http_authtargettarget

Persistent. Specifies whether any authentication configured using the http_authtype setting should be performed against a proxy or the hostname specified in the http URL.

Valid values for target are:

server (default) - authenticate against a hostname directly
proxy - authenticate against the proxy configured at the Operating System level

http_header

set http_header"name: value"

Persistent. Used to specify a single HTTP header to be included in subsequent HTTP requests. If multiple headers are required, then multiple set http_header statements should be used.

An HTTP header is a string of the form name: value.

There must be a space between the colon at the end of the name and the value following it, so the header should be enclosed in quotes

Example: set http_header "Accept: application/json"

Headers configured using set http_header will be used for all subsequent HTTP connections. If a different set of headers is required during the course of a USE script then the clear statement can be used to remove all the configured headers, after which set http_header can be used to set up the new values.

By default, no headers at all will be included with requests made by the http statement. For some cases this is acceptable, but often one or more headers need to be set in order for a request to be successful.

Typically these will be an Accept: header for GET requests and an Accept: and a Content-Type: header for POST requests. However there is no hard and fast standard so the documentation for any API or other external endpoint that is being queried should be consulted in order to determine the correct headers to use in any specific scenario.

Headers are not verified as sane until the next HTTP connection is made

http_body

set http_body datastring - use the specified string as the body of the request

set http_body filefilename - send the specified file as the body of the request

set http_body{named_buffer} - send the contents of the named buffer as the body of the request

Transient. By default no data other than the headers (if defined) is sent to the server when an HTTP request is made. The http_body setting is used to specify data that should be sent to the server in the body of the request.

When using http_body a Content-Length: header will automatically be generated for the request. After the request this Content-Length: header is discarded (also automatically). This process does not affect any other defined HTTP headers.

After the request has been made the http_body setting is re-initialised such that the next request will contain no body unless another set http_body statement is used.

http_savefile

set http_savefilefilename

Transient. If set, any response returned by the server after the next HTTP request will be saved to the specified filename. This can be used in conjunction with the buffer statement, in which case the response will both be cached in the named buffer and saved to disk.

If no response is received from the next request after using set http_savefile then the setting will be ignored and no file will be created.

Regardless of whether the server sent a response or not after the HTTP request has completed, the http_savefile setting is re-initialised such that the next request will not cause the response to be saved unless another set http_savefile statement is used.

No directories will be created automatically when saving a file, so if there is a pathname component in the specified filename, that path must exist.

http_savemode

set http_savemodemode

Persistent.

If mode is overwrite (the default) then if the filename specified by the set http_savefile statement already exists it will be overwritten if the server returns any response data. If no response data is sent by the server, then the file will remain untouched.
If mode is append then if the filename specified by the set http_savefile statement already exists any data returned by the server will be appended to the end of the file.

http_timeout

set http_timeoutseconds

Persistent. After a connection has been made to a server it may take a while for a response to be received, especially on some older or slower APIs. By default, a timeout of 5 minutes (300 seconds) is endured before an error is generated.

This timeout may be increased (or decreased) by specifying a new timeout limit in seconds, for example:

set http_timeout 60    # Set timeout to 1 minute

The minimum allowable timeout is 1 second.

http_retry_count

set http_retry_countcount

Persistent. Sets the number of retries that will be made in case of transport-level failures, such as an inaccessible server or a name resolution issue. Server responses with non-200 HTTP code are not considered transport-level failures.

By default this option has a value of 1, which means one initial request and one retry. To disable retrying set the value to 0.

http_retry_delay

set http_retry_delaymilliseconds

Persistent. Set the pause between retries in milliseconds. Default value is 5000 milliseconds. Used only if http_retry_count is non-zero.

http_redirect_count

set http_redirect_countcount

Persistent. Set the maximum number to follow HTTP redirects. Valid values are in the range 0-32, where 0 disable redirects completely. By default redirects are disabled.

http_secure

set http_secure yes|no

Persistent. Switches on or off several server HTTPS certificate validation check, such as:

certificate is issued by trusted CA (Certificate Authority) or certificate chain of trust can be traversed to trusted CA (list of trusted CAs is located in common/certificates/cacert.pem file within Exivity home directory)
server name matches the name in the certificate

Other certificate checks, such as certificate expiration date, cannot be disabled.

Starting from Exivity version 3 this option is switched on by default.

odbc_connect

set odbc_connectconnection_string

Persistent. Sets the ODBC connection string for use by the buffer statement's odbc_direct protocol. The connection string may reference an ODBC DSN or contain full connection details, in which case a DSN doesn't need to be created.

A DSN connection string must contain a DSN attribute and optional UID and PWD attributes. A non-DSN connection string must contain a DRIVER attribute, followed by driver-specific attributes.

Please refer to the documentation for the database to which you wish to connect to ensure that the connection string is well formed.

An example connection string for Microsoft SQL Server is:

set odbc_connect "DRIVER=SQL Server;SERVER=Hostname;Database=DatabaseName;TrustServerCertificate=No;Trusted_Connection=No;UID=username;PWD=password"

subroutine

The subroutine keyword is used to define a named subroutine

Syntax

subroutine subroutine_name {
   # Statements
}

Details

Overview

A subroutine is a named section of code that can be executed multiple times on demand from anywhere in the script. When called (via the gosub statement), execution of the script jumps to the start of the specified subroutine. When the end of the code in the subroutine body is reached or a return statement is encountered (whichever comes first), execution resumes at the statement following the most recent gosub statement that was executed.

The code in the body of a subroutine statement is never executed unless the subroutine is explicitly called using gosub. If a subroutine is encountered during normal linear execution of the script then the code in it will be ignored.

Subroutines in USE do not return any values, but any variables that are set within the subroutine can be accessed from anywhere in the script and as such they should be used for returning values as needed.

Subroutine Arguments

When invoked via the gosub statement, arguments can be passed to the subroutine. These arguments are read-only but may be copied to normal variables if required.

Arguments are accessed using the same syntax as is used for variables as follows:

${SUBARG.COUNT} contains the number of arguments that were passed to the subroutine

${SUBARG_N} is the value of any given argument, where N is the number of the argument starting at 1

Every time a subroutine is called, any number of arguments may be passed to it. These arguments are local to the subroutine and will be destroyed when the subroutine returns. However, copying an argument to a standard variable will preserve the original value as follows:

subroutine example {
    if (${SUBARG.COUNT} == 0) {
        var return_value = "NULL"
    } else {
        var return_value = ${SUBARG_1}
    }
}

After the subroutine above has been executed the return_value variable will retain the value it was set to.

It is not permitted to nest subroutine statements. If used within the body of a subroutine statement, a subroutine statement will cause the script to terminate with an error.

Example

The following demonstrates using a subroutine to detect when another subroutine has been provided with an incorrect number of arguments:

if (${ARGC} == 0) {
    print This script requires a yyyyMMdd parameter
    terminate with error
} 

# Ensure the parameter is an 8 digit number
gosub check_date(${ARG_1})
#
# (script to make use of the argument goes here)
#
terminate

# ----
#     This subroutine checks that its argument
#     is an 8 digit decimal number
# ----
subroutine check_date {
    # Ensure this subroutine was called with one argument
    gosub check_subargs("check_date", ${SUBARG.COUNT}, 1)

    # Validate the format
    match date "^([0-9]{8})$" ${SUBARG_1}
    if (${date.STATUS} != MATCH) {
        print Error: the provided argument is not in yyyyMMdd format
        terminate with error
    }
}

# ----
#     This subroutine generates an error message for
#     other subroutines if they do not have the correct
#     number of arguments
#
#     It is provided as a useful method for detecting internal
#     script errors whereby a subroutine is called with the
#     wrong number of arguments
#
#     Parameters:
#        1: The name of the calling subroutine
#        2: The number of arguments provided
#        3: The minimum number of arguments permitted
#        4: OPTIONAL: The maximum number of arguments permitted
# ----
subroutine check_subargs {
    # A check specific to this subroutine as it can't sanely call itself
    if ( (${SUBARG.COUNT} < 3) || (${SUBARG.COUNT} > 4) ) {
        print Error: check_subargs() requires 3 or 4 arguments but got ${SUBARG.COUNT}
        terminate with error
    }

    # A generic check
    var SCS_arg_count = ${SUBARG_2}
    var SCS_min_args = ${SUBARG_3}
    if (${SUBARG.COUNT} == 3) {
       var SCS_max_args = ${SUBARG_3}
    } else {
       var SCS_max_args = ${SUBARG_4}
    }

    if ( (${SCS_arg_count} < ${SCS_min_args}) || (${SCS_arg_count} > ${SCS_max_args}) ) {
        if (${SCS_min_args} == ${SCS_max_args}) {
            print Error in script: the ${SUBARG_1}() subroutine requires ${SCS_min_args} arguments but was given ${SCS_arg_count}
        } else {
            print Error in script: the ${SUBARG_1}() subroutine requires from ${SCS_min_args} to ${SCS_max_args} arguments but was given ${SCS_arg_count}
        }
        terminate with error
    }
}

terminate

The terminate statement will exit the USE script immediately.

Syntax

terminate [with error]

Details

Normally a USE script will finish execution when an error is encountered or when the end of the script file is reached, whichever comes first.

When the terminate statement is encountered, the script will finish at that point. No statements after the terminate statement will be executed.

By default, the script will exit with a success status, however it may be useful to exit deliberately when an error such as an invalid or unexpected response from an HTTP session is detected. Adding the keywords with error to the statement will cause it to exit with an error status.

Example

set http_savefile = extracted/serverdata.txt
buffer serverdata = http GET "https://server.com/uri"
if (${HTTP_STATUS_CODE} != 200) {
    print Got HTTP status ${HTTP_STATUS_CODE}, expected a status of 200
    print The server response was:
    print {serverdata} 
    terminate with error
} else {
    print Received data from server successfully
}

unzip

The unzip statement is used to unzip the data in a named buffer.

Syntax

unzip{buffer_name}

Details

The unzip statement will extract a single file from a zip archive stored in a named buffer. In order for this to succeed, the buffer must have been previously populated using the buffer statement, and the data within the buffer must be a valid ZIP file.

Only ZIP files are supported. To extract GZIP files, use gunzip

A warning will be logged, the buffer left intact and the script will continue to execute if any of the following conditions arise:

The buffer is empty or does not contain a valid ZIP archive
The ZIP archive is damaged or otherwise corrupted
More than 1 file is present within the archive

After the unzip statement completes, the buffer will contain the unzipped data (the original ZIP archive is discarded during this process).

The filename of the unpacked file is also discarded, as the resulting data is stored in the buffer and can subsequently be saved using an explicit filename as shown in the example below.

Example

buffer zippedData = FILE system/extracted/my_source/${dataDate}_usage.zip
unzip {zippedData}
save {zippedData} as system/extracted/my_source/${dataDate}_usage.csv
discard {zippedData}

uri

The uri statement is used to encode the contents of a variable such that it does not contain any illegal or ambiguous characters when used in an request.

Syntax

uri encodevarname

uri component-encodevarname

uri aws-object-encodevarname

As well as uri component-encode you can use uri encode-component (the two are identical in operation). Similarly, uri aws-object-encode and aws-encode-object are aliases for each other.

Details

When sending a request to an HTTP server it is necessary to encode certain characters such that the server can accurately determine their meaning in context. The encoding involves replacing those characters with a percent symbol - % - followed by two hexadecimal digits representing the ASCII value of that character.

Note that the last parameter to the uri statement is a variable name, so to encode the contents of a variable called my_query the correct statement would be uri encode my_query and not uri encode ${my_query} (The latter would only be correct if the value of my_query was the name of the actual variable to encode)

USE script provides the following methods for encoding the contents of a variable:

encode

uri encodevarname

This method will encode all characters except for the following:

This is typically used to encode a URI which contains spaces (spaces encode to %20) but doesn't contain any query parameters.

encode-component

uri encode-componentvarname

This method will encode all characters except for the following:

This is typically used to encode query components of a URI, such as usernames and other parameters. Note that this method will encode the symbols =, & and ? and as such a URL of the form:

server.com/resource?name=name_value&domain=domain_value

is usually constructed from its various components using the values of the parameters as shown in the example below.

aws-object-encode

uri aws-object-encodevarname

This method is specifically implemented to support the encoding of object names when downloading from Amazon S3 buckets. Amazon S3 buckets appear much like shared directories, but they do not have a heirarchical filesystem.

The 'files' in buckets are termed objects and to assist in organising the contents of a bucket, object prefixes may be used to logically group objects together.

These prefixes may include the forward slash character, making the resulting object name appear identical to a conventional pathname (an example might be billing_data/20180116_usage.csv). When downloading an object from S3 the object name is provided as part of the HTTP query string.

When referencing an S3 object name there is an explicit requirement not to encode any forward slashes in the object name. USE therefore provides the aws-object-encode method to ensure that any S3 object names are correctly encoded. This method will encode all characters except for the following:

URI encode every byte. UriEncode() must enforce the following rules:

URI encode every byte except the unreserved characters: 'A'-'Z', 'a'-'z', '0'-'9', '-', '.', '', and '~'._

The space character is a reserved character and must be encoded as "%20" (and not as "+").

Each URI encoded byte is formed by a '%' and the two-digit hexadecimal value of the byte.

Letters in the hexadecimal value must be uppercase, for example "%1A".

Encode the forward slash character, '/', everywhere except in the object key name. For example, if the object key name is photos/Jan/sample.jpg, the forward slash in the key name is not encoded.

The usr-object-encode method is compliant with the above requirements. For most trivial cases it should not be necessary to encode the AWS object name as it is relatively straightforward to do it by hand. However using uri aws-object-encode to URI-encode the object name may be useful for object names that contain a number of characters not listed above or for cases where the object name is provided as a parameter to the USE script.

Example

The above script will output:

var

Overview

The var statement is used to create or update a variable which can subsequently be referenced by name in the USE script.

Syntax

[public] varname [ = value]

[public] varname operator number

[public] encrypt varname = value

For details on encrypted variables please refer to the article

Details

Variables are created in one of two ways:

Manually via the var command
Automatically, as a consequence of other statements in the script

If the word public precedes a variable declaration then the variable will be shown in, and its value can be updated from, the Exivity GUI. Only variables prefixed with the word public appear in the GUI (all others are only visible in the script itself). To make an automatic variable public, re-declare it with a value of itself as shown below:

Manually defined variables

A variable is a named value. Once defined, the name can be used in place of the value for the rest of the script. Amongst other things this permits configuration of various parameters at the top of a script, making configuration changes easier.

The = value portion of the statement is optional, but if used there must be white-space on each side of the = character. To use spaces in a variable value it should be quoted with double quotes.

Once a variable has been defined it can be referenced by prefixing its name with ${ and post-fixing it with a }. For example a variable called outputFile can be referenced using ${outputFile}. If no value is specified, then the variable will be empty, eg:

will result in the output:

Variable names are case sensitive, therefore ${variableName} and ${VariableName} are different variables.

If there is already a variable called name then the var statement will update the value.

There is no limit to the number of variables that can be created, but any given variable may not have a value longer than 8095 characters

Arithmetic

Variables that contain a numeric value can have the arithmetic operations performed on them in on of two ways.

Method 1

The first, and recommended way is to use an expression, as demonstrated in the example code below:

When using expressions in this manner after a var statement it is necessary to enclose the expression in parentheses as shown above but both integer and floating point arithmetic can be performed.

Method 2

If working with integer arithmetic then one of the operators += , -= , *= , /= or %= can be used which will perform addition, subtraction, multiplication, (integer) division or modulo operations respectively.

For example the statement var x += 10 will add 10 to the value of x. Note that both the value in the variable and the value following the operator must be integers.

When performing arithmetic operations on a variable using this second method, any leading zeros in the value of that variable will be respected:

Currently, only integer arithmetic is supported by the += , -= , *= , /= and %= operators.

Automatic variables

Automatic variables are referenced in exactly the same way as manually created ones; the only difference is in the manner of creation.

The following variables are automatically created during the execution of a USE script:

match day "(...)" ${DAY_NAME_UTC}

var short_day = ${day.RESULT}

The .LENGTH suffix

On occasion it may be useful to determine the length (in characters) of the value of a variable. This can be done by appending the suffix .LENGTH to the variable name when referencing it. For example if a variable called result has a value of success then ${result.LENGTH} will be replaced with 7 (this being the number of characters in the word 'success').

A variable with no value will have a length of 0, therefore using the .LENGTH suffix can also be used to check for empty variables as follows:

myvar.LENGTH is not a variable in its own right. The .LENGTH suffix merely modifies the manner in which the myvar variable is used.

Examples

Basic variable creation and use

Creating encrypted variables

hash

The hash statement is used to generate a base-16 or base-64 encoded hash of data stored in a variable or named buffer.

Syntax

hash sha256 [HMAC [b16|b64]key] target|{target}asresult[b16|b64]

hash md5target|{target}asresult[b16|b64]

Details

The hash statement uses the contents of target as its input and places the final result into result. The SHA256 and MD5 hash algorithms are supported.

If target is surrounded with curly braces like {this} then it is taken to be the name of a memory buffer and the contents of the buffer will be used as input. Otherwise, it is treated as the name of the variable, the value of which will be hashed.

By default the resulting hash is base-16 encoded and the result placed into the variable specified by the result argument.

result is the name of the variable to put the output into, and not a reference to the contents of that variable. This is why it is not ${result}

If the optional HMACkey arguments are provided when the hash type is sha256 then the secret in key will be used to generate an HMAC-SHA-256 result. The optional b64 or b16 argument following the HMAC option indicates that key is base-64 or base-16 encoded. By default, a clear-text key is assumed.

If the optional b64 argument is used (base64 may also be specified) after the result variable, then the result will be base-64 encoded.

The optional b16 argument (base16 may also be used) after the result variable is provided for completeness, but need not be specified as this is the default encoding to use.

Example

Running the script:

var hash_me = "This is the data to hash"
var my_secret = "This is my secret key"

# SHA256
hash sha256 hash_me as result
print The SHA256 hash of '${hash_me}' in base-16 is:
print ${result}${NEWLINE}

hash sha256 hash_me as result b64
print The SHA256 hash of '${hash_me}' in base-64 is:
print ${result}${NEWLINE}

# HMACSHA256
hash sha256 hmac ${my_secret} hash_me as result
print The HMACSHA256 hash of '${hash_me}' (using '${my_secret}') in base-16 is:
print ${result}${NEWLINE}

hash sha256 hmac ${my_secret} hash_me as result b64
print The HMACSHA256 hash of '${hash_me}' (using '${my_secret}') in base-64 is:
print ${result}${NEWLINE}

results in the following output:

The SHA256 hash of 'This is the data to hash' in base-16 is:
1702c37675c14d0ea99b7c23ec29c36286d1769a9f65212218d4380534a53a7a

The SHA256 hash of 'This is the data to hash' in base-64 is:
FwLDdnXBTQ6pm3wj7CnDYobRdpqfZSEiGNQ4BTSlOno=

The HMACSHA256 hash of 'This is the data to hash' (using 'This is my secret key') in base-16 is:
cf854e99094ea5c2a88ee0901a305d5f25dfb5a0f0905eec703618080567b4b5

The HMACSHA256 hash of 'This is the data to hash' (using 'This is my secret key') in base-64 is:
z4VOmQlOpcKojuCQGjBdXyXftaDwkF7scDYYCAVntLU=