where
Overview
The where
statement is used to define a local filter, which restricts the scope of any statements contained within its body.
Syntax
Details
The expressions used by the where
statement take the same form as those used in the if statement.
The information below covers aspects of expression use that are unique to the where
statement.
The where
statement is used to define a local filter that restricts the effect of the statements located in the lines between the opening {
and closing }
.
These braces define the start and end of the body of the where
statement.
The opening
{
must be preceded with white-space may be positioned on the same line as thewhere
statement or may be on a line of its ownThe closing
}
must be located on a line of its own
Only statements that support a local filter may be located with the body. If a non-supported statement is included in the body then a log entry is generated and the task will fail.
The conditional expression associated with the where
statement must reference at least one column name as detailed in the Column names section below
Only rows in the DSET associated with the expression for which the conditional expression is true will be affected by the statements in the body of the where
statement.
Referencing column names
As the purpose of the where
statement is to selectively process rows of data based on the expression, it is a requirement that the condition references one or more column names.
To reference a column in an expression, its name should be specified, enclosed in square brackets [like_this]
.
If the column name contains white-space then it should be quoted within the brackets ["like this"]
. The reason that the quotes must be located within the brackets is that otherwise the brackets (along with their contents) would be treated as a literal string.
To illustrate this in use, consider a DSET called test.data which contains the following:
The following script will create a new column called "flag" and set that new column value to 1 for all rows where the user key is f.
A blank value in a column can be checked for using a pair of quotes with no content - ""
. The following will delete all rows in the DSET Azure.usage
where the quantity column is empty:
It is possible for one or more statements (such as set) within the statement block enclosed by where { ... }
to modify the contents of a column that is being used by the expression.
Doing this may lead to unexpected results as for each statement in the block the expression is re-evaluated for every row in the data being processed, so if set
is used to update the values in a column then the new values will be used for all subsequent evaluations of the expression, which will affect all following statements in the where
block.
If updating a value in any column(s) referenced by the expression, it is recommended to make a copy of the column before the where
statement, using
set copy_of_column as original_column
The values in the copy can then be freely modified, and the original column can be replaced with it after execution of the where
statement has completed.
Regular expressions
Regular expressions are often used to restrict processing to certain rows in a dataset.
For example:
As is the case with the if statement, the entire value in the column must match the expression in order for the row to be processed (there is no implicit .*
at the end of an expression).
Supported statements
The following statements may be used in the body of a where
statement:
The following statements may also be used in the body of a where
statement but are unaffected by the expression (and as a result work exactly the same as if they were used outside the where
statement body)
terminate
Examples
Last updated