Data Preparation¶
Data Marketplace offers you several methods to retrieve data from views in Virtual DataPort:
In the VQL Shell you express your information need using a query in VQL, the language required by Virtual DataPort.
In the Query pill of a view you define the different building blocks of a query in a wizard: its output columns, its filters and its sorting. When all parts are ready, you execute the query to obtain the results.
In the Assisted query pill of a view you define your information need in natural language and an engine based on large language models translates it into a VQL query.
Each method is intended to a different audience according to their knowledge on the view schema or their expertise in VQL or SQL.
The Data preparation pill provides you a new way for querying a view in Virtual DataPort, intended for business users with little knowledge on the view schema or VQL, based on the what you see is what you get principle.
Unlike other methods, this approach begins by retrieving data from the view. Instead of loading the entire data set, a subset is obtained that serves as a representative sample of the view.
To build the query, the user has access to several actions that can be applied to modify the data. Each action is processed the data subset in real time, allowing the immediate visualization of its effect and enabling the user to assess whether it meets the intended objective. Additional actions can then be applied iteratively until the desired result is achieved.
Once all actions have been applied and the final query has been defined, it is ready to be saved or executed on the full data set of the view.
In this section, we will see:
How to load data to start a data preparation session.
How to manipulate live data to build your query.
How to execute your query when it is ready.
Example of a practical scenario or video example
List of a filters and transformations definition
Note
The Data preparation pill is only available if you have privileges to execute the view and you have been
granted with the data_catalog_data_preparation role in Virtual DataPort.
Loading Data¶
Each data preparation session begins in the Columns section. This section displays two panels: Fields and Relationship Fields, which include all the fields available from the current view and its related views.
To define the fields that will be included in the query, drag the desired fields from either panel to the Output Column panel. Once you have selected the required fields, click Next to continue. The system will redirect you to the Definition section, where a preliminary preview of the query results is shown based on the selected output column.
Take into account that the Columns section is always available and you can go back there whenever you want. In case you want to discard your data preparation session and start from the beginning click the Start over button.
Wizard to select the fields that participate in the query¶
Let us see in more detail the available panels in the Columns section.
The Fields panel contains the list of all the fields in the schema of the view in Virtual DataPort. In addition, it
has an extra field called NUMBER_OF_RECORDS, which represents the number of rows in the query results. It is
equivalent to applying the Count transformation to a column in the Definition section.
Note
If there are some fields you are not allowed to use in a query according to your column privileges, they will not be listed here.
Fields of the view available to use for data preparation¶
The icon next to the field name describes the data type of the field. It can take one of the values below:
for the numeric types
int,long,float,doubleanddecimal.for the
texttype.for the date types
date,localdate,time,timestamp,timestamptz,intervalyearmonthandintervaldaysecond.for the
blobtype.for the
booleantype.for the
arraytype.for the
xmltype.
In addition, a field may be decorated with the icon, which means that the field is mandatory. In the
Definition section, you must create a filter on this field to provide a value for it.
The Relationship fields panel contains the list of related views, that is, the views with which the current view has
a to-one association. Notice that related views are labeled as <view name> (<role name>).
Note
It is possible to access deeper levels of associated fields, to configure this option go to Personalization Elements in the Queries pill.
Note
A related view may appear more than once in the list, since the current view can have several to-one associations with the same view. You can use the role name to differentiate each association.
Each related view has a icon next to its name. Click it to unfold the fields of the related view in
Virtual DataPort. Henceforth, we will call them the expanded fields.
Note
If there are some expanded fields you are not allowed to use in a query according to your column privileges in the related view, they will not be listed here.
Expanded fields of the view available to use for data preparation¶
Expanded fields may be mandatory as well. In case the query depends on a related view with a mandatory field, you must create a filter on the expanded field to provide a value for it.
The Output columns panel defines the schema of the query results, that is, which fields and expressions will be part of it and in which order they will appear. Initially, it only consists of fields from the view and its related views. As new columns are created in the Definition section they are automatically added to this panel to reflect the current schema of the query results.
The Output columns panel defines the schema of the query results¶
Notice that in the Output columns panel the expanded fields are labeled as <role name>/<field name>.
Now it is time to explore all the options that are available in the Columns section.
Add Fields to Output Columns¶
There are several ways of adding a field or expanded field to the Output columns panel:
Drag-and-drop a field or expanded field in the list to Output columns.
Drag-and-drop a related view from Relationships fields to Output columns. All its expanded fields will be added.
Select some fields or expanded fields in the list and drag-and-drop them to Output columns.
Select some fields or expanded fields in the list, click the
icon in the panel title and select the Add to output columns option in the menu.
Sort Fields¶
If the schema of the view or its related views contains a great number of fields, it could be difficult to find one specific field. You can use the search bar of each panel to filter fields by name. As an alternative, the field list can be sorted as follows:
Click the
icon in the panel title.
Select the option Sort ascending or Sort descending to sort the fields in alphabetical order.
Select the option Default sort to restore the original order of the fields in the view schema.
Reorder Fields in Output Columns¶
The order of the fields in the Output columns panel determines how they will appear in the query results. To reorder a field in the output schema, you have several options:
Click the
icon next to the field you want to reorder and select the Move up or Move down option. Repeat this process until the field is in the correct place.
Select those fields in Output columns you want to reorder, click the
icon in the panel title and select the Move up or Move down option. Repeat this process until the fields are in the correct place.
Drag-and-drop a field to its final place in the panel.
Select those fields you want to reorder and drag-and-drop them all to their final place.
Remove Fields From Output Columns¶
There are two ways of removing a field or expanded field from the Output columns panel:
Click the
icon next to the field you want to remove and select the Remove option.
Select those fields in Output columns you want to remove, click the
icon in the panel title and select the Remove option.
Preparing Data¶
Once your initial data has been loaded, you can start building your query from the Definition section. There you have a representative subset of the view data that works as a preview of the query results. Apply actions on these data, one at a time, to manipulate them until you are satisfied with the result. Every action is executed in real-time and the query results are updated accordingly, so you can immediately perceive the effect of the action on the data.
Data preparation screen to build queries by modifying real-time data¶
In this section there are three different areas: a table with live query results in the center, the Data preparation panel on the left and the Filters and Transformations panels on the right. Let us see these elements in more detail.
The Data preparation panel contains the actions history, i.e., the list of actions that were applied to build the
current query results in chronological order. Each time an action is applied, a new entry is added on this panel with a
little description of it. If you want to detach this panel, click the button in the panel title
and a floating window will appear with the actions history.
The table in the center shows a preview of the current query results after applying all the actions in the actions history to the view. Depending on your configuration, the data for the preview is obtained directly from the data source or from a sample of it that is locally stored. Either way, the data in the table is updated in real-time with each action.
You can select a column in the table by one of the following methods:
Click the column title.
Click the
icon in the column title and select the Open filters & transformations option.
Selecting a column in the table has two consequences. First, it opens its Filters and Transformations panels,
which show the applicable actions for the column. In addition, the actions related to the column in the actions history
are highlighted, so you can get an idea of the data lineage of the column. For a restricted data lineage of the column
that only considers transformations you can put the mouse over the icon in the column title.
To deselect a column you have several options too:
Click the column title again.
Click the
icon in the column title and select the Close filters & transformations option.
Click the Deselect column button in the toolbar.
The Filters and Transformations panels show a list of actions in alphabetical order to apply to the selected
column. The list of available actions may be different for each column as it depends on the column’s data type. If you
are not sure what an action is for, put the mouse over the icon next to the action name to get an explanation
about how it works. Panels can be expanded or collapsed with the
or
icons. In addition, you can use
the search bar of each panel to filter actions by name.
A filter is a condition on the column values that a row must meet for it to be returned in the query results. Some filters need some parameters to be defined. If that is the case, a dialog will appear asking for them when you select the filter.
Dialog to define the parameters of a filter¶
A transformation is a function that converts the column values. This conversion may modify the data type of the column value too. For instance, the length transformation converts a text in its number of characters. The icon next to the transformation name describes the data type of the value returned by the function. It can take one of the values below:
for the numeric types
int,long,float,doubleanddecimal.for the
texttype.for the date types
date,localdate,time,timestamp,timestamptz,intervalyearmonthandintervaldaysecond.for the
xmltype.for the
aggregationtype.
You can filter the list of transformations by return type. Click the icon in the panel title and select one
of the available return types.
A transformation usually works by converting the values in the column, row by row. It takes the value in the column for
the first row, converts it to a new value, and repeats this process for the rest of rows. However, there is a special
case of transformations that do not follow this pattern: aggregation transformations. An aggregation transformation takes
values in the column from several rows and produces a single value by combining them all. Examples of aggregation
transformations are Average, Count or Maximum. A column that uses an aggregation transformation is called an
aggregation column.
Note
Aggregation columns are decorated with the icon in the Output columns panel of the Columns
section, so you can differentiate them from non-aggregation columns.
Once an aggregation transformation is applied to the data the semantic of the query changes as follows:
The rows of the query results are partitioned in groups. For each distinct value of the non-aggregation columns, a group of rows is generated with all the rows that contain that value. In case all the columns in the query are aggregation columns, then one group is generated with all the rows of the query results.
For each group, aggregation columns are calculated by applying the corresponding aggregation transformation on the rows of the group.
There will be one row in the query results for each group. The row will consist of:
The values of the non-aggregation columns that generated the group.
The values of the aggregation columns calculated for the group.
For each distinct value in the non-aggregation columns, the results contain a row with all the aggregation columns¶
After selecting a transformation from the panel, you have to configure how it is applied in a new dialog. Transformations can have parameters that must be defined. In addition, you can indicate where to store the result of the transformation. By default, it is stored in the same column you are transforming. However, you can optionally create a new column with the new value.
Dialog to configure how a transformation is applied¶
Now it is time to explore all the actions that are available in the Definition section.
Apply a Filter¶
To apply a filter on the values of a column you must follow the steps below:
Select the column to filter by in the query results.
Select a filter in the Filters panel.
Fill in all the information requested in the dialog, in case the filter needs more configuration.
As an alternative, you have this other method to apply a filter on the values of a column:
Click the Add filter button in the toolbar.
Select the Add filter from wizard option.
Search for a column to filter by.
Select a filter from the options in the drop-down menu.
Fill in all the information requested, in case the selected filter needs more configuration.
Dialog to create a new filter¶
In the case of the Is one of and Is not one of filters, it is possible to add up to 1000 constant values at once. This
option can be enabled by activating the Add multiple constant values toggle, which shows an editor where the
values can be added. The Separator selector can be used to specify whether the values will be separated by
commas, semicolons, or newlines. The values can be added by clicking the button.
Dialog to create a new Is one of filter with the option to add multiple values enabled¶
Apply a Filter From Expression¶
In case you need an advanced filter that is not considered among the options in the Filters panel, you can create a filter from a VQL expression following the steps below:
Click the Add filter button in the toolbar.
Select the Add expression filter option.
Enter the VQL expression that represents the condition that the rows must meet to appear in the query results.
Dialog to create a new filter from a VQL expression¶
Take into account that the VQL editor in the dialog provides the following features to help you write an expression:
Highlighting. Each component of an expression is highlighted with a different color: functions, operators, literals, columns, etc.
Autocomplete. Press
Ctrl+Spaceand a menu with suggestions will appear.Validation. Every time you type in the VQL editor, the expression is validated. If the expression is correct, you will see a
icon. If it is incorrect, you will see a
icon. Put the mouse over the icon and a tooltip will explain why it is invalid.
See also
Further information on the capabilities of VQL to write conditions and the syntax a condition must follow can be found in the Virtual DataPort VQL Guide.
Apply Suggested Filters¶
Data Marketplace can suggest popular filters for you based on other users’ activity. To access all suggested filters proceed as follows:
Click the Add filter button in the toolbar.
Select the Suggest option.
If you are only interested in popular filters with a condition on a specific column, then you have to:
Select the column for which you want suggested filters in the query results.
Click the
icon in the title of the Filters panel.
Either way, a new dialog will appear with a list of recommended filters.
Dialog with suggested filters based on other users’ activity¶
This list considers two kind of suggested filters:
Simple filter. A filter that depends on a column currently in the query results and that can be expressed with a condition in the Filters panel. If you click a filter like this, its corresponding dialog will open.
Filter from expression. A filter represented with a VQL expression, since it does not meet the requirements of a simple filter. If you click this kind of filter, the dialog to create a filter from expression will open. Take into account that the VQL expression can depend on parameters represented as
?. These are placeholders that must be replaced with actual values.
Note
Suggestions about filters and derived fields feature is only available with the AI & Denodo Assistant FeaturePack. To find out the subscription bundle you have, open the About dialog of the Data Marketplace.
Apply a Transformation¶
To apply a transformation on the values of a column you must follow the steps below:
Select the column to transform in the query results.
Select a transformation in the Transformations panel.
Fill in all the information requested in the dialog.
Confirm if you want to store the new value in the same column you are transforming or to create a new column with it.
Apply a Transformation From Suggestions¶
Data Marketplace can suggest popular transformations for you based on other users’ activity. To access all suggested transformations proceed as follows:
Click the Add column button in the toolbar.
Select the Suggest option.
If you are only interested in popular transformations on a specific column, then you have to:
Select the column for which you want suggested transformations in the query results.
Click the
icon in the title of the Transformations panel.
Either way, a new dialog will appear with a list of suggested transformations.
Dialog with suggested transformations based on other users’ activity¶
This list considers two kind of suggested transformations:
Simple transformation. A transformation that depends on a column currently in the query results and that can be expressed with a transformation in the Transformations panel. If you click a transformation like this, its corresponding dialog will open.
Transformation from expression. A transformation represented with a VQL expression, since it does not meet the requirements of a simple transformation. If you click this kind of transformation, the dialog to create a new column from an expression will open. Take into account that the VQL expression can depend on parameters represented as
?. These are placeholders that must be replaced with actual values.
Sort by Column¶
The order of the rows in the query results is undefined by default. To establish an order you can sort the rows by the values in a column as follows:
Click the
icon of the column you want to sort by.
Select the Sort A to Z or Sort Z to A option.
For those rows that share the same value in the column you have selected, the order is still undefined. You can repeat this process on another column, adding a second level of sorting, to establish how those rows should be ordered.
Create a New Column¶
In the previous sections we have seen that you can apply a transformation or recommended transformation and store the result in a new column. Let us explore what other options you have to create a new column in the query result.
To create a new column from scratch and give it a value proceed as follows:
Click the Add column button in the toolbar.
Select the Add column from wizard option.
Enter a name for the new column.
Enter a value for the new column, which can be one of the following options:
A constant value. All rows will have the same value. You need to specify the column type.
A column value. The new column will have the same value as the column you choose.
A transformation. First select the column and the transformation. And then fill in all the information requested in the dialog, in case the transformation needs more configuration.
Nothing, it will be empty. All rows will have
NULLas a value. You need to specify the column type.An expression value. Define the value as a VQL expression as complex as you want.
Note
Because the syntax for the expanded fields is
<role_name>/<field>, you cannot use the/as an operator to create a VQL expression. Use the functionDIVinstead.See also
A comprehensive list with all the functions you can use to build a VQL expression can be found in the Virtual DataPort VQL Guide.
Dialog to create a new column from scratch¶
To create a new column with the same value of other column you have the following alternative:
Click the
icon of the column you want to duplicate.
Select the Duplicate column option.
Enter a name.
To create a new column from a VQL expression that depends on other column you have the following alternative:
Click the
icon of the column you want to use in the VQL expression of the new column.
Select the Create new column from expression option.
Enter a name.
Complete the VQL expression for the new column.
To add a field from the view or a related view as a new column follow the steps below:
Go to the Columns section.
Add a field or expanded field to the Output columns panel.
Click the Load data button.
Remove a Column¶
There are several methods to remove a column from the query results:
In the query results, click the
icon of the column you want to remove and select the Remove option.
In the actions history, click the
icon of an action that created a column and select the Remove column option.
In the actions history, select several actions that created a column, click the
icon in the panel title and select the Remove option.
Go to the Columns section, click the
icon of the column you want to remove from the Output columns panel and select the Remove option.
Go to the Columns section, select several columns in the Output columns panel, click the
icon in the panel title and select the Remove option.
Removing a column from the query results has the following effects:
The column is hidden from the query results.
A new action is added to the actions history indicating that the column was removed.
All actions in the actions history that depend on the removed column are disabled to indicate that they are not considered to build the query results.
If you are satisfied with the new query results, you have to confirm the removal of the column for its effects to be definitive. To do that, follow the steps below:
Remove all depending actions on the removed column.
Click the
icon of the removal action.
Select the Remove column option.
In this way the column is completely removed. Take into account that the action that created the column and the action that removed the column will not be visible anymore in the actions history.
Instead, if you changed your mind and want to revert the removal action, then proceed as follows:
Click the
icon of the removal action.
Select the Undo option.
Reorder a Column¶
The order in which the columns appear in the query results is not fixed. You can change it following one of the methods below:
Drag-and-drop the column title until it is placed in the appropriate location.
Click the
icon of the column you want to reorder and select the Move to the left or Move to the right option. Repeat this process until the column is in the correct place.
Go to the Columns section and reorder the columns in the Output columns panel.
Rename a Column¶
To change the name of a column follow the steps below:
Click the
icon of the column you want to rename.
Select the Rename column option.
Enter a new name for the column.
Edit an Action¶
Some actions in the actions history can be edited. For instance, applying a filter or a transformation that depends on some parameters or a VQL expression. To edit an action like this, follow the steps below:
Click the
icon of the action you want to edit.
Select the Edit option.
Enter a new value for the parameter in the dialog that just opened.
Remove an Action¶
We have already seen how to remove an action that creates a column. Any other action in the actions history can be removed too. You have the following alternatives:
Click the
icon of the action you want to remove and select the Remove option.
Select several actions in the actions history, click the
icon in the panel title and select the Remove option.
As a consequence, the actions are removed from the actions history and the query results updated accordingly.
Check the VQL of the Query¶
Even if you do not realize it, Data Marketplace is building a VQL query on the view as a consequence of your data preparation session. At any moment you can check it. Just click the VQL button in the toolbar. If you want to manually edit the VQL query, click the Open in VQL Shell button.
VQL query generated from your data preparation session¶
Save the Query¶
At any moment during your data preparation session you can save the current query for later, which would be useful in the following scenarios:
The query is not finished yet and you want to continue later.
The query is too complex and you will need to execute it often.
You want to share the query with other person in your company.
You want to deploy the query as a view in a Virtual DataPort server.
To save your data preparation session, proceed as follows:
Click the Save button in the toolbar.
Enter a name for the query and, optionally, a description.
Dialog to save a query for later¶
You will find your saved query in the My queries section. When you access the saved query, the name with which it has been saved is displayed on the pill.
Data preparation pill with the name of the saved query¶
Any modification to the query shows an asterisk next to the name, indicating that there are unsaved changes.
Data preparation pill with the modification notice next to the name of the query¶
Note
Take into account that:
No other user will have access to your saved query, unless you share it.
Your saved queries are associated to the Virtual DataPort server you are connected to.
See also
For further information on what actions you can perform with saved queries, check the My Queries section.
Executing the Query¶
During your data preparation session you are manipulating a representative subset of the view data. Once you are
satisfied with the query results in the Definition section you can perform the same manipulations on the entire view
data set. Click the Next button in the toolbar and you will be redirected to the Results section. There you
can check the query results executed on the view, export them to a file using the
Export button in the toolbar or navigate though the data of its related view
clicking the icons in the query results.
Query results executed on the entire view data set¶
Practical Scenario¶
Assume we need to extract the full address for all stores in our organization. Our first step is to review the View Address, as it is expected to contain this information.
Summary pill of the View Address results executed on the entire view data set¶
To verify the data included in the view, go to the Schema pill.
Schema pill in the View Address¶
The schema table shows that the country field is not present in the address view. To obtain this data, we need to execute a query, so we navigate to the Data Preparation pill.
Data preparation pill in the View Address¶
From the Data Preparation pill, select the fields and relationship fields needed to retrieve the full address. Once all fields have been selected, click Add to output columns, and then click Next.
Data Preparation table with the subset data¶
This table shows a data subset that includes the columns selected in the previous step. Because some of these columns provide information that is not required for this scenario, we can safely delete them.
Dropdown menu with the column actions¶
To delete a column, go to and click Remove.
Data preparation table column filters displayed¶
The table now contains only relevant information; however, some columns include incomplete rows. To use only complete rows,
click in the column header and apply the necessary filter. If you are uncertain about which filter to apply,
consult the description available in the .
Data preparation table column transformations displayed¶
The table now contains all the required information and only complete rows. To further complete the Address field with the country and
city values, select the column header and choose the desired transformation. You can check the description of each transformation from the .
Edit transformation popup¶
In this transformation, the address, city, and country fields are merged into a single value, with each part separated by a comma.
Data preparation table with the transformation applied¶
Now that we have the desired value, we can either review the VQL or save the query for future use.
VQL created by data preparation¶
Save query popup¶
Click the Next button to view the query results. In this case, the country and city columns have been removed because they are not necessary for the current scenario.
Data preparation table with the result of applying the query¶
Filters Definition¶
Like: Filters the data to show only the values whose text matches a pattern you type. You can use % to represent any sequence of characters and _ to represent a single character.
Contains: Filters the column values that contain a given text.
Starts with: Filters the column values that start with a given text.
Ends with: Filters the column values that end with a given text.
Matches Regexp: Allows you to use advanced text patterns to filter your data. Helpful for more complex searches. It distinguishes between uppercase and lowercase. Some examples: ^ca.* finds values starting with ca or [0-9]{3} finds values containing three digits in a row.
Matches Regexp case-insensitive: Allows you to use advanced text patterns to filter your data. Helpful for more complex searches. It does not distinguish between uppercase and lowercase. Some examples: ^ca.* finds values starting with ca or [0-9]{3} finds values containing three digits in a row.
Similarity: Finds text that is similar to what you typed. You set how similar it should be (a number from 0 to 1). Higher numbers require a closer match. Some examples: Searching “house” with threshold 0.8 may match hause, houze, but not mouse. Searching “auto” with threshold 0.5 might match auto, autor, alto.
Is missing: Filters the null column values.
Is not missing: Filter the non-null column values.
Is exactly: Filters the column values that are exactly the same as the given value.
Is one of: Filters the column values that are in the list of given values.
Is not one of: Filters the column values that are not included in the list of given values.
Is not: Filters the column values that are different than the given value.
Is greater than: Filters the column values that are greater than a given value.
Is equal or greater than: Filters the column values that are equal or greater than a given value.
Is lower than: Filters the column values that are lower than a given value.
Is equal or lower than: Filters the column values that are equal or lower than a given value.
Between: Filters the column values that are equal or greater than the initial value and are equal or lower than the final value.
Not between: Filters the column values that are lower than the initial value or are greater than the final value.
Transformation Definition¶
Common¶
Duplicate column: Specify the name of the new column to be created as a copy of {field}
Text¶
Convert to text: This transformation converts a column value to text data.
Append and prepend: This transformation concatenates into one string a column value with a text before and a text after.
Concat: This transformation concatenates into one string a column value with another texts or column values.
Split: Splits the text in a column using a separator (for example, a space, a comma, or a dash) and creates a new column for each resulting part.
Hash: Converts the column value into an encoded string using the MD5 algorithm and Base64. For the same input, this transformation always returns the same value.
Index of: Returns the position where a text first appears inside the column value. If the text is not found, it returns -1.
Length: This transformation returns the number of characters in a text column.
Change to lowercase: This transformation converts a text column to lowercase.
Remove accents: This transformation replaces all characters with an accent with the same characters without accent.
Repeat: This transformation repeats a text column value a given number of times.
Replace: This transformation replaces all the occurrences of the specified texts.
Replace with regular expression: Finds all parts of the text that match a regular expression and replaces them with the text you provide. Example: Value: ‘123-456-789’, expression: d{3}, replacement: ‘XXX’, Result: ‘XXX-XXX-XXX’
Position of text: This transformation returns the first position, if any, at which one string occurs within a text column value
Similarity: Finds text that is similar to what you typed. You set how similar it should be (a number from 0 to 1). Higher numbers require a closer match. Some examples: Searching “house” with threshold 0.8 may match hause, houze, but not mouse. Searching “auto” with threshold 0.5 might match auto, autor, alto.
Text fragment: Extracts part of the text from a column. You must provide the starting position (start index). You may optionally provide the ending position (end index). If no end index is provided, the fragment goes from the start index to the end of the text.
Trim: Removes any spaces or line breaks that appear at the beginning or the end of the column value.
Left trim: Removes any spaces or line breaks that appear at the beginning of the column value.
Right trim: Removes any spaces or line breaks that appear at the end of the column value.
Number¶
Absolute value: This transformation returns the absolute value of a numeric column.
Arc cosine: This transformation returns the arc cosine of an angle. The input has to be a number between -1.0 and +1.0.
Arc sine: This transformation returns the arc sine of an angle. The input has to be a number between -1.0 and +1.0.
Arc tangent: This transformation returns the arc tangent of an angle. The input has to be a number between -1.0 and +1.0.
Ceil: This transformation returns the smallest integer not less that the argument.
Cosine: This transformation returns the cosine of an angle in radians. The output is a double value between -1.0 and +1.0.
Cotangent: This transformation returns the cotangent of an angle in radians.
Degrees: This transformation, given an angle in radians, returns the corresponding angle in degrees.
Division: This transformation divides a numeric column value by a divisor. The column value will be the dividend of the operation.
Exponential: This transformation returns the exponential value of a number column.
Floor: This transformation returns the largest integer not greater than the column value.
Natural logarithm: This transformation returns the natural logarithm (base e) of a value.
Logarithm base-10: This transformation returns the logarithm of a number in base-10.
Logarithm base-n: This transformation returns the logarithm of a number in a given base.
Module: This transformation returns the result of the module operation: the remainder of the integer division of the column value and the divisor.
Multiplication: This transformation multiplies a column value by a number.
Power: This transformation returns the result of a number raised to a power.
Round: This transformation returns the column value rounded to the specified number of places to the right or left of the decimal place. If the number of places is omitted, the column value is rounded to 0 places. If the number of places is negative, the column value is rounded to digits left of the decimal point.
Sign: This transformation returns -1, 0 or 1, depending on whether the column value is negative, zero, or positive respectively.
Sine: This transformation returns the sine of an angle in radians. The output is a double value between -1.0 and +1.0.
Square root: This transformation returns a positive square root.
Subtract: This transformation subtracts a number from a column value.
Sum: This transformation adds a numeric quantity to the value of a column.
Tangent: This transformation returns the tangent of an angle in radians.
Truncate: This transformation returns the integer part of a column value.
Date¶
Add days: This transformation returns the datetime column value with its field day rolled up (or down, if the increment is negative) by the amount specified.
Add hours: This transformation returns the datetime column value with its field hour rolled up (or down, if the increment is negative) by the amount specified.
Add minutes: This transformation returns the datetime column value with its field minute rolled up (or down, if the increment is negative) by the amount specified.
Add months: This transformation returns the datetime column value with its field month rolled up (or down, if the increment is negative) by the amount specified.
Add seconds: The amount to increase the field second. If the number is negative, the field is decreased. The type of the expression can be int or long
Add weeks: This transformation returns the datetime column with its field week rolled up (or down, if the increment is negative) by the amount specified. That is, rolled up or down in multiples of 7 days.
Add years: This transformation returns the datetime column value with its field year rolled up (or down, if the increment is negative) by the amount specified.
Convert to timezone: This transformation converts a timestamp column from one timezone to another.
Division: This transformation divides an interval column by a divisor. The column value will be the dividend of the operation.
Format: This transformation returns a string containing a datetime column formatted using the given pattern.
First day of month: This transformation returns the datetime column with the field day rolled down to the first day of the month. If the datetime column value already is the first day of the month, it returns the parameter unchanged.
First day of week: This transformation returns the datetime column with the field day rolled down to the first day of the week. If the datetime column value already is the first day of the month, it returns the parameter unchanged. The first day of the week depends on the locale of the view and also may depend on the underlying database.
Get day: This transformation returns the day field of a given datetime column. The transformation returns a long data-type ranging from 1 to 31.
Get day of year: This transformation returns the number of the day in the year of the datetime column.
Get days between: This transformation returns the number of days between two dates. It returns 0 if both dates represent the same day. It returns a positive number, if the column value is first. It returns a negative number, if the datetime parameter is first.
Get day of week: This transformation returns the number of the day of the week of the datetime column. The first day of the week is 1 and the last day is 7. The first day of the week depends on the locale of the view and also may depend on the underlying database.
Get hour: This transformation returns the hour field of a given datetime column.The transformation returns a long data-type, ranging from 0 (12:00 A.M.) to 23 (11:00 P.M.).
Get microsecond: This transformation returns the microseconds field of a given datetime column.
Get millisecond: This transformation returns the milliseconds field of a given datetime column.
Get minute: This transformation returns the minute field of a given datetime column, returns a value of type long, ranging from 0 to 59.
Get month: This transformation returns the month field of a given datetime column. returns a long data-type, ranging from 1 (January) to 12 (December).
Get months between: This transformation returns the number of months between a datetime column and a given datetime. It returns 0 if both dates represent the same day. It returns a positive number, if the column value is first. It returns a negative number, if the datetime parameter is first.
Get nanosecond: This transformation returns the nanosecond field of a given datetime column.
Get quarter: This transformation returns the quarter of the year of a given datetime column.
Get second: This transformation returns the month field of a given datetime column.
Get time in milliseconds: This transformation returns the number of milliseconds from January 1, 1970, 00:00:00 GMT to the datetime column value.
Get week:This transformation returns the week of the year of a given datetime column. The first week of the year is 1. The first week of the year is that in which at least 4 days are in the year. As a result of this definition, depending on the year the day 1 of the year may be considered to belong to the previous year.
Get year: This transformation returns the year field of a given datetime column.
Last day of month: This transformation returns the datetime column with the field day rolled up to the last day of the week. The last day of the week depends on the locale of the view and also may depend on the underlying database.
Minimum: This transformation returns the minimum value between the datetime column and a given datetime parameter.
Maximum: This transformation returns the maximum value between the datetime column and a given datetime parameter.
Multiplication: This transformation multiplies a column value by a number.
Next week day: This transformation returns the datetime column with its field day rolled up to the day of the week indicated by the given parameter. If the column value already represents the day parameter, the function rolls up the date to the same day of the next week (Sunday= 0, Monday= 1…).
Previous week day: This transformation returns the datetime column with its field day rolled down to the day of the week indicated by the given parameter. If the column value already represents the day parameter, the function rolls down the date to the same day of the previous week (Sunday= 0, Monday= 1…).
Truncate: This transformation returns the column datetime value, truncated to a specific unit of measure. If the pattern is not present, the datetime is truncated to the day. Truncation units: CC SCC: Century, SYYYY YYYY YEAR SYEAR YYY YY Y: Year, IYYY IYY IY I: ISO Year, Q: Quarter, MONTH MON MM RM: Month, WW: Same day of the week as the first day of the year, IW: Same day of the week as the first day of the ISO year, W: Same day of the week as the first day of the month, DDD DD J: Day, DAY DY D: Starting day of the week, HH HH12 HH24: Hour, MI: Minute.
Subtract: This transformation subtracts a datetime value from a datetime typed column. The transformation returns the number of whole days between the date parameter and the column value. When subtracting two values of type time, the function will return the number of milliseconds between the two values.
Subtract interval year-month: This transformation subtracts a interval year-month value from a datetime typed column.
Subtract interval day-second: This transformation subtracts a interval day-second value from a datetime typed column.
Sum interval year-month: This transformation adds a interval year-month value from a datetime typed column.
Sum interval day-second: This transformation adds a interval day-second value from a datetime typed column.
Convert to localdate: This transformation converts a text column value containing a datetime in a specific format, into a value of type localdate.
Convert to time: This transformation converts a text column value containing a datetime in a specific format, into a value of type time.
Convert to timestamp: This transformation converts a text column value containing a datetime in a specific format, into a value of type timestamp.
Convert to timestamptz: This transformation converts a text column value containing a datetime in a specific format, into a value of type timestamptz.
Aggregation¶
Average: This transformation returns the average of the non-null values of the column.
Count: This transformation returns the number of non-null values of a column.
Maximum: This transformation returns the highest value of a column.
Median: This transformation returns the median value of a column.
Minimum: This transformation returns the minimum value of a column.
Standard deviation: This transformation returns the sample standard deviation of the values of a column. This transformation does not take into account null values to calculate the result.
Population standard deviation: This transformation returns the population standard deviation of the values of a column. This transformation does not take into account null values to calculate the result.
Sum of values: This transformation returns the sum of all non-null values of a column.
Variance: This transformation returns the sample variance of the values of a column. This transformation does not take into account null values to calculate the result.
Population variance: This transformation returns the population variance of the values of a column. This transformation does not take into account null values to calculate the result.
Group concat: This transformation returns, for each group, a text with the concatenation of all the column values of each group.
XPath: This transformation returns the nodes from an XML column selected by an XPath expression.
JSONPath: This transformation returns the nodes from an JSON document selected by an JSONPath expression. JSONPath expressions are similar to XPath expressions for XML documents
