USER MANUALS


Data Preparation

Data Marketplace offers you several methods to retrieve data from views in Virtual DataPort:

  • In the VQL Shell you express your information need using a query in VQL, the language required by Virtual DataPort.

  • In the Query pill of a view you define the different building blocks of a query in a wizard: its output columns, its filters and its sorting. When all parts are ready, you execute the query to obtain the results.

  • In the Assisted query pill of a view you define your information need in natural language and an engine based on large language models translates it into a VQL query.

Each method is intended to a different audience according to their knowledge on the view schema or their expertise in VQL or SQL.

The Data preparation pill provides you a new way for querying a view in Virtual DataPort, intended for business users with little knowledge on the view schema or VQL, based on the what you see is what you get principle.

Unlike other methods, this approach begins by retrieving data from the view. Instead of loading the entire data set, a subset is obtained that serves as a representative sample of the view.

To build the query, the user has access to several actions that can be applied to modify the data. Each action is processed the data subset in real time, allowing the immediate visualization of its effect and enabling the user to assess whether it meets the intended objective. Additional actions can then be applied iteratively until the desired result is achieved.

Once all actions have been applied and the final query has been defined, it is ready to be saved or executed on the full data set of the view.

In this section, we will see:

Note

The Data preparation pill is only available if you have privileges to execute the view and you have been granted with the data_catalog_data_preparation role in Virtual DataPort.

Loading Data

Each data preparation session begins in the Columns section. This section displays two panels: Fields and Relationship Fields, which include all the fields available from the current view and its related views.

To define the fields that will be included in the query, drag the desired fields from either panel to the Output Column panel. Once you have selected the required fields, click Next to continue. The system will redirect you to the Definition section, where a preliminary preview of the query results is shown based on the selected output column.

Take into account that the Columns section is always available and you can go back there whenever you want. In case you want to discard your data preparation session and start from the beginning click the Start over button.

Wizard to select the fields that participate in the query

Wizard to select the fields that participate in the query


Let us see in more detail the available panels in the Columns section.

The Fields panel contains the list of all the fields in the schema of the view in Virtual DataPort. In addition, it has an extra field called NUMBER_OF_RECORDS, which represents the number of rows in the query results. It is equivalent to applying the Count transformation to a column in the Definition section.

Note

If there are some fields you are not allowed to use in a query according to your column privileges, they will not be listed here.

Fields of the view available to use for data preparation

Fields of the view available to use for data preparation

The icon next to the field name describes the data type of the field. It can take one of the values below:

  • numeric-type for the numeric types int, long, float, double and decimal.

  • text-type for the text type.

  • date-type for the date types date, localdate, time, timestamp, timestamptz, intervalyearmonth and intervaldaysecond.

  • blob-type for the blob type.

  • boolean-type for the boolean type.

  • array-type for the array type.

  • xml-type for the xml type.

In addition, a field may be decorated with the mandatory icon, which means that the field is mandatory. In the Definition section, you must create a filter on this field to provide a value for it.

The Relationship fields panel contains the list of related views, that is, the views with which the current view has a to-one association. Notice that related views are labeled as <view name> (<role name>).

Note

It is possible to access deeper levels of associated fields, to configure this option go to Personalization Elements in the Queries pill.

Note

A related view may appear more than once in the list, since the current view can have several to-one associations with the same view. You can use the role name to differentiate each association.

Each related view has a expand-view icon next to its name. Click it to unfold the fields of the related view in Virtual DataPort. Henceforth, we will call them the expanded fields.

Note

If there are some expanded fields you are not allowed to use in a query according to your column privileges in the related view, they will not be listed here.

Expanded fields of the view available to use for data preparation

Expanded fields of the view available to use for data preparation

Expanded fields may be mandatory as well. In case the query depends on a related view with a mandatory field, you must create a filter on the expanded field to provide a value for it.

The Output columns panel defines the schema of the query results, that is, which fields and expressions will be part of it and in which order they will appear. Initially, it only consists of fields from the view and its related views. As new columns are created in the Definition section they are automatically added to this panel to reflect the current schema of the query results.

The Output columns panel defines the schema of the query results

The Output columns panel defines the schema of the query results

Notice that in the Output columns panel the expanded fields are labeled as <role name>/<field name>.


Now it is time to explore all the options that are available in the Columns section.

Add Fields to Output Columns

There are several ways of adding a field or expanded field to the Output columns panel:

  • Drag-and-drop a field or expanded field in the list to Output columns.

  • Drag-and-drop a related view from Relationships fields to Output columns. All its expanded fields will be added.

  • Select some fields or expanded fields in the list and drag-and-drop them to Output columns.

  • Select some fields or expanded fields in the list, click the menu-title icon in the panel title and select the Add to output columns option in the menu.

Sort Fields

If the schema of the view or its related views contains a great number of fields, it could be difficult to find one specific field. You can use the search bar of each panel to filter fields by name. As an alternative, the field list can be sorted as follows:

  • Click the menu-title icon in the panel title.

  • Select the option Sort ascending or Sort descending to sort the fields in alphabetical order.

  • Select the option Default sort to restore the original order of the fields in the view schema.

Reorder Fields in Output Columns

The order of the fields in the Output columns panel determines how they will appear in the query results. To reorder a field in the output schema, you have several options:

  • Click the menu-item icon next to the field you want to reorder and select the Move up or Move down option. Repeat this process until the field is in the correct place.

  • Select those fields in Output columns you want to reorder, click the menu-title icon in the panel title and select the Move up or Move down option. Repeat this process until the fields are in the correct place.

  • Drag-and-drop a field to its final place in the panel.

  • Select those fields you want to reorder and drag-and-drop them all to their final place.

Remove Fields From Output Columns

There are two ways of removing a field or expanded field from the Output columns panel:

  • Click the menu-item icon next to the field you want to remove and select the Remove option.

  • Select those fields in Output columns you want to remove, click the menu-title icon in the panel title and select the Remove option.

Preparing Data

Once your initial data has been loaded, you can start building your query from the Definition section. There you have a representative subset of the view data that works as a preview of the query results. Apply actions on these data, one at a time, to manipulate them until you are satisfied with the result. Every action is executed in real-time and the query results are updated accordingly, so you can immediately perceive the effect of the action on the data.

Data preparation screen to build queries by modifying real-time data

Data preparation screen to build queries by modifying real-time data


In this section there are three different areas: a table with live query results in the center, the Data preparation panel on the left and the Filters and Transformations panels on the right. Let us see these elements in more detail.

The Data preparation panel contains the actions history, i.e., the list of actions that were applied to build the current query results in chronological order. Each time an action is applied, a new entry is added on this panel with a little description of it. If you want to detach this panel, click the detach-actions-history button in the panel title and a floating window will appear with the actions history.

The table in the center shows a preview of the current query results after applying all the actions in the actions history to the view. Depending on your configuration, the data for the preview is obtained directly from the data source or from a sample of it that is locally stored. Either way, the data in the table is updated in real-time with each action.

You can select a column in the table by one of the following methods:

  • Click the column title.

  • Click the menu-title icon in the column title and select the Open filters & transformations option.

Selecting a column in the table has two consequences. First, it opens its Filters and Transformations panels, which show the applicable actions for the column. In addition, the actions related to the column in the actions history are highlighted, so you can get an idea of the data lineage of the column. For a restricted data lineage of the column that only considers transformations you can put the mouse over the info icon in the column title.

To deselect a column you have several options too:

  • Click the column title again.

  • Click the menu-title icon in the column title and select the Close filters & transformations option.

  • Click the Deselect column button in the toolbar.

The Filters and Transformations panels show a list of actions in alphabetical order to apply to the selected column. The list of available actions may be different for each column as it depends on the column’s data type. If you are not sure what an action is for, put the mouse over the info icon next to the action name to get an explanation about how it works. Panels can be expanded or collapsed with the expand or collapse icons. In addition, you can use the search bar of each panel to filter actions by name.

A filter is a condition on the column values that a row must meet for it to be returned in the query results. Some filters need some parameters to be defined. If that is the case, a dialog will appear asking for them when you select the filter.

Dialog to define the parameters of a filter

Dialog to define the parameters of a filter

A transformation is a function that converts the column values. This conversion may modify the data type of the column value too. For instance, the length transformation converts a text in its number of characters. The icon next to the transformation name describes the data type of the value returned by the function. It can take one of the values below:

  • numeric-type for the numeric types int, long, float, double and decimal.

  • text-type for the text type.

  • date-type for the date types date, localdate, time, timestamp, timestamptz, intervalyearmonth and intervaldaysecond.

  • xml-type for the xml type.

  • aggregation-type for the aggregation type.

You can filter the list of transformations by return type. Click the menu-title icon in the panel title and select one of the available return types.

A transformation usually works by converting the values in the column, row by row. It takes the value in the column for the first row, converts it to a new value, and repeats this process for the rest of rows. However, there is a special case of transformations that do not follow this pattern: aggregation transformations. An aggregation transformation takes values in the column from several rows and produces a single value by combining them all. Examples of aggregation transformations are Average, Count or Maximum. A column that uses an aggregation transformation is called an aggregation column.

Note

Aggregation columns are decorated with the info icon in the Output columns panel of the Columns section, so you can differentiate them from non-aggregation columns.

Once an aggregation transformation is applied to the data the semantic of the query changes as follows:

  1. The rows of the query results are partitioned in groups. For each distinct value of the non-aggregation columns, a group of rows is generated with all the rows that contain that value. In case all the columns in the query are aggregation columns, then one group is generated with all the rows of the query results.

  2. For each group, aggregation columns are calculated by applying the corresponding aggregation transformation on the rows of the group.

  3. There will be one row in the query results for each group. The row will consist of:

    • The values of the non-aggregation columns that generated the group.

    • The values of the aggregation columns calculated for the group.

Results for a query with aggregation columns

For each distinct value in the non-aggregation columns, the results contain a row with all the aggregation columns

After selecting a transformation from the panel, you have to configure how it is applied in a new dialog. Transformations can have parameters that must be defined. In addition, you can indicate where to store the result of the transformation. By default, it is stored in the same column you are transforming. However, you can optionally create a new column with the new value.

Dialog to configure how a transformation is applied

Dialog to configure how a transformation is applied


Now it is time to explore all the actions that are available in the Definition section.

Apply a Filter

To apply a filter on the values of a column you must follow the steps below:

  1. Select the column to filter by in the query results.

  2. Select a filter in the Filters panel.

  3. Fill in all the information requested in the dialog, in case the filter needs more configuration.

As an alternative, you have this other method to apply a filter on the values of a column:

  1. Click the Add filter button in the toolbar.

  2. Select the Add filter from wizard option.

  3. Search for a column to filter by.

  4. Select a filter from the options in the drop-down menu.

  5. Fill in all the information requested, in case the selected filter needs more configuration.

Dialog to create a new filter

Dialog to create a new filter

In the case of the Is one of and Is not one of filters, it is possible to add up to 1000 constant values at once. This option can be enabled by activating the Add multiple constant values toggle, which shows an editor where the values can be added. The Separator selector can be used to specify whether the values will be separated by commas, semicolons, or newlines. The values can be added by clicking the add-value button.

Dialog to create a new Is one of filter with the option to add multiple values enabled

Dialog to create a new Is one of filter with the option to add multiple values enabled

Apply a Filter From Expression

In case you need an advanced filter that is not considered among the options in the Filters panel, you can create a filter from a VQL expression following the steps below:

  1. Click the Add filter button in the toolbar.

  2. Select the Add expression filter option.

  3. Enter the VQL expression that represents the condition that the rows must meet to appear in the query results.

Dialog to create a new filter from a VQL expression

Dialog to create a new filter from a VQL expression

Take into account that the VQL editor in the dialog provides the following features to help you write an expression:

  • Highlighting. Each component of an expression is highlighted with a different color: functions, operators, literals, columns, etc.

  • Autocomplete. Press Ctrl + Space and a menu with suggestions will appear.

  • Validation. Every time you type in the VQL editor, the expression is validated. If the expression is correct, you will see a valid icon. If it is incorrect, you will see a invalid icon. Put the mouse over the icon and a tooltip will explain why it is invalid.

See also

Further information on the capabilities of VQL to write conditions and the syntax a condition must follow can be found in the Virtual DataPort VQL Guide.

Apply Suggested Filters

Data Marketplace can suggest popular filters for you based on other users’ activity. To access all suggested filters proceed as follows:

  1. Click the Add filter button in the toolbar.

  2. Select the Suggest option.

If you are only interested in popular filters with a condition on a specific column, then you have to:

  1. Select the column for which you want suggested filters in the query results.

  2. Click the denodo-assistant icon in the title of the Filters panel.

Either way, a new dialog will appear with a list of recommended filters.

Dialog with suggested filters based on other users' activity

Dialog with suggested filters based on other users’ activity

This list considers two kind of suggested filters:

  • Simple filter. A filter that depends on a column currently in the query results and that can be expressed with a condition in the Filters panel. If you click a filter like this, its corresponding dialog will open.

  • Filter from expression. A filter represented with a VQL expression, since it does not meet the requirements of a simple filter. If you click this kind of filter, the dialog to create a filter from expression will open. Take into account that the VQL expression can depend on parameters represented as ?. These are placeholders that must be replaced with actual values.

Note

Suggestions about filters and derived fields feature is only available with the AI & Denodo Assistant FeaturePack. To find out the subscription bundle you have, open the About dialog of the Data Marketplace.

Apply a Transformation

To apply a transformation on the values of a column you must follow the steps below:

  1. Select the column to transform in the query results.

  2. Select a transformation in the Transformations panel.

  3. Fill in all the information requested in the dialog.

  4. Confirm if you want to store the new value in the same column you are transforming or to create a new column with it.

Apply a Transformation From Suggestions

Data Marketplace can suggest popular transformations for you based on other users’ activity. To access all suggested transformations proceed as follows:

  1. Click the Add column button in the toolbar.

  2. Select the Suggest option.

If you are only interested in popular transformations on a specific column, then you have to:

  1. Select the column for which you want suggested transformations in the query results.

  2. Click the denodo-assistant icon in the title of the Transformations panel.

Either way, a new dialog will appear with a list of suggested transformations.

Dialog with suggested transformations based on other users' activity

Dialog with suggested transformations based on other users’ activity

This list considers two kind of suggested transformations:

  • Simple transformation. A transformation that depends on a column currently in the query results and that can be expressed with a transformation in the Transformations panel. If you click a transformation like this, its corresponding dialog will open.

  • Transformation from expression. A transformation represented with a VQL expression, since it does not meet the requirements of a simple transformation. If you click this kind of transformation, the dialog to create a new column from an expression will open. Take into account that the VQL expression can depend on parameters represented as ?. These are placeholders that must be replaced with actual values.

Sort by Column

The order of the rows in the query results is undefined by default. To establish an order you can sort the rows by the values in a column as follows:

  1. Click the menu-title icon of the column you want to sort by.

  2. Select the Sort A to Z or Sort Z to A option.

For those rows that share the same value in the column you have selected, the order is still undefined. You can repeat this process on another column, adding a second level of sorting, to establish how those rows should be ordered.

Create a New Column

In the previous sections we have seen that you can apply a transformation or recommended transformation and store the result in a new column. Let us explore what other options you have to create a new column in the query result.

To create a new column from scratch and give it a value proceed as follows:

  1. Click the Add column button in the toolbar.

  2. Select the Add column from wizard option.

  3. Enter a name for the new column.

  4. Enter a value for the new column, which can be one of the following options:

    • A constant value. All rows will have the same value. You need to specify the column type.

    • A column value. The new column will have the same value as the column you choose.

    • A transformation. First select the column and the transformation. And then fill in all the information requested in the dialog, in case the transformation needs more configuration.

    • Nothing, it will be empty. All rows will have NULL as a value. You need to specify the column type.

    • An expression value. Define the value as a VQL expression as complex as you want.

      Note

      Because the syntax for the expanded fields is <role_name>/<field>, you cannot use the / as an operator to create a VQL expression. Use the function DIV instead.

      See also

      A comprehensive list with all the functions you can use to build a VQL expression can be found in the Virtual DataPort VQL Guide.

Dialog to create a new column from scratch

Dialog to create a new column from scratch

To create a new column with the same value of other column you have the following alternative:

  1. Click the menu-title icon of the column you want to duplicate.

  2. Select the Duplicate column option.

  3. Enter a name.

To create a new column from a VQL expression that depends on other column you have the following alternative:

  1. Click the menu-title icon of the column you want to use in the VQL expression of the new column.

  2. Select the Create new column from expression option.

  3. Enter a name.

  4. Complete the VQL expression for the new column.

To add a field from the view or a related view as a new column follow the steps below:

  1. Go to the Columns section.

  2. Add a field or expanded field to the Output columns panel.

  3. Click the Load data button.

Remove a Column

There are several methods to remove a column from the query results:

  • In the query results, click the menu-title icon of the column you want to remove and select the Remove option.

  • In the actions history, click the menu-item icon of an action that created a column and select the Remove column option.

  • In the actions history, select several actions that created a column, click the menu-title icon in the panel title and select the Remove option.

  • Go to the Columns section, click the menu-item icon of the column you want to remove from the Output columns panel and select the Remove option.

  • Go to the Columns section, select several columns in the Output columns panel, click the menu-title icon in the panel title and select the Remove option.

Removing a column from the query results has the following effects:

  • The column is hidden from the query results.

  • A new action is added to the actions history indicating that the column was removed.

  • All actions in the actions history that depend on the removed column are disabled to indicate that they are not considered to build the query results.

If you are satisfied with the new query results, you have to confirm the removal of the column for its effects to be definitive. To do that, follow the steps below:

  • Remove all depending actions on the removed column.

  • Click the menu-item icon of the removal action.

  • Select the Remove column option.

In this way the column is completely removed. Take into account that the action that created the column and the action that removed the column will not be visible anymore in the actions history.

Instead, if you changed your mind and want to revert the removal action, then proceed as follows:

  • Click the menu-item icon of the removal action.

  • Select the Undo option.

Reorder a Column

The order in which the columns appear in the query results is not fixed. You can change it following one of the methods below:

  • Drag-and-drop the column title until it is placed in the appropriate location.

  • Click the menu-title icon of the column you want to reorder and select the Move to the left or Move to the right option. Repeat this process until the column is in the correct place.

  • Go to the Columns section and reorder the columns in the Output columns panel.

Rename a Column

To change the name of a column follow the steps below:

  1. Click the menu-title icon of the column you want to rename.

  2. Select the Rename column option.

  3. Enter a new name for the column.

Edit an Action

Some actions in the actions history can be edited. For instance, applying a filter or a transformation that depends on some parameters or a VQL expression. To edit an action like this, follow the steps below:

  1. Click the menu-item icon of the action you want to edit.

  2. Select the Edit option.

  3. Enter a new value for the parameter in the dialog that just opened.

Remove an Action

We have already seen how to remove an action that creates a column. Any other action in the actions history can be removed too. You have the following alternatives:

  • Click the menu-item icon of the action you want to remove and select the Remove option.

  • Select several actions in the actions history, click the menu-title icon in the panel title and select the Remove option.

As a consequence, the actions are removed from the actions history and the query results updated accordingly.

Check the VQL of the Query

Even if you do not realize it, Data Marketplace is building a VQL query on the view as a consequence of your data preparation session. At any moment you can check it. Just click the VQL button in the toolbar. If you want to manually edit the VQL query, click the Open in VQL Shell button.

VQL query generated from your data preparation session

VQL query generated from your data preparation session

Save the Query

At any moment during your data preparation session you can save the current query for later, which would be useful in the following scenarios:

  • The query is not finished yet and you want to continue later.

  • The query is too complex and you will need to execute it often.

  • You want to share the query with other person in your company.

  • You want to deploy the query as a view in a Virtual DataPort server.

To save your data preparation session, proceed as follows:

  • Click the Save button in the toolbar.

  • Enter a name for the query and, optionally, a description.

Dialog to save a query for later

Dialog to save a query for later

You will find your saved query in the My queries section. When you access the saved query, the name with which it has been saved is displayed on the pill.

Data preparation pill with the name of the saved query

Data preparation pill with the name of the saved query

Any modification to the query shows an asterisk next to the name, indicating that there are unsaved changes.

Data preparation pill with the modification notice next to the name of the query

Data preparation pill with the modification notice next to the name of the query

Note

Take into account that:

  • No other user will have access to your saved query, unless you share it.

  • Your saved queries are associated to the Virtual DataPort server you are connected to.

See also

For further information on what actions you can perform with saved queries, check the My Queries section.

Executing the Query

During your data preparation session you are manipulating a representative subset of the view data. Once you are satisfied with the query results in the Definition section you can perform the same manipulations on the entire view data set. Click the Next button in the toolbar and you will be redirected to the Results section. There you can check the query results executed on the view, export them to a file using the Export button in the toolbar or navigate though the data of its related view clicking the link icons in the query results.

Query results executed on the entire view data set

Query results executed on the entire view data set

Practical Scenario

Assume we need to extract the full address for all stores in our organization. Our first step is to review the View Address, as it is expected to contain this information.

Summary pill of the View Address

Summary pill of the View Address results executed on the entire view data set

To verify the data included in the view, go to the Schema pill.

Schema pill in the View Address

Schema pill in the View Address

The schema table shows that the country field is not present in the address view. To obtain this data, we need to execute a query, so we navigate to the Data Preparation pill.

Data preparation pill in the View Address

Data preparation pill in the View Address

From the Data Preparation pill, select the fields and relationship fields needed to retrieve the full address. Once all fields have been selected, click Add to output columns, and then click Next.

Data preparation table with the subset data

Data Preparation table with the subset data

This table shows a data subset that includes the columns selected in the previous step. Because some of these columns provide information that is not required for this scenario, we can safely delete them.

Dropdown menu in the more option column header

Dropdown menu with the column actions

To delete a column, go to menu-title and click Remove.

Data preparation table column filters displayed

Data preparation table column filters displayed

The table now contains only relevant information; however, some columns include incomplete rows. To use only complete rows, click in the column header and apply the necessary filter. If you are uncertain about which filter to apply, consult the description available in the info.

Data preparation table column transformations displayed

Data preparation table column transformations displayed

The table now contains all the required information and only complete rows. To further complete the Address field with the country and city values, select the column header and choose the desired transformation. You can check the description of each transformation from the info.

Edit transformation popup

Edit transformation popup

In this transformation, the address, city, and country fields are merged into a single value, with each part separated by a comma.

Data preparation table transformation applied

Data preparation table with the transformation applied

Now that we have the desired value, we can either review the VQL or save the query for future use.

VQL popup

VQL created by data preparation

Save query popup

Save query popup

Click the Next button to view the query results. In this case, the country and city columns have been removed because they are not necessary for the current scenario.

Data preparation results table

Data preparation table with the result of applying the query

Filters Definition

  • Like: Filters the data to show only the values whose text matches a pattern you type. You can use % to represent any sequence of characters and _ to represent a single character.

  • Contains: Filters the column values that contain a given text.

  • Starts with: Filters the column values that start with a given text.

  • Ends with: Filters the column values that end with a given text.

  • Matches Regexp: Allows you to use advanced text patterns to filter your data. Helpful for more complex searches. It distinguishes between uppercase and lowercase. Some examples: ^ca.* finds values starting with ca or [0-9]{3} finds values containing three digits in a row.

  • Matches Regexp case-insensitive: Allows you to use advanced text patterns to filter your data. Helpful for more complex searches. It does not distinguish between uppercase and lowercase. Some examples: ^ca.* finds values starting with ca or [0-9]{3} finds values containing three digits in a row.

  • Similarity: Finds text that is similar to what you typed. You set how similar it should be (a number from 0 to 1). Higher numbers require a closer match. Some examples: Searching “house” with threshold 0.8 may match hause, houze, but not mouse. Searching “auto” with threshold 0.5 might match auto, autor, alto.

  • Is missing: Filters the null column values.

  • Is not missing: Filter the non-null column values.

  • Is exactly: Filters the column values that are exactly the same as the given value.

  • Is one of: Filters the column values that are in the list of given values.

  • Is not one of: Filters the column values that are not included in the list of given values.

  • Is not: Filters the column values that are different than the given value.

  • Is greater than: Filters the column values that are greater than a given value.

  • Is equal or greater than: Filters the column values that are equal or greater than a given value.

  • Is lower than: Filters the column values that are lower than a given value.

  • Is equal or lower than: Filters the column values that are equal or lower than a given value.

  • Between: Filters the column values that are equal or greater than the initial value and are equal or lower than the final value.

  • Not between: Filters the column values that are lower than the initial value or are greater than the final value.

Transformation Definition

Common

  • Duplicate column: Specify the name of the new column to be created as a copy of {field}

Text

  • Convert to text: This transformation converts a column value to text data.

  • Append and prepend: This transformation concatenates into one string a column value with a text before and a text after.

  • Concat: This transformation concatenates into one string a column value with another texts or column values.

  • Split: Splits the text in a column using a separator (for example, a space, a comma, or a dash) and creates a new column for each resulting part.

  • Hash: Converts the column value into an encoded string using the MD5 algorithm and Base64. For the same input, this transformation always returns the same value.

  • Index of: Returns the position where a text first appears inside the column value. If the text is not found, it returns -1.

  • Length: This transformation returns the number of characters in a text column.

  • Change to lowercase: This transformation converts a text column to lowercase.

  • Remove accents: This transformation replaces all characters with an accent with the same characters without accent.

  • Repeat: This transformation repeats a text column value a given number of times.

  • Replace: This transformation replaces all the occurrences of the specified texts.

  • Replace with regular expression: Finds all parts of the text that match a regular expression and replaces them with the text you provide. Example: Value: ‘123-456-789’, expression: d{3}, replacement: ‘XXX’, Result: ‘XXX-XXX-XXX’

  • Position of text: This transformation returns the first position, if any, at which one string occurs within a text column value

  • Similarity: Finds text that is similar to what you typed. You set how similar it should be (a number from 0 to 1). Higher numbers require a closer match. Some examples: Searching “house” with threshold 0.8 may match hause, houze, but not mouse. Searching “auto” with threshold 0.5 might match auto, autor, alto.

  • Text fragment: Extracts part of the text from a column. You must provide the starting position (start index). You may optionally provide the ending position (end index). If no end index is provided, the fragment goes from the start index to the end of the text.

  • Trim: Removes any spaces or line breaks that appear at the beginning or the end of the column value.

  • Left trim: Removes any spaces or line breaks that appear at the beginning of the column value.

  • Right trim: Removes any spaces or line breaks that appear at the end of the column value.

Number

  • Absolute value: This transformation returns the absolute value of a numeric column.

  • Arc cosine: This transformation returns the arc cosine of an angle. The input has to be a number between -1.0 and +1.0.

  • Arc sine: This transformation returns the arc sine of an angle. The input has to be a number between -1.0 and +1.0.

  • Arc tangent: This transformation returns the arc tangent of an angle. The input has to be a number between -1.0 and +1.0.

  • Ceil: This transformation returns the smallest integer not less that the argument.

  • Cosine: This transformation returns the cosine of an angle in radians. The output is a double value between -1.0 and +1.0.

  • Cotangent: This transformation returns the cotangent of an angle in radians.

  • Degrees: This transformation, given an angle in radians, returns the corresponding angle in degrees.

  • Division: This transformation divides a numeric column value by a divisor. The column value will be the dividend of the operation.

  • Exponential: This transformation returns the exponential value of a number column.

  • Floor: This transformation returns the largest integer not greater than the column value.

  • Natural logarithm: This transformation returns the natural logarithm (base e) of a value.

  • Logarithm base-10: This transformation returns the logarithm of a number in base-10.

  • Logarithm base-n: This transformation returns the logarithm of a number in a given base.

  • Module: This transformation returns the result of the module operation: the remainder of the integer division of the column value and the divisor.

  • Multiplication: This transformation multiplies a column value by a number.

  • Power: This transformation returns the result of a number raised to a power.

  • Round: This transformation returns the column value rounded to the specified number of places to the right or left of the decimal place. If the number of places is omitted, the column value is rounded to 0 places. If the number of places is negative, the column value is rounded to digits left of the decimal point.

  • Sign: This transformation returns -1, 0 or 1, depending on whether the column value is negative, zero, or positive respectively.

  • Sine: This transformation returns the sine of an angle in radians. The output is a double value between -1.0 and +1.0.

  • Square root: This transformation returns a positive square root.

  • Subtract: This transformation subtracts a number from a column value.

  • Sum: This transformation adds a numeric quantity to the value of a column.

  • Tangent: This transformation returns the tangent of an angle in radians.

  • Truncate: This transformation returns the integer part of a column value.

Date

  • Add days: This transformation returns the datetime column value with its field day rolled up (or down, if the increment is negative) by the amount specified.

  • Add hours: This transformation returns the datetime column value with its field hour rolled up (or down, if the increment is negative) by the amount specified.

  • Add minutes: This transformation returns the datetime column value with its field minute rolled up (or down, if the increment is negative) by the amount specified.

  • Add months: This transformation returns the datetime column value with its field month rolled up (or down, if the increment is negative) by the amount specified.

  • Add seconds: The amount to increase the field second. If the number is negative, the field is decreased. The type of the expression can be int or long

  • Add weeks: This transformation returns the datetime column with its field week rolled up (or down, if the increment is negative) by the amount specified. That is, rolled up or down in multiples of 7 days.

  • Add years: This transformation returns the datetime column value with its field year rolled up (or down, if the increment is negative) by the amount specified.

  • Convert to timezone: This transformation converts a timestamp column from one timezone to another.

  • Division: This transformation divides an interval column by a divisor. The column value will be the dividend of the operation.

  • Format: This transformation returns a string containing a datetime column formatted using the given pattern.

  • First day of month: This transformation returns the datetime column with the field day rolled down to the first day of the month. If the datetime column value already is the first day of the month, it returns the parameter unchanged.

  • First day of week: This transformation returns the datetime column with the field day rolled down to the first day of the week. If the datetime column value already is the first day of the month, it returns the parameter unchanged. The first day of the week depends on the locale of the view and also may depend on the underlying database.

  • Get day: This transformation returns the day field of a given datetime column. The transformation returns a long data-type ranging from 1 to 31.

  • Get day of year: This transformation returns the number of the day in the year of the datetime column.

  • Get days between: This transformation returns the number of days between two dates. It returns 0 if both dates represent the same day. It returns a positive number, if the column value is first. It returns a negative number, if the datetime parameter is first.

  • Get day of week: This transformation returns the number of the day of the week of the datetime column. The first day of the week is 1 and the last day is 7. The first day of the week depends on the locale of the view and also may depend on the underlying database.

  • Get hour: This transformation returns the hour field of a given datetime column.The transformation returns a long data-type, ranging from 0 (12:00 A.M.) to 23 (11:00 P.M.).

  • Get microsecond: This transformation returns the microseconds field of a given datetime column.

  • Get millisecond: This transformation returns the milliseconds field of a given datetime column.

  • Get minute: This transformation returns the minute field of a given datetime column, returns a value of type long, ranging from 0 to 59.

  • Get month: This transformation returns the month field of a given datetime column. returns a long data-type, ranging from 1 (January) to 12 (December).

  • Get months between: This transformation returns the number of months between a datetime column and a given datetime. It returns 0 if both dates represent the same day. It returns a positive number, if the column value is first. It returns a negative number, if the datetime parameter is first.

  • Get nanosecond: This transformation returns the nanosecond field of a given datetime column.

  • Get quarter: This transformation returns the quarter of the year of a given datetime column.

  • Get second: This transformation returns the month field of a given datetime column.

  • Get time in milliseconds: This transformation returns the number of milliseconds from January 1, 1970, 00:00:00 GMT to the datetime column value.

  • Get week:This transformation returns the week of the year of a given datetime column. The first week of the year is 1. The first week of the year is that in which at least 4 days are in the year. As a result of this definition, depending on the year the day 1 of the year may be considered to belong to the previous year.

  • Get year: This transformation returns the year field of a given datetime column.

  • Last day of month: This transformation returns the datetime column with the field day rolled up to the last day of the week. The last day of the week depends on the locale of the view and also may depend on the underlying database.

  • Minimum: This transformation returns the minimum value between the datetime column and a given datetime parameter.

  • Maximum: This transformation returns the maximum value between the datetime column and a given datetime parameter.

  • Multiplication: This transformation multiplies a column value by a number.

  • Next week day: This transformation returns the datetime column with its field day rolled up to the day of the week indicated by the given parameter. If the column value already represents the day parameter, the function rolls up the date to the same day of the next week (Sunday= 0, Monday= 1…).

  • Previous week day: This transformation returns the datetime column with its field day rolled down to the day of the week indicated by the given parameter. If the column value already represents the day parameter, the function rolls down the date to the same day of the previous week (Sunday= 0, Monday= 1…).

  • Truncate: This transformation returns the column datetime value, truncated to a specific unit of measure. If the pattern is not present, the datetime is truncated to the day. Truncation units: CC SCC: Century, SYYYY YYYY YEAR SYEAR YYY YY Y: Year, IYYY IYY IY I: ISO Year, Q: Quarter, MONTH MON MM RM: Month, WW: Same day of the week as the first day of the year, IW: Same day of the week as the first day of the ISO year, W: Same day of the week as the first day of the month, DDD DD J: Day, DAY DY D: Starting day of the week, HH HH12 HH24: Hour, MI: Minute.

  • Subtract: This transformation subtracts a datetime value from a datetime typed column. The transformation returns the number of whole days between the date parameter and the column value. When subtracting two values of type time, the function will return the number of milliseconds between the two values.

  • Subtract interval year-month: This transformation subtracts a interval year-month value from a datetime typed column.

  • Subtract interval day-second: This transformation subtracts a interval day-second value from a datetime typed column.

  • Sum interval year-month: This transformation adds a interval year-month value from a datetime typed column.

  • Sum interval day-second: This transformation adds a interval day-second value from a datetime typed column.

  • Convert to localdate: This transformation converts a text column value containing a datetime in a specific format, into a value of type localdate.

  • Convert to time: This transformation converts a text column value containing a datetime in a specific format, into a value of type time.

  • Convert to timestamp: This transformation converts a text column value containing a datetime in a specific format, into a value of type timestamp.

  • Convert to timestamptz: This transformation converts a text column value containing a datetime in a specific format, into a value of type timestamptz.

Aggregation

  • Average: This transformation returns the average of the non-null values of the column.

  • Count: This transformation returns the number of non-null values of a column.

  • Maximum: This transformation returns the highest value of a column.

  • Median: This transformation returns the median value of a column.

  • Minimum: This transformation returns the minimum value of a column.

  • Standard deviation: This transformation returns the sample standard deviation of the values of a column. This transformation does not take into account null values to calculate the result.

  • Population standard deviation: This transformation returns the population standard deviation of the values of a column. This transformation does not take into account null values to calculate the result.

  • Sum of values: This transformation returns the sum of all non-null values of a column.

  • Variance: This transformation returns the sample variance of the values of a column. This transformation does not take into account null values to calculate the result.

  • Population variance: This transformation returns the population variance of the values of a column. This transformation does not take into account null values to calculate the result.

  • Group concat: This transformation returns, for each group, a text with the concatenation of all the column values of each group.

  • XPath: This transformation returns the nodes from an XML column selected by an XPath expression.

  • JSONPath: This transformation returns the nodes from an JSON document selected by an JSONPath expression. JSONPath expressions are similar to XPath expressions for XML documents

Add feedback