The bulk load subsystem of Virtual DataPort takes advantage of the FastLoad mechanism of Teradata and its JDBC driver to accelerate loading data into a Teradata database.
This feature is enabled by default in Virtual DataPort. To check if it is enabled, open the data source, click the Read & Write tab of the data source and make sure Use bulk data load APIs is selected. Teradata does not require any changes in its configuration to use FastLoad.
Virtual DataPort uses Teradata FastLoad for:
Creating and refreshing remote tables
Creating and refreshing summaries
At runtime, to load data, Virtual DataPort opens a new connection to Teradata with the parameter
TYPE=FASTLOAD. This connection cannot be obtained from the pool of connections of the data
source so the execution engine creates a new one to do the bulk data load.
FastLoad imposes the following requirement: the target table has to be empty. This requirement does not affect data movements nor summaries nor remote tables because:
When performing a data movement, the execution engine creates a table specifically for this data movement so the table is empty.
When creating a remote table or a summary, the table is created at that moment. When refreshing a remote table or a summary, the execution engine “truncates” the table first and then, it inserts the data. “Truncating” a table means deleting all the data inside the table.
The requirement of the table having to be empty may be an issue when loading the cache of a view. When loading the cache of a view, the target table may already have data on it, even if you add the parameter
'cache_invalidate' = 'all_rows' to the query that loads the cache. That is because, by default, during the process of caching a view, the existing data remains on the table so the queries to this view still return rows while the loading of new data is occurring (see more about this process in The Full Cache Mode at Runtime).
To ensure FastLoad is always used when loading the cache of a view, add the parameter
'cache_atomic_operation'='false' to the query that loads the cache. With this parameter, the cache engine will truncate the table of the cache database before inserting any data on it (the section Caching Very Large Data Sets explains in detail what the parameter “cache_atomic_operation” does).
SELECT * FROM customer360.total_sales_by_country CONTEXT ( 'cache_preload' = 'true' , 'cache_invalidate' = 'all_rows' , 'cache_atomic_operation'='false' , 'cache_wait_for_load' = 'true' , 'cache_return_query_results' = 'false');
If the query that loads the cache does not have the parameter
'cache_atomic_operation'='false' and the table in Teradata already contains data, Teradata will use the standard protocol to insert data instead of FastLoad.
Supported Data Types by FastLoad
FastLoad is not capable of transferring values of the type
intervaldaysecond. If the data to insert into Teradata uses at least one of these types, Virtual DataPort will not try to use FastLoad.
Teradata has some limitations when loading data concurrently using FastLoad.
Make sure the primary keys defined in the views are correct. Teradata FastLoad does not insert duplicated rows when the table has a primary key and it does not return any errors or warnings. If the primary key is honored, there cannot be duplicated rows.