Incremental Loading is the process of gradually loading the transactional data into your data warehouse and staging databases. This allows us to have faster loading times as only the most recent data is loaded. This can be useful in performing scheduled executions in less amount of time by executing the most recent data only. The standard execution process involves truncating all the tables and fields in staging database and data warehouse first, then the complete data is subsequently loaded from the datasource. This process is commonly known as Full Load.
During the Incremental Loading process, Jet Data Manager determines the fields inside different tables to detect the amount of data that will be moved over. Truncation is disabled in order to preserve the original data. This process decreases the execution time as Jet Data Manager only transfers the newly added records from data source into staging and data warehouse tables. The amount of time required to transfer data actually determines whether or not incremental loading should be applied. Normally, the tables on which incremental loading is applied are larger transaction tables containing huge volumes of inventory and general ledger transactions. We will use G/L Entry table from Dynamics NAV as an example in this document, but the concepts are universal and apply to every data source.
Enabling Incremental Loading for Staging Database Tables
- To enable incremental loading for tables in your staging database, first go to the table in the staging database and right-click on its name, then select Table Settings.
- Go to the Data Extraction tab and select Incremental Load.
- Multiple red X denoting error start to appear. To remove this uncheck the box before Truncate valid table before data cleansing at the bottom.
- The icon of the table in the staging database will now have an “I” icon identifying it as an incrementally loaded table.
In order to execute incremental loading properly, a primary key must be defined inside a table. A primary key can be defined after you right-click the field name and select Include in Primary Key. A table can have more than one field as primary key.
- Navigate to the Data tab at the bottom and find the table in the Data Source section. Right-click the table name and select Add Incremental Selection Rule.
In order to identify the records which have been added or changed since the last incremental load, you need to check the box or boxes of the respective fields. These fields are generated by the system and incremented sequentially after the new records are added.
The timestamp field is regarded as the best selection for Dynamics NAV. The DEX_ROW_ID and Modified_Datetime are generally regarded as the best selection for Dynamics GP and Dynamics AX respectively. These fields are automatically incremented when new records are added.
If the table that is set up for Incremental Loading is also a source for lookups on other tables, it is necessary to disable automatic index creation. For more information, please read Disabling Automatic Indexing by Insightsoftware.
- Right click the Business Unit and select Deploy and Execute.
Select Only modified tables and views to deploy and execute modified data instead of complete business unit.
Click the Start button to initiate the first full load of the tables with source based incremental loading now enabled.
Using the Incremental Load Wizard
Jet Data Manager 2017 and higher offers setting up incremental load for any number of tables in a wizard-style interface. This interface reduces the time for setting up incremental loading for multiple tables. To make use of the wizard, right-click on the applicable data source, go to Automate, and select Set Up Incremental Load:
Now you can set up incremental loading for any number of tables, select primary key fields and set our data selection rules easily.
Enabling Incremental Loading for Data Warehouse Tables
- In the staging database, you need to right-click the table that was previously configured to use for incremental loading. Go to Advanced → Show System Control Fields .
You will notice that a series of system generated fields have been added at the bottom of the table.
- You need to bring DW_TimeStamp from staging table into the table in the data warehouse where incremental loading has to be implemented. In this example, we moved the DW_TimeStamp field from the G/L Entry table to our Finance Transactions table.
- In the data warehouse, right-click the table you wish to implement incremental loading for and select Table Settings.
- Go to the Data Extraction tab and select Incremental Load.
- Multiple red X denoting error start to appear. To remove this uncheck the box before Truncate valid table before data cleansing at the bottom.
- Next, right-click the table and select Add Incremental Selection Rule.
Select the DW_TimeStamp field that we previously added to the table.
- After the selection of field for incremental loading, make sure to have a primary key in your table. The primary key will be indicated by a key. If the table does not contain a primary key then we need to create one.
There are two ways. The first one is to identify the primary key for that particular table in the data source, bring that field up into the data warehouse, and then set it up as the primary key. The second way is to select a field that already exists in the table. If there are more than one field that will result in a unique ID, then we should select these fields as a primary key. Check out the article on what Data Selection Rules are and how to create them.
To set a field as primary key in the data warehouse, you can simply right-click on the field and choose Include in Primary Key.
The field will have a little key on it. Then you can Deploy and Execute without an error.
If the table that is set up for Incremental Loading is also a source for lookups on other tables, it is necessary to disable automatic index creation. For more information, please read Disabling Automatic Indexing by Insightsoftware.
- Right-click the data warehouse node and select Deploy and Execute.
Select Only modified tables and views to deploy and execute modified data instead of complete data warehouse.
Click Start.