The staging framework along with the Task Scheduler is a set of tools to move and manipulate data all the way from the ERP database up to the production database. The framework exists in the staging database and is made up of advanced sql code. The framework allows users to configure the data flow more easily by creating something called elements and arranging the elements into runs. Each element does a specific operation and currently there are 3 types of operations:
The Data Flow is in the following layers
ERP -> RAW -> PREP -> INV | DBO
Data from the customers ERP system. If data is provided (e.g. trough linked server) straight from the ERP system, then the data is copied into tables in this layer.
In this layer we have defined the raw tables preferred for the AGR system. ERP layer data is mapped to the RAW layer. In cases when the customer can provide mapped data then it should be in the RAW data layout. In the future the ERP connector will also link straight into the RAW layer.
The PREP layer is the start of AGR specific data mapping. Lot of IDs used in AGR are generated here. PREP layer is thought of being the layer where all AGR modules have access to the same data and id structure. For example the Location Financial Planner and Inventory should use the same location_id, even though not all locations would be shared between modules. This might be because you want to consider closed location for Location financial planner to get all sales history in the past.
The schema name for the final layer should represent the module the data is used for. Here we have a final version of each table that is required for the system/module that are then merged over to the prod database. All module specific data mapping and business logic should reside here.
In the Inventory module we have one table that stores information about internal locations (warehouses, stores) and also external locations (vendors). However the ERP’s usually store this information in separate tables so that raw layer is designed to receive the data in that way.
nav.location -> raw.location -> prep.location_and_vendor -> inv.location | dbo.location
nav.vendor -> raw.vendor -> prep.location_and_vendor -> inv.location | dbo.location
If a customer would both have the INV module and FP module, then you would use the same PREP layer but map it to both INV tables and FP tables:
nav.ItemCategory -> raw.PRODUCT_GROUP -> prep.product_group_column & prep.product_group_details -> Map to INV inv.item_group_columns & inv.item_group_details and FP fp.category_column & fp.category_details
nav.ProductGroup -> raw.PRODUCT_GROUP -> prep.product_group_column & prep.product_group_details -> Map to INV inv.item_group_columns & inv.item_group_details and FP fp.category_column & fp.category_details
The control tables to configure the AGR data flow are stored in the core schema on the STG database.
core.stg_run
Here a stg run is defined, everything that should be run together in one scheduled task should be grouped together in a stg run. Example: daily job
name
(nvarchar) - required: The name you give the run, must be uniquedescription
(nvarchar) - required: description of the run, is shown in the scheduler in the system. You can either hard code the description or enter a translation key that references the translations in the systemforce_run
(bit) - not required: If true all stg element in the run are executed even if they appear twice e.g. in separate groups.run_stg_groups_in_sequence
(bit) - required: If true then stg elements in each group are all executed before starting next group. If false types across groups are executed in the run order defined in the core.stg_element_type table.include_in_system_health
(bit) - required: True if the status of the run should be included in the System Health Feature on the Inventory dashboard; 0 otherwisecore.stg_group
In this table we define how we group together elements that are run as part of stg_run. An example of a stg run group is inventory_raw which groups together all merges/procedures that need to be run to populate the RAW layer
name
(nvarchar) - required: The name you give the group, must be uniquedescription
(nvarchar) - required: description of the stg group. You can either hard code the description or enter a translation key that references the translations in the systemsystem_health_group_id
(int) - not required: if the stg_group belongs to a stg_run that should be displayed in the System Health Feature, here you enter the System Health group the elements should belong to.core.stg_run_group
Here you link a stg_run (i.e. the daily job) to a stg_group.
stg_run_name
(nvarchar) - required: The name of the stg run in the core.stg_run tablestg_group_name
(nvarchar) - required: The name of the stg run in the core.stg_run tablerun_order
(int) - required: a number defining in what order the groups are run.active
(bit) - required: 1 the group should be run when the stg_run is executed; 0 otherwise.core.stg_element
In this control table the stg elements are defined. They can either be merge procedures (automatically created in the stg run), fiexed procedures or server side elements. You define the merge procedures, i.e. where you want data to go (object) and where it should come from (src_data_object).
id
(int) - required: the id of the stg element, must be uniquetype_name
(nvarchar) - required: the name of the stg element type the element belongs toname
(nvarchar) - required: The name of the stg run in the core.stg_run tabledescription
(nvarchar) - not required: a description of the elementobject
(nvarchar) - required: If auto merge then the object is the targ table name, if you want to execute a procedure then you list the procedure name here.operation_mode
(int) - required: 0 - Auto detect if procedure or data merge, 1 - data merge, 2 - execute stored procedure, 3 - element executed by the application service.operation
(nvarchar) - autopopulated: ‘DATA_MERGE_AUTO_GENERATE’, ‘EXECUTE_PROCEDURE’ or ‘SERVER_SIDE_EXECUTION’parameters
(nvarchar) - not required: If executing a procedure directly you list the input parameters heresrc_data_object
(nvarchar) - not required: For autogenerated merge here list the source table/view you are merging frombreak_on_error
(bit) - required: 1 if the stg run should fail if this element fails; 0 if it should continueobject_full
(nvarchar) - autopopulated full name of the database objectsrc_data_object_full
(nvarchar) - autopopulated full name of the source database object if merge
data_merge_column_values
(nvarchar) - Here it is possible to set values to columns e.g. if column does not exists in source table. Example column_name=‘value’,column_name_2=0data_merge_join_columns
(nvarchar) - comma separated list of what columns to join on, if nothing is specified than it uses the primary key of the tabledata_merge_compare_columns
(nvarchar) - comma separated list of what columns to compare in the merge statement generated. If you want to list what columns to omit place ^ in front of the list. If empty all columns available in the target table are used.data_merge_update_columns
(nvarchar) - comma separated list of what columns are updated in the merge statement generated, if matched and difference is in the comparison. If you want to list what columns to omit place ^ in front of the list. If empty all columns available in the target table are used.data_merge_insert_columns
(nvarchar) - comma separated list of what columns are used int he insert part of the merge statement generated, if not matched and present in source. If you want to list what columns to omit place ^ in front of the list. If empty all columns available in the target table are used.data_merge_not_match_source_values
(nvarchar) - If row is not present in source you can add here update to a column instead of deleting the row in target. E.g. active=0 to set the column active to 0 when the row is missing in source.data_merge_filter
(nvarchar) - add a filter here if you don’t want to merge the whole table. For example for historical data you might only want to refresh the last 30 days. This filter is applied to both target and source.data_merge_source_only_filter
(nvarchar) - Filter only applied to source table.data_merge_pre_sql
(nvarchar) - not required: You can add code here that you want to run before the mergedata_merge_post_sql
(nvarchar) - not required: You can add code here that you want to run after the mergelast_run_time
(datetime) - not required: the last time the element was runlast_run_time_successful
(datetime) - not required: the last time the element was run successfullylast_run_event_id
(int) - reference to the last event that was logged for this element. See more herecore.stg_group_element
Here you link each stg element (merge/procedure/other) to one or more stg group. Here you also define what order the elements should run in
stg_group_name
(nvarchar) - required: The name of the stg group in the core.stg_group tablestg_element_name
(nvarchar) - required: The name of the stg element in the core.stg_element tablerun_order
(nvarchar) - required: a “number” defining in what order the elements are run within the group. It is possible to add sub-numbers (e.g. 3.2) this is useful when adding a element in between two previous elements without having to shift the order of all following elements. These sub-numbers can go down many layers e.g. 1.2.2.2.2.2 but each layer number can be max 3 digits (999).force_run
(bit) - required: If true the element is always executed even if it is present in another group executed in the same run.active
(bit) - required: If true the element will be run when the group is executed. Default value for this column is truerun_serial
(nvarchar) - required: Automatically Populated. This serial determines the order of execution of the elements. It is the run_order of the type of the element combined with the run_order set in this tablecore.stg_element_type
Different stg element types, to group merge procedures to run_order.
name
(nvarchar) - required: The name of the type, must be uniquedescription
(nvarchar) - required: The name of the stg element in the core.stg_element tablerun_order
(nvarchar) - required: a number defining in what order the elements are run within the group.core.stg_element_sql
This is not a configuration table but a table that stores a copy of the auto generated merge statements. This can be useful for debugging purposes, e.g. if a merge is failing or is slow. If the merge statement changes, which might happen if you add a new column, a new version of the merge statement is stores, giving you the ability to view all previous versions of the merge statemants
stg_element_name
(nvarchar) - required: The name of the staging elementsql_version
(int) - required: The version of the merge statment. Look for the highest version number for each element to locate the current merge statementsql_text
(nvarchar) - required: The actual sql merge statment.insert_date
(datetime) - required: The date and time the merge was first generated.The easiest way to run complete staging runs is probably by going to the Task Scheduler in the AGR Website and pressing the play button. This will ensure that all elements will be run, including the server side elements. That being said, there are cases where you will need to execute a single element manually or even a couple. Below is a guide on how to do so direcly in database.
It is not possible to run server side elements in sql, you will need to do that in the task scheduler. You can run a single element by using the partial run like so:
EXEC [core].[execute_stg_element] @stg_element_name = 'inv_items'
This procedure has 6 parameters:
@batch_id, @origin_id & @parent_id
: These are the standard logging parameter. They can be ignored when running a single stg element.@stg_element_name
: The name of the staging element you wish to run, defined in the core.stg_element table.@stg_group_name
: Used internally when running runs, no need to use it when running a single element.@debug
: The good old debug bit. Default is false but if enabled it might print out some helpful information.This can be useful when you define new data merge element in the core.stg_element table it is good to use this procedure to test if the data flows correctly from the source to the target. If you use the debug flag set to 1 then you get a print out of the merge statement and a table result that shows what columns and how they are used in the merge statement generation.
EXEC [core].[execute_stg_run] @stg_run_name = N'daily_run'
This procedure has 6 parameters:
@origin_id & @parent_id
: These are the standard logging parameter. They can be ignored when running manually. The procedure generates a new @batch_id
, and therefore it is not an input parameter as in other procedures.@stg_run_name
: The name of the run you wish to execute. Available runs can be found in the [core].[stg_run] table@element_type
: Optional - Allows you to filter the list down to an element type (layer). Available types can be found in the [core].[stg_element_type] table@element_list & @group_list
: If you use the above parameter, @stg_run_name, you can not use the @element_list or the @group_list. It’s an either/or scenario. If you choose to use the @element_list or @group_list you can choose to use either or both. These parameters will accept a comma separated list of element names or group names.@debug
: Default is false. If @debug = 1, the query will only print out the stg_elements in the result window, and not run them. This is f.x. useful to see in which order stg_elements in a stg_run would run.EXEC [core].[execute_stg_run] @element_list = N'inv_product,inv_product_article,inv_product_article_ready_to_prod'
EXEC [core].[execute_stg_run] @group_list = N'inventory_run_abc,inventory_data_transfer,inventory_run_orders'
EXEC [core].[execute_stg_run] @element_list = N'inv_product,inv_product_article', @group_list = N'inventory_run_abc,inventory_run_orders'
Notes:
Most setups only connect to a single ERP system but there are setups that connect to multiple ERP systems. There is support for this in the core.erp_db_info table. The best way to show this is by example. Let’s imagine we have a chain of stores in the Nordics and we need to connect to 3 ERP systems. One is in Norway, one in Denmark and one in Sweden. The table would look someting like this:
ERP data tables in the Staging database are placed in separate schemas. For legacy reasons there is a difference between Nav setups and other erp setups as can be seen below:
There is a procedure called core.create_erp_table (see example 1 below) that can help you create ERP tables, custom or not. We recommend that you use it when adding ERP tables.
In older versions (6.1 and older) consultants would have almost a clean slate in the staging database to customize whatever they wanted making it almost impossible to update. With the introduction of the new staging framework, which resides almost exclusively in the staging database, we needed to make sure it was in the upgrade path. This meant we needed to create better placeholders for customizations, so keep on reading if you are interested in knowing how we did just that:
To make it short and simple: If you need to customize you should do so in the appropriate _cus schema. This is best explained by showing examples:
If you need to add a new ERP table we have a procedure that will help you with that. Let’s assume we are using Navision and the table we would like to add is Hagar$Additional Sales Data. What you will need to do to get the table over is call the procedure like so:
EXEC tb.create_erp_table @table = N'Additional Sales Data'
Note that you don’t need to include the company name prefix (Hagar$), it will be detected using info from the core.erp_db_info table
The procedure will do the following:
The tb.create_erp_table procedure has some additional optional parameters which you might find useful:
Let’s say we want to change the [raw].[v_STOCK_LEVEL] view. This view is used to populate the [raw].[STOCK_LEVEL] table. The steps needed to customize the view are:
The staging element used to move data from the view to the table will automatically detect the presense of the new customized view. There is no need to change the setup of the element
Before view was created:
After view was created:
Check out the Extra info chapter for more info on standard extra info dataflows
I’m guessing that if you have read the two examples above you can guess how this one goes. Just make a copy of the procedure and place it in the appropriate _cus schema and the framework will detect it’s presense
Should the need arise to add new elements they should be added to the appropriate cus schema. When adding lines to the [core].[stg_element] for those new elements you should use the cus name for the object. That way you can see that this object is an addition instead of a custimization of an already existing object. Example:
It is not strictly required, but a good idea, to also prefix the name of the staging element with **cus** or some other nice prefix to indicate that this is not a part of the standard setup.
By always making a copy of the original view, tables or procedures we ensure that all custimizations are easily traceable and this also means that we can update the original objects and framework without having to worry that we will mess everything up. There are, of course, pros and cons to this but if we focus on the pros:
To ensure that consultants don’t accidentally modify, create or drop objects in the non-customizable schemas (raw, prep, inv, core, etc.) we introduced the schema locking mechanism. This mechanism will monitor all DDL requests and check if the object being touched is in a locked schema. If so you will get the following error:
The scema locking is implemented with a database trigger (tr_ddl_core_schema_locking) and two tables (core.schema_locking & core.schema_locking_exception_list)
The schema locking is only meant to help you avoid making mistakes when custimizing but we realize that there might come the time when you really need to do some modifictions to a schema. When that need arises you simply go into the [core].[schema_locking] table and disable the lock for the appropriate schema but beware that there is nothing that re-enables the lock so please remember to enable it again.
If there is an object that is frequently altered (possibly an autogenerated object) you can add that object to the [core].[schema_locking_exception_list] table and it will be excluded from the schema locking at all times
Notes: