Online Trainer block

Description

The Online Trainer block is used to retrain the model in real-time.  The functioning of this block is the same as the Offline Trainer block with the difference that it retrain the model based on the frequency of retraining required. It trains models by using a training data set.  A training data set is composed of a set of input data spanning a certain time period and a set of target data spanning the same time period.

This block is only used while the model (blueprint) is in run-time.

Return to Overview of blocks

Diagram of the online trainer block

Block Type

Rules & Models block

Input port

There are two input ports to the Online Trainer block, the top port contains the model inputs and the bottom port contain the target field.  All the Nonlinear model block(s) and/or the Linear model block(s) are displayed in a list box of which one must be selected for the specific Online Trainer block. The same properties that were used for the Offline Trainer of this specific model block will be used for the Online Trainer, i.e. name of the target field and model input fields, except for the training database configuration settings this must re-configured. For this block to be runnable the database configuration has to be completed.

Functions performed on tags

The Online Trainer does not modify the values, timestamps or qualities of the target or input data in any way. It accumulates the data as the blueprint executes into the Optimised DB Sink block as configured in the Online Trainer block. When the retrain frequency criteria are met, the Online Trainer sources the model training data from the Optimised DB Source block that has been accumulated.  

The training data set is obtained by extracting the input and target data from the Optimised DB Source block. This training data are composed of only good quality data. This training data is then used to train the Nonlinear or Linear model. Training data set will consist of the data for the period from the current time to a set time period into history, which is configured by the user.

After the model has been trained it is compared to the existing model and if the new model is better than the older model, the old model is replaced or else the old model is kept.

The comparison criterion is based on the coefficient of determination (or R2) if the R2 value of the new model is greater than the old model the old model will be replaced. The R2 values for both the old and new model are determined with the same set of data, thus comparing “apples with apples”. If a model has been replaced by the Online Trainer, it will only exist while the blueprint is running, because the new model is not saved in the blueprint (is not persisted). To enable the recreation of the latest model the set of data that was used to train that model is written to a text file, which enables the recreation of the model.

Terminology

  • Accumulated data - the blueprint first runs the source data through the blueprint up until the specific model block so that all the appropriate model input fields are cleaned, filtered and calculated if necessary, it then accumulates this data to a database before running it through the model during training

Return to top


Related topics:

  

CSense 2023- Last updated: June 24,2025