Introduction
TRAC is a universal model orchestration solution that combines your existing data and compute infrastructure,
model development environments and the repository of versioned code, to create a single ecosystem in
which to build and deploy models, orchestrate complex workflows and run analytics.
The platform is built around three key principles, selected to break the trade-off that has traditionally
been required, between flexible (but uncontrolled) analytics solutions and highly controlled (but
inflexible) production platforms.
|
SUFFICIENT |
The same infrastructure, tools and business assets support both production and experimental model runs, and post-run analytics. TRAC therefore supports all possible uses of a model and no other deployment environments are required. |
|
INCORRUPTIBLE |
The platform’s design makes it impossible to accidentally damage or destroy deployed data, models or flows. Model developers and users can therefore self-serve with confidence, free from the constraints of traditional change control processes. |
|
SELF-DOCUMENTING |
TRAC automatically generates governance-ready documentation with no manual input required, eliminating the need to manually compile paper evidence for model deployment oversight, data lineage reporting and internal audit. |
Because TRAC is sufficient, incorruptible and self-documenting you get the best of both worlds. Maximal
control and transparency plus analytical flexibility, in a single solution.
Virtual Deployment Framework
Self-describing Models
Models can be imported and used with zero code modifications or platform-level interventions, so long as
the model code contains a custom function which declares the model’s schema to the platform. A model schema
consists of:
The schema of any data inputs the model needs to run
The schema of any optional or required parameters which affect how the model runs
The schema of the output data which the model produces when it runs
Model Deployment Process
TRAC uses a ‘virtual’ model deployment framework, in which model code remains in an external repository
and is accessed at runtime. There are three main processes involved in this framework and TRAC performs
validations at each of the steps. These validations replace the traditional route-to-live process and
allow models to be deployed and used without platform-level interventions or code changes.
OBJECT |
PROCESS |
SUMMARY |
RTL VALIDATION |
|
IMPORT MODELS |
Importing a model creates an object in the TRAC metadata store which refers to and describes the model. This record includes the model schema. The model is not deployed (in the traditional, physical sense) because the code remains in the repository. |
Does the model code contain a properly constructed function declaring its schema? |
|
BUILD FLOW |
Flows can be built and validated on the platform using only the schema representations of the models. Flows exist only as metadata objects, so a flow is like a ‘virtual’ deployment of some models into an execution process. |
Is the model schema compatible with it’s proposed placing in the calculation graph? |
|
RUN JOBS |
For a RunFlow job you first pick a flow and the. select the data and model objects to use for each node, plus any required parameters. TRAC then fetches the model code and the data records from storage and orchestrates the calculations as a single job. |
Does the model code generate outputs which are consistent with the declared schema? |
In addition to these steps, the TRAC Runtime can be deployed to your IDE of choice,
giving you all the type safety of production and ensuring that models translate to production without
modification. Any model which executes via the TRAC Runtime service in the IDE with local data inputs
will run on the platform.
TRAC Guarantee
TRAC offers a unique control environment which is characterised by three guarantees.
|
AUDITABLE ACTIONS |
Any action that changes a tag or creates an object is recorded in a time-consistent fashion in the
metadata model. The metadata is designed to be easily understood by humans and machines and
standard report formats can be used to create governance-ready documentation with no manual input
required. |
|
REPEATABLE JOBS |
Any RunModel or RunFlow job can be re-resubmitted and because the inputs are immutable you will
get the same result, guaranteed. We account for multiple factors that cause non-deterministic
model output: threading (don’t use it!), random number generation, time, external calls and
dynamic execution (these are disabled), language and library versions (these are recorded
with the metadata). |
|
RISK FREE PLATFORM |
Every version of every object (model, data, flow) remains permanently available to use and there is
no possibility of accidental loss or damage to deployed assets. Therefore, there is no change risk
(as traditionally defined) on TRAC. |
Note
The repeatability guarantee applies to RunModel, RunFlow and ExportData jobs. A model cannot be
imported twice so an ImportModel job cannot be repeated. An ImportData job can be repeated but
due to the dependence on an external source, TRAC cannot guarantee that the same outputs will be produced.
Experimentation & Analytics
In addition to supporting highly-controlled (or ‘production’) model execution processes, TRAC also provide two main ways to
construct ‘experimental’ model runs.
EXPERIMENTAL FLOWS |
Separate flows can be created for any standardised analytic process, from sensitivity analysis
to periodic model monitoring. Under the virtual deployment framework, Jobs which use
these experimental flows are safely executed on production data and infrastructure. |
EXPERIMENTAL INPUTS |
Using a ‘production’ flow, alternate model versions, data inputs
or parameter values can be selected. For quick and simple what-if analysis, old
jobs can be loaded, edited and resubmitted, for example to run last year’s models with
this year’s data, or vice versa |
TRAC can execute as many parallel jobs as the underlying compute infrastructure will allow and because they
are isolated and stateless, multiple runs can use different versions of the same model or dataset
concurrently. This greatly reduces the time required to complete more complex comparative analytics.