TracContext¶
- class tracdap.rt.api.TracContext¶
Interface that allows model components to interact with the platform at runtime
TRAC supplies every model with a context when the model is run. The context allows models to access parameters, inputs, outputs and schemas, as well as other resources such as the Spark context (if the model is using Spark) and model logs.
TRAC guarantees that everything defined in the model (parameters, inputs and outputs) will be available in the context when the model is running. So, if a model defines a parameter called “param1” as an integer, the model will be able to call get_parameter(“param1”) and will receive an integer value.
When a model is running on a production deployment of the TRAC platform, parameters, inputs and outputs will be supplied by TRAC as part of the job. These could be coming from entries selected by a user in the UI or settings configured as part of a scheduled task. To develop models locally, a job config file can be supplied (typically in YAML or JSON) to set the required parameters, inputs and output locations. In either case, TRAC will validate the supplied configuration against the model definition to make sure the context always includes exactly what the model requires.
All the context API methods are validated at runtime and will raise ERuntimeValidation if a model tries to access an unknown identifier or perform some other invalid operation.
See also
- get_pandas_table(dataset_name, use_temporal_objects=None)¶
Get the data for a model input or output as a Pandas dataframe.
Model inputs can be accessed as Pandas dataframes using this method. The TRAC runtime will handle fetching data from storage and apply any necessary format conversions (to improve performance, data may be preloaded). Only defined inputs can be accessed, use
define_inputs()
to define the inputs of a model. Input names are case-sensitive.Model inputs are always available and can be accessed at any time inside
run_model()
. Model outputs can also be retrieved using this method, however they are only available after they have been saved usingput_pandas_table()
(or another put method). Calling this method will simply return the saved dataset.Attempting to retrieve a dataset that is not defined as a model input or output will result in a runtime validation error, even if that dataset exists in the job config and is used by other models. Attempting to retrieve an output before it has been saved will also cause a validation error.
- Parameters:
dataset_name (str) – The name of the model input or output to get data for
use_temporal_objects (bool | None) – Use Python objects for date/time fields instead of the NumPy datetime64 type
- Returns:
A pandas dataframe containing the data for the named dataset
- Return type:
pandas.DataFrame
- Raises:
- get_parameter(parameter_name)¶
Get the value of a model parameter.
Model parameters defined using
define_parameters()
can be retrieved at runtime by this method. Values are returned as native Python types. Parameter names are case-sensitive.Attempting to retrieve parameters not defined by the model will result in a runtime validation error, even if those parameters are supplied in the job config and used by other models.
- Parameters:
parameter_name (str) – The name of the parameter to get
- Returns:
The parameter value, as a native Python data type
- Raises:
- Return type:
Any
- get_polars_table(dataset_name)¶
Get the data for a model input or output as a Polars dataframe.
This method has equivalent semantics to
get_pandas_table()
, but returns a Polars dataframe.- Parameters:
dataset_name (str) – The name of the model input or output to get data for
- Returns:
A polars dataframe containing the data for the named dataset
- Return type:
polars.DataFrame
- Raises:
- get_schema(dataset_name)¶
Get the schema of a model input or output.
Use this method to get the
SchemaDefinition
for any input or output of the current model. For datasets with static schemas, these will be the same schemas that were defined usingdefine_inputs()
anddefine_outputs()
.For inputs with dynamic schemas, the schema of the provided input dataset will be returned. For outputs with dynamic schemas the schema must be set by calling
put_schema()
, after which this method will return that schema. Callingget_schema()
for a dynamic output before the schema is set will result in a runtime validation error.For optional inputs, use
has_dataset()
to check whether the input was provided. Callingget_schema()
for an optional input that was not provided will always result in a validation error, regardless of whether the input has a static or dynamic schema. For optional outputsget_schema()
can be called, however if an output is both optional and dynamic then the schema must first be set by callingput_schema()
.Attempting to retrieve the schema for a dataset that is not defined as a model input or output will result in a runtime validation error, even if that dataset exists in the job config and is used by other models.
- Parameters:
dataset_name (str) – The name of the input or output to get the schema for
- Returns:
The schema definition for the named dataset
- Return type:
- Raises:
- has_dataset(dataset_name)¶
Check whether a dataset is available in the current context.
This method can be used to check whether optional model inputs have been supplied or not. Models should use this method before calling get methods on optional inputs. For inputs not marked as optional, this method will always return true. For outputs, this method will return true after the model calls a put method for the dataset.
A runtime validation error will be raised if the dataset name is not defined as a model input or output.
- Parameters:
dataset_name (str) – The name of the dataset to check
- Returns:
True if the dataset exists in the current context, False otherwise
- Raises:
- Return type:
bool
- log()¶
Get a Python logger that can be used for writing model logs.
Logs written to this logger are recorded by TRAC. When models are run on the platform, these logs are assembled and saved with the job outputs as a dataset, that can be queried through the regular TRAC data and metadata APIs.
- Returns:
A Python logger that can be used for writing model logs
- Return type:
logging.Logger
- put_pandas_table(dataset_name, dataset)¶
Save the data for a model output as a Pandas dataframe.
Model outputs can then be saved as Pandas dataframes using this method. The TRAC runtime will validate the supplied data and send it to storage, applying any necessary format conversions. Only defined outputs can be saved, use
define_outputs()
to define the outputs of a model. Output names are case-sensitive. Once an output has been saved it can be retrieved by callingget_pandas_table()
(or another get method).Each model output can only be saved once and the supplied data must match the schema of the named output. Missing fields or fields of the wrong type will result in a data conformance error. Extra fields will be discarded with a warning. The schema of an output dataset can be checked using
get_schema()
. For dynamic outputs, the schema must first be set usingput_schema()
Attempting to save a dataset that is not defined as a model output will cause a runtime validation error. Attempting to save an output twice, or save a dynamic output before its schema is set will also cause a validation error.
- Parameters:
dataset_name (str) – The name of the model output to save data for
dataset (
pandas.Dataframe
) – A pandas dataframe containing the data for the named dataset
- Raises:
- put_polars_table(dataset_name, dataset)¶
Save the data for a model output as a Polars dataframe.
This method has equivalent semantics to
put_pandas_table()
, but accepts a Polars dataframe.- Parameters:
dataset_name (str) – The name of the model output to save data for
dataset (
polars.DataFrame
) – A polars dataframe containing the data for the named dataset
- Raises:
- put_schema(dataset_name, schema)¶
Set the schema of a dynamic model output.
For outputs marked as dynamic in
define_outputs()
, aSchemaDefinition
must be supplied using this method before attempting to save the data. Once a schema has been set, it can be retrieved by callingget_schema()
and data can be saved usingput_pandas_table()
or another put method.TRAC API functions are available to help with building schemas, such as
trac.F()
to define individual fields orload_schema()
to load predefined schemas. See thetracdap.rt.api
package for a full list of functions that can be used to build and manipulate schemas.Each schema can only be set once and the schema will be validated using the normal validation rules. If
put_schema()
is called for an optional output the model must supply data for that output, otherwise TRAC will report a validation error after the model completes.Attempting to set the schema for a dataset that is not defined as a dynamic model output for the current model will result in a runtime validation error. Supplying a schema that fails validation will also result in a validation error.
- Parameters:
dataset_name (str) – The name of the output to set the schema for
schema (
SchemaDefinition
) – A TRAC schema definition to use for the named output
- Raises: