TracContext

class tracdap.rt.api.TracContext

Interface that allows model components to interact with the platform at runtime

TRAC supplies every model with a context when the model is run. The context allows models to access parameters, inputs, outputs and schemas, as well as other resources such as the Spark context (if the model is using Spark) and model logs.

TRAC guarantees that everything defined in the model (parameters, inputs and outputs) will be available in the context when the model is running. So, if a model defines a parameter called “param1” as an integer, the model will be able to call get_parameter(“param1”) and will receive an integer value.

When a model is running on a production deployment of the TRAC platform, parameters, inputs and outputs will be supplied by TRAC as part of the job. These could be coming from entries selected by a user in the UI or settings configured as part of a scheduled task. To develop models locally, a job config file can be supplied (typically in YAML or JSON) to set the required parameters, inputs and output locations. In either case, TRAC will validate the supplied configuration against the model definition to make sure the context always includes exactly what the model requires.

All the context API methods are validated at runtime and will raise ERuntimeValidation if a model tries to access an unknown identifier or perform some other invalid operation.

See also

TracModel

get_pandas_table(dataset_name, use_temporal_objects=None)

Get the data for a model input or output as a Pandas dataframe.

Model inputs can be accessed as Pandas dataframes using this method. The TRAC runtime will handle fetching data from storage and apply any necessary format conversions (to improve performance, data may be preloaded). Only defined inputs can be accessed, use define_inputs() to define the inputs of a model. Input names are case-sensitive.

Model inputs are always available and can be accessed at any time inside run_model(). Model outputs can also be retrieved using this method, however they are only available after they have been saved using put_pandas_table() (or another put method). Calling this method will simply return the saved dataset.

Attempting to retrieve a dataset that is not defined as a model input or output will result in a runtime validation error, even if that dataset exists in the job config and is used by other models. Attempting to retrieve an output before it has been saved will also cause a validation error.

Parameters:
  • dataset_name (str) – The name of the model input or output to get data for

  • use_temporal_objects (bool | None) – Use Python objects for date/time fields instead of the NumPy datetime64 type

Returns:

A pandas dataframe containing the data for the named dataset

Return type:

pandas.DataFrame

Raises:

ERuntimeValidation

get_parameter(parameter_name)

Get the value of a model parameter.

Model parameters defined using define_parameters() can be retrieved at runtime by this method. Values are returned as native Python types. Parameter names are case-sensitive.

Attempting to retrieve parameters not defined by the model will result in a runtime validation error, even if those parameters are supplied in the job config and used by other models.

Parameters:

parameter_name (str) – The name of the parameter to get

Returns:

The parameter value, as a native Python data type

Raises:

ERuntimeValidation

Return type:

Any

get_polars_table(dataset_name)

Get the data for a model input or output as a Polars dataframe.

This method has equivalent semantics to get_pandas_table(), but returns a Polars dataframe.

Parameters:

dataset_name (str) – The name of the model input or output to get data for

Returns:

A polars dataframe containing the data for the named dataset

Return type:

polars.DataFrame

Raises:

ERuntimeValidation

get_schema(dataset_name)

Get the schema of a model input or output.

Use this method to get the SchemaDefinition for any input or output of the current model. For datasets with static schemas, these will be the same schemas that were defined using define_inputs() and define_outputs().

For inputs with dynamic schemas, the schema of the provided input dataset will be returned. For outputs with dynamic schemas the schema must be set by calling put_schema(), after which this method will return that schema. Calling get_schema() for a dynamic output before the schema is set will result in a runtime validation error.

For optional inputs, use has_dataset() to check whether the input was provided. Calling get_schema() for an optional input that was not provided will always result in a validation error, regardless of whether the input has a static or dynamic schema. For optional outputs get_schema() can be called, however if an output is both optional and dynamic then the schema must first be set by calling put_schema().

Attempting to retrieve the schema for a dataset that is not defined as a model input or output will result in a runtime validation error, even if that dataset exists in the job config and is used by other models.

Parameters:

dataset_name (str) – The name of the input or output to get the schema for

Returns:

The schema definition for the named dataset

Return type:

SchemaDefinition

Raises:

ERuntimeValidation

has_dataset(dataset_name)

Check whether a dataset is available in the current context.

This method can be used to check whether optional model inputs have been supplied or not. Models should use this method before calling get methods on optional inputs. For inputs not marked as optional, this method will always return true. For outputs, this method will return true after the model calls a put method for the dataset.

A runtime validation error will be raised if the dataset name is not defined as a model input or output.

Parameters:

dataset_name (str) – The name of the dataset to check

Returns:

True if the dataset exists in the current context, False otherwise

Raises:

ERuntimeValidation

Return type:

bool

log()

Get a Python logger that can be used for writing model logs.

Logs written to this logger are recorded by TRAC. When models are run on the platform, these logs are assembled and saved with the job outputs as a dataset, that can be queried through the regular TRAC data and metadata APIs.

Returns:

A Python logger that can be used for writing model logs

Return type:

logging.Logger

put_pandas_table(dataset_name, dataset)

Save the data for a model output as a Pandas dataframe.

Model outputs can then be saved as Pandas dataframes using this method. The TRAC runtime will validate the supplied data and send it to storage, applying any necessary format conversions. Only defined outputs can be saved, use define_outputs() to define the outputs of a model. Output names are case-sensitive. Once an output has been saved it can be retrieved by calling get_pandas_table() (or another get method).

Each model output can only be saved once and the supplied data must match the schema of the named output. Missing fields or fields of the wrong type will result in a data conformance error. Extra fields will be discarded with a warning. The schema of an output dataset can be checked using get_schema(). For dynamic outputs, the schema must first be set using put_schema()

Attempting to save a dataset that is not defined as a model output will cause a runtime validation error. Attempting to save an output twice, or save a dynamic output before its schema is set will also cause a validation error.

Parameters:
  • dataset_name (str) – The name of the model output to save data for

  • dataset (pandas.Dataframe) – A pandas dataframe containing the data for the named dataset

Raises:

ERuntimeValidation, EDataConformance

put_polars_table(dataset_name, dataset)

Save the data for a model output as a Polars dataframe.

This method has equivalent semantics to put_pandas_table(), but accepts a Polars dataframe.

Parameters:
  • dataset_name (str) – The name of the model output to save data for

  • dataset (polars.DataFrame) – A polars dataframe containing the data for the named dataset

Raises:

ERuntimeValidation, EDataConformance

put_schema(dataset_name, schema)

Set the schema of a dynamic model output.

For outputs marked as dynamic in define_outputs(), a SchemaDefinition must be supplied using this method before attempting to save the data. Once a schema has been set, it can be retrieved by calling get_schema() and data can be saved using put_pandas_table() or another put method.

TRAC API functions are available to help with building schemas, such as trac.F() to define individual fields or load_schema() to load predefined schemas. See the tracdap.rt.api package for a full list of functions that can be used to build and manipulate schemas.

Each schema can only be set once and the schema will be validated using the normal validation rules. If put_schema() is called for an optional output the model must supply data for that output, otherwise TRAC will report a validation error after the model completes.

Attempting to set the schema for a dataset that is not defined as a dynamic model output for the current model will result in a runtime validation error. Supplying a schema that fails validation will also result in a validation error.

Parameters:
  • dataset_name (str) – The name of the output to set the schema for

  • schema (SchemaDefinition) – A TRAC schema definition to use for the named output

Raises:

ERuntimeValidation