tracdap.metadata

Module Contents

Classes

ArrayValue

An array value holds an array of other Values.

BasicType

Basic types provide the set of core types available in the TRAC type system.

CopyStatus

Status of an individual copy of a data storage item

CustomDefinition

Define a custom object that can be stored and managed in the TRAC metadata store

DataDefinition

Define a dataset that can be stored and managed in the TRAC platform

DateValue

Represent a date value.

DatetimeValue

Represent a date-time value.

DecimalValue

Represent a decimal value.

FieldSchema

Schema for an individual field in a tabular dataset

FileDefinition

Describes a file object stored in the TRAC platform

FlowDefinition

A flow defines an execution graph as a set of connections between models and data

FlowEdge

A connection between two nodes in a flow

FlowNode

Describes an individual node in a flow

FlowNodeType

Specify the type of an individual flow node

FlowSocket

A socket is a point of connection for wiring up the edges in a flow

ImportModelJob

Specification for an IMPORT_MODEL job

IncarnationStatus

Status of an individual incarnation of a data storage item

JobDefinition

Define a job to run on the TRAC platform

JobStatusCode

Indicate the status of a job in the TRAC platform

JobType

Specify the type of an individual TRAC job

LogicalExpression

Logical expression for a search of the TRAC metadata store.

LogicalOperator

Metadata search logical operator, used as part of a LogicalExpression.

MapValue

A map value holds a map of string keys to other Values.

MetadataFormat

Available formats for representing the TRAC metadata.

MetadataVersion

Explicit versioning of the metadata schema.

ModelDefinition

Define a model for execution on the TRAC platform

ModelInputSchema

Describes the data schema of a model input

ModelOutputSchema

Describes the data schema of a model output

ModelParameter

Describes an individual parameter of a model

ObjectDefinition

Object definitions are the core structural element of TRAC's metadata model

ObjectType

Enumeration of TRAC's core object types.

PartKey

Partition key for tabular datasets

PartType

Partitioning scheme applied to a dataset

RunFlowJob

Specification for a RUN_FLOW job

RunModelJob

Specification for a RuN_MODEL job

SchemaDefinition

A schema definition describes the schema of a dataset

SchemaType

Enumeration of the available types of data schema

SearchExpression

Search expression for a search of the TRAC metadata store.

SearchOperator

Metadata search term operator, used as part of a SearchTerm

SearchParameters

Parameters to define a metadata search.

SearchTerm

Individual search term for a search of the TRAC metadata store.

StorageCopy

Define physical storage for an individual copy of a data item

StorageDefinition

Defines the physical storage for a file or dataset object

StorageIncarnation

Define physical storage for an individual incarnation of a data item

StorageItem

Define physical storage for an individual data item

TableSchema

Schema for a tabular dataset

Tag

Tags are the core informational element of TRAC's metadata model.

TagHeader

A tag header describes the identity and version of an object.

TagOperation

Enumeration of available TagUpdate operations.

TagSelector

A tag selector describes the selection of a unique object at a point in time.

TagUpdate

A tag update is a request for a single update operation on a tag.

TenantInfo

Information about a tenant that is set up on the TRAC platform.

TypeDescriptor

A type descriptor describes a data type used in the TRAC platform.

Value

A value expressed in the TRAC type system.

class tracdap.metadata.ArrayValue

An array value holds an array of other Values.

All items in an array must have the same type.

See also

ARRAY

items

items

Type:

List[Value]

class tracdap.metadata.BasicType(*args, **kwds)

Bases: enum.Enum

Basic types provide the set of core types available in the TRAC type system.

ARRAY = 8

An array of values, which may be primitive or composite values.

All items in an array must have the same type (i.e. the same type descriptor).

BASIC_TYPE_NOT_SET = 0
BOOLEAN = 1

A true/false value

DATE = 6

A date value.

Dates do not take any account of time zones or offsets from UTC.

DATETIME = 7

A date-time value.

Date-time values may be expressed with an offset from UTC, as per ISO 8601. The available sub-second precision may vary depending on language / platform.

For metadata attributes, TRAC represents all date-times as in UTC with microsecond precision. Incoming values will be converted to UTC if they are supplied with an offset.

DECIMAL = 5

A fixed-point decimal value with known precision and scale.

The available precision and scale may vary between languages / platforms.

For metadata attributes, TRAC provides the following guarantees on precision:

precision >= 31 scale >= 10 precision - scale >= 21

FLOAT = 3

64 bit signed floating point number (referred to as ‘double’ in many languages)

INTEGER = 2

64 bit signed integer

MAP = 9

An key-value map with string keys, values may be primitive or composite values.

Maps may be uniform, in which case all the values are of the same type, or non- uniform in which case values can be of any type. For uniform maps the type descriptor will specify the type contained in the map. For non-uniform maps the type descriptor can only specify that the map is non-uniform, values must be examined at run time to determine their type.

See also

TypeDescriptor

STRING = 4

UTF encoded string value of arbitrary length.

The encoding used (e.g. UTF-8, UTF-16, UCS-16) varies between languages / platforms, generally TRAC will present strings using the standard encoding for a given language or protocol.

class tracdap.metadata.CopyStatus(*args, **kwds)

Bases: enum.Enum

Status of an individual copy of a data storage item

COPY_AVAILABLE = 1

The copy of the data item is available in storage to access

COPY_EXPUNGED = 2

The copy of the data item has been expunged and is no longer available

COPY_STATUS_NOT_SET = 0
class tracdap.metadata.CustomDefinition

Define a custom object that can be stored and managed in the TRAC metadata store

customData

customData

Type:

bytes

customSchemaType

customSchemaType

Type:

str

customSchemaVersion

customSchemaVersion

Type:

int

class tracdap.metadata.DataDefinition

Define a dataset that can be stored and managed in the TRAC platform

class Delta
dataItem

dataItem

Type:

str

deltaIndex

deltaIndex

Type:

int

class Part
partKey

partKey

Type:

PartKey

snap

snap

Type:

DataDefinition.Snap

class Snap
deltas

deltas

Type:

List[DataDefinition.Delta]

snapIndex

snapIndex

Type:

int

parts

parts

Type:

Dict[str, DataDefinition.Part]

schema

schema

Type:

Optional[SchemaDefinition]

schemaId

schemaId

Type:

Optional[TagSelector]

storageId

storageId

Type:

TagSelector

class tracdap.metadata.DateValue

Represent a date value.

Dates are represented as strings in ISO 8601 format.

See also

DATE

isoDate

isoDate

Type:

str

class tracdap.metadata.DatetimeValue

Represent a date-time value.

Date-times are represented as strings in ISO 8601 format.

See also

DATETIME

isoDatetime

isoDatetime

Type:

str

class tracdap.metadata.DecimalValue

Represent a decimal value.

See also

DECIMAL

decimal

decimal

Type:

str

class tracdap.metadata.FieldSchema

Schema for an individual field in a tabular dataset

See also

TableSchema

businessKey

businessKey

Type:

bool

categorical

categorical

Type:

bool

fieldName

fieldName

Type:

str

fieldOrder

fieldOrder

Type:

int

fieldType

fieldType

Type:

BasicType

formatCode

formatCode

Type:

Optional[str]

label

label

Type:

str

notNull

This could become mandatory with the next metadata update

Type:

Optional[bool]

class tracdap.metadata.FileDefinition

Describes a file object stored in the TRAC platform

dataItem

dataItem

Type:

str

extension

extension

Type:

str

mimeType

mimeType

Type:

str

name

name

Type:

str

size

size

Type:

int

storageId

storageId

Type:

TagSelector

class tracdap.metadata.FlowDefinition

A flow defines an execution graph as a set of connections between models and data

A flow describes the shape of the execution graph, it does not fix in advance the set of models and datasets that will go into it. When a RUN_FLOW job is created, the job matches the flow with a set of models, inputs, outputs and parameters.

See also

JobDefinition

edges

edges

Type:

List[FlowEdge]

inputs

inputs

Type:

Dict[str, ModelInputSchema]

nodes

nodes

Type:

Dict[str, FlowNode]

outputs

outputs

Type:

Dict[str, ModelOutputSchema]

parameters

parameters

Type:

Dict[str, ModelParameter]

class tracdap.metadata.FlowEdge

A connection between two nodes in a flow

See also

FlowSocket

source

source

Type:

FlowSocket

target

target

Type:

FlowSocket

class tracdap.metadata.FlowNode

Describes an individual node in a flow

See also

FlowDefinition

inputs

inputs

Type:

List[str]

label

label

Type:

str

nodeAttrs

nodeAttrs

Type:

List[TagUpdate]

nodeProps

nodeProps

Type:

Dict[str, Value]

nodeSearch

nodeSearch

Type:

SearchExpression

nodeType

nodeType

Type:

FlowNodeType

outputs

outputs

Type:

List[str]

parameters

parameters

Type:

List[str]

class tracdap.metadata.FlowNodeType(*args, **kwds)

Bases: enum.Enum

Specify the type of an individual flow node

See also

FlowNode

INPUT_NODE = 1

Input nodes described inputs to the flow, such as a files or datasets

MODEL_NODE = 3

Model nodes are placeholders for TRAC models that will be supplied at runtime

NODE_TYPE_NOT_SET = 0
OUTPUT_NODE = 2

Output nodes describe outputs the flow produces, such as a files or datasets

PARAMETER_NODE = 4

Parameter nodes allow explicit mapping of parameters into models (TRAC can infer parameters by name if they are not defined explicitly)

class tracdap.metadata.FlowSocket

A socket is a point of connection for wiring up the edges in a flow

For parameter, input and output nodes the socket is just the node name and the socket field will be blank. For models, the node name refers to a model node and the socket is the name of the parameter, input or output being connected. E.g. these two sockets could be used to connect a flow input to a model, using an edge:

flow_input_socket = { "node": "my_input_dataset", socket: "" }
model_input_socket = { "node": "my_model", "socket": "input_1" }.

See also

FlowEdge

node

node

Type:

str

socket

socket

Type:

str

class tracdap.metadata.ImportModelJob

Specification for an IMPORT_MODEL job

entryPoint

entryPoint

Type:

str

language

language

Type:

str

modelAttrs

modelAttrs

Type:

List[TagUpdate]

package

package

Type:

str

packageGroup

packageGroup

Type:

Optional[str]

path

path

Type:

str

repository

repository

Type:

str

version

version

Type:

str

class tracdap.metadata.IncarnationStatus(*args, **kwds)

Bases: enum.Enum

Status of an individual incarnation of a data storage item

INCARNATION_AVAILABLE = 1

The incarnation of the data item has at least one copy available in storage

INCARNATION_EXPUNGED = 2

This incarnation of the data item is no longer available in storage, all copies have been expunged

INCARNATION_STATUS_NOT_SET = 0
class tracdap.metadata.JobDefinition

Define a job to run on the TRAC platform

importModel

importModel

Type:

Optional[ImportModelJob]

jobType

jobType

Type:

JobType

runFlow

runFlow

Type:

Optional[RunFlowJob]

runModel

runModel

Type:

Optional[RunModelJob]

class tracdap.metadata.JobStatusCode(*args, **kwds)

Bases: enum.Enum

Indicate the status of a job in the TRAC platform

CANCELLED = 10

The job was cancelled by a user of the platform

FAILED = 9

The job failed and has been terminated or rejected

FINISHING = 7

Job execution completed, the platform is cleaning up and validating the outputs

JOB_STATUS_CODE_NOT_SET = 0
PENDING = 3

The job is being set up

PREPARING = 1

The job is being set up

QUEUED = 4

The job is queued in TRAC, waiting for available resources

RUNNING = 6

The job is currently running

SUBMITTED = 5

The job has been submitted for execution but has not yet started

SUCCEEDED = 8

The job completed successfully and the results are available

VALIDATED = 2

The job has passed validation and is ok to run (dry-run operations may return this status)

class tracdap.metadata.JobType(*args, **kwds)

Bases: enum.Enum

Specify the type of an individual TRAC job

IMPORT_DATA = 4

Import data into the platform

IMPORT_MODEL = 3

Import a model into the platform

JOB_TYPE_NOT_SET = 0
RUN_FLOW = 2

Run a flow with all its models, parameters and inputs

RUN_MODEL = 1

Run a single model, with parameters and inputs

class tracdap.metadata.LogicalExpression

Logical expression for a search of the TRAC metadata store.

Applies a logical operator to one or more sub-expressions.

expr

A set of sub-expressions.

For AND or OR operations, there must be two or more sub-expressions. For NOT operations, there must be precisely one sub-expression.

Type:

List[SearchExpression]

operator

The logical operator to apply to sub-expressions

Type:

LogicalOperator

class tracdap.metadata.LogicalOperator(*args, **kwds)

Bases: enum.Enum

Metadata search logical operator, used as part of a LogicalExpression.

AND = 1

LOGICAL AND

The AND operator combines two or more search expressions, the logical expression will match only when all sub-expressions match. The order of sub-expressions is not important.

LOGICAL_OPERATOR_NOT_SET = 0
NOT = 3

LOGICAL NOT

The NOT operator applies to a single sub-expression, the logical expression will match precisely when the sub-expression does not match.

OR = 2

LOGICAL OR

The OR operator combines two or more search expressions, the logical expression will match when any of the sub-expressions match. The order of sub-expressions is not important.

class tracdap.metadata.MapValue

A map value holds a map of string keys to other Values.

Maps may be uniform (holding all the same value type) or non-uniform (holding mixed value types) depending on the type descriptor of the Value that contains them.

See also

MAP, TypeDescriptor

entries

entries

Type:

Dict[str, Value]

class tracdap.metadata.MetadataFormat(*args, **kwds)

Bases: enum.Enum

Available formats for representing the TRAC metadata.

Use for communication between components, config files metadata stored in the metadata database.

JSON = 2
METADATA_FORMAT_NOT_SET = 0
PROTO = 1
YAML = 3
class tracdap.metadata.MetadataVersion(*args, **kwds)

Bases: enum.Enum

Explicit versioning of the metadata schema.

A strictly increasing enumeration of metadata versions. The special value CURRENT is always set to the latest version and used by default, in API calls, config files and for storing in the metadata database.

TRAC can use this information identify and handle old metadata found in the metadata database. In future it may also be possible to request old metadata versions in API calls, or to run upgrades of metadata stored in an older metadata format.

CURRENT = 1
METADATA_VERSION_NOT_SET = 0
V1 = 1
class tracdap.metadata.ModelDefinition

Define a model for execution on the TRAC platform

entryPoint

entryPoint

Type:

str

inputs

inputs

Type:

Dict[str, ModelInputSchema]

language

language

Type:

str

outputs

outputs

Type:

Dict[str, ModelOutputSchema]

package

package

Type:

str

packageGroup

packageGroup

Type:

Optional[str]

parameters

parameters

Type:

Dict[str, ModelParameter]

path

path

Type:

Optional[str]

repository

repository

Type:

str

staticAttributes

Static attributes defined in model code

Type:

Dict[str, Value]

version

version

Type:

str

class tracdap.metadata.ModelInputSchema

Describes the data schema of a model input

In many cases models define the entire schemas of their inputs, in which case the input schema is just a wrapper around a schema definition. This is what is supported now.

Other approaches are possible. Models can define dynamic inputs, in which case the input schema is provided at runtime and can be interrogated by the model code. Models may also define inputs with some required fields and a dynamic portion. For non-tabular inputs, other options may be required. These capabilities may be added in future releases.

inputProps

inputProps

Type:

Dict[str, Value]

label

label

Type:

Optional[str]

optional

optional

Type:

bool

schema

schema

Type:

SchemaDefinition

class tracdap.metadata.ModelOutputSchema

Describes the data schema of a model output

In many cases models define the entire schemas of their outputs, in which case the output schema is just a wrapper around a schema definition. This is what is supported now.

Other approaches are possible. Models can define dynamic outputs, in which case the model decides at runtime what the output schema will be. Pass-through schemas (output X has the same schema as dynamic input Y) and pass-through-extend schemas (output X has the schema of dynamic input Y, plus one or more new columns) can also be useful. These capabilities may be added in future releases.

label

label

Type:

Optional[str]

optional

optional

Type:

bool

outputProps

outputProps

Type:

Dict[str, Value]

schema

schema

Type:

SchemaDefinition

class tracdap.metadata.ModelParameter

Describes an individual parameter of a model

defaultValue

defaultValue

Type:

Optional[Value]

label

label

Type:

str

paramProps

paramProps

Type:

Dict[str, Value]

paramType

paramType

Type:

TypeDescriptor

class tracdap.metadata.ObjectDefinition

Object definitions are the core structural element of TRAC’s metadata model

Definitions describe every object that is stored in the TRAC platform and there is a one-to-one relation between definitions and objects. I.e. every dataset has its own data definition, every model has its own model definition and so on. Definitions also describe actions that take place on the platform by way of job definitions, so a “job” is just another type of object. Each type of object has its own definition and definitions can be added or extended as the platform evolves.

The object definition container class allows different types of objects to be stored, indexed and accessed in the same way. Every object has a standard object header which contains enough information to identify the object.

TRAC object definitions can be versioned. In order to use versioning the semantics of versioning must be defined and those vary depending on the object type. Currently these semantics are defined for DATA objects, see DataDefinition for details. Versioning is also allowed for CUSTOM objects, in this case it is the responsibility of the application to define versioning semantics. Versioning is not currently permitted for other object types.

Object definitions are intended for storing structural data necessary to access data and run jobs on the TRAC platform. Informational data to catalogue and describe objects is stored in tags. Tags are a lot more flexible than object definitions, so applications built on the TRAC platform may choose to store structural information in tags where their required structure is not supported by TRAC’s core object definitions.

See also

Tag

custom

custom

Type:

Optional[CustomDefinition]

data

data

Type:

Optional[DataDefinition]

file

file

Type:

Optional[FileDefinition]

flow

flow

Type:

Optional[FlowDefinition]

job

job

Type:

Optional[JobDefinition]

model

model

Type:

Optional[ModelDefinition]

objectProps

objectProps

Type:

Dict[str, Value]

objectType

objectType

Type:

ObjectType

schema

schema

Type:

Optional[SchemaDefinition]

storage

storage

Type:

Optional[StorageDefinition]

class tracdap.metadata.ObjectType(*args, **kwds)

Bases: enum.Enum

Enumeration of TRAC’s core object types.

See also

ObjectDefinition

CUSTOM = 6
DATA = 1
FILE = 5
FLOW = 3
JOB = 4
MODEL = 2
OBJECT_TYPE_NOT_SET = 0
SCHEMA = 8
STORAGE = 7
class tracdap.metadata.PartKey

Partition key for tabular datasets

opaqueKey

opaqueKey

Type:

str

partRangeMax

partRangeMax

Type:

Optional[Value]

partRangeMin

partRangeMin

Type:

Optional[Value]

partType

partType

Type:

PartType

partValues

partValues

Type:

List[Value]

class tracdap.metadata.PartType(*args, **kwds)

Bases: enum.Enum

Partitioning scheme applied to a dataset

PART_BY_RANGE = 1

Partition by range over an ordered variable (not available yet)

PART_BY_VALUE = 2

Partition by value over a categorical variable (not available yet)

PART_ROOT = 0

Dataset has a single partition called the root partition (this is the default)

class tracdap.metadata.RunFlowJob

Specification for a RUN_FLOW job

flow

flow

Type:

TagSelector

inputs

inputs

Type:

Dict[str, TagSelector]

models

models

Type:

Dict[str, TagSelector]

outputAttrs

outputAttrs

Type:

List[TagUpdate]

outputs

outputs

Type:

Dict[str, TagSelector]

parameters

parameters

Type:

Dict[str, Value]

priorOutputs

priorOutputs

Type:

Dict[str, TagSelector]

class tracdap.metadata.RunModelJob

Specification for a RuN_MODEL job

inputs

inputs

Type:

Dict[str, TagSelector]

model

model

Type:

TagSelector

outputAttrs

outputAttrs

Type:

List[TagUpdate]

outputs

outputs

Type:

Dict[str, TagSelector]

parameters

parameters

Type:

Dict[str, Value]

priorOutputs

priorOutputs

Type:

Dict[str, TagSelector]

class tracdap.metadata.SchemaDefinition

A schema definition describes the schema of a dataset

Schema definitions can be top level objects (a type of object definition), in which case they can be referred to by multiple data definitions. Alternatively they can be embedded in a data definition to create datasets with one-off schemas.

A table schema describes the schema of a tabular data set. Other schema types may be added later, e.g. for matrices, tensors, curves, surfaces and structured datasets.

See also

DataDefinition

partType

partType

Type:

PartType

schemaType

schemaType

Type:

SchemaType

table

table

Type:

Optional[TableSchema]

class tracdap.metadata.SchemaType(*args, **kwds)

Bases: enum.Enum

Enumeration of the available types of data schema

Currently only table schemas are supported, other schema types may be added later.

See also

SchemaDefinition

SCHEMA_TYPE_NOT_SET = 0
TABLE = 1

Tabular data

class tracdap.metadata.SearchExpression

Search expression for a search of the TRAC metadata store.

A search expression is either a single search term or a logical combination of other expressions. Search expressions can be built up to express complex logical conditions.

logical

Set if this search expression is a logical expression

Type:

Optional[LogicalExpression]

term

Set if this search expression is a single term

Type:

Optional[SearchTerm]

class tracdap.metadata.SearchOperator(*args, **kwds)

Bases: enum.Enum

Metadata search term operator, used as part of a SearchTerm

See also

SearchTerm

EQ = 1

EQUALS

The EQ operator matches a tag when the tag has an attribute that matches the search term exactly, i.e. attribute name, type and value all match. For multi-valued attributes, the EQ operator will match if any of the attribute values match the search term. The search value for the EQ operator must be a primitive value.

Exact matches may behave erratically for FLOAT values due to rounding errors, for this reason it is not recommended to use the EQ operator with FLOAT values.

EXISTS = 8

EXISTS

If an attribute type is provided the EXISTS operator matches a tag of specified name when the tag has an attribute whose type is matched with type provided attribute type. If no attribute type is provided, then it is enough for the tag to be of specified name to be matched.

GE = 6

GREATER THAN OR EQUAL TO

The GE operator matches a tag when the tag has an attribute with a value greater than or equal to the search parameter. The GE operator will only match single-valued attributes.

The GE operator can only apply to ordered primitive values, it cannot be used with string or boolean values. The GE operator will never match a multi-valued attribute, even if one or more of the individual values matches the search term.

GT = 5

GREATER THAN

The GT operator matches a tag when the tag has an attribute with a value greater than the search parameter. The GT operator will only match single-valued attributes.

The GT operator can only apply to ordered primitive values, it cannot be used with string or boolean values. The GT operator will never match a multi-valued attribute, even if one or more of the individual values matches the search term.

IN = 7

IN

The IN operator matches a tag when the tag has an attribute whose value is matched exactly by an item in the list of values provided. For multi-valued attributes, the IN operator will match if any of the attribute values match the search term. The search value for the IN operator must be an array value whose array items are a primitive type. Arrays of BOOLEAN values are not supported, use EQ to match boolean attributes.

Exact matches may behave erratically for FLOAT values due to rounding errors, for this reason it is not recommended to use the IN operator with FLOAT values.

LE = 4

LESS THAN OR EQUAL TO

The LE operator matches a tag when the tag has an attribute with a value less than or equal to the search parameter. The LE operator will only match single-valued attributes.

The LE operator can only apply to ordered primitive values, it cannot be used with string or boolean values. The LE operator will never match a multi-valued attribute, even if one or more of the individual values matches the search term.

LT = 3

LESS THAN

The LT operator matches a tag when the tag has an attribute with a value less than the search parameter. The LT operator will only match single-valued attributes.

The LT operator can only apply to ordered primitive values, it cannot be used with string or boolean values. The LT operator will never match a multi-valued attribute, even if one or more of the individual values matches the search term.

NE = 2

DOES NOT EQUAL

The NE operator matches a tag precisely when the EQ operator does not match it. This could be because the tag attribute does not match the search term, or because the tag does not define the search attribute at all or defines it with a different type. For multi-valued attributes, the NE operator will only match if none of the attribute values match the search term. The search value for the NE operator must be a primitive value.

The NE operator is exactly equivalent to using the EQ operator inside a logical NOT operation. This equivalence holds for both single- and multi-valued attributes.

Exact matches may behave erratically for FLOAT values due to rounding errors, for this reason it is not recommended to use the NE operator with FLOAT values.

SEARCH_OPERATOR_NOT_SET = 0
class tracdap.metadata.SearchParameters

Parameters to define a metadata search.

objectType

The type of object to search for

Type:

ObjectType

priorTags

Include prior tags in the search.

By default, only the latest tag for each object version is considered in a search. If the as-of parameter is specified, latest tags are considered as of that time.

Setting this flag to true will cause TRAC to consider superseded tags in the search. If the as-of parameter is specified as well then all tags up to that time are considered. Only the latest matching tag will be included in the search result.

This flag can be combined with priorVersions to search across all tags and object versions. If neither flag is specified, only the latest version and latest tag is considered for each object.

Type:

bool

priorVersions

Include prior versions of objects in the search.

By default, only the latest version of each object is considered in a search. If the as-of parameter is specified, latest versions are considered as of that time.

Setting this flag to true will cause TRAC to consider superseded object versions in the search. If the as-of parameter is specified as well then all object versions up to that time are considered. Only the latest matching version will be included in the search result.

This flag can be combined with priorTags to search across all tags and object versions. If neither flag is specified, only the latest version and latest tag is considered for each object.

Type:

bool

search

A search expression based on tag attributes.

This field is optional. If no search parameters are given, then all objects are returned.

Type:

SearchExpression

searchAsOf

Perform the search as of a specific date/time.

Supplying this field will cause TRAC to ignore all metadata changes from the specified time onwards. The result will be the same as if a search was performed at the specified time with this field left blank.

If a zone offset is supplied as part of the timestamp, TRAC will apply the offset to search across all metadata items in UTC.

If this parameter is not supplied, the search will be executed as of the current time.

Type:

Optional[DatetimeValue]

class tracdap.metadata.SearchTerm

Individual search term for a search of the TRAC metadata store.

Applies a search operator against an individual tag attribute.

attrName

The name of the attribute to search for

Type:

str

attrType

The primitive type of the attribute being searched for

Type:

BasicType

operator

The search operator to apply

Type:

SearchOperator

searchValue

The search value to look for

Type:

Value

class tracdap.metadata.StorageCopy

Define physical storage for an individual copy of a data item

copyStatus

copyStatus

Type:

CopyStatus

copyTimestamp

copyTimestamp

Type:

DatetimeValue

storageFormat

storageFormat

Type:

str

storageKey

storageKey

Type:

str

storageOptions

storageOptions

Type:

Dict[str, Value]

storagePath

storagePath

Type:

str

class tracdap.metadata.StorageDefinition

Defines the physical storage for a file or dataset object

Each storage item corresponds to one logical data item, such as a version of a file or a snapshot of a data partition. Storage for each item is broken down into incarnations (data that has been expunged and recomputed) and copies (physical file-level copies for resilience, locality etc).

dataItems

dataItems

Type:

Dict[str, StorageItem]

storageOptions

storageOptions

Type:

Dict[str, Value]

class tracdap.metadata.StorageIncarnation

Define physical storage for an individual incarnation of a data item

copies

copies

Type:

List[StorageCopy]

incarnationIndex

incarnationIndex

Type:

int

incarnationStatus

incarnationStatus

Type:

IncarnationStatus

incarnationTimestamp

incarnationTimestamp

Type:

DatetimeValue

class tracdap.metadata.StorageItem

Define physical storage for an individual data item

incarnations

incarnations

Type:

List[StorageIncarnation]

class tracdap.metadata.TableSchema

Schema for a tabular dataset

fields

fields

Type:

List[FieldSchema]

class tracdap.metadata.Tag

Tags are the core informational element of TRAC’s metadata model.

A tag is a set of attributes (key-value pairs) associated with an object definition, intended for storing descriptive and informational data as well as application-level metadata that is not part of the object definition model. Here is an example of a set of tag attributes to illustrate some ways they can be used:

# A descriptive field intended for human users.

display_name: "Customer accounts for March 2020, corrected April 6th"

# A classification that can be used for searching or indexing.
# Client applications can also use this to find datasets of a certain
# type; typically an application will define a set of attributes that are
# "structural", i.e. the application uses those attributes to decide which
# objects to present for certain purposes.

dataset_class: "customer_accounts"

# Properties of an item can be added as individual attributes so they can
# be searched and displayed individually. This avoids the anti-pattern of
# putting multiple attributes into a single name/label field:
#    customer_accounts_mar20_scotland_commercial_approved

accounting_date: (DATE) 2020-03-31
region: "Scotland"
book: "commercial_property"
figures_approved: (BOOLEAN) true

# Attributes can be multi-valued. This can be helpful for applying
# regulatory classifiers, where multiple classifiers may apply to a
# single item.

data_classification: ["confidential", "gdpr_pii", "audited"]

# TRAC records a number of "controlled" attributes, these are set by the
# platform and cannot be modified directly through the metadata API.
# Controlled attributes start with the prefix "trac_".

trac_create_time: (DATETIME) 2020-04-01 10:37:05
trac_create_user_id: "jane.doe"
trac_create_user_name: "Jane Doe"

Tags use immutable versioning in the same way as objects - each version of a tag is immutable and “updating” a tag means creating a new version with one or more modified attributes. Each version of an object has its own series of tags starting at tag version 1.

As an example of this versioning, consider a partitioned dataset with daily account records. Version X of the dataset contains data up to a certain date and might have a tag saying it is signed off. A user/process then adds a new partition with the next day’s data, creating version X+1. In this case, object version X would still be signed off while version X+1 is awaiting approval. When version X+1 is approved, the tag for that version can be “updated”. The application could decide whether to show the most recent version of the data, or an earlier version that has the sign-off attribute set.

attrs

Tag attributes are key-value pairs where the value is a metadata Value.

Attribute values are restricted to primitive types (which are interpreted as single-valued attributes) or arrays of primitive types (which are interpreted as multi-valued attributes). Any attribute may be single- or multi-valued, except BOOLEAN attributes which are always single-valued.

An attribute may change from single- to multi-valued or vice versa when a tag is updated, e.g. is a classification is added or removed. An array containing a single item is treated as a single-valued attribute, i.e. there is no distinction between a single value and an array of one item. Single-valued attributes are always returned as primitive types when querying the metadata API.

Single- and multi-valued attributes have different search semantics. For example, inequalities are not defined on multi-valued attributes. See SearchParameters for more details.

See also

SearchParameters

Type:

Dict[str, Value]

definition

The object definition that the tag is associated with.

Sometimes the definition may be omitted, for example the results of a metadata search include only headers and attributes.

See also

ObjectDefinition

Type:

Optional[ObjectDefinition]

header

The tag header uniquely identifies the current tag and the object it is associated with.

See also

TagHeader

Type:

TagHeader

class tracdap.metadata.TagHeader

A tag header describes the identity and version of an object.

See also

Tag, ObjectDefinition

isLatestObject

isLatest flag for the object the tag is associated with.

Type:

bool

isLatestTag

isLatest flag for the tag.

Type:

bool

objectId

Object ID of the object this tag is associated with.

Object IDs are UUIDs (RFC4122, https://www.ietf.org/rfc/rfc4122.txt)

Type:

str

objectTimestamp

Timestamp for when this version of the object was created.

Type:

DatetimeValue

objectType

Object type of the object this tag is associated with.

See also

ObjectType

Type:

ObjectType

objectVersion

Version of the object this tag is associated with.

Type:

int

tagTimestamp

Timestamp for when this version of the tag was created.

Type:

DatetimeValue

tagVersion

Version of this tag.

Type:

int

class tracdap.metadata.TagOperation(*args, **kwds)

Bases: enum.Enum

Enumeration of available TagUpdate operations.

See also

TagUpdate

APPEND_ATTR = 4

Append one or more values to an existing attribute, fail if the attribute does not exist.

The existing attribute may be single- or multi-valued and the append operation may add one value or multiple values (i.e. all combinations are permitted). The appended value(s) must be of the same basic type as the existing value(s).

CLEAR_ALL_ATTR = 6

Remove all the attributes from a tag.

This operation does not affect controlled attributes, which are still managed by TRAC according to its normal rules.

CREATE_ATTR = 2

Add an attribute to a tag, fail if the attribute already exists.

CREATE_OR_APPEND_ATTR = 1

Add an attribute to a tag or append to it if it already exists.

If the attribute does not exist it will be created using CREATE_ATTR, otherwise it will appended to using APPEND_ATTR.

CREATE_OR_REPLACE_ATTR = 0

Add an attribute to a tag or replace it if it already exists.

This is the default operation if no operation is specified. If the attribute does not exist it will be created using CREATE_ATTR, otherwise it will replaced using REPLACE_ATTR.

DELETE_ATTR = 5

Remove an attribute from a tag, fail if the attribute does not exist.

REPLACE_ATTR = 3

Replace an attribute on a tag, fail if the attribute does not exist.

When replacing an attribute, the new attribute must be of the same basic type as the old one. It is allowed to replace a single-valued attribute with a multi-valued one and vice-versa (this is not considered to be changing the basic type).

Changing the type of attributes is not recommended because it is likely to confuse applications that refer to those attributes. If you really need to change the type of an attribute (e.g. to correct an error), use DELETE_ATTR followed by CREATE_ATTR.

class tracdap.metadata.TagSelector

A tag selector describes the selection of a unique object at a point in time.

A tag selector refers to a single object ID and provides criteria for selecting the object version and tag version. The available selection criteria are:

  • Select an explicit version number

  • Select the latest available version

  • Select the version that was live at specific point in time

A selector for an explicit version number will always match that exact version number. These “fixed” types of selector can be used to refer to elements of a repeatable job, because the versions they refer to will never change.

A selector for the latest version will select different versions over time, as they become available. These “variable” types of selector can be used by client applications that want to query the latest state of an object. If a job is set up using variable selectors, TRAC will convert them to fixed selectors for the particular versions that were selected before saving the job definition.

Criteria for object versions and tag versions can be “mixed and matched”, so e.g. latestObject = true with tagVersion = 1 is allowed.

See also

Tag, TagHeader

latestObject

Select the latest version of the object (the version that is live now).

If this flag is specified, it must be set to true.

Type:

Optional[bool]

latestTag

Select the latest version of the tag (the version that is live now).

If this flag is specified, it must be set to true.

Type:

Optional[bool]

objectAsOf

Select the version of the object that was live as of a particular point in time. Represented using ISO 8601.

Type:

Optional[DatetimeValue]

objectId

Object ID of the tag being selected.

Object IDs are UUIDs (RFC4122, https://www.ietf.org/rfc/rfc4122.txt)

Type:

str

objectType

Object type of the tag being selected.

See also

ObjectType

Type:

ObjectType

objectVersion

Select an explicit version of the object.

Type:

Optional[int]

tagAsOf

Select the version of the tag that was live as of a particular point in time. Represented using ISO 8601.

Type:

Optional[DatetimeValue]

tagVersion

Select an explicit version of the tag.

Type:

Optional[int]

class tracdap.metadata.TagUpdate

A tag update is a request for a single update operation on a tag.

Tag updates can be supplied to TRAC via an API call to request updates to a tag. They may also be included in TRAC policy objects or client application logic, to describe a set of operations that is performed in response to a particular action.

See also

MetadataWriteRequest

attrName

Name of the attribute this update refers to.

This field must be supplied for operations that refer to a single attribute, otherwise it should be left blank.

Type:

str

operation

The operation requested in this update

See also

TagOperation

Type:

TagOperation

value

Attribute value to use for this update.

This field must be supplied for operations that use a value, otherwise it should be omitted.

See also

Value

Type:

Optional[Value]

class tracdap.metadata.TenantInfo

Information about a tenant that is set up on the TRAC platform.

description
  • A short description of the tenant, suitable for displaying to users in lists.

Type:

str

tenantCode
  • Unique code used to identify the tenant, required by most API calls.

Type:

str

class tracdap.metadata.TypeDescriptor

A type descriptor describes a data type used in the TRAC platform.

For complex types, the descriptor holds a full type description. E.g. for array types, the type being held in the array is described. At a later point, precision fields may be introduced for decimals, or field types for structs.

arrayType

For array types only, describe the type contained in the array.

Type:

Optional[TypeDescriptor]

basicType

The basic type being described.

Type:

BasicType

mapType

For map types only, describe the type contained in the map.

To describe a uniform map the mapType descriptor must be set to a valid type descriptor, in this case all values in the map must match the type of the descriptor. If mapType is not set or is present but has basicType = BASIC_TYPE_NOT_SET then the map is non-uniform, values must be inspected individually to determine their type.

Type:

Optional[TypeDescriptor]

class tracdap.metadata.Value

A value expressed in the TRAC type system.

A value can express a primitive value, or a composite value such as an array. Arbitrary nesting of composite types is permitted, although most functions will limit the set of acceptable types during validation.

Values include a type descriptor field. For primitive values the type descriptor is optional. For composite types the root value must include a full valid type descriptor, i.e. a descriptor that goes down to the leaf types. Sub-values in a composite type are free to omit the type descriptor, even if there are multiple levels of nesting, however any extra descriptors that are provided must also be full and valid.

TRAC will always provide values with the type descriptors “normalised”. This means that a root value will always have a type descriptor (even if it is a primitive type) and sub-values will never have a type descriptor. It is not necessary or preferred for application code to send values to TRAC with normalised type descriptors, TRAC will always perform normalisation.

See also

TypeDescriptor

arrayValue

An array of Values.

All items in an array must have the same type.

See also

ARRAY

Type:

Optional[ArrayValue]

booleanValue

A boolean value.

Represented natively in both protobuf and JSON.

Type:

Optional[bool]

dateValue

A date value.

See also

DATE, DateValue

Type:

Optional[DateValue]

datetimeValue

A date-time value.

Date-times are represented as strings in ISO 8601 format.

Date-time values support nanosecond precision, however in practice the available precision may be less than this. In particular, tag attributes are always stored at microsecond precision. Values passed into models or used in application code will be limited to the precision supported by date-time types in their own coding language.

Time zone offsets are supported as described in ISO 8601. Date-time values should always include a zone offset to guarantee behaviour. If a zone offset is not supplied TRAC will normalized incoming values by applying a zone offset, currently the applied offset will be UTC.

Type:

Optional[DatetimeValue]

decimalValue

A decimal value with fixed precision and scale.

See also

DECIMAL

Type:

Optional[DecimalValue]

floatValue

A 64-bit signed floating point value.

Represented natively in both protobuf and JSON.

Type:

Optional[float]

integerValue

A 64-bit signed integer value.

Represented natively in both protobuf and JSON.

In JavaScript and JSON, native integers may be limited to less than full range of a 64 bit integer. This is because JavaScript has a single Number type that represents both integers and floats. The safe limit for integers that can be expressed without a risk of rounding in JavaScript is Number.MAX_SAFE_INTEGER.

Type:

Optional[int]

mapValue

A map of string keys to Values.

Maps may be uniform (holding all the same value type) or non-uniform (holding mixed value types) depending on the type descriptor of this Value.

See also

MAP, TypeDescriptor

Type:

Optional[MapValue]

stringValue

A string value.

Protobuf encodes strings as UTF-8 on the wire. String values in JSON requests are also be encoded as UTF-8 on the wire, as per RFC 8259. When reading or writing string values in application code, they will be presented in the normal encoding scheme of application’s coding language.

Type:

Optional[str]

type

Type descriptor for the current value.

A type descriptor is always required for values that are the root of a composite type, otherwise it is optional.

Type:

TypeDescriptor