Chapter 3 - Using Data¶

This tutorial is based on the using_data.js example, which can be found in the TRAC GitHub Repository under examples/apps/javascript.

Connecting to the data API¶

In order to use the data API, we will need an RPC transport and an instance of the API class. Here is how to set them up for a browser-based app:

// Create the Data API
const dataTransport = tracdap.setup.transportForBrowser(tracdap.api.TracDataApi);
const dataApi = new tracdap.api.TracDataApi(dataTransport);

For a Node.js or standalone environment, create a connector pointing at your TRAC instance:

// Create the Data API
const dataTransport = tracdap.setup.transportForTarget(tracdap.api.TracDataApi, "http", "localhost", 8080);
const dataApi = new tracdap.api.TracDataApi(dataTransport);

Saving data from files¶

Suppose the user has a data file on disk that they want to load into the platform. For simplicity, let’s assume it is already in CSV format. If the file is reasonably small we can load it into memory as a blob, using either the FileReader API in a browser or the fs API for Node.js environments.

In order to save data to TRAC we will need to supply a schema. This can either be done by providing a full SchemaDefinition to be embedded with the dataset, or by providing the ID of an existing schema in the TRAC metadata store. In this example we use the latter approach, see lessons 1 & 2 for examples of how to create a schema and search for its ID.

Once both schema and data are available, we can create a DataWriteRequest.

export function saveDataToTrac(schemaId, csvData) {

    const request = tracdap.api.DataWriteRequest.create({

        tenant: "ACME_CORP",

        schemaId: schemaId,

        format: "text/csv",
        content: csvData,

        tagUpdates: [
            { attrName: "schema_type", value: { stringValue: "customer_records" } },
            { attrName: "business_division", value: { stringValue: "WIDGET_SALES" } },
            { attrName: "description", value: { stringValue: "A month-end snapshot of customer accounts" } },
        ]
    });

Here, schemaId is the TagHeader (or TagSelector) for a schema created earlier. The format field must be a MIME type for a supported data format and the content field contains binary data encoded in that format. Since our csvData blob contains data loaded from a CSV file, we know it is already in the right format.

When data is saved the platform will create a DATA object in the metadata store to describe the dataset. This DATA object can be indexed and searched for, so we use TagUpdate instructions to apply tag attributes just like any other type of object.

Now the data API can be used to send the new dataset to the platform:

    return dataApi.createSmallDataset(request).then(dataId => {

        console.log(`Created dataset ${dataId.objectId} version ${dataId.objectVersion}`);

        return dataId;
    });
}

Here we used createSmallDataset(), which assumes the content of the dataset is small enough to be sent as a single blob in the content field. Since the data has already been loaded into memory, this approach avoids the complexity of using streaming calls. (An equivalent client-streaming method, createDataset(), is available in the platform API but not currently supported over gRPC-Web).

The createSmallDataset() method returns the ID of the newly created dataset as a TagHeader, which includes the object type, ID, version and timestamps. In this example we used the promise form of the method, the equivalent call using a callback would be:

dataApi.createSmallDataset(request, (err, dataId) => {

    // Handle error or response
});

Loading data from TRAC¶

Now suppose we want to get the dataset back out of TRAC, for example to display it in a web page. To do this we use a DataReadRequest.

export function loadDataFromTrac(dataId) {

    // Ask for the dataset in JSON format
    const request = tracdap.api.DataReadRequest.create({

        tenant: "ACME_CORP",

        selector: dataId,
        format: "application/json"
    });

As well as supplying a selector for the dataset (in this case the dataId created earlier), we use the format field to say what format the data should come back in, which must be a MIME type for a data format supported by the platform. In this case the request ask for data in JSON format.

Now let’s send the request to the data API:

    return dataApi.readSmallDataset(request).then(response => {

        // Decode JSON into an array of Objects
        const text = new TextDecoder().decode(response.content);
        const data = JSON.parse(text);

        return {schema: response.schema, data: data};
    });
}

Again, by using readSmallDataset() we are assuming that the content of the dataset can fit as single blob in one response message. For relatively small datasets that will be displayed in a web page, this approach avoids the complexity of streaming calls. An equivalent server-streaming call, readDataset(), is available and supported in the web API package.

In order to use the data that comes back, it needs to be decoded. Since the data is in JSON format this is easily done using a TextDecoder and JSON.parse(), which will create an array of objects. The response also includes the full schema of the dataset, in this example we are returning both the schema and the decoded data.

Exactly how the data is rendered on the screen will depend on the application framework being used. One common approach is to use an “accessor” method, which allows a renderer to access elements of the dataset that it needs to display by row and column index. We can create an accessor function for the decoded dataset like this:

export function displayTable(schema, data) {

    const fields = schema.table.fields;

    const accessor = (data_, row, col) => {
        const fieldName = fields[col].fieldName;
        return data_[row][fieldName];
    }

    renderTable(schema, data, accessor);
}

Here rows are looked up by index, for columns we must find the field name for the column and then do a dictionary lookup.

Saving data from memory¶

To continue the example, let’s suppose the data has been displayed on screen, the user has edited some values and now wants to save their changes as a new version of the dataset. The modified data exists in memory as an array of JavaScript objects, so we need to encoded it before it can be sent back. To encode the data using JSON:

function saveDataFromMemory(schemaId, originalDataId, newData) {

    // Encode array of JavaScript objects as JSON
    const json = JSON.stringify(newData);
    const bytes = new TextEncoder().encode(json);

Now we need to create a DataWriteRequest for the update:

    const request = tracdap.api.DataWriteRequest.create({

        tenant: "ACME_CORP",

        // The original version that is being updated
        priorVersion: originalDataId,

        // Schema, format and content are provided as normal
        schemaId: schemaId,
        format: "application/json",
        content: bytes,

        // Existing tags are retained during updates
        // Use tag updates if tags need to be added, removed or altered
        tagUpdates: [
            { attrName: "change_description", value: { stringValue: "Increase limit for customer A36456" } }
        ]
    });

Since we are updating an existing dataset, the priorVersion field must be used to specify the original object ID. This works the same way as for metadata updates: only the latest version of a dataset can be updated. So, if we are trying to update version 1 to create version 2, our update will fail if someone has already created version 2 before us. In this case, the user would have to reload the latest version and try the update again.

The schemaId, format and content fields are specified as normal, in this example the schema has not changed, so the schema ID will be the same. (Schema changes are restricted between dataset versions to ensure each version is backwards-compatible with the versions that came before).

Since this is an update operation, the existing tags will be carried forward onto the new version of the object. We only need to specify tags we want to change in the tagUpdates field. In this example we add one new tag to describe the change.

To send the update to the platform, we use updateSmallDataset():

    return dataApi.updateSmallDataset(request).then(dataId => {

        console.log(`Updated dataset ${dataId.objectId} to version ${dataId.objectVersion}`);

        return dataId;
    });
}

Again we are assuming that the content of the dataset can be sent as single blob in one message. (An equivalent client-streaming method, updateDataset(), is available in the platform API but not currently supported over gRPC-Web).

The method returns the ID for the updated version of the dataset as a TagHeader.