Chapter 3 - Using Data¶
This tutorial is based on the using_data.js example, which can be found in the TRAC GitHub Repository under examples/apps/javascript.
Connecting to the data API¶
In order to use the data API, we will need an RPC transport and an instance of the API class. Here is how to set them up for a browser-based app:
23// Create the Data API
24const dataTransport = tracdap.setup.transportForBrowser(tracdap.api.TracDataApi);
25const dataApi = new tracdap.api.TracDataApi(dataTransport);
For a Node.js or standalone environment, create a connector pointing at your TRAC instance:
23// Create the Data API
24const dataTransport = tracdap.setup.transportForTarget(tracdap.api.TracDataApi, "http", "localhost", 8080);
25const dataApi = new tracdap.api.TracDataApi(dataTransport);
Saving data from files¶
Suppose the user has a data file on disk that they want to load into the platform.
For simplicity, let’s assume it is already in CSV format. If the file is reasonably
small we can load it into memory as a blob, using either the FileReader
API in a
browser or the fs
API for Node.js environments.
In order to save data to TRAC we will need to supply a schema. This can either be done
by providing a full SchemaDefinition
to be embedded
with the dataset, or by providing the ID of an existing schema in the TRAC metadata store.
In this example we use the latter approach, see lessons 1 & 2 for examples of how to create
a schema and search for its ID.
Once both schema and data are available, we can create a
DataWriteRequest
.
28export function saveDataToTrac(schemaId, csvData) {
29
30 const request = tracdap.api.DataWriteRequest.create({
31
32 tenant: "ACME_CORP",
33
34 schemaId: schemaId,
35
36 format: "text/csv",
37 content: csvData,
38
39 tagUpdates: [
40 { attrName: "schema_type", value: { stringValue: "customer_records" } },
41 { attrName: "business_division", value: { stringValue: "WIDGET_SALES" } },
42 { attrName: "description", value: { stringValue: "A month-end snapshot of customer accounts" } },
43 ]
44 });
Here, schemaId
is the TagHeader
(or TagSelector
) for a schema created earlier.
The format
field must be a MIME type for a supported data format and the content
field contains binary data encoded in that format. Since our csvData
blob contains
data loaded from a CSV file, we know it is already in the right format.
When data is saved the platform will create a DATA object in the metadata store to describe
the dataset. This DATA object can be indexed and searched for, so we use
TagUpdate
instructions to apply tag attributes just
like any other type of object.
Now the data API can be used to send the new dataset to the platform:
46 return dataApi.createSmallDataset(request).then(dataId => {
47
48 console.log(`Created dataset ${dataId.objectId} version ${dataId.objectVersion}`);
49
50 return dataId;
51 });
52}
Here we used createSmallDataset()
, which
assumes the content of the dataset is small enough to be sent as a single blob in the content
field. Since the data has already been loaded into memory, this approach avoids the complexity
of using streaming calls. (An equivalent client-streaming method,
createDataset()
, is available in the platform API
but not currently supported over gRPC-Web).
The createSmallDataset()
method returns the ID of
the newly created dataset as a TagHeader
, which includes the
object type, ID, version and timestamps. In this example we used the promise form of the method,
the equivalent call using a callback would be:
dataApi.createSmallDataset(request, (err, dataId) => {
// Handle error or response
});
Loading data from TRAC¶
Now suppose we want to get the dataset back out of TRAC, for example to display it in a web page.
To do this we use a DataReadRequest
.
54export function loadDataFromTrac(dataId) {
55
56 // Ask for the dataset in JSON format
57 const request = tracdap.api.DataReadRequest.create({
58
59 tenant: "ACME_CORP",
60
61 selector: dataId,
62 format: "text/json"
63 });
As well as supplying a selector for the dataset (in this case the dataId
created earlier), we use the
format
field to say what format the data should come back in, which must be a MIME type
for a data format supported by the platform. In this case the request ask for data in JSON format.
Now let’s send the request to the data API:
65 return dataApi.readSmallDataset(request).then(response => {
66
67 // Decode JSON into an array of Objects
68 const text = new TextDecoder().decode(response.content);
69 const data = JSON.parse(text);
70
71 return {schema: response.schema, data: data};
72 });
73}
Again, by using readSmallDataset()
we are assuming
that the content of the dataset can fit as single blob in one response message. For relatively small
datasets that will be displayed in a web page, this approach avoids the complexity of streaming calls.
An equivalent server-streaming call, readDataset()
, is
available and supported in the web API package.
In order to use the data that comes back, it needs to be decoded. Since the data is in JSON format
this is easily done using a TextDecoder
and JSON.parse()
, which will create an array of objects.
The response also includes the full schema of the dataset, in this example we are returning both the
schema and the decoded data.
Exactly how the data is rendered on the screen will depend on the application framework being used. One common approach is to use an “accessor” method, which allows a renderer to access elements of the dataset that it needs to display by row and column index. We can create an accessor function for the decoded dataset like this:
75export function displayTable(schema, data) {
76
77 const fields = schema.table.fields;
78
79 const accessor = (data_, row, col) => {
80 const fieldName = fields[col].fieldName;
81 return data_[row][fieldName];
82 }
83
84 renderTable(schema, data, accessor);
85}
Here rows are looked up by index, for columns we must find the field name for the column and then do a dictionary lookup.
Saving data from memory¶
To continue the example, let’s suppose the data has been displayed on screen, the user has edited some values and now wants to save their changes as a new version of the dataset. The modified data exists in memory as an array of JavaScript objects, so we need to encoded it before it can be sent back. To encode the data using JSON:
136function saveDataFromMemory(schemaId, originalDataId, newData) {
137
138 // Encode array of JavaScript objects as JSON
139 const json = JSON.stringify(newData);
140 const bytes = new TextEncoder().encode(json);
Now we need to create a DataWriteRequest
for the update:
142 const request = tracdap.api.DataWriteRequest.create({
143
144 tenant: "ACME_CORP",
145
146 // The original version that is being updated
147 priorVersion: originalDataId,
148
149 // Schema, format and content are provided as normal
150 schemaId: schemaId,
151 format: "text/json",
152 content: bytes,
153
154 // Existing tags are retained during updates
155 // Use tag updates if tags need to be added, removed or altered
156 tagUpdates: [
157 { attrName: "change_description", value: { stringValue: "Increase limit for customer A36456" } }
158 ]
159 });
Since we are updating an existing dataset, the priorVersion
field must be used to specify the original
object ID. This works the same way as for metadata updates: only the latest version of a dataset can be updated.
So, if we are trying to update version 1 to create version 2, our update will fail if someone has already created
version 2 before us. In this case, the user would have to reload the latest version and try the update again.
The schemaId
, format
and content
fields are specified as normal, in this example the schema has not
changed, so the schema ID will be the same. (Schema changes are restricted between dataset versions to ensure
each version is backwards-compatible with the versions that came before).
Since this is an update operation, the existing tags will be carried forward onto the new version of the object.
We only need to specify tags we want to change in the tagUpdates
field. In this example we add one new
tag to describe the change.
To send the update to the platform, we use updateSmallDataset()
:
161 return dataApi.updateSmallDataset(request).then(dataId => {
162
163 console.log(`Updated dataset ${dataId.objectId} to version ${dataId.objectVersion}`);
164
165 return dataId;
166 });
167}
Again we are assuming that the content of the dataset can be sent as single blob in one message.
(An equivalent client-streaming method, updateDataset()
,
is available in the platform API but not currently supported over gRPC-Web).
The method returns the ID for the updated version of the dataset as a TagHeader
.