TIC 4.0

JSON to FLAT 2022.004

Introduction

TIC 4.0 uses the JSON format as default because it allows to express an array of subjects (for several sub-subjects) or observed properties (for different combinations of timestamps, pom, pomt, names, etc). JSON is also the default file format for many protocols, such as REST and MQTT, which are extensively used to share data or publish IoT data. These arrays give us a high flexibility thanks to the hierarchical structure. Other hierarchical structures, such as XML, are possible in TIC 4.0 lbut we recommend using JSON as default.

Although the JSON is very powerful, other formats that are required for data processing at low level and program level and which are not compatible with hierarchical structures. So we require a solution to be able to express the same content in a flat format. It is important to be able to identify unambiguously the array elements and id if necessary.

The hierarchical structure allows to include in a single message many timestamps of many concepts per subject of many subjects per the main subject, without limits.

Hierarchical data structure example

Using a flat format we are limited to only one timestamp per message, and to add in the name the critical information to identify what subject, concept, observed property and point of measurement are related.

Two flat messages equivalent to one JSON

You can download the example here

A flat format is more suitable for low-level implementation such as PLCs or software code. JSON is better for sharing extensive messages. Both are valid in TIC 4.0.

This document contains the rules to convert the JSON structure into a Flat one (2022.005 will cover Flat to JSON).

JSON to Flat: the general concept

The main goal of the conversion is to ensure that all the JSON information is transferred in the flat messages after the conversion. As flat messages have limitations this would require creating several flat messages for each JSON message. As the JSON format can include arrays of data, the flat format must include a way to identify the array in the flat message. This will be done by including some information in the flat variable name transforming the name of the variable with hierarchical and array information.

Basic actions to transform JSON to Flat:

  1. Split the JSON into multiple flat messages (at least one per timestamp).

  2. Add an extension to the attribute name to specify what array value it refers to (there cannot be two variables with the same name).

  3. Keep all metadata in each message (to avoid losing metadata information related to the values).

Depending on the use made of the data, it will be more convenient to produce more messages split by time and by subjects, or to reduce the number of messages by adding more array information.

To understand this comment, consider a JSON message with 10 CHE data: you can create a message per timestamp in which in each message the 10 CHEs are included, or you can split the flat messages per timestamp into 10, one per CHE and timestamp.

The first option will reduce the number of messages but will create a specific variable name for each CHE. The second option will create a lot of messages, but all of them will have the same variable names, which is very important if you want to compare the CHE values.

Clarifications

It is important to understand that One message with a hierarchy structure (JSON or similar) would be transformed into one or more flat messages (and vice versa). The number of messages will depend on the number of different timestamps and the classification per subject (if require).

It is also relevant to distinguish between “metadata” and “value”. The metadata is data that doesn't change its value over time, however, “value” always needs a timestamp to define when the “value” is valid.

The semantic combines 6 basic elements: HEADERSUBJECTCONCEPTOBSERVED PROPERTY, POINT OF MEASUREMENT and VALUE to represent a unique reality. TIC4.0 JSON format uses arrays for the SUBJECT and OBSERVED PROPERTY:

  • The SUBJECT never has timestamps

  • The OBSERVED PROPERTY always has a timestamp.

  • The CONCEPT never has arrays.

  • The POINT OF MEASUREMENT and VALUE (with their units) are always part of the observed properties so they cannot have an array (but can be part of one array).

Objects and values in JSON:

{ "msg": { "id": "001" },

In this JSON example, “msg” is an object and “id” is a value in JSON language. The objects contain values and the values have a payload with “data”. TIC4.0 distinguish between two types of JSON “values”: “metadata“ and “value”. The “metadata” doesn't change across the time and the “value” changes (it requires a timestamp).

Rule 1: One flat message per timestamp

A single message will be generated for each of the different timestamps present in the source JSON message. 

  1. The header of the message will be copied into each of the documents adding the “msg.sample" enumerator.

  2. Every JSON object that has no timestamp (= metadata) will be copied into every document

  3. The flat variable name will follow the rules 2, 3, 4 (see below).

For example, a simple message with just one subject, one concept, one observed property and one timestamp and some metadata 

{ "msg": { "id": "001", "timestamp": "2021-11-18T08:27:28.609Z" }, "che": [ { "name": "STS01", "on": { "totalcounter": [ { "pom": "ioutput", "pomt": "actual", "timestamp": "2021-11-18T08:27:28.609Z", "value": "0" } ] } } ] }

Would result in only one flat message: 

[ { "msg.id": "001", "msg.timestamp": "2021-11-18T08:27:28.609Z", "msg.sample": 1, "che.@.name": "STS01", "che.@.on.totalcounter.ioutput.actual.timestamp": "2021-11-18T08:27:28.609Z", "che.@.on.totalcounter.ioutput.actual.value": "0" } ]

 

Rule 2: Pathname includes objects and properties

To define the VALUE it is required to combine the SUBJECTCONCEPTOBSERVED PROPERTY, POINT OF MEASUREMENT in the path name. If the array contains values, it will be considered as a primitive element, but if it contains an object, the name of the final property will include the value of some of the attributes in the path.

Objects and properties are added to the path name. This name can be short or very long depending on how deep the hierarchy is.

Below JSON example with a very simple hierarchy containing only with subject and concepts (without observed properties and point of measurements):

Would result in

Rule 3: Subjects include the extension “@”

The SUBJECT is always an array. To be able to convert automatically from Flat to JSON an identifier must be included in the pathname to identify that the subject is an array. Any array without a timestamp is a SUBJECT.

In consequence, if there is an array without any timestamp we will add to the pathname the character “@” with the “@arrayid” if exist. The “@” is always necessary and in case the array has an “arrayid” identification the value of it must be included after the ”@arrayid".

The “arrayid” can be modified by the user as its only purpose is to identify the array component for its classification and use in a flat format. In most cases, there is only an array of one element per message and it is not necessary to specify the “arrayid”.

Example 1:

These examples represent two messages of two different CHEs. As the flat message will never have two different CHEs in the same message the “@” doesn't have an extension name (arrayid) and the path name can have the same name. This is very useful in databases and dashboards.

Flat message 001

Flat message 002

Example 2:

Another example of a subject (unique without arrayid) but with a sub-subject that has an ”arrayid” (a CHE with a specific spreader identification because in some cases has two spreaders): 

it would result in: 

Example 3:

Below you can find a real JSON example in TIC4.0 of a unique JSON message with only one CHE that has only one powersource (engine) where we are observing if it is actually “on” at 2022-02-14T08:23:55.000Z

JSON sample

 

FLAT sample. Because arrays have only one child element there is no need to add the arrayid value to the path of the properties:

 

The previous example only contains 1 subject and 1 timestamp. If the JSON contains several subjects of the same type or several timestamps in the same message it is not possible to express it in a single flat plain message. In such a case, the JSON message must split into several FLAT messages and/or use the “@arrayid” extension to express the same information in flat format identifying the arrays.

Rule 4: Observed properties included in the pathname

The OBSERVED PROPERTY always has an array that includes at least the “timestamp”. Each array element is always a unique combination of the values of the object: timestamp, POINT OF MEASUREMENT, unit etc. Therefore the array identification is made by combining part of the content to make it unique. At the present release 2022.004 (in the future are expected to incorporate new values to this list) the values to include in the path name to identify the array are:

  1. #name#value

  2. pom: input, iinput, ioutput, output (closed list)

  3. pomt: schedule, proposal, request, estimated, planned, actual, performed, historic (closed list)

  4. #reference#value

  5. #unit#value

  6. #qualifier#value

“pom” and “pomt” belong to a closed list and don't need an #extension# to be identified. The rest could have any value and needs to be identified by #____#

Example 1:

A JSON message with just one subject and sub-subject (drive), one concept (driving), one observed property (speed) but 2 timestamps and different combinations of “pom”, “pomt” and “unit”.

 

Would result in 2 messages because of two different timestamps: 

 

Rule 5: Split by subject

For many applications, it is necessary to create a message per subject (or sub-subject) as it is not possible to compare similar subjects. For example, compare CHE or spreaders or tires. So it is necessary to create a database based on independent registers per one specific subject.

The conversion from JSON to Flat allows splitting the flat message into individual messages by every entity found in the path until the selected one you wish to classify.

Example 1:

Example of a JSON message with 2 CHE, each one with 2 Powersources and 2 Spreaders that we want to split per powersource (and timestamp).

 

Convert to Flat message per timestamp and “powersource”

It will result in 16 messages if we only split it per timestamp and che.powersource, flattened by “@arrayid” (in this case per spreader array.id ):

2022-02-14T07:52:50.616Z RTG01 Main_genset

 

2022-02-14T07:52:50.616Z RTG01 aux_genset

 

Message Validation

It is an independent process from the flattening procedure. Usually to be done before flattening procedure or just to check if the message complies with the TIC4.0 schema.

The validations are made based on a JSON Schema. The schema doesn't limit the content to TIC4.0, but if the attribute is a TIC4.0 attribute the Schema will check if the format is TIC4.0.

 

Open Source Code

The backend of these code is public available at

GitHub - Fundacion-Valenciaport/TIC4.0: TIC4.0 Repository

 

© Copyright - TIC 4.0 All rights reserved | Design web by Fundación Valenciaport