|
The supported data sources are: Sensor Things API, STAplus, OGC API Features/Records, OGC Catalogue Service for the Web, S3 Services, Eclipse Data Connectors, CSV, DBF, JSON-LD, JSON and GeoJSON files. Once the data is represented as a table, it can be directly viewed, edited, semantically enriched, filtered, joint, grouped by, aggregated, etc. A part of a classical rows and columns tabular representation, data can be presented as bar charts, pie charts, scatter plots, and maps. TAPIS is integrated with NiMMbus (MiraMon implementation of the Geospatial User Feedback) and with the MiraMon Map Browser.
While the project is completely independent from the Orange data mining software, it has been inspired by its GUI design. If you have used Orange in the past, you will immediately know how TAPIS works.
But there are some differences with Orange:
In TAPIS, the first thing that you need to do is to place a starting node in the network area. Examples of starting nodes are "Add STA service", "Import CSV" or "Import GeoJSON". Clicking on them in the button area will place them in the network area. Doubleclicking on their representation in the network area will make appear dialog box to specify the parameters necessary to import the data in TAPIS. After accepting the options messages about the progress and eventual errors will appear in the Information area. The result of the operation will be represented as a table in the Query and table area (a STA query will also be shown if the node is a STA node; the ones with blue icons).
Next thing is connecting other nodes to the starting nodes.
"Table" is an example of "leaf node". Leaf nodes cannot be the source node of any other node and represent the "leafs" of a network representation. Other examples of leaf nodes are "Scatter plot", "One value", "View query" or, "Save table".
We will start by using the "Add STA service" tool to type the URL of the root page (landing page) of our favorite STAplus service. Instead, we can select one from the list of suggested STA services.
.
From the landing page, all entity types are available. This give us the flexibility to explore the model from different points of view. In our case we would like to explore Datastreams. We will right-click in the STA service and we will select the [DataStreams] button. This will give us access to all Datastreams.
While it is possible to use [View table] to see the properties of the available datastreams, we will use the [Select Row] button (the blue one) to directly select a particular Datastream.
We will select a datastream that we like to start exploring its content. While we selecting it we can see that the description of the units of measure is part of the Datastream.
Now we are ready to reproduce the UML diagram by connecting Datastream with all entities directly connected to it. We can do it by rigth clicking on Datastream and selecting the entity as we did before.
The content of the node as a table can be seen in the Query and table area directly by selecting the node by a single left click. It is also possible (but no longuer necessary) to add a connection to [View table] to any of this entities to open a dialog with its content.
A Datastream is a collection of observations. This allows us to see the observations as a table.
All the observations in a datastream are done by the same Party...
...with the same device (Thing)...
...in the same current location...
...observing the same observed property...
...shared with the same License.
and that is why they are a coherent time series that can be represented in a single scatter plot by adding the [scatter plot] tool to the observations node.
We can extend the diagram further by selecting one observation with the blue [select row]...
... and then add a [FeaturesOfInterest] and a [observationsGroup]
This way we can know the position ofot he feature of interest...
... and read about the observation group.
This will create the "Observations" tool and give us access to all observations intermixed, regardless who produced the data, or what is the observed variable.
To know about the "Datastream" of the first Observation in the table we can click on the link in the table under the "Datastream@iot.navigationLink" column name. As a result the graph will be enriched with a "select resource" tool (with the label indicating the id of the Observation) and a "datastreams" tool
To know about the "ObservedProperty" linked to the "Datastream" we will add the "ObservedProperties" tool manually with the right click on the "Datastream"
However there are other ways to get the same "ObservedProperty". The "Datastream" also contains a link to the "ObservedProperty". This link is different from the one used before.
By clicking the link we will get a new "Datastreams" tool linked to the root of the STA server, a selection of the correct resource and the "ObservedProperties" tool automatically.
Finally, there is still an obvious way to present the same "ObservedProperty" and this is by clicking of the "@iot.selfLink" one of the two representations of the "ObservedProperty".
As a result, a "ObservedProperties" tool is added to the STA root and a selection of the correct resource is also added.
The final graph shows 3 branches with 3 leaf nodes that contain the same information because they are the same entity, despite the fact that they have 3 different URLs.
We will start by using the "Add STA service" tool to type the URL of the landing page of our favorite STAplus service. Instead, we can select one from the list of suggested STA services.
.
Then we can request the list of parties available by using the "Parties" tool. We can then see all parties by using the "Table" tool or even better, see the same list and select one from the list using the "Select Row" tool. We are selecting the "Joan Maso" party.
Now we can add the "Datastreams" to access the Datastreams generated by Joan Maso. It is a common mistake to try to connect a tool representing an entity (in plural) to another entity (in plural) directly. Please note that you can only connect two entities in plural if you select on entity with "Select Row" first and only if they are directly connected in the Entities diagram. We are only interested in one particular Datastream in this exercise, so we will use the "Select Row" tool again to select the Ambient Temperature variable.
At this point, it becomes difficult to remember our selections. We can use the "Rename" tool or simply press F2 to change the name of the selection tools to a more intuitive name.
Finally, we will be able to see the Observations about Temperature done by Joan Maso adding the Tool "Observations" and adding the "Table" after it.
By doubleclicking the "Table" tool, we will see the observations as a table with the properties that the Entity "Observations" has in the diagram: phenomenonTime (the time where the sample was collected), result (the actual value of the temperature) and resultTime (the time the observation was determined and recorded). If we look at the table, we can see that we get 100 observations and time is increasing so we can assume they are in order starting by the earliest one.
Commonly, we are interested in the latest observations (possibly the current ones) so we will request them in descending order by using the "Sort By" and specifying that we want to sort by "phenomenonTime" in descending order. We can also change the number of records requested to a bigger number.
Behind the scenes, the tool is creating STA queries and sending them to the STAplus service. The responses of these queries are converted into a tabular format (that is the common representation format in TAPIS). We can learn about the actual request done using the "View Query" tool.
We can even click on the provided link to see the actual response to the query in the original JSON format (or in a nicer representation as objects if we use FireFox).
We can also see the last temperatures as a scatter plot by using the "Scatter Plot" tool and pressing the "Draw" button.
Unfortunately, there is no mention of the variable or the units of measure in the graphic. This is because this information is not on the Observations entity but in the previous entity: Datastream. We can see the properties of Datastream such as name, description and unitOfMeasurement. In here we see one of the limitations of the tabular common representation used in TAPIS: in property that is an object its values are represented as JSON serialization.
In case a JSON object needs to be used, the tool "Separate columns" split all JSON serialized columns into separated columns. In this case, unitOfmeasurement is splitted in unitOfmeasurement/name, unitOfmeasurement/symbol and unitOfmeasurement/definition.
Now we can link this split record into the scatter plot with the "Connect two nodes" and then clicking the "Scatter plot" and the "Separate columns" (in this order), creating a pentagonal shape. Now the scattered plot has more possibilities to select and the title of the diagram is more expressive showing both the name of the datastream and the units of measure.
Another way of showing the last observation value (the current value, if the sensor is live) is to use the "One value" tool. This tool represents the last captured value directly below the node icon as a title. If the selected STA service is websub enabled, the implementation refreshes the value everytime a new observation arrives. That is particularly useful for datastreams that are constantly incorporating new observations. If the STA is not websub enabled, it refreshes its value the indicated fixed number of seconds. The websub implementation is much more efficient than repeated HTTP requests, as it keeps a websocket connection with a webhook server permanently open. The refreshing can be stopped by double clicking in the same node and pressing [Stop refresh].
We can see the last temperature captured in the icon in the center of the pentagon. We can also see the value in the information window, as well as the promise to see a refreshed when the value is updated.
If the value is over the alerting value, the color of the icon and the text changes.
By the use of the FilterRowsSTA tool, we are able to formulate complex queries with no much effort.
We formulate 2 conditions on having a party with displayName equal to "Joan Masó" and a observedProperty with a definition equal to "http://vocabs.lter-europe.net/EnvThes/22035". We use the nexus "and" for joining these two conditions.
There is a tool to select the right entity where the property to formulate the condition exist.
This is how the internal query looks like: https://citiobs.demo.secure-dimensions.de/staplus/v1.1/Observations?$filter=((Datastream/Party/displayName%20eq%20%27Joan%20Mas%C3%B3%27)and(Datastream/ObservedProperty/definition%20eq%20%27http://vocabs.lter-europe.net/EnvThes/22035%27))
Next step is to expand the link properties in the result, limiting the information extracted to the phenomenonTime, result, unitOfMeasurement, definition and name of the observedProperty.
Commonly, we are interested in the latest observations (possibly the current ones) so we will request them in descending order by using the "Sort By" and specifying that we want to sort by "phenomenonTime" in descending order. We can also change the number of records requested to a bigger number.
This generate a complex query that looks like: https://citiobs.demo.secure-dimensions.de/staplus/v1.1/Observations?$filter=((Datastream/Party/displayName%20eq%20%27Joan%20Mas%C3%B3%27)and(Datastream/ObservedProperty/definition%20eq%20%27http://vocabs.lter-europe.net/EnvThes/22035%27))&$select=phenomenonTime,result&$expand=Datastream($select=unitOfMeasurement;$expand=ObservedProperty($select=name,definition))&$orderby=phenomenonTime%20desc
Now we are ready to separate the columns of the STA result and then show it as a table and as scatter plot.
This is the resulting table that contains a time series of temperature measurements and other columns with constant content
And this is the result as a scatter plot.
We will start by using the "Add STA service" tool to type the URL of the landing page of our favorite STAplus service. We will connect it to observations to get the "Observations" stored in the service.
The problem with "Observations" is that it only shows values and times with no mention of the meaning of the values. "Observations" returns links to the "Datastream" and the "FeatureOfInterest" that can be visited to get the actual properties of them. To reveal the actual meaning of the values in the same table, we need expand the units of measure stored in the "Datastream" as well as the variable name stored in "ObservedProperties". We will also need to get the actual positions of the observed variable that are stored in "FeaturesOfInterest". All this can be achieved by using the "Recursive expand" tool that allow us to select the needed properties (in practice removing the non-selected ones from the result) and expand the needed objects to see the properties directly in the table.
Commonly, we are interested in the last observations on the latest observations (possibly the current ones) so we will request them in descending order by using the "Sort By" and specifying that we want to sort by "phenomenonTime" in descending order. We can also change the number of records requested to a bigger number.
This has created a complex STA query that we can examine with the "View query" tool. This is the actual query URL:
https://citiobs.demo.secure-dimensions.de/staplus/v1.1/Observations?$select=phenomenonTime,result&$expand=Datastream($select=unitOfMeasurement;$expand=ObservedProperty($select=name,definition,description)),FeatureOfInterest($select=description,feature)&$orderby=phenomenonTime desc
The same query can be generated directly by connecting the "Observations layer" tool to the "Add STA service". This tool acts as a macro that has the predefined query stored.
Still we have the problem that the expanded entities are shown as serialized JSON objects. The tool "Separate columns" splits all JSON serialized columns into separated columns. The final table contains all the fields needed to build a time series.
The resulting table is ready to be converted into a map.
By selecting the right fields and pressing the "Open" button, a map browser emerges with an extra layer that shows the last observations as a time series. By clicking in the point icons we can see the time evolution of the different variables.
It is also possible to create a GeoJSON and a GeoJSON schema using the "Save layer" tool.
As a result, we get a GeoJSON with the features of interest as geometries and the observations as values. The corresponding GeoJSON schema is extended to include the meaning of each observed properties (variable names, definitions and units of measure). Since the time series are represented as a list of properties which names contain the datetime, the GeoJSON schema properties uses name templates representing a generic datetime format. The extended properties of the properties definition are defined in a metaschema that can also be saved.
To start the exercise, need to find a CSV in the Internet. In this case, we go to the Open data portal of the Catalan Government to find a dataset of quantity of water in the Catalonia reservoirs
(https://analisi.transparenciacatalunya.cat/ca/Medi-Ambient/Quantitat-d-aigua-als-embassaments-de-les-Conques-/gn9e-3qhr/about_data).
There is a way to get a direct link to the data in CSV format (https://analisi.transparenciacatalunya.cat/resource/gn9e-3qhr.csv).
First, we import a table from the internet using the "Import CSV" tool.
We specify that this is a coma separated file with headers defining the titles of the fields. We do not have any CSVW to add semantics so we will not select any file name. We will copy the URL of the CSV in the last box. The file is automatically uploaded in the background.
Now we will include the semantics with the fields meaning tool.
We need to populate the description of each field, the URL of a vocabulary definition and the units of measure description, URL definition and symbol.
That is a complicated process because it requires some knowledge of the existing vocabulary services. We can use one of the predefined exemplary combinations to start with. Do not worry if you do not know the answer of some items now as all fields are optional and can be left blank.
Once this is done for all fields, we can check if all works by visualizing the table with the table tool.
Field names are replaced by descriptions that are linked to the concept they represent and the units of measure are represented by a symbol linked to the definition of them.
Now, by going back to the "meaning" tool we can share this definitions in the Internet by pressing the "share" button.
The NiMMbus system will be opened. You are required to login using an identity provider such as Authenix or by creating your own user name in the system.
Do not be alarmed by the number of fields you see in the screen. They are all pre-populated, so you only need to press save and the NiMMbus system will close and you will be redirected to the TAPIS again.
To check that all is correct, we can redraw the window to reset the status of it to zero. By adding using the ImportCSV again, we can reload the previous CSV. We will specify again that the file is coma separated and has headers. We will reenter the URL of th CSV but this time we will also select "Automatic retrieve of shared meaning". This will force the system to look for sources of meaning in the NiMMbus system and retrieve our last definition in the background, once we press "Done".
We can verify that the meaning of the fields have been loaded by adding and open the "Meaning" tool again, just to see that the definitions were uploaded with no extra human interaction.
To finalize, we can use the "Save Table" tool to save a CSVW file with the meaning definitions that we have provided as well as the original CSV for later use.
This is the result of saving the CSV (fragment) and the CSVW.
We will start by using the "Add STA service" tool to type the URL of the landing page of our favorite STAplus service. We will connect it to observations to get the "Observations" stored in the service.
Then we will apply the "Geospatial Filter Rows" tool to indicate that we will select observations by an area.
A geospatial filter requires a polygon as an input. To do that, we need to import a GeoJSON file that contains polygons. We are using an smap GeoJSON with 2 polygons called: "callus_munich.geojson" that contains 2 polygons, one in the Callus (Barcelona) area.
We need to import the GeoJSON file that contains polygons with the [Import GeoJSON] tool
Then, we need to select one polygon by using the "Select Row" (the one with the grey icon).
Then we need to create a second connection that starts on the geopatial filter and ends on the selected polygon
Now the selections is executed but still we have all observed properties mixed in the results list.
We need to concatenate the geospatial filter with a non-geospatial filter and specify that we would like to filter the observations that has an observed property with the definition "http://vocabs.lter-europe.net/EnvThes/22035".
Now all observations are temparatures
Finally, we can sort the observations with the more recent first.
This results in thiw complex query URL: https://citiobs.demo.secure-dimensions.de/staplus/v1.1/Observations?$filter=geo.intersects(FeatureOfInterest/feature,geography'POLYGON ((1.78 41.78, 1.79 41.78, 1.79 41.79, 1.78 41.79, 1.78 41.78))') and ((Datastream/ObservedProperty/definition eq 'http://vocabs.lter-europe.net/EnvThes/22035'))&$orderby=phenomenonTime desc
Now that we have a time series we can use the scater plot tool to generate a temperature/time diagram.
To apply this filter, we need to select an entity where we want to apply this filter (in this case, we will use Observations), the coordinates from a point used as a reference and indicate the distance we want to use.
To start, we have to indicate the STA service that we want to use by adding the "Add STA service" tool and typing there the URL of this service. We will connect it to observations to get the "Observations" stored in the service.
Then, to apply this filter we need to select the "Filter Rows by Distance" tool.
Adding a point manually
Let's start by adding a point manually. If we don't link any additional node, we will have a window like this.
On the top, we have to select with which entity we want to apply the filter in our observations, and which way we want to follow. In other words, do we want to use coordinates from feature (in FeatureOfInterest) or coordinates in location (in Locations) from each of our observation to compare them with the reference point that we will choose later? In this case, we select feature from FeatureOfInterest, that there is only one way to achieve it.
In the middle, we have to write the coordinates from the reference point, the point with every observation coordinate (in this case, the feature of these observations) will be compared to to apply the filter. At the bottom, we will indicate the distance and the operator we want to use to apply the filter. In this case we want to obtain the observations which feature is in a distance lower than 5.
Click Ok button to see the results. The filter has been applied:
Now let's see different ways to indicate the reference point by linking different nodes with different sources origins.
In all cases, we need to take into account that nodes with only one register can be linked with the "Filter Rows by Distance" node (apart from our reference entity node where we want to apply the filter).
Using a GeoJSON to indicate the point
We have already prepared a GeoJSON file with the coordinates with a reference point, so now let's import it with [Import GeoJSON] tool.
As you can see below, when you link the "import GeoJSON" with "Filter Rows by Distance" you will have the same window as before, but the coordinates have been automatically added. The next steps will be as explained before, so they will be not repeated.
Using a service STA and FeatureOfInterest as a reference to indicate the point
Let's use the same STA service to choose one FeatureOfInterest. Link FeatureOfInterest node to STA service and then select only one record using "select row" (STA) to obtain only one feature.
Link this to "Filter Rows by Distance" and observe that again, you have already your point added automatically. The next steps will be as explained before, so they will be not repeated. The same procedure can be applied with Locations entity (location)
Using a CSV to indicate the point
For this example, we will use a CSV containing the coordinates in two of its columns. We have chosen a totally random CSV that has this information. First of all, let's add this CSV with "Import CVS" tool.
In our case, we have more than one register, so we have to select only one, with "Select Row" tool (for tables). As you can see below, in this example, columns that contain coordinates are " and "long".
After selecting one row (one register) link it to "Filter Rows by Distance".
Unlike the previous cases, we have more than one column that can contain our coordinates, therefore, we will have to choose them in the selectors that will appear in "Filter Row by Distance" window. When we select the columns, its information will be added automatically. The next steps will be as explained before, so they will be not repeated.
Commonly you cannot write in a STA service without permissions. The first think that we need is order to be able to write in the STA service using the login button
The login mechanims in TAPIS is called Authenix. Authenix does not manage users. Instead it uses preexisting user names created in other platforms. In the Authenix window select your platform or institution integrated in Edugain and follow the instructions to login. If you are previously logged in the Authenix window will open and close with no need for interaction.
Your name and organization will appear on the left hand side of the control area.
Now select the Add STA service button and select a STA service where you have writing rights.
Since the camera trap is owned by the same person that owns the pictures and does the identifiacion, we will start by creating a single Party for everthing. Do not worry, if the party already exists it will be retrieved instead. A right click on the STAplus will open a context menu to select Party. (please note that you should select the family of bottons with the entity names in singular that have a icon with a little pencil.
The creation of a Party is particularly easy, as it takes the authId from the authetication process and no need to add any extra information. You only need to press [Create]. Again, do not worry, if the party already exists it will be retrieved instead.
The information area will show the identified of the created or retreived Party.
The Query and Table area will show the description of the Party.
The same procedure can be used to create an ObservedProperty. In this case, we start by creating the identification (occurrence) observed property. You should populate the requered information before pressing [Create]. In particular we use the Species definition in GBIF as URI definition (https://en.wikipedia.org/wiki/Image).
The other observedProperty is the Picture itself. In this case, you should populate the requered information before pressing [Create]. To make things simple, we use the wikipedia definition of an image (https://en.wikipedia.org/wiki/Image).
Now we can create a Location where the Thing is placed. In this case this is the position of a Camera trap. Again, you should populate the requered information including the position (in this case represented by a point) before pressing [Create]. The encodingType property is fixed to application/geo+json.
Once the Location and the Party are created, it is possible to create a Thing that is connected to both elements. By right clicking in the Location and selecting the [Thing] botton we have the first connection and the second connection can be done using the [Connect two nodes] button. In this case this is the Camera trap itself. In the window, the connections to both objects are indicated and you should populate the requered information only a name and a description before pressing [Create].
The second Thing that we will need is the Citizen itself, that has an irrelevant location when the identification was done (probably at home reviewing the pictures downloaded from the camera trap. By right clicking in the STA plus, we will select the [Thing] botton and create a second connection using the [Connect two nodes] button to the Party. In the window, the connections to both objects are indicated and you should populate the requered information only a name and a description before pressing [Create]. The illustration shows the result of it.
Now two Sensors are created, the first one for the human sensor. By right clicking in the STA plus and selecting the [Sensor] botton. In the window, you should populate the requered information only a name and a description before pressing [Create].
The second Sensor is the optics and CCD inside the camera trap. By right clicking in the STA plus and selecting the [Sensor] botton. In the window, you should populate the requered information only a name and a description before pressing [Create].
Some Creative Commons licenses are already created in the STA plus, so in this case we will use the Licenses to request all the licences in the STA. Then by right clicking in the Licenses we will add the Select Row tool (the one with a blue icon) and by doubleclicking in it, we can select the license we want. In this case, we will select CC-BY.
Now we have all the elements necessary to create two DataStreams.
DataStream is a complicated object that is connected to Sensor, Thing, ObservedProperty, Party and License.
This can be a bit confusing so, before starting, we will organize the entities that are related to the "camera/picture" in on side (to the top, in the illustration)
and the ones related with "citizen/identification" on the other side (to the bottomin the illustration).
Then, we will start by right clicking in one of the entities related to the "camera/picture" and selecting Datastream, and using the "Connect node" botton for generating the other four connections.
By doubleclicking in the Datastream we will see the five relations and you should populate the requered information; in our case only a name and a description (due to, there is no units of measure in this case) before pressing [Create].
Now, we will start by right clicking in one of the entities related to the "human/identification" and selecting Datastream, and using the "Connect node" botton for generating the other four connections.
By doubleclicking in the Datastream we will see the five relations and you should populate the requered information before pressing [Create].
Now, everything is prepared for illustrating how to create the observations. We should start by creating one ObservationGroup that will be connected to occurrence observations and image observations. This entity can be connected at least to the Party and License. You should populate the requered information before pressing [Create].
We cannot start creating observations without a FeatureOfInterest. Since the FeatureOfInterest can not be connected to anything, it should be created by right clicking on the STA plus entity. You should populate the requered information, including the coordinates of the occurrence before pressing [Create].
Now all the elements to be connected to observations are there. To create a "picture" observation you can right click the ObservationGroup and select the Observation. Then we should connect the Observation to the FeatureOfInterest and Datastream related to "camera/images".
You should populate the requered information before pressing [Create], being the most important the "result".
In this case, the "result" is the full URL of the picture that was previously stored in a external repository and the phenomenonTime is the time when the picture was taken.
To create a "occurrence" observation you can right click the ObservationGroup and select the Observation. Then we should connect the Observation to the FeatureOfInterest and Datastream related to "identifications/occurrences".
By doubleclicking in the observation, you should populate the requered information before pressing [Create].
In this case, the "result" the full URL of the species name in GBIF (https://www.gbif.org/species/7193910), and the phenomenonTime is the time when the picture was taken and the occurrence happened.
You can add as many Observations and ObservationGroups as needed repeating the same procedure.
A partial result of the created observation group can be retrieved from the following query: https://citiobs.demo.secure-dimensions.de/staplustest/v1.1/ObservationGroups(232753)?$expand=Observations($select=result,phenomenonTime;$expand=Datastream($select=description;$expand=ObservedProperty($select=description)))
First we connect to the GeoNetwork AD4GD catalogue.
The result is a table with a list of record in the catalogue that have data associated with it.
We can select one of these records with the [select row] tool.
Now, depending on the format we can select a tool to open the file. In this case we use the [ImportCSV]. By double-clicking in the tool, we can immediately open the resource without having to write the URL of the resource. We only need to press the two [load] buttons next to the URLs.
The table can be examine and analyce. In this case, we can do groupBy statistics to know about the number of occurrences per species selecting the species boht as GroupBy and as Attributes to aggregate and using the "count defined" aggregation. The results is a little table with two columns one with the species name and another with the number of occurrences per species.
Then we can create a bar chart that represents the table by selecting the species as categories and the countDefined as values. We can easily see that the Mustela putorius is the species with more registered occurences.
Now we can report our little "finding" as geospatial user feedback. First, we need to connect the 'Feedback' tool to the 'Select row' element that contains the selected dataset record in the catalogue. By double clicking the GUF tool we can press the [Add feedback] botton.
The interface of the Nimmbus system is called. After logging in the system is possible to report our "finding" as a "comment". We have to press [Save] to send the feedback to the feedback database and close the window.
To verify that the feedback is collected and recorded, we can use the button [Get feedback] to get the stored feedback as table (with only one row if there was no previous feedback.
To read the record better, we can use the [Edit record] to see the item as a record in a form.
First we connect to the S3 service.
The result is a table with a list of buckets offered.
We can select one of these records with the [select row] tool.
Now we need to load the bucket with the [S3 bucket] tool.
The result is a table with a list of files offered. Note that the folder structure has been flattened and the name of the files contain the name of the folder directly.
We can select one of these files with the [select row] tool.
Now, depending on the format we can select a tool to open the file. In this case we use the [ImportCSV]. By double-clicking in the tool, we can immediately open the resource without having to write the URL of the resource. We only need to press the two [load] buttons next to the URLs.
First we load a table as CSV.
We select the [Group By] tool.
We select grouping by two columns: "Need Type" and "Area". We select to aggregate the column "User requirement id" and we request the "Count defined" as operation.
The result is a table that has 3 columns. The thirt column represent the number of records that has the same "Need Type" and "Area".
We select the [Bar plot] tool and we ask for Categories (the bar labels) "Need Type", the Classifications (the data series names that are going to be stacked segements in diferent colors in the bar chart) as "Area" and the Values (the size of the segments of the bars in the chart) as "User requirement id_CouldDefined". After presing "plot, you will see the actual stacked bar chart.
In preparation for this example, we configured a SensorThings API (STA) endpoint as a asset available in the data space. Our demonstrative data space is composed by two nodes, each one with an Eclipse Data Connector instance. To get access to the STA asset a sequence of steps need to be conducted. Fortunately, TAPIS takes case of those steps automatically. The steps are:
First we need to select the data space catalogue.
The result is a list or assets that can be negotiated.
Now, we have to select the asset that we want to have access to. As the result of the catalogue query is a table, we will use the black "select row" tool to do it. By double clicking on the node, we will be able to select the row representing the asset
Now, we have to operate with the asset using the "Dataspace asset" tool. The three remaining steps related to contract negociation and data transfer are exectuted automatically (in this particular case, we are providing a blank contract to negotiate because we know that the dataset has no restritions)
If the resource type is recogniced automatically, the resulting table will be shown. Note that the original URL of the asset is hidden to the client that always get the same endpoint. In this case we can see the result of the UL request as a STA root page listing all suported endpoints
Now that we know, we can start proceeding as if ths resource really was a STA endpoint. For example we will request the Observations with the right STA node. Note that the navigationLinks in the STA response show the original ULR. TAPIS will not use the original URL but it will build on the negociated URL.
We can continue exploring the STA service by selecting a particular observation with the blue "SelectRow".
Finally, We can request the Datastream of this observation in the STA service by selecting the "Datastreams" node to see the units of measurements and a description of this Datastream.
This recipe has demosntrated that is possible to connect ot a STA endpoint from a data space that contains the root of the STA service as a asset in the data space catalog.
In this example, we have a table, on every record represents a measure of the amount of water in a reservoir on a concrete date. In order to allow for some analisis we need to reorganize the data and transform the values of the collumn that refer to a Station into new columns. "Pivot table" tool can be applied to all data sources as TAPIS always represente the output data as tables. In this example, data is imported from CSV with the "Import CSV" tool. A fragment of the original table is shown below:
Now we link the "pivot table" tool to apply the desired changes to the current table. After clicking it, the dialogue allows for adding the values to different places. It is not allowed to repeat names of Columns and Rows, and at least one column has to be added to each section. The tool is inspided by the Excel pivot tables tool, in which collumns that need to be preserved from the current table should be selected in one of the sections to appear in the new table.
There are three different sections to add collumns and values to the new table:
Next you can see a fragment of the result of this execution. In this case, the "reference column" is "Day", with all unique values, in the original column (attribute selected in Row section).
The rest of the columns are all the unique values in the original "Station" column (attribute selected in Column section). Every unique value is now a new columns. As only one attribute has been selected, these columns only appear once, with the value as column name. Every result in the confluence was aggregated by the selected option chosen.
List of columns (attributes) in the resulting table
With the data of the resulting table, it is now possible to create a scatterplot graphic of the evolution water level of a one or more reservoirs.
Example 3: adding 2 collumns to Columns
Example 4: adding 2 collumns to Rows
Example 5: adding Columns and Values, without any Row
Example 6: adding Rows and Values, without any Column
In this example, we will start by a table containing a geopackage that represents the content of a SensorThings API. The same exercise could be done getting the data directly from a STA. These are the necessary steps:
First we open the geopackage. The result is the list of tables inside:
We are interested in two tables, the observations table and the feature of interest table. We can open both tables by clicking on the link and request adding them to the graph:
The feature of interest table contains GeoJSON object that describes de position of the observed feature:
The observations table contains the actual result of the observation, the phenomenonTime and the identifier of the feature of interest:
Next step is joining both tables into a single table using the feature of interest identifier as a join key. In the observations table the key is called FeatureOfInterest and in the featuresOfInterest table is called iot_id:
The result of the join is a table that contains data from both tables.
The join operation has produced too many columns, some of the confusing and unnecessary, so it is necessary to create a new table selecting only the necessary fields: position, time and result.
The result is the consolidated table only with what we need.
The result has the right content, but the data type of result and phenomenonTime is the intended one. To fix that, the column meaning tool has a reevaluate that will be handy. By pressing the link in both column descriptions, we will get the data types corrected.
Here we can see the table with the data types recalculated.
The feature element includes the position in a format that is not easy to deal with. With the tool Add a geospatial column we will be able to create two columns with latitude and longitude.
The two columns with latitude and longitude are visible at the end of the table.
Finally, we got a table with positions, times and observation values. We are ready to compute the uncertainties based on the redundancies no the dataset. We need to select the 4 columns that represent, thematic values, time and positional and specify a temporal period and a positional distance that specifies observations that will be considered the same because they are close in space and time.
The uncertainty tool creates the uncertainties in the 3 components: spatial, temporal and thematic. The table is now condensed and the aggregated values and their uncertainties are added.
Now we can request the assessment of the positional accuracy of the dataset.
A diamond logo and the text Metadata appears on the top of the table. This represents the result of the quality assessment.
Now we can request the assessment of the temporal accuracy of the dataset.
The diamond logo and the text Metadata on the top of the table accumulates the result of the second quality assessment.
Finally we can request the assessment of the thematic accuracy of the dataset.
The diamond logo and the text Metadata on the top of the table accumulates the result of the three quality assessments.
Now we would like to send the result to the immutable catalogue. To do that, we have to login first. There is a login button in the upper part of the window that will request authentication.
After a successful login, you will see your name in the upper part of the window.
The immutable catalogue window will register the data and the metadata in the catalogue. We need to provide a title, author and the positional and temporal column names.
If all goes well, the creation will be completed and you will get a URL of the STAC record in the information area.
You can take the first part of the URL to go to the STAC catalogue (in my case https://ic.ogc.secd.eu/), select the Search botton and request the right collection (in my case Immutable Catalog with DQ4STA 'SensorThings' Records) and use the last part of the URL provided as the item Id (in my case f5e72cfb-056a-45c6-aecb-3ef9f7649f39).
The result will give us a link to the immutable assed that we can follow to see the complete metadata of the asset that includes the quality parameters that we carefully crafted.
In the page we can also find a download button.
The download process produces a tabular result in JSON with the table of data that we have created that includes the position, time and observations values and its uncertainties.
This is the final complete diagram of all the steps we made: