Get and process historical data (live values)

Historical data (in the sense of Live Values) are useful for many use-cases. This article covers how to get them and what to watch out for when processing them.

Data sources and access

OPC UA

OPC UA is a standard for data exchange from sensors to applications. It's implemented by different vendors, e.g., kepware (KEPServerEX). The OPC foundation provides a reference implementation for the OPC UA standard on GitHub: [1] One detail specified by the OPC UA standard is how servers can provide historical data, and their example code includes how to access them: [2]

Aveva (Osi) PI

Aveva has bought OsiSoft, the original creator of the PI system for the management of real-time operations data. There's also documentation on how to access data streams from it: [3]

Data Warehouse or Operational Data Store

For managing huge amounts of data from multiple sources, there are software suites and services like Snowflake. You'll have to look up how to access the data for a specific product. However, if a large customer wants you to process historical data, it may be a good idea to ask for DW or ODS solutions already in place.

Recording historical data with UBIK^®

This is not recommended because there are systems specialized on recording, storing and streaming operational data, and UBIK^® isn't one of them.

Applications

[edit]

Mobile access
Reports

Mobile access

Charts

There is a dedicated article for displaying historical values and trends on UBIK^® clients as charts. The mentioned article ends where this one begins: where to get the data from. So, provided we know our data source and how to access it, how do we fill in the missing link? As the chart article describes, data is either imported using UBIK^® proxies or fetched on demand via custom code. In both cases, we need to read the required values from the data source. In case of importing, an IInterfaceExecutor implementation fetches the values from the external system and writes them to a proxy. In case of on-demand assembly, the custom C# property overriding the UBIK^® property's behavior needs to call code that reads the values from the data source an convert it to chart data. Depending on the frequency of reading operations, even on-demand retrieval should probably use some caching to avoid excessive communication with the data source.

Lookup

For maintenance purposes, it might be necessary to look up historical measurement values dynamically. This use-case is very similar to charting, but the required data points or time frame might be more dynamic. This is a challenge because it means we either have to import all data that might be required or, if offline availability is not an issue, we need a good way to query the source system, both without causing performance issues. In all but the most esoteric cases the best practice is to not persist any measurement data in UBIK^®, but instead to create only temporary instances, and to read the data from the source system based on customized query view items. However, offline capability is a feature often required in mobile UBIK^® use-cases. This means, even if there is no network connection, the client has to function fully. Ergo, we indeed have to not only read all potentially required data from the source system, we also have to send it to the client. It is of paramount importance to understand that these data sets can be huge. So whatever we do, we have to follow two guidelines:

Only use a subset of the data. Nobody needs all data everyday. If data can be organized in different topics, create separate hierarchy branches. Download only relevant data in the office, to use it offline in the field.
Encode and structure the data efficiently. If data is read-only and doesn't need to be queried over, a single pre-compiled string containing many values will minimize the overhead a large amount of property data transfer objects would incur.

Reports

There is a dedicated article for reporting with UBIK^®. In this specific case, we want to provide insights about historical data. Similar to the charting topic, we need to get that data somehow from a source system into the report. Since the shape of a report usually is not dynamic, and reports are rarely triggered by a client (ergo offline-capability is not a requirement), we don't have to store all data from the source system. We can fetch the data on-demand. As the reporting article explains, there are different techniques that have considerable impact on how to get the data into the report:

Database-sourced reports

SSRS or PowerBI are technologies that expect the data in form of a database or an Excel file. This means we have to assemble and persist UBIK^® objects containing the data to display, or to write that data into an Excel file. External reporting engines accessing the UBIK^® database is a potential source of database dead-locks. So, either the views for accessing the UBIK^® data are crafted with extreme caution, or reports are generated only when UBIK^® is otherwise idling, e.g., at night when no users are online. Another possibility would be to write to another database.

Application-sourced reports

The UBIK^® reporting plugin allows to parameterize a report using objects in managed code during runtime, which gives us more freedom in assembling the data and doesn't require us to craft a persistent data model, nor to allocate the drive memory required to save the data just to pass it on to a reporting engine. However, it also imposes the calculation effort on our application, which might be a working memory and/or CPU challenge if we either lack the resources or use an inefficient algorithm. Therefore, the focus for this solution must rest on efficient implementation.

Challenges - approaches

Irregularity - interpolation

Operational data can be delivered at irregular intervals, and there might be gaps. This can make it difficult to provide a nice trend chart. Interpolation algorithms can help fill in the gaps and to sample with uniform intervals. Spline interpolation is widely used because it doesn't misbehave even with many noisy data points.

Varying quality - filtering and smoothing

Measurements can be erroneous, unprecise and noisy, leading to outliers and plots that are hard to interpret. This can be solved by applying (smoothing) filters to the data. There are many smoothing techniques with different strengths, here are some you can explore: Gaussian kernel smoother, Whittaker-Eilers smoother, spline smoothing.