QueryPropertyHistory Service with Row Control/Limi...

pnash · ‎Apr 07, 2016

When you have Things with large numbers of properties and/or very frequent logging, it is often difficult to get a sensible data set that can be (time-series) charted in a timely manner at runtime.

We have experienced this with a number of projects, so came up with a service that gives you direct control over the rows returned, and therefore control over the browser performance/end user experience.

We created a service which could directly replace the built-in QueryPropertyHistory service but unlike the built-in service, this respects the startDate, endDate AND maxItems inputs.

Basically what it does is, taking a given startDate and endDate (or applying defaults if none are specified), it will divide the duration of the query up equally by the number rows you want. Next it will query single row values throughout that period and find the values of properties at that time. (The overriding assumption is, for example, that if my remote device tells me a value is 4, it is 4 until the device tells me it isn't 4. That's kind of how value streams work anyway.)

In short, we didn't attempt any interpolation on this version of the service.

If you want to use this service with a time-series chart in a mashup where you can select pens at runtime, you will can to wrap this service again inside a new one that provides a fixed-DataShape, InfoTable output to allow you to attached columns to pens. A second service would be required in the mashup to provide the Label fields for the Chart.

You will create the service on a Thing/Shape/Template and define the same inputs and outputs that the QueryPropertyHistory services use (maxItems (your required row count), startDate, endDate, query, oldestFirst). Default startDate is NOW, default endDate is 24 hours ago - this is done in the code below)

NOTE: If you start using with this with charts, and before you ask the service for 10,000 rows (which won't be fast to execute!), consider how many pixels your chart uses in the browser and remember that the point of this is to limit rows to sensible values (perhaps 1 per 1/2/3 pixels will give you something looks nice. For a most Trends, we've found that 250 to 500 rows is usually more than enough.)

/* *** QueryPropertyHistoryRowLimited *** */

//Does query of a fixed number of value stream points between startDate and endDate

//set default values for properties undefined

var ed = endDate;

var sd = startDate;

var mi = maxItems;

//if endDate is undefined, choose NOW as the endDate

if (ed == null) {

ed = new Date();

}

//if startDate is undefined or greater than the endDate, choose 24 hours ago as the startDate

if ((sd == null) || (sd > ed)) {

var dateValue = new Date();

sd = dateAddHours(dateValue, -24);

}

//if maxItems is null or undefined, set to 100 rows of data as default

if ((mi == null) || (mi == undefined)) {

mi = 100;

}

//calculate the time period between rows of data

var startUTC = Date.UTC(sd.getFullYear(), sd.getMonth(), sd.getUTCDate(), sd.getUTCHours(), sd.getUTCMinutes(), sd.getUTCSeconds());

var endUTC = Date.UTC(ed.getFullYear(), ed.getMonth(), ed.getUTCDate(), ed.getUTCHours(), ed.getUTCMinutes(), ed.getUTCSeconds());

var timeSlice = Math.round((endUTC -startUTC) / (mi - 1));

//setup the date for query on the first loop iteration

var loopDateUTC;

var count = 0;

//define the sort order, based on user selection

if(oldestFirst) {

loopDateUTC = startUTC;

} else {

loopDateUTC = endUTC;

}

//loop through for maxItems-1 iterations and get one row for the named properties each time

for (count=0; count < (mi); count++) {

var loopDate = new Date();

loopDate.setTime(loopDateUTC);

var params = {

oldestFirst: false /* BOOLEAN */,

maxItems: 1 /* NUMBER */,

endDate: loopDate /* DATETIME */,

query: query /* QUERY */,

startDate: undefined /* DATETIME */

};

//if this is the first iteration, create the output table from the first result set

if (count == 0) {

//get one row of data to obtain the structure

var outputTable = me.QueryPropertyHistory(params);

outputTable.RemoveAllRows(); //clear table rows for readding later

}

//get one row of data to add to our table

var propertyHistory = me.QueryPropertyHistory(params);

var row = propertyHistory.getRow(0);

//set the datetime of the row to match the Query endDate

row.timestamp = loopDate; //because we know that timestamp will be our column name

outputTable.AddRow(row);

//prepare for next iteration

if(oldestFirst) {

loopDateUTC = loopDateUTC + (timeSlice);

} else {

loopDateUTC = loopDateUTC - (timeSlice);

}

var result = outputTable;

jasong · ‎Apr 13, 2016

Patrick, from what I know about how data is stored in ThWx, and please, folks who know more than me please correct me if I am wrong, I think querying a Stream for a specific single row result over and over is very inefficient and no good for large datasets. This is because ThWx loads the entire dataset into memory on each query before returning the single row, and that is what you have looped there with QueryPropertyHistory. What might be more memory efficient would be query the full data set given your overall bounds into memory/javascript service just once, and then have your loop query against the in resulting infotable using the Infotable Functons resource. OR possibly even doing a for loop over the rows yourself looking for matches (I say that because Alex Ensminger has seen some of those infotable functions to be quite slow. He can comment further.)

See this entry from Adam Ressler about how data is stored in ThWx: Where Should I Store My Thingworx Data?

"... currently Thingworx will do a full scan of the data when querying because all of the logic for the query or filter is done on the platform, not on the database (this will likely change in a future version). So small amounts of data can be quickly loaded and parsed in memory."

Anyone ThWx devs have any thoughts on this? It might be that since you are querying a PropertyStream and only on endDate, that the platform can operate efficiently by itself. I'm not sure,

pnash · ‎Apr 13, 2016

Hi Jason

You may be right about the nature of the queries and data storage in ThingWorx.

This was much more about speeding up presentation of the results, than query execution time. In our tests, rendering took way longer for very large datasets than querying does (or did)... In our cases, we had the data on the same box as the platform, so queries are executed pretty quick, even when large datasets are stored.

For the moment, we are seeing these queries execute and produce results in 1-2 seconds, with many tens of thousands of rows of source data. This is versus several minutes if we used the 'stock' services to populate the time-series chart for a bounded range.

After looking at many posts, I am still unsure if bounding using startDate/endDate and/or maxRows in QueryNamedPropertyHistory DOES return a full dataset before the bounds are applied.

Can someone from ThingWorx comment on this please?

The services posted here ARE a quick and dirty way to get the number or rows you want, over the period you want them, and is provided 'as-is' for folks to consume. Should the need arise (for us), we will revise these two and repost sometime in the future (once our data quantities are significantly larger).

It is also worth highlighting, these services MAY NOT display any peaks and troughs in your data that occur during the period specified. Particularly if they are transient and the duration is less than (endDate - startDate) / maxRows

Thanks very much for the feedback Jason, much appreciated!

tcoufal · ‎Aug 11, 2016

Hi,

as far as my understanding goes, ValueStreams and Streams are Time-indexed in the DB so calling QueryPropertyHistory (QueryStreamEntries resp.) will cause to bind StartTime and EndTime, (i am not sure about MaxItem) directly to the SQL query it self (its like the ORM), thus returning limited amount of data set. Query is than applied as an internal InMemory process of Thingworx itself.

On another hand DataTables are not Time-indexed. You can choose to add your own indexes for that table, but they are applied only when FindDataTableEntries service is called, not on Query services.. Since DataTable Query services do not take StartDate nor EndDate, the whole content of that DataTable is retrieved to Thingworx and than Query is applied, (if your DataShape used for that table contains Timestamp you can filter it or sort it, but the whole table is there non the less). That is why DataTables should be used only for static, lookups or as a simple builtin sort-of DB tables (costumers, orders, contacts and so on.) and not for Sensor Data or any other TimeSeries data used for plotting.

Please feel free to comment if I made a mistake anywhere in this post

Tomas

preddy-2 · ‎Apr 14, 2016

Hi Patrick,

I saw TW's QueryPropertyHistory service takes inputs of 'start date', 'end date' and 'maxitems' and give the results accordingly.

Is your service behaves different than built in one?

Thank you.

With B/R,

Praveen.B

StefanBode · ‎Apr 15, 2016

Yes indeed it do. The TWX service always pushes ALL values in the given timeframe causing long refresh time on the client with larger datasets. The provided function is more intelligent because is reduce the data on the server by the average function and deliver only maxitems values to the graph. This can be viewed in lightspeed :-)

From my point of view definitely missing in TWX. additionally you don't need more points that pixel you show in your graph. This is a common function of most BI tools.

QueryPropertyHistory Service with Row Control/Limiting

QueryPropertyHistory Service with Row Control/Limiting