Solved: Re: logged infotable properties

iguerra · ‎Apr 21, 2023

I have infotable properties data that I would set as logged (to retrieve back later with QueryInfoTablePropertyHistory), with those characteristics:

1. just one row

2. usually few fields, ... 3 to 5 about

I would like to know about database storage efficiency of logged infotables (changes will be saved into the value_stream of the thing):
is it much more inefficient using logged infotables than having direct properties on the things ?
(I know that the whole infotable is saved, at every change ... not so efficient)

so the case is :

A. a thing with one logged infotable property, with 5 fields, and just 1 row

B: a thing with 5 properties (same type of the infot datashape) set as logged

data changes happen every 1 minute.

What will be storage size increase (of value_stream) between the 2 solutions ?
Is there a preferred way ? (I would prefer the A, to group better some data!!)

Thanks

VladimirRosu · ‎Apr 21, 2023

There's a lot of context that can change the decision here and I can't offer a suggestion that's the best, but you'd need to be aware of some things:

1. infotables are the only primitive types that do not support concurrent writes. If you have a high speed scale writing to the same infotable property you might see errors in logs related to this. You might encounter this if you have a storm of property writes that can happen due to disconnections for a longer period of time.

2. infotables are also slightly more difficult to work with, compared to a standard property, due to the nature of the manipulation services around them. They just require a bit more code

3. you you want to check the storage size increase, create a load script and verify the table size directly - that's the most realistic way of determining this. You can do math, but databases optimize the storage and this is not so easily quantifiable as an equation.

4. potentially be prepared for 5x more data, if your Infotable is sourced from 5 different properties and each receive their values with different timestamps. If on the other hand you receive all those 5 properties with the exact timestamp, then you're good on this point.

View solution in original post

VladimirRosu · ‎Apr 21, 2023

There's a lot of context that can change the decision here and I can't offer a suggestion that's the best, but you'd need to be aware of some things:

1. infotables are the only primitive types that do not support concurrent writes. If you have a high speed scale writing to the same infotable property you might see errors in logs related to this. You might encounter this if you have a storm of property writes that can happen due to disconnections for a longer period of time.

2. infotables are also slightly more difficult to work with, compared to a standard property, due to the nature of the manipulation services around them. They just require a bit more code

3. you you want to check the storage size increase, create a load script and verify the table size directly - that's the most realistic way of determining this. You can do math, but databases optimize the storage and this is not so easily quantifiable as an equation.

4. potentially be prepared for 5x more data, if your Infotable is sourced from 5 different properties and each receive their values with different timestamps. If on the other hand you receive all those 5 properties with the exact timestamp, then you're good on this point.

iguerra · ‎Apr 21, 2023

We use PTC-managed cloud instances ... so probably I can't read size of a particular valuestream (for a test)

For sure we don't have fast/frequent changes on this infotable data (may be 4-5 per minute) I know about concurrent writes limit

I understood your poin 4, I'll consider this ....

Moreover having 5 separated properties ... each one will have its own "timestamp value" on the logged data,

instead the infotable just one timestamp for the whole infotable (this is plus point for this) ... so if all 5 values data-changes will come with the same timestamp, probably the final dababase grow will be the similar.

Thanks

VladimirRosu · ‎Apr 26, 2023

One last thing I would want to add.

The process of determing the infotable timestamp itself can be a bit difficult, if the properties arrive at different times.

I'm saying this because I can not answer now which is the timestamp I should use from all 5 properties, especially if they arrive in parallel (remember they can arrive in parallel due to ThingWorx's parallel nature). The easiest thing would be also to aggregate at the source and send all of them but even there, you still need to answer the same question (maybe easier?).

iguerra · ‎Apr 26, 2023

For sure
I consider saving data on infotable property, as a unique container: the whole infotable is saved with a single timestamp.

For me it's good for "configuration/setup parameters" for a specefic device, where the whole data is important, not changes and timestamp of single columns.

If data comes at different timestamp ... not a problem, when an update of a column will arrive, I just write the whole updated infotable ... (it will be saved with current timestamp)
When retrieving infotable History ... I have identical data as if I use queryPropertyHistory on INTEGER or NUMBER properties, that is "repeated" values for properties that are not changed but with timestamp of the property(or properties) that was changed, something like this

TIMESTAMP COL1 COl2

01-01-2001 10:00 test 0

01-01-2001 10:01 test 1

01-01-2001 10:02 test 2

01-01-2001 10:03 test1 2

Yes, parallel writes may be a problem ... but on my case I have very very slow changes, and if multiple changes has to be transferred, these are aggregated together at the source ... so I will do a single infotable update.

If parallel data could arrive ... a queuing should be made... fortunately is not my case