1 Reply Latest reply on Jan 18, 2018 4:19 PM by rkandasamy-2 RSS
    rkandasamy-2 Newbie

    Query based on the semi structured data

    Hi Team,


    we have a scenario where we store set of values as below in thingworx datatable.




    Here order number will be primary key and BoardsArray will have array of serial numbers as follows,


    ('Order1','Client1','{"SN1", "SN2"}','ACTIVE');

    ('Order2','Client2','{"SN2", "SN3","SN4","SN5","SN6","SN7","SN8","SN9","SN10"}','ACTIVE');

    ('Order3','Client3','{"SN2", "SN3","SN4"}','ACTIVE');

    ('Order4','Client4','{"SN2", "SN3","SN1"}','ACTIVE');


    So i have SN1 value alone as input, will it be possible to fetch order1 and order4 records using any query?


    In postgreSQL, it is possible as follows,


    SELECT * from TableName where  'SN1' = ANY (BoardsArray::text[]);     (This will give order1 and order4 records which consists of SN1 in BoardsArray Field).



    Straight forward method:


    1. Do a query data table entries without any query which gives all the records in that table.

    2. Loop through that records to find out each record and do a string operation/Array operation to see if that exists in that BoardsArray field.


    In this method, i am bringing all the table entries inside thingworx which is high memory consuming operation(assume the case 1 million records).

    Whatever the filter methods available in this link(Digital Media Publisher) does not really help in this situation.

    Please let me know if there any better approach for this scenario.

      • Re: Query based on the semi structured data
        wposner-2 Creator

        If you are going to have more than 500k records, you should not use a datatable.  You need to create a new separate relational schema (you can do this in your postgres db), add this new instance as a persistence provider via the jdbc connector, and create a new thing to execute sql commands and/or stored procedures directly against your new custom schema.


        Datatables do not scale well and will severely affect your performance if you're planning on 1 million + rows of data.