cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - Your Friends List is a way to easily have access to the community members that you interact with the most! X

How to score new data with ThingWorx Analytics 52.x till 8.0 ?

No ratings

The following is valid  for ThingWorx Analytics (TWA) 52.0.2 till 8.0

For release 8.3.0 and above see How to score new data in ThingWorx Analytics 8.3.x ?

 

Overview

  • The main steps are as follow:
    1. Create a dataset
    2. Configure the dataset
    3. Upload data to the dataset
    4. Optimize the dataset
    5. Create filters for training and scoring data
    6. Train the model
    7. Execute scoring on existing data
    8. Upload new data to dataset
    9. Execute scoring on new data
  • TWA models are dataset centric, which means a model created with one dataset cannot be reused with a different dataset.
  • In order to be able to score new data, a specific feature, record purpose in the below example, is included in the dataset.
  • This feature needs to be included from the beginning when the data is first uploaded to TWA.
  • A filter on that feature can then be created to allow to isolate desired data.
  • When new data comes in, they are added to the original dataset but with a specific value for the filtered feature (record purpose), which allows to discriminate and score only those new records.

Process

  1. Create a dataset
    • This example uses the beanpro demo dataset
    • Create dataset is done through a POST on datasets REST API as below

createDataset.png

2. Configure dataset

    • This is done through a POST on <dataset>/configuration REST API

ConfigureDataset.png

3.      Upload data

        UploadData.png

4.      Optimize the dataset

        OptimizeDataset.png

5.      Create filters

    • The dataset includes a feature named record purpose created especially to differentiate between the rows to be used for training and the rows to be used for scoring.
    • New data to be added will have record purpose set to scoringnew, which will allow to execute a scoring job limited to those filtered new rows
    • Filter for training data:

TrainFilter.png

    • Filter for new scoring data

       scoreFilter.png

6.      Train the model

  • This is done through a POST on <dataset>/prediction API

       trainModel.png

7.      Score the training data

    • This is done through a POST on <dataset>/predictive_scores API.
    • Note the use of the filter TrainingData created earlier.
    • This allow to score only the rows with training as value for record purpose feature.
    • Note: scoring could also be done without filter at this stage, in which case all the data in the dataset will be scored and not just the ones with training fore record purpose

scoretraindata.png

 

  • Retrieving the scoring result show all the records in the dataset:

trainResult.png

 

8.      Upload new data

    • The newly uploaded csv file should only contains new record.
    • This will be appended to the existing ones.

uploadNewData.png

 

  • Note that the new record (it could be more than one) has a value scoringnew for the record purpose feature:

newrecord.png

  • This will allow to use the previously created filter ScoringNewData so that a new scoring job will only take into account this new record.

 

9.      Scoring new data

    • A POST on API predictive_scores is executed however using the filter ScoringNewData.
    • This results in only the newly added data to be scored and therefore a much quicker execution time too.

scoreNewData.png

    • Retrieving the scoring result shows only the new record:

newResult.png

Comments

Hi Christophe Morfin​,

This is a very good tutorial indeed for beginners as compared to the one explained in the guide https://developer.thingworx.com/resources/guides/thingworx-analytics-quickstart

I followed all step mentioned above, but when I executed the step no 9 i.e. Scoring new data ​with filter ScoringNewData, I am getting the below error.

com.coldlight.ai.pmml.InvalidDataException: Unexpected value(s) found in Record with identifier: [71]. Feature [record_purpose] had value of [scoringnew]


I have used serial to uniquely identify rows, [71] is the row number 71.

Thanks,

Azim

Hi Azim

I think I know why you are getting this and you are bringing a good point, that seems to be missing in the blog.

The definition of the record_purpose field (or whatever name you gave it) needs to be defined with displayOnly set to true so that the algorithm knows to not take this field into account in the analysis.
Indeed for string feature, ThingWorx Analytics needs to have seen all possible values during training otherwise it will report an error "Unexpected value(s) found ..."

In your case if the record_purpose field is not set with displayOnly to true, you will need to create a new dataset with this change in the metadata json file.

The difinition woudl look something like this:

{

        "dataType": "STRING",

        "description": "record purpose",

       "displayOnly": true,

        "fieldName": "record_purpose",

        "lever": false,

        "objective": false

    },

Hope this helps

Kind regards

Christophe

Hi Christophe Morfin​,

The above solution  had resolved the issue. However, as a beginner I am finding very difficulty to know the result as explained in step 9. What are these variables PREDICTED GOAL 'CONDITION' SCORE, PREDICTED GOAL 'CONDITION' MIN, PREDICTED GOAL 'CONDITION' MAX. What does they say. Can you please also share with us training and testing files. It will be really helpful for beginner to know all these parameter.

Thanks,

Azim

Hi Azim

The PREDICTED GOAL 'CONDITION' SCORE is probable the one you want to look at if you want a single value

PREDICTED GOAL 'CONDITION' MIN, PREDICTED GOAL 'CONDITION' MAX represent the range into which the predicted value can fall (remember this is using machine learning ie statistics  ).  Article Viewer | PTC might help here.

Also the Getting started article should be useful for you to find  references:  https://www.ptc.com/en/support/article?n=CS248832

Regards

Christophe

Hello Christophe Morfin​,

I wanted to know currently what are the machine learning algorithm supported by Thingworx Analytics.

Thanks,

Azim

Hi Azim

You can find the answer at https://www.ptc.com/en/support/article?n=CS254667

Kind regards

Christophe

Hi Christophe Morfin​,

How the pre processing of data is done in thingworx analytics. Like missing value treatment and outlier treatment for paramteric model and tunning and pruning for non paramteric model?

Does Optimize API of thingworx Analytics does the job for me?

Thanks,

Azim

Hi Azim

The following links should answer some of your questions:

https://community.thingworx.com/community/developers/blog/2016/08/30/best-practices-in-data-preparation-for-thingworx-an…

https://www.ptc.com/en/support/article?n=CS228295

If you still have questions, I would suggest you open a case to Technical Support

Kind regards

Christophe

Hi Christophe Morfin​,

Thanks for sharing this with me. Essentially the data has to be prepared before feeding the data into thingworx analytics.

Thanks,

Azim

Version history
Last update:
‎Aug 30, 2016 11:09 AM
Updated by:
Labels (2)