6 Replies Latest reply on Sep 19, 2017 10:35 AM by romanl RSS
    yongkim Explorer

    Questions about profiling

    This question is about profiling.

    I want to know exactly what this profiling means.

    Can you explain why some features are grouped as shown below? and the four features have Avg Goal,% Diff vs Avg values, what do they mean?

     

    Regards.

      • Re: Questions about profiling
        nsampat Apprentice

        Yong,

         

        Profiles are used to identify a group of records with similar characteristics leading to a similar/same outcome.

         

        The reason why the items are grouped in your screenshot is that these records have similar characteristics that result in the same outcome.

         

        The %Diff Vs Avg shows there is a strong relationship in how these items are different from the average overall, hence they can be lumped together.

         

        I have also reached out to you via email with the same information.

         

        Please let us know if this answers your questions.

         

        Regards,

         

        Neel

        • Re: Questions about profiling
          yongkim Explorer

          Hi~~Neel

          Thank your for your reply

           

          I would like to ask more specific question.

          In the table above, the Diff value of the cutter type displays to be -68.08%. What does this -68.08% mean? It would be nice if you could explain it using the example above.

           

          Yongjune.

            • Re: Questions about profiling
              cmorfin Communicator

              Hi Yong June

               

              I try to explain with some other words.

               

              The profile job that has been created was to minimize the occurrence of the goal (see Maximize set to False). this means that ThingWorx Analytics attempted to find groups of record that are less likely to achieve the goal.

              In this profile 1, if in the population we select the records with cutter type set to Laser, this represent 28474 records and they have a 7% change to achieve the goal. If in this sub-population we only take the records with Pump Head Type  set to Twist_Range, then we have 1% change to achieve the goal.

              The complete profile takes the record for which Cutter Type = Laser, Pump head Type = Twist Range and also Avg Cycles is between 218 and 290 and avg daily cycle variance is between -27 and 27. If all those conditions are respected then we have a 1% change to achieve the goal - which is a good thing for a profile that is meant to minimize the goal.

              The %Diff vs Avg is the result of (Avg Goal - Avg value for goal) / Avg value for goal.

              This is to give a relative comparison to the rest of the population. Here the overall profile would have 96.63% less chance to achieve the goal than the average population.

               

              I hope this helps you understand better.

              Kind regards

              Christophe

                • Re: Questions about profiling
                  romanl Newbie

                  Hi Christophe Morfin,

                  you write "in the population we select the records with cutter type set to Laser, this represent 28474 records". If this is with respect to the whole population,

                  why can you then have two profiles with different populations in the primary feature? Shouldn't they always have the same number of records then?

                  (screenshot is from the printer data set, with a maximizing goal)

                  Best regards

                  Roman

                    • Re: Questions about profiling
                      cmorfin Communicator

                      Hi Roman

                       

                      I think this extract from the API guide answers your question:

                      "Once a profile is found, all the records contained within it are excluded from subsequent profiles"

                       

                      So in your example profile 8 has got 7414 records which have region=western.

                      Those 7414 records are then removed from the dataset to compute the next profile.

                       

                      in the remaining population it is well possible that we still have region=Western still a main contributor. the population of this new profile is not going to be the same as the previous profile since different records are now taken into account.

                       

                      Kind regards

                      Christophe