Profiles are used to identify a group of records with similar characteristics leading to a similar/same outcome.
The reason why the items are grouped in your screenshot is that these records have similar characteristics that result in the same outcome.
The %Diff Vs Avg shows there is a strong relationship in how these items are different from the average overall, hence they can be lumped together.
I have also reached out to you via email with the same information.
Please let us know if this answers your questions.
Hi Yong June
I try to explain with some other words.
The profile job that has been created was to minimize the occurrence of the goal (see Maximize set to False). this means that ThingWorx Analytics attempted to find groups of record that are less likely to achieve the goal.
In this profile 1, if in the population we select the records with cutter type set to Laser, this represent 28474 records and they have a 7% change to achieve the goal. If in this sub-population we only take the records with Pump Head Type set to Twist_Range, then we have 1% change to achieve the goal.
The complete profile takes the record for which Cutter Type = Laser, Pump head Type = Twist Range and also Avg Cycles is between 218 and 290 and avg daily cycle variance is between -27 and 27. If all those conditions are respected then we have a 1% change to achieve the goal - which is a good thing for a profile that is meant to minimize the goal.
The %Diff vs Avg is the result of (Avg Goal - Avg value for goal) / Avg value for goal.
This is to give a relative comparison to the rest of the population. Here the overall profile would have 96.63% less chance to achieve the goal than the average population.
I hope this helps you understand better.
you write "in the population we select the records with cutter type set to Laser, this represent 28474 records". If this is with respect to the whole population,
why can you then have two profiles with different populations in the primary feature? Shouldn't they always have the same number of records then?
(screenshot is from the printer data set, with a maximizing goal)
I think this extract from the API guide answers your question:
"Once a profile is found, all the records contained within it are excluded from subsequent profiles"
So in your example profile 8 has got 7414 records which have region=western.
Those 7414 records are then removed from the dataset to compute the next profile.
in the remaining population it is well possible that we still have region=Western still a main contributor. the population of this new profile is not going to be the same as the previous profile since different records are now taken into account.