The Metadata for the 8.1 release has been updated from that of the previous releases as part of the architecture update that is new in 8.1.  The goal of this blog post is to inform you of these changes to the metadata, and where you can find more information on both the new metadata format, as well as all the other changes that are new for the 8.1 release.

New Structure

 

The image below provides an excerpt of what the metadata file itself will look like in practice.BP-1.PNG

The table below provides a definition for what each field in the metadata is and how it should be populated in your metadata file.

 

Parameter

Description

Required/Optional

fieldName

The exact name of the field as it appears in the data file.

Required

values

A list of the acceptable values for the field.

 

Note:

For Ordinal opTypes, the values must be presented in the correct order.

Required if the opType is Ordinal

 

Optional for Categorical opType

 

Do not use for Boolean and Continuous

range

For a Continuous field, defines the minimum and maximum values the field can accept. For informational purposes

  • only.

Optional

dataType

Describes what type of data the field contains. Options include: Long, Integer, Short, Byte, Double, Boolean, String, Other.

Note:

Select the most accurate dataType. Selecting the String dataType for numeric data can lead to undesirable results.

Required

opType

Describes how the data in the field can be used. Options include: Categorical, Boolean, Ordinal, Continuous, Informational, Temporal, Entity_ID

Required

timeSamplingInterval

An integer representing the time between observations in a temporal field.

Required if the opType is Temporal

 

Do not use for other opTypes

isStatic

A flag indicating whether or not the value in a temporal field can change over time. Marking a field as static reduces training time by removing redundant data points for fields that do not change.

Optional

 

Things to Remember

 

Remember that the Metadata file that you create will need to match the data file that you have; furthermore, all of the columns that you have in your dataset will need to be represented in the metadata file.

 

The metadata file needs to be a JSON file.

 

Setting the opType parameter incorrectly can have a severe impact on system performance.  For example, setting a numerical field that has thousands of different values as categorical instead of continuous will cause the system to handle each value as an independent category, instead of just a number, which will result in significantly longer processing time.

Additional References

 

For more information on all the other changes that are new in the 8.1 release please follow this link for the complete reference document.

 

Feel free to use the blank example metadata file attached to this post to help you get started on your own.