ML Models


Back to Main Help

The default ML Model is a nearest centroid classifier.  When trained, it calculates and stores a centroid term vector for each Tag, called the Trained Term Vector (TTV) for the Tag.   When used by an ML Task, the Confidence for each Tag is calculated as a weighted similarity measure between the input term vector and the Tag TTV.  The ML Model also stores other parameters, described below, that are used when an ML Task executes the Model.

If the LSA option is selected, then the ML Model partitions the input via Latent Semantic Analysis (available soon).

Click ML > ML Models to view, edit, delete and create ML Models.

What’s Needed

To create an ML Model, the prerequisite objects are:

·      Tagset – contains the possible ML Model Tag outcomes.

Creating an ML Model

Click ML > ML Models then New ML Model

·      Name for the ML Model and a Description, if needed.

·      LSA? – Select this for a Latent Semantic Analysis model, which requires no tagset.

·      Tagset – Choose an existing tagset.

·      Similarity Algorithm – Select algorithm to calculate similarity & confidence

o   Cosine Similarity

o   Euclidean Distance

o   Absolute Error

·      Term Count Dampening – Should term counts be reduced?

o   Count – No, use counts unchanged

o   Log – Use log2(count)

o   Binary – count = 0 or 1

·      IDF Weighting – Are terms weighted by frequency?

o   None – No term weighting

o   Inverse Document Frequency – Weight terms higher if they are in a smaller percentage of examples (higher weight to more unusual terms)

o   Inverse Tag Frequency – Weight terms higher if they are in a smaller percentage of Tag TTVs (higher weight to more unusual terms)

·      Train Term Count – How to add to TTV term counts?

o   Total – TTV counts are cumulative over all trainings

o   Binary – TTV counts are 0 or 1

·      Use Custom Weights?

o   If checked, then user can upload a spreadsheet of term weights

·      Modify Results Displayed?

o   If checked, then this ML Model stores default values for Multi-Value and Threshold – see ML Task for more information.

See ML Task Help for more information on how ML Tasks use ML Models.

See Task Help for other Task parameters.

Next: Term Vector Tasksupdates-help.htm