A Rules Task takes any Text column as input and assigns one or more Tags to each text item. It uses a Ruleset object, which contains a list of Rules, where each Rule contains:
a. A Rule Pattern (see: Patterns Help)
b. A Tag
c. A numeric Score (numeric double - set to 1.0 by default).
When the Rules Task is executed, it finds all Rule Pattern matches for each input Text cell and calculates a Total Score for each Tag by adding the Scores of all Rule matches for that Tag. Rules Tasks can generate up to 4 columns:
· Tag (text): the Tags selected for the input text
· Score (double): the sum of Scores for all Rules matched for each Tag
· Count (integer): the number of matched Rules for each Tag
· Matches (text): the matched text for each matched Rule for each Tag
No Tags are displayed if there are no Rule Pattern matches. Otherwise, the number of Tags displayed is determined by the Tag Display Options, which include:
· Multi-Value (boolean): if False, then only the single Tag with highest Score is displayed.
· Threshold (double): if set, then only Tags whose Score exceeds Threshold are displayed
In the figure above, Rules Task “LangR” has been executed on input column “Response Text” to select a Language Tag of “English” or “French”. Results are shown in columns LangR[tag], LangR[score], LangR[matches] and LangR[count].
To create a Rules Task, the prerequisite objects are:
· Template – the new Task will be added to this.
· Input Column – a text column in the Template
· Ruleset – stores a list of Rules for Tags in a specified Tagset (see: Ruleset help, Tagset help)
Results from the Rules Task can be viewed in Data Views or by selecting Tasks or Columns on the Datasets page and then View or Summary.
Creating a Rules Task
Click Pipelines > Tasks then Create Task and select Type “Rules”:
· Name for the Rules Task and a Description, if needed.
· Template – Choose an existing Template.
· Ruleset – Choose an existing Ruleset
· Input Column – Choose text column in Template
· Change Results Displayed – click to view Tag display options:
o Threshold (double): if set, then only Tags whose Score exceeds Threshold are displayed
o Multi-Value (boolean): if unchecked, then only the single Tag with highest Score is displayed.
o View Nulls (boolean): if unchecked, then Dataset rows with zero matches are left out of results.
See Task Help for other Task parameters.
Next: Machine Learning Tasks