%DeepSee.PMML.Utils.TreeBuilder
Class %DeepSee.PMML.Utils.TreeBuilder Extends %RegisteredObject [ System = 4 ]
Utility class to build Tree models for a %DeepSee.PMML.Dataset object.
Properties
Dataset
Property Dataset As %DeepSee.PMML.Dataset;
The dataset to mine.
TargetField
Property TargetField As %String;
The target field whose value is to be derived through this tree. This field should be part of Dataset.
Tree
Property Tree As %Integer [ MultiDimensional ];
..Tree = $i ..Tree(NodeId) = $lb(parent, targetValue, confidence, count, isLeaf) ..Tree(NodeId, "condition") = [AND|OR|$lb(field, operator, value)] ..Tree(NodeId, "ch", ChildNode) = "" ..Tree(NodeId, "distribution", value) = count
SplitsPerNode
Property SplitsPerNode As %Integer [ InitialExpression = 2 ];
Default (maximum) number of branches per node
MinimalSplitCoverage
Property MinimalSplitCoverage As %Numeric [ InitialExpression = 0.05 ];
Minimal % of the total number of records represented by a node that should be covered by a branch. If a branch covers fewer records than this value, it is ignored
TrackDistributions
Property TrackDistributions As %Boolean [ InitialExpression = 1 ];
Whether or not to track distribution information for tree nodes
ConsiderNullValues
Property ConsiderNullValues As %Boolean [ InitialExpression = 0 ];
Whether or not to consider null (missing) values for split criteria
SplitScoringAlgorithm
Property SplitScoringAlgorithm As %String(VALUELIST = ",Confidence,GiniImpurity,InformationGain") [ InitialExpression = "GiniImpurity" ];
Which metric to use to judge split quality. This metric is returned as the split "score" by FindSplits
Methods
%OnNew
Method %OnNew(pDataset As %DeepSee.PMML.Dataset, pTargetField As %String) As %Status [ Internal, ServerOnly = 1 ]
Reset
Method Reset() As %Status
After changing building parameters, run this method to erase the current tree structure so Build can be run again.
Build
Method Build(pMaxDepth As %Integer = 3)
Builds a tree structure with a maximum depth of pMaxDepth.
If a tree structure was already built, this method silently exits. Use Reset to erase an existing tree structure.
GetInvertedFilter
Method GetInvertedFilter(ByRef pFilter, Output pInverted) As %Status
Returns the inverse of pFilter, equivalent to a boolean NOT of the entire pFilter.
MergeFilters
Method MergeFilters(ByRef pFilter, ByRef pOtherFilter, pLogic As %String = "AND") As %Status
Appends pOtherFilter to pFilter using pLogic logic
GetNodeFilters
Method GetNodeFilters(pNode As %Integer, Output pFilters) As %Status
Returns the combination of filter conditions (pFilters) a record should satisfy to end up in node pNode. This is a combination of the node's own condition, its full ancestry and any prior siblings' conditions.
SplitNode
Method SplitNode(pNode As %Integer) As %Status
Splits node pNode in SplitsPerNode sub-nodes (or fewer, if not enough candidate splits satisfy coverage and other selection criteria).
FindSplits
Method FindSplits(pNode As %Integer, Output pSplits) As %Status
Returns an unsorted array of candidate splits for node pNode: pSplits(n) = $lb(score, coverage, targetValue, confidence, recordCount, isLeaf) pSplits(n,"condition") = ...
Print
Method Print(pNode As %Integer = 0, pLevel As %Integer = 0, pPrintDistribution As %Boolean = 0) As %Status
Prints the tree (starting with pNode) to the terminal.
GetCondition
ClassMethod GetCondition(ByRef pArray) As %String [ Internal ]
GetFieldList
ClassMethod GetFieldList(ByRef pArray) As %List [ Internal ]