Skip to main content

%DeepSee.PMML.Dataset

Class %DeepSee.PMML.Dataset Extends %RegisteredObject [ Abstract, System = 4 ]

A Dataset is a wrapper for a collection of records that can be analyzed, in order to build or run a model. Implementations abstracting different sources of data can be found in %DeepSee.PMML.Dataset.

Properties

Name

Property Name As %String(MAXLEN = 200);

IdField

Property IdField As %DeepSee.PMML.Dataset.Field;

Fields

Property Fields As array Of %DeepSee.PMML.Dataset.Field;

Methods

GetValueCount

Method GetValueCount(pField As %String, pIncludeNull As %Boolean = 1, ByRef pFilters, Output pSC As %Status) As %Integer [ Abstract ]

Returns the number of distinct values for pField (categorical)

Get1DDistribution

Method Get1DDistribution(pField As %String, Output pDistribution, ByRef pFilters) As %Status [ Abstract ]

Returns an array describing the distribution of values for a field pField (categorical) accepts pFilters(n) = $lb(field, operator, key) returns: pDistribution("total") = tTotalCount pDistribution(n) = $lb(value, count)

GetAggregatesByCategory

Method GetAggregatesByCategory(pContField As %String, pCatField As %String, Output pAggregates, ByRef pFilters) As %Status [ Abstract ]

Returns an array listing aggregate values for a continuous field pContField for each value of a categorical field pCatField. accepts pFilters(n) = $lb(field, operator, key) returns: pAggregates("total") = tTotalCount pAggregates(n) = $lb(category value, count, average, sum, max, min, countNonNull)

GetXDDistribution

Method GetXDDistribution(pFields As %List, Output pDistribution, ByRef pFilters) As %Status [ Abstract ]

accepts pFilters(n) = $lb(field, operator, key) returns: pDistribution = $lb(dim1Count, dim2Count, ...) pDistribution("value", dim, i) = value pDistribution(i, j, ...) = tCount pDistribution("total", dim, i) = tDimTotal

Clear

Method Clear() As %Status

Clears all temporary structures created by this object. The dataset should remain usable after calling this method!

GetFieldBySpec

Method GetFieldBySpec(pFieldSpec As %String) As %DeepSee.PMML.Dataset.Field

GetRecordIds

Method GetRecordIds(Output pIds, ByRef pFilters) As %Status [ Abstract ]

returns pIds(n) = rowid

GetAsResultSet

Method GetAsResultSet(pFields As %List, Output pResultSet As %SQL.StatementResult, ByRef pFilters) As %Status [ Abstract, Internal ]

HasField

Method HasField(pFieldName As %String, Output pSC As %String) As %Boolean

GetPMMLDataSourceInternal

Method GetPMMLDataSourceInternal(Output pDataSource As %DeepSee.PMML.Definition.Extension.DataSource) As %Status [ Abstract, Internal, Private ]

GetPMMLDataSource

Method GetPMMLDataSource(Output pDataSource As %DeepSee.PMML.Definition.Extension.DataSource, pName As %String = "") As %Status [ Final, Internal ]

Returns a %DeepSee.PMML.Definition.Extension.DataSource element representing the mapping from data fields to source fields