Skip to main content

%SQL.HLL

Class %SQL.HLL Extends %RegisteredObject

ObjectScript API for building Hyper Log Log estimates of the number of unique elements (cardinality) in a group of data.

The estimates are kept in containers called sketches. The containers are identified by the id of this class.
Lets assume you have 1 million pieces of data and want to know how many of those pieces are unique:

  1. Use %New to instantiate a new HLL object:

    set hll= ##class(%SQL.HLL).%New()

  2. Feed one million pieces of data into the sketch with update:

    for i=1:1:1000000 {do hll.update(i)}

  3. Get an estimate of the cardinality by calling estimate

    write hll.estimate()

    996537

Notes: We test this class at Intersystems by using murmur hash with a seed of hll.#SEED:
$zcrc(yourdata,9,2059198193) or $zcrc(yourdata,9,hll.#SEED)
The underlying library uses 64 bits of this 128 bit hash.

Estimate Partitioning: pass an existing sketch into %New to initialize its state
from the standard serialized form (optionally Base64 encoded).
To combine estimates get and merge your sketches, if your data is distributed
across many processes.

Parameters

ENCODE

Parameter ENCODE = 1;

Whether or not to Base64 encode/decode by default during get and %New

SEED

Parameter SEED = 2059198193;

Murmur hash seed to use for $zcrc(,9,)

%MODULENAME

Parameter %MODULENAME [ Internal ] = 15;

Properties

id

Property id As %Integer [ Internal, ReadOnly ];

Internal identifier of allocated memory for this HLL sketch's representation as managed by the callout library

type

Property type As %String [ Calculated, ReadOnly ];

Whether the estimator is currently sparse or dense

precision

Property precision As %Integer [ Calculated, ReadOnly ];

Precision of the estimator

libIndex

Property libIndex As %Integer [ Internal, MultiDimensional, Private ];

Index of $zf(-4) addresses

Methods

getFunctionID

Method getFunctionID(function As %String) As %Integer [ Internal ]

getLibraryID

ClassMethod getLibraryID() As %Integer [ Internal ]

%OnNew

Method %OnNew(sketch As %Binary = "", decode As %Boolean = {..#ENCODE}, Output err As %String = "") As %Status

Creates the memory and sets id for a new sketch. If you pass the sketch parameter, the new sketch will be initialized with the serialized sketch you passed in.

updateHash

Method updateHash(hash As %Binary) As %Integer [ Language = cpp ]

Updates this sketch with the user supplied hash value
Use $zcrc(yourdata,9,2059198193) or $zcrc(yourdata,9,hll.#SEED) to get the hash.

update

Method update(stringdata As %Binary) As %Integer [ Language = cpp ]

Updates this sketch with the $zcrc(,9,) hash of the stringdata. Hash done inside API.

merge

Method merge(other As %SQL.HLL, Output err As %String = "") As %Status

Merges the supplied sketch object into the current one. This merges the cardinality estimates.

estimate

Method estimate(Output err As %String = "") As %Integer

Returns the current unique value estimate (cardinality) for this sketch.

get

Method get(encode As %Boolean = {..#ENCODE}, Output err As %String = "") As %Binary

Returns the serialized form of the current sketch so that multiple sketches can be merged. Potentially you might obtain the sketch from a different process.

releaseSketch

Method releaseSketch(Output err As %String = "") As %Status [ Internal ]

Frees up the memory associated with this sketch. After this method has been called, subsequent calls for this sketch will yield a error. This method is called implicitly by the object destructor.

info

Method info(Output type As %String, Output precision As %String, Output err As %String) As %Status [ Internal ]

Helper method to retrieve metadata for the current sketch.

typeGet

Method typeGet() As %String [ Internal, ServerOnly = 1 ]

precisionGet

Method precisionGet() As %Integer [ Internal, ServerOnly = 1 ]

%OnClose

Method %OnClose() As %Status

version

ClassMethod version() As %Integer

Returns the version of the underlying callout library.