%SYSTEM.Sharding
Class %SYSTEM.Sharding Extends %SYSTEM.Help [ System = 4 ]
This class provides an API for "manually" configuring sharding, at the level of individual InterSystems IRIS Data Platform instances.
It can be used via the special $SYSTEM object, for example:
set status = $SYSTEM.Sharding.EnableSharding()
The $SYSTEM.Sharding API provides an alternative to using ICM (InterSystems Cloud Manager) to provision and configure sharded clusters, for scenarios ICM does not support, or situations where step-by-step manual control, or avoiding the use of containers, is preferred. Unlike ICM, which manages the entire process of installing and configuring InterSystems IRIS instances to function together as sharded clusters, the $SYSTEM.Sharding API adds only sharding-specific functionality, depending on other InterSystems IRIS tools and APIs for tasks such as installing InterSystems IRIS instances, configuring mirroring, or creating namespaces and databases.
Most of the $SYSTEM.Sharding API calls operate at the level of a specified master namespace. The two exceptions are EnableSharding and SetNodeIPAddress, which operate on an entire InterSystems IRIS instance (which may contain multiple master namespaces).
Terminology Sharding Transparent horizontal partitioning of tables across a set of shards. Each shard can be hosted on a separate InterSystems IRIS instance, on a separate machine, providing horizontal performance scaling for queries and data ingestion. Horizontal partitioning means partitioning by rows, as opposed to vertical partitioning by columns. Each partition of a horizontally partitioned table contains a subset of the table's rows, and each row is contained in exactly one partition.
Sharded Cluster A set of InterSystems IRIS instances configured to work together to support sharding. A sharded cluster is comprised of one shard master, one or more shard servers, each of which hosts one or more shards that are assigned to the master namespace hosted on the shard master, and zero or more master app servers, across which applications can be load balanced. Shard Master A InterSystems IRIS instance, or a mirrored pair of InterSystems IRIS instances, which hosts a master namespace. The set of InterSystems IRIS instances comprising a sharded cluster consists of the shard master itself, the set of shard server instances hosting shards assigned to the master namespace, and any shard app servers that have been configured with the shard master as their data server, and with a namespace whose globals and routine databases are mapped to those of the master namespace on the shard master. A single InterSystems IRIS instance can host more than one master namespace, in which case each of these master namespaces defines a separate sharded cluster, and the shard master instance participates in each of these sharded clusters. If a shard master is mirrored, then the globals and routine databases of each master namespace must be mirrored.
Shard Server A InterSystems IRIS instance, or a mirrored pair of InterSystems IRIS instances, which hosts one or more shards (either data shards, query shards, or both).
Master App Server A InterSystems IRIS instance that is configured as an ECP application server with the shard master as its data server, and hosting a namespace whose globals and routine databases are mapped to those of the master namespace on the shard master. Any sharding operation (including configuration, data definition, and data manipulation) that can be performed in the master namespace on the shard master can be performed identically in the namespace on a master app server whose database are mapped to those of the master namespace. The sole exception: the first-ever call to AssignShard for a given master namespace must be made locally on the shard master (and will return an error if attempted on a namespace whose globals database is not local).
Master Namespace A InterSystems IRIS namespace which has been assigned one or more shards, and can therefore contain sharded (that is, horizontally partitioned) tables. Other than the added ability to contain sharded tables, a master namespace has all of the characteristics and capabilities of an "ordinary" namespace (for example, it can contain tables that are not sharded; it can have mappings to globals residing in databases other than its default globals database; globals in its default globals database can be transparently accessed from a namespace on an ECP app server, whose globals database is a remote database mapped to the globals database of the master namespace; etc.). Any namespace that is not a shard namespace can become a master namespace by being assigned one or more shards.
Shard Namespace A InterSystems IRIS namespace which has been designated to play the role of shard, by being specified (by host and port of the InterSystems IRIS instance on which it resides, and the name of the namespace) in a call to AssignShard. Shard namespaces are transparent to end users of sharding, who access sharded tables via a master namespace, without needing to be aware of the names or locations of the shard namespaces assigned to that master namespace. Shard namespaces are visible to users who administer and configure sharding via this $SYSTEM.Sharding API. As namespaces, they are created and managed in the same ways as any other namespaces; their roles as shards are managed using this $SYSTEM.Sharding API. A namespace cannot be both a master namespace and a shard namespace. As recommended best practice, shard namespaces should not be accessed directly by end users, and should not be used for any purpose other than as shards of the master namespaces to which they are assigned.
Shard A partition of a master namespace, that contains one horizontal partition of each sharded table in the master namespace to which the shard is assigned. A shard is implemented as as shard namespace. A shard is either a data shard or a query shard. A data shard may or may not be a mirrored shard, but a query shard is never a mirrored shard.
Note: The term shard can also be used to refer to a horizontal partition of an individual sharded table, but the context of the $SYSTEM.Sharding API, it refers to a partition of the master namespace.
Data Shard A shard in which data is stored for one horizontal partition of each sharded table in the master namespace to which the data shard is assigned. A data shard may or may not be a mirrored shard. If it is a mirrored shard, its default globals database is a mirrored database. Otherwise, its default globals database is a non-mirrrored local database, or a remote database mapped to a non-mirrored database. Data shards are assigned a shard number 1 through number of shards, in the order that they are assigned calling AssignShard.
Query Shard A shard which does not store data, but provides remote access via ECP to the data stored in a data shard to which it is assigned. Zero or more query shards may be assigned to each data shard, by specifying the shard number of corresponding data shard in a call to AssignShard. A shard namespace used as a query shard must have as its default globals database a remote database that is mapped to the default globals database of the data shard to which the query shard is assigned. When SQL operations are executed on sharded tables, read-only queries are automatically executed on query shards for any shards that have one or more query shards assigned to them, but are executed on data shards if those data shards have no query shards assigned to them. If more than one query shard has been assigned to a shard, queries are automatically load balanced among them. Write operations (insert, update, delete, and DDL operations) are automatically executed on data shards. Query shards can be used to minimize interference betwe en query and data ingestion work loads, and to increase the band width of a sharded configuration for high volume multi-user query work loads.
Mirrored Shard A data shard whose default globals database is mirrored. Use of mirrored shards provides high availability for sharded configurations, with transparent failover between mirror failover members and transparent completion of query operations, in the event of failover of one or more shards occurring during execution of a query. When configuring shard namespaces for mirrored shards, the shard namespace for a given shard must have the same name on both mirror failover members, and its default globals database must be the same mirrored database.
API Usage
A InterSystems IRIS instance is enabled to act as a shard master or shard server by calling EnableSharding.
The set of shards belonging to a master namespace is defined by making repeated calls to AssignShard, one call for each shard.
Once shards have been assigned, VerifyShards can be called to verify that they are reachable and correctly configured.
If additional shards are assigned to a namespace that already contains sharded tables, and the new shards can't be reached for automatic verification during the calls to AssignShard, ActivateNewShards can be called to activate them, once they are reachable.
After new shards are assigned, existing sharded data can be rebalanced across all shards, including new ones, by calling Rebalance.
A shard can be removed from the set belonging to a master namespace by calling DeassignShard.
An existing data shard can be assigned a different shard namespace address by calling ReassignShard.
All the shards assigned to a master namespace can be listed by calling ListShards.
A non-default IP address or DNS name for connecting an InterSystems IRIS instance to other instances of sharded cluster can be specified by calling SetNodeIPAddress.
Sharding configuration options can be set by calling SetOption, and their values can be retrieved by calling GetOption.
Methods
EnableSharding
ClassMethod EnableSharding(MaxConn As %Integer = 64, EnableAsShardServer As %Boolean = 1, AllowedConnections As %String = "") As %Status
Enables the current InterSystems IRIS instance to act as a shard master or shard server.
This API call provides a convenient way to perform several configuration steps which would otherwise need to be performed separately:
- Enables the ECP service.
- Sets the config options MaxServers and MaxServerConn.
- Optionally enables the sharding service, so this InterSystems IRIS instance can act as a shard server. Optionally configures a list of allowed connections for the sharding and ECP services.
Parameters: MaxConn Maximum number of ECP connections needed for this InterSystems IRIS instance to communicate with other instances in this sharded cluster (default 64). If non-zero, this must be greater than or equal to the total number of InterSystems IRIS instances participating in the sharded cluster, but must be at least 2 (even if there is only one instance).
The value specified for MaxConn is used to set the config options MaxServers and MaxServerConn. Specifying 0 means "do not change these config options", in which case they are assumed to be already set to appropriate values.
EnableAsShardServer TRUE(1)/FALSE(0). If EnableAsShardServer is TRUE(1) (the default), the sharding service (%Service_Sharding) is enabled for this InterSystems IRIS instance, enabling it to act as a shard server or a shard master. If EnableAsShardServer is FALSE(0), the sharding service is disabled for this instance, enabling this instance to act as a shard master, but not as a shard server.
AllowedConnections List of hosts allowed to connect to this InterSystems IRIS instance in its role as shard server, specified as a semi-colon-separated list of IP addresses or hostnames. If this list is specified, the listed hosts are configured as the allowed incoming connections for the sharding service and the ECP service (replacing any lists of allowed connections previously configured for those services); else the sharding and ECP services are configured to have no lists of allowed incoming connections (this causes there to be no restriction on which hosts may connect to this InterSystems IRIS instance via these services). If AllowedConnections is specified, the list should include all hosts participating in the sharded cluster as masters, shard servers, master app servers, or query shard servers, and should include both failover members of any mirrored pairs.
Note: AllowedConnections only needs to be specified on InterSystems IRIS instances playing the role of shard server (either data or query shard server). If EnableAsShardServer is FALSE(0), the ECP service's list of allowed connections is not modified.
Returns:
Status code reporting success or failure of this API call.
Notes:
- A user must have administrative privileges in order to execute this API call.
- After this API call, the InterSystems IRIS instance must be restarted for all of the changes to take effect.
- This API call affects an entire InterSystems IRIS instance. If this instance participates in more than one sharded cluster (e.g. contains more than one master namespace, or shards belonging to more than one master namespace), the MaxConn and AllowedConnections arguments must be sufficient for all of the clusters in which this instance participates.
Examples:
- Enable as shard server or shard master, MaxServers=MaxServerConn=64, no restriction of allowed connections:
set status = $SYSTEM.Sharding.EnableSharding() - Enable as shard server or shard master, MaxServers=MaxServerConn=3, two allowed connections specified:
set status = $SYSTEM.Sharding.EnableSharding(3,1,"172.16.120.119;172.16.120.120") - Enable as shard master only, MaxServers=MaxServerConn=64:
set status = $SYSTEM.Sharding.EnableSharding(,0)
AssignShard
ClassMethod AssignShard(MasterNamespace As %String = {$namespace}, ShardHost As %String, ShardPort As %Integer, ShardNamespace As %String, ShardNumber As %Integer = "", ShardMirrorName As %String = "", ShardBackupHost As %String = "", ShardBackupPort As %Integer = "", ShardVIP As %String = "") As %Status
Assigns a shard to a master namespace.
This API call can be used to assign a data shard or a query shard.
A data shard can be a namespace on a single InterSystems IRIS instance, or it can be a namespace whose globals database is mirrored.
A query shard must be a namespace on a single InterSystems IRIS instance, whose globals database is a remote database mapped to the globals database of the corresponding data shard. When assigning a query shard, the ShardNumber of the corresponding data shard must be specified, and that data shard must already have been assigned.
New data shards cannot be assigned if any sharded tables with user-defined shard keys already exist in the specified master namespace. If new data shards are assigned when sharded tables without user-defined shard keys already exist, this API call attempts to connect to the newly assigned shards to verify them before they are activated; if this fails (either because the new shards are not reachable, or because they fail verification), this API call returns an error indicating that ActivateNewShards must be called to activate the new shards.
New query shards can be assigned at any time, regardless of whether sharded tables already exist.
If a data shard has been flagged for pending removal by calling DeassignShard while sharded tables exist, and the shard's removal has not yet been completed by calling Rebalance, the pending removal can be cancelled by calling AssignShard. In that case, if the shard is mirrored, it optional whether to specify the ShardMirrorName, ShardBackupHost, and ShardBackupPort parameters, but an error is returned if they are specified, and do not match the actual mirror name, or the host and port of the shard's current backup failover member. Call ListShards to determine whether a shard is flagged for pending removal.
Parameters: MasterNamespace The master namespace to which the shard is assigned. Defaults to the current namespace.
ShardHost The machine hosting the shard namespace, specified by hostname or IP address.
ShardPort The default port (Super Server port) of the InterSystems IRIS instance hosting the shard namespace.
ShardNamespace The namespace being assigned as a shard.
ShardNumber Specifying a ShardNumber indicates that this shard is being assigned as a query shard, corresponding to the data shard with the specified ShardNumber, which must already have been assigned. Do not specify a value for ShardNumber when assigning a data shard.
ShardMirrorName For a mirrored shard, ShardMirrorName specifies the mirror name (also known as mirror set name) of the mirror hosting the shard. This parameter must be specified when assigning a mirrored shard, and must not be specified otherwise.
ShardBackupHost For a mirrored shard, the machine hosting the backup failover member of the mirror, specified by hostname or IP address. This parameter must be specified when assigning a mirrored shard, and must not be specified otherwise.
ShardBackupPort For a mirrored shard, the default port (Super Server port) of the backup failover member of the mirror. This parameter must be specified when assigning a mirrored shard, and must not be specified otherwise. ShardVIP For a mirrored shard, the Virtual IP address of the mirror, if one has been configured (optional).
Returns:
Status code reporting success or failure of this API call.
Notes: This API call does not create or configure InterSystems IRIS instances, namespaces, mirrors, or remote database mappings, it just assigns specified namespaces on specified single or mirrored instances to play roles as shards of a specified master namespace. The specified namespaces, instances, mirrors, and remote database mappings must be configured separately using appropriate APIs in Management Portal or in classes in the Config package, either before or after calling AssignShard. The requisites are:
- Create InterSystems IRIS instance and namespace specified by ShardHost, ShardPort, and ShardNamespace.
- For mirrored shard, create InterSystems IRIS instances specified by ShardHost and ShardPort, and by ShardBackupHost and ShardBackupPort, and configure them as the failover members of a mirrored set. (Note: It doesn't matter which one is actually the primary failover member at the time AssignShard is called.) Create namespace specified by ShardNamespace on both failover members, with its globals database configured as a mirrored database.
- For query shard, create namespace specified by ShardNamespace with remote database mapped to the globals database of the namespace of the corresponding data shard.
- Enable sharding on the shard instance, by calling EnableSharding. For mirrored shard, enable sharding on both failover members.
- This API call returns an error if the namespace specified by ShardHost, ShardPort, and ShardNamespace, or by ShardBackupHost, ShardBackupPort, and ShardNamespace, has already been assigned as a data or query shard, or as the backup of a mirrored data shard, or if it is the same as the master namespace.
- By default, this API call does not attempt to connect to the newly assigned shard to confirm that it is reachable and is correctly configured, except in the case where a data shard is assigned and sharded tables already exist. This call can be made to automatically verify all assigned shards by calling SetOption(masterNamespace,"AutoVerify",1) (where masterNamespace is set to the appropriate master namespace), or verification can be performed in a separate call to VerifyShards.
- Shard numbers are assigned to data shards sequentially, starting from 1, in the order they are assigned. The shard number of a given shard can be determined by calling ListShards. If shards are de-assigned by calling DeassignShard, the shard numbers of any high-numbered shards are decremented, so that there are never any gaps in the sequence.
- As a convenience, "localhost" can be specified as the value of ShardHost, for a shard that resides on the current machine. This is always translated internally to the actual hostname or IP address. Sharding never uses the "loop-back" IP address 127.0.0.1. Examples:
- Assign a single-instance data shard:
set status = $SYSTEM.Sharding.AssignShard("MASTER","machine1",1972,"SHARD1") - Assign a mirrored shard to the current namespace:
set status = $SYSTEM.Sharding.AssignShard(,"machine2",1972,"SHARD2",,"MIRROR1","machine3",1972,"123.45.67.89") - Assign a query shard:
set status = $SYSTEM.Sharding.AssignShard("MASTER","machine4",1972,"SHARD1",1)
ReassignShard
ClassMethod ReassignShard(MasterNamespace As %String = {$namespace}, ShardHost As %String, ShardPort As %Integer, ShardNamespace As %String, ShardNumber As %Integer, ShardMirrorName As %String = "", ShardBackupHost As %String = "", ShardBackupPort As %Integer = "", ShardVIP As %String = "") As %Status
Re-assigns an existing data shard. This API call assigns a different shard namespace (specified by host, port, and namespace) to a shard number to which a data shard has previously been assigned.
The newly specified shard namespace can be a namespace on a single InterSystems IRIS instance, or it can be a namespace whose globals database is mirrored. It is expected to contain identical data to the namespace previously assigned to the specified ShardNumber. The caller is responsible for ensuring this.
This API call has the following use cases: The globals database of the shard namespace is being relocated on a different InterSystems IRIS instance and/or host machine. The move itself is done outside of this API, and can be done in two general ways: Dismount the source database, copy its IRIS.DAT file to its target location, and configure the new shard namespace with a globals database configured to use the copied IRIS.DAT file. This requires a maintenance window during which no applications access sharded tables in this master namespace. Temporarily configure the source and target InterSystems IRIS instances as a mirrored set, with the target instance as backup failover member, backup the source database to the target database, and then promote the target instance to primary failover member. Applications can continue to access sharded tables throughout this process.
The source and target versions of the relocated shard can each either be mirrored or not. If the target is mirrored, the relocation can be done in multiple steps: first call ReassignShard to specify the target shard as a namespace on a single InterSystems IRIS instance, then after configuring the target as a mirror and backing up the target database to the backup failover member, call ReassignShard to re-specify the target shard as a mirrored namespace. If the relocation is done during a maintenance window, then ReassignShard only needs to be called once, after the target mirror is configured and the target database is up to date on both failover members.
The ability to relocate shards can be used to allow for future horizontal scaling, by initially configuring more shards than the number of of host machines, which multiple shards on each machine (each hosted on a different InterSystems IRIS instance), and later adding machines to the sharded cluster, and relocating shards to separate host machines. A shard that was previously not mirrored is being reconfigured to be mirrored. In this case, specify the ShardHost, ShardPort, and ShardNamespace of the current shard namespace, specify the ShardMirrorName of the newly configured mirror, and specify the ShardBackupHost, ShardBackupPort, and (optionally) ShardVIP of the newly configured backup failover member. A shard that was previously mirrored is being reconfigured as not mirrored, leaving what was previously the primary failover member as the sole instance hosting the shard. In this case, specify the ShardHost, ShardPort, and ShardNamespace of the current primary failover member, and do not specify values for ShardMirrorName, ShardBackupHost, ShardBackupPort, or ShardVIP.
- A new backup failover member is being specified for a mirrored shard, following failure and replacement of the previous backup failover member. In this case, specify the ShardHost, ShardPort, and ShardNamespace of the current primary failover member, specify current ShardMirrorName, and specify the ShardBackupHost, ShardBackupPort, and (optionally) ShardVIP of the newly configured backup failover member.
Parameters: MasterNamespace The master namespace to which the shard is assigned. Defaults to the current namespace.
ShardHost The machine hosting the shard namespace, specified by hostname or IP address.
ShardPort The default port (Super Server port) of the InterSystems IRIS instance hosting the shard namespace.
ShardNamespace The namespace being assigned as a shard.
ShardNumber The shard number to which a new shard namespace is assigned. There must already be a data shard assigned to this shard number.
ShardMirrorName For a mirrored shard, ShardMirrorName specifies the mirror name (also known as mirror set name) of the mirror hosting the shard. This parameter must be specified when assigning a mirrored shard, and must not be specified otherwise.
ShardBackupHost For a mirrored shard, the machine hosting the backup failover member of the mirror, specified by hostname or IP address. This parameter must be specified when assigning a mirrored shard, and must not be specified otherwise.
ShardBackupPort For a mirrored shard, the default port (Super Server port) of the backup failover member of the mirror. This parameter must be specified when assigning a mirrored shard, and must not be specified otherwise. ShardVIP For a mirrored shard, the Virtual IP address of the mirror, if one has been configured (optional).
Returns:
Status code reporting success or failure of this API call.
Notes:
- This API call does not create or configure InterSystems IRIS instances, namespaces, or mirrors, or copy, backup, mount, or dismount databases. These operations must be performed separately using appropriate APIs in Management Portal or in classes in the Config package, either before or after calling ReassignShard.
- If any query shards have previously been configured and assigned to the ShardNumber that is being re-assigned, they must be re-configured so that their globals databases are mapped to the re-configured database of the data shard. This does not require de-assigning or re-assigning the query shards through this API, since this API associates query shards with data shards by shard number, not by database mappings.
- This API call returns an error if the namespace specified by ShardHost, ShardPort, and ShardNamespace, or by ShardBackupHost, ShardBackupPort, and ShardNamespace, has already been assigned as a data shard, or as the backup of a mirrored data shard, with a shard number other than the specified ShardNumber, or if it has been assigned as a query shard with any shard number.
- By default, this API call does not attempt to connect to the re-assigned shard to confirm that it is reachable and is correctly configured . This call can be made to automatically verify all assigned shards by calling SetOption(masterNamespace,"AutoVerify",1) (where masterNamespace is set to the appropriate master namespace), or verification can be performed in a separate call to VerifyShards.
- The ShardNumber to pass to this API call, if not already known, can be determined by calling ListShards.
- As a convenience, "localhost" can be specified as the value of ShardHost, for a shard that resides on the current machine. This is always translated internally to the actual hostname or IP address. Sharding never uses the "loop-back" IP address 127.0.0.1.
Examples:
- Re-assign shard number 1 as a single-instance data shard:
set status = $SYSTEM.Sharding.ReassignShard("MASTER","machine1",1972,"SHARD1",1) - Re-assign shard number 1 as a mirrored data shard:
set status = $SYSTEM.Sharding.ReassignShard("MASTER","machine2",1972,"SHARD2",1,"MIRROR1","machine3",1972,"123.45.67.89")
Reinitialize
ClassMethod Reinitialize(MasterNamespace As %String = {$namespace}, IgnoreMappings As %Boolean = 0) As %Status
Reinitializes the internal mappings, connections, and cluster metadata of a sharded cluster which has been "cloned".
A sharded cluster is cloned by copying its master and shard databases to a new set of IRIS instances, and reassigning the shards to the hosts, ports, and namespaces of their new locations. This method is called as the final step of the process.
Parameters: MasterNamespace The master namespace of the cluster to be reinitialized.
IgnoreMappings Flag indicating whether to ignore any user-defined global, routine, or package mappings that were defined in the master namespace of the original cluster that is being cloned. By default, this method returns an error if there were user-defined mappings in the original master namespace, because this might indicate that data accessed via these mappings will be unavailable in the targetcluster. If 1 is specified for IgnoreMappings, user-defined mappings in the original master namespace don't cause an error to be returned.
Returns:
Status code reporting success or failure of this API call.
Cluster cloning usage:
- Call GetConfig to get information about the directory paths of the master and shard databases, which need to be copied to the new cluster.
- Shut down the original cluster, and copy or backup the master and shard databases, making sure to keep track of which shard database belongs with which shard number. NOTE: Whether or not the original cluster is mirrored, the target cluster will not be mirrored initially. Therefore, only the databases from one failover member of each mirrored node need to be copied.
- Create IRIS instances of the target cluster, making sure to keep track of which target instance corresponds to which shard number of the original cluster. (NOTE: Do not create instances for compute nodes or mirror backups or DR asyncs - these can be added later if desired.) In each instance, create a namespace to be the shard namespace, and cause its globals database to be the copied shard database from the corresponding node in the original cluster (this can be done by restoring a backup of the original database, or by copying the IRIS.DAT file - any approach supported by IRIS for copying a database can be used). In whichever target instance corresponds to the original instance containing the master namespace and database, create a namespace to be the master namespace of the target cluster, and cause its globals database and (if different) routines database to be the copied master databases from the original cluster.
- In the master instance of the target cluster, call ReassignShard once per shard, to specify the hostname or IP address, Super Server port, and namespace of each target shard, making sure to specify the correct shard numbers to correspond to the original cluster.
- Call Reinitialize. This verifies that the target cluster is of a compatible version, sets up all the mappings, ECP connections, and metadata needed to activate the target cluster, and automatically calls VerifyShards to complete activation and confirm that the configuration is correct.
- NOTE: If the original master namespace has user-defined global, routine, or database mappings which need to be duplicated in the target cluster, create equivalent mappings in the target master namespace. Either copy the databases containing the items pointed to by the mappings, from the original cluster to the target cluster, or define the mappings in the target master namespace to point to the original databases containing the mapped items.
Examples:
- Reinitialize the cluster whose master namespace is "MASTER", returning an error if any user-defined mappings existed in the original master namespace:
set status = $SYSTEM.Sharding.Reinitialize("MASTER") - Reinitialize the cluster whose master namespace is "MASTER", ignoring any user-defined mappings that existed in the original master namespace:
set status = $SYSTEM.Sharding.Reinitialize("MASTER", 1)
DeassignShard
ClassMethod DeassignShard(MasterNamespace As %String = {$namespace}, ShardHost As %String, ShardPort As %Integer, ShardNamespace As %String) As %Status
De-assigns a shard from a master namespace to which it had previously been assigned. This removes the shard from the set of shards belonging to the master namespace.
This API call can be used to de-assign a data shard or a query shard.
If a data shard is de-assigned while any sharded tables exist, the data shard is not immediately removed, but is flagged for pending removal. The next call to Rebalance will move all sharded data from any shards that are flagged for pending removal, to other data shards, and will complete the process of removing the shards that are pending removal. The last remaining data shard in a cluster may not be de-assigned.
When a query shard is de-assigned, it is always immediately removed.
When de-assigning a data shard to which any query shards are currently assigned, those query shards are automatically de-assigned as well.
When de-assigning a mirrored data shard, the host and port of either failover member may be specified. In either case, both failover members are de-assigned as shards.
Parameters: MasterNamespace The master namespace to which the shard is currently assigned. Defaults to the current namespace.
ShardHost The machine hosting the shard namespace, specified by hostname or IP address.
ShardPort The default port (Super Server port) of the InterSystems IRIS instance hosting the shard namespace.
ShardNamespace The namespace being de-assigned as a shard.
Returns:
Status code reporting success or failure of this API call.
Notes:
- De-assigning a shard does not delete the shard namespace, or any data that may have been created in that namespace while it was serving as a shard. It simply causes that namespace to no longer serve as a shard of the specified master namespace, and makes the namespace available to be assigned as a shard of different master namespace. When a data shard is de-assigned, the shard count is decremented, and the shard numbers of any higher-numbered shards are decremented, so that there are no gaps in the sequence of shard numbers. When a query shard is de-assigned, this simply reduces by one the number of query shards assigned to the data shard to which that query shard had been assigned.
- Query shards can be de-assigned at any time, just as they can be assigned at any time. This provides a dynamic means of adjusting the query throughput capacity of a sharded cluster as multi-user workloads grow or shrink. Note that this does not change the degree of parallelism for the execution of an individual query (which always equals the number of data shards), but it changes the multi-user throughput by enabling different concurrent queries to execute on different sets of query shards. When a query shard is de-assigned, it continues to be used by any active user connections already in progress, but is not used by any new user connections. Therefore, query shards can safely be de-assigned while query applications are running.
- When a data shard is flagged for pending removal, due to calling DeassignShard while sharded tables exist, the pending removal can be cancelled by calling AssignShard. Call ListShards to determine whether a shard is currently flagged for pending removal.
- This API call returns an error if the the shard specified by ShardHost, ShardPort, and ShardNamespace is not currently assigned to the specified master namespace as either a query shard, a data shard, or either failover member of a mirrored data shard.
Examples:
- De-assign a shard from a specified master namespace:
set status = $SYSTEM.Sharding.DeassignShard("MASTER","machine1",1972,"SHARD1") - De-assign a shard from the current namespace:
set status = $SYSTEM.Sharding.DeassignShard(,"machine1",1972,"SHARD1")
VerifyShards
ClassMethod VerifyShards(MasterNamespace As %String = {$namespace}, ShardNumber As %Integer = "", QueryShardNumber As %Integer = "", ReturnFirstError As %Boolean = 0) As %Status
Verifies that assigned shards are reachable and are correctly configured.
Verifies either all shards that have been assigned to a specified master namespace (the simplest and recommended usage), or a specified data shard, or a specified query shard of a specified data shard.
For each shard verified, this API call verifies the following, and returns a specific error for each failure:
- The shard is reachable (the InterSystems IRIS instance hosting the shard is started, and can be reached via TCP/IP).
- The ECP and sharding services are enabled on the instance hosting the shard.
- If the sharding service on that instance has a list of allowed incoming connections, the host on which this API call is made is included in the list.
- The instance hosting the shard has the config parameters MaxServers and MaxServerConn set to sufficiently high values for the currently assigned set of shards. This checks for a value that is at least as great as the total number of InterSystems IRIS instances participating in the sharded cluster, including the shard master and each additional instance that hosts one or more shards, but not including any master app servers. For this purpose, a mirror counts as one instance. (If there are any master app servers participating in the sharded cluster, the number of master app servers must be added to the MaxServer and MaxServerConn settings on each instance in the sharded cluster, beyond the minimum value that is checked by this API call.)
- The instance hosting the shard does not require restart due to changes having been made to the CPF file and not yet activated. (Note that changes to MaxServers and MaxServerConn require restart before they are effective.)
- The shard namespace exists.
- The shard namespace is not the master namespace of some other sharded cluster, and has not already been assigned as a shard of some other sharded cluster. (AssignShard ensures that the same namespace is not assigned twice as a shard of the present sharded cluster.)
- For query shards, the shard namespace has as its default globals database a remote database that is mapped to the default globals database of the data shard to which the query shard is assigned. For mirrored shards, the instance assigned as primary failover member really is a mirror failover member, and really does have, as its backup failover member, the instance assigned as backup failover member (for this purpose, it does not matter which failover member is currently the primary; if the mirror is correctly configured, this API call transparently connects to whichever failover member is currently the primary, and verifies that the other failover member's host and port were correctly specified to AssignShard).
This API call also verifies that the master instance on which it is invoked has the ECP service enabled, has sufficient values for the config parameters MaxServers and MaxServerConn, and does not require restart due to changes not yet activated in the CPF file.
Parameters: MasterNamespace The master namespace whose shards are to be verified. Defaults to the current namespace.
ShardNumber The shard number of the shard to be verified (1 through number of data shards). By default, verifies all shards.
QueryShardNumber Query shard number, among the query shards of the specified shard, of the shard to be verified. ShardNumber must be specified if QueryShardNumber is specified. By default, verifies all query shards of the specified shard, or all query shards of all shards if ShardNumber is not specified.
ReturnFirstError TRUE(1)/FALSE(0). If ReturnFirstError is TRUE(1), this API call will return an error after the first shard that fails verification. If ReturnFirstError is FALSE(0) (the default), this API call will attempt to verify all data shards, and if any of them fail verification, it will return a nested status including the error for each shard which failed verification. If all data shards are successfully verified, this API call with then attempt to verify all query shards, and if any of them fail verfication, it will return a nested status including the error for each query shard which failed verification.
Returns:
Status code reporting success or failure of this API call. If multiple shards fail verification, this is a nested status code indicating how many shards failed, which shards failed, and the specific error for each failed shard.
Notes:
- The API call AssignShard can be made to automatically verify all assigned shards by calling SetOption(masterNamespace,"AutoVerify",1) (where masterNamespace is set to the appropriate master namespace). This has the advantage of avoiding the need for a separate call to VerifyShards, but means that shard instances and namespaces must already be reachable and correctly configured when calling AssignShard, or it will return verification errors, requiring a separate call to be made to VerifyShards once all shard instances and namespaces have been correctly configured and made reachable. Separating these two API calls provides the option of configuring shard instances and namespaces either before or after assigning them as shards.
Examples:
- Verify all shards assigned to the current namespace:
set status = $SYSTEM.Sharding.VerifyShards() - Verify shard 1 assigned to namespace "MASTER":
set status = $SYSTEM.Sharding.VerifyShards("MASTER",1) - Verify the second query shard assigned to shard 3 of the current namespace:
set status = $SYSTEM.Sharding.VerifyShards(,3,2) - Verify all shards assigned to namespace "MASTER", but stop and return error for the first failure:
set status = $SYSTEM.Sharding.VerifyShards("MASTER",,,1)
ListShards
ClassMethod ListShards(MasterNamespace As %String = {$namespace}) As %Status
Lists the shards assigned to a specified master namespace, to the console or current device.
The list contains a row for each shard, with information in columns under the following headings:
- Shard - the shard number (1 through number of shards).
- Host - the hostname or IP address of the machine hosting the shard.
- Port - the default port (Super Server port) of the InterSystems IRIS instance hosting the shard.
- Namespace- the shard namespace.
- Mirror - the mirror name, if the shard is mirrored. Role - the shard's role:
- (blank) - ordinary data shard.
- "Query" - query shard.
- "Primary" - primary failover member hosting mirrored shard.
- "Backup - backup failover member hosting mirrored shard. VIP - VIP for mirrored shard, if one is configured.
Parameters: MasterNamespace The master namespace whose shards are listed. Defaults to the current namespace.
Returns:
Status code reporting success or failure of this API call.
Notes:
- This API call can be used to determine the shard number corresponding to a shard's host, port, and namespace, for use as the ShardNumber parameter to AssignShard, when assigning a query shard to a shard specified by shard number.
- For mirrored shards, indicated primary and backup members are those at the time of initial shard assignment, or of the most recent operation that required connecting to shards. It is possible that the mirror has failed over since then, in which case the indicated primary member is now the backup member and vice versa.
- Shards which have been assigned but not yet activated are listed with their shard number in parentheses and followed by an asterisk, for example "(25*)". (See ActivateNewShards for an explanation of activating shards.)
- Shards which are pending removal, because they have been deassigned but Rebalance has not yet been called to moved sharded data from them to other shards, are listed with their shard number in brackets and followed by two asterisks, for example "[25**]".
Examples:
- List shards assigned to the current namespace to the console:
set status = $SYSTEM.Sharding.ListShards() - List shards asssigned to namespace "MASTER" to the file "shards.list" in the current directory:
open "shards.list":"NW"
use "shards.list" s status=$SYSTEM.Sharding.ListShards()
close "shards.list"
GetConfig
ClassMethod GetConfig(Namespace As %String = {$namespace}, ByRef ConfigInfo) As %Status
Retrieves configuration information about the sharded cluster to which the specified master or cluster namespace belongs.
The information is returned in an array, passed by reference, under the following top-level subscripts:
- "Master" - Information about the cluster's master namespace and database, and the node on which they reside. For mirrored clusters, this subscript appears as "Master:Primary" or "Master:Backup" to distinguish the two failover members. "Shard", shard number - Information about the shard with the indicated shard number. For mirrored clusters, the shard number subscript includes "Primary", "Backup", or "DRAsync", separated from the shard number by a colon, for example "1:Primary". For query shards, the shard number subscript includes "Query", separated from the shard number by a colon. For DR async nodes or query shards, the subscript includes an additional colon-separated number, to distinguish between multiple DR async nodes or query shards with the same shard number, for example "1:DRAync:2" or "3:Query:1". Configuration details for the master and each shard are returned under a second subscript (for the master) or third subscript (for shards) with the following values:
- "GlobalsDatabase" (master only) - the name of the master globals database.
- "GlobalsDirectory" (master only) - the directory path of the master globals database.
- "RoutinesDatabase" (master only) - the name of the master routines database.
- "RoutinesDirectory" (master only) - the directory path of the master routines database.
- "Database" (shards only) - the name of the shard database.
- "Directory" (shards only) - the directory path of the shard database.
- "Host" - the hostname or IP address of the machine hosting the node's InterSystems IRIS instance.
- "SuperServerPort" - the default port (Super Server port) of the InterSystems IRIS instance. "Namespace" - the name of the master or shard namespace.
Parameters: Namespace Master or shard (i.e. cluster) namespace belonging to the sharded cluster whose configuration information is to be returned.
ConfigInfo Array in which configuration information is returned, passed by reference.
Returns:
Status code reporting success or failure of this API call.
Notes:
- For query shards, only the "Host", "SuperServerPort", and "Namespace" are returned. For DR async nodes, only the "Host" and "SuperServerPort" are returned.
Example: Get configuration information for the 2-shard cluster whose master namespace is "IRISDM", returned in the array config: set status = $SYSTEM.Sharding.GetConfig("IRISDM",.config)
zw config
config("Master","GlobalsDatabase")="IRISDM"
config("Master","GlobalsDirectory")="/home/iris/mgr/irisdm/"
config("Master","Host")="machine1"
config("Master","Namespace")="IRISDM"
config("Master","RoutinesDatabase")="IRISDM"
config("Master","RoutinesDirectory")="/home/iris/mgr/irisdm/"
config("Master","SuperServerPort")=1972
config("Shard",1,"Database")="IRISSHARD"
config("Shard",1,"Directory")="/home/iris/mgr/irisshard/"
config("Shard",1,"Host")="machine1"
config("Shard",1,"Namespace")="IRISCLUSTER"
config("Shard",1,"SuperServerPort")=1972
config("Shard",2,"Database")="IRISSHARD"
config("Shard",2,"Directory")="/home/iris/mgr/irisshard/"
config("Shard",2,"Host")="machine2"
config("Shard",2,"Namespace")="IRISCLUSTER"
config("Shard",2,"SuperServerPort")=1972
SetOption
ClassMethod SetOption(MasterNamespace As %String = {$namespace}, OptionName As %String, OptionValue As %Integer = "") As %Status
Sets a specified sharding configuration option to a specified value, within the scope of a specified master namespace.
All option values are integers, except in the case of MasterIPAddress. The supported options, with their allowed values and defaults, are as follows:
Option Name | Description | Allowed Values | Default |
---|---|---|---|
AutoVerify | Should AssignShard automatically call VerifyShards? 1: Yes. 0: No. | 0/1 | No |
ConnectTimeout | Timeout when connecting to a shard, in seconds. | >=1 | 60 |
Debug | Enables debug trace. 1: Enable all debug trace messages. Higher numbers: enable increasingly selective debug trace. Only recommended when working directly with InterSystems support. | 1-10 | No debug trace. |
DropIgnoreError | Should errors occurring during DROP TABLE be ignored? 1: Yes. 0: No, return the error. | 0/1 | No |
MasterIPAddress | IP address to use for master data server, rather than using DNS resolution on hostname. Deprecated, use SetNodeIPAddress instead. | Valid IP address | Use hostname. |
MirrorConnectAttempts | Number of times to retry connecting to a mirrored shard. | >=1 | 1 |
QuiesceAllowReads | Should Backup.ShardedCluster.Quiesce allow reads? 1: Yes. 0: No, block reads as well as writes. | 0/1 | No |
RunQueriesAsync | (DEPRICATED) Should queries be run asynchronously? 1: Yes. 0: No, run queries synchronously. This permits transparent completion in event of shard failover, when executing queries on mirrored data shards, but it prevents cancelling queries and increases the risk of filling the IRISTEMP database. This option is depricated. The recommended approach for transparent query failover handling is to assign a query shard for each data shard. | 0/1 | Run queries asynchronously. |
Parameters: MasterNamespace The master namespace within whose scope the option is set.
OptionName The name of the option (case insensitive), validated against a list of supported options.
OptionValue The value to which to set the option, validated against allowed values for the specified option. Omitting this parameter, or specifying "", causes the specified option to be undefined, resulting in the default behavior described in the table above.
Returns:
Status code reporting success or failure of this API call.
Examples:
- Set the option "ConnectTimeout" to 15 in the namespace "MASTER":
set status = $SYSTEM.Sharding.SetOption("MASTER", "ConnectTimeout", 15)
GetOption
ClassMethod GetOption(MasterNamespace As %String = {$namespace}, OptionName As %String, ByRef OptionValue As %Integer) As %Status
Gets the value of a sharding configuration option specified by name, within the scope of a specified master namespace.
Parameters:
MasterNamespace The master namespace within whose scope the option's value is determined.
OptionName The name of the option (case insensitive), validated against a list of supported options.
OptionValue (Output) This parameter is set to the option's value.
Returns:
Status code reporting success or failure of this API call.
Notes:
- See SetOption for a table of supported options, the meanings of their values, and the default behavior if they are not set.
Examples:
- Set the variable connectTimeout to the value of of the option "ConnectTimeout" in the namespace "MASTER":
set status = $SYSTEM.Sharding.GetOption("MASTER","ConnectTimeout",connectTimeout)
ActivateNewShards
ClassMethod ActivateNewShards(MasterNamespace As %String = {$namespace}) As %Status
Activates shards that could not be activated by prior calls to AssignShard.
If new data shards are assigned when sharded tables already exist, AssignShard attempts to connect to the newly assigned shards to verify them before they are activated. If this fails (either because the new shards are not reachable, or because they fail verification), AssignShard returns an error indicating that ActivateNewShards must be called to activate the new shards.
If this API call returns an error, it can be called again, after correcting the problem reported in the error, until it succeeds. If there are multiple shards requiring activation, none of them is activated until all of them can be successfully activated.
Newly assigned shards are always automatically activated, except in the case where sharded tables already exist and the new shards cannot be reached or cannot be verified. Therefore, ActivateNewShards never needs to be called unless new shards are assigned while sharded tables already exist.
Parameters: MasterNamespace The master namespace whose new shards are to be activated. Defaults to the current namespace.
Returns:
Status code reporting success or failure of this API call.
Notes:
- When new shards have not yet been successfully activated, the master namespace's shard count has not been incremented to include them. This ensures that all sharded operations can continue executing on activated shards with no problems, without being affected by the existence of shards that have not yet been activated. When this API call succeeds in activating new shards, it increments the shard count by the number of newly-activated shards, making them available for use in all sharded operations. Existing data stored in other shards is not moved to newly activated shards, but data inserted after they are activated is evenly balanced across all shards including newly activated shards.
- Shards which have been assigned but not yet activated are listed by ListShards with their shard number in parentheses and followed by an asterisk, for example "(25*)".
Examples:
- Activate shards newly assigned to the current namespace:
set status = $SYSTEM.Sharding.ActivateNewShards()
Rebalance
ClassMethod Rebalance(Namespace As %String = {$namespace}, TimeLimit As %Integer = 0, ByRef Report, MinBuckets As %Integer = 1) As %Status
Rebalances existing sharded data across all current shards, including newly assigned ones. Causes future inserts to all sharded tables to be balanced across all shards that exist at the time Rebalance is called.
Rebalancing, together with the capability to assign new shards, provides elasticity to sharded clusters.
Data is rebalanced at the granularity of "buckets". For tables with user-defined shard keys, rows of data are assigned to buckets based on a hash of the shard key. There are always 2048 buckets, and the number of rows per bucket varies depending on the size of a table. For tables with system-assigned shard keys, rows are assigned to buckets based on ranges of rowids. Each range contains 256,000 rowid values, providing a fixed maximum bucket size, and the number of buckets varies with the size of the table.
Internally, the shard locations of a table's buckets are recorded in a "shard map". All tables with user-defined shard keys share the same shard map, ensuring that they are always mutually "cosharded" (that is, rows with the same user-defined shard key value will be located in the same shard, for tables with the same number and order of shard key fields). For tables with system-assigned shard keys, each table has its own shard map (which is shared by any table that is specified to "COSHARD WITH" that table).
Rebalancing moves buckets from shards with more than average number of buckets to shards with less than average, until the bucket counts of all shards differ by at most 1. Rebalancing modifies the shard maps accordingly, so that internal logic can always find rows in their current locations, and newly inserted data will be stored in the shard to which rebalancing has moved the bucket to which the data belongs.
When, for all sharded tables, the buckets counts of all shards differ by no more than one, the cluster is considered to be completely rebalanced. By default, Rebalance runs until the cluster is completely rebalanced. If the TimeLimit argument is specified to restrict rebalancing to a maintenance window, multiple calls to Rebalance may be required to reach a completely rebalanced state. Between these calls, the cluster and all sharded data it contains is in a fully usable state, but is not yet taking full advantage of newly-added shards.
If any data shards are flagged for pending removal, due to calling DeassignShard or the %SYSTEM.Cluster method Detach, all sharded data is moved from those shards to other shards that are not flagged for pending removal. In that case, the cluster is not considered to be completely rebalanced until all sharded data has been removed from shards that are flagged for pending removal. Once the cluster is completely rebalanced, Rebalance completes the process of removing the shards that are flagged for pending removal.
Parameters: Namespace A namespace in the current instance, which is a master or shard namespace of the sharded cluster to rebalance. Defaults to the current namespace.
TimeLimit Specifies, in seconds, the time window within which this call should aim to complete (defaults to 0, meaning no time limit). No new bucket move is started, whose expected execution time would exceed this time limit (based on the average execution time for moving previous buckets of the same shard map). A minimum of MinBuckets is moved, regardless of the time limit. A bucket move is an atomic operation which is never interrupted, once started, so total execution time may exceed the specified time limit, if the last bucket move takes longer than average for buckets of the same shard map. Report Returns detailed information about the rebalancing operation, in a subscripted variable, as follows:
- Report("Completed"): Set to 1 if all sharded tables have been fully rebalanced, or 0 if further rebalancing is required.
- Report("Elapsed Seconds"): Elapsed time for executing this call, in seconds.
- Report("Buckets Moved"): The cumulative total number of buckets moved by this and previous calls, since rebalancing started.
- Report("Buckets To Move"): The total number of buckets remaining to be moved, for rebalancing of all sharded tables to be completed.
- Report("Maps",mapname,"Buckets Moved"): The cumulative number of buckets moved by this and previous calls, for the shard map identified by mapname, which is the class name of one of the tables which share this map, or the special map name "||udsk" used for tables with user defined shard keys.
- Report("Maps",mapname,"Buckets To Move"): The number of buckets remaining to be moved for the shard map identified by mapname. Report("Maps",mapname,"Average Time"): The average time for moving a bucket of the shard map identified by mapname, computed cumulatively since rebalancing started. If Rebalance fails, leaving a bucket in an inconsistent, incompletely moved state, Report contains the following additional information:
- Report("Incomplete Move"): Map name identifying the shard map for which a move was incomplete.
- Report("Maps",mapname,"Incomplete Move","Bucket"): The bucket number of the shard map identified by mapname, which was incompletely moved.
- Report("Maps",mapname,"Incomplete Move","From Shard"): The shard number from which the bucket was being moved.
- Report("Maps",mapname,"Incomplete Move","To Shard"): The shard number to which the bucket was being moved. Report("Maps",mapname,"Incomplete Move","Action Needed"): The action needed to correct the situation, which will be automatically performed at the beginning of the next call to Rebalance. One of:
- "Delete New": Delete any data belonging to this bucket from the new location. A complete copy of this bucket is guaranteed to still exist at the old location.
- "Delete Old": Delete any data belonging to this bucket from the old location. A complete copy of this bucket is guaranteed to already exist at the new location.
- "Update Map": Update the shard map to record this bucket's new location. The bucket's data is guaranteed to have already been moved to the new location. MinBuckets
The minimum number of buckets to move during this call to Rebalance, regardless of the time limit (default 1). If 0 is specified, no buckets will be moved if the average bucket move time for every shard map is greater than the specified time limit.
Returns:
Status code reporting success or failure of this API call.
Notes:
- Rebalance can be called in any master or shard namespace of a cluster (that is, in the cluster namespace of any cluster node, to use the "node level" terminology used with the %SYSTEM.Cluster API).
- While rebalancing is in progress, JDBC batch inserts to sharded tables are not permitted, and will return an error if attempted. All other operations on sharded tables are permitted, including queries, updates, inserts, deletes, creating, altering and dropping tables, as well as assigning new shards. Performance of some operations may be slower than when rebalancing is not in progress, but all operations will execute correctly.
- When new shards are assigned, newly inserted data in tables with user-defined shard keys is not stored in the new shards until rebalancing has been performed, because the shard map records all hash buckets as being located on previously-existing shards, until rebalancing moves hash buckets to new shards, logically as well as physically. For tables with system-assigned shard keys, new inserts are evenly distributed across all shards, new as well as preexisting, whether or not rebalancing has yet been performed. Rebalancing redistributes the previously existing data of these tables, but does not affect where newly-inserted rows are stored. Rebalancing keeps track of its internal state persistently, so that if it fails for any reason, Rebalance can be called again to resume and complete the rebalancing process. This includes cases in which Rebalance returns an error, as well as cases where the job in which Rebalance was running has halted, or the IRIS instance in which Rebalance was running has been restarted, while rebalancing was in progress. After a failure, the cluster remains in "rebalancing in progress" mode until Rebalance is called again and either completes the entire rebalancing process, or reaches the specified TimeLimit. While in this mode:
- Queries, inserts, updates, deletes continue to be executed in "rebalancing in progress" mode, guaranteeing that they execute correctly.
- JDBC batch inserts continue to be prohibited.
- Inserts, updates, and deletes of data in buckets that have been moved continue to be forwarded to the new bucket locations. If one or more cluster nodes have been restarted, background daemons manage this transparently. These daemons automatically terminate once rebalancing is completed.
- When, due to TimeLimit being specified, Rebalance completes without error, but without completely rebalancing the cluster, the cluster is not left in "rebalancing in progress" state. Rather, it is left in a fully useable state that is identical to a completely rebalanced state, except that bucket counts differ between shards by more than 1. It is guaranteed that for tables that share the same shard map, if a bucket has been moved, the data belonging to that bucket has been moved for all tables sharing the shard map, guaranteeing that cosharded tables remain cosharded at all times. Examples:
- Completely rebalance the cluster to which the current namespace belongs:
set status = $SYSTEM.Sharding.Rebalance() Completely rebalance the cluster to which the current namespace belongs, returning a report in the variable report:
set status = $SYSTEM.Sharding.Rebalance(,,.report) - Rebalance the cluster to which namespace "SHARD1" belongs, moving as many buckets as can be moved within a one hour maintenance window, and returning a report in the variable report:
set status = $SYSTEM.Sharding.Rebalance("SHARD1",3600,.report,0) - Rebalance the cluster to which namespace "SHARD1" belongs, moving as many buckets as can be moved within a one minute maintenance window, but moving at least one bucket, even if moving that one bucket is expected to take more than one minute, and returning a report in the variable report:
set status = $SYSTEM.Sharding.Rebalance("SHARD1",60,.report,1)
CheckRebalanceProgress
ClassMethod CheckRebalanceProgress(Namespace As %String = {$namespace}, ByRef Report) As %Status
Reports the progress and status of the current Rebalance operation.
Parameters:
Namespace A namespace in the current instance, which is a master or shard namespace of the sharded cluster in which Rebalance is running. Defaults to the current namespace.
Report Returns information about Rebalance progress and status, in a subscripted variable, as follows: Report("State"): The overall state of the Rebalance operation. One of:
- "Setting Up": Rebalance is preparing to move data.
- "Moving Data": Rebalance is currently moving data.
- "Cleaning Up": Rebalance is deleting data from old locations and updating metadata to complete the rebalancing operation.
- "Timed Out": Rebalance was run with a time limit specified, and has stopped executing after reaching that time limit.
- "Terminated With Error": Rebalance stopped executing due to an error. The specific error is reported in Report("Status").
- "Not Running": Rebalance is not currently running. Either it has not yet started, or has completed with no error, or was never called.
- Report("Status"): The success or error status of the Rebalance operation. Report("Tables"): Information about tables being rebalanced or considered for rebalancing, subscripted by table name. The following is reported for each table:
- "Buckets Moved": The cumulative total number of buckets of this table moved by this and previous Rebalance calls, since rebalancing started.
- "Buckets To Move": The total number of buckets remaining to be moved, for rebalancing of this table to be completed. "Operation:": The type of rebalancing operation being performed on this table, one of:
- "Convert To Sharded": This table is being converted from nonsharded to sharded.
- "Convert To Nonsharded": This table is being converted from sharded to nonsharded.
- "Rebalance": This table is not being converted, it is just being rebalanced. If "Buckets To Move" and "Buckets Moved" both equal 0, this table was considered for rebalancing, but no rebalancing was needed.
SetNodeIPAddress
ClassMethod SetNodeIPAddress(NodeIPAddress As %String) As %Status
Specifies the IP address or DNS name to be used by this InterSystems IRIS instances, when interconnecting with other instances in a sharded cluster, rather than using the IP address resulting from resolving this machine's default hostname.
This API call does not need to be used in most sharded cluster deployment scenarios. It is only needed when neither DNS resolution of the default hostname, nor calling $SYSTEM.TCPDevice.LocalAddr() on an open TCP device, return an IP address that is usable for interconnecting nodes of a cluster. (For purposes of this discussion, the default hostname is the name returned by $piece($system,":",1) in an IRIS session).
When this API call is needed, it must be called in each instance of a sharded cluster, before any shards are assigned (in the case of a shard master instance), before any namespace in the current instance is assigned as a shard (in the case of a data shard or query shard instance), or before any sharded operations or other sharding API calls are made in this instance (in the case of a master app server instance). The same IP address or DNS name specified for a shard instance in this API call should also be passed as the ShardHost argument to AssignShard when assigning a namespace in the instance as a shard, rather than passing the machine's default hostname.
Parameters: NodeIPAddress The IP address or DNS name to be used by this InterSystems IRIS instance, when interconnecting with other instances in a sharded cluster.
Returns:
Status code reporting success or failure of this API call.
Notes:
- A user must have administrative privileges in order to execute this API call.
- Recommended practice: Specify an IP address, in scenarios where the machine's IP address will remain constant, but there is no hostname or DNS name available that resolves to this address. Specify a DNS name, in scenarios in which the machine's IP address may change regularly, but the default hostname cannot be counted on to resolve to the correct current IP address needed for interconnecting with other cluster nodes. It may be necessary to define a DNS name / IP address mapping outside of InterSystems IRIS (for example, in /etc/hosts on Unix systems), and update that mapping as the machine's IP address changes, but this avoids needing to reconfigure the cluster when IP addresses change.
- This API call affects an entire InterSystems IRIS instance. If this instance participates in more than one sharded cluster (e.g. contains more than one master namespace, or shards belonging to more than one master namespace), this call specifies the IP address used for this instance in all of the clusters in which this instance participates.
- This API call provides similar functionality to that provided by the setting the deprecated sharding option "MasterIPAddress", but this API call works in a wider range of scenarios. Sharded clusters that are already working using "MasterIPAddress" do not need to be modified, but it is recommended that SetNodeIPAddress be used when configuring new sharded clusters.
Example:
- Set this instance's IP address to "10.0.0.76":
set status = $SYSTEM.Sharding.SetNodeIPAddress("10.0.0.76") - Set this instance's DNS name to "mydnsname":
set status = $SYSTEM.Sharding.SetNodeIPAddress("mydnsname")
Reset
ClassMethod Reset(Namespace As %String = "") As %Status [ Internal ]
SetShardBrokeredHost
ClassMethod SetShardBrokeredHost(Namespace As %String = "", ShardNumber As %Integer = 0, BrokeredHost As %String = "", Backup As %Boolean = 0) As %Status
AddDatabasesToMirrors
ClassMethod AddDatabasesToMirrors(MasterNamespace As %String = {$namespace}) As %Status
RemoveMirroring
ClassMethod RemoveMirroring(MasterNamespace As %String = {$namespace}, Verbose As %Boolean = 1) As %Status
PurgeShard
ClassMethod PurgeShard(MasterNamespace As %String = {$namespace}, ShardHost As %String, ShardPort As %Integer, ShardNamespace As %String, ShardNumber As %Integer) As %Status
ClusterIsNodeLevel
ClassMethod ClusterIsNodeLevel(ClusterNamespace As %String = {$namespace}) As %Boolean
GetFederatedTableInfo
ClassMethod GetFederatedTableInfo(ShardNamespace As %String = {$namespace}, FederatedTableName As %String = "", Output Info As %List) As %Status
Experimental Feature -- InterSystems reserves the right to make backwards-incompatible changes to this API in future releases
Returns information about a specified federated table.
Parameters: ShardNamespace A shard namespace in the current InterSystems IRIS instance (which must belong to a sharded cluster). Defaults to the current namespace.
FederatedTableName The name of the federated table. Must be a valid table name.
Info
Output parameter which returns information about the specified federated table as an array subscripted by shard number. For each shard number, the format is:
Info(shardNumber) = $lb(sourceNamespace, sourceTableName, columnList)
where columnList is in the format: <
$lb( $lb(sourceColumnName1, federatedColumnName1), ...)
If no source table is connected to the specified federated table in a given shard, that shard's entry is:
Info(shardNumber) = ""
Returns:
Status code reporting success or failure of this API call.
Notes:
- Call ListShards or GetConfig to determine the hostname, port, or namespace corresponding to a shard number.
Examples: Get information about a federated table called "Sample.Person" in the current shard namespace, to be returned in the variable personInfo:
set status = $SYSTEM.Sharding.GetFederatedTableInfo(,"Sample.Person",.personInfo)
CreateFederatedTable
ClassMethod CreateFederatedTable(ShardNamespace As %String = {$namespace}, FederatedTableName As %String = "", SourceNamespace As %String, SourceTableName As %String, ColumnList As %List = "", Force As %Boolean = 0, Verbose As %Boolean = 1) As %Status
Experimental Feature -- InterSystems reserves the right to make backwards-incompatible changes to this API in future releases
Creates a read-only federated table, based on a specified source table.
If the specified federated table already exists, this method re-creates it, based on the current definition of the source table.
The federated table is visible in any shard namespace of the sharded cluster to which the specified ShardNamespace belongs. The list of columns determined by this method, either based on the list of columns of the specified source table, or customized with the optional ColumnList argument, becomes the list of columns for the overall federated table. Additional source tables can be attached to this federated table, on any instance of this sharded cluster, by calling ConnectFederatedTable in that instance, specifying a ShardNamespace and SourceNamespace on that instance. Queries of a federated table span all currently-connected source tables.
Parameters: ShardNamespace A shard namespace in the current InterSystems IRIS instance (which must belong to a sharded cluster). Must be on same instance as SourceNamespace. The specified namespace must be a data shard, not a query shard. Defaults to the current namespace.
FederatedTableName The name of the federated table to be created. Must be a valid table name.
SourceNamespace The namespace containing the source table. Must be on same instance as ShardNamespace. SourceTableName The name of the source table.
ColumnList Optionally specifies a customized set of columns for the federated table.
Format: $lb( $lb(sourceColumnName1 [, federatedColumnName1 ] ), ...)
- If ColummList is non-empty, the federated table will have one column for each entry in the list.
- If federatedColumnNameN is not provided, sourceColumnNameN is used.
- If any sourceColumnNameN does not exist in the source table, this method returns an error.
If ColumnList is not provided, the federated table's set of columns will match the source table's set of columns exactly.
Force Force the class of the specified source table, and any classes on which it depends, to be projected as proxies in the specified ShardNamespace, even if proxies for those classes already exist. If Force=0 (the default), new proxies are only created for classes whose proxies do not already exist, which can save significant time when creating multiple federated tables based on a source schema containing many interdependent classes. Specifying Force=1 ensures that proxies are up-to-date with the latest versions of source classes. Verbose If Verbose=1 (the default), report each proxy class that is created and compiled, to the output device. If Verbose=0, do not write any progess information.
Returns:
Status code reporting success or failure of this API call.
Notes:
- It is permitted for the source table to reside in a read-only database.
- If there are changes to the definition of source table, or of classes on which it depends, this method, or ConnectFederatedTable, must be called to re-create proxy classes to match the current source definitions.
Examples:
- Create a federated table called "Sample.Person" in the current shard namespace, whose list of columns exactly matches the source table Sample.Person from the SourceNamespace "Samples":
set status = $SYSTEM.Sharding.CreateFederatedTable(,"Sample.Person","Samples","Sample.Person")
- Create a federated table called "Myschema.Person" in the shard namespace "IRISCLUSTER", with a customized column list based on the source table Sample.Person in the SourceNamespace "Samples":
set status = $SYSTEM.Sharding.CreateFederatedTable(,"Myschema.Person","Samples","Sample.Person",$lb($lb("name"),$lb("home_city","homecity"),$lb("age")))
ConnectFederatedTable
ClassMethod ConnectFederatedTable(ShardNamespace As %String = {$namespace}, FederatedTableName As %String = "", SourceNamespace As %String, SourceTableName As %String, ColumnList As %List = "", Force As %Boolean = 0, Verbose As %Boolean = 1) As %Status
Experimental Feature -- InterSystems reserves the right to make backwards-incompatible changes to this API in future releases
Connect an additional source table to a read-only federated table, which has been created by a previous call to CreateFederatedTable.
If a source table has already been connected to the specified FederatedTableName in the specified ShardNamespace, it is replaced by the source table specified by SourceTableName.
Parameters: ShardNamespace A shard namespace in the current InterSystems IRIS instance (which must belong to a sharded cluster). Must be on same instance as SourceNamespace. The specified namespace must be a data shard, not a query shard. Defaults to the current namespace.
FederatedTableName The name of the federated table to which the specified source table will be connected. SourceNamespace The namespace containing the source table. Must be on same instance as ShardNamespace. SourceTableName The name of the source table.
ColumnList Optionally specifies a customized mapping of columns from the source table to the federated table.
Format: $lb( $lb(sourceColumnName1 [, federatedColumnName1 ] ), ...)
- If federatedColumnNameN is not provided, sourceColumnNameN is used.
- If any federatedColumnNameN does not exist in the federated table, this method returns an error.
- If any sourceColumnNameN does not exist in source table, this method returns an error.
- If ColumnList is not provided at all, all columns in the federated table for which a column of the same name exists in the source table will be mapped.
For any column in the federated table for which no corresponding column in the source is specified (either explicitly or because ColumnList is not provided), rows of this source table will be padded with NULL values when queried.
Force Force the class of the specified source table, and any classes on which it depends, to be projected as proxies in the specified ShardNamespace, even if proxies for those classes already exist. If Force=0 (the default), new proxies are only created for classes whose proxies do not already exist, which can save significant time when creating multiple federated tables based on a source schema containing many interdependent classes. Specifying Force=1 ensures that proxies are up-to-date with the latest versions of source classes. Verbose If Verbose=1 (the default), report each proxy class that is created and compiled, to the output device. If Verbose=0, do not write any progess information.
Returns:
Status code reporting success or failure of this API call.
Notes:
- It is permitted for the source table to reside in a read-only database.
- If there are changes to the definition of the source table, or of classes on which it depends, this method, or CreateFederatedTable, must be called to re-create proxy classes to match the current source definitions.
Examples:
- Connect the source table Sample.Person, from the SourceNamespace "Samples", to the federated table "Sample.Person" in the current shard namespace:
set status = $SYSTEM.Sharding.ConnectFederatedTable(,"Sample.Person","Samples","Sample.Person")
- Connect the source table Sample.Person, from the SourceNamespace "Samples", to the federated table "Myschema.Person" in the shard namespace "IRISCLUSTER", with a customized column list:
set status = $SYSTEM.Sharding.ConnectFederatedTable(,"Myschema.Person","Samples","Sample.Person",$lb($lb("name"),$lb("home_city","homecity"),$lb("age")))
DropFederatedTable
ClassMethod DropFederatedTable(ShardNamespace As %String = {$namespace}, FederatedTableName As %String = "") As %Status
Experimental Feature -- InterSystems reserves the right to make backwards-incompatible changes to this API in future releases
Drop a read-only federated table, which has been created by a previous call to CreateFederatedTable.
This method does not delete any data, and has no effect on any of the source tables that are connected to this federated table. It only drops the federated table's definition, and drops any proxy classes that were projected for this federated table in any shard namespace of this sharded cluster, unless those proxy classes are still needed for another federated table.
Parameters: ShardNamespace A shard namespace in the current InterSystems IRIS instance (which must belong to a sharded cluster). Defaults to the current namespace. FederatedTableName
The name of the federated table to be dropped.
Examples: Drop the federated table Myschema.Person, in the current shard namespace:
set status = $SYSTEM.Sharding.DropFederatedTable(,"Myschema.Person")
DisconnectFederatedTable
ClassMethod DisconnectFederatedTable(ShardNamespace As %String = {$namespace}, FederatedTableName As %String = "") As %Status
Experimental Feature -- InterSystems reserves the right to make backwards-incompatible changes to this API in future releases
Disconnect this shard from a read-only federated table, which has been created by a previous call to CreateFederatedTable.
This method does not delete any data, and has no effect on any of the source tables that are connected to this federated table. It only disconnects any source table that was connected to this federated table on this shard, by a previous call to CreateFederatedTable or ConnectFederatedTable, and drops any proxy classes that were projected for this federated table in this shard namespace, unless those proxy classes are still needed for another federated table.
Parameters: ShardNamespace A shard namespace in the current InterSystems IRIS instance (which must belong to a sharded cluster). The specified namespace must be a data shard, not a query shard. Defaults to the current namespace. FederatedTableName
The name of the federated table from which to disconnect.
Examples: Disconnect from the federated table Myschema.Person, in the current shard namespace:
set status = $SYSTEM.Sharding.DisconnectFederatedTable(,"Myschema.Person")
ConnectShard
ClassMethod ConnectShard(ClusterNamespace As %String = {$namespace}, ShardHost As %String, ShardPort As %Integer, ShardNamespace As %String) As %Status
Experimental Feature -- InterSystems reserves the right to make backwards-incompatible changes to this API in future releases
Connect a previously disconnected shard, specified by ShardHost, ShardPort, and ShardNamespace, to the sharded cluster specified by ClusterNamespace. The specified shard must previously have been assigned using AssignShard. If the specified shard was not previously disconnected (by calling DisconnectShard), this class method has no effect.
Parameters: ClusterNamespace A shard or master namespace in the current InterSystems IRIS instance, used to specify the sharded cluster to which the namespace belongs. Defaults to the current namespace. ShardHost The machine hosting the shard namespace, specified by hostname or IP address.
ShardPort The default port (Super Server port) of the InterSystems IRIS instance hosting the shard namespace.
ShardNamespace
The namespace of the shard being connected.
DisconnectShard
ClassMethod DisconnectShard(ClusterNamespace As %String = {$namespace}, ShardHost As %String, ShardPort As %Integer, ShardNamespace As %String) As %Status
Experimental Feature -- InterSystems reserves the right to make backwards-incompatible changes to this API in future releases
Disconnect a shard, specified by ShardHost, ShardPort, and ShardNamespace, from the sharded cluster specified by ClusterNamespace. A disconnected shard does not participate in queries of federated tables. Therefore, an inaccessible shard can be disconnected to prevent it from causing federated queries to fail, but the queries won't include results from the disconnected shard.
A shard can only be disconnected from a designated federated overlay cluster. A sharded cluster is designated as a federated overlay cluster by setting the option FederatedOverlay to 1 using SetOption after at least one shard has been assigned (by calling AssignShard), and before VerifyShards is called.
To reconnect a shard after it has been disconnected, call ConnectShard.
Parameters: ClusterNamespace A shard or master namespace in the current InterSystems IRIS instance, used to specify the sharded cluster to which the namespace belongs. Defaults to the current namespace. ShardHost The machine hosting the shard namespace, specified by hostname or IP address.
ShardPort The default port (Super Server port) of the InterSystems IRIS instance hosting the shard namespace.
ShardNamespace
The namespace of the shard being disconnected.
ShardIsConnected
ClassMethod ShardIsConnected(ClusterNamespace As %String = {$namespace}, ShardHost As %String, ShardPort As %Integer, ShardNamespace As %String, ByRef IsConnected As %Boolean) As %Status
Experimental Feature -- InterSystems reserves the right to make backwards-incompatible changes to this API in future releases
Determine whether a shard, specified by ShardHost, ShardPort, and ShardNamespace, is currently connected to the sharded cluster specified by ClusterNamespace. The specified shard must previously have been assigned using AssignShard. Returns 1 (if connected) or 0 (if disconnected) in by-reference argument IsConnected. A shard that has been assigned to a cluster is connected unless it has been disconnected by calling DisconnectShard.
Parameters: ClusterNamespace A shard or master namespace in the current InterSystems IRIS instance, used to specify the sharded cluster to which the namespace belongs. Defaults to the current namespace. ShardHost The machine hosting the shard namespace, specified by hostname or IP address.
ShardPort The default port (Super Server port) of the InterSystems IRIS instance hosting the shard namespace.
ShardNamespace The namespace of the shard being connected.
IsConnected
Returns, by reference, 1 if the shard is connected, or 0 if it is disconnected.