XAP71 == "XAP9NET" || XAP71 == "XAP9" || XAP71 == "XAP9NET")

The Elastic Service Manager

Download XAP

Search XAP 7.1
Searching XAP 7.1 Documentation
Browse XAP 7.1
Offline Documentation

Download latest offline documentation in HTML format:
xap-7.1-documentation.zip (19.4MB)

                                                              

Summary: Getting started with the Elastic Service Manager
The Elastic Middleware Services functionality is provided as a technology preview in XAP 7.1. As such, it is subject to API and behavior changes in the next XAP releases without going through the usual deprecation process of the XAP API.

The Elastic Service Manager (ESM) is an implementation of the characteristics (described in the Elastic Middleware Services High-Level Overview). It is built on-top of the existing Administration and Monitoring API exposed by the GigaSpaces components. If the built-in functionality doesn't fit your needs, you can re-write your own custom behavior.

One possibility is to use the exposed Elastic Scale Handler hook for scaling-out and scaling-in. A custom implementation, for example, could send alerts that will trigger a manual process for adding a new machine.

Getting Started

You don't need to have a private/public cloud. You don't even need a virtualization layer. All you need is a GigaSpaces installation and some machines in your local environment.

The GigaSpaces Grid Service Agent (GSA) manages the Lookup Service (LUS), the Grid Service Manager (GSM) and in addition the Elastic Service Manager (ESM). The ESM will be used to deploy an elastic data-grid deployment. The ESM works side-by-side with the GSM. A deployment is passed on to the GSM to manage, while the ESM monitors the deployment and performs the elastic runtime management (auto-scaling, rebalancing, and forcing SLAs).

The ESM will use any available machine it discovers to startup a GSC to accommodate the deployment request. If no machine is available, it will request to scale-out to a new machine. Scaling out to a new machine is handled by the Elastic Scale Handler implementation. A machine is discovered by starting-up a single standalone GSA (no GSCs) on it. On this machine, the ESM will request from the GSA to startup a new GSC. Scale-out/in requests occur either when the deployment is compromised or an SLA has been breached.

To get started, the basic setup will consist of at least two machines with a GSA on them. One of the GSAs will globally manage a LUS, GSM and an ESM. To deploy an elastic data-grid service, locate the ESM using the administrative API, and deploy.

The default Elastic Scale Handler hook does nothing (only logs requests). When a new machine is needed, simply run a Grid Service Agent (GSA) and it will be automatically considered as a free resource. We will show an example illustrating the Elastic Scale Handler which sends SSH commands to start an agent on a free machine running on the network.

Logging

The logging level of the Elastic Service Manager is controlled by the org.openspaces.grid.esm configurable log property in <GigaSpaces>/config/gs_logging.properties.

org.openspaces.grid.esm.level = INFO

Logging of the different components can also be accessed via the Admin UI, by directly clicking on each of the components (ESM, GSA, GSM, GSC). This can be extremely useful for troubleshooting.

Step-By-Step

1. The basic setup

We will use the Grid Service Agent (GSA) to start the basic components - ESM, LUS, and a GSM. All of these are globally managed (which means that if one fails, the GSA will automatically start a new one).

<GigaSpaces>/bin/gs-agent gsa.global.esm 1  gsa.gsc 0

gsa.global.esm 1 - will globally manage 1 ESM instance across all grid service agents within the same lookup group/locators.

gsa.gsc 0 - will override the default and not start a new grid service container. The reason is simple - the ESM will start as many GSCs as needed to meet the required capacity. Moreover, the ESM starts up GSCs which are assigned to a specific zone and ignores the rest.

Note that the gsa.global.lus 2 gsa.global.gsm 2 defaults of the gs-agent script are omitted - These will globally manage 2 lookup services and 2 grid service managers across all grid service agents within the same lookup group/locators.

2. Deploying an Elastic Data-grid

First, discover the ESM by use of the administrative API.

Admin admin = new AdminFactory().createAdmin();
ElasticServiceManager esm = admin.getElasticServiceManagers().waitForAtLeastOne();

Once you get a hold of an ESM instance, you can use it to deploy your elastic data-grid.

ProcessingUnit pu = esm.deploy( new ElasticDataGridDeployment("mygrid"));

Wait for deployment

Space space = pu.getSpace();
space.waitFor(pu.getSpace().getTotalNumberOfInstances());
GigaSpace gigaSpace = space.getGigaSpace();

Based on the defaults, this will deploy a highly-available data-grid named "mygrid" with a capacity of 1 - 10 gigabytes, using GSCs of 512 megabytes each. This request will be mapped to a partitioned cluster of 10,1 (ten instances, one backup each), without an automatic scaling policy. This means that in order to grow, you will manually need to startup a machine.

3. Startup a new machine

When a new machine is needed (either to occupy the backups or to scale out) simply start up a standalone agent.

<GigaSpaces>/bin/gs-agent   gsa.gsc 0

Note that we use gsa.gsc 0 to override the default and not start a new grid service container.

Data-Grid Deployment Options

Context Properties

Context deploy-time properties overriding any ${...} defined within a processing unit configuration. For example, to deploy a mirror-service, a cluster property can be specified:

new ElasticDataGridDeployment("mygrid")
	.addContextProperty("cluster-config.mirror-service.enabled", "true")

Capacity

The memory capacity is controlled by two parameters; a minimum capacity to start from, and a maximum capacity limit to scale to. By default, the capacity is set to 1-10 gigabytes.

new ElasticDataGridDeployment("mygrid")
        .capacity("1g", "10g")

Heap Size

The JVM initial and maximum memory size is the fixed memory size of a Grid Service Container (GSC) JVM. By default, this size is set to 512 megabytes for both initial and maximum size. These effect the -Xmn and -Xmx respectively.

new ElasticDataGridDeployment("mygrid")
        .initialJavaHeapSize("512m").maximumJavaHeapSize("512m")

VM input arguments

Additional JVM arguments can be added to be passed to the startup of a GSC.

new ElasticDataGridDeployment("mygrid")
        .vmInputArgument("-XX:-PrintGC").vmInputArgument("-XX:SurvivorRatio=8")

Multi-Tenant

Multi-tenancy controls the sharing and isolation of a deployment. Here are the supported options:

Dedicated - Restrict the data-grid deployment to machines not shared by other deployments. This option allows you to allocate maximum amount of resources for a specific deployed data grid. No other data-grid is allowed to be deployed into the same machine. This is the default mode.

Shared - Allow the data-grid deployment to co-exist and share the same machine resources with other deployments with the same tenant ID only. This option allows you to share HW resources across selected data-grids with the same tenant ID.

Public - Allow the data-grid deployment to co-exist and share the same machine resources with other deployments. This option allows you to utilize the HW resources across all the data-grid instances deployed. The option does not allocate you to dedicated resources into a specific deployed data grid.

The Dedicated tenancy mode is the only available multi-tenancy mode supported with XAP 7.1. The public and shared tenancy modes are available with XAP 7.1.1.

Here is how you set each option:

Dedicated

new ElasticDataGridDeployment("mygrid")
        .dedicatedDeploymentIsolation();

Shared (by tenant)

new ElasticDataGridDeployment("mygrid")
        .sharedDeploymentIsolation("tenant X");

Public

new ElasticDataGridDeployment("mygrid")
        .publicDeploymentIsolation();

Auto-Scale

Built in scaling SLAs can be added to a deployment. For example, the Memory SLA which specifies the threshold by which scaling is triggered if breached.

Memory SLA

If memory utilization is above the threshold, instances will either be relocated to an alternate GSC, a new GSC on an existing machine, or request to start a new GSC on a new machine. If memory is below the threshold, an instance will be relocated to an alternate GSC, requesting to remove any excess machines no longer in use (empty GSC).

new ElasticDataGridDeployment("mygrid")
        .addSla(new MemorySla("75%"))

The memory usage is calculated using a moving average of a pre-defined subset size. By default, this value is 6 - Six samples five seconds apart, giving a total of a 30 seconds window.
The subset size can be configured using the overloaded MemorySla constructor.

new ElasticDataGridDeployment("mygrid")
        .addSla(new MemorySla("75%", 6))

For memory statistics to be more accurate, consider installing Sigar. See "Using the SIGAR Library to Monitor Machine-Level Statistics"

High Availability

A cluster is highly available by default. One backup per each partition will be included as part of the total capacity. A backup will always require a separate machine (from its primary) to fail-over to. Thus, you will need at least 2 available machines to start with.

new ElasticDataGridDeployment("mygrid")
        .highlyAvailable(true)

Scale Handler

A scale handler specifies the means for scaling out and scaling in. By default, the scaling handler takes no action (only logs the calls to the API). A scale handler can be configured per deployment.

First we supply the class name of an ElasticScaleHandler interface custom implementation. Properties can be passed to this implementation at deployment time. For example, you can specify which machines are in the pool of this handler and so on. Here, "MyElasticScaleHandler" takes in a list of machines.

new ElasticDataGridDeployment("mygrid")
        .elasticScaleHandler(
                new ElasticScaleHandlerConfig(MyElasticScaleHandler.class.getName())
                        .addProperty("machines", "lab-12,lab-13,lab-36,lab-37"))

The ElasticScaleHandler interface consists of a very simple API.

public void init(ElasticScaleConfig config);
    public boolean accept(Machine machine);
    public void scaleOut(ElasticScaleHandlerContext context(;
    public void scaleIn(Machine machine);

The implementation class and possibly any jars that need to be used by the implementation should be placed under the following library folder:

<GigaSpace>/lib/platform/esm

Refer to the Custom Elastic Scale Handler Example for a detailed implementation which sends SSH commands to startup an agent on our lab machines.

Capacity planning

For detailed information regarding capacity planning and best practices, please also refer to Capacity Planning.

Cluster Size

The size of a partitioned cluster is derived by the capacity and heap size deployment options. The maximum memory capacity size divided by the maximum JVM size gives the number of partitions. If highly available, this number is divided by two to account for the backups - The total memory available for data is half of the requested maximum memory capacity.

For example, capacity("1g", "10g").maximumJavaHeapSize("512m").highlyAvailable(true) will yield a partitioned cluster of 10,1.
Calculated as: (10 gigabytes) divided by (512 megabytes) = 20 ==> divided by 2 two account for backups = 10 partitions (each with 1 backup)

JVM Size

For a 32-bit system, a 2 gigabytes heap size is recommended.
For a 64-bit system, a 4-10 gigabytes heap size is recommended.
For performance optimization you should have the initial heap size the same as the maximum heap size.

Initial GSCs

To meet the required initial capacity, a minimum number of GSCs must be started. The minimum memory capacity size divided by the maximum JVM size gives the initial number of GSCs needed. For example, 1 gigabytes minimum capacity divided by 512 megabytes JVM size, yields 2 GSCs needed to occupy a 10,1 cluster.

Maximum GSCs

The maximum memory capacity size divided by the maximum JVM size gives the maximum number of GSCs needed. For example, 10 gigabytes maximum capacity divided by 512 megabytes JVM size, yields 20 GSCs needed to occupy a 10,1 cluster.

Scaling Factor

To reach the capacity limit, we would need to span one instance per GSC. For a cluster of 10,1 it would mean 20 GSCs - each with one instance. The maximum number of GSCs divided by the initial number of GSCs is the scaling factor. A scale factor derives the amount of instances to limit per GSC. For example, a cluster of 10,1 will have no more than 10 instances (of the same deployment) co-existing on the same GSC.

Setting Partition Size


More advanced users can set the partition size which will override the default calculated value for the number of partitions (derived from the capacity parameters). In a highly available deployment this denotes the number of primaries.

For example, setting the partition size to 5:

//Request to deploy mygrid 5,1 capacity: 1g - 2g - DEDICATED
ProcessingUnit pu = esm.deploy(new ElasticDataGridDeployment("mygrid")
                         .capacity("1g", "2g")
                         .maximumJavaHeapSize("250m")
                         .highlyAvailable(true)
                         .setPartitions(5));
IMPORTANT: This is an old version of GigaSpaces XAP. Click here for the latest version.

Additional resources: XAP Application Server | XAP Data Grid | XAP for Cloud Computing | XAP J2EE Support

Labels

 
(None)