GS == "XAP9NET" || GS == "XAP9" || GS == "XAP9NET")

SBA - Introduction and Example

  GigaSpaces 5.X

Documentation Home
Quick Start Guide
Release Notes

Previous release

  Search Here
Searching GigaSpaces Platform 5.X Documentation


Summary: An overview and implementation example of Space Based Architecture, a new approach to distributed computing which transforms scalability from dead end to open road.
For an in-depth discussion of SBA and scalability, see our White Paper: The Scalability Revolution – From Dead End to Open Road.


To skip to the example, click here.

In many application domains today, especially in financial services, the number of clients, the depth of services provided, and the data volumes are all growing simultaneously; in parallel, middle-office analytics applications are moving towards near-real-time processing. As a result, application workload is growing exponentially. One GigaSpaces customer is expecting to grow from 100K trades to 80 million trades – in only two years!

In order to understand the scalability problem, we must first define scalability: scalability is the ability to grow an application to meet growing demand, without changing the code, and without sacrificing the data affinity and service levels demanded by your users.

We identify two situations in which scalability is interrupted:

  • A scalability crash barrier – occurs if your application, as it is today, cannot scale up without reducing data affinity or increasing latency to unacceptable levels.
  • A marginal cost barrier – occurs when the cost of scaling your application progressively increases, until scaling firther is not economically justifiable.

For most contemporary applications, particularly transactional applications in a low-latency environment, these barriers are inevitable. But this is not the only possible case. Theoretically, an application can achieve linear scalability – the ability to grow as much as needed, at a fixed price per capacity unit – in which case it would never face scalability barriers.

Tier-Based Business-Critical Applications – Scalability Interrupted

Consider two typical business-critical applications – front-office applications and back/middle office analytics applications. It is clear that both types of applications do not have linear scalability, because scaling becomes progressively more difficult and expensive as the application grows.

Interestingly, these two very different business-critical applications have striking similarities: they are both stateful, and both use a messaging tier for coordination, a data tier for storage of state information, and a business tier for the actual processing – in other words, they are both founded on the tier-based architecture.

More interesting still, both types of applications encounter similar scalability problems: cluster nodes get overloaded by inefficient clustering; different clustering models for each tier cause unnecessary ping-pong inside the tiers; unknown scaling ratios between system components cause unexpected bottlenecks when the system scales; growing messaging volumes might overaload the processing components; the network becomes the botleneck; inability to virtualize the tiers causes coordination problems; and different H/A models for each tier makes it difficult to guarantee recovery from partial failure.

Each of these problems can cause a scalability crash barrier. To avoid hitting a crash barrier, application administrators are forced to apply temporary fixes-complex, resource-consuming coordination and clustering mechanisms – and scale up each tier just to accommodate the additional overhead. As the system scales, these fixes need to be applied again and again, making scalability progressively more expensive.

The application's administrators find themselves caught between a rock and a hard place: if they apply the scalability fixes, the application grows more and more complex, until it hits a marginal cost barrier; but if they don't apply these fixes, the application quickly hits a scalability crash barrier, and must be replaced.

This dilemma is inherent in tier-based applications: because the system is divided into separate tiers, it increases in complexity as it scales, requires more and more overhead just to manage this complexity, and makes it more and more costly to increase capacity.

Therefore, tier-based applications cannot be linearly scalable. This is proven by the well-known Amdahl's Law, which states that if all processors in a system spend some of their time on overhead – as is the case in all tier-based systems – the speed improvement yielded by additional processors quickly hits an upper boundary (for 10% overhead, maximal improvement is 10X).

If scalability is a road, and applications are cars driving on the road, non-linear scalability is a dead end. To avoid ending up as a wreck, tier-based business-critical applications must become linearly scalable – and the only way to do this is a change of architecture.

Towards a Linearly-Scalable Architecture

The key to a linearly-scalable architecture is to expand the application by adding self-sufficient units. This way, as the application grows, there is no complex coordination that consumes resources and leads to scalability barriers.

This is a clear departure from the tier-based architecture: instead of running each application tier in a separate cluster of computing resources, all the tiers are compressed into a single unit, which is duplicated when the application is scaled. The middleware problem simply evaporates when each component becomes self-sufficient.

But in a stateful environment, how can you manage a workflow when each machine is completely self-sufficient? How can machines share state information between them? The secret is to collocate all steps of the business process – putting them on the same machine, in the same JVM. This requires developing a processing unit – a mini-application which can perform the entire business process on its own and return a result.

The processing unit manages its own workflow, providing messaging and data storage facilities to its collocated service instances. This saves the need to contact external resources, and means that all process information can be stored in local memory, reducing latency to a minimum.

To ensure reliability, state data and partial results can be persisted to a database, or replicated to another, identical processing unit. Thus, the processing unit can be made as reliable as needed: transient data can be stored in memory only; very sensitive data can be persisted to a remote database. This improves on the tier-based model, which forces all processing components to pay the high price of persistency – even if this type of reliability is not really required.

In the processing unit model, user requests that enter the system are distributed between the processing units using content-based routing – if the request has value A it goes to Processing Unit 1; if it has value B it goes to Processing Unit 2. This type of routing is very simple and inexpensive, but most importantly, it does not increase in complexity as the application scales.

The concept of a processing unit makes linear scalability possible, but this is not enough. A critical requirement is that as the distributed application scales, it should behave and look as one server, whether viewed by designers, developers, administrators, or clients. Everyone involved should see the system as one coherent unit: scaling should be simplified to the point of transparency.

Space Based Architecture – Scalability as Open Road

Space-Based Architecture (SBA) is a way to implement processing units in your application, transforming scalability from dead end to open road.

At the core of an SBA processing unit is the space – a middleware infrastructure, collocated with the business process services, which provides in-memory messaging and data storage facilities for the processing unit. Like a conveyor belt in a production line, the space helps coordinate the business process, and allows business services to deposit their partial results on a shared work area, which can be accessed by other services in turn.

The space's messaging facility allows direct in-memory access to specific messages – this saves each business process service the overhead of connecting to an external messaging server and screening out unneeded messages from the queue.

When the space acts as data store, the services can save results at in-memory speeds, because they are collocated with the space on the same physical machine.

The space has a built-in clustering model, which allows it to share data and messages with other spaces on the network; this allows processing units to guarantee reliability by replicating state data between them. The space can also persist its data to a database. All this occurs in the background, without affecting the latency of the processing units.

This unified clustering model allows clients of the application to execute requests against a generic cluster proxy – the cluster matches the request at the object level, and routes it to the appropriate space. This routing is done in a distributed manner, without requiring a central server.

A central part of SBA is a deployment framework that takes care of running the processing unit on physical machines. The framework consists of SLA-driven containers that run on all the computers in the deployment scope.

Via the user interface of the deployment framework, it is easy to deploy as many processing units as needed – with one click. A general deployment descriptor describes a generic scaling policy for all possible deployment sizes, so deployment doesn't become even a bit more complex when scaling up. What's more, the framework dynamically responds to application workload.

Because of the flexibility of the deployment framework, SBA permits a combination of scaling-up (within a multi-core machine) and scaling-out, without breaking the SOA model.

SBA guarantees linear scalability, while ensuring simplicity – the application's scale becomes transparent to all involved, as if it was a single server:

  • Scalability ceases to be a design consideration.
  • The application presents a unified API and clustering model, so developers can write code as if for a single server – physical distribution is completely abstracted.
  • The entire application is deployed through a single deployment command.
  • Clients can perform operations on the entire cluster in a single operation.

The SBA Value Proposition

SBA removes the scalability dead-end facing today's tier-based applications, and guarantees improved scalability in four dimensions:

  1. Fixed, minimal latency;
  2. Low, predictable hardware/software costs;
  3. Reduced development costs;
  4. Consistent data affinity.

Example Structure

This example shows how SBA would be implemented for a simple distributed application: a trading system that accepts buy and sell orders from clients. This application is transactional (there are two sequential steps in the business process), and requires very low latency, so that buy and sell orders can be executed in real time.

A1. Business Process

In order to make this example as simple as possible, we were forced to sacrifice some realism in portraying the business process of a trading application. This section shows the real business process for a trading application, and the simplified business process implemented in this example.

Real Business Process for a Trading Application

  1. A customer submits a buy order or sell order for a certain asset.
  2. The application puts the order into an order book – this triggers other processing, such as calculation of depth of market.
  3. A Matching Engine attempts to match the order to an existing order of the opposite type (buy or sell).
  4. A Credit Engine calculates if the buyer has sufficient credit to execute the trade (this might involve forecasting the maximal future value of the trade – in some cases the credit check comes before matching).
  5. A Trading Engine executes the trade.
  6. The application sends a trade confirmation, typically to both sides of the trade.

Simplified Business Process Used in This Example

  1. A customer submits a buy order or sell order using a client application. There are two shares being traded between buyers and sellers: Google shares and Microsoft shares.
  2. Business Process Step A (Matching): The application tries to match 'buy orders' to 'sell orders'. A match is achieved if two orders were made on the same amount of the same shares, for the same price (for example, the buyer wants to buy 100 Google shares at $10, and the seller wants to sell 100 Google shares at $10).
  3. If a match is found, the application executes the trade and generates a settlement, which represents a deal between the buyer and seller.
  4. Business Process Step B (Routing): The application routes the settlement to the exchange or a persistent store, so it can be viewed by buyers and sellers.

A2. Application Components

Our sample SBA application is comprised of three components:

  • A client – allows the customer to submit an order using a command-line interface.
  • Processing Unit 1 – dedicated to Google shares. Contains:
    • A space – a memory unit that facilitates messaging and data storage.
    • A matching service – handles Business Process Step A (of the simplified business process).
    • A routing service – handles Business Process Step B (of the simplified business process)
  • Processing Unit 2 (running on a separate machine) – dedicated to Microsoft shares. Also contains a space, a matching service and a routing service.

Business Process Executed by Independent Services

It should be emphasized that the matching and routing operations are defined as independent, generic services – they are not hard-wired into the processing unit, and can easily be replaced with or supplemented by other services. They can also be independently discovered and managed, just like in a regular SOA. However, because these services are collocated, together with the space which facilitates coordination, they can cooperate within the processing unit to accomplish a structured workflow.

A3. Application Workflow

The simplified business process is executed as follows (to see the differences between the simplified business process used in this example and a typical business process of a real trading application, see the Business Process section):

  1. The client application accepts an order from the customer via a command-line interface, and generates an order object. It delivers the object to the application via the Spring Framework.
    In this example the client parses the order and converts it into an object the application can work with. In a real trading application, there is an additional business process component that receives orders in a textual format such as FIX, converts them to usable form, and validates each order.
  2. Using the space clustering model (operating in partitioned mode), the order is automatically routed to one of the processing units, based on the productId – if the share is Google, the order is routed to Processing Unit 1; if Microsoft, it is routed to Processing Unit 2. The order is written to a virtual messaging queue, in the space of the relevant processing unit.
  3. The matching service (BUSINESS PROCESS STEP A) takes the new order from the messaging queue on the space, and checks if the space contains another order that matches it.
  4. If there is a match, the matching service generates a settlement and writes it to a virtual in-memory database in the space.
  5. The routing service (BUSINESS PROCESS STEP B) is notified that a new settlement has been written to the virtual in-memory database, and takes the settlement from the space.
  6. The routing service writes the settlement to the exchange or to a persistent store, allowing interested parties to see which deals were executed.

The workflow is illustrated in the diagram below.

It is easy to see that the entire business process takes place inside the processing unit (in the diagram, Processing Unit 2 handles the order, because the shares are Microsoft). The processing unit is not dependent on any other component, so it can perform its entire process on a single machine, at in-memory speeds.

The clients are not aware of the clustering at all: they submit an order through the Spring Framework, and view the result as they always would, through the exchange or a legacy database. This means that even if this application were to scale from 2 processing units to 200, the client application and the entire process would remain the same.

Latency as Low as Possible, Reliability as High as Necessary

Because the services are collocated, communicate using the space's in-memory messaging and data store, and do not reference any external resources, the processing unit responds with the lowest possible latency – that of in-memory access.

The processing unit's data can be made as reliable as necessary, by either replicating it to another processing unit or persisting it to a database. But because the application is not forced to save all its data to a database, there is no need to suffer a latency penalty, incurred by disk access or network calls, for information that is transient in nature, or that needs to be highly available rather than persisted to hard disk.

FIFO Ordering Enables Virtual Messaging

This example uses the space's ability to record the order in which objects (representing trades in this case) are written to the space. When the matching service takes trades from the virtual 'messaging queue', it is in fact simply reading them from the space – because of the FIFO ordering, the space can return trades in the exact order they were submitted.

A4. Data Model

The application has two data classes: order and settlement. This section shows and explains these classes, and discusses the data lifecycle.


This class represents a buy or sell order submitted by a customer.

import com.gigaspaces.annotation.pojo.SpaceClass;
import com.gigaspaces.annotation.pojo.SpaceProperty;

public class Order {
    public static int BUY_ORDER = 1;
    public static int SELL_ORDER = 2;
    public static string GOOGLE = GOOG;
    public static string MICROSOFT = MSFT;
    private int amount;
    private int type;
    private double price;
    private String productId;
    private Boolean isNew;

As you can see, this is a Plain Old Java Object (POJO). Five data fields are declared using native types (with four additional static fields):

  • amount – the amount the buyer (seller) wants to buy (sell).
  • type – whether the order is a BUY_ORDER (=1) or SELL_ORDER (=2).
  • price – the price at which the buyer (seller) is prepared to buy (sell).
  • productId – what the buyer (seller) wants to buy (sell). In this example we have two products: GOOGLE shares (=GOOG) and MICROSOFT shares (=MSFT).
  • isNew – used to distinguish between new trades waiting in the matching queue, and old trades that have undergone matching, but no match was found for them.

@SpaceClass is an annotation that says this class's data should be represented in the space.

@SpaceProperty(null-value = "-1")
    public int getAmount() {
	return amount;
    public void setAmount(int amount) {
	this.amount = amount;

The @SpaceProperty annotation specifies that the amount field should be represented in the space. The null-value argument provides a specific value that corresponds to null (required because the field is of a native type).

The annotation is followed by two methods, getAmount() and setAmount() – the first is used when this field is written to the space (the space 'gets' the field value from the POJO); the second is used when this field is read from the space (the space 'sets' the field value into the POJO).

The same structure is repeated for the other fields, type, price, productId and isNew, so that they are represented in the space as well.

An XML mapping file can be used instead of inserting annotations into the code.


This object represents a deal executed between a buyer and a seller.

import java.util.Date;

import com.gigaspaces.annotation.pojo.Nullable;
import com.gigaspaces.annotation.pojo.SpaceClass;
import com.gigaspaces.annotation.pojo.SpaceField;

public class Settlement {
    private int amount;
    private String productId;
    private Date date;
    private double price;

Like Order, this is a POJO with four fields; the @SpaceClass annotation says the class should be represented in the space.

The Settlement class has a constructor (not shown), which accepts one sell order and one buy order object, asserts that they match, and sets the amount, date, and price of the settlement to those of the sell order.

@SpaceProperty(null-value = "-1")
    public int getAmount() {
	return amount;
    public void setAmount(int i) {
	this.amount = i;

Like in the Order object, the @SpaceProperty annotation and the get/setAmount() methods specify that the amount field should be represented in the space. There are similar annotations for the other data fields.

Data Lifecycle

In an SBA application, it is useful to examine the part played by different elements of the data model in the application workflow, and the relation between these elements:

  • Process relevance: Order objects are only relevant in Step A of the business process; Settlement objects are only relevant in Step B of the business process. What's more, the creation of a Settlement object signifies the start of Step B of the business process.
  • Accessibility: Order objects need to be accessible only to the matching component (Step A). Settlement objects needs to be accessible only to the routing component (Step B), and subsequently to buyers and sellers.
  • Latency: Both objects need to be accessed quickly for Step A and Step B to be executed.
  • Relation between data elements: Each Settlement object replaces two Order objects – once the settlement is generated, the orders are no longer relevant to anyone.

The following sections show how SBA allows the application to match these requirements precisely – each object is only stored for the duration required, and is only made accessible to the required parties, reducing the application's overhead to a minimum.

A5. Client Application


import java.util.Scanner;

import org.springmodules.javaspaces.gigaspaces.GigaSpacesTemplate;


The application imports Spring Framework context support, and org.springmodules.javaspaces.gigaspaces.GigaSpacesTemplate, the Spring abstraction of a GigaSpaces space. These two classes allow the application to connect to and work with the space, via a Spring DAO.

To learn more about Spring, refer to the Spring Framework Documentation.

Obtaining the Space

public class MarketClient {
    private static final long ORDER_LEASE_TIME = 60 * 1000;
    private GigaSpacesTemplate space;

The MarketClient class declares an ORDER_LEASE_TIME. This is a kind of timeout: how long (in ms) the order should remain in the space of the processing unit, before expiring. The space automatically cleans orders when their lease expires.

It also declares GigaSpacesTemplate space, a field that can accept the Spring abstraction of the space.

Further in the code, the client defines getSpace(), a simple method that references the Spring application context file (at path resources/client.xml), and obtains a GigaSpacesTemplate bean.

    private GigaSpacesTemplate getSpace() {
	if (space == null) {
	    ClassPathXmlApplicationContext ctx = 
                 new ClassPathXmlApplicationContext("/resources/client.xml");
	    space = (GigaSpacesTemplate) ctx.getBean("gigaspacesTemplate");
	return space;

In this example there are two spaces (one for each of the two processing units; one for Google shares and one for Microsoft shares). When getSpace() is called, the Spring Framework hands the relevant space to the application. Finally, the Order object is written to the space.

Submitting the Order

The customer enters order details using the command line (code not shown), and the application generates an Order object, setting the order details provided and setting isNew to true:

    Order order = createOrder(command[0]);
    fillOrder(order, command[1], command[2], command[3], true);

The order is then executed:

    private void execute(Order order) {
	getSpace().write(order, ORDER_LEASE_TIME);

The application obtains a GigaSpacesTemplate bean by calling the getSpace() method, and writes the order to the space, by passing a POJO and a lease time to the write() method exposed by the bean.

It is important to realize that the code above remains exactly the same if the application scales up, if the cluster topology changes, or even if GigaSpaces is replaced by a different vendor – the clustering model is completely transparent to the client, which always sees the application as one coherent unit.

A6. Delivering Orders to Processing Units

As we saw above, the client application obtains a GigaSpacesTemplate bean and 'writes' the order to it. But what does this mean in practice? How is the order actually delivered to the correct processing unit – the one that handles the specific shares specified in the order (either Google or Microsoft)?

A Partitioned Cluster

Behind the scenes, the spaces in the two processing units form a partitioned cluster – the orders flowing into the system, as well as the settlements generated by the system, are divided between the spaces in the cluster according to a partitioning scheme.

In this example, requests are partitioned according to the productId – orders for Google shares go to the space of Processing Unit 1; orders for Microsoft shares go to the space of Processing Unit 2.

In a real-world scenario, each node in the replicated cluster can have an additional node backing it up, to ensure high availability. The replication between the partitioned nodes and their backups can either be synchronous or asynchronous, depending on the reliability and performance requirements.

The Clustered Proxy

The cluster of spaces (two spaces in this case) exposes a clustered proxy, which allows clients to see the cluster as one coherent unit. This proxy accepts client requests, and routes them to the relevant space, based on the clustering mode – in this case, according to the partitioning scheme.

The GigaSpacesTemplate bean we saw earlier is merely a Spring abstraction of the cluster space proxy. When the client calls the write() method on the template, passing the Order object, the object is passed to the write() method of the cluster proxy. The proxy recognizes the productId of the request, and automatically routes it to the appropriate space.

Defining Cluster Mode and Topology

When the application is deployed, using a service deployment framework such as the GigaSpaces Service Grid, the administrator can specify how many processing units to deploy and how they should be clustered. The Service Grid offers several basic clustering patterns (partitioned, partitioned with backup, synchronous replication, asynchronous replication), and allows users to define custom patterns.

A7. The Processing Unit

The processing unit comprises three services: a space, a matching service, and a routing service.

The Space Service

The space is a uniform service that is injected by the deployment framework when the processing unit is deployed. It is tasked with providing messaging and data storage facilities for the business services (in this example, the matching service and the routing service).

The class which actually provides these middleware capabilities is This class exposes the JavaSpaces APIs: read(), write(), take(), and notify(). The space service, shown below, is a super-class that enables the space to be presented to the application as a Spring template.


import org.springframework.context.ApplicationContext;
import org.springmodules.javaspaces.gigaspaces.GigaSpacesTemplate;

public abstract class SpaceServiceImpl implements SpaceService {

    protected GigaSpacesTemplate spaceTemplate;
    public SpaceServiceImpl() {
    protected GigaSpacesTemplate getSpaceTemplate() {
	if (spaceTemplate == null) {
	return spaceTemplate;
    private void createSpace(String beanId) {
	ApplicationContext context = new ClassPathXmlApplicationContext(
	spaceTemplate = (GigaSpacesTemplate) context.getBean(beanId);
    public void setSpace(JavaSpace javaSpace) {

The Matching Service (Business Process Step A)

The matching service occupies a thread, and constantly repeats the following operation: It reads a FIFO-ordered Order object from the space, tries to find a matching order, and if it finds one, generates a settlement.

This is exactly as if the service subcribes to a messaging queue containing orders waiting to be processed. The nice thing is, there is no messaging server – only Order objects waiting in the space, on the matching service's local machine. This is a simple demonstration of how the space can facilitate implicit messaging between software components.

GigaSpaces offers space JMS support, which allows the messaging to be explicit: components can treat the space as if it were a traditional messaging service, and subscribe to queues or topics as usual.

Code Walkthrough

  1. The matching service starts by getting an order (any order) from the space. To do this, it defines a read template:
        public void match() {
    		Order readTemplate = new Order();

    The Order class's no-args constructor generates an order with all fields set to null, except for isNew which is set to true. This read template instructs the space to return any object of class Order for which isNew is true.

  2. The service obtains a GigaSpacesTemplate bean using the getSpaceTemplate() method (defined in the space service, which this service extends), and invokes the take() method, which reads an object from the space and deletes it.
    Order order = (Order) getSpaceTemplate().take(readTemplate, 10000);

    Notice that the template from the previous step is passed: this instructs the space to return an order (just one order), and then delete it.

    In this example the space is configured to perform FIFO ordering. Therefore, by default, the space returns the chronologically-next Order object (the order that was submitted by the client immediately after the last order taken by the matching service).

  3. The service checks the type (buy or sell) of the order retrieved from the space, and defines a variable with the opposite type.
    Int matchingOrderType
    	    if (order.getType() == Order.SELL_ORDER) {
    		matchingOrderType = Order.BUY_ORDER;
    	    } else {
    		matchingOrderType = Order.SELL_ORDER; }
  4. The service now tries to find a matching order in the space. To do this, it creates a new read template from the order taken from the space in step 2, only with the opposite type (buy or sell), and with isNew set to false. This instructs the space to find another order with the same price, amount and productId as the first order taken from the space, with an opposite type.
    Order readTemplate = order
    	    readTemplate.type = matchingOrderType
    	    readTemplate.isNew = false

    isNew is set to false to specify that the new trade should only be matched to existing trades in the space, not to new trades waiting in the queue.

  5. The service attempts to take a matching order from the space, with a timeout of one second.
    Order matchingOrder = (Order) getSpaceTemplate().take(readTemplate,

    If a matching order exists in the space, the space returns it; otherwise it returns null.

    In a real trading application, to enable more complex matching rules, the service could use GigaSpaces's space SQL support. Clients can pass a read template containing an SQL WHERE clause, and the space returns objects matching the query.
  6. If the take operation returned null, the service concludes there is no match, sets isNew to false and writes the order back to the space, in order to process it again in the future.
    if (matchingOrder == null) {
    		order.isNew = false
    For simplicity, this example does not use the space's local transaction mechanism. In a real implementation, the first take would be performed under a local transaction on the space – this would lock the original order, and if matching failed, the transaction would be rolled back and the lock would be released.
  7. If a matching order was found, the service generates a settlement, which represents a deal between buyer and seller:
    } else {
    		resolve(order, matchingOrder);
        private void resolve(Order order, Order matchingOrder) {
    	    Settlement settlement;
    	    settlement = new Settlement(matchingOrder, order);

    Since both the first order and the matching order were taken from the space (read and deleted), there is no danger that they'll be processed again in the next loop.

  8. Finally, the service writes the settlement to the space:

    This is exactly the same as if the service would write its results to a database. The space is now acting as a data store, holding the partial results of one step in the business process.

  9. Back to step 1 (while the service still occupies the thread).
    Once the matching service has finished working on the orders, it doesn't need to trigger the next step in the process; there is also no workflow engine managing the transition to the next business step. As we shall see, the transition occurs via the space's notification mechanism.

Lifecycle Methods

The matching service defines two lifecycle methods (not shown), invoked by the deployment framework, which perform user-defined maintenance operations when the processing unit is initialized and shutdown:

  • started() – called when the processing unit is deployed; opens a thread for the service and calls the match() procedure.
  • destroyed() – called when the processing unit is shut down on a certain machine; clears the thread.

The Routing Service (Business Process Step B)

The routing service asks to be notified by the space whenever a settlement is written to it. When a settlement appears in the space, the service receives a notification and takes it from the space.

Again, this is exactly as if the service subcribes to a messaging queue, this time a queue containing settlements. And again, there is no messaging server – there are only Settlement objects waiting in the space.

The routing service then writes the template to the exchange or to a persistent store.


This service imports a number of Jini and Spring classes that enable it to register for notifications on a space:

import java.rmi.RemoteException;

import net.jini.core.entry.UnusableEntryException;
import net.jini.core.event.RemoteEvent;
import net.jini.core.event.RemoteEventListener;
import net.jini.core.event.UnknownEventException;
import org.springframework.context.ApplicationContext;
import org.springmodules.javaspaces.gigaspaces.GigaSpacesTemplate;
import com.j_spaces.core.client.EntryArrivedRemoteEvent;
import com.j_spaces.core.client.ExternalEntry;
import com.j_spaces.core.client.NotifyModifiers;

The service also imports application-specific classes (not shown).

Code Walkthrough

  1. The routing service starts by asking to be notified when Settlement objects are written to the space. To do this, it first defines a read template containing a null Settlement (using the Settlement class's no-args constructor).
        public void route() {
    		Settlement readTemplate = new Settlement();

    This template defines what the routing service wants to be notified about – all Settlement objects.

  2. The service obtains a GigaSpacesTemplate and invokes the addNotifyDelegator() method, passing the null Settlement and various notification parameters.

    This sets up a notification listener, and registers the routing service for a notification on all Settlement objects that are written the space.

    The following methods, defined in the routing service, are invoked when the notification event is fired:

    public Notify(GigaSpacesTemplate spaceTemplate)
         this.template = template;
    public void notify(RemoteEvent theEvent) throws UnknownEventException,
           EntryArrivedRemoteEvent arrivedRemoteEvent = 
               (EntryArrivedRemoteEvent) theEvent;
           ExternalEntry newSettlement = 
               (ExternalEntry) arrivedRemoteEvent.getEntry(true);

    The second method actually retrieves the Settlement object that was written to the space, placing it in the variable newSettlement.

  3. When the matching service finishes its process and writes a new Settlement object to the space, an event is fired, the notification methods above are invoked by the space, and the newSettlement variable is populated.
  4. The routing service converts the newSettlement to the required format, and writes it to the exchange or persistent data store.
  5. Back to step 3, while the service is running.

Lifecycle Methods

Like the matching service, the routing service uses the startup() and shutdown() methods for thread management; these methods are invoked by the deployment framework when the processing unit is started and stopped, respectively.

A8. Why this Application is Linearly Scalable

The trading application we presented supports two shares, with one machine processing the trades for each share. From this starting point, the application might be asked to scale in two dimensions: the volume of trades might increase in one or both shares, and more products might be added. The new products might have the same business process as the old, or a completely different business process (which requires developing a new processing unit alongside the existing one).

Supporting Increased Trading Volume

Suppose that the maximum capacity of each existing machine is 1000 trades per second, that the current trading volume in Microsoft shares is 800, and that this volume is expected to triple from 800 to 2400 trades per second.

We are assuming that trade in Google shares will remain constant. If and when it grows as well, the Google machine can be scaled in the same way.

Initially, the application does not need to do anything, because there is still capacity to utilize on the Microsoft machine. Only when that machine reaches full utilization, does the application use a combination of scaling up and scaling out to increase capacity.

It is possible to scale up the machine running Processing Unit 2 (which is dedicated to Microsoft shares), by adding two more CPUs – Processing Unit 2 can then handle the additional load.

Alternatively, two more machines can be added – as the load increases, the deployment framework will automatically launch Processing Units 3 and 4 on these machines, and these units will process the additional Microsoft trades. It is also possible to combine scaling up and out – for example, adding one CPU to the Microsoft machine and adding one new machine.

The decision how to scale depends solely on economics, not on technical considerations. The application remains perfectly optimized, whether it scales by adding more threads or adding more machines – and neither option breaks the SOA model.

Most importantly, however the application scales, scaling is linear: each additional CPU or machine contributes 1000 more trades per second.

This is true linear scalability because:

Latency remains constant. The space cluster proxy routes user requests based on a simple rule, so it takes the same amount of time to distribute user requests between 2 spaces and 3 spaces – or 300 spaces (assuming the network environment doesn't change) No latency barrier
Data affinity remains constant. Each user request is routed to a processing unit that can perform the entire business process independently. This guarantees consistency. Processing units can be backed up using replication or persistency. No consistency barrier
Code remains constant. As the application grows there is no need to change the client, the business process components, or the middleware. No coding barrier
Cost of additional capacity units remains constant, because each hardware unit contributes the same amount of trades per second. No marginal cost barrier

Using the second definition of linear scalability (See The Scalability Revolution, SBA Concept Paper, pg. 5):

Supporting More Products (Same Business Process)

Suppose that there is a need to support trading in IBM shares, alongside Google and Microsoft. There are three ways to scale up:

  • Increase utilization of an existing machine. Suppose that the machine dedicated to Google shares has enough spare capacity to handle the IBM shares are well. In this case, the IBM trades can be routed to Processing Unit 1. The partitioning scheme will be changed to direct both IBM and Google trades to Processing Unit 1.
  • Scale up an existing machine. If neither machine has enough spare capacity, CPUs can be added to one of the existing machines – say, the machine running Processing Unit 1 – and then the partitioning scheme can be changed to route IBM trades to Processing Unit 1.
  • Scale out. Another machine can be added, Processing Unit 3 can be deployed on that machine, and then the partitioning scheme changed to route IBM trades to Processing Unit 3.
  • Combination of scaling up and out.

As shown above, capacity increases linearly as more machines or CPUs are added.

Supporting More Products (Different Business Process)

Suppose there is a need to support trading in derivatives. This has a different business process from trading of shares – the application needs to get an input of the derivatives base and perform computations on it.

In this case, a new type of processing unit needs to be developed. This new processing unit might have a computation service, a different matching algorithm, etc. (The development of a new processing unit does not affect the design of the existing processing unit, which can remain exactly the same.)

The new processing unit is defined in the deployment framework, and deployed in one of two ways:

  • Deployed to an existing machine. If a machine is under-utilized, or can be scaled up, it can host two processing units: one for shares and one for derivatives. The partitioning scheme is simply changed to route derivatives trades to the under-utilized processing unit.
  • Deployed to new machines. If new machines are added, and there is sufficient load in derivatives trading, the new processing unit will automatically be deployed to the new machines.

As shown above, capacity increases linearly (for each product type) as more machines or CPUs are added.

A9. Beyond this Example

This example shows a very simple implementation of SBA. You are invited to read about the following additional options and considerations, which would be the next things to consider in developing a similar application:

  • Mirror Service – allows the application's data to be synchronized with an external data store.
  • Hibernate Cache Plugin – allows transparent integration with external databases.
  • Aggregated SQL queries – allows external clients and applications to perform SQL queries on the entire cluster of processing units.
  • External notifications – external clients can register once to receive notifications on changes to data anywhere in the application (refer to JavaSpaces Notifications; Map Notifications).
  • GUI-based monitoring and deployment – the GigaSpaces Service Grid with its administration UI provides a single point of control for the entire application (refer to Service Grid; GigaSpaces Browser).
  • JMX-based monitoring – the GigaSpaces Focal Server allows for monitoring of the entire application using standard JMX clients.

Wiki Content Tree

Your Feedback Needed!

We need your help to improve this wiki site. If you have any suggestions or corrections, write to us at Please provide a link to the wiki page you are referring to.