Unit of Scalability
July 27th, 2008 by Oscar HuseyinRecently, some of our clients had asked us to determine an early performance view of an application that they are either considering to purchase or have recently purchased. Considering all things that make up a J2EE application, this can be difficult to represent. So, how can we present a view of the systems performance to the customer and satisfy all their concerns around product performance?
The answer is relatively trivial. A performance view can be derived using two methods; bottom-up or top-down.
Lets consider the bottom-up approach. This will typically involve an application architecture review, where all major architectural components of the system are identified and then analysed. Analysis of the components typically involve inspecting the configuration of the component in a view to derive the performance sociability of the component. A good example here would be the Hibernate 2nd Level Cache. As developers of Hibernate will testify, the configuration of the 2nd level cache can be very tricky and unforgiving. The cache configuration will typically involve the caching of both static and/or dynamic database data. Static data can be cached without any thought as the data is read-only and will not change during the runtime of the JVM. However, the same cannot be said about dynamic data. Dynamic data requires very careful thought as the developer is playing against transaction management and isolation levels. A simple oversight in design can cause data integrity issues that are hard to find when performance testing.
With view of all the architectural components and the way they configured to interact with each other, a performance architect can derive a scalability view of the system. Depending on the required performance detail, a model can then be formulated which can attempt to stochastically derive system configuration values to select optimal configuration parameters for the system.
Now, lets look at the top-down view. Applying operational analysis, we can define the expected end user usage characteristics and create a load model; also known as an Application Simulation Model. This load model is considered the system input for which the system will be measured for performance. The operational model is the commonly used performance and volume testing method. Applying the load model and measuring system metrics like CPU, memory, network utilisation etc. the performance engineer can view the system scalability on a given platform.
Having defined the two approaches, lets consider the analysis of the data and look at the results that we can obtain from both approaches. To the skillful performance engineer, the data provided from the architectural analysis using the bottom-up approach will give clues and allow conclusions to be derived of the systems performance capabilities. Continuing with the Hibernate 2nd level cache example, a performance engineer could conclude that the requirement to distribute updates to other caches in a cluster of JVM’s will incur an N+1 update overhead. This synchonise operation is, at best, an interprocess call or, worst case, a network call to write the cache state changes to the other JVM’s. Not very good for performance if the system requires a large number of JVM’s. This heuristic view of a component behavior is usually enough to flag a performance issue. The point here is that although the performance engineer is certain of a scalability issue, the results are theoretical and will require proof using operational model methods. Generally speaking, the performance engineer will not be able to derive a concrete scalability result from the bottom-up approach.
Operational analysis, or top-down approach, will provide clearer, more tangible result for the scalability of the system. The results of the test will generally be represented as something like: 100 logged in users, CPU at 80% capacity, memory at 80% capacity and network at 80% capacity. Normalising this view using a well defined, divisible entities is what l’ve found to provide the best view of system scalability.
Let me further clarify what is meant by well defined, divisible entities. To achieve the most potent results, the software and hardware configuration needs to be as simple as possible and representative of the minimum components of the system; for example, a single node for the JVM, a single node for the webserver and a single node for the database. This way, this forms the fundamental unit of configuration. From this point, the application can be horizontally or vertically scaled. Here is a view of this configuration:

The diagram illustrates a simple system configuration and describes the method which virtual users access the system.
By ensuring that the system remains simple and each software and hardware components are configured to the minimum required level to access all possible system functions, we can run our Application Simulation Model load into the system and measure the key system metrics. These measured metrics grouped together and presented as a whole is what l call the Unit of Scalability. Once this fundamental view is acquired, other scalability attributes are very easily derivable. For example, the Unit of Scalability can be used to derive whether the system scales vertically or horizontally in a linear manner. To determine the horizontal scalability attributes, simply doubling the hardware and software components and executing the same simulation model will reveal the systems linear scalability capabilities.
Having calculated the Unit of Scalability and further derived the systems vertical and/or horizontal scalability attributes, we can accurately quote system capacity requirements, and more importantly, present our customers with a simple view of a systems performance and scalability.