Friday, 20 March 2015

Disk Performance Testing with JetStress 2013

Exchange Server Jetstress 2013 should be on the utility tool belt of every messaging administrator, since it allows the testing and validation of one of the most critical areas for Exchange Server performance: the disk subsystem.
Jetstress helps verify disk performance by simulating Exchange disk Input/Output (I/O) load, without requiring the Exchange Server bits to be installed. Jetstress simulates the Exchange database and log file loads produced by a specific number of users, to help ensuring that the disk subsystem is adequately sized in terms of the performance criteria you established.
There are two categories of test scenarios available with the tool: disk subsystem throughput and the Exchange mailbox profile.
  • In the disk subsystem throughput test scenario, you can do the following types of tests:
    • Performance of database transactions (the performance test becomes a stress test when its duration is longer than 6 hours)
    • Database backup
    • Soft recovery
  • In the Exchange mailbox profile test scenario, you can specify the number of mailbox users and I/O per seconds per mailbox to simulate the profiled Exchange mailbox load.
Before diving into some disk performance testing just for the fun of it, please take some time to plan the tests correctly, specifically to determine what are the performance goals to be met. The best place to start is, of course, the Exchange 2013 Server Role Requirements Calculator.

What’s New

Let’s start with what is not new… but should be! The previous version of JetStress introduced a help file (Microsoft Exchange Server Jetstress.chm) that replaced the old Word document with instructions. It seems that someone forgot to update the .CHM file, since it looks exactly the same as the old version, as depicted in Figure 1.
Image
Figure 1: Help File (.CHM)
This new version of Jetstress is intended to be used only with Exchange Server 2013. For previous versions of Exchange Server use Jetstress 2010. The 32-bit version of Jetstress 2010 should only be used to validate Exchange Server 2003, as depicted in the following table.
Version
Build
Usage
Link
14.01.0225.017
32 bit
Exchange 2003
http://go.microsoft.com/fwlink/?LinkId=202341
14.01.0225.017
64 bit
Exchange 2007
Exchange 2010
http://go.microsoft.com/fwlink/?LinkId=178616
15.0.658.4
64 bit
Exchange 2013
http://aka.ms/Jetstress2013
Table 1: Jetstress version and download table
Jetstress 2013 introduces some improvements, bug fixes and, most important, it now supports the latest version of Exchange Server. Here is a brief list of the new features:
  • The Event log is captured and logged to the test log. These events show up in the Jetstress UI as the test is progressing.
  • Any errors are logged against the volume that they occurred. The final report shows the error counts per volume in a new sub-section.
  • A single IO error anywhere will fail the test. In case of CRC errors, they might be remapped. A re-run of Jetstress should verify that they indeed were remapped.
  • Detects -1018, -1019, -1021, -1022, -1119, hung IO, DbtimeTooNew, DbtimeTooOld.
  • Threads, which generate IO, are now controlled at a global level. Instead of specifying Threads/DB, you now specify a global thread count, which works against all databases. This improves the granularity of thread tuning and enables automatic tuning to work more effectively.
  • Jetstress configuration files (JetstressConfig.XML) generated from an older version of Jetstress is no longer allowed.

Installing JetStress

Jetstress testing should be performed before you put the server in production, and the tool should be completely removed before installing the definitive Exchange Server version.
To install Jetstress follow these steps:
  1. Ensure that installation requirements are met:
    • .NET Framework Version 4.5 or higher installed.
    • Windows Server 2008 R2 and Windows Server 2012.
  1. Configure the storage subsystem:
    • Although not really a pre-requisite, consider following these recommendations: use Windows certified hardware. If the server is a cluster, the whole server/storage configuration must be Cluster Certified.
    • Verify that all the storage components have the latest firmware and the drivers are current.
    • Format the LUNs within Windows with NTFS file system (64KB allocation unit size).
    • Verify that the HBA/SAN specific configuration is set correctly. Many HBAs use registry keys to customize the configuration to a specific SAN platform (for example, Queue Depth).
    • Raid Controller Stripe size is 256Kb or greater (refer to hardware vendor for guidance).
    • Read/Write Cache is 75% Write and 25% Read on all LUN’s.
    • Configure the storage logical unit numbers (LUNs) (consider Exchange log devices and database devices).
    • NTFS Compression is not enabled.
    • File Level Anti-Virus is configured to exclude all Exchange data locations and any directories that Jetstress has been configured to use.
    • Storport.SYS has been updated to the latest supported version for your hardware.
  1. Download Jetstress and start the installer by double-clicking the Jetstress.msi file. Follow the prompts to install the tool on your computer.
  2. Copy the following ESE database modules to the directory where you installed Jetstress.
    • ESE.DLL
    • ESEPERF.DLL
    • ESEPERF.INI
    • ESEPERF.HXX
    • ESEPERF.XML
ESE.DLL can be found at C:\Program Files\Microsoft\Exchange Server\V15\Bin\, in a default Exchange Server 2013 installation. The rest of the files reside in C:\Program Files\Microsoft\Exchange Server\V15\Bin\perf\AMD64\.
If you forget to copy these files, you’ll get the following warning when you try to run the tool:
Image
Figure 2: 
Validating required files
  1. Running the tool will trigger some validation tests, such as the one mentioned on the previous step. Checking if the LogicalDisk counters are enabled is another test that is run and that will be automatically fixed.
Image
Figure 3: Validating performance counters
  1. Every time the tool fixes something, you’ll need to restart JetStress. Eventually, when all the requisites are met, you’ll be able to continue.
Image
Figure 4: Checking test system

Running the Test

There is a command line version of the tool, and a GUI of Jetstress. The steps described in this article relate to the GUI version.
  1. To start Jetstress simply click the shortcut available at the Start Menu or start JetstressWin.exe. On the Open Configuration page, you can either create a new test configuration or open an existing configuration file.
Image
Figure 5: Test configuration
  1. There are two categories of test scenarios, listed here, that you can select from the Define Test Scenario page:
    • Test a disk subsystem throughput
    • Test an Exchange mailbox profile
In this example we’ll use the first option, test a disk subsystem throughput.
Image
Figure 6: Define test scenario
  1. Since we’re performing a disk subsystem throughput test, the Select Capacity and Throughput page is available. There are options to size the test databases using the percentage of the maximum storage capacity, and target I/O throughput (IOPS) by the percentage of the maximum throughput capacity of the disk subsystem. Jetstress reserves 25% of the initial database file size for its future growth during test runs.
    You also get the option to suppress tuning and use thread count, if the automatic tuning fails. To configure the initial number of threads, apply the following formula:

    Starting thread count = Target IOPS / 65

    For example, if the Target IOPS is 1000, the number of starting threads should be 1000 / 65 = 15.38 (round up to 16).
Image
Figure 7: Capacity and throughput
  1. The next screen it’s where you have the option to select the kind of test you want to perform:
    • Performance - Generates the Exchange type of I/O by accessing a database that has Jet transactions.
    • Database backup - Measures the performance of a backup solution. Be aware that Jetstress can perform a streaming backup only to a device that can be mounted with a drive letter.
    • Soft recovery - Measures the log replay rate.
    • Multi-Host Test - Select this check box to pause the Jetstress tests (both performance and soft recovery) before the database checksums are run. By pausing, you can coordinate multiple hosts running Jetstress in parallel, and prevent the checksum of one host from interfering with the performance test on another host. Multiple host tests should only be used when testing multiple hosts against a common storage area network (SAN).
Image
Figure 8: Test type
  1. The Define Test Run page has the following options:
    • Output path for test results enables you to specify the directory where Jetstress will save the performance logs, and test reports.
    • Test duration (hours) enables you to specify the period for performance sample gathering.
Image
Figure 9: Test duration
  1. Jetstress limits the number of databases based on the physical memory of the system. For Exchange 2013, there’s a requirement of 256MB per database.
Image
Figure 10: Define Database Configuration
  1. The Select Database Source page has three options:
    • Create new databases
    • Restore backup databases
    • Attach existing databases
Since we are building this test from scratch we’ll select Create new databases. If we had previously prepared this test, we could just attach existing databases, saving some time.
Image
Figure 11: Database source
  1. The Review & Execute Test page gives you a summary of the test scenario that Jetstress will run. You have the following options available:
    • Prepare test - Creates the test database(s).
    • Execute test - Proceeds to prepare test databases, performs automatic tuning, and runs the configured test.
    • Save test - Saves the settings that you have configured to a new or an existing configuration file.
Image
Figure 12: Review & Execute Test
  1. When you hit Execute test, you get the Test in Progress page, where you have the option to cancel the test and exit from the application.
Image
Figure 13: Test in progress

Analyzing Results

After the test is completed, the performance data is analyzed and reported in a summary report. Results will be saved to Performance_(DateTime).html file. All the performance counters collected will be gathered in a counter log file named Performance_(DateTime).blg that you can use for some more advanced analysis.
Consider the following guidelines when examining the data collected.
Performance counter instance
Guidelines for performance test
Guidelines for stress test (>6h)
I/O   Database Reads Average Latency (msec)
The average value should be less than 20 milliseconds (msec) (.20) and the maximum   values should be less than 100 msec.
The maximum value should be less than 200 msec.
I/O   Log Writes Average Latency (msec)
Log disk writes are sequential, so average write latencies should be less than 10 msec, with a maximum of no more than 100 msec.
The maximum value should not exceed 200 msec.
%Processor   Time
Average should be less than 80% and the maximum should be less than 90%.
Same as for performance test.
Table 2: Guidelines for examining Jetstress 2013 analysis reports
If you open Performance_(DateTime).html you’ll see a nice HTML report with some tables with performance analysis, like the following example:
Image
Figure 14: Performance Test Result Report
How to interpret the summary report:
  • The Test Summary legend summarizes the test run.
  • The Test Issues legend indicates whether the test passed or failed. It also indicates any performance samples averages and spikes that have exceeded thresholds.
  • The Database Sizing and Throughput legend explains your planned Exchange database sizing in addition to the input/output (I/O) throughput target.
  • The Jetstress System Parameters legend explains the parameters that relate to the Exchange database engine.
  • The Database Configuration legend lists the database files and log paths, grouped by Jetstress 2013 instance.
  • The Transactional I/0 Performance legend shows the throughput, average I/O size and latency for read and write disk operations that are related to user activity, and log replication. The numbers are grouped by Jetstress 2013 instance for both database and log operations.
  • The Background Database Maintenance I/0 Performance legend shows the database read throughput and average size for operations triggered by background database maintenance.
  • The Log Replication I/0 Performance legend explains log read throughput and average size for operations triggered by log replication.
  • The Total I/0 Performance legend explains throughput, average I/O size and latency for read and write disk operations, taking into account all the operations triggered by Jetstress 2013.
  • The Host System Performance legend summarizes other non-I/O performance metrics. Be aware that these figures could be indirectly affected by storage performance.
Since I made this test using a virtual environment without the proper performance conditions, the test just failed. That was expected.

Conclusion

The storage subsystem is key for a successful Exchange deployment. Planning it and testing it should precede any Exchange deployment. The Jetstress tool in conjunction with Performance Monitor and Event Viewer allows you to validate how the server and storage infrastructure will behave once a real Exchange workload is applied.
Please remember that before a server which was tested with Jetstress can be put into production, it is necessary to follow this recommended procedure:
  • Uninstall Jetstress and Reboot
  • Copy the Jetstress data to a safe location
  • Delete the Jetstress installation folder
  • Remove all test databases

No comments:

Post a Comment