Cargo Cult Testing (testing as ritual rather than science)

May 15th, 2007

A cargo cult reproduces a landing strip complete with wooden plane.Richard Feynman popularised the idea of the cargo cult in his essay on cargo cult science, which was “science” that followed all the forms of scientific investigation, but lacked real critical scientific thought. The idea was transferred to the world of software courtesy of the Jargon File with its entry on cargo cult programming.

The Jargon File defines cargo cult programming as:

A style of (incompetent) programming dominated by ritual inclusion of code or program structures that serve no real purpose. A cargo cult programmer will usually explain the extra code as a way of working around some bug encountered in the past, but usually neither the bug nor the reason the code apparently avoided the bug was ever fully understood.

As someone who works in one of the more technical areas of testing, I see the same thing in the testing world too.

Functional testing is pretty straight-forward; ask a test manager why they are testing, they will probably say something like “we are seeing if the business functions work” (really smart test managers will add that they are seeing that the actions that are meant to fail do actually fail too).

But once you add a layer of technology, it becomes a bit harder to grasp. If your functional test regression suite is automated, then the quality of the testing is less easy for most people to see. Were the test cases that were automated good test cases to begin with? After a few months or a year, regression testing becomes a little bit like a ritual; you wave the magical testing tool over a new version of the application and declare it “tested”. Whether it is tested well is an entirely different question…and one which is unlikely to occur to those deeply involved in the ritualistic behaviour.

In the stress/volume/performance/load/reliability testing world things are even worse. Occasionally I visit companies that I have consulted at in the past to see how they are going with their testing. They generally know to run a peak load test for new builds (and platform changes) before applying the change to Prod, but if you ask a question like “have the transaction volumes changed in Production since I wrote the original Detailed Test Plan, and have the LoadRunner scenarios been updated to reflect this?”, you will draw a blank.

Performance testing during the maintenance phase of software frequently seems to degrade to a situation where the testers are more interested in whether the test cases and LoadRunner scripts still run successfully, rather than whether the test cases reflect reality or even cover the point of change that is driving the current round of testing. And a lack of understanding of what each test case is designed to test leads to wasteful re-running of test cases that are not impacted by a change.

Unfortunately I don’t really have any solutions for this besides companies paying me to occasionally come in and review their performance testing activities (which kind of smacks of self-interest).

Connecting to a MySQL database with LoadRunner

May 14th, 2007

The Virtual Table Server is great for most situations where your virtual users need a common data pool, but the limitations of the API mean that it is a bad fit in some cases.

Say you need to find a row in the data table with a specific value. On a real database you could just do a simple SELECT statement. With VTS, you would have to write code to iterate through every row in the table and check the column value each time.

So sometimes you need to use a real database. Interfacing with a real database is not much harder than interfacing with VTS (as the attached script demonstrates). You will spend more time setting up the database than writing code.

Interfacing with a real database will also allow you talk to the database your application uses (only do read-only operations, and be careful of any additional overhead you might introduce during your load test).

This example users the Java vuser type, and interfaces with a MySQL database. The beauty of JDBC is that to port this code to another database just requires changing the line of code that specifies the database driver (assuming that you have not used database-specific SQL code).

On with the example…

Setting up the database

  1. Download and install MySQL.
  2. Create a database
    mysql> CREATE DATABASE loadtest;
    mysql> SHOW DATABASES;
    mysql> USE loadtest;
  3. Create a table
    mysql> CREATE TABLE message (time_sent BIGINT UNSIGNED, identifier VARCHAR(127));
    mysql> SHOW TABLES;
    mysql> DESCRIBE message;
  4. Load data into the table
    mysql> INSERT INTO message VALUES ('1178858071111','asdf1234');
    mysql> INSERT INTO message VALUES ('1178858071112','asdf1235');
    mysql> INSERT INTO message VALUES ('1178858071113','asdf1236');
    mysql> INSERT INTO message VALUES ('1178858071114','asdf1237');
    mysql> INSERT INTO message VALUES ('1178858071115','asdf1238');
    mysql> SELECT * FROM message;
  5. Create a user account with access to the database.
    mysql> GRANT ALL PRIVILEGES ON loadtest.* TO 'loadrunner'@'%' IDENTIFIED BY 'loadrunner' WITH GRANT OPTION;
    Note username/password is loadrunner/loadrunner.
    mysql> SELECT * FROM mysql.user where User = 'loadrunner';

Verifying external connectivity to the database

  1. Download an SQL query tool to verify that you can successfully connect to the database.
    I recommend that you check out SQLeonardo.
  2. Download the JDBC driver for MySQL.
    MySQL now produce their own JDBC driver (Connector/J)
  3. Configure the SQL query tool.
    nickyb (the creator of SQLeonardo says)…

    Suppose you have downloaded and unzipped sqleonardo into c:\sqleonardo and the mysql jdbc driver into c:\sqleonardo\mysql

    Run SQLeonardo and into the “metadata explorer” do:
    - choose the menu “actions>new driver…”
    - check “add library (browse filesystem)” and click “next >”
    - select the jar file into c:\sqleonardo\mysql and click “next >”
    - type into the textfield named “name:” => MySQL
    - select into the combobox named “driver:” => com.mysql.jdbc.Driver
    - into “example:” => jdbc:mysql://[host][:port]/[database]
    - click ok.
    now you have registered the driver!

    Select the item “MySQL” appeared into the tree and:
    - choose the menu “actions>new datasource…”
    - replace jdbc:mysql://[host][:port]/[database] => jdbc:mysql://localhost:3306/
    - put username and password and click ok.
    now you have added your database profile.

    Select the item under “MySQL”…you need now to test the connection!

    So our database connection string would be jdbc:mysql://www.myloadtest.com:3306/loadtest (assuming the database is installed on the default port).

  4. Test the connection.
    Try running SELECT * FROM message; in your SQL query tool.

Writing the code

  1. Java.sql javadocs are here. There are numerous tutorials on the web, and you have my sample code (text, zipped). What more do you need?

Scripting Exercise: Correlation Challenge

May 14th, 2007

Beaten by the correlation exercise.Correlation is one of the fundamental LoadRunner scripting skills; and LoadRunner novices are usually not very good at it (which is expected), but people who think they are LoadRunner experts are sometimes not very good at it either.

Mercury has done just about everything it can to remove the need for complicated correlation (correlation rules, the “scan script for correlations” option in VuGen, HTML mode recording, the Click and Script vuser type), but there will always be web applications that will require you to perform manual correlation when scripting.

Mercury’s training material kind of glosses over correlation, and makes it look overly easy – the correlation exercise from the training material never gives any of my students any problems; but if the training material were to include difficult exercises, it would be necessary to spend another day, and more people would be unable to complete the exercises.

I usually tell my students that they will encounter some much more difficult to correlate applications than the Mercury Tours website, and they will need to spend some time improving their manual correlation skills using WDiff .

This exercise should really test your correlation skills. Every problem (or something very similar) has been seen “in the wild” while I have been creating scripts for LoadRunner. The first exercise is the same as the exercise from Mercury’s VuGen 8.1 Scripting for the Web training material.

Correlation exercise coming soon…

Until the, have a go at the previous scripting exercises:

The 10 Commandments of Load Testing

May 9th, 2007

I have made a list of the top ten things load testers frequently fail to do that make me feel like smiting them.

Lightning strike (Image ID: nssl0012, National Severe Storms Laboratory)

  1. Thou shalt know how thy test tool works.
    The worst performance testers I have met were always more concerned about whether they could get their scripts to run, rather than whether the tests they were running were realistic. Read the documentation, practice, spend some time figuring out what all the settings do, then relate how your scripts are running back to how real users exercise your application.
  2. Thou shalt gather realistic usage data.
    Garbage in, garbage out. If your transaction volumes are wrong, then your load test is wrong.
  3. Thou shalt have testable requirements.
    Non-functional requirements (especially load and performance-related requirements) are usually an afterthought for many projects. This shouldn’t stop you from trying to gather the requirements you need for your tests. The business approach of “let us know how fast it is, and we will let you know if that’s okay” isn’t good enough. Get some numbers. The numbers can change in the future (maybe call them “targets” or “guidelines” rather than “requirements”), but you need something to test against before you start.
  4. Thou shalt write a test plan.
    Even if you already know what you’re going to be doing, other people would probably like to know too – they might even be able to help; besides, a signed-off test plan has saved many a tester from the wrath of project management.
  5. Thou shalt test for the worst case.
    Don’t test with transactions from an average day, test for the busiest day your business has ever had. Add a margin for growth. Testing failover? A server doesn’t fall over at midnight when no one is using your application (would we care in this situation anyway?), it falls over in the middle of the day when lots of real people are using it.
  6. Thou shalt monitor your test environment infrastructure.
    I feel that I have to spell it out, because I still see people who don’t do this. Monitoring your servers allows you to more easily figure out where the problem is. You can also make neat observations like “response times for the new version of the application are the identical to the previous version, but CPU utilisation on the servers has increase by 10%” When I say “monitor your servers”, this includes your load generators.
  7. Thou shalt enforce change control on your environment.
    The final thing you tested should be what is deployed into Production – same application version, same system configuration. It’s easy to lose track of what you are actually testing against if people are making uncontrolled changes to your environment, or if people are making tuning changes without tracking what they are changing. Keep a list of changes that are made…even if you are in a hurry; and always make sure you know what you are testing against.
  8. Thou shalt use a defect tracking tool.
    An untracked defect is a little like a tree that fall in the forest when no-one is around – no-one cares. Raising defects lets everyone know there is a problem (not just the people who should be working to fix it). It also provides a neat repository to keep track of all the things that have been tried to fix the problem.
  9. Thou shalt rule out thy own errors before raising a defect.
    “Oops, my bad!” is a great way to lose credibility with the people who are going to be fixing your defects. If you don’t have credibility, you are going to have to work much harder to convince people that the problem you are seeing is due to a fault with the system rather than a fault with your test scripts. Don’t be so afraid of making a mistake that you test “around” errors (like people who see HTTP 500 errors under load and “solve” the problem by changing their scripts to put less load on the system). It always helps if you have followed commandment #1 Thou shalt know how thy test tool works.
  10. Thou shalt pass on your knowledge.
    Write a Test Summary Report and let management know what you found (and fixed) during testing, make some PowerPoint slides, hold a meeting. Let the Production monitoring group know which metrics are useful to monitor, let them re-use your LoadRunner scripts for Production monitoring with BAC. Leave some documentation for future testers; don’t make them gather requirements and transaction volumes again, or re-write all your scripts because they don’t understand them. Retain your test results until you are sure that no-one is going to ever ask about the results of that test you ran all those months ago.

SOAP over JMS LoadRunner script

April 30th, 2007

This is a quick walkthrough of a LoadRunner script I created to load test a “web” service that communicates with SOAP messages sent over JMS.

Note that this script supports the queuing/point-to-point model of JMS (rather than the publish and subscribe model), works with JMS version 1.1 rather than 1.02b (as it uses the BytesMessage.getBodyLength() method among others), and has been tested with Java 1.5 using the Tibco implementation of JMS.

The LoadRunner script can be downloaded here, the source code is available here, and a mirror of the JMS 1.1 specification can be downloaded here. People using Tibco’s JMS implementation can download their Javadocs here (10MB). Feedback is appreciated.

From looking at the script, you may wonder why I chose to write my own script using the Java vuser type instead of using the Web Services vuser (version 8.1, Feature Pack 4 supports web services using JMS as well as HTTP), but that is a story for some other time.

Anyway…for the tiny number of people that I haven’t lost already, here’s how it all works…

At a high level, we are trying to put a SOAP message on a queue (MyLoadTest.Warehouse.Request), receive the corresponding response from a different queue (MyLoadTest.Warehouse.Response), and check that the response message matches our expected message. Download my source code and follow along.

On with the detail…

The javax.jms.* classes provide interfaces only, other vendors have provided implementations of these interfaces (I am using Tibco’s implementation – found in com.tibco.tibjms.*). This should mean that any JMS code you write will be portable between implementations; whether this is true or not is something I haven’t tested.

Initially we perform a JNDI lookup to find a ConnectionFactory and get the details for the queues we will be using.

We then create a Connection and a Session, and then create the MessageProducer (using the request queue details) that we will be using to send our message to the request queue.

We create either BytesMessage or a TextMessage (MapMessage, ObjectMessage and StreamMessage have not been implemented in the script), and set the JMS header properties. The important JMS headers are the JMSDestination (the request queue), and the JMSReplyTo (the response queue). We can also set Message properties, which are simple name/value pairs. We then set the Message body, using a string constant containing the SOAP XML message.

We only want to measure system response times, so we only put our LoadRunner transaction timing points around the code that sends and receives the message.

Sending the message is as simple as calling myMessageProducer.send(myMessage).

In the following extract from the Tibco EMS logs, you can see the JMS headers, the Message properties and the Message body (shown here with a size only). The JMS headers show the request and response queues along with a unique msgID.

2007-02-26 13:20:07 [MSG:73422]
received from user='anonymous':
connID=1669
prodID=16906
msgID='ID:SVP-EMS-SERVER.391E45DA4ADA6C8:3083'
Time=1172456256000
mode=PERSISTENT
queue='MyLoadTest.Warehouse.Request'
msg=BytesMessage=
{
  Header=
  {
    JMSDestination={QUEUE:'MyLoadTest.Warehouse.Request'}
    JMSReplyTo={QUEUE:'MyLoadTest.Warehouse.Response'}
    JMSDeliveryMode={PERSISTENT}
    JMSPriority={4}
    JMSMessageID={ID:SVP-EMS-SERVER.391E45DA4ADA6C8:3083}
    JMSType={BytesMessage} JMSTimestamp={1172456256000}
  }
  Properties=
  {
    "SOAPJMS_requestIRI"={string:''}
    "SOAPJMS_targetService"={string:''}
    "SOAPJMS_soapAction"={string:''}
    "SOAPJMS_isFault"={string:'false'}
    "SOAPJMS_contentType"={string:''}
    "SOAPJMS_soapMEP"={string:'http://www.w3.org/2003/05/soap/mep/request-response/'}
    "SOAPJMS_bindingVersion"={string:''}
  }
  Body=
  {
    byte[]:1126 bytes
  }
}

To receive a Message, we must create a MessageConsumer (using the response queue). Rather than simply receiving the messages from the queue in the order they arrived (FIFO – which would mean that we could be receiving the wrong response message when we are running load against the system), we can use a message selector to make sure that we receive the message that we want from the queue.

In the log above, we saw that each JMS message has a unique JMSMessageID. When a response is sent by the server, the response message has a JMSCorrelationID that matches the JMSMessageID of the message it is replying to. A message selector is defined in the same way as the WHERE clause of an SQL statement, so our selector would be
"JMSCorrelationID = '" + myBytesMessage.getMessageID() + "'".
Note that the unique message ID is only set when the message is sent so, if you are using a selector, you must create your MessageConsumer *after* you have called myMessageProducer.send(myMessage). A selector can use any of the JMS headers and Message properties, but cannot use the Message body.

Using the MessageConsumer, receiving the correct message from the queue is done by simply calling myMessageConsumer.receive(timeout). The script will wait to receive the message until it reaches the timeout value. If no message is received before the timeout, the response message will be null. Remember to close the MessageConsumer, or you will leak and end your test with thousands of open receivers.

Here is the Tibco EMS log entry for the response message. You can see that the message has a different JMSMessageID, and has a JMSCorrelationID with the message ID of the request it is replying to.

2007-02-26 13:20:09 [MSG:73425]
received from user='anonymous':
connID=1714
prodID=19956
msgID='ID:SVP-EMS-SERVER.391E45DA4ADA6F5:7835'
Time=1172456409014
mode=PERSISTENT
queue='MyLoadTest.Warehouse.Response'
msg=BytesMessage=
{
  Header=
  {
    JMSDestination={QUEUE:'MyLoadTest.Warehouse.Response'}
    JMSDeliveryMode={PERSISTENT}
    JMSPriority={4}
    JMSMessageID={ID:SVP-EMS-SERVER.391E45DA4ADA6F5:7835}
    JMSCorrelationID={ID:SVP-EMS-SERVER.391E45DA4ADA6C8:3083}
    JMSTimestamp={1172456409014}
  }
  Properties=
  {
    "SOAPJMS_requestIRI"={string:''}
    "SOAPJMS_contentType"={string:'text/xml'}
    "SOAPJMS_soapMEP"={string:'http://www.w3.org/2003/05/soap/mep/request-response/'}
    "SOAPJMS_bindingVersion"={string:''}
  }
  Body=
  {
    byte[]:681 bytes
  }
}

Extracting contents of the Message body from the JMS Message is done by calling myMessage.getText() if it is a TextMessage (if it is a BytesMessage, it is a bit trickier – see the attached script).

And finally comparing the received SOAP message to the expected message is just done using String.equals().

Remember to close the Session and Connection at the end of the script.

The exercise of modifying the script to send many different messages to different queues (which would mean you only have to maintain one script instead of one script per SOAP operation) is left as an exercise for the reader…

Java Thread Dump

April 10th, 2007

A Java thread dump is a way of finding out what every thread in the JVM is doing at a particular point in time. This is especially useful if your Java application sometimes seems to hang when running under load, as an analysis of the dump will show where the threads are stuck.

You can generate a thread dump under Unix/Linux by running kill -QUIT <pid>, and under Windows by hitting Ctl + Break.

A great example of where this would be useful is the well-known Dining Philosophers deadlocking problem. Taking example code from Concurrency: State Models & Java Programs, we can cause a deadlock situation and then create a thread dump.

Dining Philosopers applet screenshot

In the example below (shown using tda), we can see that the 5 Philosopher threads each have a lock on a Fork object and are each waiting to obtain a lock on a second Fork object before they can eat. Unfortunately this never happens and all the philosophers starve.

Thread dump of the Dining Philosphers from Thread Dump Analyzer

Download the thread dump from here (7 KB).

Note that not all hangs are going to be due to deadlocks, and there are many tools (including Eclipse) that will help you analyse thread dumps.

Free packet sniffer

November 9th, 2006

Yet another application to add to your performance testing toolkit is Wireshark (previously called Ethereal). A packet sniffer like Wireshark will let you see your network traffic at all the different protocol layers.

Wireshark - a free packet sniffer

A naturally curious person might use it to investigate why the lights on their router started flashing all the time just after they installed that really useful piece of software that let them change their mouse cursor.

As a performance tester, I use it when I’m trying to figure out what the application I am trying to record with LoadRunner is sending and receiving, and the VuGen log doesn’t contain enough information (and I can’t get it by recording in Winsock mode). 99% of the time this isn’t necessary but occasionally, when the application uses some sort of weird (or custom) protocol, it helps to show what is going on.

Recently I was running a LoadRunner training session and was asked to take a look at the application the company was developing. It was a web-based application that had an embedded ActiveX object that also communicated with the server. Recording just HTTP did not record any traffic to or from the object. Recording with the Web/Winsocket Dual Protocol vuser type recorded a message from the object to the server (as well as all the HTTP traffic from the web page), but no communication from the server to the object; nothing appeared in the recording log either.

To double check that the object really was getting information from the server, rather than values being passed from the HTML, a network trace was run. This showed that there really was a message being sent to the ActiveX object, and that for some reason VuGen was not recording it correctly. In this situation, it is easy to give Mercury Support your script and your tracefile and they will pass it on to the R&D team who will usually give you a work-around or write you a patch in a couple of days.

So, anyway, sometimes you will come across an application that talks to the server in some kind of unconventional way; and sometimes a packet sniffer helps you figure out what is going on. Wireshark is your best option because it is free and it has a comparable feature-level with expensive proprietary tools like Sniffer Pro.

Download it from the Wireshark website.

Some other tools that I have talked about previously:

LoadRunner ContentCheck Rules

November 3rd, 2006

A LoadRunner feature that has made my life a lot easier has been ContentCheck rules, which are available in the script runtime settings. If you are using a web-based vuser type, you can configure your LoadRunner script to search through all returned pages for strings that match known error messages.

LoadRunner web content check rules

Using web_reg_find functions is fine, but when you get an error LoadRunner reports it as “failed to find text” instead of something more descriptive.

I will always create rules for any error messages I find during scripting and, if I receive an error while running a scenario, I will add the error message from the snapshot in the scenario results directory (the snapshot on error feature is very useful).

All this is pretty obvious if you have taken the time to explore LoadRunner’s features or you have attended a Mercury training session, but I recommend taking things a step further.

  • Ask your developers for a list of all the error messages that the application can throw. This should be easy for them to provide if the application is well designed and stores all the message in some kind of message repository instead of sprinkling them throughout the source code.
  • Include error message for functional errors that you are likely to encounter. Creating a rule for “incorrect username or password” may save someone 20 minutes of investigation when they first run the script after the database has been refreshed.

If you prefer to have error message you are checking for in the script (where you can add comments to them) instead of the runtime settings, you can use the web_global_verification function instead. The only difference between the two is the error message that LoadRunner will include in its log:

Action.c(737): Error -26368: “Text=A runtime error occurred” found for web_global_verification (“ARuntimeErrorOccurred”) (count=1), Snapshot Info [MSH 0 21]

…compared to:

Action.c(737): Error -26372: ContentCheck Rule “ARuntimeErrorOccurred” in Application “Webshop” triggered. Text “A runtime error occurred” matched (count=1), Snapshot Info [MSH 0 21]

And finally, ContentCheck rules can be easily exported and shared between scripts, which can be a nice time-saver.

Free WAN emulator

November 3rd, 2006

Mercury has just announced that they will no longer be re-selling the Shunra WAN Emulator. This means that I can let you in on a little secret – you can get most of the WAN emulation features for free by using a simple open-source program.

As performance testers, we know that an application operating over the network will perform poorly if there is not enough bandwidth available. We also know that response times for some applications are more affected by latency than others no matter how much bandwidth you have (eg/ an interactive multiplayer game like Quake is playable over a 56k modem, but completely useless over a satellite link that has 10 times the bandwidth). Sidenote: a good job interview question for a performance tester might be to explain the difference between latency and bandwidth, and its impact on application performance.

Load testing with bandwidth limitations is easy; LoadRunner gives you this feature for free. Latency is harder as it requires either a real WAN link, or something to introduce an artificial delay. An artificial delay can be introduced by a black box that you plug into your network (like those offered by Anue, East Coast DataCom, and Apposite Technologies) or by a piece of software like the Shunra WAN Emulator.

Shunra WAN Emulator

The free alternative to Shunra’s software is Dummynet, which was created by an Italian academic researcher.

Unfortunately Dummynet only runs under FreeBSD, but a tiny version of FreeBSD with Dummynet that fits on a bootable floppy disk is available for download. Personally, I haven’t seen a floppy disk for years and I don’t quite trust that FreeBSD (let alone a tiny version of FreeBSD) will support the variety of hardware it will encounter.

My preferred solution is to install FreeBSD as a guest operating system inside VMware. The hardware in the virtual machine is virtualised, so you don’t have to worry about driver support, and it is easy to distribute a VMware image between computers. The only other thing you will need to do is to set up a second network card in your computer and add it to the virtual machine.

FreeBSD running inside VMware

The good thing about this solution is that it makes it easy to demonstrate latency and bandwidth-related performance problems manually, rather than expecting people to just accept your tool’s measurements. The only tricky part may be getting permission to plug your laptop into the network (or install the software) at a client site.

Using open source tools for performance testing (Google video)

November 2nd, 2006

I just found something interesting on Google Video; it is a 1 hour Google TechTalk presentation by Goranka Bjedov about Using Open Source Tools for Performance Testing.

Goranka has some interesting things to say. She makes the point that there is really no standard terminology in performance testing circles, and goes on to prove this by giving her own definitions of performance, stress, load, scalability, and reliability testing. As an example of reliability testing she notes that “typically, when I was at AT&T, we would run for about a month at a time after everything was done just to find out that the system can actually stand up and can work fine with the load for a prolonged period of time.” In my testing circles, we would call that a soak test, but I would have been interested to hear more about the types of systems she was testing at AT&T.

The main body of her talk is about the different tools available for load and performance testing. These can be broken down into in-house, vendor and open-source tools.

Google has 55 in-house load and performance testing tools that have been developed by different groups for testing different Google products. These are very expensive to maintain and may only be used in-house, which makes any benchmarks impossible to verify by a third-party. Goranka says “Before you decide to develop your own, please take a look at what is out there…”

Goranka slams vendor tools (like LoadRunner, SilkPerformer and WebLOAD) for being overly expensive and using proprietry scripting languages. Personally, I have always thought that it was pointless having to learn another language just to use a load testing tool. Unfortunately she uses LoadRunner’s scripting language as an example “it’s C, minus the pointers”, and is incorrect – unlike many other tools, Mercury uses standard C (and Java and VBScript).

Her recommended solution is the open source tools – “five years ago they just weren’t there, but today they are.” Her personal preference is JMeter, but also recommends OpenSTA and The Grinder. Open-source tools have the advantage of being a good price, and having source code available; she also makes the point that they use standard programming languages for scripting (although this is incorrect when talking about OpenSTA).

The disadvantages of open-source tools are that they have a steep learning curve and do not support many protocols. “The vendor tools support far more protocols than the open-source tools, but as long as you are staying in the web space, and your looking at HTTP/S, IMAP and POP3, the open-source tools are pretty good”.

Goranka does not say that the open-source tools are free because it is occasionally necessary to write code to extend their features. “Free software is free in the sense that a puppy is free.” Features that Google engineers have written for JMeter have been added back to the main code tree by the people maintaining the project, meaning that Google is at least saved the cost of maintaining their forked code.

She uses JMeter for testing web-based applications through the GUI, uses The Grinder for API-based testing, and does not use OpenSTA because it only works on Windows.

Other points during the presentation:

  • You should use the same monitoring for Load testing that you use for Production monitoring (so you don’t have to account for the differences in load that a different monitoring system will put on the system).
  • If you are running Unix-based systems, don’t sustain CPU above 80%.
  • Google tracks a summary of every performance test in a central database. The database also contains information on every piece of software that is installed on the machines in the test environment.
  • If I am unfamiliar with the system, I don’t trust it. One of the things that I have realised is that
    A) the system will fail in the place where they tell me that nothing could go wrong.
    B) developers are totally delusional about their own software, and frequently they will just forget about things that they’ve done two weeks ago.
  • I run every test 5 times. I want to see that I have some sort of statistical consistency.
  • Performance testing should not be used as a tool to find memory leaks; but it can.
  • Performance testing without monitoring? Don’t bother. Why waste your time?
  • If you are going to do any performance testing, make sure that database sizes are somewhat realistic. They don’t have to be exactly the same, but they have to be the same order of magnitude otherwise the results you are getting are completely off.
  • Execute a stress test. Find how your system is failing. Find where it is failing. Do find out how the system handles overload. There are no good defence mechanisms against people out there, and you can’t predict sudden popularity (eg/ Google Earth).
  • Start a test after a decent warm-up period. Don’t start 100000 users all at once.
  • Quite often people don’t know about everything that is running on a complex system. Maybe there is a low priority process that is running with high priority. This can usually be fixed by niceing the process down. Quite often there are debug things that are still running also.
  • Monitor the machines that are collecting the monitoring data and the load generators (not just the system under test).
  • Performance Testing and QA is about risk analysis. If I believe it is high risk, I want to take a look at it.
  • When I am doing performance testing, the first thing I try to do is eliminate the network. I want to simplify my problem. I am interested in the machines, and my hope is that the network provided will handle everything I need. Once everything is profiled and understood, we will do some tests that include the network. If you can, put everything on the same subnet and same switch. It will make you a much happier performance tester in the first pass. Debugging networking problems is (not) fun.
  • (When talking about testing on smaller sized systems than Production). You can’t test on a 386. Extrapolation will kill you. You will run out of some resource that you never expected, and you can’t predict this ahead of time. For final validation, you really want to get some time on the Production hardware before it goes live. If the system is not being used for Production, it should not be that hard to get hold of it for a week or a weekend.
  • Find more open-source performance tools at opensourcetesting.org

There is another summary of her talk available on Robert Baillie’s blog.

You might also want to have a look at Becoming a Software Testing Expert; a 1 hour presentation delivered by software testing expert, and author of Lessons Learned in Software Testing, James Bach on June 13, 2006. His presentation is available for download from his website.