Practical Thoughts on Equivalency Class Partitioning and Boundary Testing

I recently had a friend ask for my thoughts on some software testing methodologies / tools and within that the topic of Equivalence Classes and Boundaries came up. The book he was using, while highly recommended, did a rather poor job of providing any practical example of when these tools would be used and why they were useful. Because of this, he wasn’t even sure if they had much practical value and was wondering if I had any input.

I’ll start this article with the same thing I told him: this is one of the most useful software testing tools you could possibly have! It is definitely not esoteric and it is certainly something you want to work through to understand.

Books don’t always do topics justice, and his book in particular didn’t do an “awful” job of presenting things, but it was very “text book”. Using a simple yet realistic example helped him and I’m hoping the same will be true for the readers of this post.

Guess The Number Game Example
Let’s say you have a program that randomly selects a number between 1 and 10 (inclusive) and then asks you to guess the number. Expected input by the user is any integer value between 1 and 10 (inclusive).

Thinking about this logically we have “good values” for the number between 1 and 10. Anything outside of this range is “bad”. This means that any value less than 1 for is “bad”. Any value greater than 10 is “bad”. This yields three “equivalency classes” as shown below.

Equivalency Class / Partition Guessed Number
Too Low x <= 0
Good 1 <= x <= 10
Too High 11 <= x

The basic idea is this: testing any value from within one of these Equivalency Classes is the same as testing any other value within it. In other words: the program should behave the same way for one value within the Class as it does for all values within the Class. By testing a part of the Class, you are in fact testing the whole Class.

Extrapolating further: to test a single value from each Class is to test every value in every Class. This is the same as testing all possible values but clearly takes a lot less time!

This is obviously an oversimplification because it’s possible the program could react differently to values within the same Class. (There could be logic for x == -5.) However, trying to test all the values would take too much time and testing is about balancing risk vs. reward. Yes, there is risk to miss special logic only understood by the programmer, but the chance is low. The reward for testing so many values would probably in no way compensate for the time spent in testing it. You’ll never find all the bugs, but testing using Equivalency Classes will help find many of them.

Examples of defects that could only be caught by testing every value would include the Pentium Division Bug (per “Also, only certain numbers (whose binary representation show specific bit patterns) divide incorrectly. Consequently many users may never encounter the division error.”) and the Microsoft Excel Display Bug (note that only 12 out of 18,446,744,073,709,551,616 possible floating point binary numbers demonstrate the bug). Equivalency Classes / Value Partitioning will never catch these kinds of errors and while they make a lot of media attention, they really didn’t impact that many people in real life.

Boundaries and Catching Some of that “Special Logic”
While I just said that we don’t want to spend loads of time trying to find “special logic” within an Equivalency Class, there are certain areas within and between Equivalency Classes where special logic often creeps up. Based on historic evidence and some common sense, it would make sense to test these areas because the risk vs. reward balance makes sense. These areas are the “Boundaries” and are composed of “Boundary Values”.

Looking at the table above, we can begin to think that somewhere in the program, the programmer is going to have to handle the special Boundaries between the Equivalency Classes very carefully. For instance, the program needs to reject 0 but accept 1. The values of 10 and 11 are handled specially in the same way. This allows us to expand our test values as shown below.

Value Type Value Name Number Range Test Value
Equivalency Class Too Low Class x < 0 -284
Boundary Too Low to Good Lower Boundary x = 0 0
Boundary Too Low to Good Upper Boundary x = 1 1
Equivalency Class Good Class 1 < x < 10 7
Boundary Good to Too High Lower Boundary x = 10 10
Boundary Good to Too High Upper Boundary x = 11 11
Equivalency Class Too High Class 11 < x 5,396

Notice that we’ve narrowed the Equivalency Classes just a little to exclude the boundary values. Since we’re testing those values as special Boundary Cases, we don’t want to pick the same value to represent the Equivalency Class. We’re also tesing 7 data points now rather than only 3. This is the end of our value selection because this is the least number of test points that will yield the highest number of defects. Programmers often make mistakes by not considering an Equivalency Class (“The user will never enter a negative number, so why handle it.”) or by mishandling Boundaries (IF x > 1 AND x < 10). The points above will exercise the weakest areas of the program, exposing any failure on the part of the developer.

The Infamous Integer and Real Life Web Application Madness
So, what if you don’t have any clear Equivalency Classes or Boundaries within a program? What if you’re dealing with a program that accepts an integer input for Total Cost (cents are always “.00”) and as far as you can see “the sky’s the limit”. Without any stated Classes / Boundaries, where do you begin with testing?

This is where a little bit of programming knowledge will take you a long way as a tester. You won’t run into this every day, but when you do you will shine above the rest! (In my ever-so-humble opinion.)

First, a real-life example that occurred while I was in SQA. Someone was testing a web application and ran into an “unbounded” integer field that was for Total Cost or Total Hours or some such. As a way of testing, they entered the value 9,999,999,999. They then clicked Next and Back. When they came back, their outrageously large value had been “transformed” into the value 1,410,065,407!

The questions: What was happening? Why such an odd transformation? Furthermore, if 9,999,999,999 wasn’t accepted, what should be accepted?

The answers: the value was being truncated in binary, binary truncation yields results that are difficult for mortals to comprehend, any value between 0 and 4,294,967,295.

The why: If you look at the number 9,999,999,999 in binary, you get the following value.


What’s very interesting to note about this binary number is that it is 34 bits long. As we’ll discuss below, most programs use standard sized storage units that are either 8, 16, 32 or (more recently) 64 bits large. 34 bits crosses a “Boundary” from a 32-bit integer to a 64-bit integer (from a DWORD to a QWORD). Depending on exactly what size storage unit the program is using, this could pose a problem.

Let’s look at the outrageously long binary value in another way:

Bits 33 & 34 Bits 25 to 32 Bits 17 to 24 Bits 9 to 16 Bits 1 to 8

If binary truncation were to occur, it would happen because the back-end storage unit is either 8, 16 or 32 bits long and can’t hold the entire 34-bit number 9,999,999,999. Let’s look at what each of these values would be in decimal given the digits above. Of these values, I’ve highlighted one very interesting one; namely, the very strange value into which the web application was transforming 9,999,999,999!

Data Type Size Data Type Name Binary Digits Decimal Digits
8-bit BYTE
16-bit WORD
32-bit DWORD (Double WORD)
64-bit QWORD (Quardruple WORD)

In other words: even though the user was entering 9,999,999,999 the input was only being stored in a 32-bit storage unit. The 33rd and 34th binary digits were being lost and we were only getting the 1st through 32nd digits. This yielded a truncated value that fit into a 32-bit storage unit but in no way represented the original value!

What Does All This Mean?
It means that many computer programs have Boundaries and Equivalency Classes even if they are not stated. Depending on the storage unit backing a particular input field, only certain values can be accepted. How a program handles input outside of these ranges is critical. In the case of the web application above, it failed to realize the value was outside of its maximum range. It should have checked the value and ensured it was within range before accepting the input.

Floating point ranges (those involving the decimal point) are a bit harder to test, but they are no less important. However, for purposes here, we’ll just focus on integers. Handling integers gets programmers into more trouble on a routine basis than floating point numbers. I speak from experience.

So, here are ranges for the different fundamental integer data types. What makes things a bit more convoluted is that these data types can hold either signed or unsigned data. The funny thing is, the underlying data type is actually identical! Signed vs. unsigned is a matter of “interpretation” of the value (specifically, the Most Significant Bit or MSB). Don’t worry yourself too much with such esoteric programmer rubbish except to the extent to know that you should probably always test both signed and unsigned ranges, just to see if the programmer misinterprets the value somewhere along the way.

Integer Data Type Signed Range Unsigned Range
BYTE (8-bit) -128 to +127 0 to 255
WORD (16-bit) -32,768 to +32,767 0 to 65,535
DWORD (32-bit) -2,147,483,648 to +2,147,483,647 0 to 4,294,967,295
QWORD (64-bit) -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 0 to 18,446,744,073,709,551,615

I’m running way too long on this (as usual), but one last word of advice may help get you pointed in the right direction. We can talk about this more later if you’d like. You can use the above table to have “standard” testing points. For example, knowing that something “funny” might happen if a program is using an 8-bit signed integer with the values -129 to -128 means you should probably always test those values. Finding the above error in the web application would come from testing the upper boundary on a 32-bit unsigned integer; namely, 4,294,967,295 and 4,294,967,296. In that case, you would have found that 4,294,967,296 would have turned into 0!

I hope this helps someone out there, it certainly was useful to my friend and a few others who ended up receiving a copy. I’m sure there are other ways to view this are there are definitely plenty more (and probably better) examples out there, but this was something I put together and wanted to share it with others who are interested.

Until next time,

Posted in Work | Tagged , | Leave a comment

Sierra Wireless Overdrive™ Pro 3G/4G Mobile Hotspot (802S) with LinuxMint 13 (Maya)

So, I’ve recently signed up for a Virgin Mobile plan that gives me 2GB of data per month for just $35. I know, you’re probably asking how much 2GB can do for you. Well, I had my doubts as well but after nearly 20 days into my 30 day allotment, I’ve hardly used 500 MB. Now, this isn’t my primary connection. I’m using it to supplement my rather limiting connection at my current contract gig (lots of filtering). But at the end of the day, it’s doing what I need and $35 a month is a small price to pay for the freedom to move files back and forth from my server back home or send out the ever-so-critical email using those commonly filtered mail services, GMail and Hotmail.

At the same time, I’ve also moved to LinuxMint 13 (Maya). Sure, I’ve been able to connect to my Sierra Wireless 802S device over it’s built-in WiFi broadcast (it’s a mobile hotspot), but I wanted to see if I could get it to work straight through the USB connection. It simplifies the connection process, speeds things up a little and can also eliminate the need to turn the WiFi broadcast on. This is useful if I only need to connect one device (almost always the case) and would rather not want people trying to crack into my wireless network.

Well, I tried plugging it in a couple of times and almost immediately the OS would crash. No, not just the desktop manager (I’m using KDE) or the network manager, the whole Linux OS became unresponsive. This definitely isn’t a good sign and indicates a low-level kernel mode driver issue. Not pretty and not something I really have a lot of time to deal with. Oddly enough, as fate would have it, things did magically work yesterday afternoon. I had recently applied updates to Mint and figured that was what caused the change. I was really excited about coming in this morning and getting connected through a USB cable instead of a one-to-one WiFi network.

Until I plugged in and Linux froze again. What was going on? Well, luckily again /var/log/syslog contained lots of useful information. Comparing the two logs, I found this difference in the sequence of events.

The Bad Sequence of Events

kernel: [ 37.072075] usb 2-3: new high-speed USB device number 5 using ehci_hcd
mtp-probe: checking bus 2, device 5: "/sys/devices/pci0000:00/0000:00:1d.7/usb2/2-3"
mtp-probe: bus: 2, device: 5 was not an MTP device
kernel: [ 37.262305] usbcore: registered new interface driver usbserial
kernel: [ 37.262342] USB Serial support registered for generic
kernel: [ 37.262796] usbcore: registered new interface driver usbserial_generic
kernel: [ 37.262799] usbserial: USB Serial Driver core
kernel: [ 37.267187] USB Serial support registered for Sierra USB modem
kernel: [ 37.267364] sierra 2-3:1.0: Sierra USB modem converter detected
kernel: [ 37.267855] usb 2-3: Sierra USB modem converter now attached to ttyUSB0
kernel: [ 37.269565] sierra 2-3:1.1: Sierra USB modem converter detected
kernel: [ 37.269959] usb 2-3: Sierra USB modem converter now attached to ttyUSB1
kernel: [ 37.270131] usbcore: registered new interface driver sierra
kernel: [ 37.270135] sierra: v.1.7.16:USB Driver for Sierra Wireless USB modems
kernel: [ 37.330024] usbcore: registered new interface driver cdc_ether
modem-manager[804]: <info> (ttyUSB1) opening serial port...
modem-manager[804]: <info> (ttyUSB0) opening serial port...
kernel: [ 37.383446] sierra ttyUSB0: sierra_submit_rx_urbs: submit urb failed: -8

The Good Sequence of Events

kernel: [23840.000102] usb 2-2: new high-speed USB device number 5 using ehci_hcd
mtp-probe: checking bus 2, device 5: "/sys/devices/pci0000:00/0000:00:1d.7/usb2/2-2"
mtp-probe: bus: 2, device: 5 was not an MTP device
kernel: [23840.248888] usbcore: registered new interface driver usbserial
kernel: [23840.248908] USB Serial support registered for generic
kernel: [23840.248961] usbcore: registered new interface driver usbserial_generic
kernel: [23840.248964] usbserial: USB Serial Driver core
kernel: [23840.275138] cdc_ether 2-2:1.0: eth1: register 'cdc_ether' at
        usb-0000:00:1d.7-2, CDC Ethernet Device, 00:a0:d5:ff:ff:af
kernel: [23840.275503] usbcore: registered new interface driver cdc_ether
kernel: [23840.287483] USB Serial support registered for Sierra USB modem
kernel: [23840.287535] usbcore: registered new interface driver sierra
kernel: [23840.287538] sierra: v.1.7.16:USB Driver for Sierra Wireless USB modems
NetworkManager[834]: SCPlugin-Ifupdown: devices added (path:
        /sys/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/net/eth1, iface: eth1)

As can be seen, the critical difference is in the order that the USB core loads the cdc_ether and sierra drivers. In the bad sequence, sierra gets loaded first and subsequently drops into tty (aka “serial cable emulation”) mode. Things go bad quickly and apparently the whole kernel locks up. (Yes, even the other consoles…F1 through F6…lock up.)

In the good sequence, cdc_ether is loaded first and lays claim to the device. The new eth1 interface is configured and all is well. After doing some research, I decided to give something a go. What if we just disabled the sierra driver? It seemed like this latest Sierra Wireless device didn’t really need that driver as it could communicate via USB CDC. What could it hurt? Nothing, I figured, so I opened up /etc/modeprobe.d/blacklist.conf (don’t forget to invoke vi with sudo!) and went to work. Here’s what I added at the bottom.

# added 2012-12-04 - prevent sierra USB driver from loading
blacklist sierra

After a reboot (via “sudo telinit 6″…how else would you do it?), I plugged in my device again. And now I’m finishing up this post on my Sierra Wireless connection via USB in LinuxMint.


Posted in Uncategorized | 3 Comments

Microsoft Moles Crashes while Multithreaded

So I’ve been doing a lot of C#.NET lately and found one of the coolest tools ever earlier this week: Microsoft Moles. Seriously, this stuff is awesome.

The name’s a little strange. I mean, come on, when’s the last time you used the word “mole” in a good way? And when’s the last time you used it more than 25 times in a day? Yeah, that’s what I’m thinking, but still, the sacrifice of saying “mole” that many times is more than worth the power of this tool.

I’m not going to get into how to use Moles because there are all sorts of great resources for that. The one I found the most useful by far was over on Didactic Code by Dave Fancher. His article is excellent and I highly recommend reading it before beginning to use Microsoft Moles.

The main reason I’m posting is to share my rather troublesome experience getting started with Microsoft Moles. By default, the Moles “compiler” (moles.exe) will use multiple threads to reduce the overall time it takes to create the stub and/or mole assemblies. This can lead to a significant time savings because generating these assemblies takes a fair amount of time. Even when you restrict the framework to generating stubs or moles for only those classes you absolutely need it can grind for a good bit of time (15-30 seconds per assembly…that’s an eternity when working on a small solution). Obviously, multithreading this costly operation is an awesome built-in feature.

That is to say, when it doesn’t crash.

But that’s exactly what it did to me. Not all the time, but 90% of the time when you executed a build moles.exe would crash. It left Visual Studio in a confused state until you went and manually killed the moles.exe process. Seeing as how Moles was just replaced by the built-in Fakes framework in Visual Studio 2012, I didn’t see much point in trying to find a patch. The Microsoft Moles site tells you point-blank that Moles is no longer developed. Oh well, sucks to be me here in a shop still using Visual Studio 2008…!

Or maybe not. Maybe I could isolate the issue…perhaps its the fact that it’s running in Visual Studio? So I tried running straight from the command line and received the same result. To further narrow down the possible causes, I kept editing the “moles.args” file which VS produces in the obj\Debug\Moles folder. I started removing the CLR specifications first to no avail. Then the multiple references (this project I was working on had a lot of those). No dice. Finally, I started removing the number of mole assemblies being built until I got it down to one. Bingo! Success every time.

OK, so things will work if I’m only producing one mole assembly at a time. But it just so happens I’m not doing that…I’m building seven of them to support these unit tests. As soon as I add two, my chances of not crashing fall from 100% to less than 10%. I don’t care what the stakes are, those odds are horrible. That’s when I started looking into the moles.exe arguments and found the ProcessCount argument (short form, “/pc”). Per “moles.exe help compilation” ProcessCount “[s]pecifies the number of concurrent processes to generate Moles”. I figured it couldn’t hurt…let’s try setting “/pc:1” in the moles.args file. After this much digging, I can hardly say I was surprised when it worked! Multithreading might not be such a good idea afterall!

(My sarcastic tone here may imply I’m ungrateful to all those who’ve done research and development for Microsoft Moles. Let me state in no uncertain terms that this could not be farther from the truth. Such a minor issue in my book should not be held against a tool of this power. Moles has forever changed how I will approach Unit Testing in .NET and I plan on using it and its companion Fakes framework for years to come.)

So how can we get back into Visual Studio 2008 with the “/pc:1” argument set? I’m not sure if there is a better way, but here’s how I did it. I edited the “C:\Program Files\Microsoft Moles\bin\Microsoft.Moles.targets” file and changed one line. Inside the <GenerateMoles/> tag (the file is XML) there are several attributes. The one to change is CommandLineArguments. (Changes are highlighted in yellow.)

CommandLineArguments="$(MolesCommandLineArguments) /pc:1"

Sure enough, compilation of the moles assemblies in Visual Studio no works like a charm. Sure, it’s a little slower, especially the first time around, but it’s a small price to pay for stability. And that stability gives me a power now I only ever dreamed would be possible. Yes, Test Driven Design is absolutely key and approaches like MVVM help promote good unit test architecture. But at the end of the day it’s going to be a long time before everyone is doing it the “right way”. Until then, a tool like Microsoft Moles will forever be awesome. I mean, come on, hard coding a return value for System.DateTime.Now…how cool is that?

Till next time,

Posted in Geekery, Work | Tagged , , , , | 1 Comment

Installing ruby-debug-ide19 on Ruby 1.9.3 (p125) from RailsInstaller 2.1.0

This has been quite a little journey. In the end, I’m happy with the solution though as it appears to be pretty elegant and, most importantly, seems to work at the moment. First, some background.

I’ve recently had a big interest develop in Ruby. Not necessarily so much for the language itself as for Rails. I can only go so long hearing so much good about something that I have to go and check it out for myself. Last week I read through Why’s Poignant Guide over a day and a half and feel pretty good with the language (even metaprogramming). So, the time came to get my IDE up and running. I chose Aptana Studio 3 since it’s Eclipse based and I like Eclipse. Their setup instructions include how to install a Ruby debugger for use within Eclipse.

I installed Ruby on Windows using RailsInstaller 2.1.0, the latest available at the moment, which installs Ruby 1.9.3 p125. Based on Aptana’s instructions, I knew I needed to install the ruby-debug-ide19 gem and so I entered their given command.

gem install ruby-debug-ide19

And I was greeted with a warm error message.

ruby_debug.c:29:19: error: conflicting types for
  ruby-1.9.3-p125/vm_core.h:505:7: note: previous
  declaration of 'rb_iseq_compile_with_option' was here

“How quaint,” I thought. “This is exactly what turns some people off to open source software.” Well, not wanting to be dissuaded I started looking for solutions. After searching for a while, I came up with my own. Alter the “ext/ruby_debug/extconf.rb” within the ruby-debug-base19-0.11.25.gem file (can be opened with 7-Zip). The alterations are to comment lines 19 and 21 (in red).

#if RUBY_REVISION >= 26959 # rb_iseq_compile_with_option was [...]

I could now proceed…only to be met with a second, more interesting error.

  undefined reference to `ruby_threadptr_data_type'

Wow, how interesting is that! This thing just doesn’t want to compile OR link! It’d been a while since I’d done this much work trying to get something to compile in C/C++, but I had a good background from the past. So this should be easy, right? Four days later, I think I have a solution.

Many people who encountered this started installing ruby-debug-base19x to get around it. Apparently this is a fork off ruby-debug-base19 which is maintained with one of the IDEs to try and fix some of the challenges ruby-debug is having with Ruby 1.9. That’s all fine and good, and it would install for me, but I needed a ruby-debug-ide19 to go along with it that Aptana could use. No such luck for me, so I was back to the drawing board.

I also found this post and tried to use the information there to resolve my problem. This just led me down a rabbit hole of trying to compile Ruby on Windows from source. I didn’t like where that was headed so I started looking into the libraries in my Ruby/lib directory. Sure enough, libmsvcrt-ruby191.dll.a didn’t include a ruby_threadptr_data_type export. However, libmsvcrt-ruby191-static.a did include it. Strange, but still not a solution.

Finally, I compared the sources for ruby_debug.c between ruby-debug-base19-0.11.25 and ruby-debug-base19x-0.11.30.pre10. I found that there was a new function definition in the 19x version.

static inline const rb_data_type_t *
  static const rb_data_type_t *thread_data_type;
    if (!thread_data_type) {
  VALUE current_thread = rb_thread_current();
  thread_data_type = RTYPEDDATA_TYPE(current_thread);
return thread_data_type;

#define ruby_threadptr_data_type *threadptr_data_type()

I’d already messed with the GEM file earlier, so why not again? I modified the ruby-debug-base19-0.11.25.gem file again, this time adding the above function to ruby_debug.c at line 195. I saved the GEM modifications and tried installing again. Voilá! MAGIC!

Until I tried to use the debugger in Aptana Studio (Eclipse).

Exception in DebugThread loop: undefined method `is_binary_data?'

Wow…this thing just didn’t want to work. Fortunately for me, someone else already had a great fix in this post on Stack Overflow. I edited my ruby-debug-ide19-0.4.12/lib/ruby-debug/xml_printer.rb file to include this definition (from the above article).

EDIT: To help others, I should clarify exactly where in the file this is done. I made the changes between the require statements at the beginning and the module Debugger line. In other words, this addition to the String class should occur outside the Debugger module.

class String
  def is_binary_data?
    ( self.count( "^ -~", "^\r\n" ).fdiv(self.size) > 0.3 ||
      self.index( "\x00" ) ) unless empty?

And now, peace. The wonderful peace brought by having a debugger running inside an IDE of your choice.

For those who don’t know exactly how I’m editing the GEM file, look in your gem cache, usually located in <Ruby install path>/lib/ruby/gems/<version #>/cache.

In case it matters, here’s the versions I’m running of everything I could think would affect this.

  • Ruby = 1.9.3 p125
  • Ruby Gems = 1.8.16
  • DevKit = 4.5.2
  • ruby-debug dependencies
    • columnize = 0.3.6
    • archive-tar-minitar = 0.5.2
    • linecache19 = 0.5.12
    • ruby_core_source = 0.1.5
  • ruby-debug-base19 = 0.11.25
  • ruby-debug-ide19 = 0.4.12

Hope this helps someone else beginning in their Ruby journey!

– Archimedes

Posted in Geekery | Tagged , , | 11 Comments

JBoss 5.1.0 GA: The declaration for the entity “HTML.Version” must end with ‘>’.

So, my latest project is to create a working Struts2 site on a pre-existing JBoss 5.1.0 GA server instance. The server is fixed at this version because it’s packaged as part of a commercial-off-the-shelf (COTS) system. I’ve noticed over the past 3 years that it is becoming increasingly popular to bundle COTS software with JBoss rather than than a straight up Tomcat server. I’d say this is because of the larger feature set that JBoss supports and that it supports EAR files similar to WebSphere Application Server.

All this is fine and dandy, I have no problem with the restriction of server version and it was my choice to go Struts2. All was good until I decided to deploy a simple application skeleton to a working JBoss 5.1.0 GA server. I’d already tested this app on Tomcat 6 since that helped me get started quicker (I had to find some beefier hardware to run JBoss…my main box is an antique). Everything worked great in Tomcat 6, but when deploying the EAR or embedded WAR, I received the following message.

DEPLOYMENTS IN ERROR:  Deployment “vfszip:/C:/JBoss/5.1.0.GA/server/default/deploy/AppX.war/” is in error due to the following reason(s): org.jboss.xb.binding.JBossXBRuntimeException: -1:-1 31:3 The declaration for the entity “HTML.Version” must end with ‘>’.


Needless to say, this isn’t what I expected. I found an article on the JBoss community site that seemed to indicate the problem might be with the loose.dtd hosted by The proposed solution involved faking out one of the compiled libraries by downloading a file and changing a line to turn off XML validation. Sure, this would probably do the trick I thought, but who likes this as a solution? Especially when you’re talking about building an application for a client?

After a lot of different tries, I was about to go down the road discussed above when I happened upon another article about the web.xml and web-app 3.0. This got me to thinking and I checked out my web.xml. Sure enough, the following line was right there at the top.

<web-app xmlns:xsi=”” xmlns=”” xmlns:web=”” xsi:schemaLocation=”” id=”WebApp_ID” version=”3.0″>


I changed over to a 2.5 web-app spec and voilá! Both the WAR and EAR would not deploy.

<web-app xmlns:xsi=”” xmlns=”” xmlns:web=”” xsi:schemaLocation=”” id=”WebApp_ID” version=”2.5″>


I think the bad web-app version crept in when I first started on the project. I didn’t know what my target servers were so I just spun up a project in Eclipse without too much thought. Obviously this lack of planning came back to bit me later! For what it’s worth, Eclipse does a good job of guiding you here because if your target is a JBoss 5 environment, it won’t even let you pick anything above 2.5 for the web-app version.

I’m sure there might be more learned from this project. I’ll be sure to post if I find anything worthwhile!


Posted in Uncategorized | 4 Comments

Integrating LoadRunner 11 and SiteScope 11 with Authentication Enabled

So I’ve been silent for a while. Not because I haven’t had any good ideas but because I’ve just been too busy to sit down and write them out! Well, I’m taking the time this morning because this one is just too good to pass up. There seems to be a fair amount of confusion on this subject and after spending two hours this morning, I’m hoping that posting this will save someone else the time in the future.

I’ve been a performance tester now for a little under a year and I have to say that I really enjoy it. Even though I’m looking to move to another area pretty soon, performance testing is definitely challenging and very interesting. LoadRunner is a great tool (with its problems…but every tool has those) and I’ve really been fairly happy with SiteScope as well. After having a ton of trouble getting native LoadRunner Windows monitoring working (we never get it to work), we moved to SiteScope and it’s done what we needed.

Yesterday, I decided we should probably enable authentication for SiteScope since anyone in the company with the right URL could have accessed it. It’s not that I see a lot of danger in that but still, a completely unauthenticated site? There is definitely some sensitive information as well and it might be possible to use it to launch an internal attack. Best be safe and put a password in place!

Turning authentication “on” was really easy. All you have to do is set a password for the administrator account and voilá, you have a login screen presented to you on next access. Of the systems I’ve administered, that ranks at the top of “ease” in my book.

Tricky thing is this: now LoadRunner couldn’t pull over stats! I was receiving a <META> tag error (something about mismatching with <HEAD>). Rather than get into the details of the error message, which won’t do us any good, let’s jump to understanding the solution. There’s a lot of good material out there (this article, which is based on this doc from HP) but its all based on this strange idea of a “login1” which no one seems to explain. I tend to think I may have figured it out because I was able to get this working without using “login1” (not that there’s anything wrong with “login1”).

When you edit a SiteScope user, you specify the “login name” but the “account” is given automatically. In the case of the built-in administrator, the “account” name is “administrator”.

A screenshot from SiteScope showing the built-in administrator account.

The HP doc mentions that Topaz (the underlying system that is SiteScope) uses a “user index” rather than a “user name”. I would like to propose the following equations to make this understandable for us mere mortals.

  • HP doc user name = SiteScope login name
  • HP doc user index = SiteScope account

We don’t have any control over the “account” field (unless we go editing the <SiteScope install path>\groups\users.config file…) but that doesn’t mean we’re forced into using some strange “login1” value without understanding it.

So, if we want to connect LoadRunner to SiteScope as the built-in administrator account (and skip the need for additional accounts), how do we do that? Open the LoadRunner monitor INI found here.

C:\Program Files\HP\LoadRunner\dat\monitors\xmlmonitorshared.ini

And add the information in bold.

DlgTitle=SiteScope Monitor

Now when you connect to SiteScope in LoadRunner, you will be prompted for credentials. Enter the administrator account’s “login name” (admin in my case) and the password. You should be up and running!

While you certainly can create another account that will get a “user index” of “login1”, why doesn’t HP simply explain what’s going on here instead of having instructions that are dependent on you adding only one user ID? What if you have 50 SiteScope users and you’d like to use the 47th for LoadRunner access? It seems that a simple explanation of the system (maybe even a nice diagram?) could really help people out and lower the number of support calls made for this issue.

Best of luck to you all in the wonderful world of LoadRunner (and SiteScope)! Signing off for now,


Posted in Work | Tagged , , | Leave a comment

Mounting your VOB 101: Slow down there, partner!

The purpose of this post is to help others who find absolutely no useful information if confronted with the following message while mounting a ClearCase VOB:

Mounting MVFS filesystem \<VOB>.
cleartool: Error: Unable to mount: File exists

I found this odd error message in my Build Forge logs this morning. I have several automated jobs that rely on scripts being pulled from ClearCase VOBs. When the Build Forge job went to mount the VOB, this was the error it encountered.

Now, keep in mind, this type of Build Forge job has been running for over a year. This particular job has been running for at least a week now. I’ve never seen this error before.

I Googled the error “Unable to mount: File exists” and what did I find? Nothing. Absolutely nothing. Never a good moment in the life of a ClearCase administrator! So what’s the next step? Well, I figured checking the MVFS logs might help since mounting a VOB is an MVFS action. Voilá! A message I hadn’t ever seen before here either.

[<timestamp>] mvfs: Ok: INFO: {<pid>/<tid>}
IoCreateSymlink failed: MVFS drive symlink already exists.

(That was all on one line…I just broke it up for formatting purposes.) This looked promising. Surely someone else has encountered this odd message before.

Awesome…6 results! Not so awesome: none of them refer to mounting a VOB in general. They are all about ClearCase and Distributed File System (DFS) support (or, rather, non-support). That doesn’t really help me out.

With no one else out there seeming to have had the same problem, I started asking myself what might cause this problem. Well, it just so happens that this Build Forge job is one of two new jobs I introduced last week. I already had two other jobs running on a similar schedule. All four jobs end up running at exactly the same time this one job failed. All four jobs end up trying to mount the same VOB on the same machine (they all run on the same server). Is it possible that ClearCase ends up croaking because of 4 nearly-simultaneous attempts to mount the same VOB? Is it more generalized and it’s the attempt to simultaneously mount anything that’s causing the error? Unless someone else is willing to devote some time and try to find out, I’ll quote the Tootsie Pops commercial narrator: “The world may never know.”

The job is running fine now. I expect it was a “perfect storm” scenario and we might run into it every week or so. Am I happy with that? No. Will I live with it? Yes. And maybe you can too if you found this page looking for help to this mysterious set of error messages.


Posted in Geekery, Work | Tagged , | 2 Comments

Giving people some space…to talk!

So I’ve been thinking a lot about this today. Why? Well, for one I spent over 6 hours in meetings today. For another, many of those meetings were meant for me to try and convey a point while also eliciting feedback on whatever it was I happened to be proposing.

That can be a really hard thing to do. Those of you who might be experts at this already, I’m sure you’re laughing at me. Well, I’m still young so I have time to learn this fine art. And what an art I’m beginning to appreciate it for.

You see, I’ve always known this, but it’s just now starting to really sink in. I’m sure being married, having a kid, and trying to raise a family while also working on moving forward with my career have helped with this. Or perhaps it’s just that time in my life; but, whatever the case, it’s starting to sink in. I’ve known that other people have opinions and that those opinions, like my own, should be expressed. The part I haven’t realized until perhaps even today is exactly how difficult it can be to let someone express those opinions (or several people at once) while still attempting to get you opinion out there.

I’m also beginning to realize that managing people involves cooperating with them at a much higher level than I had previously realized. Well, again, maybe I’d realized it…I just hadn’t experienced it until recently.

The world of IT is certainly full of opinions and most of them (there are rare exceptions of course) are well thought out and accurate. Yet, somehow, we have to work together to make those opinions turn into a particular direction we should head. That “somehow” part is the tricky part (it always is). How do you give someone space to talk while you keep thinking, “Man, this isn’t where I wanted this meeting to head…!”

Bottom line, you have to listen. And I mean really listen. Think about what the other person is saying rather than how much you’d rather them say something else. I’ve heard so many people respond to someone with something that’s completely unrelated to what the other person just said. Why? Because the person who responded wasn’t listening to the person who was just talking. Instead, they heard about 5 words of a 5 minute thought and already had their response worked up. What’s wrong with silence for a moment while everyone collects their thoughts? Is control of the conversation really worth the cost of not listening to someone else? Wouldn’t it be better to hear their opinions out rather than just trying to be the one who talks the most? I’m definitely guilty of this and I’m sure I will still be guilty of it throughout my life well beyond this post. But that doesn’t mean I won’t think about it a bit more.

Yeah, this post is pretty random. Just thinking out loud and trying to remind myself to let other people have a chance to express themselves. Now I just have to learn how to control and channel that into a productive meeting without necessarily speaking more than everyone else.


Posted in General, Work | Tagged , , | Leave a comment

Build Forge Agent Proxy

So I’ve been really busy the last few months. REALLY BUSY.

But, I’ve also had the chance to learn a lot of new things. One of the more interesting things I’ve learned lately (as in, the past week) is that Build Forge Agent communication can be routed through a reverse proxy. This isn’t officially support by IBM, mind you, but it does seem like it can be done. And quite easily at that.

I had used Wireshark to look at the Build Forge Agent protocol a couple of times just so I could understand how it worked. This helped me tremendously when it came time to request firewall rules in different environments. I never saw anything that looked like the Console needed to really know what Agent it was talking to. Instead, the traffic appeared to be very simple and to the point: “Hey Agent! I need you to do X, Y, and then Z. Oh, and use this account while you’re at it!”

So we ran into a situation where I was only allowed one hole through a particular firewall. To clarify, this one whole meant one source server and one target server. If you know anything about Build Forge, you know that means you can only get to one Build Forge Agent on the other side of the firewall. This was a major bummer for us because we already had three agents installed on the other side with plans for many, many more. But the security gods had spoken and one hole it was.

All hope looked lost until we started tossing around the idea of  a proxy server. I knew we couldn’t use a normal forward proxy since we’re basically talking about routing telnet traffic here and that just isn’t forward proxy routable (nothing is built into the protocol to allow for that like HTTP). So, reverse proxies came to mind. If we could open multiple ports to this one server and then have multiple reverse proxy listeners that would route the traffic on the other side of the firewall this just might work.

The first step was to find a simple reverse proxy. We couldn’t use an HTTP proxy because, again, this isn’t HTTP traffic and HTTP proxies alter the traffic. We needed the raw TCP packets to be passed through. I knew I could write something like this myself in Java quite easily but I wanted to see if someone else had done any work before wasting investing a lot of time on this. After trying a few things, a Google search on “java simple tcp proxy” gave me what I wanted. And what exactly did I want? The Grinder TCP Proxy.

This thing is nice. Very easy to use and quick to test out. I set up the proxy for simple TCP routing and voilá, it worked! The Build Forge Console was able to communicate with my personal PC running The Grinder which routed the traffic on to a real Build Forge Agent. I even ran several concurrent jobs through the reverse proxy connection and it didn’t miss a beat. Well, this was very promising. But what about SSL support? We had to use SSL for our implementation and that would be a deal breaker there.

While The Grinder’s TCP Proxy can do a basic SSL “man in the middle” approach (not an attack per se, but the concept is the same). I attempted to use this but it failed because it tried to connect to the Agent with SSLv2 rather than TLSv1 (SSLv3). Well, my own implementation might be able to address this. Then I wondered…hmm, could just routing the TCP packets work? I know it shouldn’t because the Common Name on the SSL certificate wouldn’t match a reverse DNS lookup on the origin IP, but I wasn’t completely sure that Build Forge Agents actually did their due diligence to confirm the origin IP was associated with the Common Name on the certificate. It was worth a shot.

OK, I’ll admit, I’m lying. I messed up initially and forgot to pass the TCPProxy class the “-ssl” option at first and everything worked. Only afterward did I realize that SSL traffic worked through The Grinder in pure TCP routing mode! It was once I used “-ssl” (like I meant to originally) that I found the SSLv2 issue mentioned above.

So, pure TCP routing not only worked for regular connections, it also worked for SSL connections too. Great! On a slightly not so great note, IBM Rational Build Forge Agents (and the console too) don’t actually reverse ping the origin IP of SSL communication to verify that the reverse DNS name retrieved matches the Common Name on the certificate. This doesn’t compromise the encryption since a “man in the middle” can’t do much with encrypted traffic; however, I’m still a bit shocked to say the least. Perhaps IBM’s approach here is to just use SSL for encryption and not truly for client/server verification. Still, I find that a bit odd, especially considering one of the Build Forge Agent paramters is ssl_verify_client. Even with that set to “true”, raw TCP routing through a reverse proxy didn’t cause an SSL alert.

So we’re now working on “productionizing” this and writing our own simple TCP router. It will listen on multiple ports and forwards them to multiple agents on the other side of the firewall. We might also look at putting in a true SSL “man in the middle” which decrypts the traffic (like a normal agent would) and then begins a new SSL channel to the real Agent (like a normal console would). This would prevent any issues later if IBM decides to start performing reverse DNS lookups for Common Name matching to origin IP. It never hurts to plan ahead.

Alright, I’m headed to bed now.


Posted in Work | Tagged , , , | 6 Comments

Merry Christmas!

To all who celebrate Christmas,

I wish you a Merry Christmas this year, 2010. I know for me, my family and I have a lot to be thankful for and I just stand in awe at how far I’ve come in such a short time. May each of you feel some magic this holiday season and get to enjoy one of the best gifts of all: family. Remember why the season is here and take the time to celebrate it in a way that will make it meaningful and everlasting.

To all, a good night!


Posted in Home Life | Tagged | Leave a comment