How would you test a random() function?

Let’s assume that it can be
initialized with a seed that you supply in order to generate different sequence
of random numbers, but that the same seed will always generate the same
sequence.

The first idea that comes to mind is to pick a constant seed, write down the
numbers returned, make them the "expected" value of your test, run the test with
that same seed and compare these values one by one.  It’s a start, but it’s
a far cry from testing the actual specification of your method, which is
approximately "returns a
(pseudo)-random number between 0.0f and 1.0f".

Enter statistical testing.

First of all, you need to define exactly what you mean by "random".  One
way would be to define it in terms of average:  "For a big enough sample of
numbers, the average will be 0.5f with an error of 0.01f".

Here is a quick implementation:

@Test
public void verifyAverage() {
  float sum = 0;
  int count = 10000;
  for (int i = 0; i < count; i++) {
    sum += random();
  }
  float average = sum / count;
  float tolerance = 0.01f;
  assertTrue(0.5 - tolerance <= average && average <= 0.5 + tolerance);
}

Of course, you should extend this test in many ways, such as testing on
bigger samples, use different seeds (I didn’t use any in this example) or use a
different metric.  For example, what if your algorithm is buggy and returns
pairs of 0.1, 0.9, 0.1, 0.9, etc…  It will pass this test but the
distribution is obviously not correct.  To address this, you might want to
measure the standard deviation of the returned values.

Here is another
potential bug:  what if the values "bunch up" around the average, say, they
are always between 0.4 and 0.6?  Again, both verifyAverage() and
verityStandardDeviation() will probably pass, so you might want to introduce a third test
for the distribution, say "verifyEntropy()".

Statistical testing comes in handy in many other situations.  Here are
two more examples.

How would you verify that your Web application can create simultaneously a thousand users?  Imagine that your Web site is tremendously
popular and you have people sign up in bursts.  All these pages are going
to try to insert/update rows in the database roughly at the same time, so how do
you make sure that your transactions are correctly isolated?

Again, statistical testing to the rescue.  Simulate all these users
accessing your database simultaneously and make sure your database contains the
right values at the end (this is slightly different from load-testing, which
only makes sure that the performance of your server remains acceptable, but you
do test these two approaches similarly:  by firing a lot of simultaneous
requests to your server).

And finally, I reach the point of this post:

How can you test that
your code is thread-safe?

Of course, your first reaction should be to understand the code you are trying to
test, analyze the various values that can come in contention and make sure these 
values are adequately protected (typically with synchronization).  But as soon as
your code becomes complicated enough and starts calling into more methods (some
of which you might not even have the source of), this approach becomes very
quickly impractical and your implementation is at best theoretical.

If you multiply the number of "yield points" of your code (locations where
the JVM can preempt your thread) with the number of ways the JVM can preempt
you, you quickly realize that there is no way you can be 100% sure that you
covered all the scenarios.

Again, statistical testing is can help you increase the percentage of comfort
you have in your testing.

The upcoming TestNG 4.6 contains a very powerful feature that makes this kind
of testing trivial:  individual method thread pools.

Consider the following code:

@Test(threadPoolSize = 10, invocationCount = 10000)
public void verifyMethodIsThreadSafe() {
  foo();
}

@Test(dependsOnMethods = "verifyMethodIsThreadSafe")
public void verify() {
  // make sure that nothing was broken
}

invocationCount has been in
TestNG for quite a few releases but threadPoolSize is new, and it basically
instructs TestNG to create a pool of ten threads that will then be used to
invoke the test methods ten thousand times.  Thanks to its dependency, the
verify() method will be invoked once all the verifyMethodIsThreadSafe() methods
have been called and it will double check that the data modified by the
concurrent code is what we expect.

Here is a quick illustration of this feature where the test method sleeps a
random interval before exiting.  We call this method six times with a pool
of three threads:

private void log(String s) {
  System.out.println("[" + Thread.currentThread().getId() "] " + s);
}
  
@Test(threadPoolSize = 3, invocationCount = 6)
public void f1() {
  log("start");
  try {
    int sleepTime = new Random().nextInt(500);
    Thread.sleep(sleepTime);
  }
  catch (Exception e) {
    log("  *** INTERRUPTED");
  }
  log("end");
}

Here is a sample
output:

[10] start
[8] start
[9] start
[10] end
[10] start
[9] end
[9] start
[8] end
[8] start
[8] end
[10] end
[9] end
PASSED: f1
PASSED: f1
PASSED: f1
PASSED: f1
PASSED: f1
PASSED: f1

As you can see, the first three runs fill the thread pool, which then blocks
until one of the threads finish.  Thread#10 finishes first, and it is
reallocated to another run of the method right away, and so on.  Finally,
all the threads end and TestNG reports that all six invocations have passed.

What if one of these methods is taking too long to respond?

You can use
another feature of TestNG to make sure that your tests won’t be locked up
forever:  timeOut (this attribute already existed in older
versions of TestNG and it’s simply being reused here).

Let’s make things a bit more interesting and specify a timeOut of 500ms but
this time, making the method sleep a random number of milliseconds between 0 and
1000.  What this means is that whenever the method sleeps for less than
500ms, it will pass, but if it takes longer to wake up, TestNG will interrupt it
and mark it as a failure.

Here is the code:

@Test(threadPoolSize = 3, invocationCount = 6, timeOut = 500)
public void f1() {
  log("start");
  try {
    int sleepTime = new Random().nextInt(1000);
    if (sleepTime > 500log("   should fail");
    Thread.sleep(sleepTime);
  }
  catch (Exception e) {
    log("  *** INTERRUPTED");
  }
  log("end");
}

And the output:

[11] start
[12] start
[12] should fail
[13] start
[13] should fail
[11] end
[14] start
[14] should fail
[12] *** INTERRUPTED
[12] end
[13] *** INTERRUPTED
[13] end
[15] start
[16] start
[16] end
[14] *** INTERRUPTED
[14] end
[15] end

===============================================
Test Suite
Total tests run: 6, Failures: 3, Skips: 0
===============================================

In this run, three methods came up with a sleep time greater than 500 and
therefore, announced that they should fail.  A few seconds later, these
three methods got interrupted by TestNG and then marked as failures.

Individual method thread pools will appear in TestNG 4.6, which will be
released very soon (beta versions are available if you are interested).

 

Update: Thanks to JB and David for pointing out that the property you want
to test about the returned values is entropy and not a Gaussian distribution.
I updated this article accordingly.

Update 2: TestNG 4.6 beta can be downloaded here.