if ( document.comments_form.url ) { document.comments_form.url.value = getCookie("mtcmthome"); } Otaku, Cedric's weblog: August 2004 Archives

August 27, 2004

More on JUnit and multiple instantiations

In a recent entry, Martin Fowler explains the reason behind JUnit reinstantiating your JUnit class before invoking each test method. He concludes by showing the following example:

[TestFixture]
public class ServerTester
{
  private IList list = new ArrayList();

  [Test]
  public void first() {
    list.Add(1);
    Assert.AreEqual(1, list.Count);
  }

  [Test]
  public void second() {
    Assert.AreEqual(0, list.Count);
  }
}
and pointing out that this example passes with JUnit but fails with NUnit and TestNG.

Here is why.

Remove mentally the annotations from the code above ([TestFixture], [Test]) and read the code just like if it were standard Java/C# code. And now, predict its output, knowing that first() is invoked first and then second().

It seems pretty obvious this code will fail in Java and in C#.

So why should your test framework behave differently?

By the way, here is the correct way to write this test with TestNG:

@Test
public class ServerTester
{
  private IList list = null;

  @Configuration(beforeTestMethod = true)
  public void init() {
    list = new ArrayList();
  }

  public void first() {
    list.Add(1);
    Assert.AreEqual(1, list.Count);
  }

  public void second() {
    Assert.AreEqual(0, list.Count);
  }
}
Quick explanation:
  • The @Test annotation at the class level indicates that all the public methods in this class are test methods, so there is no need to add individual annotations @Test on each method.
  • The @Configuration annotation indicates that the method init() should be invoked every time before a test method is invoked.
How would you feel if Java invoked your constructor every time before invoking your methods?

Posted by cedric at 01:46 PM | Comments (7)

August 26, 2004

Static: friend or foe?

A comment in my previous entry asks why statics can cause problems. Here is a quick example of a program that wants to handle a -verbose option:

public class Foo {
  private static boolean m_verbose = false;

  public static void main(String[] argv) {
    if (/* argv contains -verbose */) {
      m_verbose = true;
    }

    runProgram();
  }
Now imagine you invoke your program twice in the same build file as follows:
  <java classname="Foo" line="-verbose"/>
  ... and somewhere else
  <java classname="Foo" />
You will probably wonder why the second invocation keeps being invoked in verbose mode...

The solution is of course:

  <java fork="yes" classname="Foo" line="-verbose"/>
  ... and somewhere else
  <java fork="yes" classname="Foo" />
If you forget to fork a new JVM, the static variables set in the previous invocations will not be reinitialized. The problem with the fork flag is that it creates a new JVM each time, which dramatically increases the time of your build.

Therefore, it is much cleaner to avoid static variables as a rule of thumb, for example by storing all your command line variables in a singleton class and make sure this singleton gets reinitialized each time your program is invoked.

Now, does this mean you should never use static methods/fields? Of course not. Actually, I like to use static methods as much as I can as long as they don't modify static state, because const static methods decrease the coupling of your classes.

But this will be the subject of a future entry.

Posted by cedric at 11:21 PM | Comments (9)

August 25, 2004

TestSetup and evil static methods

TestSetup is a JUnit class you can use if you want to introduce initialization in your JUnit tests that only get invoked once, and not once before test method, which is what JUnit does (making it difficult to make sure that an initialization method is only invoked once during the lifetime of the class). I wrote more about this oddity not long ago.

A recent weblog entry about TestSetup shows how to use it, which is pretty straightforward. I still have a big problem with this solution, and what JUnit forces you to do to work around this "design choice": it forces you to introduce static methods in your code.

I don't have anything against static methods in general, I tend to use them as much as possible whenever they don't need access to any non-static fields since it decreases the coupling between my classes. However, static methods are particularly evil when they modify state, and by extension, force you to declare static fields.

Which is exactly the only choice that JUnit leaves you.

The problem with non-const static methods is that they break badly in several cases:

  • Multi-thread access.
  • Different objects thinking they are accessing private content, while it is actually shared.
  • Reuse of the same JVM.
This last point is particularly important. By forcing you to introduce static methods in your code, JUnit guarantees that if somebody ever invokes your class several times from an ant task while not providing the flag fork="yes", your code will break in mysterious ways.

Classes in object-oriented programming obey very simple rules:

  • If you invoke your initialization from the constructor, it will only be invoked once.
  • If your content is not static, it is guaranteed to be private to your instance.
JUnit violates these two principles and I just can't get used to it. Instead of righting a wrong with another wrong, why not return to the most intuitive way of doing this and letting test writers implement methods that are guaranteed to be invoked once?

Of course, these is also an alternative. :-)

Posted by cedric at 02:18 PM | Comments (9)

August 23, 2004

I quit BEA

Effective today, I have quit BEA and accepted a position at Google. In case you worry, this new role is not going to alter my interest in J2EE and related technologies, nor will it dramatically impact my blogging, so I am definitively planning on staying involved with the community.

Here's to a new beginning!

Posted by cedric at 02:50 PM | Comments (40)

August 19, 2004

Dependent test methods

Imagine I am trying to test a server.  In order to do this, my test class will contain the following test methods:

  • Check that we are running the correct JVM.
  • Check that the server started correctly.
  • About twenty methods making various calls to the server.

Obviously, the first two test methods listed above should be run before everything else.  The way you would do this with JUnit is to put the first two methods in the initialization code, which will probably have to be static since JUnit instantiates a new object before each test method invocation, and we can't use setUp() either since it's invoked before each test method.  This initialization code will set two booleans and the twenty test methods will have to test for these two booleans before proceeding, and if one of them is false, it will fail.

Now let's assume that on a test run, I used the wrong JVM.  In this case, JUnit will probably report this in the initialization code and it will then report twenty more failures for each test method.

This is very bad for a variety of reasons:

  • The initialization code is not in a test method per itself.  This is bad because report of its success/failure will be done in a separate track.
  • When QA reads the result of this test run, they will see "1 SUCCESS, 21 FAILS" and they will, rightfully, get really scared.
  • Upon reading of this result, QA will have to provision for fixing 20 tests, while the reality is that only one test failed.

This is why I believe that a test framework needs to provide support for "dependent test methods", where you can mark test method b() as depending on the successful run of test method a().  If a() failed, then b() will be marked as a SKIP, and not a FAIL.

With such a feature, the test run will be marked "1 SUCCESS, 1 FAIL, 20 SKIPS", which is much more accurate.

Here is how I would write this test using TestNG:

@Test
public correctVM() {}

@Test
public serverStartedOk() {}

@Test(dependsOnMethods = { "correctVM", "serverStartedOk" })
public method1() {}

...

This is okay but having to list all the methods that we depend for each new test method on is error-prone, so instead, let's use TestNG's groups:

@Test(groups = { "init" })
public correctVM() {}

@Test(groups = { "init" })
public serverStartedOk() {}

@Test(dependsOnGroups = { "init.* })
public method1() {}

...

We have gained a lot of flexibility with groups.  For example, imagine that I want to add another init test method, such as "firewall is on".  All I need to do is add this test method and declare that it is part of the group "init".

Also, note that I used a regular expression in the "dependsOnGroups" declaration, as a reminder that you can actually define several init groups (such as "initOS", "initJVM", etc...) and they will automatically be run before any test method is invoked.

But we can do even better.

In the above example, I don't like the fact that whenever I add a new "real" test method, I need to remember to specify that it depends on the group "init.*".  In TestNG, the traditional way to indicate that an annotation should apply to all test methods is to move this annotation at the class level.

Also, I don't like the fact that "init" methods and "real" test methods are in the same class, so I'd like to use inheritance to provide a cleaner separation of roles.

Therefore, I will restructure my tests like this:

@Test(groups = "init")
public class BaseTest {

  public correctVM() {}

  public serverStartedOk() {}
}
This is now the base class for my tests.  The @Test annotation is now on the class, which means it applies to all the public methods inside that class (so there is no need to repeat it on each individual method).  Therefore, each public method automatically becomes part of the group "init".

Next, I write my test class as a subclass of BaseTest:

@Test(dependsOnGroups = { "init.*" })
public class TestServer extends BaseTest {

  public method1() { ... }

  public method2() { ... }

  ... 
}
Here again, the @Test annotation is now on the class, which means that it applies to all the public test methods, making it easier to add new testing methods.  Also, since this class extends BaseTest, it "sees" not only the methods that are being inherited, but their annotations as well, so TestNG will include all the methods from the base class that belong to the "init.*" group to determine which methods need to be run first.

What's the overall result?

  • Two classes that have responsibilities that are very clearly delineated.
  • Very easy maintenance and evolution, since adding test methods (whether they are init or "real" test methods) boils down to just adding these methods to the right class (no need to annotate them).  A newcomer doesn't even need to know about that annotations that are needed to get all this to work.
  • A report that will accurately reflect the result of the test runs and will correctly identify the real failures from those that are caused by a cascade effect, and whose resolution should therefore be postponed until all the FAILs have been resolved.

 

Posted by cedric at 10:57 AM | Comments (12)

August 18, 2004

Using annotation inheritance for testing

Imagine I am trying to test a server.  In order to do this, my test class will contain the following test methods:

  • Check that we are running the correct JVM.
  • Check that the server started correctly.
  • About twenty methods making various calls to the server.

Obviously, the first two test methods listed above should be run before everything else.  The way you would do this with JUnit is to put the first two methods in the initialization code, which will probably have to be static since JUnit instantiates a new object before each test method invocation, and we can't use setUp() either since it's invoked before each test method.  This initialization code will set two booleans and the twenty test methods will have to test for these two booleans before proceeding, and if one of them is false, it will fail.

Now let's assume that on a test run, I used the wrong JVM.  In this case, JUnit will probably report this in the initialization code and it will then report twenty more failures for each test method.

This is very bad for a variety of reasons:

  • The initialization code is not in a test method per itself.  This is bad because report of its success/failure will be done in a separate track.
  • When QA reads the result of this test run, they will see "1 SUCCESS, 21 FAILS" and they will, rightfully, get really scared.
  • Upon reading of this result, QA will have to provision for fixing 20 tests, while the reality is that only one test failed.

This is why I believe that a test framework needs to provide support for "dependent test methods", where you can mark test method b() as depending on the successful run of test method a().  If a() failed, then b() will be marked as a SKIP, and not a FAIL.

With such a feature, the test run will be marked "1 SUCCESS, 1 FAIL, 20 SKIPS", which is much more accurate.

Here is how I would write this test using TestNG:

@Test
public correctVM() {}

@Test
public serverStartedOk() {}

@Test(dependsOnMethods = { "correctVM", "serverStartedOk" })
public method1() {}

...

This is a good start, but having to list all the methods that we depend on for each new test method on is error-prone, so instead, let's use TestNG's groups:

@Test(groups = { "init" })
public correctVM() {}

@Test(groups = { "init" })
public serverStartedOk() {}

@Test(dependsOnGroups = { "init.* })
public method1() {}

...

We have gained a lot of flexibility with groups.  For example, imagine that I want to add another init test method, such as "firewall is on".  All I need to do is add this test method and declare that it is part of the group "init".

Also, note that I used a regular expression in the "dependsOnGroups" declaration, as a reminder that you can actually define several init groups (such as "initOS", "initJVM", etc...) and they will automatically be run before any test method is invoked.

But we can do even better.

In the above example, I don't like the fact that whenever I add a new "real" test method, I need to remember to specify that it depends on the group "init.*".  In TestNG, the traditional way to indicate that an annotation should apply to all test methods is to move this annotation at the class level.

Also, I don't like the fact that "init" methods and "real" test methods are in the same class, so I'd like to use inheritance to provide a cleaner separation of roles.

Therefore, I will restructure my tests like this:

@Test(groups = "init")
public class BaseTest {

  public correctVM() {}

  public serverStartedOk() {}
}
This is now the base class for my tests.  The @Test annotation is now on the class, which means it applies to all the public methods inside that class (so there is no need to repeat it on each individual method).  Therefore, each public method automatically becomes part of the group "init".

Next, I write my test class as a subclass of BaseTest:

@Test(dependsOnGroups = { "init.*" })
public class TestServer extends BaseTest {

  public method1() { ... }

  public method2() { ... }

  ... 
}
Here again, the @Test annotation is now on the class, which means that it applies to all the public test methods, making it easier to add new testing methods.  Also, since this class extends BaseTest, it "sees" not only the methods that are being inherited, but their annotations as well, so TestNG will include all the methods from the base class that belong to the "init.*" group to determine which methods need to be run first.

What's the overall result?

  • Two classes that have responsibilities that are very clearly delineated.
     
  • Very easy maintenance and evolution, since adding test methods (whether they are init or "real" test methods) boils down to just adding these methods to the right class (no need to annotate them).  A newcomer doesn't even need to know about what annotations that are needed to get all this to work.
     
  • A report that will accurately reflect the result of the test runs and will correctly identify the real failures from those that are caused by a cascade effect, and whose resolution should therefore be postponed until all the FAILs have been resolved.

 

Posted by cedric at 09:41 AM

August 17, 2004

Projects on java.net

I have moved the following projects to java.net:

It will make it easier for users to download the latest versions from CVS or use the mailing-lists to ask questions.

Update: if you are looking for a Java project, read this.

 

Posted by cedric at 10:30 AM

August 15, 2004

Typing game

Today is Sunday, no serious stuff, so I give you... a typing game.

Posted by cedric at 01:19 PM | Comments (6)

August 12, 2004

The perils of split()

Can you spot why the following program:

public class Split {
  public static void main(String[] argv) {
    String STRING = "foo  bar";
    String[] s = STRING.split(" ");
    for (int i = 0; i < s.length; i++) {
      System.out.println(i + " '" + s[i] + "'");
    }
  }
}
displays:
0 'foo'
1 ''
2 'bar'

The reason is that split() works a little differently from StringTokenizer:  it accepts a regular expression as a separator.  In the code above, I define this regular expression as " " (one space character) but the input string contains two of them.  Therefore, we can solve this problem by using the " +" as a regular expression ("at least one space character").

Still, the fact that split() can return empty strings is deceiving, especially if you are converting your code from StringTokenizer.

There are a couple of good things about this behavior, though:

  • You can reconstitute the original string if you need to.
  • It makes it easier to parse strings with records that can be empty, such as lines from a log file.

Can you think of any other use?

Posted by cedric at 10:41 AM | Comments (3)

Eclipse plug-in for generics

I have just added support for concrete collections to J15:  use Ctrl-1 on the creation of a concrete collection and J15 will add the correct Generic parameters.

Before:
List l2 = new ArrayList();
Float n2 = new Float(42); 
l2.add(n2);
After:
ArrayList<Float> l2 = new ArrayList<Float>();
Float n2 = new Float(42); 
l2.add(n2);
Before:
Map m = new HashMap();
m.put(new Integer(42), "Cedric");
After:
HashMap<Integer, String> m = new HashMap<Integer, String>();
m.put(new Integer(42), "Cedric");
Let me know if you can think of other conversions that this plug-in could make for you.

Posted by cedric at 09:42 AM | Comments (1)

August 11, 2004

J15: an Eclipse plug-in to migrate to the JDK 1.5

I have been converting a lot of code to the new JDK 1.5 constructs recently and I decided it was too tedious, so I wrote a quick Eclipse plug-in to assist me.  It's called J15 ("J One Five") and I am happy to announce its first release.

Right now, J15 can convert for loops on arrays and collections but I have quite a few other enhancements in mind which I will disclose as I implement them.

In the meantime, please test it and let me know how it works for you, as I am confident there are quite a few cases that I haven't covered (manipulating a Java Abstract Syntax Tree is fun but it produces very ugly code...).

By the way, you will need a recent Eclipse build to run it (since it needs to support these new constructs).  I am currently running N20040806 which passed all the Windows tests.

Posted by cedric at 10:30 AM | Comments (3)

August 09, 2004

EJB 3 callbacks

We are currently trying to figure out how callbacks should be implemented with EJB 3.  So far, we have identified five different techniques and I'd like to get some feedback from readers about them.  Here is a quick rundown, edited from messages sent by Marc Fleury and Craig Russell, along with their respective pros and cons:

  1. Magic callbacks.  Method signature is the same as a container defined one. Eg public void ejbPassivate();
    +    Very lightweight, no template code, no base class, no interface dependency.
    -     Easily breakable (developers will mistype), doesn't express container dependency.
     
  2. Interface. Current state in 2.x spec.
    +   Clear dependency expression, not breakable.
    -    Lot of template code (today's code is cluttered with many lines of callback template code that does nothing).
     
  3. Base class. Provide template code in base class, have developers extend from it.
    +   No template code, not breakable.
    -    Single inheritance in java makes this base class a no-no.
     
  4. Annotations. Annotate any method as callback (as the example of "@remove" that Linda already showed)
    +
        No template code, not breakable (annotations can be supported in IDE/compile), any name.
    -    ?
     
  5. EntityManager callbacks as opposed to POJO callbacks. e.g.

    EntityManager.registerLifeCycleCallbackListener
      (LifeCycleListener listener, Class classOfInterest, long lifeCycleEventsOfInterest);

    +    No interference with POJO class.
    +    Exploits well-known event listener paradigm.
    +    Very lightweight, only called if specific class/life cycle event happens.
    +/-   "aspect oriented".
    -    Not object-oriented (code outside POJO affects POJO).
    -    How can we extend this to Session beans?

Can you think of other options?  Which one would you favor?

 

Posted by cedric at 10:19 AM | Comments (19)

August 06, 2004

New TestNG, now with groups of groups!

I have just released TestNG 0.9.  Among the new features, the most interesting one is "groups of groups".  I'll quote the documentation directly:

Groups can also include other groups.  For example, you might want to define a group "all" that includes "checkintest" and "functest".  "functest" itself will contain the groups "windows" and "linux" while "checkintest will only contain "windows".  Here is how you would define this in your property file:

testng.group.functest = windows linux
testng.group.checkintest = windows
testng.group.all = functest checkintest

testng.includedGroups = all
I have also added new Javadocs and more code excerpts to the documentation.  For all the details, please refer to the TestNG main page.

Posted by cedric at 01:16 PM | Comments (0)

August 04, 2004

Spam software review, part 1

I have tried several anti-spam clients for Outlook recently, here are a few reviews.

I'll start with SpamBayes, the open-source Python add-in for Outlook.

First of all, I have to say I like the idea of an Outlook add-in written in Python.  I am not a big fan of Python myself (here is  why), preferring Groovy and Ruby, but this is a testament to the goodness of COM/.Net which gives you a lot of flexibility on your language of choice.

My initial contact with SpamBayes was pretty good but unfortunately, the honeymoon didn't last long.  After a few weeks, I became annoyed by the following shortcomings:

  • SpamBayes is slow.  I don't know if it's due to Python (or its executable compiler) or the code itself, but you can clearly see it processing a message.  While it's usually not an issue for individual messages, you will feel your pain when you haven't launched Outlook in several days and that SpamBayes suddenly finds itself in front of over one hundred messages to filter.
     
  • Even after a few weeks of training, a high number of emails (about twenty per day) still ended up in the "Junk suspects" folder while they are obviously spam that should have been detected by the Bayesian algorithm (meaning:  they are of reasonable size, in plain English text and contain quite a few keywords that should have made the filter take immediate notice).
     
  • But the number one reason that made me decide to give up is:  SpamBayes doesn't have a white list.

First of all, I was quite put off by the attitude of the developers when I asked for that feature.  The responses were typically along the lines "SpamBayes doesn't need a white list, it's doing a great job already", "I've never needed it" and "Why don't you add the feature yourself?".

This is not the kind of response you get from commercial vendors typically, but well, the software is free so there is not much I can do.

But the worst part of this shortage is that it shows that the Spambayes authors don't understand that a spam filter is simply useless without a white list.

It took just a few days for me to realize that when I started exchanging important emails with someone who tends to write very short, poorly-formatted emails that the filter was absolutely incapable of training against.  Despite all my efforts, this person's emails regularly ended up in the "Junk suspects" folder or, worse, in "Junk".

Another example a few days later: emails from a member of my family who are tagged with several lines of self promotion / advertising at the bottom, which the filter systematically interpreted as spam.

After a few weeks of use, I realized that I just didn't trust my spam filter.  I kept dreading that I would miss an important email and therefore, applied extra caution when perusing my "Junk suspects" folder, which completely defeats the purpose of such a tool.  Added to the fact that SpamBayes doesn't offer extra goodies such as statistics or challenge/response, it became clear to me it was time to look for another option.

Next:  MailFrontier's Matador (and after that, IHateSpam).

Posted by cedric at 07:50 AM | Comments (12)

August 03, 2004

Are we winning the war against spam?

The latest spam that made it to my Inbox looks like this:

sielkundiges
xhdgkcuh`ezapatascratcht

u,_s^.a ph,a.-rm & next-d.~ay sh-'ipp^'ing




nagstoel
etaicurcinch-pounds http://...

In order to get past my antispam software, the spammers now need to disfigure their message to the point where it's barely legible.  When I see this, I start thinking that maybe we are slowly beginning to win this war...

Now if only we could find a way for these spams not to be sent in the first place...

 

Posted by cedric at 09:55 AM | Comments (5)