Extreme Perl:  Chapter 13: Unit Testing   An Evolving Book
about Extreme Programming
with Perl
dot
Previous: Chapter 12: Continuous Design   Next: Chapter 14: Refactoring
 

A successful test case is one that detects an as-yet undiscovered error.

-- Glenford Myers[1]

The second and third examples test a post office protocol (POP3) client available from CPAN. These two unit tests for Mail::POP3Client indicate some design issues, which are addressed in the Refactoring chapter. The third example also demonstrates how to use Test::MockObject, a CPAN module that makes it easy to test those tricky paths through the code, such as, error cases.

Testing Isn't Hard

One of the common complaints I've heard about testing is that it is too hard for complex APIs, and the return on investment is therefore too low. The problem of course is the more complex the API, the more it needs to be tested in isolation. The rest of the chapter demonstrates a few tricks that simplify testing complex APIs. What I've found, however, the more testing I do, the easier it is to write tests especially for complex APIs.

Testing is also infectious. As your suite grows, there are more examples to learn from, and the harder it becomes to not test. Your test infrastructure also evolves to better match the language of your APIs. Once and only once applies to test software, too. This is how Bivio::Test came about. We were tired of repeating ourselves. Bivio::Test lets us write subject matter oriented programs, even for complex APIs.

Mail::POP3Client

The POP3 protocol[2] is a common way for mail user agents to retrieve messages from mail servers. As is often the case, there's a CPAN module available that implements this protocol.

Mail::POP3Client[3] has been around for a few years. The unit test shown below was written in the spirit of test first programming. Some of the test cases fail, and in Refactoring, we refactor Mail::POP3Client to make it easier to fix some of the defects found here.

This unit test shows how to test an interface that uses sockets to connect to a server and has APIs that write files. This test touches on a number of test and API design issues.

To minimize page flipping the test is broken into pieces, one part per section. The first two sections discuss initialization and data selection. In Validate Basic Assumptions First and the next section, we test the server capabilities and authentication mechanisms match our assumptions. We test basic message retrieval starting in Distinguish Error Cases Uniquely followed by retrieving to files. The List, ListArray, and Uidl methods are tested in Relate Results When You Need To. Destructive tests (deletion) occur next after we have finished testing retrieval and listing. We validate the accessors (Host, Alive, etc.) in Consistent APIs Ease Testing. The final test cases cover failure injection.

Make Assumptions

use strict;
use Test::More tests => 85;
use IO::File;
use IO::Scalar;
BEGIN {
    use_ok('Mail::POP3Client');
}

my($cfg) = {
    HOST => 'localhost',
    USER => 'pop3test',
    PASSWORD => 'password',
};

To access a POP3 server, you need an account, password, and the name of the host running the server. We made a number of assumptions to simplify the test without compromising the quality of the test cases. The POP3 server on the local machine must have an account pop3test, and it must support APOP, CRAM-MD5, CAPA, and UIDL.

The test that comes with Mail::POP3Client provides a way of configuring the POP3 configuration via environment variables. This makes it easy to run the test in a variety of environments. The purpose of that test is to test the basic functions on any machine. For a CPAN module, you need this to allow anybody to run the test. A CPAN test can't make a lot of assumptions about the execution environment.

In test-first programming, the most important step is writing the test. Make all the assumptions you need to get the test written and working. Do the simplest thing that could possibly work, and assume you aren't going to need to write a portable test. If you decide to release the code and test to CPAN, relax the test constraints after your API works. Your first goal is to create the API which solves your customer's problem.

Test Data Dependent Algorithms

my($subject) = "Subject: Test Subject";
my($body) = <<'EOF';
Test Body
A line with a single dot follows
.
And a dot and a space
. 
EOF

open(MSG, "| /usr/lib/sendmail -i -U $cfg->{USER}\@$cfg->{HOST}");
print(MSG $subject . "\n\n" . $body);
close(MSG)
    or die("sendmail failed: $!");
sleep(1);

my($body_lines) = [split(/\n/, $body)];
$body = join("\r\n", @$body_lines, '');


The POP3 protocol uses a dot (.) to terminate multi-line responses. To make sure Mail::POP3Client handles dots correctly, we put leading dots in the message body. The message should be retrieved in its entirety, including the lines with dots. It's important to test data dependencies like this.

The test only sends one message. This is sufficient to validate the client implementation. Testing the server, however, would be much more complex, and would require multiple clients, messages, and message sizes.

The sleep(1) is used to give sendmail time to deliver the message before the test starts.

Validate Basic Assumptions First

my($pop3) = Mail::POP3Client->new(HOST => $cfg->{HOST});
$pop3->Connect;
is($pop3->State, 'AUTHORIZATION');
like($pop3->Capa, qr/UIDL.*CRAM.*|CRAM.*UIDL/is);
ok($pop3->Close);

The first case group validates some assumptions used in the rest of the cases. It's important to put these first to aid debugging. If the entire test fails catastrophically (due to a misconfigured server, for example), it's much easier to diagnose the errors when the basic assumptions fail first.

Bivio::Test allows you to ignore the return result of conformance cases by specifying undef. The return value of Connect is not well-defined, so it's unimportant to test it, and the test documents the way the API works.

This case raises a design issue. Perl subroutines always return a value. Connect does not have an explicit return statement, which means it returns an arbitrary value. Perl has no implicit void context like C and Java do. It's always safe to put in an explicit return; in subroutines when you don't intend to return anything. This helps ensure predictable behavior in any calling context, and improves testability.

The second case tests the server supports CAPA (capabilities), UIDL (unique identifiers), and CRAM (challenge/response authentication). The capability list is unordered so we check the list for UIDL then CRAM or the reverse. Bivio::Test allows us to specify a Regexp instance (qr//) as the expected value. The case passes if the expected regular expression matches the actual return, which is serialized by Data::Dumper.

Validate Using Implementation Knowledge

foreach my $mode (qw(BEST APOP CRAM-MD5 PASS)) {
    $pop3 = Mail::POP3Client->new(%$cfg, AUTH_MODE => $mode);
    is_deeply([$pop3->Body(1)], $body_lines);
    is($pop3->Close, 1);
}

$pop3 = Mail::POP3Client->new(%$cfg, AUTH_MODE => 'BAD-MODE');
like($pop3->Message, qr/BAD-MODE/);
is($pop3->State, 'AUTHORIZATION');
is($pop3->Close, 1);

$pop3 = Mail::POP3Client->new(
    %$cfg, AUTH_MODE => 'BEST', PASSWORD => 'BAD-PASSWORD');
like($pop3->Message, qr/PASS failed/);
is($pop3->State, 'AUTHORIZATION');
is($pop3->Close, 1);

$pop3 = Mail::POP3Client->new(
    %$cfg, AUTH_MODE => 'APOP', PASSWORD => 'BAD-PASSWORD');
like($pop3->Message, qr/APOP failed/);
is($pop3->Close, 1);


Once we have validated the server's capabilities, we test the authentication interface. Mail::POP3Client defaults to AUTH_MODE BEST, but we test each mode explictly here. The other cases test the default mode. To be sure authentication was successful, we download the body of the first message and compare it with the value we sent. POP3 authentication implies authorization to access your messages. We only know we are authorized if we can access the mail user's data.

In BEST mode the implementation tries all authentication modes with PASS as the last resort. We use knowledge of the implementation to validate that PASS is the last mode tried. The Message method returns PASS failed, which gives the caller information about which AUTH_MODE was used.

The test doesn't know the details of the conversation between the server and client, so it assumes the implementation doesn't have two defects (using PASS when it shouldn't and returning incorrect Message values). We'll see in Mock Objects how to address this issue without such assumptions.

The authentication conformance cases are incomplete, because there might be a defect in the authentication method selection logic. We'd like know if we specify APOP that Mail::POP3Client doesn't try PASS first. The last case group in this section attempts to test this, and uses the knowledge that Message returns APOP failed when APOP fails. Again, it's unlikely Message will return the wrong error message.

Distinguish Error Cases Uniquely

sub _is_match {
    my($actual, $expect) = @_;
    return ref($expect) eq 'Regexp'
        ? like(ref($actual) ? join('', @$actual) : $actual, $expect)
        : is_deeply($actual, $expect);
}

$pop3 = Mail::POP3Client->new(%$cfg);
foreach my $params (
    [Body => $body_lines],
    [Head => qr/\Q$subject/],
    [HeadAndBody => qr/\Q$subject\E.*\Q$body_lines->[0]/s],
) {
    my($method, $expect) = @$params;
    _is_match([$pop3->$method(1)], $expect);
    is($pop3->Message(''), '');
    is_deeply([$pop3->$method(999)], []);
    like($pop3->Message, qr/No such message|Bad message number/i);
}


The Body method returns the message body, Head returns the message head, and HeadAndBody returns the entire message. We assume that 999 is a valid message number and that there aren't 999 messages in the mailbox.

Body returns an empty array when a message is not found. Should Body return something else or die in the deviance case? I think so. Otherwise, an empty message body is indistinguishable from a message which isn't found. The deviance test identifies this design issue. That's one reason why deviance tests are so important.

To workaround this problem, we clear the last error Message saved in the Mail::POP3Client instance before calling the download method. We then validate that Message is set (non-blank) after the call.

The test case turned out to be successful unexpectedly. It detected a defect in Message: You can't clear an existing Message. This is a side-effect of the current test, but a defect nonetheless. One advantage of validating the results of every call is that you get bonuses like this without trying.

Avoid Context Sensitive Returns

foreach my $params (
    [Body => $body],
    [Head => qr/\Q$subject/],
    [HeadAndBody => qr/\Q$subject\E.*\Q$body/s],
) {
    my($method, $expect) = @$params;
    _is_match(scalar($pop3->$method(1)), $expect);
    is(scalar($pop3->$method(999)), undef);
}


When Body, Head, and HeadAndBody are invoked in a scalar context, the result is a single string, and undef is returned on errors, which simplifies deviance testing. (Note that Bivio::Test distinguishes undef from [undef]. The former ignores the result, and the latter expects a single-valued result of undef.)

Bivio::Test invokes methods in a list context by default. Setting want_scalar forces a scalar context. This feature was added to test non-bOP classes like Mail::POP3Client.

In bOP, methods are invocation context insensitive. Context sensitive returns like Body are problematic.[4] We use wantarray to ensure methods that return lists behave identically in scalar and list contexts. In general, we avoid list returns, and return array references instead.

Use IO::Scalar for Files

foreach my $params (
    [BodyToFile => $body],
    [HeadAndBodyToFile => qr/\Q$subject\E.*\Q$body/s],
) {
    my($method, $expect) = @$params;
    my($buf) = '';
    is($pop3->$method(IO::Scalar->new(\$buf), 1), 1);
    _is_match($buf, $expect);
}


BodyToFile and HeadAndBodyToFile accept a file glob to write the message parts. This API design is easily testable with the use of IO::Scalar, an in-memory file object. It avoids file naming and disk clean up issues.

We create the IO::Scalar instance in compute_params, which Bivio::Test calls before each method invocation. check_return validates that the method returned true, and then calls actual_return to set the return value to the contents of the IO::Scalar instance. It's convenient to let Bivio::Test perform the structural comparison for us.

Perturb One Parameter per Deviance Case

foreach my $method (qw(BodyToFile HeadAndBodyToFile)) {
    is($pop3->$method(IO::Scalar->new(\('')), 999), 0);
    my($handle) = IO::File->new('> /dev/null');
    $handle->close;
    is($pop3->$method($handle, 1), 0);
}


We test an invalid message number and a closed file handle[5] in two separate deviance cases. You shouldn't perturb two unrelated parameters in the same deviance case, because you won't know which parameter causes the error.

The second case uses a one-time compute_params closure in place of a list of parameters. Idioms like this simplify the programmer's job. Subject matter oriented programs use idioms to eliminate repetitious boilerplate that obscures the subject matter. At the same time, idioms create a barrier to understanding for outsiders. The myriad Bivio::Test may seem overwhelming at first. For the test-first programmer, Bivio::Test clears away the clutter so you can see the API in action.

Relate Results When You Need To

foreach my $method (qw(Uidl List ListArray)) {
    my($first) = ($pop3->$method())[$method eq 'List' ? 0 : 1];
    ok($first);
    is_deeply([$pop3->$method(1)], [$first]);
    is_deeply([$pop3->$method(999)], []);
}


Uidl (Unique ID List), List, and ListArray return lists of information about messages. Uidl and ListArray lists are indexed by message number (starting at one, so the zeroeth element is always undef). The values of these lists are the message's unique ID and size, respectively. List returns a list of unparsed lines with the zeroeth being the first line. All three methods also accept a single message number as a parameter, and return the corresponding value. There's also a scalar return case which I didn't include for brevity in the book.

The first case retrieves the entire list, and saves the value for the first message. As a sanity check, we make sure the value is non-zero (true). This is all we can guarantee about the value in all three cases.

The second case requests the value for the first message from the POP3 server, and validates this value agrees with the value saved from the list case. The one-time check_return closure defers the evaluation of $_SAVE until after the list case sets it.

We cross-validate the results, because the expected values are unpredictable. Unique IDs are server specific, and message sizes include the head, which also is server specific. By relating two results, we are ensuring two different execution paths end in the same result. We assume the implementation is reasonable, and isn't trying to trick the test. These are safe assumptions in XP, since the programmers write both the test and implementation.

Order Dependencies to Minimize Test Length

my($count) = $pop3->Count();
ok($count >= 1);
is($pop3->Delete(1), 1);
is($pop3->Delete(999), 0);
$pop3->Reset;
is($pop3->Close, 1);
$pop3->Connect;
is($pop3->Count, $count);
# Clear mailbox, which also cleans up aborted or bad test runs
foreach my $i (1 .. $count) {
    $pop3->Delete($i);
};
is($pop3->Close, 1);
$pop3->Connect;
is($pop3->Count, 0);
is($pop3->Close, 1);


We put the destructive cases (Delete) near the end. The prior tests all need a message in the mailbox. If we tested delete first, we'd have to resend a message to test the retrieval and list methods. The case ordering reduces test length and complexity.

Note that we cannot guarantee anything about Count except that is at least one. A prior test run may have aborted prematurely and left another message in the test mailbox. What we do know is that if we Delete all messages from one to Count, the mailbox should be empty. The second half of this case group tests this behavior.

The empty mailbox case is important to test, too. By deleting all messages and trying to login, we'll see how Mail::POP3Client behaves in the this case.

Yet another reason to delete all messages is to reset the mailbox to a known state, so the next test run starts with a clean slate. This self-maintaining property is important for tests that access persistent data. Rerun the entire test twice in a row, and the second run should always be correct.

The POP3 protocol doesn't remove messages when Delete is called. The messages are marked for deletion, and the server deletes them on successful Close. Reset clears any deletion marks. We cross-validate the first Count result with the second to verify Reset does what it is supposed to do.

Consistent APIs Ease Testing

$pop3 = Mail::POP3Client->new;
is($pop3->State, 'DEAD');
is($pop3->Alive, '');
is($pop3->Host($cfg->{HOST}), $cfg->{HOST});
is($pop3->Host, $cfg->{HOST});
$pop3->Connect;
is($pop3->Alive, 1);
is($pop3->State, 'AUTHORIZATION');
is($pop3->User($cfg->{USER}), $cfg->{USER});
is($pop3->User, $cfg->{USER});
is($pop3->Pass($cfg->{PASSWORD}), $cfg->{PASSWORD});
is($pop3->Pass, $cfg->{PASSWORD});
is($pop3->Login, 0);
is($pop3->State, 'TRANSACTION');
is($pop3->Alive, 1);
is($pop3->Close, 1);
is($pop3->Alive, '');
is($pop3->Close, 0);

$pop3 = Mail::POP3Client->new;
$pop3->Connect;
is($pop3->Alive, '');
is($pop3->Login, 0);
is($pop3->State, 'DEAD');


This section not only tests the accessors, but also documents the State and Alive transitions after calls to Connect and Login.

There's a minor design issue to discuss. The accessor Pass does not match its corresponding named parameter, PASSWORD, like the Host and User do. The lack of uniformity makes using a map function for the accessor tests cumbersome, so we didn't bother.

Also the non-uniform return values between Alive and Close is clear. While the empty list and zero (0) are both false in Perl, it makes testing for exact results more difficult than it needs to be.

Inject Failures

$pop3 = Mail::POP3Client->new(%$cfg);
is($pop3->POPStat, 0);
$pop3->Socket->close;
is($pop3->POPStat, -1);
is($pop3->Close, 0);


The final (tada!) case group injects a failure before a normal operation. Mail::POP3Client exports the socket that it uses. This makes failure injection easy, because we simply close the socket before the next call to POPStat. Subsequent calls should fail.

We assume error handling is centralized in the implementation, so we don't repeat all the previous tests with injected failures. That's a big assumption, and for Mail::POP3Client it isn't true. Rather than adding more cases to this test, we'll revisit the issue of shared error handling in Refactoring.

Failure injection is an important technique to test error handling. It is in a different class from deviance testing, which tests the API. Instead, we use extra-API entry points. It's like coming in through the back door without knockin'. It ain't so polite but it's sometimes necessary. It's also hard to do if there ain't no backdoor as there is in Mail::POP3Client.

Mock Objects

Mock objects allow you to inject failures and to test alternative execution paths by creating doors where they don't normally exist. Test::MockObject[6] allows you to replace subroutines and methods on the fly for any class or package. You can manipulate calls to and return values from these faked entry points.

Here's a simple test that forces CRAM-MD5 authentication:

use strict;
use Test::More;
use Test::MockObject;
BEGIN {
    plan(tests => 3);
}
my($socket) = Test::MockObject->new;
$socket->fake_module('IO::Socket::INET');
$socket->fake_new('IO::Socket::INET');
$socket->set_true('autoflush')
    ->set_false('connected')
    ->set_series(getline => map({"$_\r\n"}
        # Replace this line with '+OK POP3 <my-apop@secret-key>' for APOP
        '+OK POP3',
        '+OK Capability list follows:',
        # Remove this line to disable CRAM-MD5
        'SASL CRAM-MD5 LOGIN',
        '.',
        '+ abcd',
        '+OK Mailbox open',
        '+OK 33 419',
    ))->mock(print => sub {
        my(undef, @args) = @_;
        die('invalid operation: ', @args)
            if grep(/(PASS|APOP)/i, join('', @args));
        return 1;
    });
use_ok('Mail::POP3Client');
my($pop3) = Mail::POP3Client->new(
    HOST => 'x', USER => 'x', PASSWORD => 'keep-secret'
);
is($pop3->State, 'TRANSACTION');
is($pop3->Count, 33);


In BEST authentication mode, Mail::POP3Client tries APOP, CRAM-MD5, and PASS. This test makes sure that if the server doesn't support APOP that CRAM-MD5 is used and PASS is not used. Most POP3 servers always support APOP and CRAM-MD5 and you usually can't enable one without the other. Since Mail::POP3Client always tries APOP first, this test allows us to test the CRAM-MD5 fallback logic without finding a server that conforms to this unique case.

We use the Test::MockObject instance to fake the IO::Socket::INET class, which Mail::POP3Client uses to talk to the server. The faking happens before Mail::POP3Client imports the faked module so that the real IO::Socket::INET doesn't load.

The first three methods mocked are: new, autoflush, and connected. The mock new returns $socket, the mock object. We set autoflush to always returns true. connected is set to return false, so Mail::POP3Client doesn't try to close the socket when its DESTROY is called.

We fake the return results of getline with the server responses Mail::POP3Client expects to see when it tries to connect and login. To reduce coupling between the test and implementation, keep the list of mock routines short. You can do this by trial and error, because Test::MockObject lets you know when a routine that isn't mocked has been called.

The mock print asserts that neither APOP nor PASS is attempted by Connect. By editing the lines as recommend by the comments, you can inject failures to see that the test and Mail::POP3Client works.

There's a lot more to Test::MockObject than I can present here. It can make a seemingly impossible testing job almost trivial.

Does It Work?

As noted, several of the Mail::POP3Client test cases were successful, that is, they found defects in the implementation. In Refactoring, you'll see the fruits of this chapter's labor. We'll refactor the implementation to make it easier to fix the defects the test uncovered. We'll run the unit test after each refactoring to be sure we didn't break anything.

Footnotes

  1. Art of Software Testing, Glenford Myers, John Wiley & Sons, 1979, p. 16.

  2. The Post Office Protocol - Version 3 RFC can be found at http://www.ietf.org/rfc/rfc1939.txt. The Mail::POP3Client also implements the POP3 Extension Mechanism RFC, http://www.ietf.org/rfc/rfc2449.txt, and IMAP/POP AUTHorize Extension for Simple Challenge/Response RFC http://www.ietf.org/rfc/rfc2195.txt.

  3. The version being tested here is 2.12, which can be found at http://search.cpan.org/author/SDOWD/POP3Client-2.12.

  4. The book Effective Perl Programming by Joseph Hall discusses the issues with wantarray and list contexts in detail.

  5. We use IO::File instead of IO::Scalar, because IO::Scalar does not check if the instance is closed when Mail::POP3Client calls print.

  6. Version 0.9 used here is available at: http://search.cpan.org/author/CHROMATIC/Test-MockObject-0.09/

 
Previous: Chapter 12: Continuous Design   Next: Chapter 14: Refactoring
dot
dot
Discussion at Extreme Perl Group
Copyright © 2004 Robert Nagler (nagler at extremeperl.org)
Licensed under a Creative Commons Attribution 4.0 International License.
  back to top
 
none