Unit test input values - validate or not?

Question

I am manually creating a rather large data model as input to my unit tests. Data model is created using a number of builders. Developers use these builders in order to create the data model as they see fit for a particular unit test.

An example might be in order: when reporting work time, it is possible to report taking a break. This break must be within the working hours, no sense in taking a break at midnight if you worked a 9am-5pm shift.

I have kept from validating any of the inputs into the builders, because:

I believe developers should be familiar with data they create.
Makes the builders more complex. Testing infrastructure should be as simple as possible, in my opinion.

Of course, this means the developer can make a mistake and create a break that is outside of the working hours.

A colleague of mine proposed I do some validation, e.g. making sure the break falls into the working day. It is a minor change and introducing it won't complicate things, but it got me thinking.

Is it a smart move to validate input data for your unit tests, and what are the drawbacks here?

Rather than validators, which your production code should already have, you may want to have explicit builder verbs, such as addValidBreakTime() or addInvalidBreakTime(). This would extend the testing language available to the people writing unit tests. — BobDalgleish, Mar 29 '19 at 19:19
I like the idea of expanding the testing language. Please see my comment to mmathis's answer, regarding anemic data models. — robotron, Mar 29 '19 at 19:46

score 8 · Accepted Answer · answered Mar 29 '19 at 17:55

Unit tests don't always have valid input data - invalid data is a very common input for unit tests, in fact, to make sure it is handled properly. Presumably these data models are used in your production code as well, and your production code knows what to do with invalid data (e.g., a break at midnight). Thus, there is no need for the unit test to validate the data model, as that should be handled by the code under test - that could be the very point of the unit test, in fact.

If your production code doesn't use these same data models, I'd be wary of using them in your unit tests. Another piece of untested code introduces possible bugs in your unit tests, potentially leading to false positive test results.

No, unfortunately the code uses anemic data models. All the business logic is in various service classes which are out of scope for these particular unit tests. — robotron, Mar 29 '19 at 19:42

score 1 · Answer 2 · answered Mar 30 '19 at 03:02

Your class under the test can assume that received data already has been validated before passed in.

This kind of assumptions or if you call them conventions will simplify our code. For example many teams working with .NET stack have convention that any method which returns a collection will never returns null, instead filled or empty collection will be returned. With such convention we will save our code from null checks.

If you are not a fanatic of "one assertion per test", you can assert input values as well as output, which will explicitly tell other developers or readers of your tests that system under the test have some assumptions about the input data.

public void WorkingTimeCalculator_ExcludeBreaks()
{
    var day = 20.January(2019);
    var workingTime = 
        workingTimeBuilder.StartsShiftAt(day.At(8, 0))
                          .TakesBreak(from: day.At(10, 15), to: day.At(10, 30))
                          .TakesBreak(from: day.At(12, 0), to: day.At(13, 0)) 
                          .TakesBreak(from: day.At(14, 30), to: day.At(14, 15))
                          .FinishesShiftAt(day.At(16, 0));

    // Before executing system under the test - assert that input data
    workingTime.Breaks.Should().BeWithin(day.At(8, 0), day.At(16, 0));

    // Execute system under the test
    var calculatedTime = calculator.Calculate(workingTime);

     var expectedWorkingMinutes = 390;
    calculatedTime.Should().Be(expectedWorkingMinutes);        
}

Test above will fail with message that provided breaks are outside of expected period if provided data is invalid.

Laiv · Answer 3 · 2019-03-30T16:31:20.073

In addition to the @mmathis answer, I would like to add something else

I am manually creating a rather large data model as input to my unit tests. The data model is created using a number of builders. Developers use these builders in order to create the data model as they see fit for a particular unit test. [...]

I have kept from validating any of the inputs into the builders because: [...]*

This's concerning for different reasons.

We test to build confidence. Implementing a framework without due care might have defeated that because right now, you can not trust in what is being used as inputs in your tests. The problem derives from the fact that the framework has not been properly tested. Tests rise this sort of issues. Issues about functionality, design and adequacy. So, the problem is triple:
- Test the application code
- Test the framework
- Review all the tests to make sure they all use valid inputs.

when reporting work time, it is possible to report taking a break. This break must be within the working hours, no sense in taking a break at midnight if you worked a 9am-5pm shift.

This sounds a business rule to me. Does not it? You should be validating this already in the application. Somewhere this rule is being checked. Right? And yet, you have been suggested to duplicate the validation and place it out of the application code. The problem now is quadruple
- Maintain business rules in two different source codes

I would dare to say that's a violation of the DRY principle.

Altogether, makes me agree with @BobDalgleish when he says

Rather than validators, which your production code should already have, you may want to have explicit builder methods, such as addValidBreakTime() or addInvalidBreakTime(). This would extend the testing language available to the people writing unit tests

Or why not, to have such validation at hand for reuse. Make the rule validation a testable first-class citizen of the code, so that it can be reused on both sides.

I like your answer, it cuts down to the core of the problem, as well as mmathis: bad design. I agree, this is a business rule and should be enforced within business domain classes. Unfortunately, application is ripe with anemic models and all the logic is in the service classes, which themselves are out of scope for this particular set of tests. The rewrite is out of the question as well. — robotron, Mar 31 '19 at 09:10
To my understanding, you have two options. 1. perpetuate the design error until you have to ... 2. refactor/rewrite the code to make it testable. — Laiv, Mar 31 '19 at 14:44

Unit test input values - validate or not?

3 Answers3

Linked