Stan Math Library
5.0.0
Automatic Differentiation
|
We've written a framework for testing our univariate distribution functions. It's not well doc'ed and the framework is pretty complicated, but it is used to make sure our distributions are ok. (Well – ok enough.)
This is an attempt to document how to use the framework and how we ended up with this framework.
In this section, I'll describe how to run the tests and what's happening.
From within the Math library:
For development on windows, add GENERATE_DISTRIBUTION_TESTS=true
to the make/local file.
This will take hours, perhaps many hours. Most of the time is spent compiling. You might want to add the parallel flag to the python script.
Here, I'm just going to describe the steps that are taken to run the tests. Details on how to write a test and what's inside the framework are further down. These are the steps taken when calling ./runTests.py test/prob
– just broken out step by step.
make generate-tests
is called.test/prob/generate_tests
executable from test/prob/generate_tests.cpp
.test/prob/*/*
, it will call the executable with the test file as the first argument and the number of template instantiations per file within the second argument. For example, for testing the bernoulli_lpmf()
function, make will call: test/prob/generate_tests test/prob/bernoulli/bernoulli_test.hpp 100
The call to the executable will generate 5 different test files, all within the test/prob/bernoulli/
folder:{fd, ffd, ffv, fv, v}
are the types of instantiations within the file. Those types map to: fvar<double>
, fvar<fvar<double>>
, fvar<fvar<var>>
, fvar<var>
, var
. In disributions with many arguments, there will be more than 1 file per types of instantiations.test/prob/*/*
are compiled.test/prob/*/*
is run.The easiest way is to:
make generate-tests
./runTests.py test/prob/bernoulli/bernoulli_00000_generated_v_test.cpp
Before getting into how to write a distribution test, I'll walk through some of the reasons why we built a framework for testing distribution functions.
As with almost everything complicated in Stan, the testing framework was built in part out of necessity and part as a reaction. The framework was built after we started vectorizing functions. In earlier versions of Stan, we didn't have vectorized functions. For a 3 argument function, like normal_log(T1 y, T2 mu, T3 sigma)
, we had 8 (2^3; each template argument can be double
or stan::nath::var
) instantiations that we wrote out by hand. That's not individual tests. At that time, we probably had 2 tests per instantiation: one for good arguments which would have tested the result, the gradients, and the propto flag, and a second test for invalid arguments which would have tested for exceptions. So, with 8 instantiations, that was 16 hand-written tests. That was managagble, but when we started to vectorize, we realized that it wasn't. With vectorization, that same function now allows for 512 different instantiations. We now are instantiating with different containers. For double
, it's: double
, std::vector<double>
, Eigen::Matrix<double, -1, 1>
, Eigen::Matrix<double, 1, -1>
. And we have the instantiations with double
replaced with stan::math::var
.
I mentioned a part of this framework was created as a reaction. As we started vectorizing the distributions efficiently, we had buggy gradients in just some of the instantiations. Early on, we were also not as good with templated C++, so not every instantiation compiled. As a precaution, we started testing every instantiation and testing gradients across all of the possible instantiations. In the long run, it's actaully saved us quite a bit of time.
Some of the original goals were to:
We ended up with our current framework mostly due to dealing with these constraints.
Recently, we added a few more goals:
This lead to an expansion of the original framework, which may have added more complication than necessary.
Writing a single distribution test isn't overly difficult. It's a bit tedious, but it beats writing every possible instantiation out by hand.
I'll start out by outlining a distribution test. To write a test for cdfs, it's similar. The testing framework file that is associated with the distribution is located at: test/prob/test_fixture_distr.hpp
. If you want to know the details of the tests, look in here.
Before diving into details, the overall structure of the test file is:
AgradDistributionTest
(the name is a relic of old-Stan).void valid_values(vector<vector<double> >& parameters, vector<double>& log_prob)
void invalid_values(vector<size_t>& index, vector<double>& value)
Details:
distribution_test.hpp
where you'll name the file after the distribution. Our convention is to place this file here: test/prob/distribution/distribution_test.hpp
where distribution
is replaced with the distribution name.//
, then a space, then the keyword Arguments
, a colon :
, then a comma-then-space separated list of argument types. Each of the distribution tests should start with exactly: Int
, Double
, Ints
, and Doubles
. The first two indicate that the argument is not vectorized and only takes single, scalar values. The last two plural versions indicate that the argument types are vectorized.