Automatic Differentiation
 
Loading...
Searching...
No Matches

◆ map_rect()

template<int call_id, typename F , typename T_shared_param , typename T_job_param , require_eigen_col_vector_t< T_shared_param > * = nullptr>
Eigen::Matrix< return_type_t< T_shared_param, T_job_param >, Eigen::Dynamic, 1 > stan::math::map_rect ( const T_shared_param &  shared_params,
const std::vector< Eigen::Matrix< T_job_param, Eigen::Dynamic, 1 > > &  job_params,
const std::vector< std::vector< double > > &  x_r,
const std::vector< std::vector< int > > &  x_i,
std::ostream *  msgs = nullptr 
)

Map N function evaluations to parameters and data which are in rectangular format.

Each function evaluation may return a column vector of different sizes and the output is the concatenated vector from all evaluations.

In addition to job specific parameters, real and int data, a shared parameter vector is repeated in all evaluations. All input parameters are stored as vectors whereas data is stored as arrays.

For N jobs the output of this function is

[ f(shared_params, job_params[1], x_r[1], x_i[1]), f(shared_params, job_params[2], x_r[2], x_i[2]), ... ]'.

The function is implemented with serial execution and with parallelism using threading or MPI (TODO). The threading version is used if the compiler flag STAN_THREADS is set during compilation while the MPI version is only available if STAN_MPI is defined. The MPI parallelism takes precedence over serial or threading execution of the function.

For the threaded parallelism the N jobs are chunked into T blocks which are executed asynchronously using the async C++11 facility. This ensure that at most T threads are used, but the actual number of threads is controlled by the implementation of async provided by the compiler. Note that nested calls of map_rect will lead to a multiplicative increase in the number of job chunks generated. The number of threads T is controlled at runtime via the STAN_NUM_threads environment variable, see the get_num_threads function for details.

For the MPI version to work this function has these special non-standard conventions:

  • The call_id template parameter is considered as a label for the functor F and data combination. Since MPI communication is expensive, the real and int data is transmitted only a single time per functor F / call_id combination to the workers.
  • The MPI implementation requires that the functor type fully specifies the functor and hence requires a default constructible function object. This choice reduces the need for communication across MPI as the type is sufficient and the state of the functor is not necessary to transmit. Thus, the functor is specified as template argument only.
  • The size of the returned vector of each job must stay consistent when performing repeated calls to map_rect (which usually vary the values of the parameters).
  • To achieve the exact same results between the serial and the MPI evaluation scheme both variants work in exactly the same way which is to use nested AD calls and pre-computing all gradients when the function gets called. The usual evaluation scheme would build an AD graph instead of doing on-the-spot gradient evaluation. For large problems this results in speedups for the serial version even on a single core due to smaller AD graph sizes.
  • In MPI operation mode, no outputs from the workers are streamed back to the root.
  • Finally, each map_rect call must be registered with the STAN_REGISTER_MAP_RECT macro. This is required to enable communication between processes for the MPI case. The macro definition is empty if MPI is not enabled.

The functor F is expected to have the usual operator() function with signature

template <typename T1, typename T2> Eigen::Matrix<return_type_t<T1, T2>, Eigen::Dynamic, 1> operator()(const Eigen::Matrix<T1, Eigen::Dynamic, 1>& eta, const Eigen::Matrix<T2, Eigen::Dynamic, 1>& theta, const std::vector<double>& x_r, const std::vector<int>& x_i, std::ostream* msgs = 0) const { ... }

If an expression is passed as shared_params, the functor needs to accept expression.

WARNING: For the MPI case, the data arguments are NOT checked if they are unchanged between repeated evaluations for a given call_id/functor F pair. This is silently assumed to be immutable between evaluations.

Template Parameters
T_shared_paramType of shared parameters.
T_job_paramType of job specific parameters.
Parameters
shared_paramsshared parameter vector passed as first argument to functor for all jobs
job_paramsArray of job specific parameter vectors. All job specific parameters must have matching sizes.
x_rArray of real arrays for each job. The first dimension must match the number of jobs (which is determined by the first dimension of job_params) and each entry must have the same size.
x_iArray of int data with the same conventions as x_r.
msgsOutput stream for messages.
Template Parameters
call_idLabel for functor/data combination. See above for details.
FFunctor which is applied to all job specific parameters with conventions described.
Returns
concatenated results from all jobs

Definition at line 127 of file map_rect.hpp.