## 5.4 Univariate data types and variable declarations

All variables used in a Stan program must have an explicitly declared data type. The form of a declaration includes the type and the name of a variable. This section covers univariate types, the next section vector and matrix types, and the following section array types.

### Unconstrained integer

Unconstrained integers are declared using the `int`

keyword.
For example, the variable `N`

is declared to be an integer as follows.

`int N;`

### Constrained integer

Integer data types may be constrained to allow values only in a
specified interval by providing a lower bound, an upper bound, or
both. For instance, to declare `N`

to be a positive integer, use
the following.

`int<lower=1> N;`

This illustrates that the bounds are inclusive for integers.

To declare an integer variable `cond`

to take only binary values,
that is zero or one, a lower and upper bound must be provided, as in
the following example.

`int<lower=0, upper=1> cond;`

### Unconstrained real

Unconstrained real variables are declared using the keyword
`real`

. The following example declares `theta`

to be an
unconstrained continuous value.

`real theta;`

### Constrained real

Real variables may be bounded using the same syntax as integers. In theory (that is, with arbitrary-precision arithmetic), the bounds on real values would be exclusive. Unfortunately, finite-precision arithmetic rounding errors will often lead to values on the boundaries, so they are allowed in Stan.

The variable `sigma`

may be declared to be non-negative as follows.

`real<lower=0> sigma;`

The following declares the variable `x`

to be less than or equal
to \(-1\).

`real<upper=-1> x;`

To ensure `rho`

takes on values between \(-1\) and \(1\), use the
following declaration.

`real<lower=-1, upper=1> rho;`

#### Infinite constraints

Lower bounds that are negative infinity or upper bounds that are
positive infinity are ignored. Stan provides constants
`positive_infinity()`

and `negative_infinity()`

which may
be used for this purpose, or they may be read as data in the dump
format.

### Affinely transformed real

Real variables may be declared on a space that has been transformed using an affine transformation \(x\mapsto \mu + \sigma * x\) with offset \(\mu\) and (positive) multiplier \(\sigma\), using a syntax similar to that for bounds. While these transforms do not change the asymptotic sampling behaviour of the resulting Stan program (in a sense, the model the program implements), they can be useful for making the sampling process more efficient by transforming the geometry of the problem to a more natural multiplier and to a more natural offset for the sampling process, for instance by facilitating a non-centered parameterisation. While these affine transformation declarations do not impose a hard constraint on variables, they behave like the bounds constraints in many ways and could perhaps be viewed as acting as a sort of soft constraint.

The variable `x`

may be declared to have offset \(1\) as follows.

`real<offset=1> x;`

Similarly, it can be declared to have multiplier \(2\) as follows.

`real<multiplier=2> x;`

Finally, we can combine both declarations to declare a variable with offset \(1\) and multiplier \(2\).

`real<offset=1, multiplier=2> x;`

As an example, we can give `x`

a normal distribution with non-centered
parameterization as follows.

```
parameters {
real<offset=mu, multiplier=sigma> x;
}
model {
x ~ normal(mu, sigma);
}
```

Recall that the centered parameterization is achieved with the code

```
parameters {
real x;
}
model {
x ~ normal(mu, sigma);
}
```

or equivalently

```
parameters {
real<offset=0, multiplier=1> x;
}
model {
x ~ normal(mu, sigma);
}
```

### Expressions as bounds and offset/multiplier

Bounds (and offset and multiplier) for integer or real variables may be arbitrary expressions. The only requirement is that they only include variables that have been declared (though not necessarily defined) before the declaration. array[N] row_vector[D] x; If the bounds themselves are parameters, the behind-the-scenes variable transform accounts for them in the log Jacobian.

For example, it is acceptable to have the following declarations.

```
data {
real lb;
}
parameters {
real<lower=lb> phi;
}
```

This declares a real-valued parameter `phi`

to take values
greater than the value of the real-valued data variable `lb`

.
Constraints may be complex expressions, but must be of type `int`

for integer variables and of type `real`

for real variables
(including constraints on vectors, row vectors, and matrices).
Variables used in constraints can be any variable that has been
defined at the point the constraint is used. For instance,

```
data {
int<lower=1> N;
array[N] real y;
}
parameters {
real<lower=min(y), upper=max(y)> phi;
}
```

This declares a positive integer data variable `N`

, an array
`y`

of real-valued data of length `N`

, and then a parameter
ranging between the minimum and maximum value of `y`

. As shown
in the example code, the functions `min()`

and `max()`

may
be applied to containers such as arrays.

A more subtle case involves declarations of parameters or transformed parameters based on parameters declared previously. For example, the following program will work as intended.

```
parameters {
real a;
real<lower=a> b; // enforces a < b
}
transformed parameters {
real c;
real<lower=c> d;
c = a;
d = b;
}
```

The parameters instance works because all parameters are defined
externally before the block is executed. The transformed parameters
case works even though `c`

isn’t defined at the point it is used,
because constraints on transformed parameters are only validated at
the end of the block. Data variables work like parameter variables,
whereas transformed data and generated quantity variables work like
transformed parameter variables.

### Declaring optional variables

A variable may be declared with a size that depends on a boolean
constant. For example, consider the definition of `alpha`

in the
following program fragment.

```
data {
int<lower=0, upper=1> include_alpha;
// ...
}
parameters {
vector[include_alpha ? N : 0] alpha;
// ...
}
```

If `include_alpha`

is true, the model will include the vector
`alpha`

; if the flag is false, the model will not include
`alpha`

(technically, it will include `alpha`

of size 0,
which means it won’t contain any values and won’t be included in any
output).

This technique is not just useful for containers. If the value of
`N`

is set to 1, then the vector `alpha`

will contain a
single element and thus `alpha[1]`

behaves like an optional
scalar, the existence of which is controlled by `include_alpha`

.

This coding pattern allows a single Stan program to define different models based on the data provided as input. This strategy is used extensively in the implementation of the RStanArm package.