## 15.1 Floating-point representations

Stan’s arithmetic is implemented using double-precision arithmetic.
The behavior of most^{26} modern computers
follows the floating-point arithmetic, *IEEE Standard for
Floating-Point Arithmetic* (IEEE 754).

### 15.1.1 Finite values

The double-precision component of the IEEE 754 standard specifies the representation of real values using a fixed pattern of 64 bits (8 bytes). All values are represented in base two (i.e., binary). The representation is divided into two signed components:

*significand*(53 bits): base value representing significant digits*exponent*(11 bits): power of two multiplied by the base

The *value* of a finite floating point number is

\[ v = (-1)^s \times c \, 2^q \]

### 15.1.2 Normality

A *normal* floating-point value does not use any leading zeros in
its significand; *subnormal* numbers may use leading zeros. Not all
I/O systems support subnormal numbers.

### 15.1.3 Ranges and extreme values

There are some reserved exponent values so that legal exponent values range between\(-(2^{10}) + 2 = -1022\) and \(2^{10} - 1 = 1023\). Legal significand values are between \(-2^{52}\) and \(2^{52} - 1\). Floating point allows the representation of both really big and really small values. Some extreme values are

*largest normal finite number*: \(\approx 1.8 \times 10^{308}\)*largest subnormal finite number*: \(\approx 2.2 \times 10^{308}\)*smallest positive normal number*: \(\approx 2.2 \times 10^{-308}\)*smallest positive subnormal number*: \(\approx 4.9 \times 10^{-324}\)

### 15.1.4 Signed zero

Because of the sign bit, there are two ways to represent zero, often
called “positive zero” and “negative zero.” This distinction is
irrelevant in Stan (as it is in R), because the two values are equal
(i.e., `0 == -0`

evaluates to true).

### 15.1.5 Not-a-number values

A specially chosen bit pattern is used for the *not-a-number* value
(often written as `NaN`

in programming language output, including
Stan’s).

Stan provides a value function `not_a_number()`

that returns this special
not-a-number value. It is meant to represent error conditions, not
missing values. Usually when not-a-number is an argument to a
function, the result will not-a-number if an exception (a rejection in
Stan) is not raised.

Stan also provides a test function `is_nan(x)`

that returns 1 if `x`

is not-a-number and 0 otherwise.

Not-a-number values propagate under almost all mathematical
operations. For example, all of the built-in binary arithmetic
operations (addition, subtraction, multiplication, division, negation)
return not-a-number if any of their arguments are not-a-number. The
built-in functions such as `log`

and `exp`

have the same behavior,
propagating not-a-number values.

Most of Stan’s built-in functions will throw exceptions (i.e., reject) when any of their arguments is not-a-number.

Comparisons with not-a-number always return false, up to and including
comparison with itself. That is, `not_a_number() == not_a_number()`

somewhat confusingly returns false. That is why there is a built-in
`is_nan()`

function in Stan (and in C++). The only exception
is negation, which remains coherent. This means `not_a_number() != not_a_number()`

returns true.

Undefined operations often return not-a-number values. For example,
`sqrt(-1)`

will evaluate to not-a-number.

### 15.1.6 Positive and negative infinity

There are also two special values representing positive infinity
(\(\infty)\) and negative infinity (\(-\infty\)). These are not
as pathological as not-a-number, but are often used to represent error
conditions such as overflow and underflow. For example, rather than
raising an error or returning not-a-number, `log(0)`

evaluates to
negative infinity. Exponentiating negative infinity leads back to
zero, so that `0 == exp(log(0))`

. Nevertheless, this should not be
done in Stan because the chain rule used to calculate the derivatives
will attempt illegal operations and return not-a-number.

There are value functions `positive_infinity()`

and
`negative_infinity()`

as well as a test function `is_inf()`

.

Positive and negative infinity have the expected comparison behavior,
so that `negative_infinty() < 0`

evaluates to true (represented with 1
in Stan). Also, negating positive infinity leads to negative infinity
and vice-versa.

Positive infinity added to either itself or a finite value produces positive infinity. Negative infinity behaves the same way. However, attempts to subtract positive infinity from itself produce not-a-number, not zero. Similarly, attempts to divide infinite values results in a not-a-number value.

The notable exception is Intel’s optimizing compilers under certain optimization settings.↩︎