R/posterior_survfit.R
posterior_survfit.Rd
This function allows us to generate estimated survival probabilities
based on draws from the posterior predictive distribution. By default
the survival probabilities are conditional on an individual's
groupspecific coefficients (i.e. their individuallevel random
effects). If prediction data is provided via the newdataLong
and newdataEvent
arguments, then the default behaviour is to
sample new groupspecific coefficients for the individuals in the
new data using a Monte Carlo scheme that conditions on their
longitudinal outcome data provided in newdataLong
(sometimes referred to as "dynamic predictions", see Rizopoulos
(2011)). This default behaviour can be stopped by specifying
dynamic = FALSE
, in which case the predicted survival
probabilities will be marginalised over the distribution of the
groupspecific coefficients. This has the benefit that the user does
not need to provide longitudinal outcome measurements for the new
individuals, however, it does mean that the predictions will incorporate
all the uncertainty associated with betweenindividual variation, since
the predictions aren't conditional on any observed data for the individual.
In addition, by default, the predicted subjectspecific survival
probabilities are conditional on observed values of the fixed effect
covariates (ie, the predictions will be obtained using either the design
matrices used in the original stan_jm
model call, or using the
covariate values provided in the newdataLong
and newdataEvent
arguments). However, if you wish to average over the observed distribution
of the fixed effect covariates then this is possible  such predictions
are sometimes referred to as standardised survival probabilties  see the
standardise
argument below.
posterior_survfit( object, newdataLong = NULL, newdataEvent = NULL, extrapolate = TRUE, control = list(), condition = NULL, last_time = NULL, prob = 0.95, ids, times = NULL, standardise = FALSE, dynamic = TRUE, scale = 1.5, draws = NULL, seed = NULL, ... )
object  A fitted model object returned by the


newdataLong, newdataEvent  Optionally, a data frame (or in the case of

extrapolate  A logical specifying whether to extrapolate the estimated
survival probabilities beyond the times specified in the 
control  A named list with parameters controlling extrapolation
of the estimated survival function when

condition  A logical specifying whether the estimated
subjectspecific survival probabilities at time 
last_time  A scalar, character string, or 
prob  A scalar between 0 and 1 specifying the width to use for the
uncertainty interval (sometimes called credible interval) for the predictions.
For example 
ids  An optional vector specifying a subset of IDs for whom the
predictions should be obtained. The default is to predict for all individuals
who were used in estimating the model or, if 
times  A scalar, a character string, or 
standardise  A logical specifying whether the estimated
subjectspecific survival probabilities should be averaged
across all individuals for whom the subjectspecific predictions are
being obtained. This can be used to average over the covariate and random effects
distributions of the individuals used in estimating the model, or the individuals
included in the 
dynamic  A logical that is only relevant if new data is provided
via the 
scale  A scalar, specifying how much to multiply the asymptotic
variancecovariance matrix for the random effects by, which is then
used as the "width" (ie. variancecovariance matrix) of the multivariate
Studentt proposal distribution in the MetropolisHastings algorithm. This
is only relevant when 
draws  An integer indicating the number of MCMC draws to return. The default is to set the number of draws equal to 200, or equal to the size of the posterior sample if that is less than 200. 
seed  An optional 
...  Currently unused. 
A data frame of class survfit.stanjm
. The data frame includes
columns for each of the following:
(i) the median of the posterior predictions of the estimated survival
probabilities (survpred
);
(ii) each of the lower and upper limits of the corresponding uncertainty
interval for the estimated survival probabilities (ci_lb
and
ci_ub
);
(iii) a subject identifier (id_var
), unless standardised survival
probabilities were estimated;
(iv) the time that the estimated survival probability is calculated for
(time_var
).
The returned object also includes a number of additional attributes.
Note that if any variables were transformed (e.g. rescaled) in the data
used to fit the model, then these variables must also be transformed in
newdataLong
and newdataEvent
. This only applies if variables
were transformed before passing the data to one of the modeling functions and
not if transformations were specified inside the model formula.
Rizopoulos, D. (2011). Dynamic predictions and prospective accuracy in joint models for longitudinal and timetoevent data. Biometrics 67, 819.
plot.survfit.stanjm
for plotting the estimated survival
probabilities, ps_check
for for graphical checks of the estimated
survival function, and posterior_traj
for estimating the
marginal or subjectspecific longitudinal trajectories, and
plot_stack_jm
for combining plots of the estimated subjectspecific
longitudinal trajectory and survival function.
# \donttest{ # Run example model if not already loaded if (!exists("example_jm")) example(example_jm) # Obtain subjectspecific survival probabilities for a few # selected individuals in the estimation dataset who were # known to survive up until their censoring time. By default # the posterior_survfit function will estimate the conditional # survival probabilities, that is, conditional on having survived # until the event or censoring time, and then by default will # extrapolate the survival predictions forward from there. ps1 < posterior_survfit(example_jm, ids = c(7,13,15)) # We can plot the estimated survival probabilities using the # associated plot function plot(ps1)# If we wanted to estimate the survival probabilities for the # same three individuals as the previous example, but this time # we won't condition on them having survived up until their # censoring time. Instead, we will estimate their probability # of having survived between 0 and 5 years given their covariates # and their estimated random effects. # The easiest way to achieve the time scale we want (ie, 0 to 5 years) # is to specify that we want the survival time estimated at time 0 # and then extrapolated forward 5 years. We also specify that we # do not want to condition on their last known survival time. ps2 < posterior_survfit(example_jm, ids = c(7,13,15), times = 0, extrapolate = TRUE, condition = FALSE, control = list(edist = 5)) # Instead we may want to estimate subjectspecific survival probabilities # for a set of new individuals. To demonstrate this, we will simply take # the first two individuals in the estimation dataset, but pass their data # via the newdata arguments so that posterior_survfit will assume we are # predicting survival for new individuals and draw new random effects # under a Monte Carlo scheme (see Rizopoulos (2011)). ndL < pbcLong[pbcLong$id %in% c(1,2),] ndE < pbcSurv[pbcSurv$id %in% c(1,2),] ps3 < posterior_survfit(example_jm, newdataLong = ndL, newdataEvent = ndE, last_time = "futimeYears", seed = 12345)#> Drawing new random effects for 2 individuals. Monitoring progress: #>    0%  ===================================  50%  ====================================================================== 100%head(ps3)#> id year survpred ci_lb ci_ub #> 1 1 1.0951 1.0000 1.0000 1.0000 #> 2 1 2.0278 0.4827 0.0013 0.9483 #> 3 1 2.9604 0.2179 0.0000 0.8575 #> 4 1 3.8930 0.0697 0.0000 0.6929 #> 5 1 4.8257 0.0054 0.0000 0.4532 #> 6 1 5.7583 0.0002 0.0000 0.2547# We can then compare the estimated random effects for these # individuals based on the fitted model and the Monte Carlo scheme ranef(example_jm)$Long1$id[1:2,,drop=FALSE] # from fitted model#> (Intercept) #> 1 2.0591942 #> 2 0.6024433#> b[Long1(Intercept) id:1] b[Long1(Intercept) id:2] #> 2.0256299 0.6795806# Lastly, if we wanted to obtain "standardised" survival probabilities, # (by averaging over the observed distribution of the fixed effect # covariates, as well as averaging over the estimated random effects # for individuals in our estimation sample or new data) then we can # specify 'standardise = TRUE'. We can then plot the resulting # standardised survival curve. ps4 < posterior_survfit(example_jm, standardise = TRUE, times = 0, extrapolate = TRUE) plot(ps4)# }