This vignette describes the practical contract used by
SuperSurv() to call survival learners and screening
methods. The goal is to make small extensions possible without changing
the ensemble algorithm.
How Learners Are Found
Learners are supplied to SuperSurv() by name:
event.library <- c("surv.coxph", "my.surv.km")During cross-validation and final fitting, SuperSurv()
looks up each name as an R function and calls it directly. A custom
learner can therefore live in a user script, package, or analysis file,
as long as it is available in the R session before calling
SuperSurv().
Learner Wrapper Contract
A survival learner wrapper should have this shape:
my.surv.learner <- function(time, event, X, newdata, new.times,
obsWeights, id, ...) {
# fit model on X, time, event
# predict survival probabilities for newdata at new.times
list(pred = pred_matrix, fit = fitted_object)
}The required arguments are:
| Argument | Meaning |
|---|---|
time |
numeric observed follow-up time |
event |
event indicator, with 1 for the event and
0 for censoring |
X |
training covariates as a data frame |
newdata |
covariates for prediction as a data frame |
new.times |
time grid where survival probabilities are needed |
obsWeights |
observation weights, possibly NULL
|
id |
optional cluster or subject identifier, possibly
NULL
|
... |
learner-specific tuning parameters |
The returned pred must be a numeric matrix with:
- one row per row of
newdata, - one column per value of
new.times, - entries on the survival-probability scale, usually between
0and1.
If possible, each row should be non-increasing across time. Wrappers
commonly clamp small numerical drift back into [0, 1].
The returned fit can be any object needed for future
prediction. When control = list(saveFitLibrary = TRUE),
SuperSurv() stores these fitted objects and
predict.SuperSurv() later calls predict() on
them.
Prediction Method Contract
If a custom learner should support prediction on new data after
fitting, store a classed object in fit and define a
matching S3 prediction method:
predict.my.surv.learner <- function(object, newdata, new.times, ...) {
pred_matrix
}This method should return the same type of matrix as the wrapper’s
pred component: nrow(newdata) rows by
length(new.times) columns.
Direct Survival Curves vs Risk Scores
Some survival learners directly return survival curves. In that case,
the wrapper usually interpolates the native prediction grid to
new.times.
Other learners return a risk score or linear predictor. Those wrappers need one extra calibration step:
- fit the risk-score model,
- estimate a baseline cumulative hazard on the training data,
- predict risk scores for
newdata, - convert scores to survival curves with
S(t | x) = exp(-exp(eta(x)) H0(t)).
Several built-in Cox-style wrappers follow this pattern. The internal
helper safe_breslow_step() is used by some package
wrappers, but it is not exported; custom wrappers should either use a
standard package prediction method that already returns survival curves
or implement their own baseline-hazard calibration carefully.
Minimal Custom Learner
The following learner ignores covariates and fits a weighted
Kaplan-Meier curve. It is intentionally simple, but it satisfies the
full SuperSurv() learner contract.
my.surv.km <- function(time, event, X, newdata, new.times,
obsWeights = NULL, id = NULL, ...) {
if (is.null(obsWeights)) {
obsWeights <- rep(1, length(time))
}
fit_km <- survival::survfit(
survival::Surv(time, event) ~ 1,
weights = obsWeights
)
step_surv <- stats::stepfun(fit_km$time, c(1, fit_km$surv), right = FALSE)
surv_probs <- step_surv(new.times)
pred <- matrix(
surv_probs,
nrow = nrow(newdata),
ncol = length(new.times),
byrow = TRUE
)
fit <- list(object = fit_km)
class(fit) <- "my.surv.km"
list(pred = pred, fit = fit)
}
predict.my.surv.km <- function(object, newdata, new.times, ...) {
fit_km <- object$object
step_surv <- stats::stepfun(fit_km$time, c(1, fit_km$surv), right = FALSE)
surv_probs <- step_surv(new.times)
matrix(
surv_probs,
nrow = nrow(newdata),
ncol = length(new.times),
byrow = TRUE
)
}You can test the wrapper outside the ensemble before adding it to a library:
data("metabric", package = "SuperSurv")
dat <- metabric[1:40, ]
x_cols <- grep("^x", names(dat))[1:3]
X <- dat[, x_cols, drop = FALSE]
times <- seq(50, 150, by = 50)
wrapper_out <- my.surv.km(
time = dat$duration,
event = dat$event,
X = X,
newdata = X[1:5, , drop = FALSE],
new.times = times
)
dim(wrapper_out[["pred"]])
#> [1] 5 3
dim(predict(wrapper_out[["fit"]], newdata = X[1:5, , drop = FALSE], new.times = times))
#> [1] 5 3The custom learner can be mixed with built-in learners:
Screening Method Contract
Screening methods are also supplied by name. A screener should accept
the same basic training inputs and return a logical vector with one
value per column of X:
my.screen.first3 <- function(time, event, X, obsWeights = NULL, id = NULL, ...) {
keep <- rep(FALSE, ncol(X))
keep[seq_len(min(3, ncol(X)))] <- TRUE
names(keep) <- names(X)
keep
}Use a list entry to pair a learner with one or more screeners:
Development Checklist
Before using a custom wrapper in a large ensemble, check:
-
predis a numeric matrix with dimensionsnrow(newdata)bylength(new.times), - predictions are on the survival-probability scale,
-
predict(wrapper_out[["fit"]], newdata, new.times)returns the same shape when saved fits are needed, - the wrapper tolerates
obsWeights = NULLandid = NULL, - all required packages are checked with
requireNamespace(), - any screener returns a logical vector aligned with the columns of
X.
These rules are deliberately small. They allow new learners to be added without changing the SuperSurv ensemble machinery.
