### INTRODUCTION

### SECTION 1: AN ILLUSTRATIVE CASE

*β*-blocker for at least 2 years. We defined a person as adhering to secondary medication for a given drug if a prescription was filled within 30 days of discharge or if the patient was in possession of a sufficient quantity of the drug to cover the initial 30 days after discharge (see Hansen et al. [5] for further references). As the recommended secondary mediation includes 4 drugs, we have 4 variables that may function as potential mediators. The outcome was death from all causes during follow-up. As this was an observational study, we attempted to control confounding by including information on age, sex, calendar year, net household income, educational level, cohabitation status, myocardial infarction, cardiac arrhythmia, heart failure, pulmonary oedema, cardiogenic shock, valvular heart disease, cerebrovascular disease, cancer, chronic obstructive pulmonary disease, diabetes with complications, acute and chronic renal failure, sepsis, pneumonia, anaemia, respiratory insufficiency, prior revascularization, prior in-hospital bleeding, and the use of antihypertensive medications, aspirin, lipid-lowering drugs, vitamin K antagonists, glucose-lowering drugs, loop diuretics, or chronic obstructive pulmonary disease medication.

*β*-blocker) after discharge was observed in 56% of patients (68 vs. 45% in the early and conservative invasive groups, respectively). Receiving an early invasive treatment strategy was associated with a lower incidence rate of all-cause death (3.1 vs. 8.1 deaths per 100 person-years; adjusted hazard ratio [HR], 0.71; 95% confidence interval [CI], 0.67 to 0.74; p<0.001) compared to a conservative approach.

### SECTION 2: THE MATHEMATICAL FRAMEWORK FOR MEDIATION ANALYSIS

*A*→

*B*, can loosely be interpreted as: “if we actively change the variable

*A*, the distribution of

*B*might change.” Note that the real assumption in the DAG

*A*→

*B*is that an intervention on

*B*will not change

*A*. For a more detailed introduction to DAGs see Pearl [6]. The defining feature of a mediator is that it is positioned between the exposure and the outcome when following the directions of the arrows in the DAG. The DAG must also include all likely common causes of any pair constructed from an exposure, mediator, and outcome. In Figure 1, early intervention is assumed to affect the secondary treatment strategy (here defined as drug initiation within 30 days), which in turn affects mortality; this effect is called an indirect effect. There might also be other mechanisms, not involving the secondary treatment strategy, through which early intervention can affect mortality. These are subsumed within the direct arrow from early invasive strategy to death, which therefore can be thought of as mediation through all mediators except for the secondary treatment strategy; this is called a direct effect.

•

*Y*(_{i}*a, m*) is the outcome achieved for person*i*if, possibly contrary to fact, exposure had been set to a and mediator to*m*.•

*M*(a) is the mediator achieved for person_{i}*i*if, possibly contrary to fact, exposure had been set to*a*.

*i*will be omitted when referring to a randomly picked person. The counterfactual variable

*Y*(1,

_{i}*m*) corresponds to the death time observed in a double-intervention randomized trial where early intervention had been used and secondary medication set to

*m*. Likewise, the counterfactual variable

*M*(1) is the secondary medication observed in a single-intervention randomized trial where early intervention had been used. One can combine the two counterfactuals, yielding so-called nested counterfactuals defined as

_{i}*Y*(

*a, M (a*)). When a=a

^{*}^{*}, the nested counterfactual simply corresponds to the observations one would observe if early intervention had been set to

*a*. In the mediation analysis literature, the effect one would observe in a simple randomized trial is referred to as the total effect of treatment (i.e., early intervention), and it is defined as a comparison of the distribution of

*Y*(1,

*M*(1)) with that of

*Y*(0,

*M*(0)). The comparison could be done as a comparison of average values, but with a survival outcome, it would be more common to compare the 2 arms of the trial using a Cox model, leading to a causal HR quantifying the effect of treatment. The books by Pearl [6] and Hernán & Robins [7] provide a thorough introduction to why 1 arm of a randomized trial can be used to estimate the distribution of the counterfactual variable

*Y*(1,

*M*(1)), which is a quantity defined for the whole population, not only the people in the

*A*=1 arm.

*a*=0 arm, treatment is set to 0 and the mediator to the value it would naturally take for that person when treatment is set to 0, leads to the following definition of the so-called natural direct and indirect effects. For ease of presentation alone, we compared the counterfactuals using their average values, but other scales such as odds ratios (ORs) could equally well have been used.

*E*[

*Y*(1,

*M*(1))]−

*E*[

*Y*(0,

*M*(0))]

*E*[

*Y*(1,

*M*(1))]−

*E*[

*Y*(1,

*M*(0))])+(

*E*[

*Y*(1,

*M*(0))]−

*E*[

*Y*(0,

*M*(0))])

*E*[

*Y*(

*a*,

*M*(

*a*))] for

^{*}*a*≠

*a*we can give a precise mathematical definition of mediation. This definition was originally introduced by Pearl [8] and much work has since been published on identification, estimation, and applications, culminating in the recent book by VanderWeele [9], where a comprehensive list of references can be found. As the definition of natural direct and indirect effects at its core builds on comparing distributions of nested counterfactuals, these effects can just as easily be expressed on other scales than the averages. For a survival outcome, it would, for instance, be more common to decompose the HR as follows:

^{*}*Y*(

*a, M (a*)) for potentially different

^{*}*a*and

*a*. As in the DAG in Figure 1, we will allow non- randomized study settings as well. The following assumptions are sufficient to identify natural direct and indirect effects on any scale from independent observations of the triplet (

^{*}*C, A, Y*) [9].

### No uncontrolled confounding

### Positivity

*A*or

*M*are continuous.

### Consistency

### Identification of natural effects

*Y (a, m*) is independent of the counterfactual mediator,

*M (a*) when ever

^{*}*a*and

*a*are different. Mathematically, the condition is:

^{*}*a*and

*a*are assumed to be different). From an applied perspective, assumption (4a) can be replaced by assuming that there are no confounders of the mediator-outcome relationship that are themselves affected by exposure. Or, perhaps more practically, one can assume that the indirect and direct effects are created by distinct and causally unrelated mechanisms.

^{*}*E[g (Y (a, M (a*))]. The arbitrary measurable function

^{*}*g*: R → R is included to demonstrate that it is the full distribution of the nested counterfactual that we have identified, not only the mean. For ease of exposition only, we will assume that

*C*and

*M*are discrete, with state space

*C*and

*M*, respectively.

*i*follows from (4a), equality

*ii*from (1a-1c), and equality

*iii*from (3a). The final expression only depends on the observed data and can therefore be estimated from the observed data. It appears as if the positivity assumption is not needed; however, it is precisely the positivity assumption that guarantees that, in large samples, all quantities in the final expression can be non-parametrically estimated. If one is only interested in a given function

*g*and contrasts such as

*E[g (Y (1, M (1)))] − E[g (Y (0, M (1)))]*, then the identification assumption can be reduced to certain no-interaction assumptions.

*g*is reduced to the identity function, the formula is known as Pearl’s mediation formula or just the mediation formula. While the mediation formula in principle allows non-parametric estimation of any mediation analysis, it can rarely be applied directly, as one will suffer the curse of dimensionality when trying to estimate

*E [g(Y) | A=a, M=m, C=c]*in all strata. As in all other branches of statistics, the curse of dimensionality is countered by introducing parametric modelling assumptions. This will be the theme of the next section.

### SECTION 3: ESTIMATING NATURAL EFFECTS MODELS

*W*is a deterministic function. The class of NEM models, along with the corresponding estimation techniques [11,12], also applies to survival models, such as the Cox model. Estimation algorithms for NEMs are all derived under the assumptions listed in the preceding section, and build on the trick that first we duplicate the original data set, then we create an artificial exposure,

*A*, which takes on different values in the 2 replications of each observation. Finally, we use an auxiliary model to link the artificial observations (i.e., those where

^{*}*A ≠ A*) to the mediators, which is either done through weighting or through imputation. Once this is done, the NEM can be estimated by simply applying standard software applied to the extended data set and using both

^{*}*A*and

*A*, possibly along with

^{*}*C*, as the model specification. This entire procedure has been automated and implemented in the R package medflex for any GLM type outcome model. In the following, we describe in detail how to estimate a NEM for a survival outcome with a multidimensional mediator. This is the situation we have in the illustrative case.

- Using the original data alone, fit a parametric survival model to the outcome conditioned on confounders, exposure, and mediator. This could, for instance, be a Weibull based parametric time-to-event model.

- Construct a new data set by repeating each observation in the original data set twice and including an additional variable a

^{*}, which is equal to the original exposure for the first replication and equal to the opposite of the actual exposure for the second replication. In addition, add an identification variable to indicate which data rows originate from the same subject.- Use the predict functionality, possibly along with the Weibull distribution function, to impute possible survival times for the rows where

*A ≠ A*. In the imputations, the value of the exposure is set to a^{*}^{*}, while mediators and confounders are set to their observed values; that is, impute values for the survival times*Y*)._{i}(a^{*}, M (a)- Fit a Cox model to the extended data set including

*A, A*, and C, but not the mediator. The coefficient of^{*}*A*will be the natural indirect log-HR and the coefficient of*a*will be the natural direct log-HR.^{*}- Repeat steps 3 and 4 ten times and combine the obtained log- HRs as with ordinary multiple imputation; that is, take the average of the log-HR estimates.

- CIs for the natural effect estimates, as well as derived quantities such as mediated proportions, can be obtained by bootstrapping, which involves repeating steps 1-5 a total of 1,000 times, each time creating a new data set by random sampling with replacement from the original data set.

### SECTION 4: ANALYSING THE ILLUSTRATIVE CASE

*β*-blockers could not reasonably be said to constitute distinct causal pathways, but were highly interdependent, mediation was only assessed through the combined 4-dimensional mediator. Accordingly, the counterfactual mediator was

*M (a)*є {0, 1}

^{4}, where a=0 indicates conservative invasive treatment and a=1 indicates early invasive treatment. The nested counterfactual was death time in years, starting 30 days after hospital discharge. To accommodate censoring, the nested counterfactual outcome technically had 2 dimensions, namely, an event time and an event indicator

*Y (a, M (a*)). For ease of exposition, we will only refer to the underlying event time and event indicator when required by context. Our effect measure of interest was natural direct and indirect HRs decomposing the total effect, which had a HR of 0.71 (95% CI, 0.67 to 0.74). Accordingly, the final natural effects model (i.e., the one fitted in step 4 of our suggested approach) should be a Cox proportional hazard model.

^{*}))=((T, δ) (a, M (a^{*}**Step 1:**For the actual mediation analysis, we fit a parametric survival model with a Weibull error distribution to the survival times using the treatment group, mediators, and a long list of potential confounders as explanatory variables (Appendix 1). The Appendix 2 presents both the employed R code and the full table of estimated parameters. Note that for technical reasons relating to R, it is better to use a copy of the exposure variable when fitting this model.

**Step 2:**The data set can be duplicated and the auxiliary variable inserted by copying the original data set twice. In both copies a new variable is created (called, say, exposure Star). In the first copy, the new variable is set to the values of the actual received treatment (exposure Star=exposure) while in the second copy, the new variable is set to the opposite value of the actual received treatment. Finally, the two copies are appended producing a single data set with twice as many rows as in the original data set. See the R code in Appendix 2 for coding advice.

**Step 3:**We now set the temporary exposure variable, used when fitting the imputation model in step 1, to the values in the just created exposure variable (i.e., exposure Star). As the employed survival model is parametric, we can randomly draw survival times conditional on observed mediators and the just-defined temporary exposure variable. This corresponds to randomly drawing the nested counterfactuals variables

*Y*, where a

_{a*,a}^{*}and

*a*are potentially different. To avoid extrapolating the imputation model outside what is supported by the data, any imputed survival time longer than the decided maximum follow-up (7 years in this case) will be artificially censored at the time of maximal follow-up. For the R implementation, see lines 21 to 32 of the Appendix 2.

**Step 4-5:**For each draw of the imputed nested counterfactuals, we fit a Cox model including the exposure that was actually received (the coefficient of this variable will estimate the natural indirect log-HR), the created exposure variable (the coefficient of this variable will estimate the natural direct log-HR), and all confounders. Note that the mediators are not included in this model. This is repeated 10 times, and the 10 resulting model fits are combined using standard formulas for multiple imputation. The resulting estimates are reported in Appendix 1. The fitted model is a natural effects Cox model.

**Step 6:**Finally, CIs are established by 5,000 bootstrap repetitions of steps 1-5. From the bootstrapped replications, we also directly obtain CIs for derived quantities, as the total effect (the sum of natural direct and indirect log-HRs) and the mediated proportion (natural indirect log-HR divided by total effect log-HR). The results are reported in Table 2.

### R package medflex to avoid own coding

fitAux ← glm (BinaryMort ~ dhrkag3 + asatreat30 + adptreat30

+ statintreat30 + betatreat30 + i_alder + factor (sex)

+ factor (indkgrp) + factor (uddankat) + boralene + factor (fi_aar)

+ mi + card + cochf + puled + shock + cervas + mal + diabet

+ arf + crf + anemi + pneumoni + sepsis + klap + bleed

+ Antihyp_12mb + Lipidlow_12mb + ASA_12mb + VitKant_12mb

+ Diureti_loop_12mb + COPD_12mb + tidl_reva, data=workData, family=“binomial”)

extendedData ← neImpute(fitAux, nMed=4)

fit NEM_binaryOutcome ← neModel (BinaryMort ~ dhrkag30 + dhrkag31

+ i_alder + factor (sex) + factor (indkgrp) + factor (uddankat)

+ boralene + factor (fi_aar) + mi + card + cochf + puled + shock

+ cervas + mal + diabet + arf + crf + anemi + pneumoni + sepsis

+ klap + bleed + Antihyp_12mb + Lipidlow_12mb + ASA_12mb

+ VitKant_12mb + Diureti_loop_12mb + COPD_12mb + tidl_reva, expData=extendedData, family=“binomial”, se=“robust”