3  Mediation with latent factors

We define mediation with latent factors as a two step approach, in which we first perform dimensionality reduction on the omics data and then use the factors/clusters as latent mediator for the mediation analysis between the exposure and the outcome.

3.1 Load data and packages for Mediation Analysis with Latent factors

Before starting, load the data and packages using the following code.

Code
# Load R package
library(EnvirOmix)

# Load simulated data
data(simulated_data)

# Define exposure and outcome name
exposure <- simulated_data[["phenotype"]]$hs_hg_m_scaled
outcome  <- simulated_data[["phenotype"]]$ck18_scaled

# Get numeric matrix of covariates 
covs <- simulated_data[["phenotype"]][c("e3_sex_None", "hs_child_age_yrs_None")] 
covs$e3_sex_None <- ifelse(covs$e3_sex_None == "male", 1, 0)

# create list of omics data 
omics_lst <- simulated_data[-which(names(simulated_data) == "phenotype")]

# Create data frame of omics data
omics_df <- omics_lst |> 
  purrr::map(~as_tibble(.x, rownames = "name")) |>
  purrr::reduce(left_join, by = "name") |>
  column_to_rownames("name")

3.2 Early integration

For early integration, we used principal component analysis (PCA) as a dimensionality reduction step and selected the top i principal components which explained >80% of the variance. Following the joint dimensionality reduction step, we used the r package HIMA (Zhang et al. 2016) to examine whether the variance components mediated associations of in utero mercury exposure with MAFLD.

In this analysis, principal components explained >80% of the variance in the combined omics datasets. Of these components, 7 significantly mediated the relationship between maternal mercury and childhood liver injury (Figure 3.1).

3.2.1 HIMA Early Integration

Code
# Run Analysis
result_med_with_latent_fctrs_early <-
  lafamum(exposure, 
          outcome,
          omics_lst, 
          covs = covs,
          Y.family = "gaussian",
          fdr.level = 0.05, 
          integration = "early")

3.2.2 Plot Early Integration

Code
# Plot
(plot_lafamum(result_med_with_latent_fctrs_early))

Figure 3.1: Mediation analysis with latent factors and early integration identifies joing components which mediate the association between maternal mercury and childhood liver injury. Panel A shows the mediation effects, where Alpha represents the coefficient estimates of the exposure to the mediator, Beta indicates the coefficient estimates of the mediators to the outcome, and TME (%) represents the percent total effect mediated calculated as alpha*beta/gamma. Panel B shows the individual correlation between the omic feature and the joint component.

3.3 Intermediate Integration

The steps for intermediate integration start with performing a joint dimensionality reduction step using Joint and Individual Variance Explained (JIVE) (Lock et al. 2013). Following the joint dimensionality reduction step, we used the r package HIMA (Zhang et al. 2016) to examine whether the variance components mediated associations of in utero mercury exposure with MAFLD.

3.3.1 Conduct JIVE and Perform Mediation Analysis

3.3.1.1 Conduct JIVE

For this step, JIVE can estimate the optimal number of joint and individual ranks by changing the method argument in the function jive. For the simulated HELIX data, the optimal number, determined by setting method = "perm", was 22 joint ranks and 6, 9, 5, 5, and 8 ranks for the methylome, transcriptome, miRNA, proteome, and metabolome, respectively.

3.3.1.2 Perform mediation analysis

In this analysis, 6 joint components, 1 transcriptome specific component significantly mediated the relationship between maternal mercury and childhood liver injury (Figure 3.2).

Code
# Run analysis with rnkJ and rankA provided
result_med_with_latent_fctrs_JIVE <- 
  lafamum(exposure, outcome,
          jive.rankJ = 22,
          jive.rankA = c(6, 9, 5, 5, 8),
          omics_lst,
          covs = covs, 
          Y.family = "gaussian",
          fdr.level = 0.05, 
          integration = "intermediate")

3.3.2 Plot Intermediate Integration

Code
(plot_lafamum(result_med_with_latent_fctrs_JIVE))

Figure 3.2: Mediation analysis with latent factors and intermediate integration identifies joint and individual variance componets which mediate the association between maternal mercury and childhood liver injury. Panel A shows the mediation effects, where Alpha represents the coefficient estimates of the exposure to the mediator, Beta indicates the coefficient estimates of the mediators to the outcome, and TME (%) represents the percent total effect mediated calculated as alpha*beta/gamma. Panel B shows the individual correlation between the omic feature and the joint and individual components.

3.4 Late integration

For late integration, we used principal component analysis (PCA) as a dimensionality reduction step on each omics layer separately, and selected the top i principal components which explained >80% of the variance. Following the dimensionality reduction step, we used the r package HIMA (Zhang et al. 2016) to examine whether the variance components mediated associations of in utero mercury exposure with MAFLD.

This analysis identified 2 methylated CpG sites, 1 miRNA, 1 protein and 2 expressed gene transcript clutesrs significantly mediated the association between mercury and MAFLD (Figure 3.3).

3.4.1 HIMA Late Integration

Code
result_med_with_latent_fctrs_late <- lafamum(exposure, 
                                             outcome,
                                             omics_lst, 
                                             covs = covs,
                                             Y.family = "gaussian",
                                             fdr.level = 0.05, 
                                             integration = "late")

3.4.2 Plot Late Integration

Code
(plot_lafamum(result_med_with_latent_fctrs_late))

Figure 3.3: Mediation analysis with latent factors and late integration identifies features in each omics layer individually which mediates the association between maternal mercury and childhood liver injury. Panel A shows the mediation effects, where Alpha represents the coefficient estimates of the exposure to the mediator, Beta indicates the coefficient estimates of the mediators to the outcome, and TME (%) represents the percent total effect mediated calculated as alpha*beta/gamma. Panel B shows the individual correlation between the omic feature and components.

3.5 Pathway Analysis.

Following mediation analysis, you can use the correlation p-values to perform pathway analysis with appropriate pathway analysis software.