No association between genetically predicted C-reactive protein levels and colorectal cancer survival in Korean: two-sample Mendelian randomization analysis

OBJECTIVES Elevated C-reactive protein (CRP) levels are associated with an increased risk for colorectal cancer (CRC), as well as a poor prognosis, but it remains unclear whether these associations are causal. This study examined the potential causality between CRP levels and CRC survival using 2-sample Mendelian randomization (MR). METHODS From the Korean Genome and Epidemiology Study, a genome-wide association study (n=59,605), 7 single-nucleotide polymorphisms (SNPs) related to log2-transformed CRP levels were extracted as instrumental variables for CRP levels. The associations between the genetically predicted CRP and CRC-specific and overall mortality among CRC patients (n=6,460) were evaluated by Aalen’s additive hazard model. The sensitivity analysis excluded a SNP related to the blood lipid profile. RESULTS During a median of 8.5 years of follow-up, among 6,460 CRC patients, 2,676 (41.4%) CRC patients died from all causes and 1,622 (25.1%) died from CRC. Genetically predicted CRP levels were not significantly associated with overall or CRC-specific mortality in CRC patients. The hazard difference per 1,000 person-years for overall and CRC-specific mortality per 2-fold increase in CRP levels was -2.92 (95% confidence interval [CI], -14.05 to 8.21) and -0.76 (95% CI, -9.61 to 8.08), respectively. These associations were consistent in a subgroup analysis according to metastasis and a sensitivity analysis excluding possible pleiotropic SNPs. CONCLUSIONS Our findings do not support a causal role for genetically predisposed CRP levels in CRC survival.


INTRODUCTION
CRP is a variable that must be included when constructing a prognostic prediction model [10], including the Glasgow Prognostic Score, which is known to effectively predict the prognosis of CRC patients [14]. However, in these observational studies, it is difficult to differentiate between the deterioration of cancer-related inflammation and the clinical impact of elevated CRP levels themselves, because most studies analyzed peri-treatment CRP levels. The best timing for evaluating CRP-related markers remains unclear [6]. In addition, since the association between cancer progression and an immune response is often bidirectional and multifactorial, it is difficult to avoid reverse causality and residual confounding effects in observational studies [15]. To overcome these limitations and clarify the true causality, Mendelian randomization (MR) using genotypes as instrumental variables has been widely applied. Because genetic variants are randomly allocated by Mendel's law, an MR study using genetic variants as instrumental variables can be independent of potential confounders and can exclude the possibility of reverse causality. However, constructing large data sets with intermediate phenotypes and genetic instruments is challenging due to the high cost of measurements and/or the lack of suitable biological specimens. In this context, 2-sample MR can evaluate the association between exposure and outcome using 2 independent existing genome-wide association studies (GWASs), and 2-sample MR is steadily becoming more common in research using MR analysis [16].
GWASs have reported that several single-nucleotide polymorphisms (SNPs) were associated with CRP levels [17][18][19]; in the largest recent GWAS, those SNPs explained about 7.0% of the variance in CRP levels [17]. However, limited epidemiological data have been reported on the association between CRP-related genetic variants and the prognosis of CRC, and the findings are still inconsistent [20][21][22][23].
The International Survival Analysis in Colorectal Cancer Consortium recently reported that genetically predicted CRP levels were not significantly associated with CRC-specific mortality in a GWAS of 16,918 European CRC cases [24]. However, since CRP levels vary among ethnicities [25] and several SNPs were found to be related to CRP levels only in a GWAS of East Asians [19,26,27], there remains a need for further research to clarify the association between CRP levels and survival of CRC in other ancestries. Therefore, we evaluated the causal role of genetically predicted CRP levels in the survival of CRC in Koreans by conducting a 2-sample MR study with representative Korean GWAS datasets.

Sources of the C-reactive protein genome-wide association study data
Supplementary Material 1 shows the flow chart of the CRP GWAS. The association between SNPs and CRP levels was determined using the GWAS dataset from the Korean Genome and Epidemiology Study (KoGES) [28], a consortium project consisting of 6 prospective cohort studies supported by government funding. Over 223,000 participants have been recruited, with 72,298 from population-based studies (58,700 from the KoGES Health Examinee  [KoGES_HEXA] study, 8,105 from the KoGES Cardiovascular  Association Study [KoGES_CAVAS], and 5,493 from the KoGES Ansan and Ansung Study) who provided epidemiological information and genome-wide arrays after a quality control procedure. The KoGES_CAVAS and KoGES Ansan and Ansung Study consisted of community inhabitants, while the KoGES_HEXA study included participants recruited from the national health examinee registry. From the original sample, 10,358 participants whose serum CRP level was not measured, 146 participants with a serum CRP level ≥ 10 mg/L, 2,462 participants with a previous history of cancer, and 81 participants with missing values for a previous history of cancer were excluded from the analysis. Consequently, 59,605 participants were included in the final analysis (Supplementary Material 2). Genomic DNA was extracted from peripheral blood, and the GWAS was conducted using the Korean Biobank Array (K-CHIP) customized for the Korean population. Details on genotyping, GWAS quality control, and imputation have been described elsewhere [29]. K-CHIP contains 833,535 SNPs, including 89,413 SNPs present in East Asians. Imputation was conducted using the 1000 Genomes Phase 3 dataset of the East Asian population as a reference panel.

Source of the colorectal cancer genome-wide association study data
The CRC GWAS data were obtained from the Hwasun Cancer Epidemiology Study-Colon and Rectum Cancer (HCES-CRC). The Hwasun Cancer Epidemiology Study (HCES) is a hospitalbased case-control study aiming to identify serologic and genetic risk factors for multiple cancers, including esophageal [30], breast [31], gastric [32], and colorectal cancers [33]. The HCES-CRC consisted of 7,089 hospital-based CRC cases and 4,979 populationbased cancer-free controls. Details of genotyping and GWAS quality control have been described elsewhere [33]. The baseline characteristics of the CRC GWAS are presented in Supplementary Material 3. In brief, the subjects were patients diagnosed with histologically confirmed CRC at Chonnam National University Hwasun Hospital between 2004 and 2014. Germline DNA genotyping was performed using the Infinium OncoArray-500K BeadChip (Illumina Inc., San Diego, CA, USA) in 3,158 CRC cases, and the Infinium Multi-Ethnic Global BeadChip (MEGA, Illumina Inc.) in 3,465 cases. Of those, 163 cases without information on the tumor, node, metastasis stage were excluded, and 6,460 CRC cases were finally included in the analysis. The cause and date of death were obtained from the National Statistical Office. The date of death was ascertained until December 31, 2020. The cause of death was coded according to the International Classification of Diseases, 10th revision. The details of the HCES-CRC and imputation procedure have been described previously [34]. The analysis included SNPs with an info score greater than 0.4.

Associations between genetic variants, C-reactive protein levels, and colorectal cancer survival
This study consisted of a discovery cohort of 47,258 individuals from KoGES_HEXA and a replication cohort of 12,347 individuals from KoGES_CAVAS and the KoGES Ansan and Ansung Study. Multivariate linear regression was performed to evaluate the association between genetic variants and log 2 -transformed serum CRP levels. Age, sex, survey year, and the assessment centers of cohort studies were adjusted. The first 10 principal components were also adjusted to correct for the possible population structure in the GWAS. The statistical analyses were performed using PLINK version 1.90b6.0 (https://www.cog-genomics.org/plink/). SNPs with a minor allele frequency (MAF) < 0.05 were excluded from the analysis, leaving 1,859 SNPs significantly associated with log 2 -transformed serum CRP levels in KoGES (p < 5 × 10 -8 ). We used a linkage disequilibrium (LD)-based clumping cut-off of r 2 < 0.001 and a window size of 10,000 kb directly from the KoGES genotyping data. The results of the discovery and replication phases are presented in Supplementary Material 4. Discovery analysis identified 13 significant SNPs associated with serum CRP levels, of which 7 were replicated. The replicated SNPs were rs2794520 near CRP, rs12133641 in IL6R, rs71086917 in LINC02819, rs1260326 in GCKR, rs7383869 near IL6, rs79320731 in HNF1A, and rs429358 in APOE.
The association between selected SNPs and log 2 -transformed serum CRP levels was re-evaluated in a pooled analysis, and these results were used for the MR study. The minimum value of the F-statistics of the selected SNPs was 61.2, and it was expected that the bias by weak instruments in the main analysis would not be significant.
Because Aalen's additive hazard model preserves linearity, it can be used in 2-sample MR analysis regardless of the proportional hazard assumption [35]. Therefore, we conducted Aalen additive hazard regression to evaluate the association between CRP-related SNPs and CRC survival using the R package "timereg. " Statistical analyses were performed using R version 4.2.0 (R Foundation for Statistical Computing, Vienna, Austria).

Two-sample Mendelian randomization and genetic risk score
We estimated genetically predicted CRP levels and CRC survival using the inverse-variance weighted (IVW) method using the R package "MendelianRandomization" [36].
The estimated associations of genetically predicted CRP levels with hazard differences (HDs) in mortality were expressed with respect to a 2-fold increase in the serum CRP level.
Seven SNPs were selected as instrumental variables to calculate the weighted genetic risk score (GRS) for the log 2 -transformed serum CRP level. Supplementary Material 5 shows the distribution of the weighted GRS for CRP and CRC GWAS. In PLINK, the weights are the estimated beta coefficients associated with each copy of the minor allele in a linear regression analysis. The mean GRS per non-missing genetic marker was calculated and the GRS was divided into quintiles (Q1 to Q5).

Statistical power
To the best of our knowledge, there is no available tool to estimate statistical power for survival outcomes in MR. Instead, we used a conservative tool that considered binary survival outcomes [37]. Of a total of 6,460 CRC cases, 2,676 (41.4%) deaths occurred over an 8-year follow-up period. In previous meta-analysis of CRC patients, the hazard ratios (HRs) for overall survival of elevated CRP levels and CRP-to-albumin ratio were 2.04 [12] and 2.03 [13], respectively. We had more than 90% power to detect an odds ratio (OR) of 1.50 for the association between CRP levels and overall mortality at a significance level of 0.05, assuming that the GRS would explain 4.0% of the variance in CRP levels.
In addition, we ran a simulation using an additive hazards model for power calculation. With 6,460 CRC cases and 2,676 deaths accrued over an 8-year follow-up, the population-averaged hazard was estimated to be 2,676/(6,460× 8)= 0.052 per person-year (PY). We had at least 90% power to detect a 50.0% difference in hazard (HD, 0.026) for every 1 standard deviation (SD) increase in log 2transformed CRP levels, assuming that 4.0% of the variance of CRP was explained by the GRS. The R code for the simulation was modified from the R code in the study of Hua et al. [24].

Sensitivity analysis
MR relies on 3 assumptions. First, genetic variants are associated with the exposure (CRP). Second, genetic variants are not associated with potential confounders. Third, genetic variants are not directly associated with the outcome (death), except through the exposure (a lack of horizontal pleiotropic effects). In our study, the selected SNPs were validated in the CRP GWAS and were independent of each other (not in LD). In addition, using Phenoscanner, a database of GWAS results, the pleiotropic effects of the selected SNPs were checked [38], and an association of rs429358 near APOE with the blood lipid profile was reported [39,40]. Therefore, we performed an additional sensitivity analysis excluding rs429358.
For 2-sample MR, 3 sensitivity analyses (simple and weighted median, and MR-Egger regression) were performed. Although the IVW method is sensitive to violations of the assumption regarding pleiotropy, the results from the simple and weighted median are consistent even when up to 50% of the information comes from invalid instrumental variables [41]. The intercept estimated from the MR-Egger regression provides an estimate of the horizontal pleiotropic effect [42].

Ethics statement
The KoGES was reviewed and approved by the Korea Centers for Disease Control and Prevention in Korea (IRB No. 2015-08EXP-01-C-A and 2016-02-20-C-A). Informed consent was obtained from the participants.
There are 2 possible explanations for the association between the remaining novel SNP, rs71086917 (LINC02819), and CRP levels. First, although, to our knowledge, an association between LINC02819 and CRP has not been reported in previous studies, rs10908724 in LINC02819 and in the LD block for rs71086917 (R 2 = 0.180) was related to MCP-1 [44], which mediates the chemotaxis of CRP [45]. However, since the results of the GWAS do not reveal the function of the SNP, the association between the regulation of LINC02819 and CRP or MCP-1 still needs to be evaluated. Second, rs10908724 may be a proxy SNP for rs3093068 (CRP) identified in a previous GWAS [43]. In our genotype data, rs10908724 was correlated to rs3093068 (R 2 = 0.02). Table 2 presents the effects of genetically predicted CRP levels on the risk of death using summary statistics and the GRS. The effects estimated by the IVW method showed that a 2-fold increase in serum CRP levels was not significantly associated with the risk of overall or CRC-specific mortality (HD per 1,000 PY: -2.92 and -0.76, respectively; 95% CI, -14.05 to 8.21 and -9.61 to 8.08, respectively). Furthermore, the results of sensitivity analyses using the simple median and median weighted estimation were consistent with the main results. The MR-Egger intercept showed no significant evidence of pleiotropic effects (p = 0.669 for overall mortality and p = 0.876 for CRC-specific mortality). These non-significant results were similar in the sensitivity analysis that excluded the SNP related to the blood lipid profile. The scatter plot of SNP-specific associations with CRC-specific survival against coefficients of SNP-CRP associations and the regression line depicting the association between genetically predicted CRP levels and survival are visualized in Supplementary Materials 6 and 7, respectively. The linear association between a 1-SD increment of the GRS for CRP levels was not significantly associated with CRC-specific mortality (HD per 1,000 PY, -2.09; 95% CI, -4.26 to 0.08). The GRS for CRP levels demonstrated a significant association with overall mortality (HR per 1,000 PY, -2.09; 95% CI, -5.77 to -0.59). Additionally, compared to the third quintile of the GRS for CRP levels, the HR for the first quintile of the GRS for CRP levels was 8.53 (95% CI, 1.12 to 15.94). However, these associations were not The positions of the SNPs were derived from GRCh37. 2 Age, sex, study centers, survey years, and first 10 principal components were adjusted in all models. 3 Age, sex, genotyping array, tumor, node, metastasis stage, and first 10 principal components were adjusted in all models.
observed in a sensitivity analysis that excluded rs429358, a variant known to be associated with the blood lipid profile. Table 3 presents the subgroup analysis according to metastasis. In the analyses using summary statistics and the individual GRS, the association between genetically predicted CRP levels and mortality was not significant, regardless of metastasis.

DISCUSSION
This study did not find evidence for an association between genetically elevated CRP levels and survival among CRC patients in Korea. Consistent results were found in a sensitivity analysis excluding a possible pleiotropic SNP and a subgroup analysis according to metastasis. Values are presented as hazard difference per 1,000 person-year (95% confidence interval). CRC, colorectal cancer; CRP, C-reactive protein; GRS, genetic risk score; IVW, inverse-variance weighted method; SD, standard deviation; SNP, singlenucleotide polymorphism. 1 Results are expressed per 2-fold increase in serum CRP levels. 2 Age, sex, genotyping array, and tumor, node, metastasis stage were adjusted. Values are presented as hazard difference per 1,000 person-year (95% confidence interval). CRC, colorectal cancer; CRP, C-reactive protein; GRS, genetic risk score; IVW, inverse-variance weighted method; SD, standard deviation. 1 Results are expressed per 2-fold increase in serum CRP levels. 2 Age, sex, and genotyping array were adjusted.
The effect of CRP levels on survival was recently evaluated in CRC cases of European ancestry using 2-sample MR [24]. Similar to our results, Hua et al. [24] reported that genetically predicted CRP levels were not associated with CRC-specific mortality regardless of metastasis. Compared to the instrumental variants used by Hua et al. [24], among the 7 CRP-related SNPs used in our study, 2 SNPs (rs2794520 and rs1260326) were consistent, and 3 correlated SNPs (rs12133641, rs79320731, and rs429358) were included in the same LD blocks. In addition, the effecting alleles showed a consistent directionality in terms of their impact on CRP levels. However, the effective allele frequency (EAF) and effect size of selected SNPs were different, which is presumed to be due to ethnic differences. The EAF and effect size of rs1260326 were 0.450 and 0.008, respectively, in our study, compared to 0.610 and -0.050, respectively, in the study of Hua et al. [24]. In particular, rs1880241, which was included in the study of Hua et al. [24] was excluded from our CRP GWAS because of its low MAF in the East Asian population ( < 0.01). Nonetheless, we could not find a causal effect of CRP levels on survival in CRC patients in this East Asian population, similar to findings in the previously studied European population.
Although not MR studies, several previous genetic studies have investigated survival in CRC patients. Two studies evaluated associations between CRP-related SNPs and mortality in CRC patients, but the results were also not significant [21,22]. Two GWASs examined survival in CRC patients. In the Scottish Colorectal Cancer Study [46], no variants reached the p-value threshold for statistical significance. In contrast, Phipps et al. [47] reported that rs209489 was associated with poor survival in patients with distant metastatic CRC. However, since rs209489 was not significantly associated with log 2 -transformed CRP levels in our GWAS, we did not consider this SNP.
Regarding the association between CRP levels and CRC survival, contrary to our findings, previous observational studies have reported that high CRP levels were associated with a poor CRC prognosis. A meta-analysis of 21 observational studies reported that an elevated preoperative CRP level was associated with poor survival with pooled HRs of 2.04 (95% CI, 1.45 to 2.85) for overall survival and 4.37 (95% CI, 2.63 to 7.27) for CRC-specific survival [12]. In a study of CRC patients treated with neoadjuvant therapy and surgery, CRP levels were associated with disease-free survival independently of carcinoembryonic antigen levels or resection margins [5]. Another meta-analysis using the CRP-to-albumin ratio had similar results [13]. However, since the treatment, stage, and CRP cut-off varied in those studies, it is difficult to distinguish between the effects of cancer-related inflammation and circulating CRP. The prognostic role of CRP levels in CRC patients remains a matter of debate in studies of cancer-free general populations. In the general population, the effects of cancer-related inflammation on the association between CRP and CRC prognosis would be reduced, although inconsistent findings have been reported regarding a positive association between elevated pre-diagnostic CRP levels and CRC mortality. In the National Health and Nutrition Examination Survey III, CRP levels were positively associated with CRC-specific mortality in the general population [3], while the Apolipoprotein Mortality Risk Study reported a null association [48], and the Copenhagen City Heart Study [49] reported a possible association between baseline CRP levels and CRC-specific mortality in their cohort.
Chronic inflammation induces cancer invasion, progression, and metastasis, and influences the efficacy of chemotherapy and immunotherapy [50]. CRP, which is synthesized in the liver, is an acute-phase protein that reflects inflammation. However, we did not find an effect of genetically predicted CRP levels on CRC survival as a systemic inflammatory mediator, unlike many previous case-control studies [5,7,15], prospective studies [3,4], and metaanalyses [10,12,13]. However, since the potential causal role of CRP was not confirmed in previous Mendelian studies, the observed effect of CRP elevation on CRC mortality may be due to residual effects or reverse causality.
Our study had several limitations. First, since information on recurrence was not available, disease-free survival was not included in the analysis. Second, the batch effect may not have been excluded in CRP and CRC GWAS. Therefore, to minimize the batch effect, we evaluated the association between genetic variants and serum CRP levels by statistically adjusting for survey years and study sites in the CRP GWAS, and the association between genetic variants and survival by statistically adjusting for genotyping arrays in the CRC GWAS. Third, because there were only 804 metastatic CRC cases in our study, a further evaluation is needed to clarify the effect of CRP levels on survival in patients with metastatic CRC. Fourth, to confirm the null association between CRP levels and mortality in CRC patients, future studies with higher statistical power are needed. In particular, to improve the power of the MR analysis, the CRP variance explained by genetic instruments should be discussed in terms of the biological effects of selected SNPs or CRP levels on cancer mortality to compensate for the lack of functional analysis of SNPs.
In summary, we found that genetically predicted CRP levels were not associated with the overall or CRC-specific survival of CRC patients. Therefore, our results suggest that genetically predisposed circulating CRP levels do not play a causal role in the prognosis of CRC.