Data profile: the Korean Workers’ Compensation-National Health Insurance Service (KoWorC-NHIS) cohort
Article information
Abstract
The Korean Workers’ Compensation-National Health Insurance Service (KoWorC-NHIS) cohort was established to investigate the longitudinal health outcomes of Korean workers who have been compensated for occupational injuries or diseases. This cohort study, which utilized data spanning from 2004 to 2015, merged workers’ compensation insurance claim data with the National Health Insurance Database (NHID), encompassing 858,793 participants. The data included socio-demographic factors such as age, sex, income, address, insurance type, and disability grade. It also covered the types of occupational accidents, International Classification of Diseases, 10th revision codes for diseases or accidents, work tenure, industry, occupation code, and company size. Additional details such as the occupational hire date, date of claim, date of recognition, and affected body parts were recorded. The cohort predominantly consisted of male workers (80.0%), with the majority experiencing their first occupational accident in their 40s (27.6%) or 50s (25.3%). Notably, 93.1% of the cases were classified as occupational injuries. By integrating this data with that from the NHID, updates on health utilization, employment status, and income changes were made annually. The follow-up period for this study is set to conclude in 2045.
INTRODUCTION
Industrial accidents have declined but remain disproportionately prevalent among vulnerable, low-status workers due to labor market polarization, which frequently involves irregular or short-term employment through subcontractors [1-5]. Outsourcing complicates the legal responsibility for worker safety and disrupts safety management systems, potentially leading to regulatory failures [6-10]. Precarious employment complicates the tracking and assessment of long-term occupational health effects due to frequent job changes and the difficulty in identifying employers [10].
Using data from the National Health Insurance Database (NHID), we constructed a cohort to explore the health effects of occupational characteristics on precarious workers in Korea. The country’s single-payer social insurance system, which encompasses both workers’ compensation insurance and national health insurance, facilitates detailed tracking of workers across various employment statuses, including day laborers and part-time workers [11,12]. Despite the extensive coverage of 19.87 million wage workers, representing 67.7% of the workforce, challenges remain in covering the entire economically active population [13,14]. We conducted a longitudinal assessment of healthcare behaviors and disease incidence by linking health utilization data with workers’ compensation insurance records. This dataset includes all Korean industrial accident workers and offers unique insights into changes in income, occupational transitions, health behaviors, and workplace risks. Our research aimed to elucidate both the immediate and long-term health impacts of industrial accidents, particularly for precarious workers, who are often overlooked in traditional studies.
DATA RESOURCE AND POPULATION COVERAGE
Data collected
Workers’ compensation insurance claim data were merged with the NHID using resident registration numbers in this cohort. The merging was conducted by the Expert Data Combination Agency in accordance with the Enforcement Decree of the Personal Information Protection Act.
The Korea Workers’ Compensation and Welfare Service (COMWEL) administers workers’ compensation under the Industrial Accident Compensation Insurance Act. This act provides coverage for medical expenses and survivor compensation in cases of occupational injuries and diseases. The COMWEL assesses claims by examining employment records, as well as statements from workers and their coworkers, and conducts site inspections when necessary. Decisions on industrial accidents are determined according to established guidelines or, in more complex situations, by a committee of experts.
All information was collected from the compensation data when occupational injuries occurred. This cohort study included workers who received compensation from COMWEL during the period from 2004 to 2015.
To track health outcomes, we combined approved occupational accident (occupational injury and occupational disease) compensation data with figures from the NHID. We linked the NHID and compensation databases using personal IDs. The data merging was conducted by a specialized agency, the National Health Insurance Service (NHIS), in accordance with the Enforcement Decree of the Personal Information Protection Act. All data were securely encrypted and processed. Korea operates a universal health insurance coverage system, under which the entire population is covered by the National Health Insurance Act [12]. In this mandatory system, all medical institutions in Korea are required to have contracts with the government. Medical services are paid through a regulated fee-for-service system, and insurance premiums are determined based on income levels. A population-based database from the NHIS was compiled using the NHID. It includes sociodemographic characteristics, medical treatment records, and income proportions of the population. Since the data were not updated in real time, the researchers selected the data period based on the analysis timing. For this study, data were included through December 31, 2022. The follow-up period is subject to updates.
In this cohort, workers who received approval for workers’ compensation were included. We excluded cases involving survivor benefits. If a worker had applied for compensation more than twice, only the first approved case was considered for inclusion. Among the total population, 781 individuals were approved for workers’ compensation twice, 15 were approved 3 times, and 2 were approved 4 times. Additionally, individuals who received medical assistance were not included in the study. We also excluded workers whose accident dates occurred before 2002, as the NHIS-claimed data prior to this year were not formatted in a way that was suitable for our research (Figure 1).
Variables
The Korean Workers’ Compensation-National Health Insurance Service (KoWorC-NHIS) cohort study incorporated sociodemographic variables such as sex, age, insurance type, income, and region of residence and socioeconomic indicators from the NHID. This setup facilitated longitudinal analyses (Table 1).
Variables for the Korean Workers’ Compensation-National Health Insurance Service (KoWorC-NHIS) cohort
We extracted the database from the NHID, spanning from the cohort entry point to the censoring point, using personal ID as the merging key. The cohort database comprises 6 tables: eligibility, cause of death, medical treatment, health examination, medical care institution, and occupational accidents. The eligibility table contains data on sex, age, residence, income, insurance type, and disability status. The cause of death table records the cause and date of death. The medical treatment tables catalog diagnosis codes for diagnosed conditions, classifications for inpatient, outpatient, and emergency room visits, treatments and medications administered during those encounters, and associated medical costs. The health examination table includes information on past medical and family history, health behaviors such as smoking, drinking, and physical activity, as well as health examination measurements like height, weight, body mass index, chest X-rays, blood pressure, fasting blood sugar, and other laboratory test results. The medical care institution table includes information about the type of institution, location, number of beds, facilities, and number of physicians.
Data in the occupational accidents table are sourced from the COMWEL compensation database. This table details occupational injuries or diseases, with variables fixed at the time of the incident. Consequently, each ID is assigned a single value that reflects the circumstances at the time of the industrial accident. Occupational data and company information, including details such as International Classification of Diseases, 10th revision codes, occupation codes, tenure, accident/diagnosis dates, and duration of compensation claims, were sourced from the Workers’ Compensation Database. Occupation and industry classifications were manually converted to facilitate international comparisons.
To adjust for comorbidities that significantly affect disease prognosis and act as confounders or mediators in the studies, both disability severity and the Charlson comorbidity index score were utilized. Comorbidity data, obtained from national insurance records, were timed variably based on the specifics of each study. Impairment severity was assessed post-treatment by COMWEL physicians, who evaluated the compensation levels.
Entry and censoring points
The cohort entry date was defined as either the date of the occupational injury or the date of occupational disease diagnosis, as recorded in the COMWEL database. The censoring point was set at the earlier of 2 dates: the date of death or December 31, 2022. All participants were followed from their entry date to the censoring point. Person-years were calculated using the follow-up period (Table 2). The total person-years amounted to 13,223,234. Additionally, the average follow-up duration was 13.5 years.
Ethics statement
This study involving human participants was reviewed and approved by the Institutional Review Board of Hanyang University, Korea (study No. HYUIRB-202012-005-4).
DATA RESOURCE USE
The baseline characteristics of the study participants are detailed in Table 3. The study included a total of 858,793 participants, with males comprising 80.0% of the sample. The age distribution showed that most participants experienced their first occupational accident either in their 30s (n= 180,490, 21.0%) or their 50s (n= 216,928, 25.3%). The majority were paid workers (n = 641,730, 74.7%), while self-employed workers accounted for 25.3% of the sample (n= 217,063). Regarding income levels, 24.3% of the participants (n= 205,634) were in the lowest 20%. Geographically, 38.5% of the participants resided in the Seoul metropolitan area (n=329,027), 30.4% lived in other metropolitan areas (n= 260,174), and the remaining 31.1% lived in non-metropolitan cities (n= 265,759).
Regarding the characteristics of occupational accidents, the majority were occupational injuries (n= 799,521, 93.1%), as opposed to occupational diseases (n= 59,272, 6.9%). The Korea Workers’ Compensation and Welfare Service approved 98.4% of all reported occupational injuries or diseases within 3 months of the application date. In the total cohort of participants, trauma (S00-T98, V01-V98) was the most common (92.1%).
As relates to job characteristics, 35.4% of the participants were employed in the manufacturing industry. Elementary workers (n= 329,855, 38.4%), constituted the largest group in the occupational accident cohort. The majority of these workers had been employed for less than 5 years (n= 730,508, 85.1%). Most participants in this cohort experienced work-related injuries. The pre-trained workforce was particularly vulnerable to occupational accidents [15].
Regarding comorbidities and the severity of disability, 60.7% of the study population had no disabilities. Additionally, 68.6% had no other diagnosed diseases at the time of the occupational accidents (Table 3). There were 10,387 deaths, representing 1.2% of the 858,793 individuals in the cohort.
STRENGTHS AND LIMITATIONS
Occupational accidents tend to occur among relatively vulnerable workers. This group can be monitored to observe long-term health outcomes and socioeconomic changes, such as variations in income, residence, and employment status, particularly among workers with lower levels of employment who have experienced occupational accidents. Numerous studies have indicated that precarious employment, which includes contingent, indirectly employed, and subcontractor workers, carries a greater risk of exposure to industrial accidents or diseases compared to direct employment [2,5,16].
Despite the significance of precarious employment, constructing a long-term follow-up cohort study for workers in such positions presents considerable challenges. It is particularly difficult for researchers to track variables like employment status, income, and participation in long-term surveys. Consequently, this cohort, which integrates 2 social insurance systems, offers valuable insights into the health impacts and rates of occupational accidents through its ability to conduct long-term follow-up with large study populations.
Due to employment insecurity, atypical workers often face excessive workloads and understaffing compared to their regular counterparts. Additionally, many hazardous tasks that regular workers avoid are frequently outsourced to small companies [5,17]. The unclear nature of the employment relationship results in both parties shirking responsibilities related to safety control and management [17]. In addition, workers in the early stages of labor market entry are often unskilled [2]. For these reasons, numerous studies have highlighted a strong correlation between atypical labor and occupational accidents [2,5].
This study allows the examination of various topics due to the availability of information about the workplaces where workers were employed at the time of their occupational injuries. It is feasible to assess specific industries for occupational cancers, acute poisonings, and respiratory diseases, among others. Cohorts can be utilized to evaluate both short-term and long-term health effects over periods of 10 years or more, taking into account the time it takes for exposure to an occupational hazard to result in an occupational disease. Moreover, occupational injuries have a significant impact on workers’ lives. It is possible to examine whether these injuries lead to changes in employment status, employment levels, income, and residence, including job losses. For instance, researchers can investigate the risk factors that influence worker unemployment and changes in employment levels following an industrial accident. If the goal is to identify the risk of cancer incidence associated with high chemical exposure, researchers can link these data to occupational and industry-specific exposure data, such as the Korean Carcinogen Exposure (CAREX) dataset [18].
This cohort reflects comprehensive data on workers who have received compensation. The baseline data were collected from all compensated workers between 2004 and 2015. Moreover, this is the inaugural cohort study in Korea dedicated to monitoring the long-term health outcomes of workers who have suffered occupational injuries or diseases. By integrating this cohort with NHIS data, researchers can evaluate the incidence of specific diseases, such as cancer or psychiatric disorders, which are challenging to assess over short-term follow-up periods. The integration with NHIS data, consisting of administratively collected insurance claims, ensures that the data are not subject to recall bias, a significant limitation often encountered in survey-based studies.
However, the current dataset lacks a control group, which complicates the determination of causality. Depending on the specific hazard, it may be feasible to define a non-exposed group within the cohort for manipulative purposes. Given the substantial presence of service industry workers in the cohort, they could potentially serve as a control group. This arrangement would enable comparisons across different occupations, such as manufacturing or construction, particularly concerning physical and chemical hazards. However, since they do not represent the entire workforce, the findings are only relevant to this specific context and cannot be broadly generalized.
In addition, because of the systematic procedures of worker compensation, selection bias may have occurred as minor occupational accidents or diseases might have been overlooked. Under the Industrial Accident Compensation Insurance Act, only work-related injuries or diseases necessitating more than 3 days of hospitalization were eligible for compensation [11]. For this reason, as a previous study pointed out, less severe occupational injuries or diseases such as contact dermatitis, toxic intoxication manifesting merely as dizziness, and corneal damage from chemical spills are under-reported in the compensation database. Thus, these less severe cases were excluded from the cohort. This selection bias should be carefully considered when interpreting our results. In this study, control groups were not selected. Future research utilizing this cohort could address this limitation by selecting control groups from the NHID using various matching methods.
Despite these limitations, a cohort that integrates data from Korea’s workers’ compensation insurance system with data from the NHIS can be utilized to more closely assess the health effects and changes in the socioeconomic status of vulnerable labor groups.
This cohort will also allow us to better assess the health impacts and socioeconomic statuses of vulnerable labor groups.
DATA ACCESSIBILITY
The data used in this study are not freely available. Because of the Personal Information Protection Act and the NHIS database policy, only pre-authorized researchers can access cohort data. Proposals for possible collaboration for further data analysis should be sent to Professor Inah Kim (inahkim@hanyang.ac.kr).
Notes
Conflict of interest
The authors have no conflicts of interest to declare for this study.
Funding
This work was supported by a National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2021R1A2C1008227).
Author contributions
Conceptualization: Min J, Kim I. Data curation: Min J, Kim EM, Kim I. Funding acquisition: Kim I. Methodology: Min J, Choi Y, Kim I. Visualization: Jang J, Kim J. Writing – original draft: Min J. Writing – review & editing: Kim EM, Kim J, Jang J, Choi Y, Kim I.
Acknowledgements
This study used NHIS-NSC data (NHIS-2022-1-129) from the National Health Insurance Service (NHIS).
