Effect change of obesity on diabetes depending on measurement: self-reported body mass index from 2012 Community Health Survey vs. directly measured from the Korea National Health and Nutrition Examination Survey

OBJECTIVES: Obesity is a well-recognized risk factor for type 2 diabetes mellitus (DM) among young and middle-aged adults in South Korea. To elaborate on the association between obesity and DM, subjective data from self-reporting survey or objective data from health examination is generally used. This study was conducted to validate the change of association from using these different measurements. METHODS: Community Health Survey data and Korea National Health and Nutrition Examination Survey data, as subjective and objective data respectively, were used. Population, resident in Seoul and over 45 aged, were selected for the study and the association between obesity and DM were defined by using multivariate logistic regression model. RESULTS: In subjective data, DM prevalence was 12.4% (male, 14.7; female, 10.6) and obesity prevalence was 26.0% (male, 29.2; female, 23.4). Whereas, in objective data, DM prevalence was 15.0% (male, 17.8; female, 12.9), and obese population was 32.4% (male, 34.4; female, 30.8). Based on the effect of obesity on DM prevalence from each data, using objective data increased the impact of obesity. Difference of relative risk of obesity between from subjective data and from objective was bigger in female than male and statistically significant. CONCLUSIONS: The differences of association pattern between subjective and objective data were found, due to higher obesity prevalence in objective data, and discrepancies of socio-economic status. These discrepancies could be inevitable Therefore we have to face them proactively, and understand the different aspect of various variables from different measurement.


INTRODUCTION
While studies of obesity diagnosed from body mass index (BMI) have been conducted, self-reported anthropometric in-formation regarding the diagnosis of obesity is subject to systematic errors [1][2][3], and particularly to discrepancies regarding the characteristics of participants. Nevertheless, self-reported information obtained through questionnaire or interview has the advantages of low cost and higher availability; moreover, these surveys are easy to administer and are a good method for studying large numbers of individuals [4].
It is important to assess the prevalence of obesity in various socio-demographic groups. Health policies, programs, and interventions can be more effective if they are targeted at high-risk populations and adopt a practical approach. In South Korea, the leading representative national health survey is the Korea National Health and Nutrition Examination Survey (KNHANES). It consists of an interview for demographic information, including socioeconomic status and health behavior, a health exami-nation including a laboratory test, and a dietary assessment. In contrast, at the regional municipality level, Community Health Surveys (CHS) comprise interviews conducted with computerassisted personal interviewing (CAPI), and thus all information from CHS is self-reported. Consequently, the prevalence of obesity from each survey might be different.
It is therefore necessary to determine whether different measurements, such as self-reporting and direct measurement, would give rise to different information or values and whether a difference in association among relevant factors would consequently occur. Answering these questions would address common misgivings about the quality of self-reported anthropometric information and might lead to suggestions for how best the results from studies using self-reported data can be interpreted and used.
Obesity is a well-recognized risk factor for type 2 diabetes mellitus (DM) among young and middle-aged adults both worldwide [5][6][7][8] and in South Korea [9]. DM is one of the diseases most directly and strongly associated with obesity. It is predicted that there may be a difference in this association between different tools for obesity measurement. This study was conducted to validate the difference in association arising from using different measurements: objective data on the difference in the relationship of obesity with DM prevalence measured by diagnosis of obesity and DM during an actual health examination; and subjective data from respondent's self-reporting.

Data and study population
Objective data was obtained from the KNHANES conducted in 2012, and the subjective data was obtained from the Korean CHS. The CHS was a cross-sectional interview survey conducted by the Korea Centers for Disease Control and Prevention (KCDC) and each municipality. The questionnaire consisted primarily of questions about personal health behaviors regarding leading causes of health problems in Korean communities as well as health status and disease condition [10].
The target age of this study was 45 years and over, since chronic diseases such as hypertension and DM are relatively rare in younger generations. Only people living in Seoul, South Korea were selected in order to minimize regional effects while maximizing the total number of data. Some data were excluded due to missing values in education level, marital status, self-rated health status, employment status, income, BMI, blood pressure, smoking, and DM morbidity. With these criteria, a total of 587 objective data and 10,833 subjective data were obtained from 8,058 KNHANES and 228,921 CHS, respectively.

Definition of variables
As explanatory variables, we selected education level, living status, self-rated health, employment status, equalized income, smoking, obesity, and hypertension.
Education level was categorized by the highest school level completed. Living status was defined by whether subjects lived with their spouse at the time, regardless of marital status. Selfrated health status was measured according to a five-point scale: very good, good, moderate, poor, and very poor. We re-categorized this scale into a three-point scale in order to prevent a small sample size in each category. The employment variable was determined by working status, with the employed group including self-employed, employed with salary, and unpaid family workers. The revised household income was calculated from the total household income divided by family size. Smoking status was divided into three groups: current smokers, past smokers, and non-smokers. Non-smokers indicate people who have never experienced smoking or have smoked less than five packs in their lifetime.
All of the above demographic and socioeconomic variables were obtained by interviewing; however, for obesity, DM and hypertension, the definition was different between subjective and objective data. In the subjective data, reported height and weight were used for obesity, and diagnosis experience was used to measure DM and hypertension prevalence. On the other hand, in the objective data, measured height/weight and blood pressure were used for BMI calculation and hypertension, respectively, and the prevalence of DM was measured by laboratory test. Obesity was defined as a BMI ≥ 25.0 kg/m 2 according to the standard of World Health Organization/Western Pacific Regional Office [11]. DM was classified as a fasting (8 hours or more) blood glucose level of ≥ 126 mg/dL or as taking medication for DM. Hypertension was defined as a systolic blood pressure ≥ 140 mmHg or a diastolic pressure ≥ 90 mmHg or as taking medication for hypertension.

Statistical analysis
Frequency analysis and logistic regression models were mainly used in this study.
Frequency analysis was used in order to understand the demographic characteristics of the data and their association with DM prevalence. The associations were examined by chi-square test using univariate analysis. Logistic regression analysis was then conducted using multivariate analysis. In both frequency and logistic analyses, we separated the male and female populations because female's economic activity rates are generally lower than male's and because risky health behaviors such as smoking and drinking are underreported among female. To avoid multicollinearity in the regression model, the marital status and employment status variables were excluded due to strong cor-relations with age and income, respectively. The smoking variable was also excluded in the female model since the results of the frequency analysis indicated a severely unbalanced proportion. SAS version 9.3 (SAS Institute Inc., Cary, NC, USA) was used for all statistical analysis in this study. Table 1 shows the demographic and socioeconomic characteristics from the 2012 Seoul CHS data and their association with self-reported DM prevalence. 12.4% of participants had experienced a DM diagnosis, and the reported obesity prevalence was 26.0%. Both DM and obesity prevalence were higher in males than in females. The association between self-reported DM prevalence and levels of each variable was statistically significant except for living status among male and smok-ing status among female. Table 2 shows the demographic and socioeconomic characteristics of the 2012 Seoul KNHANES data and their association with measured DM prevalence. Among the participants in the KNHANES, 15% have DM, and 32.4% show obesity. Both DM prevalence and obesity were higher in the male population than in females. The variables which had a significant association with measured DM prevalence were age, education, income, smoking, obesity, and hypertension in the whole population; age, employment, and smoking in the male population; and age, education, living status, income, obesity, and hypertension in females.

RESULTS
In a multivariate logistic regression model for DM prevalence (Table 3), male show higher odds than female (odds ratio [OR], 1.46; 95% confidence interval [CI], 1.21 to 1.76) only in subjective data. All subjects aged over 55 years show higher odds than the reference group (subjects aged 45 to 54 years) in subjective data, and a similar association was noted in the objective data. Though the association in the 55 to 64 years age group appeared not to be statistically significant, it was marginally significant (OR, 2.05; 95% CI, 0.98 to 4.28). Education level was not statistically significant, with only the middle school group showing higher odds than university graduates in objective data (OR, 2.32; 95% CI, 1.00 to 5.40). Groups with lower self-rated health status showed significantly higher odds than the higher self-rated health group, and this tendency was found only in subjective data, whereas the association between income and DM was shown in the objective data. A significant association of obesity and smoking behavior with DM was noted in both sets of data, but the association of hypertension was shown only in subjective data.
In the multivariate logistic regression model of the male population (Table 4), only age difference and smoking behavior showed significant associations with DM in both sets of data. No sig-nificant association was found with education level. On the other hand, the association of self-rated health, obesity, and hypertension were shown only in the subjective data. In the female population (Table 4), the association of income level, obesity, and hypertension showed significance in both sets of data, but when it came to income level, only the second quartile group had the association in subjective data. In contrast, age, education level, and self-rated health showed a significant association only in subjective data.
For both male and female, self-rated health was significant only in the subjective data. When we contrasted results from subjective and objective data, for male, the ORs of age and smoking in the objective data were considerably higher than in the subjective data. In female, the ORs of income, obesity, and hypertension in the objective data were statistically significant and were higher than in the subjective data.   Values are presented as odds ratio (95% confidence interval). 1 Equivalized household income weighted by family number. 2 Based on body mass index from reported height and weight (≥ BMI 25). *p< 0.05.

DISCUSSION
According to subjective data obtained from residents aged 45 years and older living in Seoul, DM prevalence was 12.4% (male, 14.7; female, 10.6). The proportion of the population who were obese (BMI≥ 25 kg/m 2 ) was 26.0% (male, 29.2; female, 23.4). In contrast, according to objective data from the KNHANES from residents aged 45 years and older and living in Seoul, DM prevalence was 15.0% (male 17.8, female 12.9), and the prevalence of obesity (BMI≥ 25 kg/m 2 ) was 32.4% (male, 34.4; female, 30.8). The discrepancy in DM prevalence between the two sets of data was just 2.6%, but that of the obesity prevalence was 6.4%, a difference more than twice that of DM.
Based on the degree of the relationship of obesity with DM prevalence observed in each set of data, we can infer that using objective data increased the observed impact of obesity. The difference in the relative risk of obesity between subjective and objective data was greater in female than in male and all of the relative risk was statistically significant. This appearance is most likely due to the greater prevalence of obesity in the objective data. It should be also noted that other socioeconomic information was collected from the questionnaires in both sets of data. Therefore, some socioeconomic factors in reporting error and in DM prevalence are entangled in the multi directional interaction among them.
In this study, which compares subjective data with objective data, measurement discrepancies in both obesity, the covariate, and DM, the outcome variable, was shown. These discrepancies should result in variation in the association between the survey methods. However, it is very difficult to determine which method is more accurate, since every survey procedure has pro blems leading to systematic errors.
As for the evaluation of obesity using BMI calculated from height and weight, self-reported and measured values for height and weight are strongly correlated; thus, many studies assume that self-reported data are valid [12][13][14]. However, researchers have expressed concerns regarding discrepancies and systematic errors in self-reported values, as these are unreliable in population subgroups with a high prevalence of obesity (e.g., overweight female and middle-aged and elderly individuals) [1,2,[15][16][17][18]. Being overweight is also a predictor of errors in reporting height measurements [3,19]. In addition to intentional reporting errors, many people are unaware of anthropometric changes such as atrophy after middle age, particularly female and older adults [20]. In fact, systematic reporting errors can result from both intended and unintended misreporting, affecting data reliability and, in turn, introducing bias.
Furthermore, measurements of height and weight have normal circadian variation [21,22]. The accuracy of measurement instrument, and competence of personnel are critical in spite of regular quality control [23]. Selection bias is a well-known problem in population studies. Participants in surveys are likely to be different from those who decline to participate. However, it is not understood how self-selection affects the information. Individuals who are more obese and at greater risk of health problems tend to refuse participation [24]. The time and effort required for measuring and examination may also influence selection bias. In this way, epidemiological surveys engaging with obesity and other features related to the social desirability of health outcomes may be lacking an important portion of the normal distribution of the population.
Accordingly, the measuring tool difference must be highlighted before the results of such surveys are compared and interpreted for any political decision. This point is very important for decision makers to assign a priority to public health policy, practice, or intervention.
Because self-perceptions and social implications of obesity are in the background of self-reports, understanding the potential influences of personal characteristics such as sex, age, ethnicity, and sociocultural and socioeconomic status are important for designing studies and for interpreting data obtained from self-reported information [25,26]. To determine which factors influence certain health outcomes, researchers collect measurements of related, interactive, and potential causes. Some factors present in the survey process need to be considered when researchers use data from a survey and interpret results from each set of data. These include sampling frames, data collection modes (CAPI, computer-assisted self-interviewing, paper-assisted personal interviewing, etc.), and the non-response rate [27].
The goal of CHS is to produce health indicators that are comparable among municipalities, for monitoring community health status. The community level of health indicators indicated by the monitoring of results from CHS in each municipality is the principle evidence used to develop community health plans, intervention planning, outcome assessments and effect evaluations. This goal seems to have been achieved successfully, based on a review of annual community health figures in each municipality since the year 2008 [28]. However, many researchers hesitate to use CHS data due to its data collection method, that is, interviewing with a questionnaire. The KNHANES is the leading nationally representative health survey that provides directly measured information on community-dwelling people through laboratory or physical exams. Generally, directly measured information seems more accurate than reported information because of reduced reporting error, and thus many Korean researchers use data from the KNHANES rather than from CHS. Many previous studies, however, have also provided insights into significant variability in biomarkers from laboratory and physical health exams [22,[29][30][31]. Moreover, the self-reporting method is still used in population-based studies for its low cost and relative ease in practice [4,[32][33][34].
It is worth asking whether the lack of information from particular populations due to differences in survey design and execution is reversible. We should take into account that biased data can be adjusted to a comparably accurate level through sex and age standardization. Now, we need a debate on how we understand and improve measurement bias arising from regional differences in demographic structures. We should do all we can to diminish bias in the design and administration of surveys and be consistent in reporting the level of accuracy of estimation and potential for making prediction from each survey. There is no perfect survey, since measurement error is an indispensable part of surveys. We have to face the discrepancy among different measuring tools proactively, and develop a comprehensive understanding of the different factors involved in the variables from different measurements. Therefore, we call for widespread use of the thus obtained information and strong consideration of the effect of measurement method on results.