Data

We use nationally representative cross-sectional survey data from the 2001–2012 Behavioral Risk Factor Surveillance System (BRFSS), the largest telephone health survey in the United States (Mokdad, 2009). The BRFSS employs a multistage, stratified sampling approach to collect uniform, state-specific data on health, health behaviors, and disease risk factors related to diseases, injuries, and infections in the adult population (CDC, 2022). The dataset includes annual samples from all 50 states, and the District of Columbia, with sample sizes varying between 200,000 and 400,000 for each state. BRFSS questionnaires consist of three parts: 1) the core component, 2) optional modules, and 3) state-added questions. We use data from the core component to examine associations between racial/ethnic status, SES, and health. We are limited to the 2001–2012 BRFSS data, as the detailed race data necessary to construct measures of multiracial status are not publicly available for more recent years. We use supplemental data from the optional modules and state-added questions to explore the contribution of three non-socioeconomic pathways (early life social conditions, race-related experiences, and health behaviors) to racial/ethnic differences in health. We provide a detailed breakdown of missing data for each variable in Supplementary Table 1. While demographic characteristics, SES measurements, and health behaviors belong to the core component of the questionnaires, early life social conditions and race-related experiences are all optional modules only provided during select years and by select states (see Supplementary Table 2 for more information).

Health outcomes

Our analysis examines two self-reported health measures from BRFSS. The first measures days of poor mental health. Respondents were asked “thinking about your mental health, which includes stress, depression, and problems with emotions, for how many days during the past 30 days was your mental health not good?”, with response categories ranging from 0 to 30 days. The second measures days of poor physical health, based on the question “thinking about your physical health, which includes physical illness and injury, for how many days during the past 30 days was your physical health not good?”, with responses again ranging from 0 to 30 days. These measures are widely used in population health research and avoid possible bias from relying on diagnosed conditions that could reflect differential healthcare access (Zahran et al., 2005; Pierannunzi et al., 2013). We report the distribution of these two dependent variables, as well as major demographic controls, in Table 1.

Race and ethnicity classification

We use detailed race and ethnicity questions in BRFSS to construct our primary independent variables. Respondents were first asked “Are you Hispanic or Latino?” and then “Which one or more of the following would you say is your race?” with options to select multiple categories: (1) White, (2) Black, (3) Asian, (4) Native Hawaiian or Other Pacific Islander, (5) American Indian or Alaska Native, and (6) Other Races. To create mutually exclusive categories, we first classify all respondents of Hispanic origin as Hispanic regardless of racial identification, which means we cannot identify multiracial patterns among Hispanic respondents. Among non-Hispanic respondents, we then classify participants into 10 categories: monoracial White, monoracial Black, monoracial Asian, monoracial Native Hawaiian/Pacific Islander, monoracial American Indian or Alaska Native, Other monoracial race, Black multiracial, American Indian or Alaska Native multiracial, Asian multiracial, and Other multiracial. The non-Hispanic specification is assumed for all racial categories in our analyses.

For multiracial respondents (those selecting multiple races), we create categories using a hierarchical approach that reflects historical patterns of racial classification. Following previous research on multiracial health disparities (Roth, 2005; James Davis, 2010; Bratter and Gorman, 2011), respondents identifying as Black and any other race(s) are classified as Black multiracial, reflecting the historical legacy of hypodescent rules in U.S. racial classification. Among remaining multiracial respondents, those identifying as American Indian or Alaska Native and any other race(s) are classified as American Indian or Alaska Native multiracial, though we acknowledge this may not fully capture Indigenous identity complexity (Quint et al., 2023). Next, remaining respondents selecting Asian and any additional race(s) are classified as Asian multiracial. All other multiracial combinations are classified as Other multiracial.

Four hypothetical pathwaysSocioeconomic status

We include three interrelated measures of SES: income, education, and employment status, which are widely used in health disparities research (Williams, 2012). Income is reported in eight categories ranging from less than $10,000 to $75,000 or more. To facilitate interpretation, we convert these to a continuous measure using the midpoint value of each category (e.g., $12,500 for the $10,000–$14,999 category) and scale income in $1000 units for regression analyses. Educational attainment is coded into three categories: less than high school (reference), high school graduate or GED equivalent, and any college education. Finally, employment status classifies respondents as employed (reference), not employed, or student.

Early life social conditions

A subsample (2009–2012) of BRFSS respondents completed questions about early life experiences. These questions assess childhood exposure to family instability, abuse, and household challenges. We include a series of measures that serve as a proxy of disadvantaged early life social conditions. Family structure is measured by parental marital status (parents married [reference], parents divorced, parents never married). Child abuse is assessed through questions about physical abuse from parents (never [reference], once, more than once) and verbal abuse from parents (never [reference], once, more than once). Respondents also reported witnessing violence between parents or adults in the household (never [reference], once, more than once). Additional measures capture childhood household experiences, including living with people who: used illegal street drugs or misused prescription medications, had problems with alcohol use, experienced depression or mental illness, or had served time in a correctional facility (each coded as never lived with [reference] or lived with).

Race-related experiences

Another subsample (2004–2012) completed questions about race-related experiences and attitudes. These measures include: frequency of thinking about race (never [reference], yearly, monthly, weekly, daily, hourly, constantly), perceived treatment when seeking healthcare (same as other races [reference], worse than other races, better than other races, worse than some races but better than others, only encountered same race, did not seek healthcare in past 12 months), experiences of race-related emotional distress (no [reference], yes), and experiences of race-related physical distress (no [reference], yes).

Health behaviors

From the core survey component, we include measures of health-related behaviors: smoking status (never smoked [reference], former smoker, current occasional smoker, current regular smoker), alcohol consumption (heavy drinking defined as 2+ drinks daily for men, 1+ drinks daily for women), health insurance coverage (uninsured [reference], insured), and physical activity (any exercise in past month: no [reference], yes).

Control variables

All models include demographic characteristics that could confound the relationship between racial/ethnic status and health outcomes. We control for gender (male [reference], female) as health reporting patterns and healthcare utilization differ systematically by gender (Read and Gorman, 2010). Age (in years, range 18–80) is included to account for life course patterns in health status and because age distributions vary across racial groups. Survey year indicators and state fixed effects adjust for temporal trends in health outcomes and unobserved state-level factors that might affect both racial composition and health (e.g., healthcare policies, environmental conditions). In sensitivity analyses, including marital status produced substantively similar results.

Estimation and statistical procedures

We employ a two-part analytical strategy designed to both identify and explain racial/ethnic health disparities. First, we estimate the association between racial/ethnic status and health outcomes using negative binomial regression models. We choose negative binomial models over Poisson regression because our dependent variables (days of poor mental/physical health, range 0–30) show significant overdispersion, with variance exceeding the mean. For each health outcome, we follow a step-wise modeling strategy. Our analytical strategy employs a systematic model-building approach to examine how different pathways contribute to racial/ethnic health disparities. We begin with base models that adjust for demographic characteristics (age, gender) and include both state and year fixed effects to account for geographic and temporal variation. These base models establish the foundational racial/ethnic differences in health outcomes. We then sequentially examine key pathways by adding sets of variables related to socioeconomic status (income, education, employment), early life conditions (family structure, abuse exposure, residential environment), race-related experiences (discrimination, healthcare perceptions), and health behaviors (smoking, drinking, physical activity) to our models. This sequential approach allows us to observe how the coefficients associated with racial/ethnic status change as we account for each set of factors. The detailed outcomes of these models are reported in Supplementary Table 310.

For our comprehensive mediation analyses (presented in Figs. 2, 3 and Supplementary Tables 11, 12), we estimate models that simultaneously include all relevant pathway variables to quantify their mediating effects while accounting for potential interdependence between pathways. Throughout all analyses, we incorporate BRFSS survey weights to ensure nationally representative estimates. The inclusion of state and year fixed effects helps control for unobserved state-level characteristics and temporal trends that may influence health outcomes.

In these models, the coefficient estimates represent the difference in the expected log count of poor health days comparing each racial group to the reference category. For ease of interpretation, we also present the results as incidence rate ratios (IRR), which can be interpreted as the relative change in expected days of poor health associated with each racial category compared to the reference group.

Second, we conduct formal mediation analysis using the KHB routine to further quantify how different pathways contribute to racial/ethnic differences in health. The KHB method decomposes the total effect of racial/ethnic status on health outcomes into direct effects and indirect effects operating through each set of mediators (Kohler et al., 2011; Karlson et al., 2012). This approach is particularly appropriate for our analysis because it: (1) allows comparison of coefficients across nested nonlinear models, (2) handles multiple mediators simultaneously, and (3) provides formal tests of mediation effects.

Share.

Comments are closed.