Dahyun Choi

Hello! I am a Ph.D. candidate in the Department of Politics at Princeton University, where I am affiliated with the Center for the Study of Democratic Politics, the Data-Driven Social Science Initiative, and the Program for Quantitative and Analytical Political Science. I work on questions in interest group and bureaucratic politics, political economy, and American political institutions, using computational methods that involve text analysis and large language models, with an empirical focus on environmental policymaking.

Working Papers

Partisan Bias and the Resilience of High-Quality Science

[ ]

How do partisan bias and scientific credibility jointly shape the use of evidence in regulatory policymaking? I investigate this question using a novel dataset of 16,783 peer-reviewed studies evaluated by the Environmental Protection Agency (EPA) for the Integrated Science Assessments (ISAs), which inform the National Ambient Air Quality Standards (NAAQS). I find that Democratic administrations are 15.4% more likely to cite pro-regulatory studies, while Republican administrations are 17.5% less likely to do so. These effects correspond to a two-standard-deviation change in study quality measures. Yet partisan bias is moderated by evidentiary quality: high-quality studies are cited consistently across administrations, while lower-quality studies are less penalized when aligned with an administration’s policy agenda. Evidence on participant selection in the ISA process suggests a plausible mechanism underlying this pattern. Together, the findings suggest that while science retains epistemic authority, its application is shaped by political context within the administrative state.

Fine-tuned Large Language Models Can Replicate Expert Coding Better than Trained Coders

(with Brandon Stewart and Denis Peskoff) Revise & Resubmit, Political Science Research and Methods [ ]

Understanding the political process in the United States requires examining how information is provided to politicians and the general public. While existing studies point to interest groups as strategic information providers, studying this aspect empirically has been challenging due to the need for expert-level annotation in measurement. We make two contributions. First, we demonstrate that fine-tuned large language models (LLMs) can replicate expert-level annotation in a specialized area above the accuracy of lightly-trained workers, crowd-workers, and zero-shot LLMs. Second, we quantify two types of interest group signals that are difficult to separate empirically using other means: 1) informative signals that help agents improve political decisions, and 2) associative signals that influence preference formation but lack direct relevance to the substantive topic of interest. We demonstrate the utility of this approach using two applications where our classifier generalizes out of distribution. This study shows methodologically the applicability of large language models for complex expert-driven measurement tasks but also shows substantively that interest groups strategically tailor the composition of signals under different institutional settings.

Email for current draft

2023 American Political Science Association, Los Angeles

2022 Society for Political Methodology, Washington University in St. Louis

Teaming Up Across Political Divides: Evidence from Climate Regulations

Revise & Resubmit, Economics and Politics [ | paper ]

Why do interest groups with contrasting interests and policy goals work together? I present a theory of collaborative policy production and show that interest groups can achieve higher policy gains through collaboration, even though their ideal policy goals may diverge significantly. To test theoretical results, I introduce original measurement strategies that reveal systematic patterns in which firms and environmental groups invest in joint efforts to improve fine-grained details of policy to achieve greenhouse gas emissions targets. The analysis, using public comments spanning 2010-2020, demonstrates that comments written jointly by environmental groups and firms contain more information that can contribute to the quality of policy implementation than individual efforts alone, despite compromises on policy preferences. These findings highlight the hidden dynamics of regulatory politics, wherein divergent political goals are reconciled for high-quality policy implementation.

Political Economy of Climate and the Environment, University of Pennsylvania

Frontiers in Money in Politics Research, University of Southern California

2023 American Political Science Association, Los Angeles

2023 Money in Politics Conference, Copenhagen Business School

Climate Pipeline Project, Brown University

2023 Midwest Political Science Association, Chicago

2022 Environmental Politics and Governance, Virtual conference

2022 International Political Economy Society, University of Pittsburgh

How Much Data Is Enough? A Design-aware Approach to Empirical Sample Complexity in Political Science

(with Perry Carter)

[ | paper ]

How much data is needed to ensure that a model performs reliably on new, unseen data? Despite its central importance to empirical research design, sample size decisions are often made heuristically—guided more by resource constraints than by principled diagnostics. Existing tools like power analysis and cross-validation offer limited insight into how predictive performance scales with sample size. We introduce a design-aware, empirical framework for estimating sample complexity bounds tailored to applied settings. By fitting smooth extrapolation functions to model performance from resampled pilot data, our method estimates the sample size needed to achieve researcher-specified generalization guarantees. Through applications to supervised learning tasks involving extensive human-annotated data, we show that generalization often stabilizes with as little as 10% of typical labeling costs. This approach provides a statistically grounded, interpretable diagnostic for generalization performance and a practical tool for political scientists designing data-intensive studies under resource constraints or design uncertainty

2023 American Political Science Association, Los Angeles

2023 Society for Political Methodology, Stanford University

Regulatory Capacity and Corporate Political Disengagement: Evidence from State-Level Workforce Shocks

(with Kyuwon Lee) [ | paper ]

Although there are public concerns about the declining capacity of regulatory agencies and its impact on regulatory outcomes, such decline could also lead regulated firms to disengage themselves from politics. We examine whether and how firms reduce their campaign contributions in response to decreases in state-level regulatory capacity. To do so, we collect original datasets on the workforce size of U.S. state environmental agencies and leverage variations in workforce shocks that arise from the gap between actual and appropriated workforce sizes. Our analysis reveals that state environmental agencies' workforce shocks decrease firms' donations to state legislators, particularly to those in the majority party and the Democratic party, but do not affect firms' contributions to their ideological allies. We also find that existing state-level restrictions on corporate donations do not moderate firms' political responsiveness. Overall, this article provides a nuanced picture of how diminishing regulatory capacity could shape corporate political activities.

2024 American Political Science Association, Philadelphia

2024 Money in Politics Conference, Copenhagen Business School

Conference on Bureaucratic and Interest Group Politics, Princeton University

2024 Midwest Political Science Association, Chicago

Parallel Forces, Parallel Patterns 1868–2024: An Integrated Approach to Seat and Vote Shares in the House and Senate

(with Charles Cameron and Harry Paarsch)

How Do Bureaucrats Learn in Absence of Autonomous Sources of Expertise?

Politics of Academic Experts

(with Nolan McCarty)