{"id":493131,"date":"2026-02-27T06:30:08","date_gmt":"2026-02-27T06:30:08","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/493131\/"},"modified":"2026-02-27T06:30:08","modified_gmt":"2026-02-27T06:30:08","slug":"large-language-model-reveals-an-increase-in-climate-contrarian-speech-in-the-united-states-congress","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/493131\/","title":{"rendered":"Large language model reveals an increase in climate contrarian speech in the United States Congress"},"content":{"rendered":"<p>Data<\/p>\n<p>This section outlines our corpus on congressional floor speeches, the annotation procedure used to create the test set for assessing model performance from these data, and the covariates used in the statistical analysis. Section\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S.3<\/a> of the\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">Supplementary Information<\/a> provides detailed information on the data utilized in this study, including the source, variable descriptions, and descriptive statistics for all variables.<\/p>\n<p>Congressional speeches<\/p>\n<p>We utilize a publicly available Congressional Record scraper and parser<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 33\" title=\"Judd, N., Drinkard, D., Carbaugh, J. &amp; Young, L. Congressional-record: a parser for the congressional record. &#010;                  https:\/\/github.com\/unitedstates\/congressional-record&#010;                  &#010;                 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR33\" id=\"ref-link-section-d261351822e2205\" rel=\"nofollow noopener\" target=\"_blank\">33<\/a> to extract structured data from HTML files that contain the text of the Congressional Record<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 34\" title=\"U.S. Government Publishing Office. Congressional record. &#010;                  https:\/\/www.govinfo.gov\/app\/collection\/crec&#010;                  &#010;                 (Accessed 5 February 2025).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR34\" id=\"ref-link-section-d261351822e2209\" rel=\"nofollow noopener\" target=\"_blank\">34<\/a>. This scraped data includes transcripts of all speeches delivered on the House and Senate floor by Congress Members from 1994\u2014i.e., when the Congressional Record was first digitized\u2014to 2024, as well as each speech\u2019s date and the speaker\u2019s bioguide ID. Each speech was divided into paragraphs (or utterances), resulting in a total of 2,515,806 paragraphs over the sample period. Next, to identify relevant paragraphs for our subsequent analysis, we utilize the ClimateBERT model<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 35\" title=\"Webersinke, N., Kraus, M., Bingler, J. A. &amp; Leippold, M. Climatebert: a pretrained language model for climate-related text. arXiv preprint &#010;                  https:\/\/arxiv.org\/abs\/2110.12010&#010;                  &#010;                 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR35\" id=\"ref-link-section-d261351822e2213\" rel=\"nofollow noopener\" target=\"_blank\">35<\/a> to classify climate and non-climate change paragraphs (see Section\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S.4<\/a> in the\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">Supplementary Information<\/a> for details). The process resulted in 110,837 relevant paragraphs over the sample period.<\/p>\n<p>Test set annotation<\/p>\n<p>To assess our model performance against a human gold standard, we manually annotated a random sample of 2151 paragraphs from the floor speech dataset using the revised CARDS taxonomy. Given that Democrats are far more likely to discuss climate change in floor speeches, we stratify by party and slightly over-sample Republican speeches. We trained a total of 12 annotators, half of whom contributed substantially to the labeling process. All annotators are or have been research assistants in the Climate and Development Lab at Brown University and have taken a semester-long class on climate obstruction. Additionally, all annotators reviewed the previous CARDS taxonomy and associated video, as well as participated in a lecture and subsequent discussion on the taxonomy updates before starting the labeling process. Each annotator is given one random instance from the climate-relevant paragraphs to label at a time, and each instance is labeled by at least 3 coders. The labeling occurred between October 2023 and February 2024.<\/p>\n<p>As previously noted in Coan et al.<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 21\" title=\"Coan, T. G., Boussalis, C., Cook, J. &amp; Nanko, M. O. Computer-assisted classification of contrarian claims about climate change. Sci. Rep. 11, 22320 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR21\" id=\"ref-link-section-d261351822e2234\" rel=\"nofollow noopener\" target=\"_blank\">21<\/a>, this is a challenging annotation task even for experts in climate change skepticism. Unsurprisingly, the agreement among the student coders was low\u2014with an overall (pairwise) agreement of 79.8%, marginal reliability based on Krippendorff\u2019s alpha (\u03b1\u00a0=\u00a00.501)\u2014and varied considerably across pairs of coders. To mitigate the challenge of accurately assessing model performance in the face of noisy label data, we had an expert in climate change skepticism and misinformation review instances where there was disagreement among the coders (N\u00a0=\u00a0510 instances) and provided the final label for that instance. Although not completely error-free, this procedure greatly reduced label noise and helped to ensure a more accurate estimate of model performance.<\/p>\n<p>Covariates<\/p>\n<p>Building on past scholarship on interest group influence in Congress<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 36\" title=\"Roscoe, D. D. &amp; Jenkins, S. A meta-analysis of campaign contributions&#x2019; impact on roll call voting*. Soc. Sci. Q. 86, 52&#x2013;68 (2005).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR36\" id=\"ref-link-section-d261351822e2252\" rel=\"nofollow noopener\" target=\"_blank\">36<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 37\" title=\"Whittlestone, J. A. H. &amp; Klitgaard, M. B. Interest group resources, access, and influence: an empirical review. Scand. Political Stud. 47, 308&#x2013;333 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR37\" id=\"ref-link-section-d261351822e2255\" rel=\"nofollow noopener\" target=\"_blank\">37<\/a> and on the correlates of congressional speech on climate change<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 15\" title=\"Nanko, M. O. &amp; Coan, T. G. Defeating cap-and-trade: how the fossil fuel industry and climate change counter movement obstruct US climate change legislation. Glob. Environ. Change 89, 102919 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR15\" id=\"ref-link-section-d261351822e2259\" rel=\"nofollow noopener\" target=\"_blank\">15<\/a>, we collected data on a range of covariates at the MoC level. First, we draw on metadata obtained from Congress.gov<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 38\" title=\"Congress.gov. Biographical directory of the united states congress. &#010;                  https:\/\/bioguide.congress.gov&#010;                  &#010;                 (Accessed 8 March 2025).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR38\" id=\"ref-link-section-d261351822e2263\" rel=\"nofollow noopener\" target=\"_blank\">38<\/a> to extract information on the chamber, term, party, age, and state of the MoC associated with each speech. Second, to measure ideology, we use the first dimension of DW-NOMINATE scores provided by Lewis et al.<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 39\" title=\"Lewis, J. B. et al. Voteview: Congressional roll-call votes database. &#010;                  https:\/\/voteview.com\/&#010;                  &#010;                 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR39\" id=\"ref-link-section-d261351822e2267\" rel=\"nofollow noopener\" target=\"_blank\">39<\/a>, which represents legislators\u2019 positions on the dominant ideological dimension in American politics, primarily reflecting liberal-conservative differences on economic and social issues. This dimension explains the largest share of variance in congressional voting behavior and ranges from \u22121 (most liberal) to +1 (most conservative). Third, we collected data on campaign contributions from the oil and gas industry and the coal mining industry received by the associated MoC of the given speech during the Congress term of the speech. The campaign contribution data comprises contributions from Political Action Committees and individuals associated with each of the three industries.\u00a0The data was obtained from OpenSecrets\u2019s bulk data<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 40\" title=\"OpenSecrets.org. Campaign finance for interest groups (oil &amp; gas, coal mining, and electric utilities). &#010;                  https:\/\/www.opensecrets.org\/industries&#010;                  &#010;                 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR40\" id=\"ref-link-section-d261351822e2271\" rel=\"nofollow noopener\" target=\"_blank\">40<\/a>, who derive the data from Federal Election Commission data. The OpenSecrets industry codes we included as part of the fossil fuel industry were: energy production and distribution (E1000), oil and gas (E1100), major (multinational) oil and gas producers (E1110), independent oil and gas producers (E1120), natural gas transmission and distribution (E1140), oilfield service, equipment and exploration (E1150), Petroleum refining and marketing (E1160), gasoline service stations (E1170), fuel oil dealers (E1180), LPG\/liquid propane dealers and producers (E1190), and coal mining (E1210). We matched the OpenSecrets bulk data to our congressional speech dataset by matching each member\u2019s Bioguide ID to their corresponding CID (opensecrets_id) in the OpenSecrets data using an existing key<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 41\" title=\"Github.com\/unitedstates. Congress legislators: Comprehensive data on members of the u.s. congress. &#010;                  https:\/\/github.com\/unitedstates\/congress-legislators&#010;                  &#010;                 (n.d.).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR41\" id=\"ref-link-section-d261351822e2276\" rel=\"nofollow noopener\" target=\"_blank\">41<\/a>. Finally, we collected district level employment data to capture the economic importance of fossil fuel industries within each member\u2019s constituency from the Bureau of Labor Statistics\u2019 Quarterly Census of Employment and Wages (QCEW). Specifically, we obtained annual employment data for industries classified under specific NAICS codes associated with fossil fuel extraction and related activities: oil and gas extraction (211), coal mining (2121), drilling oil and gas wells (213111), support activities for oil and gas operations (213112), support activities for coal mining (213113), fossil fuel power generation (221112), natural gas distribution (2212), and pipeline transportation (486). The county-level employment data was spatially merged with congressional district boundaries using geographic intersection methods, with employment statistics allocated to districts based on the proportion of county area within each district. We normalize these employment figures by total employment for each constituency (district or state depending on House or Senate member). For more information on the procedure used to build the fossil fuel employment data, see Section\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S.3<\/a> of the\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">Supplementary Information<\/a>.<\/p>\n<p>Classifying specific contrarian claims<\/p>\n<p>We build on and extend the machine learning framework developed in Coan et al., drawing on recent advances in large language models (LLMs) to overcome several shortcomings of the original CARDS model. First, as made clear in the original Coan et al. study, their model was only appropriate for categorizing contrarian claims within a set of known climate skeptics (e.g., Conservative Think Tanks and Skeptical Blogs). The original model is thus less appropriate for classifying claims when source position is unknown and may produce false positives when classifying non-skeptical content<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 22\" title=\"Rojas, C. et al. Hierarchical machine learning models can identify stimuli of climate change misinformation on social media. Commun. Earth Environ. 5, 436 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR22\" id=\"ref-link-section-d261351822e2298\" rel=\"nofollow noopener\" target=\"_blank\">22<\/a>. Second, the original model relies on multi-class, not multi-label data\u2014i.e., the model assumes the presence of a single category for each paragraph under consideration. This decision proved challenging when dealing with paragraphs containing multiple claims, which is the rule, not the exception. For example, consider the following excerpt:<\/p>\n<p>\u201cAnd the tragedy is if we were allowed to produce, if this Congress would stop locking up the Outer Continental Shelf, if they would open up the reserves in the Midwest which some of them are taking off in the energy bill, we could have adequate natural gas in this country; the price could be affordable; Americans could be warm; and, the very best jobs in America like petrochemical and polymers and plastic and fertilizer and glass and steel plants and bricks could be made in America, and middle-class working Americans could continue to have the jobs that have historically allowed them to live a quality of life and raise their families.\u201d<\/p>\n<p>The claim that Congress should \u201cstop locking up the Outer Continental Shelf\u201d suggests that fossil fuels are plentiful and should be used (category 7.1.0), while the statement \u201cwe could have adequate natural gas in this country\u201d suggests that fossil fuels are necessary to meet energy demand (category 7.3.0). The excerpt also makes claims related to the economy and jobs. Statements that the \u201cthe very best jobs in America&#8230;could be made in America\u201d as the result of expanding fossil fuel supply suggests that fossil fuels are important for economic growth and development (category 7.2.1), while the text also suggests that current energy bill undermines the ability for \u201cmiddle-class working Americans [to] continue to have the jobs that have historically allowed them to live a quality of life\u201d implies that policies related to climate change could kill the \u201cvery best jobs\u201d (category 4.1.1.2).<\/p>\n<p>Lastly, though the taxonomy developed in Coan et al. outlines examples of detailed claims challenging climate science and policy solutions, the machine learning model only classifies texts based on 27 level-2 claims. Yet, classifying claims at a granular level is essential for developing an effective response to climate misinformation<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 24\" title=\"Cook, J., Ellerton, P. &amp; Kinkead, D. Deconstructing climate misinformation to identify reasoning errors. Environ. Res. Lett. 13, 024018 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR24\" id=\"ref-link-section-d261351822e2315\" rel=\"nofollow noopener\" target=\"_blank\">24<\/a>, while detailed classifications related to climate solutions allow researchers to more clearly connect specific claims to the general discourses of delay proposed in Lamb et al.\u2019s study<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 3\" title=\"Lamb, W. F. et al. Discourses of climate delay. Glob. Sustain. 3, e17 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR3\" id=\"ref-link-section-d261351822e2319\" rel=\"nofollow noopener\" target=\"_blank\">3<\/a>. We address these limitations by developing a multi-label LLM-based framework that classifies texts down to the lowest level of the CARDS taxonomy.<\/p>\n<p>In-context learning with foundation models<\/p>\n<p>The concept of in-context learning (ICL) has emerged as a central paradigm for task adaptation in LLMs<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 42\" title=\"Dong, Q. et al. A Survey on In-context Learning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing 1107&#x2013;1128 (Association for Computational Linguistics, 2024).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR42\" id=\"ref-link-section-d261351822e2331\" rel=\"nofollow noopener\" target=\"_blank\">42<\/a>. At the core of this paradigm is the model\u2019s capacity to adapt its behavior based on provided examples, rather than having its internal parameters systematically modified via resource-intensive fine-tuning, which is a notable shift away from traditional machine learning approaches in NLP. ICL leverages the \u201ccontext\u201c embedded within the model\u2019s prompt to adapt the LLM to specific downstream tasks, spanning a spectrum from zero-shot learning (where no additional examples are provided) to few-shot learning (where several examples are offered). In essence, LLMs are able to execute an array of tasks by conditioning them on a few examples (few-shot) or task-descriptive instructions (zero-shot). This method of conditioning or \u201cprompting\u201d the LLM can be performed either manually<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 43\" title=\"Schick, T. &amp; Sch&#xFC;tze, H. True few-shot learning with prompts-a real-world perspective. Trans. Assoc. Comput. Linguist. 10, 716&#x2013;731 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR43\" id=\"ref-link-section-d261351822e2335\" rel=\"nofollow noopener\" target=\"_blank\">43<\/a> or automatically<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 44\" title=\"Shin, T., Razeghi, Y., Logan IV, R. L., Wallace, E. &amp; Singh, S. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 4222&#x2013;4235 (Association for Computational Linguistics, 2020).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR44\" id=\"ref-link-section-d261351822e2339\" rel=\"nofollow noopener\" target=\"_blank\">44<\/a>.<\/p>\n<p>Zero-shot classification. To assess arguably the best case scenario for LLM-based classification performance using ICL, we start by utilizing two frontier models: OpenAI\u2019s GPT4o model and Anthropic\u2019s Claude Sonnet 3.5 model. As of this writing, these models provide near state-of-the-art (SOTA) performance across a range of benchmarks and thus offer a useful baseline for understanding the zero-shot capabilities of large foundation models for classifying contrarian claims.<\/p>\n<p>To utilize an LLM-based approach, we must start by specifying a suitable prompt. We follow current best-practices in the prompt engineering literature<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 45\" title=\"Anthropic. Build with claude: Prompt engineering overview, &#010;                  https:\/\/docs.anthropic.com\/en\/docs\/build-with-claude\/prompt-engineering\/overview&#010;                  &#010;                 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR45\" id=\"ref-link-section-d261351822e2351\" rel=\"nofollow noopener\" target=\"_blank\">45<\/a> to iteratively develop the base prompt used in this study (see Section\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S.5<\/a> of the\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">Supplementary Information<\/a> for the final prompt and a discussion of the prompt development). Our experiments suggest that the use of chain-of-thought (CoT) prompting is particularly helpful for prompt development and classification using foundation models. Zhou et al.<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 46\" title=\"Zhou, Y. et al. Large language models are human-level prompt engineers. In The Eleventh International Conference on Learning Representations, &#010;                  https:\/\/openreview.net\/forum?id=92gvk82DE-&#010;                  &#010;                 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR46\" id=\"ref-link-section-d261351822e2361\" rel=\"nofollow noopener\" target=\"_blank\">46<\/a> carried out a systematic evaluation of different chain-of-thought prompt triggers, including human-designed and Automatic Prompt Engineer (APE). In their experiments, allowing the model to break a problem into steps on their own has proven to be more effective than other prompt triggers. For this paper, we utilized the following APE prompt trigger suggested by Zhou et al.<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 46\" title=\"Zhou, Y. et al. Large language models are human-level prompt engineers. In The Eleventh International Conference on Learning Representations, &#010;                  https:\/\/openreview.net\/forum?id=92gvk82DE-&#010;                  &#010;                 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR46\" id=\"ref-link-section-d261351822e2365\" rel=\"nofollow noopener\" target=\"_blank\">46<\/a>: \u201cLet\u2019s work this out in a step by step way to be sure we have the right answer.\u201d<\/p>\n<p>Dynamic few-shot learning. Research demonstrates that providing examples can allow LLMs to better understand the correlation between the question and the response, thereby improving model performance<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 47\" title=\"Brown, T. B. et al. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Curran Associates Inc., 2020).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR47\" id=\"ref-link-section-d261351822e2374\" rel=\"nofollow noopener\" target=\"_blank\">47<\/a>. Traditional approaches to few-shot learning augment zero-shot prompts by including a handful of examples for each class, often relying on domain experts to hand-pick specific examples or by randomly selecting from a pool of annotated data. There are practical challenges, however, to implementing the traditional approach, especially for an extensive taxonomy such as CARDS. Providing examples of each claim in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#Fig1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a> would significantly increase the size of the prompt needed for classification, which will increase the cost and latency of model implementation.<\/p>\n<p>Dynamic few-shot learning extends the traditional few-shot approach by only sending examples that are semantically similar to a user\u2019s input text. This approach utilizes retrieval augmented generation (RAG) to 1) retrieve relevant examples based on calculating the cosine similarity between input text and 2) generate few-shot classifications using the selected examples. Dynamic few-shot learning is particularly useful when one has access to a diverse set of training examples\u2014i.e., either sourced through previous research or by hiring annotators\u2014and may provide the only practical approach to implementing few-shot learning for large, multi-label classification problems.<\/p>\n<p>Fine-tuning scalable alternatives<\/p>\n<p>Although frontier models and in-context learning have been shown to offer competitive performance across a range of classification tasks, these models are expensive to use for large text datasets, thereby making these models difficult to scale in practice (see Section\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S.7<\/a> of the\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">Supplementary Information<\/a> for a detailed breakdown of costs). Given challenges associated with scaling current SOTA models, we developed a procedure for fine-tuning a smaller, more scalable (i.e., cheaper and faster) alternative.<\/p>\n<p>Fine-tuning data. We start by curating a dataset of high-quality examples for each claim in the revised CARDS taxonomy, aiming for roughly 5 examples per claim. To source examples, we draw on data from the original Coan et al.<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 21\" title=\"Coan, T. G., Boussalis, C., Cook, J. &amp; Nanko, M. O. Computer-assisted classification of contrarian claims about climate change. Sci. Rep. 11, 22320 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR21\" id=\"ref-link-section-d261351822e2410\" rel=\"nofollow noopener\" target=\"_blank\">21<\/a> study, focusing on paragraph level data from conservative think tanks (CTTs). When selecting examples, the research team prioritized those that a) effectively captured the claim of interest and b) provided a diverse representation of the ways in which claims are made in real-world texts. We included a carefully selected set of \u201cNo claim\u201d examples, again drawing on data from Coan et al.<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 21\" title=\"Coan, T. G., Boussalis, C., Cook, J. &amp; Nanko, M. O. Computer-assisted classification of contrarian claims about climate change. Sci. Rep. 11, 22320 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR21\" id=\"ref-link-section-d261351822e2414\" rel=\"nofollow noopener\" target=\"_blank\">21<\/a>. Specifically, we select a set of semantically and substantively similar texts that express the opposite view of each claim in the taxonomy. Many of these examples were sourced from quotations and references made to mainstream arguments in CTT text. This procedure resulted in 1691 examples representing the full range of claims in the taxonomy. Note that in addition to using these data for fine-tuning, we also use these examples when carrying out dynamic few-shot learning.<\/p>\n<p>Given that the focus of our analysis is congressional speech, we held out an additional 100 examples from the 2151 manually annotated test set paragraphs to provide additional context on the textual characteristics commonly found in congressional testimonies. OpenAI documentation suggests that we should observe improvements from fine-tuning on 50 to 100 examples, depending on the use case<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 48\" title=\"OpenAI. Supervised fine-tuning. &#010;                  https:\/\/platform.openai.com\/docs\/guides\/supervised-fine-tuning&#010;                  &#010;                 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#ref-CR48\" id=\"ref-link-section-d261351822e2421\" rel=\"nofollow noopener\" target=\"_blank\">48<\/a>. Considering the potential difficulties associated with our classification task, we start with the upper-bound of this recommendation. As such, this leaves a total of 2051 out-of-sample paragraphs to assess model performance.<\/p>\n<p>Reverse engineered chain-of-thought prompting. While large foundation models can apply APE straightaway, smaller models must be taught to \u201creason\u201d. We developed a procedure\u2014reverse\u00a0engineered chain-of-thought prompting (RECoT)\u2014that uses examples to solve the dual objectives of teaching a scalable model to \u201creason\u201d, while also training the model to identify specific contrarian claims on climate change. RECoT is carried out in two steps. First, we draw on our fine-tuning dataset and a SOTA model to analyze example text and \u201creason\u201d through to the final answer provided by the researcher. We explored both Claude 3.5 Sonnet and GPT4o for this step, as these were high performing and affordable options at the time of this analysis. Importantly, the model is instructed to \u201cact\u201d as if it has not been given the correct answers. Second, we then fine-tune a smaller, more scalable base model on the entire CoT responses. After assessing several candidate open- and closed-source base models for RECoT fine-tuning, we found GPT4o-mini provided an effective model for this task, as this model is large enough to efficiently learn from our fine-tuning dataset, but also relatively cheap and fast to ensure model scalability. Supplementary Section\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S.6<\/a> provides a detailed explanation of how we employed the RECoT procedure.<\/p>\n<p>Model performance<\/p>\n<p>Table\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#Tab3\" rel=\"nofollow noopener\" target=\"_blank\">3<\/a> compares the overall performance of the various modeling approaches described above, classifying claims down to level 3 of the taxonomy (see Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#Fig1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>). Several notable findings emerge from this analysis. First, and somewhat surprisingly, few-shot learning underperforms relative to its zero-shot counterparts. This finding suggests that the procedures used to select suitable context or examples hinder rather than help classification performance. This result is particularly striking given that we use the same examples for fine-tuning with substantially better results.<\/p>\n<p>Second, zero-shot classification with Claude-Sonnet-3.7 (one of\u00a0the most advanced frontier\u00a0models at the time of this analysis) delivers remarkable overall performance, consistently achieving the highest scores across our full range of metrics. While this performance is promising, the inference costs for Claude-Sonnet-3.7 are prohibitively high, making this model difficult\u2014if not impossible\u2014to scale to even moderate-sized classification tasks. As demonstrated in Supplementary Section\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s44458-025-00029-z#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S.7<\/a>, using Claude-Sonnet-3.7 is nearly 20 times more expensive than GPT-4o-mini and 10 times more expensive than our fine-tuned alternatives.<\/p>\n<p>In light of these scalability constraints, we turn to examining our various fine-tuned alternatives. We took the original fine-tuning dataset with 1691 examples curated from CTTs and generated chain-of-thought responses using both GPT-4o and Claude Sonnet-3.5, creating two distinct datasets. Using GPT-4o-Mini as our base model, we fine-tuned two models\u2014one with each dataset\u2014which we term CARDS-mini-GPT and CARDS-mini-Sonnet. These models demonstrated substantial performance improvements compared to the base model (GPT-4o-Mini): CARDS-mini-GPT achieved a 30.9 percentage point increase in F1-score relative to the 4o-mini baseline, while CARDS-mini-Sonnet achieved a 31.9 percentage point increase. Moreover, incorporating the 100 congressional testimony-specific training examples further enhanced these metrics. The best performing CARDS model (CARDS-mini-Sonnet-2024-12-05) achieved performance metrics that closely approximate (or equal, in the case of Hamming Loss) those of the Claude-Sonnet-3.7 alternative. Crucially, the inference cost is approximately one-tenth the price, making CARDS-mini-Sonnet-2024-12-05 a viable alternative for classifying large volumes of text data.<\/p>\n<p>Statistical methods<\/p>\n<p>To examine the relationship between contrarian claims and key demographic, political, and economic covariates in Congressional speech, we estimated a series of Bayesian mixed effects models. Our unit of analysis is the speech, and the dependent variable of interest is a binary measure for the presence (1) or absence (0) of a contrarian claim in a given speech. While there are several ways to operationalize contrarian speech in our data (each with benefits and drawbacks), our approach captures the adage that it only takes a little dirt to muddy the waters.<\/p>\n<p>Specifically, we examine the correlates of climate change claims using a Bayesian logistic regression model with random effects for legislators and years. Let yit represent whether member of Congress i made a climate change claim during year t. We assume the following probability model: <\/p>\n<p>$${y}_{it} \\sim \\,{{\\rm{Bernoulli}}}\\,({p}_{it})$$<\/p>\n<p>\n                    (1)\n                <\/p>\n<p>where pit represents the probability that a member of Congress makes a contrarian claim. To model this probability as a function of covariates, we use the logit link function: <\/p>\n<p>$$\\,{{\\rm{logit}}}({p}_{it})={{\\rm{log}}}\\,\\left(\\frac{{p}_{it}}{1-{p}_{it}}\\right)={\\beta }_{0}+{{\\bf{X}}}it{{\\boldsymbol{\\beta }}}+{\\alpha }_{i}+{\\gamma }_{t}$$<\/p>\n<p>\n                    (2)\n                <\/p>\n<p>where \u03b20 is the global intercept, Xit is a vector of covariates for member i in year t, \u03b2 is a vector of fixed effect coefficients, \u03b1i is a member-specific random intercept, and \u03b3t is a year-specific random intercept. The covariates include party affiliation (Republican, Independent), chamber (Senate), sex (Female), standardized age, standardized fossil fuel employment, standardized utility sector presence, and standardized fossil fuel campaign contributions. We estimate the model using Bayesian inference with the following priors: <\/p>\n<p>$${y}_{it} \\sim \\,{{\\rm{Bernoulli}}}\\,({p}_{it}) \\qquad \\qquad {{\\rm{(likelihood)}}}$$<\/p>\n<p>\n                    (3)\n                <\/p>\n<p>$$\\,{{\\mbox{logit}}}\\,({p}_{it})={\\beta }_{0}+{{\\bf{X}}}it{{\\boldsymbol{\\beta }}}+{\\alpha }_{i}+{\\gamma }_{t}$$<\/p>\n<p>\n                    (4)\n                <\/p>\n<p>$${\\beta }_{0},{{\\boldsymbol{\\beta }}} \\sim \\,{{\\mbox{Student-}}}\\,t(4,0,2.5) \\qquad \\qquad {{\\rm{(priors)}}}$$<\/p>\n<p>\n                    (5)\n                <\/p>\n<p>$${\\alpha }_{i} \\sim \\,{\\mbox{Normal}}\\,(0,{\\sigma }_{\\alpha })$$<\/p>\n<p>\n                    (6)\n                <\/p>\n<p>$${\\gamma }_{t} \\sim \\,{\\mbox{Normal}}\\,(0,{\\sigma }_{\\gamma })$$<\/p>\n<p>\n                    (7)\n                <\/p>\n<p>$${\\sigma }_{\\alpha } \\sim {{\\mbox{Normal}}}^{+}(0,1)$$<\/p>\n<p>\n                    (8)\n                <\/p>\n<p>$${\\sigma }_{\\gamma } \\sim \\,{{\\rm{Normal}}}\\,{t}^{+}(0,.5)$$<\/p>\n<p>\n                    (9)\n                <\/p>\n<p>where Student-t+ denotes the half Student\u2019s t distribution constrained to be positive. We use weakly informative priors for all parameters, with Student-t distributions for the fixed effects and half-Student-t distributions for the random effect standard deviations. The model implemented in Stan (<a href=\"http:\/\/mc-stan.org\" rel=\"nofollow noopener\" target=\"_blank\">http:\/\/mc-stan.org<\/a>) using Hamiltonian Monte Carlo with 4 chains, each with 3000 iterations (including 1000 warm-up iterations).<\/p>\n","protected":false},"excerpt":{"rendered":"Data This section outlines our corpus on congressional floor speeches, the annotation procedure used to create the test&hellip;\n","protected":false},"author":2,"featured_media":493132,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[18],"tags":[23,4253,193269,193268,144806,3,145990,193270,9742,40438,21,19,22,20,25,24],"class_list":{"0":"post-493131","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-united-states","8":"tag-america","9":"tag-climate-change","10":"tag-development-and-social-change","11":"tag-development-and-sustainability","12":"tag-environmental-science-and-engineering","13":"tag-news","14":"tag-politics-and-international-relations","15":"tag-renewable-and-green-energy","16":"tag-sociology","17":"tag-sustainable-architecture-green-buildings","18":"tag-united-states","19":"tag-united-states-of-america","20":"tag-unitedstates","21":"tag-unitedstatesofamerica","22":"tag-us","23":"tag-usa"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/493131","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=493131"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/493131\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/493132"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=493131"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=493131"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=493131"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}