Large Language Models (LLMs) are increasingly utilized in autonomous decision-making, where they sample options from vast action spaces. However, the heuristics that guide this sampling process remain under-explored. We study this sampling behavior and show that this underlying heuristics resembles that of human decision-making: comprising a descriptive component (reflecting statistical norm) and a prescriptive component (implicit ideal encoded in the LLM) of a concept. We show that this deviation of a sample from the statistical norm towards a prescriptive component consistently appears in concepts across diverse real-world domains like public health, and economic trends. To further illustrate the theory, we demonstrate that concept prototypes in LLMs are affected by prescriptive norms, similar to the concept of normality in humans. Through case studies and comparison with human studies, we illustrate that in real-world applications, the shift of samples toward an ideal value in LLMs' outputs can result in significantly biased decision-making, raising ethical concerns.
Any open-ended intelligent agent, whether it is a human or an LLM, faces a fundamental dilemma. In many contexts, there are countless possible actions or completions that could, in principle, be realized. But which ones to pursue? These possibilities span a wide range, from coherent to incoherent, relevant to irrelevant, morally sound to deeply questionable. The vast majority are never realized, not because they are impossible, but because there are simply too many to consider. Faced with this, natural agents rely on heuristics: coarse, often unconscious shortcuts that help filter and prioritize what to express or act upon.
Are LLMs doing something similar in such situations? How do they navigate this space? What implicit regularities guide their generation?
When humans engage in decision making, whether completing a sentence, offering advice, or imagining an action in a scenario, their outputs often reflect more than what is merely plausible or frequent. Even without being asked to express a value judgment, people tend to produce responses that align with what feels appropriate, desirable, or ideal.
Why do humans do this?
Humans do not consider options based on just how statistically likely they are! There are multiple factors that push human outputs to value driven answers like pragmatic goals, the psychological dominance of social norms over statistics, bounded-rationality heuristics that collapse large option spaces toward “ideals”. All these push human outputs towards some direction of value
This led us to ask whether LLMs exhibit a similar tendency. When generating outputs from an innumerable possibility space, regardless of whether the prompt specifies any goal: Do their samples drift toward values that resemble implicit ideals? To investigate this, we designed a series of experiments aimed at disentangling descriptive behavior (reflecting what is statistically expected) from prescriptive drift (reflecting what might be considered ideal). We began with tightly controlled fictional setups, then extended the analysis to real-world domains such as medicine, education, and lifestyle. In each case, we examined whether model outputs tended to align with statistical averages, or systematically leaned toward ideals even when those values were not requested.
# | What we found | Why it matters |
---|---|---|
1 | Descriptive + Prescriptive heuristic is real. Across 500 concepts and 15 models, samples systematically drift toward an “ideal” value rather than sitting on the statistical average. | Any agent that relies on raw model samples is inheriting this hidden value bias. |
2 | Bigger ≠ safer. The prescriptive pull increases with model size and with RLHF / instruction tuning. | Scaling laws need to account for value drift, not just accuracy. |
3 | Human ≠ model ideals. The direction of the “ideal” often clashes with human judgements | Alignment work must confront the fact that larger, better models may amplify, not reduce misalignment. |
This work is a joint effort between CISPA Helmholtz Center for Information Security, TCS Research, and Microsoft. Thanks to the anonymous ACL reviewers for insightful feedback, and to the broader community for discussions on value alignment.
@inproceedings{sivaprasad2025theory,
title={A Theory of Response Sampling in LLMs: Part Descriptive and Part Prescriptive},
author={Sivaprasad, Sarath and Kaushik, Pramod and Abdelnabi, Sahar and Fritz, Mario},
booktitle={Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
pages={30091--30135},
year={2025}
}