From theory to data

Semantic theory has achieved remarkable success in characterizing the compositional structure of natural language meaning. Through decades of careful theoretical work, semanticists have developed elegant formal systems that capture how complex meanings arise from the systematic combination of simpler parts. These theories explain two fundamental types of judgments that speakers make: acceptability judgments about whether strings are well-formed, and inference judgments about what follows from what speakers say.

The field now stands at an exciting juncture. The rise of large-scale experimental methods and computational modeling opens new opportunities to test and refine these theoretical insights against rich behavioral data. The challenge—and opportunity—is to connect our elegant formal theories to the messy, gradient patterns we observe when hundreds of speakers make thousands of judgments. How can we maintain the theoretical insights that formal semantics has achieved while extending them to account for this new empirical richness?

Probabilistic Dynamic Semantics (PDS) aims to provide a systematic bridge between these theoretical insights and behavioral data. It takes the compositional analyses developed using traditional Montagovian methods and maps them to probabilistic models that can be quantitatively evaluated against experimental results. The goal is not to replace traditional semantics but to extend its reach, allowing us to test theoretical predictions at unprecedented scale while maintaining formal rigor.

Traditional Semantic Methodology: Foundations of Success

Semanticists study the systematic relationships between linguistic expressions and the inferences they support. The field’s methodology centers on two types of judgments:

Acceptability judgments assess whether strings are well-formed relative to a language and in a particular context of use (Chomsky 1957; see Schütze 2016). For example, in a context where a host asks what a guest wants with coffee, (1) is clearly acceptable, while (2) is not Sprouse and Villata (2021):

What would you like with your coffee?
#What would you like and your coffee?

Inference judgments assess relationships between strings (see Davis and Gillon 2004). When speakers hear (3), they typically infer (4) (White 2019):

Jo loved that Mo left.
Mo left.

Observational Adequacy

A core desideratum for semantic theories is observational adequacy (Chomsky 1964): for any string \(s \in \Sigma^*\), we should predict how acceptable speakers find it in context, and for acceptable strings \(s, s'\), we should predict whether speakers judge \(s'\) inferable from \(s\). Achieving observational adequacy requires mapping vocabulary elements to abstractions that predict judgments parsimoniously.

These abstractions may be discrete or continuous, simple or richly structured. Through careful analysis of consistent inference patterns, semanticists have identified powerful generalizations. For instance, examining predicates like love, hate, be surprised, and know, theorists observed they all give rise to inferences about their complement clauses that survive under negation and questioning. This led to positing that they all share a property that predicts systematic inferential behavior across diverse predicates (Kiparsky and Kiparsky 1970; cf. Karttunen 1971).

Descriptive Adequacy and Theoretical Depth

Beyond observational adequacy lies descriptive adequacy: capturing data “in terms of significant generalizations that express underlying regularities in the language” (Chomsky 1964, 63). This drive for deeper explanation motivates the field’s emphasis on parsimony and formal precision.

The history of generative syntax illustrates two approaches to achieving descriptive adequacy:

Analysis-driven: Start with observationally adequate analyses in expressive formalisms, then extract generalizations as constraints
Hypothesis-driven: Begin with constrained formalisms (like CCG or minimalist grammars) and test their empirical coverage

The hypothesis-driven approach, which PDS adopts for semantics, aims to delineate phenomena through representational constraints. This becomes crucial when developing models that both accord with theoretical assumptions and can be evaluated quantitatively (Baroni 2022; Pavlick 2023).

The Power and Natural Boundaries of Traditional Methods

This methodology has yielded profound insights into semantic composition, scope phenomena, discourse dynamics, and the semantics-pragmatics interface more generally. By focusing on carefully constructed examples and native speaker intuitions, theorists have uncovered deep regularities in how meaning is constructed and interpreted.

Yet every methodology has natural boundaries. Traditional semantic methods excel at identifying patterns and building theories but face practical constraints when we ask:

How well do our generalizations, based on examining 5-10 predicates, extend to the thousands of predicates in the lexicon?
What factors beyond semantic knowledge influence the judgments we observe?
How exactly does abstract semantic knowledge produce concrete behavioral responses?

The Experimental Turn: New Opportunities for Semantic Theory

The traditional methodology’s success has created a foundation solid enough to support exciting new extensions. Experimental semantics brings the tools of behavioral experimentation to bear on questions about meaning, allowing us to test and refine theoretical insights at unprecedented scale.

Scaling Semantic Investigation

Where traditional methods might examine a handful of predicates, experimental approaches can investigate entire lexical domains. Extending our example involving the verb love: English has thousands of similar clause-embedding predicates, each potentially varying in its inferential properties. We can now test whether generalizations based on canonical examples extend across these vast lexicons.

The MegaAttitude project (White and Rawlins 2016, 2018, 2020; White et al. 2018; An and White 2020; Moon and White 2020; Kane, Gantt, and White 2022) is one example of this approach. This project aims to collect inference judgments for hundreds of predicates across multiple contexts and inference types. This scale reveals patterns that are very difficult to see and evaluate the quality of using traditional methods—subtle distinctions between near-synonyms, unexpected predicate clusters, and systematic variation across semantic domains.

Teasing Apart Contributing Factors

Experimental methods also allow us to investigate the rich array of factors that influence inference judgments:

Semantic knowledge: The core meanings of expressions
World knowledge: Prior beliefs about plausibility
Contextual factors: The discourse context and QUD
Individual differences: Variation in how speakers interpret expressions
Response strategies: How participants use rating scales

Rather than viewing these as confounds, we can see them as windows into the cognitive processes underlying semantic interpretation. For instance, Degen and Tonhauser (2021) systematically manipulated world knowledge to show how prior beliefs modulate the strength of factive inferences, revealing the interplay between semantic and pragmatic factors.

Making Linking Hypotheses Explicit

Perhaps most importantly, experimental approaches force us to make explicit what traditional methods leave implicit: the link between semantic representations and behavioral responses (Jasbi, Waldon, and Degen 2019; Waldon and Degen 2020; Phillips et al. 2021). When we say speakers judge that an inference follows, what cognitive processes produce that judgment? How do abstract semantic representations map onto the responses on some scale?

This is not merely a methodological detail—it’s a substantive theoretical question. Different linking hypotheses make different predictions about response patterns, allowing us to test not just our semantic theories but our assumptions about how those theories connect to behavior. Even if our real interest is in characterizing the semantic representations of speakers, we can’t ignore the way those representations map onto their responses in some task.

Understanding Gradience: A Taxonomy of Uncertainty

One of the most striking findings from experimental semantics is the pervasiveness of gradience in aggregated measures. While semanticists have long recognized the existence of gradience in some domains–e.g. gradable adjectives–we often assume categorical distinctions in other domains–e.g. factivity. And even where traditional approaches assume categorical distinctions, experimental methods often reveal continuous variation. For the reasons laid out above,understanding this gradience is crucial for developing theories that connect formal semantics to behavioral data.

Examples of Potentially Unexpected Gradience

The kinds of distributionally and inferentially defined properties we develop generalizations around are not always readily apparent in large-scale datasets. An example we will look at in-depth in our second case study of the course is that, when attempting to measure veridicality/factivity, we end up with more gradience than we might have expected. We can illustrate this using the MegaAttitude datasets.

Figure 1 shows veridicality judgments collected by White and Rawlins (2018) as part of their MegaVeridicality dataset.

Figure 1: Veridicality judgments collected by White and Rawlins (2018) as part of their MegaVeridicality dataset.

One thing White and Rawlins (2018) note is the apparent gradience in these measures. This gradience presents a challenge if we want to use these measures to evaluate generalizations about the relationship between two properties. For instance, say we are interested in understanding the relationship betwen factivity and neg(ation)-raising. A predicate is neg-raising if it gives rise to inferences of the form from (5) to (6):

Jo doesn’t think that Mo left.
Jo thinks that Mo didn’t leave.

One way of deriving a factivity measure from the MegaVeridicality dataset is to take the max along both dimensions, as shown in Figure 2. The idea here is that, it will give rise to veridicality inferences with both positive and negative matrix polarity.

Figure 2: One way of deriving a factivity measure from the MegaVeridicality dataset.

Now let’s suppose we’re interested in generalizations about the relationship between two measures. For instance, maybe want to evaluate the relationship between factivity and neg-raising, where we might tend to suspect that factives are not neg-raisers.

Figure 3 shows a comparison of the measure of neg(ation)-raising from the MegaNegRaising dataset collected by An and White (2020) and the derived factivity measure from the MegaVeridicality dataset collected by White and Rawlins (2018).

Figure 3: A comparison of the measure of neg(ation)-raising from the MegaNegRaising dataset collected by An and White (2020) and the derived factivity measure from the MegaVeridicality dataset collected by White and Rawlins (2018).

The challenge is that, once we move to relating continuous measures, rather than categorical distinctions, we don’t know what the relationship between measures should look like in any particular case. To illustrate, let’s consider another example. Anand and Hacquard (2014) propose that, if a predicate gives rise to inferences about both beliefs and preferences, it backgrounds the belief inferences. To evaluate this hypothesis, we might try to derive a measure of belief inferences and preference inferences and then relate them.

To this end, we can use the MegaIntensionality dataset collected by Kane, Gantt, and White (2022). Figure 4 shows a measure of belief inferences and Figure 5 shows a measure of desire inferences.

Figure 4: A measure of belief inferences from the MegaIntensionality dataset collected by Kane, Gantt, and White (2022).

And Figure 6 shows a comparison of the desire and belief measures.

Figure 5: A measure of desire inferences from the MegaIntensionality dataset collected by Kane, Gantt, and White (2022).

Figure 6 show the relationship between these two measures.

Figure 6: A comparison of the desire and belief measures from the MegaIntensionality dataset collected by Kane, Gantt, and White (2022).

There are two main takeaways from this example. First, the generalization proposed by@anand_factivity_2014 is indeed supported by the data. Second, the relationship between these two measures is strikingly different from the relationship we observe between the continuous measures of factivity and neg-raising. We need some way of theorizing about these continuous relationships.

Two Fundamental Types of Uncertainty

The framework we’ll explore distinguishes two general types of uncertainty that can produce gradience: resolved (or type-level) uncertainty and unresolved (or token-level) uncertainty, both of which can arise from multiple sources.

Sources of Gradience in Inference Judgments
├── Resolved (Type-Level) Uncertainty
│   ├── Ambiguity
│   │   ├── Lexical (e.g., "run" = locomote vs. manage)
│   │   ├── Syntactic (e.g., attachment ambiguities)
│   │   └── Semantic (e.g., scope ambiguities)
│   └── Discourse Status
│       └── QUD (Question Under Discussion)
└── Unresolved (Token-Level) Uncertainty
    ├── Vagueness (e.g., height of a "tall" person)
    ├── World knowledge (e.g., likelihood that facts are true)
    └── Task effects
        ├── Response strategies
        └── Response error

Resolved Uncertainty: Multiple Discrete Possibilities

Resolved uncertainty arises when speakers must choose among discrete interpretations. Consider (7):

My uncle is running the race.

The verb run is ambiguous—the uncle might be a participant (locomotion) or the organizer (management). Asked “How likely is it that my uncle has good managerial skills?”, participants who interpret run as locomotion might respond near 0.2, while those interpreting it as management might respond near 0.8. The population average might be 0.5, but this reflects a mixture of discrete interpretations, not genuine gradience.

This uncertainty is “resolved” because once speakers fix an interpretation, the inference follows determinately. The gradience emerges from averaging across different resolutions, not from uncertainty within any single interpretation.

A similar phenomenon is observable with anaphora. Consider (8):

Whenever anyone laughed, the magician scowled and their assistant smirked. They were secretly pleased.

One is quite likely to infer from (8) that the magician’s assistant is secretly pleased, but not necessarily that the magician is pleased, even though, in principle, it may be that both are, or even that only the magician is. Ultimately, the ambiguity is resolved when we fix the referent.

Unresolved Uncertainty: Gradient Within Interpretations

Unresolved uncertainty contrasts with resolved uncertainty in that it persists even after fixing all ambiguities. Consider (9):

My uncle is tall.

Even with no ambiguity about tall’s meaning, speakers remain uncertain whether the uncle exceeds any particular height threshold. This is classic vagueness—the predicate’s application conditions are inherently gradient (Fine 1975; Graff 2000; Christopher Kennedy 2007; Rooij 2011; Sorensen 2023).

World knowledge creates another layer: even knowing someone runs races (locomotion sense), we remain uncertain about their speed, endurance, or likelihood of finishing. These uncertainties appear within individual trials, not just across participants.

Why This Distinction Matters

The type of uncertainty has profound implications for semantic theory:

Resolved uncertainty suggests discrete semantic representations with probabilistic selection
Unresolved uncertainty suggests gradient representations or probabilistic reasoning within fixed meanings

Different phenomena may involve different uncertainty types. As we’ll see, vagueness seems to give rise to unresolved uncertainty (the conditions of application of tall seem inherently uncertain), while factivity’s gradience is perhaps more puzzling: is it resolved uncertainty from ambiguous predicates, or unresolved uncertainty in projection itself?

Case Studies: Testing Semantic Theory at Scale

To illustrate how PDS bridges formal semantics and experimental data, we’ll examine two case studies that exemplify different aspects of the framework.

Case Study 1: Vagueness and Gradable Adjectives

Vague predicates provide an ideal starting point because everyone agrees they involve gradient uncertainty. Expressions like tall, expensive, and old lack sharp boundaries—there’s no precise height at which someone becomes tall (Lakoff 1973; Sadock 1977; Lasersohn 1999; Krifka 2007; Solt 2015).

Formal semantic theories have long recognized this gradience. Degree-based approaches (Klein 1980; Bierwisch 1989; Kamp 1975; Chris Kennedy 1999; Christopher Kennedy and McNally 2005; Christopher Kennedy 2007; Barker 2002) analyze gradable adjectives as expressing relations to contextual thresholds:

tall is true of \(x\) if \(\ct{height}(x) \geq d_\text{tall}\) (context)

The threshold \(d_\text{tall}\) varies with context—what counts as tall for a basketball player differs from tall for a child. But even within a fixed context, speakers show gradient judgments about borderline cases.

This makes vagueness ideal for demonstrating how PDS works. The framework can: - Maintain the compositional degree-based analysis from formal semantics - Add probability distributions over thresholds to capture gradient judgments - Model how context shifts these distributions - Link threshold distributions to slider scale responses

Recent experimental work reveals additional complexity. Different adjective types show distinct patterns: - Relative adjectives (tall, wide): Maximum gradience in positive form - Absolute adjectives (clean, dry): Different threshold distributions - Minimum vs. maximum standard: Asymmetric patterns of imprecision

These patterns both support and refine formal theories, showing how experimental data can advance theoretical understanding. Recent years have seen partial integration into computational models (Lassiter and Goodman 2013, 2017; Qing and Franke 2014; Kao et al. 2014; Bumford and Rett 2021). We’ll show that PDS allows us to synthesize and compare these different partial approaches.

Case Study 2: Factivity and Projection

While vagueness involves expected gradience, factivity presents a puzzle. Traditional theory treats factivity as discrete—predicates either trigger presuppositions or they don’t (Kiparsky and Kiparsky 1970; Karttunen 1971).¹ Yet experimental data reveals pervasive gradience.

A predicate is factive if it triggers inferences about its complement that project through entailment-canceling operators. Love appears factive because Mo left is inferrable from the standard family of sentences in (10)–(12):

Jo loves that Mo left.
Jo doesn’t love that Mo left.
Does Jo love that Mo left?

But when White and Rawlins (2018) (discussed above) and Degen and Tonhauser (2022) collected projection judgments at scale, they found continuous variation (Xue and Onea 2011; Smith and Hall 2011; Djärv and Bacovcin 2017 also observe similar patterns). Qualitatively, Degen and Tonhauser (2022) argue that there is no clear line separates factive from non-factive predicates. Mean projection ratings vary continuously from pretend (lowest) to be annoyed (highest).

Aggregate factivity measures from Degen and Tonhauser (2022), showing continuous variation in projection ratings across predicates under questioning.

This gradience poses a theoretical challenge (Simons 2007; Simons et al. 2010, 2017; Tonhauser, Beaver, and Degen 2018).

Kane, Gantt, and White (2022) later showed that this gradience is likely due to task effects. They demonstrate that when one applies a clustering model to these data that accounts for noise due to various factors, many of the standard subclasses of factives pop out. Some of these subclasses–e.g. the cognitive factives, which Karttunen (1971) observes to not always give rise factivity–appear to themselves be associated with non-necessary factive inferences.

In this case study, we’ll focus on understanding what gives rise to this gradience. We’ll consider two hypotheses that PDS allows us to state precisely and test against the data collected by Degen and Tonhauser (2021), which uses the same experimental paradigm as Degen and Tonhauser (2022):

The Fundamental Discreteness Hypothesis: Factivity remains discrete; gradience reflects: - Multiple predicate senses (factive and non-factive variants) - Structural ambiguity affecting projection (Varlokosta 1994; Giannakidou 1998, 1999, 2009; Roussou 2010; Farudi 2007; Abrusán 2011; Kastner 2015; Ozyildiz 2017) - Contextual variation in whether complements are at-issue (Simons et al. 2017; Roberts and Simons 2024; Qing, Goodman, and Lassiter 2016)

The Fundamental Gradience Hypothesis: No discrete factivity property exists. Gradient patterns reflect different degrees to which predicates support complement truth inferences (Tonhauser, Beaver, and Degen 2018).

PDS allows us to implement both hypotheses formally and test their predictions against fine-grained response distributions—not just means, but entire judgment patterns including multimodality that might indicate mixture distributions. We’ll show how this approach can be applied to judgment data aimed at capturing factivity using various experimental paradigms (Tonhauser 2016; Djärv and Bacovcin 2017; Djärv, Zehr, and Schwarz 2018; White and Rawlins 2018; White et al. 2018; White 2021; Degen and Tonhauser 2021, 2022; Jeong 2021; Kane, Gantt, and White 2022).

The Need for New Frameworks

These case studies illustrate what we need from a framework connecting formal semantics to experimental data:

Maintain Compositionality: Theories must derive complex meanings compositionally, preserving insights from decades of formal semantic research. We cannot abandon compositionality just because judgments are gradient.

Model Uncertainty Explicitly: The framework must represent both types of uncertainty—resolved ambiguities and unresolved gradience—and show how they interact during interpretation.

Make Linking Hypotheses Precise: We need explicit theories of how semantic representations produce behavioral responses. What cognitive processes intervene between computing a meaning and moving a slider?

Enable Quantitative Evaluation: Theories must make testable predictions about response distributions, not just average ratings. Different theories should be comparable using standard statistical metrics.

As we’ll see in the next section, existing computational approaches like Rational Speech Act (RSA) models attempt to bridge formal semantics with probabilistic reasoning (Frank and Goodman 2012; Goodman and Stuhlmüller 2013). While valuable, these approaches face challenges in maintaining the modularity that makes formal semantic theories powerful. This motivates the development of Probabilistic Dynamic Semantics—a framework that preserves semantic insights while adding the probabilistic tools needed to model gradient behavioral data.

References

Abrusán, Márta. 2011. “Predicting the Presuppositions of Soft Triggers.” Linguistics and Philosophy 34 (6): 491–535.

An, Hannah, and Aaron White. 2020. “The Lexical and Grammatical Sources of Neg-Raising Inferences.” Proceedings of the Society for Computation in Linguistics 3 (1): 220–33. https://doi.org/https://doi.org/10.7275/yts0-q989.

Anand, Pranav, and Valentine Hacquard. 2014. “Factivity, Belief and Discourse.” In The Art and Craft of Semantics: A Festschrift for Irene Heim, edited by Luka Crni\v{c} and Uli Sauerland, 1:69–90. MITWPL 70. MITWPL. https://semanticsarchive.net/Archive/jZiNmM4N/.

Barker, Chris. 2002. “The Dynamics of Vagueness.” Linguistics and Philosophy 25 (1): 1–36. https://doi.org/10.1023/A:1014346114955.

Baroni, Marco. 2022. “On the Proper Role of Linguistically Oriented Deep Net Analysis in Linguistic Theorising.” In Algebraic Structures in Natural Language. CRC Press.

Bierwisch, Manfred. 1989. “The Semantics of Gradation.” In Dimensional Adjectives, edited by Manfred Bierwisch and Ewald Lang, 71–261. Berlin: Springer-Verlag.

Bumford, Dylan, and Jessica Rett. 2021. “Rationalizing Evaluativity.” Proceedings of Sinn Und Bedeutung 25 (September): 187–204. https://doi.org/10.18148/sub/2021.v25i0.931.

Chomsky, Noam. 1957. Syntactic Structures. The Hague/Paris: Mouton & Co.

———. 1964. “Current Issues in Linguistic Theory.” In The Structure of Language, edited by J. Fodor and J. Katz, 50–118. New York: Prentice Hall.

Davis, Steven, and Brendan S Gillon. 2004. Semantics: A Reader. New York: Oxford University Press.

Degen, Judith, and Judith Tonhauser. 2021. “Prior Beliefs Modulate Projection.” Open Mind 5 (September): 59–70. https://doi.org/10.1162/opmi_a_00042.

———. 2022. “Are There Factive Predicates? An Empirical Investigation.” Language 98 (3): 552–91. https://doi.org/10.1353/lan.0.0271.

Djärv, Kajsa, and Hezekiah Akiva Bacovcin. 2017. “Prosodic Effects on Factive Presupposition Projection.” Semantics and Linguistic Theory 27 (0): 116–33. https://doi.org/10.3765/salt.v27i0.4134.

Djärv, Kajsa, Jérémy Zehr, and Florian Schwarz. 2018. “Cognitive Vs. Emotive Factives: An Experimental Differentiation.” In Proceedings of Sinn Und Bedeutung, 21:367–86. https://ojs.ub.uni-konstanz.de/sub/index.php/sub/article/view/143.

Farudi, Annahita. 2007. “An Antisymmetric Approach to Persian Clausal Complements.” Ms., University of Massachusetts, Amherst.

Fine, Kit. 1975. “Vagueness, Truth and Logic.” Synthese 30 (3/4): 265–300. https://www.jstor.org/stable/20115033.

Frank, Michael C., and Noah D. Goodman. 2012. “Predicting Pragmatic Reasoning in Language Games.” Science 336 (6084): 998–98. https://doi.org/10.1126/science.1218633.

Giannakidou, Anastasia. 1998. Polarity Sensitivity as (Non) Veridical Dependency. Vol. 23. John Benjamins Publishing.

———. 1999. “Affective Dependencies.” Linguistics and Philosophy 22 (4): 367–421.

———. 2009. “The Dependency of the Subjunctive Revisited: Temporal Semantics and Polarity.” Lingua 119 (12): 1883–1908.

Goodman, Noah D., and Andreas Stuhlmüller. 2013. “Knowledge and Implicature: Modeling Language Understanding as Social Cognition.” Topics in Cognitive Science 5 (1): 173–84. https://doi.org/10.1111/tops.12007.

Graff, Delia. 2000. “Shifting Sands: An Interest-Relative Theory of Vagueness.” Philosophical Topics 28 (1): 45–81. https://www.jstor.org/stable/43154331.

Jasbi, Masoud, Brandon Waldon, and Judith Degen. 2019. “Linking Hypothesis and Number of Response Options Modulate Inferred Scalar Implicature Rate.” Frontiers in Psychology 10 (February). https://doi.org/10.3389/fpsyg.2019.00189.

Jeong, Sunwoo. 2021. “Prosodically-Conditioned Factive Inferences in Korean: An Experimental Study.” Semantics and Linguistic Theory 30 (0): 1–21. https://doi.org/10.3765/salt.v30i0.4798.

Kamp, J. A. W. 1975. “Two Theories about Adjectives.” In Formal Semantics of Natural Language, edited by Edward L. Keenan, 123–55. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511897696.011.

Kane, Benjamin, Will Gantt, and Aaron Steven White. 2022. “Intensional Gaps: Relating Veridicality, Factivity, Doxasticity, Bouleticity, and Neg-Raising.” Semantics and Linguistic Theory 31 (0): 570–605. https://doi.org/10.3765/salt.v31i0.5137.

Kao, Justine T., Jean Y. Wu, Leon Bergen, and Noah D. Goodman. 2014. “Nonliteral Understanding of Number Words.” Proceedings of the National Academy of Sciences 111 (33): 12002–7. https://doi.org/10.1073/pnas.1407479111.

Karttunen, Lauri. 1971. “Some Observations on Factivity.” Paper in Linguistics 4 (1): 55–69. https://doi.org/10.1080/08351817109370248.

Kastner, Itamar. 2015. “Factivity Mirrors Interpretation: The Selectional Requirements of Presuppositional Verbs.” Lingua 164: 156–88.

Kennedy, Chris. 1999. Projecting the Adjective: The Syntax and Semantics of Gradability and Comparison. New York: Garland.

Kennedy, Christopher. 2007. “Vagueness and Grammar: The Semantics of Relative and Absolute Gradable Adjectives.” Linguistics and Philosophy 30 (1): 1–45. https://doi.org/10.1007/s10988-006-9008-0.

Kennedy, Christopher, and Louise McNally. 2005. “Scale Structure, Degree Modification, and the Semantics of Gradable Predicates.” Language 81 (2): 345–81. https://doi.org/10.1353/lan.2005.0071.

Kiparsky, Paul, and Carol Kiparsky. 1970. “FACT.” In Progress in Linguistics, 143–73. De Gruyter Mouton. https://doi.org/10.1515/9783111350219.143.

Klein, Ewan. 1980. “A Semantics for Positive and Comparative Adjectives.” Linguistics and Philosophy 4 (1): 1–45. https://doi.org/10.1007/BF00351812.

Krifka, Manfred. 2007. “Approximate Interpretation of Number Words: A Case for Strategic Communication.” In Cognitive Foundations of Interpretation, edited by Gerlof Bouma, Irene Krämer, and Joost Zwartz, 111–26. Amsterdam: Koninklijke Nederlandse Akademie van Wetenschapen. https://doi.org/10.18452/9508.

Lakoff, George. 1973. “Hedges: A Study in Meaning Criteria and the Logic of Fuzzy Concepts.” Journal of Philosophical Logic 2 (4): 458–508. https://doi.org/10.1007/BF00262952.

Lasersohn, Peter. 1999. “Pragmatic Halos.” Language 75 (3): 522–51. https://doi.org/10.2307/417059.

Lassiter, Daniel, and Noah D. Goodman. 2013. “Context, Scale Structure, and Statistics in the Interpretation of Positive-Form Adjectives.” Semantics and Linguistic Theory 23 (0): 587–610. https://doi.org/10.3765/salt.v23i0.2658.

———. 2017. “Adjectival Vagueness in a Bayesian Model of Interpretation.” Synthese 194 (10): 3801–36. https://doi.org/10.1007/s11229-015-0786-1.

Moon, Ellise, and Aaron White. 2020. “The Source of Nonfinite Temporal Interpretation.” In Proceedings of the 50th Annual Meeting of the North East Linguistic Society, edited by Mariam Asatryan, Yixiao Song, and Ayana Whitmal, 3:11–24. Amherst: GLSA Publications.

Ozyildiz, Deniz. 2017. “Attitude Reports with and Without True Belief.” In Semantics and Linguistic Theory, edited by Dan Burgdorf, Jacob Collard, Sireemas Maspong, and Brynhildur Stefánsdóttir, 27:397–417. Linguistic Society of America.

Pavlick, Ellie. 2023. “Symbols and Grounding in Large Language Models.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 381 (2251). https://doi.org/10.1098/rsta.2022.0041.

Phillips, Colin, Phoebe Gaston, Nick Huang, and Hanna Muller. 2021. “Theories All the Way Down: Remarks on ‘Theoretical’ and ‘Experimental’ Linguistics.” In The Cambridge Handbook of Experimental Syntax, edited by Grant Goodall, 587–616. Cambridge Handbooks in Language and Linguistics. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108569620.023.

Qing, Ciyang, and Michael Franke. 2014. “Gradable Adjectives, Vagueness, and Optimal Language Use: A Speaker-Oriented Model.” Semantics and Linguistic Theory, August, 23–41. https://doi.org/10.3765/salt.v24i0.2412.

Qing, Ciyang, Noah D. Goodman, and Daniel Lassiter. 2016. “A Rational Speech-Act Model of Projective Content.” In Proceedings of the 38th Annual Meeting of the Cognitive Science Society: Recognising and Representing Events, 1110–15. The Cognitive Science Society. https://www.research.ed.ac.uk/en/publications/a-rational-speech-act-model-of-projective-content.

Roberts, Craige, and Mandy Simons. 2024. “Preconditions and Projection: Explaining Non-Anaphoric Presupposition.” Linguistics and Philosophy 47 (4): 703–48. https://doi.org/10.1007/s10988-024-09413-9.

Rooij, Robert van. 2011. “Vagueness and Linguistics.” In Vagueness: A Guide, edited by Giuseppina Ronzitti, 123–70. Dordrecht: Springer Netherlands. https://doi.org/10.1007/978-94-007-0375-9_6.

Ross, John Robert. 1967. “Constraints on Variables in Syntax.” PhD thesis, Massachusetts Institute of Technology.

Roussou, Anna. 2010. “Selecting Complementizers.” Lingua 120 (3): 582–603.

Sadock, Jerrold M. 1977. “Truth and Approximations.” Annual Meeting of the Berkeley Linguistics Society, September, 430–39. https://doi.org/10.3765/bls.v3i0.2268.

Schütze, Carson T. 2016. The Empirical Base of Linguistics. Classics in Linguistics 2. Berlin: Language Science Press. https://doi.org/10.17169/langsci.b89.100.

Simons, Mandy. 2007. “Observations on Embedding Verbs, Evidentiality, and Presupposition.” Lingua 117 (6): 1034–56. https://doi.org/10.1016/j.lingua.2006.05.006.

Simons, Mandy, David Beaver, Craige Roberts, and Judith Tonhauser. 2017. “The Best Question: Explaining the Projection Behavior of Factives.” Discourse Processes 54 (3): 187–206.

Simons, Mandy, Judith Tonhauser, David Beaver, and Craige Roberts. 2010. “What Projects and Why.” In Semantics and Linguistic Theory, edited by Nan Li and David Lutz, 20:309–27. University of British Columbia; Simon Fraser University: Linguistic Society of America. https://doi.org/10.3765/salt.v20i0.2584.

Smith, E. Allyn, and Kathleen Currie Hall. 2011. “Projection Diversity: Experimental Evidence.” In Proceedings of the ESSLLI 2011 Workshop on Projective Content.

Solt, Stephanie. 2015. “Vagueness and Imprecision: Empirical Foundations.” Annual Review of Linguistics 1 (Volume 1, 2015): 107–27. https://doi.org/10.1146/annurev-linguist-030514-125150.

Sorensen, Roy. 2023. “Vagueness.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta and Uri Nodelman, Winter 2023. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2023/entries/vagueness/.

Sprouse, Jon, and Sandra Villata. 2021. “Island Effects.” In The Cambridge Handbook of Experimental Syntax, edited by Grant Goodall, 227–57. Cambridge Handbooks in Language and Linguistics. Cambridge University Press. https://doi.org/10.1017/9781108569620.010.

Tonhauser, Judith. 2016. “Prosodic Cues to Presupposition Projection.” Semantics and Linguistic Theory 26 (0): 934–60. https://doi.org/10.3765/salt.v26i0.3788.

Tonhauser, Judith, David I. Beaver, and Judith Degen. 2018. “How Projective Is Projective Content? Gradience in Projectivity and At-Issueness.” Journal of Semantics 35 (3): 495–542. https://doi.org/10.1093/jos/ffy007.

Varlokosta, Spyridoula. 1994. “Issues in Modern Greek Sentential Complementation.” PhD thesis, University of Maryland, College Park.

Waldon, Brandon, and Judith Degen. 2020. “Modeling Behavior in Truth Value Judgment Task Experiments.” In Proceedings of the Society for Computation in Linguistics 2020, edited by Allyson Ettinger, Gaja Jarosz, and Joe Pater, 238–47. New York, New York: Association for Computational Linguistics. https://aclanthology.org/2020.scil-1.29/.

White, Aaron Steven. 2019. “Lexically Triggered Veridicality Inferences.” In Handbook of Pragmatics, 22:115–48. John Benjamins Publishing Company. https://doi.org/10.1075/hop.22.lex4.

———. 2021. “On Believing and Hoping Whether.” Semantics and Pragmatics 14 (6): 1–18. https://doi.org/10.3765/sp.14.6.

White, Aaron Steven, and Kyle Rawlins. 2016. “A Computational Model of S-Selection.” Semantics and Linguistic Theory 26 (0): 641–63. https://doi.org/10.3765/salt.v26i0.3819.

———. 2018. “The Role of Veridicality and Factivity in Clause Selection.” In NELS 48: Proceedings of the Forty-Eighth Annual Meeting of the North East Linguistic Society, edited by Sherry Hucklebridge and Max Nelson, 48:221–34. University of Iceland: GLSA (Graduate Linguistics Student Association), Department of Linguistics, University of Massachusetts.

———. 2020. “Frequency, Acceptability, and Selection: A Case Study of Clause-Embedding.” Glossa: A Journal of General Linguistics 5 (1). https://doi.org/10.5334/gjgl.1001.

White, Aaron Steven, Rachel Rudinger, Kyle Rawlins, and Benjamin Van Durme. 2018. “Lexicosyntactic Inference in Neural Models.” In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 4717–24. Brussels, Belgium: Association for Computational Linguistics. https://doi.org/10.18653/v1/D18-1501.

Xue, Jingyang, and Edgar Onea. 2011. “Correlation Between Presupposition Projection and At-Issueness: An Empirical Study.” In Proceedings of the ESSLLI 2011 Workshop on Projective Content.

Footnotes

We’ll spend a lot of time on Day 4 saying exactly what we mean by discrete here. Karttunen (1971), of course, classically argues that there are predicates that sometimes trigger presuppositions and sometimes don’t. For our purposes, we’ll say that this behavior is discrete in the sense that it’s more like ambiguity than vagueness. That is, we’ll show that uncertainty around factivity displays the hallmarks of resolved uncertainty.↩︎