Overview

Compositional dynamic semantic theories often model utterance meanings as maps from discourse states into sets of discourse states.1 PDS inherits this functional view of utterances; but following much work in the probabilistic semantics and pragmatics literature (van Benthem, Gerbrandy, and Kooi 2009; Lassiter 2011; Frank and Goodman 2012; Zeevat 2013; Lassiter and Goodman 2017; Bergen, Levy, and Goodman 2016, i.a.), it translates this idea into a probabilistic setting: in PDS, utterances denote maps from discourse states to probability distributions over discourse states. Thus in comparison to traditional dynamic semantics, PDS introduces a weighting on discourse states, allowing one to model preferences for certain resolutions of ambiguity over others.

Probability distributions as monadic values

In and of itself, this extension is not novel. More novel is that we view probability distributions as monadic values that inhabit types arising from a probability monad (see, e.g., Giorgolo and Asudeh 2014; Bernardy et al. 2019; Grove and Bernardy 2023). We formalize this view soon; but the gist is that viewing probability distributions this way allows PDS (i) to map linguistic expressions of a particular type to probability distributions over objects of that type so that the usual compositional structure of semantic analyses is retained; and thereby (ii) to compose probabilistic analyses with other analyses of, say, anaphora; as well as (iii) to define explicit linking models that map probability distributions over discourse states to probability distributions over judgments recorded using some response instrument.2

Crucial for PDS is that because probability distributions are characterized by a monad, they may themselves be stacked while retaining properties important for semantic composition.3 That is, the types derived from a probability monad may be inhabited by distributions over familiar types of objects—entities, truth values, functions from entities to truth values, and the like—or they may be inhabited by distributions over such distributions. And this stacking can be as deep as is necessary to model the sorts of uncertainty of interest to the analyst.

Two kinds of uncertainty

We argue here that at least two levels of stacking are necessary in order to appropriately model two kinds of interpretive uncertainty, respectively, which we refer to as resolved (or type-level) uncertainty and unresolved (or token-level) uncertainty. Resolved uncertainty is any kind of uncertainty which relates to lexical, structural, or semantic (e.g., scopal) ambiguity. For example, a polysemous word gives rise to resolved uncertainty. Based on the content of its direct object, ran in (1) seems likely to take on its locomotion sense, though it remains plausible that it has a management sense if Jo is understood to be the race’s organizer.

  1. Jo ran a race.

In contrast, unresolved uncertainty is that which is associated with an expression in view of some fixed meaning it has. Vague adjectives may give rise to unresolved uncertainty, for example, as witnessed by the vague inferences they support: the minimum degree of height tall requires to hold of entities of which it is true remains uncertain on any use of (2), even while the adjective’s meaning plausibly does not always vary across such uses.

  1. Jo is tall.

In general, we conceptualize unresolved uncertainty as reflecting the uncertainty that one has about a given inference at a particular point in some discourse, having fixed the meanings of the linguistic expressions.

Put slightly differently, resolved uncertainty is a property of one’s knowledge about the meanings of expressions qua expressions. Sometimes run means this; sometimes it means that. Thus, any analysis of the uncertainty about the meaning of run should capture that it is uncertainty about types of utterance act. In contrast, unresolved uncertainty encompasses any semantic uncertainty which remains, having fixed the type of utterance act—it is uncertainty pertaining to the semantically licensed inferences themselves.4

To capture this idea, our approach regards these types of uncertainty as interacting with each other in a restricted fashion by taking advantage of the fact that distributions may be stacked. Because resolved uncertainty must be resolved in order for one to draw semantically licensed inferences from uses of particular expressions, we take resolved parameters to be fixed in the computation of unresolved uncertainty. This rigid connection among sources of uncertainty is a natural consequence of structuring probabilistic reasoning in terms of stacked probability distributions.

Discourse states

We follow a common in dynamic semantics practice by regarding discourse states as lists of parameters. We depart slightly from the usual assumption that these lists are homogenous by treating them as potentially arbitrarily complex, i.e., heterogeneous (though see Bumford and Charlow 2022). As such, they could be structured according to a variety of models sometimes employed in formal pragmatics (e.g., Farkas and Bruce 2010). For example, we will define one parameter of this list to be a representation of the Stalnakerian common ground (or more aptly, the “context set”: Stalnaker 1978 et seq.) and another parameter to be a stack of Questions Under Discussion (QUDs: Ginzburg 1996; Roberts 2012).

We represent common grounds as probability distributions over indices encoding information about possible worlds, as well as what we call contexts. The possible world part of an index represents facts about how the (non-linguistic) world is—e.g., a particular individual’s height—while the context part encodes certain facts about lexical meaning—e.g., the identity of the height threshold conveyed by a vague adjective, such as tall (see, i.a.: Kennedy and McNally 2005; Kennedy 2007; Lassiter 2011).

Utterances—and more broadly, discourses—map tuples of parameters onto probability distributions over new tuples of parameters. Moreover, complex linguistic acts may be sequenced; in general, the effect on an ongoing discourse of multiple linguistic acts may be computed by using the sequencing operation (bind) native to the probability monad. In this sense, compositionality of interpretation obtains in PDS from the level of individual morphemes all the way up to the level of complex exchanges. For example, a discourse may consist in (i) making an assertion, which (perhaps, under a simplified model) modifies the common ground; (ii) asking a question, which adds a QUD to the top of the QUD stack; or (iii) a sequence of these. Regardless, we require the functions encoding discourses to return probabilistic values, in order to capture their inherent uncertainty.

Linking models

A linking model takes a discourse as conceived above, together with an initial probability distribution over discourse states, and links them to a distribution over responses to the current QUD. The possible responses to the QUD are determined by a data collection instrument, which could be a Likert scale, a slider scale, or something else. Furthermore, the distribution over responses is fixed by a likelihood function whose choice is constrained by the nature of the values encoded by the instrument. Thus a Bernoulli distribution for instruments that produce binary values; a categorical distribution for instruments that produce unordered, multivalued discrete responses; a linked logit distribution for instruments that produce ordered, multivalued discrete responses; and so on.

Haskell

Throughout these sets of notes, we include code snippets in the Haskell programming language to illustrate concepts that we introduce. There is a working Haskell implementation of PDS, which is currently undergoing further development, and which can translate PDS models into minimal pieces of code in the Stan programming language for several of the example modeling cases that we will discuss. Since the components of PDS are presented with their computational implementation in mind, we think it is particularly revealing to see the code itself. Thus we will interleave relevant code with the prose and semantic formulae.

References

Beaver, David I. 1999. “Presupposition Accomodation: A Plea for Common Sense.” In Logic, Language, and Computation, edited by Lawrence S. Moss, Jonathan Ginzburg, and Rijke de Maarten, 2:21–44. Stanford: CSLI Publications.
———. 2001. Presupposition and Assertion in Dynamic Semantics. Studies in Logic, Language and Information. Stanford: CSLI Publications. https://semanticsarchive.net/Archive/jU1MDVmZ.
Bergen, Leon, Roger Levy, and Noah Goodman. 2016. “Pragmatic Reasoning Through Semantic Inference.” Semantics and Pragmatics 9 (May): ACCESS–. https://doi.org/10.3765/sp.9.20.
Bernardy, Jean-Philippe, Rasmus Blanck, Stergios Chatzikyriakidis, Shalom Lappin, and Aleksandre Maskharashvili. 2019. “Predicates as Boxes in Bayesian Semantics for Natural Language.” In Proceedings of the 22nd Nordic Conference on Computational Linguistics, 333–37. Turku, Finland: Linköping University Electronic Press. https://www.aclweb.org/anthology/W19-6137.
Bumford, Dylan, and Simon Charlow. 2022. “Dynamic Semantics with Static Types.” LingBuzz. https://ling.auf.net/lingbuzz/006884.
Charlow, Simon. 2019. “Where Is the Destructive Update Problem?” Semantics and Pragmatics 12 (November): 10:1–24. https://doi.org/10.3765/sp.12.10.
Farkas, Donka F., and Kim B. Bruce. 2010. “On Reacting to Assertions and Polar Questions.” Journal of Semantics 27 (1): 81–118. https://doi.org/10.1093/jos/ffp010.
Frank, Michael C., and Noah D. Goodman. 2012. “Predicting Pragmatic Reasoning in Language Games.” Science 336 (6084): 998–98. https://doi.org/10.1126/science.1218633.
Ginzburg, Jonathan. 1996. “Dynamics and the Semantics of Dialogue.” In Logic, Language, and Computation, edited by Jerry Seligman and Dag Westerståhl, 1:221–37. Stanford: CSLI Publications.
Giorgolo, Gianluca, and Ash Asudeh. 2014. “One Semiring to Rule Them All.” In CogSci 2014 Proceedings. https://cogsci.mindmodeling.org/2014/papers/031/.
Grove, Julian, and Jean-Philippe Bernardy. 2023. “Probabilistic Compositional Semantics, Purely.” In New Frontiers in Artificial Intelligence, edited by Katsutoshi Yada, Yasufumi Takama, Koji Mineshima, and Ken Satoh, 242–56. Lecture Notes in Computer Science. Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-36190-6_17.
Kennedy, Christopher. 2007. “Vagueness and Grammar: The Semantics of Relative and Absolute Gradable Adjectives.” Linguistics and Philosophy 30 (1): 1–45. https://doi.org/10.1007/s10988-006-9008-0.
Kennedy, Christopher, and Louise McNally. 2005. “Scale Structure, Degree Modification, and the Semantics of Gradable Predicates.” Language 81 (2): 345–81. https://doi.org/10.1353/lan.2005.0071.
Lassiter, Daniel. 2011. “Vagueness as Probabilistic Linguistic Knowledge.” In Vagueness in Communication, edited by Rick Nouwen, Robert van Rooij, Uli Sauerland, and Hans-Christian Schmitz, 127–50. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-18446-8_8.
Lassiter, Daniel, and Noah D. Goodman. 2017. “Adjectival Vagueness in a Bayesian Model of Interpretation.” Synthese 194 (10): 3801–36. https://doi.org/10.1007/s11229-015-0786-1.
Phillips, Colin, Phoebe Gaston, Nick Huang, and Hanna Muller. 2021. “Theories All the Way Down: Remarks on Theoretical and Experimental Linguistics.” In The Cambridge Handbook of Experimental Syntax, edited by Grant Goodall, 587–616. Cambridge Handbooks in Language and Linguistics. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108569620.023.
Roberts, Craige. 2012. “Information Structure: Towards an Integrated Formal Theory of Pragmatics.” Semantics and Pragmatics 5 (December): 6:1–69. https://doi.org/10.3765/sp.5.6.
Stalnaker, Robert. 1978. “Assertion.” In Pragmatics, edited by Peter Cole, 9:315–32. New York: Academic Press.
van Benthem, Johan, Jelle Gerbrandy, and Barteld Kooi. 2009. “Dynamic Update with Probabilities.” Studia Logica 93 (1): 67–96. https://doi.org/10.1007/s11225-009-9209-y.
Zeevat, Henk. 2013. “Implicit Probabilities in Update Semantics.” In The Dynamic, Inquisitive, and Visionary Life of φ, ?φ, and ⋄φ: A Festschrift for Jeroen Groenendijk, Martin Stokhof, and Frank Veltman, 319–24. Amsterdam: ILLC.

Footnotes

  1. In its distributive implementations, that is. For a discussion of distributive vs. non-distributive variants of dynamic semantics, see, e.g., Charlow (2019).↩︎

  2. This type of capability is often discussed in the experimental linguistics literature under the heading of linking hypotheses or linking assumptions (see Phillips et al. (2021)). For our purposes, we define linking models to be statistical models that relate a PDS analysis (which determines a probability distribution over the inferences supported by a linguistic expression) to comprehenders’ judgments, as recorded using a particular instrument.↩︎

  3. More to the point, monads give rise to functors, which are composable, giving rise to the “stacking”.↩︎

  4. See Beaver (1999) and Beaver (2001), which describe an analogous bifurcation of orders of pragmatic reasoning in the representation of the common ground.↩︎