Modeling vagueness
\[ \newcommand{\expr}[3]{\begin{array}{c} #1 \\ \bbox[lightblue,5px]{#2} \end{array} ⊢ #3} \newcommand{\ct}[1]{\bbox[font-size: 0.8em]{\mathsf{#1}}} \newcommand{\updct}[1]{\ct{upd\_#1}} \newcommand{\abbr}[1]{\bbox[transform: scale(0.95)]{\mathtt{#1}}} \newcommand{\pure}[1]{\bbox[border: 1px solid orange]{\bbox[border: 4px solid transparent]{#1}}} \newcommand{\return}[1]{\bbox[border: 1px solid black]{\bbox[border: 4px solid transparent]{#1}}} \def\P{\mathtt{P}} \def\Q{\mathtt{Q}} \def\True{\ct{T}} \def\False{\ct{F}} \def\ite{\ct{if\_then\_else}} \def\Do{\abbr{do}} \]
Our next model addresses how speakers reason about the likelihood that gradable adjectives apply. We’ll start with a realistic model of the vagueness data that one might design as a means for analyzing that dataset. As for the norming model, what we’ll do is to build up the model block-by-block, explaining each line. Then, we’ll turn to how we might analyze this experiment using PDS and show which components of this model correspond to the PDS kernel model and which are extensions by the analyst.
Understanding the experimental setup
Before diving into the Stan code, let’s consider how we’ll represent the vagueness data. Here’s a sample:
participant | item | item_number | adjective | adjective_number | scale_type | scale_type_number | condition | condition_number | response |
---|---|---|---|---|---|---|---|---|---|
1 | 9_high | 25 | quiet | 9 | absolute | 1 | high | 1 | 0.82 |
1 | 4_low | 11 | wide | 4 | relative | 2 | low | 2 | 0.34 |
1 | 5_mid | 15 | deep | 5 | relative | 2 | mid | 3 | 0.77 |
Each row represents one likelihood judgment: - participant
: Which person made this judgment - item
: A unique identifier combining adjective and condition (e.g., “quiet_high”) - adjective
: The gradable adjective being tested - condition
: Whether this is a high/mid/low standard context - response
: The participant’s likelihood judgment (0-1)
The key difference from norming: we now distinguish between items (specific adjective-condition pairs) and adjectives themselves. This structure lets us model properties that belong to adjectives (like how context-sensitive they are) separately from properties of specific items—a distinction motivated by the semantic theory.
The structure of the Stan program
Let’s build up our vagueness model block by block. Since you’re already familiar with Stan’s architecture from the norming model, we’ll focus on what’s different for modeling likelihood judgments.
The data
block
The data block for vagueness extends the norming structure with adjective-level information:
data {
int<lower=1> N_item; // number of items (adjective × condition)
int<lower=1> N_adjective; // number of unique adjectives
int<lower=1> N_participant; // number of participants
int<lower=1> N_data; // responses in (0,1)
int<lower=1> N_0; // boundary responses at 0
int<lower=1> N_1; // boundary responses at 1
// Response data
vector<lower=0, upper=1>[N_data] y; // slider responses
// NEW: Mapping structure
array[N_item] int<lower=1, upper=N_adjective> item_adj; // which adjective for each item
// Indexing arrays for responses
array[N_data] int<lower=1, upper=N_item> item;
array[N_0] int<lower=1, upper=N_item> item_0;
array[N_1] int<lower=1, upper=N_item> item_1;
array[N_data] int<lower=1, upper=N_adjective> adjective;
array[N_0] int<lower=1, upper=N_adjective> adjective_0;
array[N_1] int<lower=1, upper=N_adjective> adjective_1;
array[N_data] int<lower=1, upper=N_participant> participant;
array[N_0] int<lower=1, upper=N_participant> participant_0;
array[N_1] int<lower=1, upper=N_participant> participant_1;
}
The key addition is the adjective-level structure. We need to track both items (e.g., “tall_high”) and adjectives (e.g., “tall”) because our semantic theory posits that context-sensitivity is a property of adjectives, not individual items.
The parameters
block
The parameters capture both semantic quantities and statistical variation:
parameters {
// SEMANTIC PARAMETERS
// Each item has a degree on its scale
vector<lower=0, upper=1>[N_item] d;
// Global vagueness: how fuzzy are threshold comparisons?
real<lower=0> sigma_guess;
// Adjective-specific context sensitivity
vector<lower=0>[N_adjective] spread;
// PARTICIPANT VARIATION
// How much participants vary in their thresholds
real<lower=0> sigma_epsilon_mu_guess;
// Each participant's standardized deviation
vector[N_participant] z_epsilon_mu_guess;
// RESPONSE NOISE
real<lower=0, upper=1> sigma_e;
// CENSORED DATA
array[N_0] real<upper=0> y_0; // latent values for 0s
array[N_1] real<lower=1> y_1; // latent values for 1s
}
The parameters include:
d
: The degree each item has on its scale (e.g., how tall basketball players are)sigma_guess
: Global vagueness parameter controlling threshold fuzzinessspread
: How much each adjective’s standard shifts across contexts
The transformed parameters
block
This block computes the semantic judgments from our parameters:
transformed parameters {
// Convert standardized participant effects to natural scale
vector[N_participant] epsilon_mu_guess = sigma_epsilon_mu_guess * z_epsilon_mu_guess;
// STEP 1: Set up base thresholds for each item
vector[N_item] mu_guess0;
// This assumes our data has 3 conditions per adjective in order:
// high (index 1), low (index 2), mid (index 3)
for (i in 0:(N_adjective-1)) {
// High condition: positive threshold shift
3 * i + 1] = spread[i + 1];
mu_guess0[// Low condition: negative threshold shift
3 * i + 2] = -spread[i + 1];
mu_guess0[// Mid condition: no shift (baseline)
3 * i + 3] = 0;
mu_guess0[
}
// STEP 2: Transform thresholds to probability scale
vector<lower=0, upper=1>[N_data] mu_guess;
vector<lower=0, upper=1>[N_0] mu_guess_0;
vector<lower=0, upper=1>[N_1] mu_guess_1;
// STEP 3: Compute predicted responses
vector<lower=0, upper=1>[N_data] response_rel;
vector<lower=0, upper=1>[N_0] response_rel_0;
vector<lower=0, upper=1>[N_1] response_rel_1;
// For each response in (0,1)
for (i in 1:N_data) {
// Add participant adjustment to base threshold
real threshold_logit = mu_guess0[item[i]] + epsilon_mu_guess[participant[i]];
// Convert from logit scale to probability scale
mu_guess[i] = inv_logit(threshold_logit);
// KEY SEMANTIC COMPUTATION:
// P(adjective applies) = P(degree > threshold)
// Using normal CDF for smooth threshold crossing
1 - normal_cdf(d[item[i]] | mu_guess[i], sigma_guess);
response_rel[i] =
}
// Repeat for censored data
for (i in 1:N_0) {
mu_guess_0[i] = inv_logit(mu_guess0[item_0[i]] + epsilon_mu_guess[participant_0[i]]);1 - normal_cdf(d[item_0[i]] | mu_guess_0[i], sigma_guess);
response_rel_0[i] =
}
for (i in 1:N_1) {
mu_guess_1[i] = inv_logit(mu_guess0[item_1[i]] + epsilon_mu_guess[participant_1[i]]);1 - normal_cdf(d[item_1[i]] | mu_guess_1[i], sigma_guess);
response_rel_1[i] =
} }
Line 40 is the crucial one: response_rel[i] = 1 - normal_cdf(d[item[i]] | mu_guess[i], sigma_guess)
. This implements the likelihood that the adjective applies, using a smooth threshold crossing via the normal CDF. We’ll return to this shortly.
The model
block
The model block specifies our priors and likelihood:
model {
// PRIORS
// Vagueness: smaller values = more precise thresholds
5);
sigma_guess ~ exponential(
// Context effects: how much standards shift
1);
spread ~ exponential(
// Participant variation
1);
sigma_epsilon_mu_guess ~ exponential(
z_epsilon_mu_guess ~ std_normal();
// LIKELIHOOD
// Observed responses are noisy measurements of semantic judgments
for (i in 1:N_data) {
y[i] ~ normal(response_rel[i], sigma_e);
}
// Censored responses
for (i in 1:N_0) {
y_0[i] ~ normal(response_rel_0[i], sigma_e);
}
for (i in 1:N_1) {
y_1[i] ~ normal(response_rel_1[i], sigma_e);
} }
The likelihood connects our semantic computation (response_rel
) to the observed data through a measurement model.
The complete model
Here’s our complete vagueness model—the exact model we’ll use for analysis:
data {
int<lower=1> N_item; // number of items
int<lower=1> N_adjective; // number of adjectives
int<lower=1> N_participant; // number of participants
int<lower=1> N_data; // number of data points in (0, 1)
int<lower=1> N_0; // number of 0s
int<lower=1> N_1; // number of 1s
vector<lower=0, upper=1>[N_data] y; // response in (0, 1)
array[N_item] int<lower=1, upper=N_adjective> item_adj; // map from items to adjectives
array[N_data] int<lower=1, upper=N_item> item; // map from data points to items
array[N_0] int<lower=1, upper=N_item> item_0; // map from 0s to items
array[N_1] int<lower=1, upper=N_item> item_1; // map from 1s to items
array[N_data] int<lower=1, upper=N_adjective> adjective; // map from data points to adjectives
array[N_0] int<lower=1, upper=N_adjective> adjective_0; // map from 0s to adjectives
array[N_1] int<lower=1, upper=N_adjective> adjective_1; // map from 1s to adjectives
array[N_data] int<lower=1, upper=N_participant> participant; // map from data points to participants
array[N_0] int<lower=1, upper=N_participant> participant_0; // map from 0s to participants
array[N_1] int<lower=1, upper=N_participant> participant_1; // map from 1s to participants
}
parameters {
//
// FIXED EFFECTS
//
// items:
vector<lower=0, upper=1>[N_item] d;
real<lower=0> sigma_guess;
vector<lower=0>[N_adjective] spread;
//
// RANDOM EFFECTS
//
real<lower=0> sigma_epsilon_mu_guess; // global scaling factor
vector[N_participant] z_epsilon_mu_guess; // by-participant z-scores
real<lower=0, upper=1> sigma_e;
//
// CENSORED DATA
//
array[N_0] real<upper=0> y_0;
array[N_1] real<lower=1> y_1;
}
transformed parameters {
vector[N_participant] epsilon_mu_guess;
vector[N_item] mu_guess0;
vector<lower=0, upper=1>[N_data] mu_guess;
vector<lower=0, upper=1>[N_0] mu_guess_0;
vector<lower=0, upper=1>[N_1] mu_guess_1;
vector<lower=0, upper=1>[N_data] response_rel;
vector<lower=0, upper=1>[N_0] response_rel_0;
vector<lower=0, upper=1>[N_1] response_rel_1;
//
// DEFINITIONS
//
// non-centered parameterization of the participant random intercepts:
epsilon_mu_guess = sigma_epsilon_mu_guess * z_epsilon_mu_guess;
for (i in 0:N_adjective-1) {
3 * i + 1] = spread[i + 1];
mu_guess0[3 * i + 2] = -spread[i + 1];
mu_guess0[3 * i + 3] = 0;
mu_guess0[
}
for (i in 1:N_data) {
mu_guess[i] = inv_logit(mu_guess0[item[i]] + epsilon_mu_guess[participant[i]]);1 - normal_cdf(d[item[i]] | mu_guess[i], sigma_guess);
response_rel[i] =
}
for (i in 1:N_0) {
mu_guess_0[i] = inv_logit(mu_guess0[item_0[i]] + epsilon_mu_guess[participant_0[i]]);1 - normal_cdf(d[item_0[i]] | mu_guess_0[i], sigma_guess);
response_rel_0[i] =
}
for (i in 1:N_1) {
mu_guess_1[i] = inv_logit(mu_guess0[item_1[i]] + epsilon_mu_guess[participant_1[i]]);1 - normal_cdf(d[item_1[i]] | mu_guess_1[i], sigma_guess);
response_rel_1[i] =
}
}
model {
//
// FIXED EFFECTS
//
// scale estimate standard deviations:
5);
sigma_guess ~ exponential(
// scale estimate spread
1);
spread ~ exponential(
//
// RANDOM EFFECTS
//
// by-participant random intercepts:
1);
sigma_epsilon_mu_guess ~ exponential(
z_epsilon_mu_guess ~ std_normal();
//
// LIKELIHOOD
//
for (i in 1:N_data) {
y[i] ~ normal(response_rel[i], sigma_e);
}for (i in 1:N_0) {
y_0[i] ~ normal(response_rel_0[i], sigma_e);
}for (i in 1:N_1) {
y_1[i] ~ normal(response_rel_1[i], sigma_e);
}
}
generated quantities {
vector[N_data] ll; // log-likelihoods (needed for WAIC/PSIS calculations)
// definition:
for (i in 1:N_data) {
ll[i] = normal_lpdf(
y[i] |
response_rel[i],
sigma_e
);
} }
This model for vagueness treats each item as having a degree on its scale, with participants making likelihood judgments based on comparing these degrees to contextually shifted thresholds. The vagueness parameter controls how fuzzy these comparisons are.
PDS-to-Stan
So what components of the above model are derived from PDS? To answer this, we need to define our PDS model of the likelihood judgment task itself. Here it is:
-- From Grammar.Parser and Grammar.Lexica.SynSem.Adjectives
= ["jo", "is", "a", "soccer", "player"]
expr1 = ["how", "likely", "that", "jo", "is", "tall"]
expr2 = getSemantics @Adjectives 0 expr1
s1 = getSemantics @Adjectives 0 expr2
q1 = ty tau $ assert s1 >>> ask q1
discourse = asTyped tau (betaDeltaNormal deltaRules . adjectivesRespond likelihoodPrior) discourse likelihoodExample
This code: 1. Asserts that Jo is a soccer player (updated the common ground) 2. Asks “how likely (is it) that Jo is tall?” using the likelihood operator 3. Applies beta and delta reduction rules via betaDeltaNormal
4. Uses likelihoodPrior
to generate prior distributions 5. Applies adjectivesRespond
to specify the response function
The PDS implementation: Gradable adjectives
Now that we understand Stan’s structure and how to translate from PDS, let’s look at how we might capture vagueness. For this, we’ll need to add to our lexicon from the previous section a new denotation for the adjective tall and the wh-word how. We’ll also need an entry for likely.
instance Interpretation Adjectives SynSem where
= Convenience.combineR
combineR = Convenience.combineL
combineL
= [lex]
lexica where lex = \case
...
"tall" -> [ SynSem {
= AP :\: Deg,
syn = ty tau (purePP (lam d (lam x (lam i (sCon "(≥)" @@ (sCon "height" @@ i @@ x) @@ d)))))
sem
}SynSem {
, = AP,
syn = ty tau (lam s (purePP (lam x (lam i (sCon "(≥)" @@ (sCon "height" @@ i @@ x) @@ (sCon "d_tall" @@ s)))) @@ s))
sem
} ]...
"likely" -> [ SynSem {
= S :\: Deg :/: S,
syn = ty tau (lam s (purePP (lam p (lam d (lam _' (sCon "(≥)" @@ (Pr (let' i (CG s) (Return (p @@ i)))) @@ d)))) @@ s))
sem
} ]"how" -> [ SynSem {
= Qdeg :/: (S :/: AP) :/: (AP :\: Deg),
syn = ty tau (purePP (lam x (lam y (lam z (y @@ (x @@ z))))))
sem
}SynSem {
, = Qdeg :/: (S :\: Deg),
syn = ty tau (purePP (lam x x))
sem
} ]...
The key components of the gradable adjective entries are: - sCon "height"
represents an \(e \rightarrow r\) function from individuals to their heights - sCon "d_tall"
represents the contextual threshold from the discourse state - sCon "(≥)"
represents the comparison relation
The lexical entry for tall is related to the one we talked about in the last section but has a few key differences:
- Syntactic type:
AP
indicates this is an adjective phrase (in contrast to the degree-question versionAP :\: Deg
) - Semantic computation: The meaning is a function that:
- Takes a discourse state
s
(containing threshold information) - Returns a function from entities
x
to propositions (functions from indicesi
to truth values) - The proposition is true when the entity’s height exceeds the contextual threshold
- Takes a discourse state
- Semantic components:
- \(\ct{height}\): A function from indices to entity-to-degree mappings (type: \(\iota \to e \to r\))
- \(\ct{d\_tall}\): Extracts the threshold for “tall” from the discourse state (type: \(\sigma \to r\))
- \(\ct{(≥)}\): Comparison operator (type: \(r \to r \to t\))
This implements degree-based semantics where gradable adjectives denote relations between degrees and contextually determined thresholds. The use of the discourse state for threshold storage captures the context-sensitivity of standards.
Important also is the lexical entry for likely:
- Syntactic type:
S :\: Deg :/: S
indicates this takes a sentence and a degree to give back a sentence (this syntactic type is not completely realistic, but it serves our purposes) - Semantic computation: The meaning is a function that:
- Takes a discourse state
s
(containing threshold information) - Returns a function from propositions
p
to propositions to functions from degreesd
to propositions. - The resulting proposition is true (at any index) when the probability of the original proposition given the common ground of
s
exceeds the contextual threshold
- Takes a discourse state
- Semantic components:
- \(\ct{Pr}\): A function from probability distributions over truth values to real numbers that computes the probability of True
- \(\ct{CG}\): Grabs the common ground of the current state—a value of type \(\P ι\)
- \(\ct{(≥)}\): Comparison operator (type: \(r \to r \to t\))
Working through likelihood judgments
Having seen the PDS code for likelihood judgments, we now trace through how delta rules transform these complex λ-terms into forms suitable for Stan compilation. Delta rules, as introduced in Lambda.Delta
, are partial functions from terms to terms that implement semantic computations.
For likelihood judgments like how likely (is it) that Jo is tall?, the compositional semantics first produces the embedded proposition Jo is tall:
"(≥)" @@ (sCon "height" @@ i @@ j) @@ (sCon "d_tall" @@ s)) (sCon
\[\ct{(≥)}(\ct{height}(i)(\ct{j}))(\ct{d\_tall}(s))\]
This undergoes delta reduction using the states
rule from Lambda.Delta
:
-- From Lambda.Delta (lines 167-183)
states :: DeltaRule
= \case
states CG (UpdCG cg s) -> Just cg
CG (UpdDTall _ s) -> Just (CG s)
DTall (UpdDTall d _) -> Just d
DTall (UpdCG _ s) -> Just (DTall s)
-> Nothing _
Applied to extract the threshold for “tall”:
"(≥)" @@ (sCon "height" @@ i @@ j) @@ d) (sCon
\[\ct{(≥)}(\ct{height}(i)(\ct{j}))(d)\]
Next, the indices
rule extracts Jo’s height:
-- From Lambda.Delta (lines 106-120)
indices :: DeltaRule
= \case
indices Height (UpdHeight p _) -> Just p
-> Nothing _
This yields:
"(≥)" @@ h @@ d) (sCon
\[\ct{(≥)}(h)(d)\]
where \(h\) is Jo’s height.
The probabilities delta rule
The comparison stays symbolic—it cannot be reduced without concrete values. This is where the probabilities
rule becomes crucial. Here’s the FULL probabilities rule:
-- From Lambda.Delta (lines 156-164)
probabilities :: DeltaRule
= \case
probabilities Pr (Return Tr) -> Just 1
Pr (Return Fa) -> Just 0
Pr (Bern x) -> Just x
Pr (Disj x t u) -> Just (x * Pr t + (1 - x) * Pr u)
Pr (Let v (Normal x y) (Return (GE t (Var v')))) | v' == v -> Just (NormalCDF x y t)
Pr (Let v (Normal x y) (Return (GE (Var v') t))) | v' == v -> Just (NormalCDF (- x) y t)
-> Nothing _
This handles the case where we compute the probability that a normally distributed variable exceeds (or is exceeded by) a threshold. When both degrees and thresholds are uncertain, the system computes the appropriate probabilistic comparison.
Binding
The bind operator allows us to sequence probabilistic computations:
\[ \begin{array}{l} d \sim \ct{Normal}(\mu, \sigma) \\ h \sim \ct{Normal}(\mu_h, \sigma_h) \\ \pure{h \geq d} \end{array} \]
When we have probabilistic comparisons like height vs threshold, both drawn from distributions, the system can compute the probability that one exceeds the other. If both are normally distributed:
- \(h \sim \ct{Normal}(\mu_h, \sigma_h)\) (height)
- \(d \sim \ct{Normal}(\mu_d, \sigma_d)\) (threshold)
Then \(P(h \geq d)\) can be computed using the fact that \(h - d \sim \ct{Normal}(\mu_h - \mu_d, \sqrt{\sigma_h^2 + \sigma_d^2})\).
This property allows the compilation to Stan code that efficiently computes these probabilities:
real p = normal_cdf((mu_h - mu_d) / sqrt(sigma_h^2 + sigma_d^2) | 0, 1);
target += normal_lpdf(y | p, sigma_response);
Input PDS:
= ["jo", "is", "a", "soccer", "player"]
expr1 = ["how", "likely", "that", "jo", "is", "tall"]
expr2 = getSemantics @Adjectives 0 expr1
s1 = getSemantics @Adjectives 0 expr2
q1 = ty tau $ assert s1 >>> ask q1
discourse = asTyped tau (betaDeltaNormal deltaRules . adjectivesRespond likelihoodPrior) discourse likelihoodExample
Delta reductions:
- Parse “jo is tall” (embedded in likelihood) → \(\ct{(≥)}(\ct{height}(i)(\ct{j}))(\ct{d\_tall}(s))\)
- Apply
states
rule → \(\ct{(≥)}(\ct{height}(i)(\ct{j}))(d)\) - Apply
indices
rule → \(\ct{(≥)}(h)(d)\) - Wrap in
Pr
operator for likelihood computation - Apply
probabilities
rule →NormalCDF
computation
Kernel output:1
model {
// FIXED EFFECTS
0.0, 1.0);
v ~ normal(
// LIKELIHOOD
target += normal_lpdf(y | 1 - normal_cdf(v, 0.0, 1.0), sigma);
}
The PDS kernel model
The PDS system outputs the following kernel model:
model {
0.0, 1.0);
v ~ normal(target += normal_lpdf(y | 1 - normal_cdf(v, 0.0, 1.0), sigma);
}
This kernel captures the likelihood questions effect where v
represents Jo’s height and the normal_cdf
implements the probability that Jo counts as tall given vagueness in the threshold.
The full model
But as we saw for the norming model and in our discussion above, reality is complicated: we need to handle multiple adjectives, context effects, participant variation, and censored data.
The full model with analyst augmentations looks like:
for (i in 1:N_data) {
mu_guess[i] = inv_logit(mu_guess0[item[i]] + epsilon_mu_guess[participant[i]]);1 - normal_cdf(d[item[i]] | mu_guess[i], sigma_guess); // PDS KERNEL
response_rel[i] = }
The highlighted line shows the kernel model from PDS. Everything else in our complete model adds the statistical machinery needed for real data:
model {
// FIXED EFFECTS (analyst-added structure)
5); // PDS vagueness parameter
sigma_guess ~ exponential(1); // analyst-added context effects
spread ~ exponential(
// RANDOM EFFECTS (analyst-added)
1);
sigma_epsilon_mu_guess ~ exponential(
z_epsilon_mu_guess ~ std_normal();
// LIKELIHOOD
for (i in 1:N_data) {
// Wraps PDS computation
y[i] ~ normal(response_rel[i], sigma_e);
} }
The transformed parameters computes the semantic judgment (PDS kernel), while the model block adds measurement noise and hierarchical structure.
How the model components map to semantic theory
Let’s trace through a specific example to see how this model works:
Item degree: Suppose we’re modeling “tall” in the high condition. The parameter
d[item["tall_high"]]
might be 0.85, representing that basketball players (high condition) have high degrees on the height scale.Adjective spread: The parameter
spread["tall"]
might be 2.0, meaning “tall” is highly context-sensitive—its threshold shifts dramatically between conditions.Threshold computation:
- Base threshold (logit scale):
mu_guess0["tall_high"] = spread["tall"] = 2.0
- Participant adjustment: Say participant 5 has
epsilon_mu_guess[5] = -0.3
- Final threshold (logit):
2.0 + (-0.3) = 1.7
- Final threshold (probability):
inv_logit(1.7) ≈ 0.85
- Base threshold (logit scale):
Semantic judgment (THE PDS KERNEL):
1 - normal_cdf(0.85 | 0.85, sigma_guess) response_rel =
- If
sigma_guess = 0.1
(precise threshold), this gives ≈ 0.5 - If
sigma_guess = 0.3
(vague threshold), the response is more variable
- If
Response generation: The participant’s actual slider response is a noisy measurement of this semantic judgment, with noise
sigma_e
.
This vagueness model extends our baseline by capturing threshold uncertainty through probabilistic computation. The kernel represents the core semantic judgment—comparing degrees to thresholds—while the augmentations handle experimental realities.
data {
// Basic counts
int<lower=1> N_item; // number of items (adjective × condition)
int<lower=1> N_adjective; // number of unique adjectives
int<lower=1> N_participant; // number of participants
int<lower=1> N_data; // responses in (0,1)
int<lower=1> N_0; // boundary responses at 0
int<lower=1> N_1; // boundary responses at 1
// Response data
vector<lower=0, upper=1>[N_data] y; // slider responses
// NEW: Mapping structure
array[N_item] int<lower=1, upper=N_adjective> item_adj; // which adjective for each item
// Indexing arrays for responses
array[N_data] int<lower=1, upper=N_item> item;
array[N_0] int<lower=1, upper=N_item> item_0;
array[N_1] int<lower=1, upper=N_item> item_1;
array[N_data] int<lower=1, upper=N_adjective> adjective;
array[N_0] int<lower=1, upper=N_adjective> adjective_0;
array[N_1] int<lower=1, upper=N_adjective> adjective_1;
array[N_data] int<lower=1, upper=N_participant> participant;
array[N_0] int<lower=1, upper=N_participant> participant_0;
array[N_1] int<lower=1, upper=N_participant> participant_1;
}
parameters {
// SEMANTIC PARAMETERS
// Each item has a degree on its scale
vector<lower=0, upper=1>[N_item] d;
// Global vagueness: how fuzzy are threshold comparisons?
real<lower=0> sigma_guess;
// Adjective-specific context sensitivity
vector<lower=0>[N_adjective] spread;
// PARTICIPANT VARIATION
// How much participants vary in their thresholds
real<lower=0> sigma_epsilon_mu_guess;
// Each participant's standardized deviation
vector[N_participant] z_epsilon_mu_guess;
// RESPONSE NOISE
real<lower=0, upper=1> sigma_e;
// CENSORED DATA
array[N_0] real<upper=0> y_0; // latent values for 0s
array[N_1] real<lower=1> y_1; // latent values for 1s
}
transformed parameters {
// Convert standardized participant effects to natural scale
vector[N_participant] epsilon_mu_guess = sigma_epsilon_mu_guess * z_epsilon_mu_guess;
// STEP 1: Set up base thresholds for each item
vector[N_item] mu_guess0;
// This assumes our data has 3 conditions per adjective in order:
// high (index 1), low (index 2), mid (index 3)
for (i in 0:(N_adjective-1)) {
// High condition: positive threshold shift
3 * i + 1] = spread[i + 1];
mu_guess0[// Low condition: negative threshold shift
3 * i + 2] = -spread[i + 1];
mu_guess0[// Mid condition: no shift (baseline)
3 * i + 3] = 0;
mu_guess0[
}
// STEP 2: Transform thresholds to probability scale
vector<lower=0, upper=1>[N_data] mu_guess;
vector<lower=0, upper=1>[N_0] mu_guess_0;
vector<lower=0, upper=1>[N_1] mu_guess_1;
// STEP 3: Compute predicted responses
vector<lower=0, upper=1>[N_data] response_rel;
vector<lower=0, upper=1>[N_0] response_rel_0;
vector<lower=0, upper=1>[N_1] response_rel_1;
// For each response in (0,1)
for (i in 1:N_data) {
// Add participant adjustment to base threshold
real threshold_logit = mu_guess0[item[i]] + epsilon_mu_guess[participant[i]];
// Convert from logit scale to probability scale
mu_guess[i] = inv_logit(threshold_logit);
// KEY SEMANTIC COMPUTATION:
// P(adjective applies) = P(degree > threshold)
// Using normal CDF for smooth threshold crossing
1 - normal_cdf(d[item[i]] | mu_guess[i], sigma_guess);
response_rel[i] =
}
// Repeat for censored data
for (i in 1:N_0) {
mu_guess_0[i] = inv_logit(mu_guess0[item_0[i]] + epsilon_mu_guess[participant_0[i]]);1 - normal_cdf(d[item_0[i]] | mu_guess_0[i], sigma_guess);
response_rel_0[i] =
}
for (i in 1:N_1) {
mu_guess_1[i] = inv_logit(mu_guess0[item_1[i]] + epsilon_mu_guess[participant_1[i]]);1 - normal_cdf(d[item_1[i]] | mu_guess_1[i], sigma_guess);
response_rel_1[i] =
}
}
model {
// PRIORS
// Vagueness: smaller values = more precise thresholds
5);
sigma_guess ~ exponential(
// Context effects: how much standards shift
1);
spread ~ exponential(
// Participant variation
1);
sigma_epsilon_mu_guess ~ exponential(
z_epsilon_mu_guess ~ std_normal();
// LIKELIHOOD
// Observed responses are noisy measurements of semantic judgments
for (i in 1:N_data) {
y[i] ~ normal(response_rel[i], sigma_e);
}
// Censored responses
for (i in 1:N_0) {
y_0[i] ~ normal(response_rel_0[i], sigma_e);
}
for (i in 1:N_1) {
y_1[i] ~ normal(response_rel_1[i], sigma_e);
}
}
generated quantities {
// Log-likelihood for model comparison
vector[N_data] ll;
for (i in 1:N_data) {
ll[i] = normal_lpdf(y[i] | response_rel[i], sigma_e);
}
// We could also compute other quantities of interest:
// - Average vagueness per adjective
// - Predicted responses for new items
// - Posterior predictive checks
}
This model for vagueness implements several key components:
- Vagueness as threshold uncertainty: The
sigma_guess
parameter captures the participant’s uncertainty about people’s heights - Context sensitivity: The
spread
parameters capture how standards shift across contexts - Individual differences: Participants can have systematically different thresholds
- Measurement error: Slider responses are noisy measurements of semantic judgments
The model thus operationalizes the theoretical distinctions introduced earlier while adding the statistical machinery needed for real experimental data.
Footnotes
Actual PDS output:
model { v ~ normal(0.0, 1.0); target += normal_lpdf(y | normal_cdf(v, -0.0, 1.0), sigma); }
↩︎