12  Listening Comprehension

12.1 Task Description

Children listen to a sentence and are shown three or four pictures. They are asked to choose the picture that best represents the meaning of the sentence they heard.

12.2 Construct

The Listening Comprehension task measures the construct of the grammatical comprehension of sentences. Sentences cover three broad areas of grammar: phrasal syntax, temporal relationships, and modified noun phrases.

12.3 Item Development

While many available tests use production tasks to tap into children’s grammatical knowledge in their language(s), considerably fewer tests use receptive tasks to measure these skills. Researchers conducted a review of existing receptive grammar tests, as well as a review of the literature regarding easy and difficult grammatical constructions, and used this to inform item design. Grammatical constructions were considered for inclusion if (1) they had the potential to differentiate low vs. high comprehension and (2) they could be tested using a receptive format (e.g., with appropriate foils). This resulted in a blueprint of selected grammatical constructions, presented below:

12.3.1 English

  • Phrasal syntax: Correct parsing of the sentence required correctly linking the subject (‘the doer’) and object (‘the one being acted upon’) in the sentence. This included alternations in ditransitive sentences (direct object indirect object vs indirect object direct object); passive constructions; subject/object relative clauses with reversible and nonreversible noun phrases; and interrogatives (direct vs. embedded questions).
  • Temporal comprehension: Comprehension required linking the order of events and ranged from easy (with events linearly matching the real-world chronological order) to difficult (mismatch). This was done using causal clauses (because/so), temporal clauses, future tense (will), and conditional clauses (if).
  • Complex noun phrases: Comprehension required identifying modified noun phrases. This was done using prepositional phrases (with the red hat), adjectives (long striped), and quantifiers (none, all).

12.3.2 Spanish

  • Phrasal syntax: Correct parsing of the sentence required correctly linking the subject (‘the doer’) and object (‘the one being acted upon’) in the sentence. This included alternations in ditransitive sentences (direct object indirect object vs indirect object direct object); passive constructions; subject/object relative clauses with reversible and nonreversible noun phrases; and interrogatives (direct vs. embedded questions).
  • Temporal comprehension: Comprehension required linking the order of events and ranged from easy (with events linearly matching the real-world chronological order) to difficult (mismatch). This was done using causal clauses (porque/para que); temporal clauses (hasta que); future tense (va a), and conditional clauses (si, mientras).
  • Complex noun phrases: Comprehension required identifying modified noun phrases. This was done using prepositional phrases (entre el árbol y la casa); adjectives (en el carro rojo); and quantifiers (todo, ninguno).

The sentences were represented by illustrations created specifically for the assessment. The following guidelines were provided for the development of the illustrations:

  • Easily Recognizable. Emphasis was placed on developing illustrations that could be easily identified.
  • Removal of Irrelevant Information. Unnecessary elements were not included in the image, focusing solely on the essential components of the image related to the content of the sentence.
  • Diversity Representation. The illustrations were designed to target various aspects of diversity, including different racial backgrounds (through variations in skin tones and hair textures), and diverse abilities such as featuring characters who use wheelchairs, prosthetics, or hearing devices. Cultural representations were carefully considered, encompassing a range of clothing styles, skin tones and physical features to reflect various backgrounds. The Justice, Equity, Diversity, and Inclusion (JEDI) team reviewed all the developed illustrations to enhance diversity in representation.

Dialectal considerations.

12.4 Scoring

Dichotomous fixed response format of 0 points for incorrect responses or non-responses and 1 point for correct ones.

12.5 Calibration Samples

Table 12.1: Demographic Characteristics of Calibration Samples for the English and Spanish Listening Comprehension Tasks
Characteristic
English
Spanish
K
N = 294
G1
N = 301
G2
N = 290
K
N = 239
G1
N = 278
G2
N = 260
Timepoint





    Fall 2023 294 (100%) 301 (100%) 290 (100%) 239 (100%) 278 (100%) 260 (100%)
Administration Format





    Forms 294 (100%) 301 (100%) 290 (100%) 239 (100%) 278 (100%) 260 (100%)
Race





    American/Alaskan Native 7 (2.4%) 3 (1.0%) 3 (1.1%) 2 (0.8%) 3 (1.1%) 4 (1.6%)
    Asian 39 (13%) 49 (16%) 13 (4.8%) 6 (2.5%) 3 (1.1%) 0 (0%)
    Black/African American 29 (10.0%) 31 (10%) 55 (20%) 1 (0.4%) 0 (0%) 0 (0%)
    Not reported 44 (15%) 49 (16%) 20 (7.4%) 134 (57%) 183 (67%) 171 (66%)
    Other 77 (26%) 50 (17%) 13 (4.8%) 43 (18%) 11 (4.0%) 20 (7.8%)
    White 95 (33%) 119 (40%) 165 (61%) 51 (22%) 75 (27%) 63 (24%)
    Unknown 3 0 21 2 3 2
Ethnicity





    Hispanic/Latin(o/a) 133 (45%) 127 (42%) 170 (59%) 215 (91%) 255 (92%) 243 (98%)
    Intentional nonreport 6 (2.0%) 2 (0.7%) 0 (0%) 1 (0.4%) 0 (0%) 2 (0.8%)
    Not Hispanic/Latin(o/a) 155 (53%) 172 (57%) 116 (41%) 20 (8.5%) 22 (7.9%) 2 (0.8%)
    Unknown 0 0 4 3 1 13
Gender





    Female 145 (49%) 139 (46%) 144 (50%) 128 (54%) 138 (50%) 139 (53%)
    Male 149 (51%) 162 (54%) 143 (50%) 111 (46%) 139 (50%) 121 (47%)
    Unknown 0 0 3 0 1 0
Home Language





    English 213 (74%) 225 (75%) 189 (79%) 27 (11%) 33 (12%) 16 (6.2%)
    Spanish 39 (14%) 36 (12%) 40 (17%) 207 (88%) 241 (88%) 241 (93%)
    Other 36 (13%) 38 (13%) 10 (4.2%) 2 (0.8%) 1 (0.4%) 1 (0.4%)
    Unknown 6 2 51 3 3 2
English Proficiency Label





    (Re-)Classified Proficient 25 (10%) 23 (7.8%) 16 (6.7%) 29 (13%) 43 (16%) 33 (14%)
    English Learner 58 (23%) 56 (19%) 34 (14%) 184 (83%) 204 (74%) 189 (79%)
    English-only 165 (67%) 215 (73%) 188 (79%) 9 (4.1%) 27 (9.9%) 16 (6.7%)
    Unknown 46 7 52 17 4 22
Ever IEP/504 20 (8.4%) 27 (10%) 27 (11%) 20 (9.8%) 21 (9.8%) 15 (13%)
    Unknown 56 43 51 35 63 143

12.6 Psychometric Analysis

12.6.1 Basic Item Statistics

We excluded 0 items from the English task and 0 items from the Spanish task based on low response counts (n < 90). 0 items were excluded because they had no variance in the Spanish task, and 0 items in the English task. Additionally, we excluded 12 items from the English task and 14 items from the Spanish task based on low point-biserial correlations (r < 0.2). Table 12.2 summarizes the basic item characteristics, Figure 12.1 shows the relationship between point-biserial correlations and the proportion of correct responses for each item.

Table 12.2: Basic Item Statistics Before and After Application of Exclusion Criteria, for the English and Spanish Listening Comprehension Tasks
English
Spanish
Characteristic
Before Excl.
After Excl.
Before Excl.
After Excl.
N = 117 N = 105 N = 120 N = 106
No. of Responses 171 (101) 172 (103) 151 (91) 152 (90)
Proportion Correct 0.81 (0.15) 0.80 (0.14) 0.72 (0.18) 0.72 (0.16)
Point-biserial Correlation 0.38 (0.13) 0.41 (0.10) 0.36 (0.12) 0.39 (0.09)
Excluded (n < 90) 0 (0%) 0 (0%) 0 (0%) 0 (0%)
Excluded (pbis < .2) 12 (10%) 0 (0%) 14 (12%) 0 (0%)
Excluded (no variation) 0 (0%) 0 (0%) 0 (0%) 0 (0%)
Figure 12.1: Scatterplot Showing Point-biserial (Item-total) Correlations and Proportion of Correct Responses for the English (Panel A) and Spanish (Panel B) Listening Comprehension Tasks

12.6.2 Rasch Analysis

12.6.2.1 Item Location Estimates

Figure 12.2: Scatterplot Showing Item Location and Proportion of Correct Response for the English (Panel A) and Spanish (Panel B) Listening Comprehension Tasks

12.6.2.2 Item Fit Statistics

Table 12.3: Frequencies of Item Misfit Categories Based on Infit/Outfit MSE Values for the English and Spanish Listening Comprehension Tasks
English
Spanish
Infit MSE
A B C D Total A B C D Total
Outfit MSE
A 98 0 0 0 98 105 0 0 0 105
B 6 0 0 0 6 1 0 0 0 1
C 1 0 0 0 1 0 0 0 0 0
D 0 0 0 0 0 0 0 0 0 0
Total 105 0 0 0 105 106 0 0 0 106

12.6.2.3 Person Location Estimates

Figure 12.3: Scatterplot Showing Person Location Estimates (Obtained using the MLE method) and the Proportion of Correct Responses for English and Spanish Listening Comprehension Tasks

12.6.2.4 Person Fit Statistics

Table 12.4: Frequencies of Person Misfit Categories Based on Infit/Outfit MSE Values for the English and Spanish Listening Comprehension Tasks
English
Spanish
Infit MSE
A B C D Total A B C D Total
Outfit MSE
A 557 0 0 0 557 641 0 0 0 641
B 187 97 0 0 284 60 39 0 0 99
C 29 0 4 0 33 25 0 1 0 26
D 3 0 2 1 6 1 0 1 0 2
Total 776 97 6 1 880 727 39 2 0 768

12.6.2.5 Distribution of Theta Estimates

Figure 12.4: Distribution of Theta Estimates for the English and Spanish Listening Comprehension Tasks

12.6.2.6 Wright Maps

Figure 12.5: Wright Maps Showing the Relationship Between Item and Person Location Estimates for the English Listening Comprehension Task
Figure 12.6: Wright Maps Showing the Relationship Between Item and Person Location Estimates for the Spanish Listening Comprehension Task

12.6.2.7 Model Summary

Table 12.5: Summary of Rasch Model Statistics for the English and Spanish Listening Comprehension Tasks
English
Spanish
Item
Person
Item
Person
Characteristic N = 105 N = 880 N = 106 N = 768
Logit Scale Location -2.07 (1.22) -0.06 (-0.81, 1.17) -1.28 (1.02) 0.06 (-0.66, 0.77)
Outfit 0.93 (0.26) 0.69 (0.39, 0.98) 0.98 (0.15) 0.86 (0.66, 1.06)
Infit 0.99 (0.09) 0.86 (0.71, 1.02) 1.00 (0.07) 0.92 (0.79, 1.05)
Reliability of Separation 0.6992 0.5419 0.7060 0.6316
12.6.2.7.1 Final Number of Items

Following the exclusion of items with point-biserial correlations < .20 and items with poor fit statistics, the final versions of the task contain 105 and 106 for the English and Spanish tasks, respectively.

12.7 Criterion Validity Evidence

12.7.1 Sample

Table 12.6: Demographic Characteristics of the Concurrent Criterion Validity Evidence Samples for the English and Spanish Listening Comprehension Tasks
Characteristic
English
Spanish
K
N = 261
G1
N = 231
G2
N = 201
K
N = 242
G1
N = 226
G2
N = 261
Timepoint





    Winter 2024 261 (100%) 231 (100%) 201 (100%) 242 (100%) 226 (100%) 261 (100%)
Race





    American/Alaskan Native 5 (1.9%) 3 (1.3%) 1 (0.5%) 2 (0.8%) 4 (1.8%) 4 (1.5%)
    Asian 35 (14%) 36 (16%) 8 (4.4%) 8 (3.3%) 2 (0.9%) 0 (0%)
    Black/African American 27 (10%) 30 (13%) 32 (17%) 1 (0.4%) 0 (0%) 0 (0%)
    Not reported 30 (12%) 32 (14%) 13 (7.1%) 133 (55%) 154 (69%) 168 (65%)
    Other 73 (28%) 45 (19%) 3 (1.6%) 40 (17%) 8 (3.6%) 18 (6.9%)
    White 88 (34%) 85 (37%) 126 (69%) 56 (23%) 55 (25%) 69 (27%)
    Unknown 3 0 18 2 3 2
Ethnicity





    Hispanic/Latin(o/a) 109 (42%) 98 (42%) 121 (61%) 218 (91%) 210 (93%) 244 (98%)
    Intentional nonreport 7 (2.7%) 2 (0.9%) 0 (0%) 1 (0.4%) 0 (0%) 2 (0.8%)
    Not Hispanic/Latin(o/a) 145 (56%) 131 (57%) 79 (40%) 20 (8.4%) 16 (7.1%) 2 (0.8%)
    Unknown 0 0 1 3 0 13
Gender





    Female 129 (49%) 110 (48%) 97 (48%) 128 (53%) 110 (49%) 134 (51%)
    Male 132 (51%) 121 (52%) 104 (52%) 114 (47%) 116 (51%) 127 (49%)
Home Language





    English 190 (74%) 174 (76%) 126 (82%) 29 (12%) 23 (10%) 23 (8.9%)
    Spanish 34 (13%) 24 (10%) 22 (14%) 208 (87%) 199 (89%) 235 (91%)
    Other 32 (13%) 32 (14%) 5 (3.3%) 2 (0.8%) 1 (0.4%) 1 (0.4%)
    Unknown 5 1 48 3 3 2
English Proficiency Label





    (Re-)Classified Proficient 12 (5.6%) 17 (7.5%) 11 (7.2%) 31 (14%) 24 (11%) 39 (16%)
    English Learner 49 (23%) 40 (18%) 16 (10%) 185 (81%) 177 (80%) 177 (74%)
    English-only 155 (72%) 169 (75%) 126 (82%) 11 (4.8%) 19 (8.6%) 23 (9.6%)
    Unknown 45 5 48 15 6 22
Ever IEP/504 20 (10%) 23 (13%) 18 (12%) 20 (9.4%) 23 (11%) 16 (12%)
    Unknown 63 47 48 30 15 123

English Listening Comprehension was correlated with the Sentence Comprehension subtest of the Clinical Evaluation of Language Fundamentals, 5th Edition (CELF 5) test (Wiig, Semel, and Secord 2013). Spanish Listening Comprehension was correlated with the Sentence Comprehension subtest of the Clinical Evaluation of Language Fundamentals, 4th Edition, Spanish (CELF 4 Spanish) test (Semel et al. 2006).

Table 12.7: Concurrent Criterion Validity Correlations for the English and Spanish Listening Comprehension Tasks
English
Spanish
All
EL
All
Grade n r [CI] n r [CI] n r [CI]
K 261 0.51 [0.41, 0.59] 49 0.58 [0.35, 0.74] 242 0.45 [0.35, 0.55]
G1 231 0.37 [0.26, 0.48] 40 0.44 [0.15, 0.66] 226 0.42 [0.31, 0.52]
G2 201 0.42 [0.29, 0.52] NA NA 261 0.40 [0.29, 0.50]