Marc Goodrich: “Addressing One Research Question Using Multiple Methodological Approaches”
Nebraska Center for Research on Children, Families and Schools
Author
04/23/2019
Added
6
Plays
Description
Marc Goodrich
Assistant Professor, Department of Special Education & Communication Disorders
This presentation will focus on how different data analysis strategies can be used to approach the same research question or theory. Following one research question in particular, the presentation will demonstrate how both experimental and correlational designs can address the question, as well as unique information that can be obtained by using advanced data analysis techniques.
Searchable Transcript
Toggle between list and paragraph view.
- [00:00:00.649](upbeat music)
- [00:00:06.070]Thank you very much, everyone.
- [00:00:07.140]Thank you, Natalie, for the introduction.
- [00:00:11.230]You know Natalie said
- [00:00:12.310]I have considerable methodological expertise.
- [00:00:14.180]I would like to preface this with
- [00:00:15.100]I am not a methodologist by training,
- [00:00:17.600]but I am really interested in statistics
- [00:00:19.780]and how sort of advanced and complex modeling techniques
- [00:00:24.150]can be used to address research questions.
- [00:00:26.770]And so, as Natalie mentioned, the majority of my work
- [00:00:31.550]has focused on the development of language
- [00:00:33.140]and literacy skills in Spanish-speaking children.
- [00:00:35.610]And so a lot of times when I learn about a new approach
- [00:00:39.140]or a new technique, I think to myself,
- [00:00:41.470]okay, well how can I apply this to my area of interest?
- [00:00:46.040]And so a lot of times I'm addressing the same theory
- [00:00:48.530]or the same type of research question.
- [00:00:50.210]Hence, the title of my talk,
- [00:00:51.610]Addressing one research question
- [00:00:53.230]using multiple methodological approaches.
- [00:00:57.380]So, I also noticed they saved,
- [00:00:59.080]she said my talk was the last,
- [00:01:00.390]so hopefully they saved the best for last,
- [00:01:02.300]but maybe not.
- [00:01:03.820]We'll find out.
- [00:01:05.720]But just to give you a brief overview,
- [00:01:06.861]I know this is a methodology talk,
- [00:01:11.250]but I'm going to be grounding it in a lot of content.
- [00:01:15.390]So, I'm gonna start with background and theory
- [00:01:17.480]on development of language and literacy skills
- [00:01:20.000]in dual language learners.
- [00:01:22.280]I think it's sort of important to understand the theory
- [00:01:25.550]that I'm evaluating to really connect how these different
- [00:01:28.680]approaches all are addressing the same question.
- [00:01:33.123]I'll talk about using regression-based approaches at first,
- [00:01:35.250]and really almost all of the approaches
- [00:01:37.270]I'm going to talk about are regression-based approaches,
- [00:01:40.600]but these are simple regression-based approaches, right?
- [00:01:43.050]So, there's multiple regression, maybe some moderation.
- [00:01:46.130]I'm going to start at sort of a basic level,
- [00:01:49.290]hopefully most of you know what multiple regression
- [00:01:51.360]and moderation is, but maybe a couple of you
- [00:01:53.380]aren't super familiar with it, so I'll start
- [00:01:55.530]at a basic level before getting a little bit more advanced.
- [00:01:58.860]I'll talk about looking at different units of analysis
- [00:02:01.220]in your data, so looking at scale-level data
- [00:02:04.300]versus item-level data and how the answers to a question
- [00:02:07.740]you might want to address differ depending on how you look
- [00:02:10.110]at the exact same data.
- [00:02:13.750]I'll talk about using factor analysis a little bit
- [00:02:15.537]and different information that can be obtained
- [00:02:17.500]from factor analysis than traditional regression,
- [00:02:20.220]and then I'll talk about experimental methods, as well.
- [00:02:24.650]And those might be the best approaches
- [00:02:27.010]for actually examining causality
- [00:02:28.530]if you have a causal research question.
- [00:02:31.010]So, just to give you a little background,
- [00:02:33.040]I'm gonna talk about dual language learners,
- [00:02:35.840]so those are any kids who are learning
- [00:02:37.850]more than one language.
- [00:02:38.840]I tend to focus on Spanish-speaking kids
- [00:02:40.830]'cause that's the largest population
- [00:02:42.010]of dual language learners in the United States.
- [00:02:44.340]But they have significantly lower academic achievement
- [00:02:46.570]than their peers do across grade level, across subject area,
- [00:02:51.150]and so that presents a problem. What do we do about that?
- [00:02:54.433]How do we improve these kids skills?
- [00:02:56.500]So, if you look at reading achievement by students
- [00:02:59.840]who are and are not designated as English language learners
- [00:03:02.720]in fourth grade, this top line is students
- [00:03:04.780]who are not English language learners,
- [00:03:06.130]this bottom line is students
- [00:03:07.150]who are English language learners.
- [00:03:09.230]There's a big achievement gap, this is from
- [00:03:10.970]the National Assessment of Educational Progress,
- [00:03:14.440]and the stand deviation for these scale scores
- [00:03:16.840]is in the upper 30s to low 40s range
- [00:03:20.070]depending on the year and the group.
- [00:03:22.300]So, it's about a full standard deviation difference
- [00:03:24.610]between the groups, pretty large achievement gap.
- [00:03:27.890]When you look at eighth grade, it's even bigger.
- [00:03:30.930]So as these kids are going through school,
- [00:03:34.010]they're falling further and further behind
- [00:03:35.670]their monolingual peers.
- [00:03:38.140]Math, slightly smaller achievement gap,
- [00:03:40.750]but still an achievement gap, right?
- [00:03:42.557]You know we might expect a slightly smaller achievement gap
- [00:03:45.070]in a subject area like math,
- [00:03:46.270]it's not quite as language-heavy as reading is.
- [00:03:51.580]But, still a significant gap, nonetheless.
- [00:03:55.570]So, when thinking about how to address this problem,
- [00:03:59.280]I think a lot of people fall into the trap of thinking
- [00:04:02.140]that this suit group is a homogenous group,
- [00:04:04.120]that dual language learners are all the same.
- [00:04:07.160]So, the typical conceptualization
- [00:04:08.840]of an English language learner is well,
- [00:04:10.230]these kids speak Spanish or they speak Chinese
- [00:04:12.410]or they speak whatever language.
- [00:04:13.840]They must have strong language proficiency in that language,
- [00:04:16.830]they must have strong academic skills in that language,
- [00:04:19.100]and what they need help with is acquiring English.
- [00:04:22.280]That might not be the case, so we had some data
- [00:04:25.047]and we were interested in exploring this
- [00:04:27.220]with latent profile analysis
- [00:04:29.400]to see is this a homogenous group
- [00:04:31.410]or are there different groups of children.
- [00:04:35.140]When we're looking at whether there's heterogeneity
- [00:04:36.860]among DLLs and we did our latent profile analysis,
- [00:04:38.750]we found nine groups.
- [00:04:39.980]That's a lot of groups.
- [00:04:41.150]But when you look at it in a little bit more detail,
- [00:04:42.850]and some of you might've actually seen this before,
- [00:04:45.310]there's actually four groups.
- [00:04:47.420]So I call these supergroups, not like musical supergroups,
- [00:04:50.610]like Audio Slave or something,
- [00:04:51.840]but other kinds of supergroups.
- [00:04:56.030]I'm trying to tie it in to Matt Fritz' talk
- [00:04:58.460]on how rock bands can inform our research design.
- [00:05:05.370]So, these would be like the traditional yellow kids.
- [00:05:08.640]They have higher Spanish proficiency
- [00:05:10.330]than English proficiency, but you see even within this group
- [00:05:13.250]there's a lot of differences in absolute level of skill.
- [00:05:16.430]These kids have typical levels of Spanish skills,
- [00:05:18.780]and English skills in the low average range.
- [00:05:23.030]These kids are really low in both languages, right?
- [00:05:25.740]So, even though they have the same pattern,
- [00:05:27.890]these kids are certainly look to be more at risk
- [00:05:30.090]than do these kids.
- [00:05:32.562]You know we also see a supergroup of kids
- [00:05:35.390]that have relatively even skills in both of their language.
- [00:05:38.830]These kids might be thought of as balanced bilinguals.
- [00:05:41.590]You especially look at this group,
- [00:05:42.730]they look like they're balanced bilinguals.
- [00:05:44.030]They have typical proficiency in both languages.
- [00:05:46.700]It's sort of weird to call these kids balanced bilinguals,
- [00:05:49.280]though, because they are balanced, but they're really low.
- [00:05:52.960]And then we also have kids who have higher levels
- [00:05:55.160]of English proficiency than Spanish proficiency.
- [00:05:57.250]And these are much smaller numbers,
- [00:05:59.030]this is only like 10% of the sample in these two groups.
- [00:06:02.500]But, nonetheless, we see really different patterns of skills
- [00:06:06.470]in this population, so we might need different approaches
- [00:06:11.180]depending on the individual child
- [00:06:13.120]for what's going to help us narrow the achievement gap.
- [00:06:16.220]So, what do we do about the achievement gap?
- [00:06:18.840]We have to identify approaches that help,
- [00:06:21.330]that work best for promoting achievement.
- [00:06:24.120]Is that doing English-only instruction?
- [00:06:26.030]Is it doing dual language instruction?
- [00:06:27.520]Maybe it differs depending on the individual child.
- [00:06:30.330]If it's dual language instruction, do we transition?
- [00:06:33.370]We start off in the first language
- [00:06:34.780]and then transition to English?
- [00:06:36.030]Or do we keep doing instruction in the first language
- [00:06:38.480]throughout school while also doing English instruction
- [00:06:40.940]to maintain kids' skills in their first language?
- [00:06:43.140]I don't know.
- [00:06:44.160]In order to figure this out, we have to understand
- [00:06:47.130]how academic skills actually develop for these kids
- [00:06:50.880]and if this development is different than it is
- [00:06:53.147]for monolingual children.
- [00:06:54.900]The one really prominent theory of how academic skills
- [00:06:58.850]develop for dual language learners
- [00:07:00.850]is the Developmental Interdependence Hypothesis.
- [00:07:03.750]And according to this theory, there's a direct quote here,
- [00:07:07.460]the level of L2 competence which a bilingual child attains
- [00:07:10.390]is partially a function of the type of confidence
- [00:07:13.320]the child has developed in their first language
- [00:07:15.320]at the time when intensive exposure
- [00:07:17.170]to the second language begins.
- [00:07:19.340]And so according to this hypothesis,
- [00:07:21.620]children with high levels of skills in their first language,
- [00:07:24.690]when they're exposed to a second language will also develop
- [00:07:26.920]high levels of skills in that language.
- [00:07:28.900]But, for kids with low levels of proficiency
- [00:07:31.020]in their first language, they're not going to develop
- [00:07:33.356]strong skills in their second language
- [00:07:35.530]when they're exposed to their second language.
- [00:07:38.020]So, in other words, a lot of people interpret this
- [00:07:41.980]as saying that dual language learners can transfer knowledge
- [00:07:44.750]across languages.
- [00:07:46.130]If they have knowledge in their first language,
- [00:07:47.920]they can apply it directly to their second language.
- [00:07:51.930]And so this sort of intuitively makes sense
- [00:07:53.720]especially for some skills.
- [00:07:55.690]If you're not familiar with the concept
- [00:07:56.980]of phonological awareness, it is sort of the ability
- [00:07:59.720]to detect and manipulate the sounds of words.
- [00:08:01.740]So can I segment the word cat into it's constituent sounds?
- [00:08:06.160]Like kuh, ae, tuh.
- [00:08:08.630]I can do that in English.
- [00:08:11.450]I can do that because I have a knowledge
- [00:08:12.740]that words are made up of smaller sound units.
- [00:08:15.080]So, I can do that in another language, right?
- [00:08:17.930]I don't know German at all, but I can take a German word,
- [00:08:21.960]like I do know what the word nein means, it means no, right?
- [00:08:24.530]But I can take that and go en, i, en
- [00:08:28.240]and break that word down into its component sounds.
- [00:08:30.080]So, it's sort of intuitive that for certain skills,
- [00:08:32.760]kids can apply knowledge that they've developed
- [00:08:35.220]and utilize that knowledge when learning a second language.
- [00:08:38.210]But for others, it might not be so intuitive.
- [00:08:42.220]One thing that I'm really interested in
- [00:08:44.640]is does this actually happen?
- [00:08:46.410]Do dual language learners transfer knowledge
- [00:08:49.010]from their first to their second language?
- [00:08:51.560]And I particularly focused on reading-related skills
- [00:08:53.523]or pre-reading skills, early literacy skills.
- [00:08:56.180]Things like letter knowledge, alphabet knowledge,
- [00:08:57.879]vocabulary, phonological awareness.
- [00:09:01.440]So, we're looking at things like word reading,
- [00:09:03.280]reading comprehension.
- [00:09:05.450]I know a few of you in the room
- [00:09:06.470]were at the Pacific Coast Research Conference
- [00:09:08.220]where we found out from the keynote address
- [00:09:09.950]that we have no idea what reading comprehension is.
- [00:09:13.163]But we're gonna pretend like we do
- [00:09:14.330]and see if it can transfer across languages.
- [00:09:19.730]I'll talk a little bit about vocabulary knowledge,
- [00:09:22.440]alphabet knowledge, knowledge of letter sounds,
- [00:09:25.010]and then phonological awareness, which I just described.
- [00:09:28.870]So, the problem with a lot of the early research
- [00:09:32.700]in this area is that it really relied
- [00:09:34.530]on zero order correlations and it's like well,
- [00:09:36.800]these things are positively correlated across languages,
- [00:09:39.400]therefore they transfer.
- [00:09:41.290]So this is pretty typical of what you would see.
- [00:09:42.840]This is like a seminal paper in the area,
- [00:09:45.770]Cross-Language Transfer of Phonological Awareness.
- [00:09:49.010]And you see statements like this,
- [00:09:50.590]multiple-regression analysis revealed that the readers'
- [00:09:52.820]performance on English word and pseudoword recognition tests
- [00:09:55.660]was predicted by the levels
- [00:09:56.900]of both Spanish phonological awareness
- [00:09:58.397]and Spanish word recognition,
- [00:10:00.080]thus indicating cross-language transfer.
- [00:10:02.840]Similarly, in this paper they controlled
- [00:10:05.440]for prior English skills and found that Spanish vocabulary
- [00:10:07.697]and knowledge predicted English reading comprehension.
- [00:10:10.260]And so you see this similar language,
- [00:10:12.610]this study demonstrates the existence
- [00:10:14.310]of literary skills transfer from the first
- [00:10:16.230]to the second language.
- [00:10:17.790]These are pretty strong statements
- [00:10:20.030]to be making in your abstract.
- [00:10:21.640]We know that these things transfer
- [00:10:23.610]based on a multiple-regression
- [00:10:25.120]that is open to lots of other alternative explanations.
- [00:10:30.431]There is some meta-analytic work in this area, as well,
- [00:10:34.600]that shows that for these word-reading types of skills,
- [00:10:38.040]skills that are focused around decoding actual words,
- [00:10:43.650]breaking apart words in their constituent sounds
- [00:10:45.470]in order to read them,
- [00:10:47.240]they're pretty strong cross-language correlations.
- [00:10:50.730]But for the more language-based aspects of reading,
- [00:10:53.470]like comprehension, like vocabulary knowledge,
- [00:10:56.350]the correlations are much smaller and often not there.
- [00:11:00.670]From the correlation between L1 language skills
- [00:11:02.900]to L2 reading comprehension, .04.
- [00:11:06.156]All right, so that's a non-significant effect.
- [00:11:11.810]So, when we're thinking about these correlations, though,
- [00:11:15.040]we have to, we've see all these studies that say
- [00:11:17.300]these things are related,
- [00:11:18.280]therefore they must transfer across languages.
- [00:11:21.120]But they might not.
- [00:11:22.940]Maybe there's an influence of a third variable.
- [00:11:25.650]Maybe these relations are moderated by something else.
- [00:11:28.390]In fact, the hypothesis,
- [00:11:29.900]the Developmental Interdependence Hypothesis,
- [00:11:31.360]if you think back says that kids with high proficiency
- [00:11:34.750]in their first language will transfer skills
- [00:11:36.500]across languages and kids with low proficiency
- [00:11:38.440]in their first language won't.
- [00:11:39.407]All right, so that's a testable hypothesis of moderation
- [00:11:42.630]whether across a cross-language transfer is moderated.
- [00:11:45.010]So, if you're not familiar with statistical moderation,
- [00:11:48.560]you have a relation between an independent
- [00:11:50.494]and a dependent variable and the moderator
- [00:11:53.180]affects the strength of that relation.
- [00:11:55.480]It's pretty straightforward.
- [00:11:57.090]You have this nice example, it's a little bit blurry,
- [00:11:59.000]but I thought it was funny.
- [00:12:01.810]On the x-axis, we have time, so before something was cool
- [00:12:04.960]and after something was cool.
- [00:12:06.910]And on the y-axis we have plotted
- [00:12:08.070]how much you like that thing.
- [00:12:09.850]So, for the control group, after something becomes cool
- [00:12:12.440]you like it a little bit more.
- [00:12:13.870]But if you're a hipster, you really liked it
- [00:12:15.770]before it was cool, don't like it at all after it was cool.
- [00:12:20.290]So whether or not you're a hipster moderates the effect
- [00:12:23.770]or moderates the relation between whether something is cool
- [00:12:26.300]and how much you like it.
- [00:12:27.380]So that's just sort of a funny example of moderation.
- [00:12:32.040]So how do we apply this sort-of-funny example
- [00:12:33.950]to the research question?
- [00:12:34.880]Well, I sort of already described it.
- [00:12:37.420]But what I did was I look at are children's
- [00:12:39.130]phonological awareness skills correlated across languages?
- [00:12:42.110]And then does this correlation differ depending on the level
- [00:12:45.360]of proficiency in the first language?
- [00:12:50.690]Do kids with higher Spanish proficiencies
- [00:12:52.450]show a stronger cross-language correlation
- [00:12:54.400]in phonological awareness
- [00:12:55.920]than kids with lower Spanish proficiency?
- [00:12:57.940]So we looked at this with a sample
- [00:12:59.270]of 466 Spanish-speaking preschoolers.
- [00:13:03.890]They completed measures of phonological awareness
- [00:13:05.590]in Spanish and English and measures
- [00:13:07.420]of expressive language skills in both languages.
- [00:13:09.530]I'm only gonna report data on their Spanish language skills.
- [00:13:12.990]So, when we just look at whether they're correlated
- [00:13:15.530]across languages, these bold correlations in the red box
- [00:13:19.190]show the cross-language correlations
- [00:13:21.080]in phonological awareness.
- [00:13:23.060]So, they're correlated in the .3 to .4 range,
- [00:13:26.330]that's pretty substantial.
- [00:13:27.510]Maybe it suggests transfer,
- [00:13:28.880]maybe it suggests something else.
- [00:13:31.550]But is the strength of these relations affected
- [00:13:33.880]by their first language proficiency?
- [00:13:38.070]And so what we found here is that we have plotted here
- [00:13:41.200]is English phonological awareness on the y-axis,
- [00:13:43.860]Spanish phonological awareness on the x-axis,
- [00:13:46.290]and the two separate lines represent children
- [00:13:48.260]with high and a low Spanish oral language proficiency.
- [00:13:51.597]And this dotted line is children with high Spanish
- [00:13:54.210]oral language proficiency.
- [00:13:55.640]And so what you can see is that the correlation,
- [00:13:58.320]the slope of the line between Spanish and English
- [00:14:01.270]phonological awareness is much stronger for kids
- [00:14:04.050]with high Spanish oral language skills than it is
- [00:14:07.020]for children with low Spanish oral language skills.
- [00:14:10.150]And so it's interesting.
- [00:14:10.983]It's consistent with that hypothesis,
- [00:14:13.240]but it's also a little bit depressing.
- [00:14:15.520]It's sort of like a Matthew Effect.
- [00:14:17.190]It's like these kids who are already doing well in Spanish
- [00:14:19.573]also do well in English.
- [00:14:21.590]But the kids who are struggling continue to struggle.
- [00:14:25.740]And so we might need to dig a little bit more into that
- [00:14:27.450]and figure out what do we do for these kids
- [00:14:29.700]who have low Spanish oral language and might not benefit
- [00:14:33.920]from this cross-language transfer as much as other kids.
- [00:14:37.300]But that would be what we do about this as the topic
- [00:14:40.330]for a different presentation.
- [00:14:44.810]So, there are definite limitations that I sort of alluded to
- [00:14:48.310]to these correlational-based approaches.
- [00:14:50.130]So, we might have moderation, these correlations might vary
- [00:14:53.220]based on something else.
- [00:14:54.760]They're open to alternative explanations.
- [00:14:57.770]So, these kids might have similar quality
- [00:15:00.110]language environments across their first and second language
- [00:15:02.680]and that might help explain some of this.
- [00:15:04.900]Or they might just have a general underlying
- [00:15:07.180]language learning capacity, or G, right?
- [00:15:09.990]This might just be G intelligence
- [00:15:12.550]explaining these cross-language correlations
- [00:15:14.480]and so even though we have these sort of
- [00:15:15.660]interesting patterns that seem to be consistent with theory,
- [00:15:19.710]I don't know how much we can still even say
- [00:15:21.530]from moderation analysis.
- [00:15:24.460]So, more longitudinal
- [00:15:25.507]or experimental evidence is needed here.
- [00:15:28.080]These regression-based approaches aren't telling us a ton.
- [00:15:31.400]But before I get to the experimental or longitudinal pieces,
- [00:15:35.090]I wanna talk a little bit about quantile regression.
- [00:15:37.143]This is a methodology talk, and so I wanna talk
- [00:15:39.610]about using some advanced methodological approaches.
- [00:15:45.800]Ordinary least squares regression examines the effect
- [00:15:48.490]of X at the mean of Y.
- [00:15:50.630]So, if the effect of X is conditionalized on the mean of Y.
- [00:15:55.310]It assumes constant variance in the outcome and it assumes
- [00:15:59.480]that there are normally distributed residuals.
- [00:16:01.320]As we know, we don't always have normally-distributed data.
- [00:16:04.160]This is a paranormal distribution,
- [00:16:05.760]so we might have that kind of distribution.
- [00:16:08.126](laughs)
- [00:16:09.300]We might also have skewed data or kurtodic data,
- [00:16:14.250]but either way multiple-regression might,
- [00:16:17.760]standard ordinary least squares regression
- [00:16:19.240]might not be appropriate
- [00:16:20.840]depending on the nature of the data.
- [00:16:23.290]And so, if the variance in the dependent variable
- [00:16:25.700]differs across the distribution of the independent variable
- [00:16:30.547]our regression estimate isn't going to explain the data
- [00:16:34.050]equally well at all points across the distribution.
- [00:16:36.550]And I'll show a graph to help you visualize this.
- [00:16:41.030]Quantile regression is a little bit different.
- [00:16:43.460]It gives you a slope estimate at each quantile,
- [00:16:45.960]or percentile, of the distribution of the outcome.
- [00:16:48.650]So, if your variance is at constant across the level
- [00:16:51.460]of the outcome, then you can get a different slope estimate
- [00:16:55.470]at each level of the outcome.
- [00:16:57.430]Yakov Petscher and Jessica Logan have a really great paper
- [00:17:00.060]that describes this really succinctly,
- [00:17:04.020]at least for someone who's not a methodologist like me,
- [00:17:06.200]I could actually understand this paper,
- [00:17:07.620]which was fantastic.
- [00:17:08.850]I've tried to read other papers on quantile regression
- [00:17:10.890]and they're just like woo, I have no idea
- [00:17:12.850]what these equations mean.
- [00:17:15.900]So, this is an example of Y plotted against X
- [00:17:21.170]and a fitted regression line.
- [00:17:22.880]But as you can see here, the variance in Y
- [00:17:25.000]differs across the level of X.
- [00:17:27.090]So, down here at the bottom end of the distribution of X,
- [00:17:30.820]there's a really tight, strong correlation between X and Y,
- [00:17:33.410]and this regression line
- [00:17:34.520]does a really nice job describing that.
- [00:17:37.770]But up here, at the higher end of the distribution of X,
- [00:17:42.030]there's a lot more variance in Y and our regression line
- [00:17:44.420]that we fitted isn't doing as nice of a job
- [00:17:46.710]explaining this relationship.
- [00:17:49.200]So, we might need to plot different regression lines
- [00:17:52.150]at different points across the distribution.
- [00:17:58.370]That was an abstract example.
- [00:18:00.410]This is sort of a real-world example
- [00:18:01.960]of where quantile regression might be useful.
- [00:18:04.430]So, this graph, I pulled it from Jessica and Yakov's paper,
- [00:18:08.890]it is plotting height by age for girls age two
- [00:18:14.450]at the far left end of the x-axis
- [00:18:16.540]to 20 at the right end of the x-axis.
- [00:18:19.130]And so you see is, until girls are about 13 years old,
- [00:18:22.580]there's a really strong, positive relationship
- [00:18:25.950]between how old you are and how tall you are.
- [00:18:28.552]I can predict how old you are by how tall you are
- [00:18:31.720]or vice versa.
- [00:18:32.580]But when I get above 13 years of age,
- [00:18:36.330]that slope really changes.
- [00:18:39.810]I no longer can really tell how old someone is
- [00:18:42.420]just by knowing how tall they are.
- [00:18:44.910]And then that, of course, continues past age 20, as well.
- [00:18:48.620]The age breakdown might be a little bit different for boys,
- [00:18:51.030]but the same general pattern holds for boys, as well.
- [00:18:54.090]This is an instance in which quantile regression
- [00:18:56.900]might be more appropriate than using OLS regression.
- [00:19:00.210]So, back to the theory
- [00:19:01.270]about the Developmental Interdependence Hypothesis
- [00:19:04.410]and cross-language transfer.
- [00:19:06.140]Cummins in that same paper where he described
- [00:19:08.570]this cross-language transfer theory, also described
- [00:19:10.710]this threshold hypothesis where he essentially says
- [00:19:13.790]in order for these positive effects to occur,
- [00:19:16.730]you have to have high levels of proficiency
- [00:19:18.990]in both languages.
- [00:19:20.410]Again, sort of like a Matthew Effect.
- [00:19:23.170]In order for this transfer to occur,
- [00:19:25.220]you have to have high levels of proficiency.
- [00:19:27.460]And other people have interpreted this to say
- [00:19:29.920]cross-language relations are not constant
- [00:19:32.020]across the continuum of second language proficiency.
- [00:19:35.610]So, for kids with higher proficiency
- [00:19:37.740]in their second language, they're more able
- [00:19:39.340]to utilize knowledge from their first language
- [00:19:41.440]than kids with lower proficiency in their second language.
- [00:19:46.000]So, again, we could use quantile regression to look at this.
- [00:19:50.920]So, does the correlation between first and second language
- [00:19:53.620]academic skills that differ for children
- [00:19:55.750]with different levels of skill in the second language?
- [00:19:59.850]And so we're gonna examine the correlation at various
- [00:20:02.630]quantiles, or percentiles, of second language ability.
- [00:20:09.490]One reason why you might not wanna use quantile regression
- [00:20:11.700]is it's a lot less efficient than OLS regression.
- [00:20:14.780]So, if you don't actually have differences in the data
- [00:20:18.580]across the distributions,
- [00:20:20.210]or if you have small sample sizes
- [00:20:24.590]you might be really underpowered to detect effects.
- [00:20:26.670]So, I had a big sample here that I was fortunate enough
- [00:20:29.240]to be able to use to look at this.
- [00:20:34.420]I'm looking at cross-language transfer of oral language,
- [00:20:36.870]phonological awareness, and print knowledge separately.
- [00:20:40.770]And I'm just looking at bi-variate correlations.
- [00:20:42.740]You can also do quantile multiple regression,
- [00:20:45.160]it gets a little more complicated.
- [00:20:46.400]I'm not gonna talk about that today,
- [00:20:49.500]but if you're interested in quantile regression
- [00:20:51.384]that's something you can do, as well,
- [00:20:52.530]is include multiple predictors,
- [00:20:53.850]control for different things.
- [00:20:56.380]I wanna go into a little bit of how to interpret
- [00:20:58.980]quantile regression, it's a little strange.
- [00:21:01.640]So bear with me for a moment
- [00:21:02.850]and hopefully I'll walk you through this slowly,
- [00:21:06.220]or what I think is slow, maybe I'm going way too fast.
- [00:21:09.430]And hopefully it will make sense, we'll find out.
- [00:21:13.400]So, the interpretation of
- [00:21:15.540]an unstandardized regression coefficient is that,
- [00:21:17.930]and let's say that that regression coefficient is .5.
- [00:21:21.170]That means for every one unit increase in X,
- [00:21:23.540]we have a corresponding .5 increase in Y.
- [00:21:28.440]But there's an alternative way to interpret this
- [00:21:30.320]and I'm actually going to flip this to a standardized
- [00:21:33.440]coefficient so that we're thinking
- [00:21:34.600]in standard deviation units here.
- [00:21:37.290]So, the alternative interpretation is that the coefficient
- [00:21:40.610]is actually the difference in the outcome
- [00:21:44.110]for individuals who are at the mean of X
- [00:21:46.370]when compared to individuals
- [00:21:47.670]who are one standard deviation above the mean of X.
- [00:21:50.360]'Cause with a standardized variable, a one unit increase
- [00:21:53.860]is a one standard deviation increase.
- [00:21:57.030]And so, that alterative interpretation is really important
- [00:22:00.150]for quantile regression.
- [00:22:01.460]If you use this standard interpretation,
- [00:22:04.181]it doesn't really make sense, right?
- [00:22:05.726]'Cause I'm saying I'm going to estimate the slope
- [00:22:08.410]at the 25th percentile.
- [00:22:10.880]But if I say, okay, so at the 25th percentile,
- [00:22:14.030]for every one unit increase in X
- [00:22:15.540]there's a two unit increase in Y.
- [00:22:16.980]Well it's like as soon as I have
- [00:22:18.100]that two unit increase in Y,
- [00:22:19.220]I'm no longer at the 25th percentile.
- [00:22:21.450]So that doesn't make a lot of sense.
- [00:22:23.440]So you really want to use this different interpretation
- [00:22:26.130]and you're saying the difference
- [00:22:27.660]in the 25th percentile of Y for a student
- [00:22:33.000]is whatever the coefficient is for students
- [00:22:35.220]who are at the mean of X
- [00:22:36.160]and one standard deviation above the mean of X.
- [00:22:38.690]And so, if we have two z-scored variables, X and Y,
- [00:22:41.420]the mean of both is zero
- [00:22:43.130]and the standard deviation of both is one.
- [00:22:45.989]And then we estimate the slope at the 75th percentile of Y.
- [00:22:49.990]Our estimated slope coefficient is .8
- [00:22:51.850]and our intercept is .1, so this is our regression equation
- [00:22:54.576]for at the 75th percentile of Y.
- [00:22:56.970]It's pretty simple.
- [00:22:58.770]So, at the mean of x, we're gonna multiply .8 times zero,
- [00:23:01.650]and that means our Y value is .1 and one standard deviation
- [00:23:05.200]above the mean of X.
- [00:23:06.070]We'll multiply .8 times one, our predicted Y value is .9.
- [00:23:12.290]So, the difference in the 75th percentile of Y
- [00:23:15.880]between individuals at the mean and one standard deviation
- [00:23:18.380]above the mean of X is .8.
- [00:23:20.710]Does that make sense?
- [00:23:21.620]And so that's how you have to,
- [00:23:23.887]it was hard to wrap my brain around for a long time
- [00:23:26.940](laughs) and it's sort of hard for me to figure out
- [00:23:30.310]how to actually explain it.
- [00:23:32.560]But this is how you sort of have to think to interpret
- [00:23:35.060]the quantile regression coefficients accurately.
- [00:23:37.960]When I looked at this for oral language skills,
- [00:23:40.930]I estimated ability at the 25th percentile,
- [00:23:43.530]at the 50th percentile, at the 75th percentile.
- [00:23:46.110]And then I also got an ordinary least squares estimate.
- [00:23:49.974]And so I can see are these different than each other?
- [00:23:52.710]Are they different than the ordinary least squares estimate?
- [00:23:55.530]And so the only time that there was
- [00:23:58.520]a significant relationship between first
- [00:24:00.340]and second language oral language skills
- [00:24:02.280]was at the upper end of the distribution
- [00:24:04.410]of second language oral language skills.
- [00:24:06.770]All right?
- [00:24:07.603]So that's sort of consistent with that theory
- [00:24:10.640]at the beginning, but what you actually see
- [00:24:13.150]when you graph it is that that's not actually different
- [00:24:15.890]than the OLS regression estimate.
- [00:24:17.810]So what you have here, there's three black dots.
- [00:24:20.910]One over here at the 25th percentile,
- [00:24:22.976]one here at the 50th percentile,
- [00:24:25.430]and one here at the 75th percentile.
- [00:24:27.570]And so our x-axis is actually our percentile
- [00:24:30.130]of English oral language skills.
- [00:24:33.220]Now, our y-axis is the correlation between Spanish
- [00:24:36.520]and English oral language skills at each percentile.
- [00:24:39.930]So, here and then this red line
- [00:24:43.540]is the OLS regression estimate.
- [00:24:46.144]So, the dotted red lines are the confidence intervals
- [00:24:48.840]for the OLS estimate.
- [00:24:50.290]This big, gray blob is basically our confidence interval
- [00:24:53.870]for the quantile regression coefficients.
- [00:24:57.160]So, what you can see here is the estimates are falling
- [00:24:59.880]within the confidence interval
- [00:25:01.310]of the OLS regression estimate and the confidence interval
- [00:25:04.420]of the quantile regression estimates are overlapping
- [00:25:06.898]with the confidence intervals from the OLS regression.
- [00:25:11.170]Did I just say the same thing twice?
- [00:25:12.880]I can't remember.
- [00:25:13.860]Basically, the quantile regression estimates overlap
- [00:25:16.140]with the OLS regression estimates.
- [00:25:17.547]So, quantile regression isn't actually
- [00:25:19.750]that useful to us here.
- [00:25:22.500]But when I look at phonological awareness
- [00:25:24.220]I get a little bit different picture
- [00:25:25.450]and it's actually the opposite pattern
- [00:25:27.110]of what I would predict based on that threshold hypothesis.
- [00:25:30.200]So, the relationship is actually stronger
- [00:25:32.440]between the first and second language phonological awareness
- [00:25:35.140]for kids at the lower ends of the distribution
- [00:25:37.660]of English phonological awareness.
- [00:25:39.190]And that's sort of actually interesting
- [00:25:40.670]and not quite as depressing
- [00:25:41.720]as some of these other effects, right?
- [00:25:43.260]So that means for kids who have low English proficiency,
- [00:25:45.820]if we can improve their skills in their first language,
- [00:25:48.700]that's gonna have a bigger impact on their second language
- [00:25:51.690]than it would for these kids who already have higher levels
- [00:25:53.693]in English proficiency.
- [00:25:55.220]Sort of makes sense.
- [00:25:56.510]When we look at this graph,
- [00:25:57.790]we see non-overlapping confidence intervals.
- [00:26:00.520]So, in this case, quantile regression actually is
- [00:26:03.870]a useful way to conceptualize these data.
- [00:26:07.500]And then we see an identical pattern or similar pattern,
- [00:26:11.150]but actually a stronger trend with letter knowledge
- [00:26:15.310]and letter sound correspondence.
- [00:26:17.777]So, here we have a really strong relationship
- [00:26:20.480]at the lower end of the distribution,
- [00:26:22.280]a much weaker relationship at the higher end
- [00:26:24.170]of the distribution of English print knowledge.
- [00:26:30.120]So, what's really cool about quantile regression is
- [00:26:32.760]I just looked at three points here, the 25th percentile,
- [00:26:35.900]the 50th percentile, and the 75th percentile.
- [00:26:39.030]But I can look at however many points I want.
- [00:26:42.598]I mean granted the more points you look at,
- [00:26:44.309]the larger sample size you need.
- [00:26:47.030]But I can look at every 10th quantile,
- [00:26:48.670]every 10th percentile, right?
- [00:26:49.887]And so I say every tenth, I started at .2 and went to .8.
- [00:26:53.560]Would I get an effect at the 20th percentile?
- [00:26:55.350]30th percentile?
- [00:26:56.183]40th and so on.
- [00:26:57.820]And I can see if the trend is looking the same.
- [00:27:00.700]Or I can take the exact same data
- [00:27:02.210]and look at every 100th percentile.
- [00:27:04.780]And so, here I estimated a hundred different
- [00:27:07.430]slope coefficients for the relationship
- [00:27:11.330]between Spanish and English print knowledge.
- [00:27:13.480]And I actually, what I see is by cutting off
- [00:27:16.790]this last graph at the 25th percentile,
- [00:27:19.940]I'm missing something over here.
- [00:27:22.890]So, I want to look at the full continuum of the distribution
- [00:27:27.330]to really see what's going on in the data.
- [00:27:31.550]So, this is a really interesting approach.
- [00:27:33.210]I think it can be applied in a lot of situations.
- [00:27:37.030]A, it doesn't rely on the assumptions
- [00:27:39.370]that OLS regression makes.
- [00:27:41.920]And then it could be particularly helpful
- [00:27:43.740]when we have floor to ceiling effects in our data.
- [00:27:47.590]When we have floor to ceiling effects,
- [00:27:48.930]we might see very different correlations
- [00:27:52.350]at different aspects of the distribution.
- [00:27:56.850]And then another analysis or another thing
- [00:27:58.840]that's great about it is it's easy to implement.
- [00:28:02.830]It takes two lines of code in R to run these models.
- [00:28:06.320]You know it's really easy.
- [00:28:09.410]But going back to it, I talked about some of the limitations
- [00:28:12.850]of correlations and limitations of regressions.
- [00:28:15.260]We still have those same limitations here.
- [00:28:17.260]It's open to some alternative explanations.
- [00:28:20.120]It doesn't tell us anything about the causal relations
- [00:28:22.420]between these variables.
- [00:28:23.930]But it's an interesting way to dig a little bit deeper
- [00:28:26.680]into your data and see if your standard correlations
- [00:28:31.320]or your standard regressions that you're used to writing
- [00:28:33.160]are actually doing an adequate job of describing
- [00:28:36.336]the developmental nature
- [00:28:37.710]between the constructs that you have.
- [00:28:39.450]So now I wanna shift gears a little bit.
- [00:28:41.410]I wanna talk about looking at different units of analysis.
- [00:28:45.050]And I'm gonna do so in the context of vocabulary knowledge
- [00:28:47.570]because what I often see is that there aren't
- [00:28:50.330]cross-language correlations for vocabulary knowledge.
- [00:28:52.630]But I wanted to dig a little bit deeper and figure out
- [00:28:54.352]if I look at item-level data is that different?
- [00:28:57.280]So, (mumbles) can vocabulary knowledge
- [00:28:59.770]transfer across languages?
- [00:29:03.710]If you know twice as many languages,
- [00:29:06.180]do you know twice as many words?
- [00:29:07.800]Maybe you know twice as many swear words
- [00:29:09.240]and that's really fun 'cause you can use them more often
- [00:29:11.810]and in more contexts.
- [00:29:15.230]But maybe this works for cognates,
- [00:29:17.350]so I speak Spanish and English.
- [00:29:19.520]I don't speak Spanish as well
- [00:29:20.890]as I hope I would speak Spanish.
- [00:29:23.660]But I know the word chocolate in English
- [00:29:26.210]and so it might be really easy for me to acquire words
- [00:29:28.620]that are direct cognates like chocolate.
- [00:29:30.530]It's like the exact same orthographic form,
- [00:29:33.000]different phonological form, but very similar.
- [00:29:35.600]And so having that knowledge of the words
- [00:29:38.150]in my first language might help me fast map those words
- [00:29:40.630]a little bit better when they're cognates.
- [00:29:43.697]But what about other words.
- [00:29:44.530]So like casa and house.
- [00:29:46.429]What can the word casa tell me about the fact
- [00:29:49.813]that the English translation equivalent is house.
- [00:29:52.360]Well, there's an S in both of 'em.
- [00:29:54.340]I don't know, maybe that helped.
- [00:29:55.774](laughs)
- [00:29:57.250]But other than that, there's not much.
- [00:29:59.270]I mean maybe you can build on the conceptual knowledge
- [00:30:01.530]that you have about what a casa is to help you acquire
- [00:30:04.930]the word house and fast map that concept
- [00:30:06.900]a little bit easier.
- [00:30:08.240]But there's nothing specifically
- [00:30:09.520]about the forms of the words
- [00:30:10.910]that are going to be useful to you when you're learning
- [00:30:13.580]a second language.
- [00:30:15.370]And so, the questions that I asked around this situation
- [00:30:19.930]were does information regarding words that you only know
- [00:30:22.390]in your first language provide unique information
- [00:30:24.920]about your future vocabulary development
- [00:30:26.770]in your second language?
- [00:30:27.730]So words you know in your first language
- [00:30:29.240]but not in your second language,
- [00:30:30.630]does that knowledge transfer over
- [00:30:33.000]to your second language essentially?
- [00:30:36.270]And then similarly, at the individual word level,
- [00:30:39.980]are children more likely to acquire translation equivalence
- [00:30:42.940]for words they already know in their first language
- [00:30:45.310]than they are to acquire words in English
- [00:30:47.060]that they don't know in their first language?
- [00:30:50.190]Two very, almost identical questions, just one is looking
- [00:30:53.320]at scale level data in the first one,
- [00:30:54.850]and the second one is going to be looking
- [00:30:56.190]at item-level data.
- [00:30:58.870]So I did this in two samples and kids completed receptive
- [00:31:02.140]and definitional vocabulary assessments at two time points
- [00:31:05.040]in each sample in both languages.
- [00:31:06.820]So not surprisingly, I just talked about that situation
- [00:31:09.310]of casa and house.
- [00:31:10.390]And not surprisingly, you often see negative
- [00:31:13.260]or non-significant cross-language correlations
- [00:31:15.350]of vocabulary knowledge.
- [00:31:16.700]There's no real intuitive reason
- [00:31:18.030]why vocabulary should transfer especially when you think
- [00:31:20.750]about the more time kids are hearing one language,
- [00:31:23.960]there's only a finite amount of time they're hearing
- [00:31:25.840]language in general, so the more time
- [00:31:27.630]they're hearing Spanish,
- [00:31:28.463]the less time they're hearing English.
- [00:31:29.970]So that sort of helps me think
- [00:31:31.660]about the negative correlation across languages.
- [00:31:35.490]But people have developed alternative methods
- [00:31:37.330]of interpreting this and assessing vocabulary knowledge,
- [00:31:40.490]one of which is conceptual vocabulary.
- [00:31:42.880]So in conceptual vocabulary you figure out
- [00:31:45.730]which words kids only know in Spanish,
- [00:31:48.220]which words kids only know in English,
- [00:31:49.835]and which words kids know in both language.
- [00:31:52.290]And I guess you can look at also which words
- [00:31:54.050]kids don't know in either language on your assessment.
- [00:31:57.100]So it gives you a little bit more information, though,
- [00:31:59.130]than just here's an English vocabulary assessment.
- [00:32:01.620]Which words did you know in English?
- [00:32:03.260]'Cause we can see they knew this word in English
- [00:32:04.980]but they didn't know it in Spanish or vice versa.
- [00:32:08.580]So, when I look at scale level data,
- [00:32:11.198]I'm just looking at the relations
- [00:32:12.860]between L1 and L2 vocabulary
- [00:32:14.710]using longitudinal multiple regression.
- [00:32:16.410]So, I'm predicting time two English vocabulary
- [00:32:19.700]from time one Spanish vocabulary and vice versa.
- [00:32:23.391]And so there's a lot of coefficients here,
- [00:32:24.740]but what I'll draw your attention to are these ones in red
- [00:32:27.450]that I highlighted in red.
- [00:32:29.430]These are the unique effects of Spanish vocabulary
- [00:32:33.661]at time one on English vocabulary at time two
- [00:32:37.540]after controlling for words kids already knew in English.
- [00:32:40.690]There's no effects.
- [00:32:43.940]At the scale level, knowing which words, how many words
- [00:32:47.110]kids knew in Spanish only does not tell us anything
- [00:32:50.610]about what they're going to do at a later time point
- [00:32:52.980]and what they're gonna know
- [00:32:53.813]at a later time point in English.
- [00:32:55.720]Same thing goes the other way around, right?
- [00:32:57.551]This is words known in English at time one
- [00:32:59.960]predicting words known in Spanish at time two.
- [00:33:02.370]So, knowing what words kids knew in English but not Spanish
- [00:33:06.200]doesn't tell us anything
- [00:33:07.630]about their future Spanish vocabulary.
- [00:33:10.450]And so if you are thinking about this
- [00:33:12.350]the way other people think about cross-language transfer
- [00:33:14.830]in that a significant positive regression coefficient
- [00:33:17.700]means kids transfer knowledge across languages,
- [00:33:19.736]this means they don't.
- [00:33:21.440]I don't necessarily agree with that interpretation,
- [00:33:25.140]but that's what this interpretation would be.
- [00:33:27.399]But if I look at I can use hierarchical generalized models
- [00:33:30.360]to look at item-level data, and generalized model
- [00:33:33.230]it's sort of weird to call it a generalized linear model.
- [00:33:36.450]At least in the HLM framework, what that actually means
- [00:33:39.380]is it's a log linear model.
- [00:33:40.670]You're predicting a dichotomous outcome.
- [00:33:42.390]So I don't know why they call it a generalized linear model,
- [00:33:44.400]but that's what they do.
- [00:33:45.377]And so items are crossed with participants.
- [00:33:47.630]We actually have a crossed-classified model
- [00:33:49.100]not a nested model 'cause every kid gets every item
- [00:33:51.340]on the assessment and we're predicting a probability
- [00:33:54.920]of responding correctly to English vocabulary items
- [00:33:58.420]at time two.
- [00:33:59.490]And so because we have a dichotomous outcome,
- [00:34:02.000]it's just whether or not you got
- [00:34:03.370]each individual item correct.
- [00:34:05.600]The results are reported as odds ratios.
- [00:34:08.940]And so, I'll draw your attention to these middle rows.
- [00:34:14.610]This represents the interaction of whether you knew the word
- [00:34:17.940]in English at time one and whether you knew the word
- [00:34:20.500]in Spanish at time one.
- [00:34:22.100]And so these interactions are significant.
- [00:34:24.670]And what they look like when you graph them is,
- [00:34:29.343]so this is the probability of knowing a word in English
- [00:34:32.700]at time two on the x-axis when it was known
- [00:34:35.680]in Spanish in time one or not known in Spanish at time one.
- [00:34:38.380]And then the solid line is
- [00:34:39.570]when it's not known in English at time one.
- [00:34:41.780]The dotted line is when it is known in English at time one.
- [00:34:45.190]So, when a word is known in English and Spanish
- [00:34:49.250]at time one, really high probability of knowing it.
- [00:34:51.790]Pretty high probability of knowing it in English at time two
- [00:34:55.280]when you knew it in English at time one
- [00:34:56.650]but it wasn't known in Spanish at time one.
- [00:34:59.200]When you don't know it in either language at time one,
- [00:35:02.400]only like a 25% chance
- [00:35:03.890]you're gonna know that word at time two.
- [00:35:06.040]But what's interesting is when you knew the word in Spanish
- [00:35:09.380]at time one, but you didn't know it in English at time one,
- [00:35:12.560]this dot right here,
- [00:35:14.040]you're almost as likely to know it in English at time two
- [00:35:16.870]as you are for words that you knew in both languages.
- [00:35:20.018]And actually, which I can't really explain,
- [00:35:22.060]more likely to know it than words you did know
- [00:35:23.970]in English at time one but not in Spanish. (laughs)
- [00:35:26.770]But right, this is a really sort of powerful effect.
- [00:35:31.610]I don't know that this means kids transfer knowledge
- [00:35:33.810]across languages, but it certainly means that these words
- [00:35:37.060]that they know in Spanish are relevant to them
- [00:35:39.630]and so they're probably likely to seek out
- [00:35:41.880]and try and figure out what the label is
- [00:35:44.370]for that concept in English
- [00:35:46.110]when they're in an English-speaking environment.
- [00:35:51.010]Now you might say this is transfer.
- [00:35:52.752]I don't know that I would, but some people might say
- [00:35:54.470]this is cross-language transfer and so when you're trying
- [00:35:57.030]to address this research question of cross-language transfer
- [00:35:59.210]of vocabulary knowledge,
- [00:36:00.370]you might get two radically different answers
- [00:36:02.560]depending on whether you look at individual items
- [00:36:05.200]on the assessment or whether you look at the total scores
- [00:36:08.410]on the assessment.
- [00:36:10.520]So, I know I sort of already summarized that
- [00:36:14.530]but the point I'm trying to make here is it's important
- [00:36:17.350]to look at your data in different ways,
- [00:36:19.790]but I also wanna couch that with don't just pick the way
- [00:36:23.240]that confirms your theory or hypothesis.
- [00:36:26.124]The fact that you see differences in and of itself
- [00:36:29.020]is interesting and I think you should report both of those
- [00:36:33.810]if you do see dramatic differences in your results
- [00:36:38.110]depending on the unit of analysis.
- [00:36:40.650]Now if you don't see any differences,
- [00:36:41.970]you know maybe that's not as interesting,
- [00:36:43.380]you just go with the traditionally reported analysis,
- [00:36:45.870]which in this case would probably be the scale-level data.
- [00:36:52.620]So, that's a little bit about unit of analysis,
- [00:36:55.170]but now back to theory a little bit because I wanna go into
- [00:36:57.544]latent variable models and factor analysis.
- [00:37:00.220]So, just like there's
- [00:37:01.300]this Developmental Interdependence Hypothesis
- [00:37:03.270]and there's threshold hypothesis,
- [00:37:04.570]Cummins also says well, the way I explain this transfer
- [00:37:08.710]is that kids have a common underlying proficiency
- [00:37:10.990]for their languages.
- [00:37:12.390]So I included this picture here
- [00:37:13.840]'cause it's kind of creepy looking and funny.
- [00:37:17.050]So it's this really creepy, eyeless head,
- [00:37:22.880]like disembodied head and in the brain we have
- [00:37:25.970]a common underlying proficiency
- [00:37:27.130]that inputting both languages increases that.
- [00:37:30.760]It's not like there is a separate proficiency for L1
- [00:37:34.107]and a separate proficiency for L2.
- [00:37:36.130]It's input in both of the languages improves
- [00:37:38.900]this common underlying proficiency for language skills.
- [00:37:42.670]You can also conceptualize this as this dual iceberg model
- [00:37:46.810]where you have surface features of the first language
- [00:37:48.940]and surface features of the second language
- [00:37:50.280]that are sort of apparent in how language gets used,
- [00:37:53.400]but then kids have this common underlying proficiency
- [00:37:56.330]where the icebergs are actually connected under water.
- [00:38:01.290]And so I sort of like this analogy 'cause you'll see
- [00:38:04.130]when I talk about latent variable models,
- [00:38:05.720]this looks very similar to a model I'm going to show you.
- [00:38:08.010]I learned about bifactor modeling at one point
- [00:38:10.150]and I was like, huh, this is perfect to address
- [00:38:13.331]this common underlying proficiency theory.
- [00:38:17.060]In a traditional confirmatory factor analysis
- [00:38:19.350]if I wanna look at phonological awareness in Spanish
- [00:38:21.427]and English, I could have a one-factor model
- [00:38:23.720]where all my Spanish and English phonological awareness
- [00:38:26.140]variables are loading onto the same factor.
- [00:38:28.170]And that just tells me about the child's
- [00:38:29.990]general phonological awareness ability.
- [00:38:32.780]And I could compare that to a two-factor model
- [00:38:35.320]where Spanish phonological awareness items load
- [00:38:38.370]onto their own construct,
- [00:38:40.070]English phonological awareness items load
- [00:38:41.630]onto their own construct,
- [00:38:42.730]and I can estimate the correlation between the two
- [00:38:45.100]Spanish and English phonological awareness constructs.
- [00:38:47.500]And maybe that correlation is like a common
- [00:38:49.783]ongoing proficiency.
- [00:38:53.315]Well, what they can then do is compare that two-factor model
- [00:38:56.060]to a bifactor model.
- [00:38:58.410]And if you rotate this mentally
- [00:39:01.870]90 degrees counter-clockwise, you will see this looks
- [00:39:04.700]exactly like that dual iceberg model.
- [00:39:07.490]You have these surface features of
- [00:39:08.950]Spanish phonological awareness that are specific to Spanish.
- [00:39:12.200]So there's features of English phonological awareness
- [00:39:13.970]that are specific to English, and then you have this general
- [00:39:16.710]phonological awareness ability that's common
- [00:39:18.857]across all of the Spanish and English indicators.
- [00:39:25.000]So, if we look at this model,
- [00:39:26.257]and if this model provides a better fit to the data
- [00:39:29.070]than this model, then that suggests
- [00:39:31.160]that there is a common underlying proficiency
- [00:39:33.170]for phonological awareness for these kids.
- [00:39:36.750]So, again I had a big sample for this study
- [00:39:39.410]and what you can see over here,
- [00:39:41.900]these first two groups are phonological awareness,
- [00:39:44.920]this one is print knowledge,
- [00:39:46.270]and then down here these bottom two are vocabulary.
- [00:39:48.737]And you can see in the bifactor model there's a significant
- [00:39:52.200]high-score difference test for blending, elision,
- [00:39:55.070]which are both phonological awareness, and print knowledge.
- [00:39:58.080]And that suggests that the bifactor model provides
- [00:40:00.380]the best fit to the data for those skills.
- [00:40:01.990]So, for those more code-related aspects of literacy skills,
- [00:40:05.380]there does appear to be this common underlying proficiency.
- [00:40:08.980]But for vocabulary, a two-factor model provided the best fit
- [00:40:14.580]to the data, suggesting there wasn't actually
- [00:40:18.200]a common underlying proficiency for vocabulary knowledge.
- [00:40:20.510]And actually, when we had that two-factor model,
- [00:40:22.590]the correlation between the two vocabulary factors
- [00:40:26.090]was not statistically significant.
- [00:40:28.310]So they are just like two separate things,
- [00:40:29.980]Spanish and English vocabulary.
- [00:40:33.994]But what's really cool about bifactor modeling is not only
- [00:40:36.510]can you just describe this and say,
- [00:40:37.920]hey, there is this common underlying proficiency,
- [00:40:41.120]but we can use statistical metrics to figure out
- [00:40:43.365]how much of children's abilities are accounted for
- [00:40:46.940]by that common underlying proficiency,
- [00:40:48.440]by that general factor
- [00:40:49.990]and how much is unique to each language.
- [00:40:53.770]And you could do this for anything.
- [00:40:55.190]If you have intelligence scales,
- [00:40:56.920]you look at you have general intelligence and then fluid
- [00:40:59.150]and crystallized intelligence
- [00:41:00.300]and you could see how much variance is general intelligence
- [00:41:04.570]and how much variance in the items
- [00:41:06.070]is specific to each of those things.
- [00:41:08.410]And so you use this using omega.
- [00:41:10.360]So, omega is an index of reliability that's been proposed
- [00:41:14.180]as an alternative to alpha
- [00:41:15.290]because alpha has some limitations.
- [00:41:17.070]Alpha assumes your data are unidimensional.
- [00:41:19.880]We just saw in those factor models
- [00:41:23.300]the data aren't unidimensional.
- [00:41:25.366]So alpha wouldn't be appropriate.
- [00:41:28.330]Alpha also assumes that you have equal factor loadings
- [00:41:30.700]across items, and so that means that each item or indicator
- [00:41:34.370]is equally related to the latent variable.
- [00:41:38.220]And that's almost always not going to be true.
- [00:41:41.590]So, that's another limitation of alpha.
- [00:41:44.300]Whereas omega is computed from the factor loadings
- [00:41:48.520]from the details of a specific model.
- [00:41:51.320]And so, therefore, it doesn't rely on these assumptions.
- [00:41:53.990]It is specific to whatever model that you have.
- [00:41:59.249]And so here's where this gets a little more methodsy
- [00:42:00.910]than some other things I've talked about.
- [00:42:03.810]But these equations aren't as scary as they look.
- [00:42:06.460]So, omega total.
- [00:42:07.340]These lambdas are really just the factor loadings.
- [00:42:11.140]So you take the sum of the squared factor loadings
- [00:42:13.180]for all the factors.
- [00:42:14.140]And so in this model there is a general factor
- [00:42:18.000]and then there are four specific factors.
- [00:42:19.640]I just had two specific factors, Spanish and English.
- [00:42:21.910]But in this model, they have four specific factors.
- [00:42:25.580]And then this term over here is the error variance.
- [00:42:30.050]And so what omega total tells you is the percent
- [00:42:33.110]of the variance that is actually in the true score
- [00:42:35.490]that is actually reliable variance and not due to error.
- [00:42:40.960]Now, omega hierarchical is used to get
- [00:42:44.340]the amount of variance in the true score
- [00:42:46.140]that is due to any individual factor.
- [00:42:49.180]So here, we're computing the percentage
- [00:42:52.390]of the total variance that's attributable
- [00:42:53.880]to the general factor.
- [00:42:55.360]We could throw up, we could sub in this numerator
- [00:42:58.410]with group one or group two
- [00:43:00.240]or any of the other group factors
- [00:43:02.810]and get the variance that's attributable to that factor.
- [00:43:06.670]And then if you divide omega hierarchical by omega total,
- [00:43:10.280]you get the percent of reliable variance
- [00:43:12.770]that's accounted for by each factor.
- [00:43:14.330]So the percent of variance that's not due to error
- [00:43:16.370]that's accounted for by each factor.
- [00:43:18.150]Similarly, you can do this for just the different subscales.
- [00:43:21.610]So I can do this for just the Spanish items
- [00:43:23.800]or I can do this for just the English items
- [00:43:26.010]in my assessment.
- [00:43:28.480]So this is an example where they had an anxiety scale
- [00:43:31.010]and here's their general anxiety factor,
- [00:43:33.100]and then they had physical symptoms of anxiety.
- [00:43:35.180]Things like sweating, shaking, things like that.
- [00:43:38.850]So that's one of their subscales is
- [00:43:41.770]what are your physical symptoms of anxiety?
- [00:43:44.180]And so here they're computing the percent of variance
- [00:43:47.070]across all those physical symptoms items
- [00:43:49.380]that's not due to error.
- [00:43:53.160]Here you can compute the percent of variance
- [00:43:56.260]in those items that to do
- [00:43:57.560]to that specific physical symptoms factor.
- [00:44:00.130]You could also swap in this variable
- [00:44:02.860]and get the percent of variance
- [00:44:04.060]that's due to the general anxiety factor.
- [00:44:06.760]When I did this, I see that
- [00:44:09.110]when I look at my omega total values for my models
- [00:44:11.750]in Spanish and English phonological awareness
- [00:44:13.310]and print knowledge, almost all of the variance
- [00:44:15.800]is true square variance that is reliable.
- [00:44:18.670]You know only two percent,
- [00:44:19.860]two to three percent is error variance.
- [00:44:22.600]And then I have my omega hierarchical for each one
- [00:44:25.380]and so I can divide .35 by .98 and I get
- [00:44:29.100]the percent of variance in blending skills
- [00:44:31.590]that's accounted for by the general blending factor,
- [00:44:34.990]and it's 36%.
- [00:44:36.670]I'm almost dividing it by one
- [00:44:37.910]so it's not changing very much.
- [00:44:39.620]Almost 50% of the variance in blending skills
- [00:44:42.250]is being accounted for by the Spanish factor.
- [00:44:44.470]And only 12% being accounted for by the English factor.
- [00:44:47.380]This sort of makes sense to me.
- [00:44:48.550]These kids are preschoolers and most
- [00:44:50.400]of these dual language learners show up to preschool
- [00:44:52.520]having not been exposed to English hardly at all.
- [00:44:55.570]And so there's very little variance
- [00:44:57.130]that's being accounted for
- [00:44:58.060]by these specific English factors.
- [00:45:01.100]I'll skip this upscale parts
- [00:45:02.440]because it's sort of the same explanation.
- [00:45:04.620]But I can gain a lot more information about what's going on
- [00:45:09.830]across my Spanish and English assessments by using
- [00:45:11.960]this sort of approach than I can with others.
- [00:45:14.920]Then, I can extend this bifactor modeling approach
- [00:45:18.120]into structural equation models.
- [00:45:20.210]And look at mediation and look at
- [00:45:21.730]my actual cross-language transfer.
- [00:45:23.770]So I can look at do those specific
- [00:45:26.450]Spanish phonological awareness skills predict
- [00:45:28.990]a later English reading outcomes?
- [00:45:32.350]After controlling for that general phonological awareness
- [00:45:35.270]ability or that English specific
- [00:45:36.650]phonological awareness ability.
- [00:45:38.429]And so I can do that using mediation analysis.
- [00:45:41.550]So, I'll show you some models really quickly here.
- [00:45:44.320]We have this one-factor model in kindergarten,
- [00:45:46.210]it doesn't fit the data very well.
- [00:45:47.590]This is word reading, not phonological awareness,
- [00:45:49.840]so shifting a little bit.
- [00:45:50.730]But I have Spanish and English word reading.
- [00:45:53.020]Just saying that Spanish and English word reading
- [00:45:54.630]are the same doesn't describe the data very well.
- [00:45:57.800]I've got a pretty good fit for my two-factor model.
- [00:46:03.000]But I have even better fit for my bifactor model.
- [00:46:05.100]What's sort of interesting, though,
- [00:46:06.700]is that the specific English factor, none of the loadings
- [00:46:10.410]were significant and so they dropped out
- [00:46:11.900]and so my bifactor model actually says
- [00:46:13.440]there is no specific English phonological awareness ability
- [00:46:16.010]or word reading ability at kindergarten.
- [00:46:18.090]It's just all being captured
- [00:46:19.350]by this general decoding factor.
- [00:46:22.030]And so this is my model in kindergarten.
- [00:46:24.500]And then I also did this for first grade skills.
- [00:46:27.790]So, again my one-factor model doesn't fit the data well.
- [00:46:30.530]My two-factor model fits the data really well,
- [00:46:33.140]and this one also fits the data well,
- [00:46:35.920]but it didn't fit the data any better
- [00:46:38.440]than the two-factor model.
- [00:46:39.930]So, for sake of parsimony,
- [00:46:41.830]this was selected as the best model.
- [00:46:44.680]So, I'm gonna skip through.
- [00:46:46.550]Here's our final model for kindergarten.
- [00:46:48.320]Oops.
- [00:46:49.720]Our final model for kindergarten, we have general decoding
- [00:46:51.860]and Spanish decoding.
- [00:46:53.530]Our final model for first grade, we have English decoding
- [00:46:55.747]and Spanish decoding.
- [00:46:56.850]And I'm gonna look at if these things
- [00:46:58.270]predict English reading comprehension at grade three.
- [00:47:00.910]And so here's the final structural model
- [00:47:02.810]with only the significant effects in the model.
- [00:47:05.870]I also had observed English vocabulary skills
- [00:47:08.080]in kindergarten.
- [00:47:09.710]So, we can see it doesn't look like
- [00:47:14.070]there's cross language transfer, right?
- [00:47:15.700]We don't see any effects of the Spanish skills
- [00:47:18.430]on English reading comprehension after we control
- [00:47:21.110]for English decoding in first grade
- [00:47:23.230]and English vocabulary in kindergarten.
- [00:47:25.730]So, it sort of looks like transfer is not occurring,
- [00:47:29.330]but we have this significant path
- [00:47:31.410]from unique Spanish word-reading skills in kindergarten
- [00:47:35.050]to unique English word-reading skills in the first grade.
- [00:47:38.560]And so, if we use mediation analysis,
- [00:47:40.920]we see that there actually is an indirect effect
- [00:47:43.900]of Spanish word reading on English reading comprehension
- [00:47:48.370]that goes through English word reading.
- [00:47:50.110]So kids' Spanish word-reading skills improve
- [00:47:52.610]their English word-reading skills,
- [00:47:53.730]which in turn predict their English vocabulary at all,
- [00:47:57.260]or reading comprehension.
- [00:47:58.810]Sorry, gettin' tripped up on my words here.
- [00:48:01.040]Same thing with the general factor,
- [00:48:02.270]we see an even larger indirect effect
- [00:48:04.850]of just general decoding skills at kindergarten.
- [00:48:09.740]So we can take that bifactor approach,
- [00:48:11.533]which wasn't really looking at transfer,
- [00:48:13.490]was more just looking at that common underlying proficiency,
- [00:48:16.020]but then I can use that approach and extend it
- [00:48:19.570]to look at cross language transfer.
- [00:48:24.530]So, there might be some other experimental approaches
- [00:48:28.700]might be more appropriate.
- [00:48:29.620]So, you guys, I'm sure you know
- [00:48:32.080]that randomized control trials are sort of the gold standard
- [00:48:35.840]in evaluating whether one variable
- [00:48:38.350]is causally related to another.
- [00:48:39.930]So, if I randomly assign kids and give them spit to get
- [00:48:42.600]Spanish reading instruction and that's all they got
- [00:48:46.590]was Spanish reading instruction and the control group
- [00:48:48.363]just got whatever they normally get.
- [00:48:50.250]And the kids in my Spanish reading instruction group
- [00:48:52.080]improved in English reading above the control group,
- [00:48:55.100]that suggests that transfer occurred.
- [00:48:58.130]But also, if level of skill in Spanish at pre-test moderates
- [00:49:03.620]the impact of the intervention on English skills post-test,
- [00:49:06.630]that would also suggest transfer.
- [00:49:08.210]So, if kids who had started off with higher Spanish skills
- [00:49:10.483]benefited more from the intervention than kids
- [00:49:13.030]who started off with low Spanish skills,
- [00:49:14.980]that would suggest transfer.
- [00:49:17.070]So, I wanted to look at this
- [00:49:19.000]to have some stronger evidence of transfer.
- [00:49:22.480]And so your kids were randomly assigned
- [00:49:23.960]to receive early literacy instruction and then we looked at
- [00:49:26.624]whether there was a significant moderation.
- [00:49:29.530]Here in this row, there's a significant positive effect
- [00:49:32.730]of the intervention group.
- [00:49:33.563]So, kids in the intervention group always outperformed
- [00:49:35.710]control group regardless of the outcome, vocabulary,
- [00:49:38.410]phonological awareness, or letter knowledge.
- [00:49:41.310]Here what we see is that for some vocabulary
- [00:49:43.107]and phonological awareness assessments,
- [00:49:47.160]there was a significant moderation
- [00:49:49.080]of the intervention impact by Spanish skills at pre-test.
- [00:49:53.530]And so what that looks like again,
- [00:49:54.830]so here on the x-axis we have the intervention groups,
- [00:49:58.140]on the y-axis we have English vocabulary
- [00:50:00.030]or English phonological awareness,
- [00:50:01.510]and then these dotted lines are kids
- [00:50:03.650]with high Spanish skills at pre-test.
- [00:50:07.700]The solid lines are kids
- [00:50:08.533]with low Spanish skills at pre-test.
- [00:50:10.480]And so what we see is there's a stronger relation
- [00:50:12.700]between intervention group in English skills at post-test
- [00:50:15.708]for the kids who have the higher levels of Spanish skills
- [00:50:18.910]at pre-test, suggesting that those kids were able to draw
- [00:50:21.860]on their Spanish skills to benefit from the intervention.
- [00:50:25.190]Sort of like that Matthew Effect I was describing.
- [00:50:27.820]So, to wrap things up quickly
- [00:50:29.830]'cause I know I'm running short on time here,
- [00:50:32.890]the longitudinal mediation models.
- [00:50:35.023]Longitudinal mediation or the experimental models
- [00:50:39.080]provide the strongest evidence for transfer
- [00:50:40.870]and there's some more studies coming out recently
- [00:50:43.060]that show experimental evidence.
- [00:50:46.270]But despite some of the limitations, you can really get
- [00:50:49.470]some interesting information out of some of those
- [00:50:51.360]more correlational or concurrent methods
- [00:50:53.570]of evaluating the data.
- [00:50:54.877]So, with moderation we can look at whether the relation
- [00:50:58.310]between X and Y varies based on Z.
- [00:51:01.970]For quantile regression, we can look at whether the relation
- [00:51:04.540]between X and Y varies depending on the level of Y.
- [00:51:08.130]Conceptually, it's similar to a moderation analysis
- [00:51:10.660]except you're not using a third variable as a moderator,
- [00:51:13.190]you're using a level of your outcome as the moderator.
- [00:51:18.280]Depending on your unit of analysis for the data,
- [00:51:20.330]you might see interesting patterns
- [00:51:23.070]that differ across units of analysis.
- [00:51:27.120]And then bifactor modeling can give you
- [00:51:28.760]some sort of unique insights into multidimensionality
- [00:51:32.370]of constructs and how much of variance in constructs
- [00:51:37.800]is actually accounted for by these different
- [00:51:40.320]underlying abilities, whether it's your group's
- [00:51:43.210]specific abilities or your generalized ability.
- [00:51:47.424]And all of these things can be applied to a number
- [00:51:50.530]of research questions or theories.
- [00:51:52.800]I was talking to Mackenzie this morning
- [00:51:54.520]and it was like the trick is sort of finding a vague theory.
- [00:51:57.883]that you can apply all these things to.
- [00:52:01.620]But I think this is not applicable
- [00:52:03.970]just to what I'm talking about,
- [00:52:05.020]I think it's applicable in a lot of areas.
- [00:52:08.440]So, does cross language transfer occur?
- [00:52:10.140]Maybe, but I think it depends on the skill.
- [00:52:12.140]I think if you go all the way back
- [00:52:13.360]to that meta-analytic evidence, it looks like there's
- [00:52:15.400]a lot more evidence for a potential transfer
- [00:52:17.240]for things like word reading,
- [00:52:18.630]things like phonological awareness,
- [00:52:19.980]these sort of aspects of reading
- [00:52:21.830]that are about sounding out words
- [00:52:25.240]and doing those sorts of tasks
- [00:52:27.200]rather than the more language-heavy aspects of reading
- [00:52:29.830]like vocabulary or reading comprehension.
- [00:52:33.320]But again, more research is needed to determine
- [00:52:35.160]how that'll ever transfer to narrow
- [00:52:36.700]that problematic achievement gap.
- [00:52:39.270]And so I wanted to come back to the achievement gap
- [00:52:41.470]and just give a final reason to be optimistic about this.
- [00:52:45.520]This looks like we haven't done anything
- [00:52:47.310]since this is 2002 is right here to 2017.
- [00:52:51.420]We haven't moved the needle at all
- [00:52:52.740]with these kids in 15 years.
- [00:52:54.890]It turns out that's not true
- [00:52:56.380]and that these data are really misleading.
- [00:52:58.470]So, these data are kids who are classified by schools
- [00:53:01.270]as English language learners, but it doesn't account
- [00:53:04.810]for these kids, the former English language learners who,
- [00:53:07.390]as they moved throughout school, they were once classified
- [00:53:12.260]as English language learners,
- [00:53:13.670]but their English skills improve
- [00:53:15.830]and they graduate from that distinction essentially.
- [00:53:19.350]So, this achievement gap data only looks at the kids
- [00:53:23.330]who are currently classified.
- [00:53:25.340]And so, a recent paper came out and did show
- [00:53:29.530]that when you look at all dual language learners
- [00:53:31.990]and not just kids classified
- [00:53:33.960]by English language learner status,
- [00:53:36.000]the achievement gap is shrinking.
- [00:53:38.330]It's still there, .3 is still a much bigger achievement gap
- [00:53:41.610]than we would like to see, but down from .45 15 years ago
- [00:53:46.840]is substantial progress.
- [00:53:48.120]Now we can't tell from these data
- [00:53:49.380]what we're doing is exactly working,
- [00:53:51.700]but something we're doing is working, right?
- [00:53:53.410]Maybe we've used this information about transfer
- [00:53:55.610]and we're using dual language instruction more.
- [00:53:58.570]Maybe it's something else, but something's working.
- [00:54:01.840]We just need to figure out how to get that achievement gap
- [00:54:03.940]from .3 down to zero.
- [00:54:08.200]Just some last minute acknowledgements.
- [00:54:11.440]Almost all of these data that I presented
- [00:54:13.080]came from work done
- [00:54:14.260]at the Florida Center for Reading Research
- [00:54:16.362]with the Preschool Research Group down there.
- [00:54:18.930]Lots of grad students and RAs helped me out with this stuff
- [00:54:21.490]and helped others out.
- [00:54:22.950]And I also wanted to thank MAP Academy
- [00:54:24.980]and Natalie in particular for inviting me to give this talk.
- [00:54:28.750]So, thank you very much.
- [00:54:31.076](audience applause)
The screen size you are trying to search captions on is too small!
You can always jump over to MediaHub and check it out there.
Log in to post comments
Embed
Copy the following code into your page
HTML
<div style="padding-top: 56.25%; overflow: hidden; position:relative; -webkit-box-flex: 1; flex-grow: 1;"> <iframe style="bottom: 0; left: 0; position: absolute; right: 0; top: 0; border: 0; height: 100%; width: 100%;" src="https://mediahub.unl.edu/media/11030?format=iframe&autoplay=0" title="Video Player: Marc Goodrich: “Addressing One Research Question Using Multiple Methodological Approaches”" allowfullscreen ></iframe> </div>
Comments
0 Comments