Statistical Methodology in the Social Sciences 2016: Session 1

Statistical Methodology in the Social Sciences 2016: Session 1


So we’re having three sessions with
a break at 10:30, a coffee break. And then lunch will be provided outside. We tried to group speakers
by area of interest. I wanted to say one thing
at the start which was That there’s a lot of stuff
going on on campus, so there is the Data Science Initiative, and that’s looking to develop the various
science minors at the undergraduate level, graduate designated emphasis
encouraged courses. There has been –The university’s
had a hiring incentive program. And okay, three of nine were awarded
to data science side of things. So in particular, this research and
academic programs, there’s a hope that eventually
they’ll be a Program or department within Letters and Science. And these other two
are also related to data. The Business School is starting a Master
of Science in Business Data Analytics, which to begin with will
be in San Francisco. But ultimately, it will be here. The Statistics Department
has three new data Sciences closest to
the undergraduate level, right? So there’s a lot of stuff going on. Okay, so the first session studies, which I would say from, The converse point of view, which I am, non-economics,
maybe with a psychology bent. This is, we see different types of work. And our first speaker is Jingwen, and so why don’t you come on up,
from communication And this small, right? So people wanna move easy to see the
screen, I feel welcome to do that, okay. So the plan is each speaker has 30 minutes, if they
finish at 25 minutes which I recommend, then we have time for a bit of feedbacks I didn’t, there is a clicker. Yes, there is a clicker. [SOUND] [SOUND] [INAUDIBLE]
Okay so what you should do is
put this mic on yeah. And yes I should have
brought my extension okay. [INAUDIBLE] Hello, everyone. Thank you all for
coming In this early Friday morning to attend my talk [LAUGH] so
I’m very glad to. I’m I’m a new assistant professor at. And just joined the. So it’s my first time to come in here. So I’m very excited about starting
my Research and using theories. So today I’m going to talk about my
research in the past three years. So the general idea is to use only experimental approach to study known
social influence in offline behaviors. So I will present three studies,
and I think the goal for you is to Just to know how I
do research and how we use persuasive technologies to change people’s
behaviors and we utilize findings. So the basic theory and co-foundation for
this work is social influence and we know that social influence has
been studied on different levels for some On a macro level through
personal communication, and the macro level on the level
of social networks and how it also shows influence in
the field in social networks. We know that classical work from
psychology and communication have exempt a lot of personal influence techniques,
like Chaldini talks about it, a lot of lambrage on strategies in
persuading what you will do And from school had this classical
book about personal influence, mostly context. And on the macro level, social network
has been becoming a very hot topic, and Also in communication there are a lot
of forward down to understand how people communication with each
other in their immediate and also extended social networks. It looks like linked in didn’t actually
attract a lot of attention, and the argument is that your behaviors and
ideas and beliefs are not influenced only by your friends, but also by
the friends of your friends Of difference. So just give a very simple
example of smoking. Smoking often times happens
in social settings, so people sometimes smoke under
the pressure of other people. In several cultures like in,
back in China, smoking can be set up Different context
to give pressure to other people. For example dinner table,
offering cigarettes for other people sometimes show respect and
effort to build relationship. So in that context you may not be able to resist taking the cigarette
if even you don’t smoke. This Study by Christakis and Being in 2008 showed that they tracked 1,000 people from
1971 to 2000 about their smoking status. So clearly, we can see from this diagram that smokers were clustered in their
social networks in [INAUDIBLE]. This are one and
the one social norms are smoking shared. Then, the one we looked at here
are looking to solve and see that many of the non-smokers have smoked previous
smokers be in maximum care and then, the greenhouse showed the maximum care
that then belong after together So this is a basic idea that we’re looking at social
influence at two different dimensions. One is at personal level and
the other economic level. And we can clearly observe that the human
clustering means we’re demonstrating social influence among our circles
of friends and family members. For me, my research question
is how to design persuasive technologies that leverage networked
social influence to change behavior. So, from communication perspective, we
are looking at how to design, for example, lifesize and mobile applications to change
people behaviors utilizing their… Social influences in their social
networks, either real social networks or constructed social networks
from an experimental approach. The behaviors that I’m looking at and
that we tested in the previous studies is focusing on physical activity to address
the problem of sedentary lifestyle. If you notice that about 38% of American adults
were obese in 2013 and 2014 and
that is on average took 500 steps a day. Do you know how many steps you take per
day more than that or less than that? More than that okay. And about 69% of Young Americans fail
to meet the national guidelines for physical activity turning 14. So a sedentary lifestyle is very
common also among certain minorities. So we’re going to see how
our experiments can change people’s physical activity behaviors. Previous interventions on promoting
physical activity focused on different strategies
including mass media behavior interventions and
financial incentive intervention. And in less than a year, so
there are more programs in health that And house interventions completing
all our online forums for people to exchange information about
how to increase their fitness. And also all those wearable devices
people can wear like fit bit and and those devices are connected with apps that
people can track their physical activities and When compared with other people,
that’s showing third in this picture. You can add other people as connections
and then you can steps and other people. This stuff initiated
social connections Is that individuals’ choices oftentimes
limit the value of their resources. People often select a social contacts
based on similar characteristics, including famous levels. So this basically is missed
on the idea of homofily. So you choose your social contacts based
on similar Braintrust and behavior traits. So if you are the person using
the you connect with your friends and people that you know and often times
your behaviors are on similar levels. So in this case you have
all these people but your behaviors may be very
similar to each other then from Information you may not
receive new information. For example, you may other people who
are much more active than you, or much more less active than you. Our experimental approach is to see whether
social networks can behavior outcome changes For
psychological mechanisms. So this model is the simplified
model of the Berkman model on how social networks and health. The outlook is that our social structure
conditions including political, environment, and cultural environment
influence the social network formation. And that influence of psychological mechanism ‘Cuz that leads
to the other outcomes. So, my research and
then, we go to outcome, we use this verse of technology to
change the social networks and then, we need the behavior outcomes
through psychological [INAUDIBLE]>>[COUGH]>>The social is that we answer [INAUDIBLE] We construct
>>Online social networks. So this is just a topology of how we
construct people’s social networks. So for example if you go to
a website of stature user M, and you start to see
other social contacts, as researchers we control who you can see,
who will be your social contact. So for example if you are in a study and
you are assigned a node here, so this represent you, and
then your social connections are determined by us as researchers. So from experimental approach perspective, we can now manipulate the social network
configurations for your social contacts. And then your information exposure will
be determined by your social network. Any questions on this? No, okay. So after that we basically
have in the experiments, and then we are going to be assigned
into different nodes in the network. Just a little bit of clarification, in this [INAUDIBLE]
>>You can see the blue nodes and red nodes and this is based on algorithm
that people will be putting to add social contacts if they’re similar in age,
and this is male and this is female. Basically we want to see whether
as a algorithm of social Networks, you can use better behavior change. Yeah.>>Are you manipulating example
networks and controlling them, or constructing a network
online from scratch?>>It’s from scratch. It’s a totally new network, because we don’t want to confound people’s
existing network characteristics. With our experimental manipulation. So all this construction
is from scratch and when people come to study they
start the program as making, basically seeing new people
from your experiment.>>So do the people know that
they’re part of an experiment?>>Yeah, but
they don’t know exactly where they are. So this is Is unknown to them. So when they go to their study,
you will see how it looks.>>Okay fine.>>Yeah.
And then, so it will, so this study is longitudinal, so then we
offer people speaking change over time, in your envelope, in [INAUDIBLE]. So in study, I’ll talk about three
Studies, and this is study one. In study one we want to compare two
different approaches in promoting behavior changes, as the first step to see whether
constructing on a network is more effective or less effective than
traditional communication strategy, like via messaging. So in the mean to messaging condition,
we’re basically paid to give people messages that are promoting physical
activity And this message from the CDC are saying that you need this amount
of activities to again health benefits. And the assumption is that individuals
are independent from each other receiving the messages from the researcher. And the odds of the network as I just
explained when participants came into the study they were signing a Different
conditions connecting with each other. This is not the exact configuration,
because in the study people started to networks with either four to six
individuals because it’s a large. Not everyone has the same
Degree of connection. And then, so the idea is that if one
person changes his or her behavior, all the other connections will receive
the behavior signals from that individual. So, for example,
this person starts to attend a yoga class. And attending that class as a message
signal will be Moved into all the connections of that
individual who starts to access. And then you can then envision. So if the signal changed this, it was behavior then the behavior will
cascade into the rest of the network. So, Ms. Dalia was down at University of
Pennsylvania where as you My PhD program, and it was a 13-week program
at the university level, and we collaborated with the center
to initiate this program. So it’s a real field experiment. And the center offered 40 classes and
redesigned the website for the participants to enroll in
Classes as the experiment. And the primary outcome we’re looking at
is the number of all class enrollment. So when the students enroll in a program
they could use the website to register for excess classes. And we use that additional behavior
as an outcome in the study. And after they enroll in the class and
they also show up Classes, so we know that because the trainers tracked
whether they’ve attended classes.>>So is the class just a one-hour class,
or is it a whole quarter long?>>So there are 49 free classes
through the semester, and an individual can choose to
attend as many as they want. So one class is Is one hour.>>One plus one hour.>>Yeah it’s just a training session so the agents have multiple
opportunities to attend the classes. So we also included a control
condition that this is the [INAUDIBLE] the participants will say when
they attended the experiment. So it would allow me to provide And this out of fastest information and
then, register. This is tracking of common constant
be registered and attended. So, that’s the control condition on for the individuals to register process and
media messaging condition, the basic website the same for
an individual to use And then on a weekly basis they received a promotional message
through emails and on the website as well. So in this example of
an email including how to do burpees that’s a promotional message. And all the class information
included in the email as well. In the other social networks certain
individuals Four assigned quests. In this case, four social connections. So here they could see how other
people were performing the program also received real time signals
about how other people were doing. For example, this email saying, dear
Olympia, your buddy has just registered for the following class I’m done
giving the class information. So if, for example if we’re on
the same network, and I read it for one class, you all will receive that
reading notification through your email, knowing that I’m going to the class, so maybe you’re inspired to go to
the class To the class as well.>>And this is the distribution
of class enrollment. We involved 217 students,
a majority of them are females. And here we can see that for the fourth
quartile for the participants who attended more than six
classes through the Semester. There was significant difference
between the social network condition with the control condition. So in this case, our argument is that for people who are in the same network,
if individuals started to attend more and more classes, that will form
a reversal of behavior change. S that specific Network performed
much better than individuals who were in the control,
who just did do things on their own. And if we look at the cumulative class
enrollment throughout the whole program, this is for the control. So if the cumulative class enrollment for
all the participants were assigned into the control condition, and
this is for the media messaging and And this is where it’s on social networks. So, from this diagram we can see that
in the first half of the program we didn’t see much difference
between the messaging condition and the social network condition. They were almost the same, but they were both more significantly more
affected than the control condition. In the second half, however
The social network showed more impact than the media messaging foundation. And the messaging and also more interactive.>>I might at least just nights ago
when you were talking about this>>The specific information via media. When did that happen? [INAUDIBLE] throughout the program or
at some specific point?.>>So it’s, I always say it’s
throughout the whole program but also at specific time. That means that, for example,
if I’m Olivia and you’re the which is the username So if I register for
one class, I have a weird connection. Whenever I register for the class,
you will receive my behavior modification. So, for example,
if I register one week this week, another one class this week,
another class next week. So, this week you’ll receive them and then next week you’ll receive Got
to [INAUDIBLE] it throughout.>>Throughout and also in real time
meaning that whenever I did something on the website,
you immediately receive that email. So it’s throughout the whole program. And it’s based on how many times I
register and how many times you receive the email Okay, so now to the first study,
and we’re glad to see that this trend to perform much better in
the later half of our program. Because that shows
the reinforcement once people start to get active and more involved
getting to the state of [INAUDIBLE] Yeah?>>Sorry, can you go back a second?>>Yeah.
>>So this is cumulative across
a whole group of people.>>Yeah.>>Cuz you got your numbers, your cumulative numbers
are around 300 to 400 at the end. But I thought from a couple of slides
ago that most people didn’t even Six classes of the request to
hold a meeting, is that right?>>That’s right. So, we had a bunch of people who
were mostly active in the study so they fair to-
>>So this affected feeding everybody keeping?>>Right, treatment by the people
who are really active and that also makes sense because when you think
about the dynamics of social influence>>As people start to active, their network gets
activated off being active. If people, for example, are really
sedentary in their social contact, the activation won’t be active. The phase of activity won’t be activated.>>Well you said you had 278 people.>>Correct.>>Does that include
those who did nothing?>>Yeah. Okay.
>>Yeah, that’s the total sample.>>So it is a small group.>>Yeah, it’s a small group.>>And then, how is that group picked?>>We didn’t pick the group. So they are, we just recruited
participants into the study and assigned them randomly into networks,
in network conditions.>>So these are any students who
chose to take an extra class>>But what are you interested in yes.>>Yeah so there was no selection bias
when they came into the network for the positions. Okay so instead we looked
into the map instance that could be Primed in their
social network condition. So we compared two of them, one was
social support one was social comparison. In this were too common [INAUDIBLE]
use of physical activity promotion interventions in the health and
health complication. The social support ideas are simple. People want to help Child behavior change,
so you can natural support or
sometimes I give very detailed suggestions on how
to change behavior change. Comparison is that people
sometimes look out for other people’s behaviors, Their own behaviors,
so very different approaches, and we want to compare whether they could be utilized
in social networks to change behavior. Again, the same program
in class structure. This time, a second semester. It was 11 weeks and
the center decided to offer more classes. That this half is 93 excess content. And again we use the same website for
students to enroll in class. Primary outcome is the outcome
class attendances. Again we looked at the number of class
enrolments as well but this time And we’re also tracking the number
of class attendance, basically meaning that,
in the first study, when students registered in a class, they
may not necessarily show up for class. But in the second study, we looked at the
class attendances as a stronger behavior indication for that behavior change. In this study we had four commissioned Is it too much of design with
control condition basic website. Social comparison is [INAUDIBLE] That is the same as the first
study on our condition. So, in Visual Studios
the basic lifestyle and they were also a certified on a peers
to see a quite amount of peers. The social support condition Individuals
start with a basic website and they were assigned on a team
with five other teammates. So, basically families, they were told they would be on
a team to copy this in this program. And we were also providing a chanting box
that they could exchange information and emotional support. In the combined condition, meaning combining the incentives of
Social comparison and social support. Individuals use the basic website
to interact with class after you were signed into a team with
five other teammates as well. But also in this condition,
each team could see five other teams’ performances, so
basically here in your own team, you can exchange support with each other,
but also your team is Competing with other teams for
making better performances in the program. And also individuals could
chat with their team members. So in these two connections
individuals were competing for themselves as individuals. When in the other two
connections with the team so individuals were Given chances
to talk with each other and also give support and compare their
performances between each other as well. So you can see the percepts looked out and
initial promising for the study. But here in the team support condition
Individuals were not seen and then, it creates champion information
almost regards to the class, okay. And then, circle comparison,
they don’t have a chance but, it can help pairing the individual
performance and then, in comparison condition, this that are the
key performance Performance is to this your team is competing with other team. So with that very clear that the two
conditions with the comparison incentive perform much rather than other tool and
control all the social support and the social support really didn’t work for
physical To promotion at all. And this is the cumulative
class attendance again. So clearly, the combined and social comparison performed
much better than the other two. Very briefly about a third study, so
the third study is a mobile app-based social network intervention for
African Americans. Young women in connectivity
in Philadelphia. So this time we used the fit bit so
we gave participants the fit bits. And now we designed an app called planet
fit to track information from the fit bit that participants wear. So now we just have this co
infrastructure for gathering all the data. And the front of this and the app. I’ll skip this.
Study designs on individuals again on social networks again. So, very basic idea because this
time we’re extending the program to a different population so we want to
keep the experiment design very simple. So this is what the app looks like. Individuals will check their own
activities of their business day to day, and this time on they work they also
see other people’s performances. So same idea but
implemented in your mobile phone solution. And this shows the mean log in
numbers that you have by the end And the experiment conditions. So we could clearly see that throughout
the whole program the log in numbers decreased, but the overall network
increased the last part of control region. So basically people in our
networks log into the apps more because they have social contacts. If they’re on their own they’re
less likely to log back in. Into that.
And we found some significant facts that on our
condition increased achieving 90-light active minutes goal in comparison with
the individual control condition. So, individuals on our three
hour woman would do better at achieving the goal that we gave
them than the individual condition. So, So, some discussion and solution, we
found that on our condition can increase the thinnest program to hit [INAUDIBLE]
And also, different framings and device we can use different levels of social
[INAUDIBLE] And luckily, more research based needed than mediation effects to
accept [INAUDIBLE] Back problems past. So in the second study we
designed the interface to induce different psychological states,
but we actually didn’t measure. They’re using like survey messes to ask them questions whether you would feel more
supportive, or you feel more competitive. So, further Ask those questions and see whether that’s the real
mediation observed in the study. So here extending this program again, mostly using mobile apps because
now we can get a lot of data from their mobile phones and
also the that we So that’s all about my research. I know it’s a lot if you have questions
you can approach me individually and talk about the future
collaboration of teams. Thank you.
>>Thank you.>>Okay we have time through 1 or
2 questions. If anyone wants to ask any questions. Questions.
Yes.>>Thank you for your presentation. It’s a really interesting study. I’m wondering if you think any individual
characteristics of the people in the networks, that they might be comparing
themselves to, would influence whether or not Not being in the physical
activity [INAUDIBLE].>>Yeah so that’s basically I think your
question can be tested with a moderation analysis to see whether individuals
characteristics with moderate effects. So we didn’t have those data but
I think Think it’s very interesting to ask those questions in
the survey before the study. But in this case because all individuals
were randomized into the network positions so the results wouldn’t be driven by
their individual psychological states. Yeah but you were right that if we
We have those measurements where we see a lot of change that.>>Do you have a way to quantify
the extent of the results being that some people call it effectizes. Some of the results over their eyes Statistically significant. They seem to be on the small side. Is there a way to quantify that?>>Yes, so in ours, in fact,
were very small, so small. Because it’s a field experiment,
so we thought even a small effect would be very
significant for For the [LAUGH] yeah. But if we were able to expand our
programs I think because of the was very minimum website. And all the web information went
out by the You’ll have experiments. Individual says the acting on
life sense based on whether that the extent they
perceive is punctuations.>>So, later to that,
if you have been on sample, you think you could go beyond just the
summary of looking at spill over effects? So, what is the effect of
friends of friends Yeah yeah we will be able to see that.>>And
you’re hoping to get to that point or?>>Yeah yeah but I think in this case,
when we constructed ours is not as extended as we
observe in real life now where people have extensions of so in this case
>>It’s only one step further, so one social connection. But designing experiments
in using larger and extended networks are very difficult
because you have to have a very large sample in order to have one
observation about one social network. So in this case,
one network It was like very small in size the rate is four
to five each person. But if we’re talking about extended
hours it’s now a size of forty. And forty people will
become one deal at one. And then it has to have like thousands
of people in order to have multiple. Observations [INAUDIBLE].>>Okay, well thank you very much.>>Thank you.
>>[APPLAUSE]>>Okay, next. Paul, yeah Okay, next speaker is Annie. And I’ll give you five minute on it.>>Okay, thank you much. [COUGH] Excuse me. I believe you want the microphone.>>Well,
can you hear me if I’m talking this loud?>>Yeah.
>>All right, I’m gonna continue having the Cloud. So, my name is Nicole Sparachane,
I’m also a new Assistant Faculty Member on a School Education and I also hold an
appointment of the mind institute as I did to be here today to talk to you
guys about Latent Cob analysis, so just as an outline we’re going to give you
a very brief background on Latent cross analysis and primarily I’m gonna go
through one study that’s looking at using latent class analysis in
In first grade students. And then finally just a preview of
latent class analysis in the ASD, or autism, population. Okay, so
to begin what is latent class analysis? I know some of you might
know this really well and others might know this Not at all. So it’s primarily
a person-centered approach. That means that rather than thinking about
[COUGH] examining a population mean, it’s a statistical approach that looks
at identifying different classes of individuals within
one specific population. So why do we do this? Latent class analysis. It’s really ideal for
a heterogeneous populations. So the idea is that if
you have a population you want to know whether you really do
just have one single population or do you have multiple populations
within one data set. So we come there. So the nice thing about using latent
class analysis in educational research is that classrooms are very, very diverse. There’s a lot of heterogeneity
in the classrooms. Students come in with varying strengths
and means And you know that this idea of, one size fits all approach to
instruction doesn’t work for kids. So, new kids are often faced with this
problem where they’re supposed to be individualized and there’s student
instruction to meet their new needs. Late Class Analysis is really
exciting because it gives. This is the opportunity for
us to think about statistically how we can characterized patterns
of student behaviors. In the research literature, just recently
there’s been a lot of legal class analysis happening in really childhood
settings where we are just doing not. Characterizing patterns student. Characteristics in hopes to better social,
emotional and academic outcomes. By improving the instructional methods. So the first study that I am going to
talk to you about is a recent publication it’s called latten profile analysis of foundational learning
skills among first graders. So the difference in terminology
is that we refer to profiles and we have continuous data, excuse me. And we refer to it as
positively have of data. So this really came about by
this need that we know that. These school readiness
skills are important in young kids that are entering schools. So school readiness is a term
that is used to conceptualize the constellation of skills and
behaviors and attitudes that students need when
they enter kindergarten, and they are set to It said that they
are set to set this foundation for alerting, but then when we move
beyond that time frame entry, there’s really not a lot of data that
says there may be variability or that these skills may be important for
academic set. So there’s need to look at that
readiness beyond kindergarten. So we termed for this study. We termed this as foundational
learning skills which represent broad idea of school readiness. And the three main areas that we
were looking into is social and emotional developement,
language and self Regulation. So a little bit of detail on each or
background on each. Social emotional development
we know is important. It’s been linked to academic achievement. And then the literature that talks about
students that don’t have the adaptive behaviors to cope with Classroom change,
they may exhibit problematic behaviors like externalizing and internalizing
behaviors that literature suggests long-term negative outcomes
whether social or academic. This idea of social emotional development
in the classroom is important, yet it is a very needed area to research. And same with language. So we know that language is
important in the classroom, students that have stronger language skills
are typically better social communicators. Language is also related
to reading achievement. And then with self-regulation, and we know that self-regulation
is important at school entry. And then there’s been some research
that says that self-regulation Which is important for learning outcomes. So this study looked
primarily at two researching. One to evaluate whether there were
distinct profiles that we could find within a sample of first graders [SOUND]
And two we were interested in looking at whether there are patterns Within or
whether those profiles are differentially predict students whether she
achievement across first grade. So, the data came from a lot to most of the study on individualizing
student instruction. Students were Part of the larger study
over the first through third grade. This study primarily just
looked at the first graders. So there were 324 first graders in total, 28 teachers in general
education classrooms. Across five schools 45% male between 5 and
7 The descriptors. In regard to developmental abilities
the sample was fairly diverse. 14% had individualized education plans, and 36% are eligible for
free and reduced price lunch. So that’s typically used as a proxy for social As the socio-economic status. As for measures used in the latten
profile analysis we used the SRSS the social skills rating scale to
look at students social behaviors and problematic behaviors. It characterizes social behaviors
across Three domains cooperation, assertion, self control and
problematic behaviors across three as well externalizing, internalizing,
and hyper active behaviors. As for language we used the delb
s which is a screening tool that provides two primary scores. One that looks at level of risk. So the higher the score The more
risk that a student has for having a language delay or disorder. And then the other one looks at
how variable their language is, dialectal variation. And then finally, for self-regulation
we use the head, toes, knee, shoulders task, which is a direct,
individually-administered Fun little game that measures
behavioral inhibition in students. Are you guys familiar with that at all? It’s fun to give,
it’s not really fun to take. And then for outcome measures we use
the Johnson vocabulary in the letter So in order to identify latent profiles
within the sample of 321st students, we compare models with two profiles,three
profiles,four profiles and five profiles. We use Mplus software and then we
compared the model fit statistic Six and look at that entry. The idea is that let’s promote
the statistics the lower the better. For entro the values that are closer
to one equal to clear classifications. So if you have a one that means you
have very clear distinct profiles. Yep.
>>So, second session a little bit more. First of all, which aspect to the data was
longitude versus just have one measure?>>Excellent.
So, to go back, we’re looking at profile analysis at the beginning of the year
just at the first The first measure. So the outcome measure,
which was the Woodcock Johnson, is the one that we’re looking
at at the end of first grade. So the profile analysis is just for the beginning of first grade with data
used at the beginning of the year.>>And then you’re trying
to predict the [INAUDIBLE]>>Next occasion.>>Yep, yep. Okay, so what we’ve found is that a four latent profile was actually,
four profiles best represented the data.>>So if you look at the numbers
you see that the model statistics are better than the two and
the three profiles. If you look at the entropy
it’s 0.92 which is high. The reason why four beat out five profiles
is because in the five profiles Model, so that’s saying that this sample of 324
students can be divided into 5 subgroups. The fifth subgroup was very, very small. It was only a couple of students,
and so the recommendation was if anything was under 1% of
the sample, then it’s not ideal. So we make it a four in this case.>>So if I think of this
as say a linear regression, you could just have one linear
regression with one set of coefficients, two linear regressions with
two sets of coefficients. And in that analogy what you find
is best is to have four linear regressions with four
Male: Difference as coefficients. Female: I might hae to process that. I’m not quite sure. So the graph that I want to show you
in a seconde will probably clarify what all this is. Pretty much this is saying,
against 324 students, I’m looking at the beginning of the year,
I have a series of variables. equals looking at social behaviors,
cooperation, assertion, self control, externalizing,
internalizing, hyperactivity, langauge,
and self regulation. So I these constellation of behaviors and
I want to know if they actually cluster differently or
if it’s just one large Sample.>>How exactly is this profile done? Is there a maximal I could pull
information to find out if it’s true?>>Yeah and plus it’s actually
fairly simple to fit the statistics. You’re just identifying how many
classes you want to Observe. You’re using maximum likelihood and
then you’re asking for output in the end. It’s a mixed model.>>And then these labels that you’re
giving like Emergent Hyperactive, this is after looking at what sort of
people are in there and you provide the->>Right, absolutely.>>Okay.
>>Again, this, I probably should have shown you first,
but this is where it makes sense. It started to make sense for me. So what we found is with our 324
students we didn’t have one population, we actually had four. Four different sub or
profiles of students, and they were characterized differently. So the majority of the students
that 179 We call the GGS, we named generally good students. These are the students that
teachers want in their classrooms. So they have been z scored. So we’re looking at everything
compared to the sample mean. So slightly above Typical social
behaviors, overall very well behaved. The group of kids that I am most intrigued
by is this one, the supervising profile. These are the kids that I have been, as a clinician, called into classrooms for
years and years. These are our guys that are getting
a lot of externalizing behavior in the classroom, they’re They’re
exhibiting hyperactive behaviors and they’re not socially, at onset but
they’re not cooperative. They don’t have the self
regulation skills. And then the two groups that
are also very interesting, this emergent hyperactive,
which is the second largest group, they were exhibiting some
hyperactive Active and externalizing behavior and then some areas
where we can sense special behaviors. And then finally this is internalizing
a profile where the students are exhibiting more internalizing behavior
than any of the other profiles, and less assertive behavior than
any of the other profiles. And interesting in this profile,
they were at a higher risk for having a language delay or disorder. And their self-regulation skills were
also lower than the rest of the group. And so that’s it for that. And then looking back we
also look at the probability of memberships within the analysis. We look to see Do these students
belong in these profiles? What’s the probability and
they were all very high. So, for group one,
a 0.937 that these students belonged in an emergent hyperactive group, and
very low probability they belong in any other groups, and
you can see that across Each of the four profiles. A few descriptives of it that are pretty
interesting, in the hyper, emergent hyper active group where the students had some
indications of social and emotional risks. 50% were male, 13% had IUPs, 33% were eligible for
a free or reduced price lunch. And then if you jump down here to that
externalizing group where the students had a long externalizing behaviors, mainly male which is consistent
with the research literature. But this is interesting to me,
zero percent Of students in that group were identified, so they did not
have an individual education plan. 50% were eligible for
a free and reduced price lunch. Generally good kids, higher percentage
of female, some students had IUPs, 30% were eligible for
free and reduced price lunch. And then finally,
in this last group where we saw across all that we looked at,
59% male, 11% had IEPs, and they had the highest percentage of
students that had three Price lunch. So then we next looked at the growth
across the first grade years. We used latent class growth analysis,
and the idea was just to see if these profiles could differentially
predict their growth We controlled for gender and SCS and, again,
using Mplus software. And so
interestingly that externalizing root, compared to the other ones, grew the most. And letter were decoding so
they showed the most growth, significantly more than
all the other profiles. When it comes to letter or encoding. And then looking at their vocabulary
both the externalizing group and the generally good kids or generally good
students group showed more growth in vocabulary over the course of
first grade than the other groups. The hyper active And the internalizing
group grew at a less rapid rate than the other profiles. So in conclusions to this study, we found
that foundational learning skills do cluster differently among students
at the beginning of first grade, and that these profiles did
differentiate We predict their developing literacy
competencies across the year. This is exciting because this idea
provides some improvement area or some supporting evidence that the profile
based approach might be helpful in identifying specific students
that might be at risk and beyond what they’re doing academically. And then for some future directions
with this we really want to understand. I was excited about this study but
it’s really just the beginning. It opens up this door for
a lot of questions. So really understanding
the individual components and how they’re relating to one
another If there is anything and with the skills that are more
important than the other. On or fit really is kind of
pattern within this students. And then looking at the interaction
with what’s happening in the fashion environment. So we have teacher data as well so
we can see the interaction with. What instruction the interaction
between instruction and how profiles change and grow in time. And then finally to be able to use
latent class analysis across different populations which goes right into the next
couple slides I’m gonna show you. So I recently had the opportunity to
use latent profile analysis looking at Students with Autism Spectrum Disorder, I
don’t know how familiar you guys are with this population but
it is a wildly heterogeneous population. So multiple developmental
doings students vary. So we as part of this larger It’s called the Classroom SCERTS
Intervention Project out of Florida State. We looked at 196 students with autism,
which for students with autism, this is a large sample. And they were at kindergarten
through second grade. We used observational data. So we created this measure of
the classroom measure of active engagement that That evaluated specific learning challenges that we
see incidents with Autism. So, the examples here are looking at the
how productive they are during the day. What we had hoped for in this observational data is are they
doing anything that is Productive. How many times they’re
looking up at their peers or they’re looking up at their teachers,
that that was tracked. How well they’re responding. So not really with comprehension but just are they responding when they
argue and it’s a bitter direction. And then their level of communication, whether they’re directing communication or
they’re generating some kind of language. Demand again their level
of flexible behavior. So we coded this behaviors
within simple but observational video recorded. And again, just as an example to show you
how cool can be with heterogeneous samples we found That profile model best
represented the sample as well. So yeah, just briefly,
the majority of the sample, 126 out of 196, of the students were
not very actively engaged at all. And then 23 out of 196 students showed a lot of eye gaze, so they weren’t
doing much else but they were looking. So looking but not really doing
much else in the classroom. Pretty interesting, characterized by
more relatively generative language. Over any other active
engagement variable and then finally profile four
of 13 students fairly good active engagement compared to
the rest of the profiles overall. So this is exciting because. Well actually I will go
into that in a second. Again the four profile Model
here showed best fit overall, and the probability was
very strong throughout. Entropy is very hight. So again, this is very exciting because
this is where my passion comes into play, working with and
studying students with autism. In the classroom context, and
if we can start to identify these specific observable behaviors that
students are presenting in the classroom we can not only categorize these patterns,
but we can think about how to match instructional opportunities to
maximize that active engagement. And I’m really interested in seeing in
this population too how we can change active engagement [COUGH] And
whether students [COUGH] excuse me, whether students are actually
staying within those profiles or if that’s something that’s more
fluid as part of an intervention. And that’s it.>>Thank you.
>>[APPLAUSE]>>Okay, so we do have clock for a few questions. Any Yeah.>>Yeah, I wondered if you
consider how students categories, if you think that’s based on their
personality characteristics, or do you think there’s some kind of
social factors that are putting them. Boys are more likely to be in one group,
Students proposals and assisted [INAUDIBLE] and other groups.>>So there is some literature that
thinking about the typical population and looking at social economic status and how that’s actually predicting
certain things right. And that’s kind of a very
common Common finding. And then with latent class males
is that has been something that has been predicting proof in the past. So, yes, I absolutely think that there are
specific characteristics that are placing kids into groups. And there is some literature
that is looking at that. For the first study I
did not look into that. For the students with autism, we have,
the dataset is just phenomenal. We have so many measure that
are characterizing the sample. So we have autism symptomatology, We have
intelligence, we have, I mean there’s just so many measures that characterize
the sample, so it would be very fun to look at what is predicting membership and
if membership is stable over time. If it is something intelligent there’s
been recent literature that says even in students with autism we
can change intelligence it’s just a score that actually build and
can actually change as students grow. So I think that it opens the door for
some nice Your questions.>>[INAUDIBLE]
>>Yeah, you go ahead.>>Yeah, [COUGH] do you have
any idea how sensible is this? Is the class Certification with regard to the measures. If you were to take on
some of the versicles to include some others would you
have a consistent pattern of. And if so great, if not I think
it should be crucial to have a key theoretical notion So then you
would want to use it for justification?>>Yeah, absolutely. So, two comments. This whole idea came from
practicing in the field and seeing those guys in the externalizing
profile and really conceptualizing. And thinking about as I’m practicing
what is important for learning. What are these foundations? Relaying skills. So the tools that we use again just
a beginning sample one of our follow up questions was to see how
can we replicate it with different measures using
observational measures. Are these the key components or is it being biased graded by a teacher
reading measure that kind of thing. So again this is The same, but it also is the very beginning so
our plans are formidable. But it provides some support
that these profiles can be just as important to
provide in the future. [SOUND]
>>You signed one of the identified students in profiles, but when you did the
>>Prediction. How does it explain
things I’ve been taught? I thought that particular part
of it would be well suited to machine learning methods. So particularly used progression trees,
right? You’re looking at outliers. The progression tree will say, the firs variable, if I split it into
what gives the most predict The path. It might be, say, hyperactivity. I may know these methods very broadly, but
I thought, separate from your studies, that it might be interesting to
apply it to your data, regression.>>That’s where collaboration comes in.>>Any other questions? Okay thank you very much.>>[APPLAUSE]>>Now the problem is. [INAUDIBLE] Okay and do you want a mike or
are you good?>>No, I think I’m fine.
Can you hand me that? Yeah.
So, I want to talk today about How we measure psychological constructs. And yes I focus on psychological
constructs cuz that’s in my context. I do think that what I will talk about
is more generally applicable perhaps not to economists but
certainly to many social sciences. So I think what I’m gonna argue is
that we are Steeped in the classical test theory assumption that every variable
you measure contains the thing that we’re trying to get at, the cluster of invented
measure, as well as the mean and error. And what I want to argue,
is that I think for a lot of the constructs we’re interested
in, that’s not the relationship that we actually have, that we don’t have
a bunch of variables that all look like One construct has error, and
that making that mistake has led us to some models that perhaps don’t represent
what we want them to represent. So consider measurement error, so this is classical [INAUDIBLE] theory, the
idea being that observed x and observed y. Our assumption of the true score exit we
want to measure plus some measurement error and the true score that we want
to measure plus the measurement error. And we can go a little bit more
expensive in the model and we can add some specific reliable
error that is not related, or not error specific, reliable variance
that’s not related to the construct. But we tend to accept and take for granted that every measure contains t and
it contains e. And the problem with measurement
error that we’re all very sensitive to is that it gives us things like
bias progression coefficients right. If we want to estimate the relationship
between construct x and construct y And we only have observed variables
that are measured with error, then this regression coefficient of y on
x is gonna be biased because it’s not the same it’s not an unbiased estimate of
the regression and true score y on true x. And we know exactly to what extent that
bias occurs right that bias is a function of so
this observed progression code Coefficient is a function of the true reversion
coemission and the reliability of x. So the more error that is present in x,
the less our observed regression coefficient corresponds to
a truth forward regression coefficient. And this becomes even more of a problem
as our models get more complicated. So this is a simple regression But
if you look at something like a PATH analysis where you have one variable
predicting some other variables, that those ones can in turn
predict some other variables. You can have a model that
has many variables in it. Those various paths will
be affected by measurement error in very difficult to determine ways. And so for example, A regression
path that connects two variables might be affected by the measurement error
in some other variable in the model. Just so we can get over estimated and the
underestimated progression coefficients then this can be a real problem. And so the recommendation that’s often
given by methodologies in psychology and I think other fields Is that we should get
away from observed variable models, and start relying more on
latent variable models. So here’s what a latent
variable model looks like, your typical reflective latent variable
model, and it relies on this assumption that we can have multiple indicators of
every construct that we wanna measure. So if x is some construct that you
wanna measure, but you have a scale and you have three items on that scale, then
all you need is the assumption that all three of those items contain
the same true score. So x1 and x2 and
x3 contain exactly the same true score and they contain different
unique measurement error. Why one, two, and
three contain true score, why? And some other regimen error. And if we do that, this model allows us to disetenuate the
aggression coefficient of b paired right. So the ideas that what these variables
have in common is the true score. That is the consequence
you’re trying to measure. And if you can make that assumption,
Then it means that your latent variables represent exactly what
you want them to represent. They are your true construct,
disattenuated and measured with error. And so these models are so convenient
that we start relying on them a lot. We think all we need is several
indicators of my construct, and then I can get this model that lets me. Assess the construct that I want to
assess without measurement error. So I want to focus, or I wanna just point
out a few implications of the latent variable model that I think are we don’t
often spend enough time thinking about. So the first is that all items
have to share the same true score. so for example if they capture
diverse facets of the construct those won’t be Presented. And so if you look at how we actually
construct scales in psychology, what people often do is they
actually pick items to try to get at different aspects of a construct,
right? So they understand that many
psychological aspects are complicated. And that they are multi faceted. They’re not Dimensional. And so we aim to pick different
items that actually assess all corners of a construct. And as soon as we do that, those unique
facets of the the construct are not represented in latent variable, because
they’re not shared among all items, right? So we’ve violated the assumptions
that allow us To use this model. A second implication is that all pairs of
observed variables male are expected to be uncorrelated after conditioning on the
latent variable.right, so the idea is that if you wanna make a change in x1 or
x2 you work through the common factor the latent variable Right, so you
can change the latent variable and in so doing you will change the predicted
scores on all of the observed variables. If you change the construct
you expect everything on the scale to go up or go down. And if you change just one item on the
scale, if you effect someone’s behavior or someone A score on one item of the scale, you don’t expect other items on
the scale to change accordingly. Because there’s no direct
link between them. And so, whereas I think in reality, for a lot of different constructs,
there are direct effects that we expect. So if you were to change One part
of a construct, one item that you’re assessing, you could actually
expect to see changes other things, right. So again, I think this model is
violated more often than we expect. So I just wanna mention that there
are alternative possibilities. So I wanna Talk about three possible
alternative construct models, and these are not the only ones,
but here are some. So what is a causal indicator model? So it could be the case
that your latent variable, your construct that you’re interested
in is not the cause of the indicators that you see, but
in fact, it’s the result. Right?
And so for example, socioeconomic status is
a really good example. We often measure socioeconomic
status using the indicators income, education, and occupational prestige. And in this case, what you actually expect is a compensatory
relation among these things, right? You can increase income and socioeconomic
status will increase accordingly. Or you can increase education, and you would also expect
socioeconomic status to increase. These things don’t need to
increase in tandem, right? Even though income and
education are correlated, it’s not the case that you need to,
that increasing socioeconomic status requires increasing all three
of these indicators, right? And it’s also much more intuitive
to think of socioeconomic status as a result of
Changed in these indicators. Rather than a cause of
changing [INAUDIBLE]. A second possible alternative
construct model is a MIMC mode. That stands for multiple indicator and
multiple causes. Which is just a reflective model and
the cause of model. So the idea here is that
you have some indicators. That are better thought of as
causes of the latent variable and others that are better thought
of as reflections of it. So, healthy lifestyle is an example here, where there are multiple behaviors you can
assess that lead to a healthy lifestyle. Things like exercise,
vegetable intake, and hours of sleep. These are all things that if you
change them they will lead you to Be more healthy. And if not the other way around. On the other side there are indicators
that you could conceive us that are dependent on healthy lifestyle. Like ratings for example. So if you ask someone to rate, how healthy
are you or how healthy is your partner. Or ask someones peer to rate how healthy
they are, those are reflections, presumably, of someones health. They’re not causes of their health,
they’re reflections of them. So there are two different
kinds of indicators here, and sometimes we have access to
these multiple indicators. A final type of analysis
is a network model, this is sort of the most radically
different kind of analysis, because now the construct we’re interested in is
not even Picture in the model, right. It’s not a unique variable. But according to the network perspective, the idea is that it’s part
of an emergent properties. So here’s a network
analysis of depression. And the idea here is that the symptoms
of depression have direct effect on each other. They have feedback loops And
that changing one symptom in a depression network will lead to other
symptoms being different. So these things are not the result of
some underlying depression construct that leads to changes in all of these symptoms
simultaneously and independently, but instead they affect each other. And depression, the construct, arises
from this System of interacting symptoms. Okay so you might be wondering whether
this is actually a problem to what extent there are constructs that we’re
interested in that really fit these other alternative constructs. I don’t have very much
time to talk about it now. I did just to answer this question for
myself I looked at the most, six most recent issues of
six psychology journals, of the biggest ones,
the most recent issue. I just found the first article in each
that includes a structural equation model of a latent variable, and
I looked at what sorts of variables. Will be. And it seem to me that
first of all in every case there was no justification given for the. So we tend to take for granted that we
have some variables we can thow them into model and not have to justify that that’s
actually a reasonable thing to do. So at least in psychology and maybe this
is different in other In social sciences, we take for granted that any set of
variables that are related to the same construct can be put into a reflective,
late, and variable model, and apparently reviewers don’t have
a problem with this, right? Secondly, in at least four of the six
cases, maybe five of the six cases that I read, it seemed to me that the direction
of the relation From observe variables. So, the construct was
questionable to ask him. So, authors were same things
like used to like variable to represent the sum these
constructs essentially, right? So, they have a couple of different
construct, they would talk them as the great construct, and then, say we
wanted to use like variable analysis To represent the combination
of these things right. So people tend to talk about latent
variables as a combination and if you looked at the indicators and the
latent variables that they were talking about it often seemed much more
plausible from a lay perspective anyway, from the perspective of someone who is
not substantially in any of these fields. That a reflective model was not obviously
the best model for, like I said, four or five out of the six of
these models that I saw. And so I think this is actually
a problem worth taking seriously, because at the very least, we should justify to ourselves the models
that we use before employing them. Okay, so what I wanna know is
what are the consequences. Consequences. And so I basically designed
a simulation study just to look at what happens if we have data that are generated
by some alternative measurement model, and fitted to one of these. Okay. So in particular what
biases That’s inquiring. So, my models are really simple ones. I’m just having a mediation
model where X predicts M. M predicts Y and M is mediator. I’m going to model it, generate together that mediator
according to two different models. Okay. Study one. The model that I’m using is a model here. So as an example, supposed you have
a mediation hypothesis where parent permissiveness leads teens to have
more exposure to sexual media content. Which in turn leads to higher
rates Teen sexual activity. All right, exposure to sexual media content is an
example of a formative construct, right? So you could have indicators like
teens’ exposure to television content, Internet content,
magazines and books, right? So four different kinds of media content, which summed up together give you their
exposure to Media content in total. If that’s the model, then what it means is
that any effect of parental permissiveness on exposure is itself
mediated through these paths. So it’s mediated by these components. And so
this is the model that I generated from. And in particular what I did is I so these are these positive labeled a 1 to
a 4 I randomly drew 100,000 sets of values of a 1 to a 4 just from
uniform distribution, and then I solved basically for
gamma and the covariances among these causal Indicators, so that the
total effect of x and m was constant 0.6. So I was able to have a population
value and can see all of the different conditions, what bias there was
from that population value. For each set, I’m looking at
the population co-variance matrix, so I’m not actually
generating any sample data, so I’m just looking at Given these
range of different population values, what values has incurred
at the level of population? And for each covariance matrix
we fit these two models. So the composite model, where m is
just a sum of the four indicators. And the reflective model, where m is
model as a reflective link variable Neither of these is the true
model in this case. Because over here also has
some residual variants and so the model and the construct is not just
the pure of the sum of those indicators, but is the sum of those
indicators plus a little bit. Okay, so, to read these results
Going from left to right this is bias in the a path, bias in the b path,
and bias in the c path. So horizontal black lines
reflect the population value. Along the x axis is correlation
among the causal indicators. And as you might imagine there’s
the correlation among these indicators Indicators gets higher, it gets more and more appropriate to use
a reflective latent variable model because the latent variable then
captures what’s affecting them. So as they share more variants, that latent variable can actually
capture more and more of the construct. So what we see as we move from left to
right is that the latent variable model, which is The results are these
blue to red, swarm of dots. Gets a little bit more accurate. Along the glass this is just the estimate. So the green lines reflect
the composite model. And what we see is that
in the composite model, there’s a particular degree of over
estimate Here, which is again, just comes because m is not just
the sum of those four items. And we’ve got a little bit of
an underestimate here, but the total x to y path is perfectly
reflective in the composite model, and so we have a perfect
estimate of the indirect effect, this is a times The color gradient from red to blue here reflects the amount
of variation in these a paths. And so you can think about it a little
bit like differential item functioning. So if x has a huge effect on m one,
and no effect on m four, a latent variable model is going
to have a really difficult time. Representing that different relationship, because according to the latent variable
model, x predicts the latent variable and the latent variable predicts
changes in the indicators. So if x has different effects on
different indicators, then the latent variable model just won’t be able to
affect those, so what we see The bias increases to the extent that differential
item functioning gets bigger, okay? In general, what we’re seeing here is that
there’s a lot more bias in the reflective latent variable model than there
is in the composite model. I also looked at model fit, and
this is of the latent variable model. So the composite model fit perfectly. In terms of model fit, so on the Y
axis we have just population RMFDA, which is, zero means perfect fit and
higher is worse. And what we see here is that there’s
actually no correlation in the latent variable model between the degree
of fit and the degree of And so in general we’re getting
a lot of poorly fitting models. But those poorly fitting models
don’t necessarily correspond to the most biased results. Okay just as a follow up so
in that previous study I had assume that the indicator x1 and
x4 become m1, m4 measure without error themselves and that
is a pretty straight assumption right. So most of our indicators of anything
at least in psychology do with error. So now I’ve just done a variation of
this model where those indicators themselves have reliability. 8.8, so each one of those indicators
are causal indicators but they have they’re also
affected by measurement error. All right the true model in this
case can not be identified so we couldn’t fit the true
model if we knew what it was. But I’m again looking at just
the composite model [SOUND] And so what we see is that when those
indicators have error variants, where they have less
than perfect reliability, the green line reflecting the composite
model now bends a little bit, right? So it has some additional sort of. The reflective model looks more Okay,
in terms of fit, we now have a green line that reflects
the fit of the composite model. We no longer have a perfect fit,
although it looks like a very good fit. So it’s very difficult to
tell that the composite model is not the right model for the data. The fit of the reflective model
Taking a different example. So this is a MIMIC model that
corresponds to the Multiple Indicator, Multiple Causes example
that I showed before. So now I’ve got 2 causal indicators and 2 reflective indicators
in the generating model. And again I Randomly withdrew values
of these two causal paths and it randomly drew values of
these two factor loadings. And we’re gonna look at
all of those results. Again, fitting the same two models, and
again, neither of these is the true Okay, so now in the top row, we have the results
from the composite model, the sum score, and in the bottom row we have the results
from the latent variable model. Along the x axis now is the co-variance
between those two causal indicators, go back,
just the co-variance between m1 and m2. And the color gradient now shows the mean
loading of these two factoroids. So the color gradient is actually
not very helpful in this respect. But what we see overall
is that the causal model, sorry, the composit model,
Those results are some bias. But there is very little variation
around in in what kind of results you get out of the composite model. There’s bias but it’s very little bias. On the variation there’s much
more variability in the results that we reflected later in the models. So they’re much more
effective by the strength. Such as puzzle has and this right little
later of the late variable loading. In both cases, the overall estimate
of the indirect effect, so that it half time could be
half in roughly analyzed. So actually, in this and many of this
cases the results Using the latent variable model actually seems to matter
less than you might think it would, right, even though it’s patently
the wrong model for the data. So is the composite model. In most cases, we don’t get a very biased
estimate of the entire indirect effect. That’s model fit. WE get much worse fit for
the latent variable model than we do for the progressive model. Okay, one final example is
a network type analysis. So I have a really simple network here. Where x predicts m, m predicts m2 M3 and
M2 and M3 together predict M4. Now, I’m just going to be looking at the
indirect effect x on the y because there’s no way to separate what the a path and the
b path are when there’s no one there that represents the [INAUDIBLE] effects. Here it is. And again, we actually find
that both of these models, although both result in
biased parameter estimates, neither results in a very
dramatic amount of bias, right. So both models result in an indirect path
estimate that’s a little bit biased, a bit of an overestimate, but not so. So the results are actually
less dramatic than I expected. And it could be the particular
model that I’ve chosen. Or it could be that latent variables,
even though conceptually and theoretically they’re the wrong
model in a lot of cases, actually somehow manage to capture
enough of the variance in the construct. That they planned to work as a stand-in. So I wanna conclude and say that the composite
model showed equal or less bias in each of these cases
than the reflective latent model. So it was never the case that the
reflective model produced a better result. And obviously, I’m only considering
cases where the reflective model is not the right model, right? So when the reflective model is the right
model there’s plenty of evidence suggesting this, that if you form
a composite, if you form a scale and use that variable when the true
model of the Off a latent variable, you will get very biased estimates. But I guess what I wanna say
is that when the true model is clearly not a reflective latent variable
model, you might be better off choosing to use a composite rather than
a reflective latent variable. Whether or not the model is appropriate,
of course, is a difficult question, I think it can often
be answered by theory. At least to the extent
that you want to know is this a reflective latent variable or
maybe is it something else. Determining what the actual correct
relationship among the indicators is can often be very helpful. That’s it thank you.>>[APPLAUSE]
>>Okay, so we’ve got enough time for a couple questions. Okay, so I have a question, which was, I didn’t quite see what
the composite model was.>>I’m sorry.
>>I mean, it didn’t register.>>That’s a very important bit, I’m sorry. So, the composite model If we can show it. The composite model is just the sum scores
so m is just a sum of the four indicators. So this is m one to m four. The composite model just sums them up. And so the question here is I’m
looking at each case at two options. Either you can just sum these variables
up and use the observed score or you can model it later. So the reason I ask this is that, I think in economics, what we would do
in that setting is that we’d go with the first principle component on
the right, or maybe the second. Essentially, that’s what your
confidence score is doing, right? It’s summing them up? Yeah it’s summing them up. I mean your first principle
component is not just the sum score.>>Yeah it’s a wider score.>>It’s a weighted sum, it’s the weighted
sum that captures the most variance right. So the question then is
is the weighting correct.>>Yes.
>>If the weighting is gonna be>>Based on the-
>>Yeah, yeah.>>Now I wanna add that condition in.>>You’re composite is a unity way.>>Easier if you just let it be later
determined in your simulations.>>Okay, so you-
>>And so at first it’s awkward.>>Very controversial
because the bell curve. Alright, which looked at, and it just used
the single principle component, alright? And used that to predict outcomes. It’s a book by Mario->>Mario-
>>Pearson, right.>>So, at least in certain situations
that’s not going to work, right? So, if, for example
>>These variables are not correlated at all. Your principle component is, well what your principle component is just
gonna be one of the variables, right. So the principle component is based
on the concept of shared variance, and so if they don’t share variance
then your principle component This analysis is not gonna
give you something useful.>>So I wasn’t guaranteeing
that it was gonna work but I just thought that we are talking about
different approaches today though. And in economics,
generally people wouldn’t use. Your methodology though, I think
that would use principle components.>>Do economists ever use latent
>>[INAUDIBLE]>>In certain settings, yes. But, for instance, it would be, if you had a multinomial choice, so you’re choosing over Ten categories and
the ever are correlated but then, you have the line like area
errors and the big covariance matrix. It’s just too hard to deal with,
so, instead you just shoot more vacant structure for that error,
just to collect the dimensionality of it. Sue may examples. It’s used in certain settings, but it’s not-
>>In certain settings, but in a purely pragmatic way, with no meaning attached
to latent variables, you might say.>>Yes.
>>Okay.>>Yeh, I think that’s [INAUDIBLE]
>>I’ll think about it during the project. Okay, any other questions?>>I have a very basic question. So thinking back at the beginning slides. So, just conceptualizing and thinking
about a latent variable, I remember when I took my first class at SEM, the instructor
said a true latent variable if every indicator has You will see a strong
correlation between indicators because every one will have its unique variants
that it’s sharing to that link variable. Does that make sense? No?
>>So this is stuck in my head for years and then I was going
through reading the literature, this is not something you see over and
over again. So I thought because you were talking about high correlations
between the indicators. When? So I guess I have two questions. One, I just wanted to kinda throw that out
there because that’s been in my head for years that a true latent variables,
kinda you don’t see those correlations between the indicators because each
one is contributing something unique. Not necessary two latent.>>Right.
>>Two related latent variables. And honestly, those were his words,
a true latent variable. So that was one, so you’re saying that that’s not
an accurate thing I should hold onto.>>Not, I would say not.>>Okay.
Is there any truth in that?>>I don’t know, it sounds like,
so the latent variable, and you can look at the map and what it’s
doing, it represents the shared variance. So in the case where you
We have no share berry, you cannot accept right then and if you
look at like a factor looking there, you typically [INAUDIBLE] Analysis
you wanna hide factor loading. The factor loading represents basically
the square root of the covariance with HR and the other [INAUDIBLE] And so,
if you have a low Correlations in the new items you can have variable effect
reloading which is an indication that your latent variable model is not right, right? So yeah I would say that’s
kind of the opposite right?>>And yeah that’s what it is.>>Well and.
>>Exactly the opposite yeah.>>What is too high then because I
have also seen again in the literature I’ve seen really
>>High correlations between indicators to the point where it’s like is there ever a too high
correlation between two indicators?>>I mean, I think that too high
correlation would probably indicate that you’re just repeating the same
item over and over again.>>Exactly.
>>In terms of the reliability, in terms of the statistical model, it’s great cuz you
>>Here you are getting one unidimensional thing, you are getting at it with a high amount
of reliability cuz you have items that are themselves highly reliable because
they’re highly correlated with each other. So but I think there might be a pragmatic
problem in there when it sounds like you’re asking people 12
variations of the same item That’s, it’s what they call the voter specifics
and the idea is that you have, that your factor is probably
something very, very, very specific. Which, if you’re a statistician,
is a good thing. And if you’re a psychologist,
it’s probably not a good thing.>>Okay, so just to follow up on that
question about latent variables. In economics we use latent
variables as the y variable, right. So it’s generally not this stuff
along the way but the outcome. And so one study would be with
the choices between two alternatives. You’d have this latent utility for each,
and then if one utility is greater than the other then we’ll [INAUDIBLE] The
second example would be with by looking at the expenditures on cars, and
this is latest desire to purchase a car. And if you cross on the threshold,
you’ll actually [INAUDIBLE] into zeros. We’re not using things very much
quite factorial [INAUDIBLE]>>Okay. Okay.
>>Okay. Any other questions? Okay last question.>>[INAUDIBLE] I learned a lot from your presentation. So I’m more interested in
the actual model than when you [INAUDIBLE]
to the process and relationships [INAUDIBLE]. So what’s [INAUDIBLE]
Here is the model that we want to have. So [INAUDIBLE]. So what determined The position that
we wanted to use this network model.>>So I used this network model because I
wanted to do something with four variables that was just kind of like a network,
and I could stick in the path analysis. So this is just [INAUDIBLE]. The example from depression, so
you could fit a network model in there. Various Different ways can fit either
a directed or an undirected network, and there are pieces of software
that can do that for you. But whether that’s sectional data or
longitudinal data, that will basically estimate what are the
most likely relations among the data. And you can get programs that will
estimate directions among those variables as well,
it’s a little bit Fraud, right? If you have cross sectional data,
to have a program that comes and tells you this predicts this and
this predicts this. Well, a lot of those arrows will
be going in the wrong direction. So you wanna know how to estimate them or
you just? I mean, I guess, so you could also have a
theory that predicts a particular network. And that would just be Kind of analysis,
right? If you think I have this construct
that represents this other one that represents this other one. So you could test in a confirmative
way a particular network structure if you had a theory. Generally, what’s more
likely to be done Its a analysis. If you have some data then some software will estimate a
particular network structure for the data.>>Okay well thank you very much for
a great session. [APPLAUSE] And
we’ll start at 11 on the dot.