Males and females in the sample are now distinguished by colour. All I did was to add …colour = Gender … to the specification of what data variables must be mapped to what aesthetic features: x position for a point is mapped to age y position to reading ability and now colour of point is mapped to gender, to give us: Ggplot(subjects, aes(x = Age, y = TOWRE_wordacc, colour = Gender)) + geom_point() Is the potential age effect actually a gender effect? Get the scatterplot to show males and females observations in different colours using the following line of code: Maybe I’m curious if gender has a role here. Maybe the few older people, in their 60s, pull the relationship towards the negative i.e. There might be no relationship between age and reading ability: how does ability vary with age here? Ggplot(subjects, aes(x = Age, y = TOWRE_wordacc)) + geom_point()Īnd you should see this in R-studio Plots window: Let’s draw a scatterplot using geom_point(). We will return to these key concepts, in other posts.įor now, we will stick to showing the relationships between all the variables in our sample, modifying plots over a series of steps as follows: You can see that the uncertainty increases as we go towards low accuracy scores on either measure. I imposed a line of best fit indicating the relationship between nonword and word reading estimated using linear regression, the line in blue, as well as a shaded band to show uncertainty – confidence interval – over the relationship. most of the data we have for this group of ostensibly typically developing readers is up in the good nonword readers and good word readers zone. at least one person is very bad on nonword reading but OK on word reading if people are more accurate on nonword reading they are also often more accurate on word reading What does the scatterplot tell us, and how can it be made? We can examine these relationships and, in this post, we will work on doing so through using scatter plots like that shown below. people show increased ability to read aloud made-up words (pseudowords) or nonwords – increased word naming accuracy with increased nonword naming accuracy. people read more – increased word naming accuracy with increased reading history (ART score) 3. people get older – increased word naming accuracy with increased age 2. I might expect to see that the ability to read aloud words correctly actually increases as: 1. How is variation in the values of one variable associated with change (or not) in the values of the other variable? Let’s suppose that we are concerned about the relationship between two variables. How do the variables relate to each other? Most people tested got at least three quarters of the items in each test correct, most of them did much better then that.Ģ. What these numbers and these histograms show us is that participants in the sample varied broadly (from 20 years to 60+ years) in age, though most of the middle-aged people in the sample were male, also that measures of reading ability – TOWRE word and non-word naming accuracy – tended to vary around the top end of the range. using the ggplot code discussed previously to get some histograms using the function call: describe(subjects) to get the mean and standard deviation etc. In previous posts, we have seen how to use R to get a sense of the average for and spread of values on these measures for our sample, and we have seen how to show the distribution of values using histograms: What does the observation that participants varied mean? We know that the participants tested in the ML study of reading varied on measures of gender, age, reading skill (we used the TOWRE test of ability, Torgesen, Wagner & Rashotte, 1999) and print exposure, a proxy for reading history (measured using the ART, Masterson & Hayes, 2007 Stanovich & West, 1989). Variation in the values of subject scores While doing these things – mostly achieving the presentation of relationships between variables – we should also consider what statistical insights the plots teach us.ġ. modifying the appearance of the scatterplots for presentation. examining the relationship between pairs of variables using scatterplots 2. This post assumes that you have installed and are able to load the ggplot2 package, that you have been able to download the ML subject scores database and can read it in to have it available as a dataframe in the workspace, and that you have already tried out some plotting using ggplot2.
0 Comments
Leave a Reply. |