|
Lorelei Howard and Nick Wright MfD 2008 t-tests, anova and regression
|
Date | 20.08.2017 | Size | 13.89 Kb. | | #28254 |
| - Lorelei Howard and Nick Wright MfD 2008
Overview - Why do we need statistics?
- P values
- T-tests
- ANOVA
Why do we need statistics? - To enable us to test experimental hypotheses
- H0 = null hypothesis
- H1 = experimental hypothesis
- In terms of fMRI
- Null = no difference in brain activation between these 2 conditions
- Exp = there is a difference in brain activation between these 2 conditions
2 types of statistics - Descriptive Stats
- e.g., mean and standard deviation (S.D)
- Inferential statistics
Issues when making inferences So how do we know whether the effect observed in our sample was genuine? - So how do we know whether the effect observed in our sample was genuine?
- Instead we use p values to indicate our level of certainty that our results represent a genuine effect present in the whole population
P values - P values = the probability that the observed result was obtained by chance
- i.e. when the null hypothesis is true
- α level is set a priori (Usually 0.05)
- If p < α level then we reject the null hypothesis and accept the experimental hypothesis
- 95% certain that our experimental effect is genuine
- If however, p > α level then we reject the experimental hypothesis and accept the null hypothesis
Two types of errors - Type I error = false positive
- α level of 0.05 means that there is 5% risk that a type I error will be encountered
- Type II error = false negative
t-tests Hypothetical experiment - Q – does viewing pictures of the Simpson and the Griffin family activate the same brain regions?
- Condition 1 = Simpson family faces
- Condition 2 = Griffin family faces
Calculating T - Difference between the means divided by the pooled standard error of the mean
How do we apply this to fMRI data analysis? Degrees of freedom - = number of unconstrained data points
- Which in this case = number of data points – 1.
- Can use t value and df to find the associated p value
- Then compare to the α level
Different types of t-test - 2 sample t tests
- One sample t tests
- compare the mean of one sample to a given value
Another approach to group differences - Analysis Of VAriance (ANOVA)
- Multiple groups
- e.g. Different facial expressions
- H0 = no differences between groups
- H1 = differences between groups
Calculating F - F = the between group variance divided by the within group variance
- the model variance/error variance
- for F to be significant the between group variance should be considerably larger than the within group variance
What can be concluded from a significant ANOVA? - There is a significant difference between the groups
- NOT where this difference lies
- Finding exactly where the differences lie requires further statistical analyses
Different types of ANOVA - One-way ANOVA
- One factor with more than 2 levels
- Factorial ANOVAs
- Mixed design ANOVAs
- Some factors independent, others related
Conclusions - T-tests assess if two group means differ significantly
- Can compare two samples or one sample to a given value
- ANOVAs compare more than two groups or more complicated scenarios
- They use variances instead of means
Further reading - Howell. Statistical methods for psychologists
- Howitt and Cramer. An introduction to statistics in psychology
- Huettel. Functional magnetic resonance imaging (especially chapter 12)
PART 2 - Correlation
- Regression
- Relevance to GLM and SPM
Correlation - Strength and direction of the relationship between variables
- Scattergrams
Describe correlation: covariance - A statistic representing the degree to which 2 variables vary together
- Covariance formula
- cf. variance formula
- but…
- the absolute value of cov(x,y) is also a function of the standard deviations of x and y.
Describe correlation: Pearson correlation coefficient (r) - Equation
- r = -1 (max. negative correlation); r = 0 (no constant relationship); r = 1 (max. positive correlation)
- Limitations:
Summary - Correlation
- Regression
- Relevance to SPM
Regression - Regression: Prediction of one variable from knowledge of one or more other variables.
- Regression v. correlation: Regression allows you to predict one variable from the other (not just say if there is an association).
- Linear regression aims to fit a straight line to data that for any value of x gives the best prediction of y.
Best fit line, minimising sum of squared errors - Describing the line as in GCSE maths: y = m x + c
- Here, ŷ = bx + a
- ŷ : predicted value of y
- b: slope of regression line
- a: intercept
- Residual error (ε): Difference between obtained and predicted values of y (i.e. y- ŷ).
- Best fit line (values of b and a) is the one that minimises the sum of squared errors (SSerror) (y- ŷ)2
How to minimise SSerror - Minimise (y- ŷ)2 , which is (y-bx+a)2
- Plotting SSerror for each possible regression line gives a parabola.
- Minimum SSerror is at the bottom of the curve where the gradient is zero – and this can found with calculus.
- Take partial derivatives of (y-bx-a)2 and solve for 0 as simultaneous equations, giving:
- Sums of squared error (SSerror)
How good is the model? - We can calculate the regression line for any data, but how well does it fit the data?
- Total variance = predicted variance + error variance
- sy2 = sŷ2 + ser2
- Also, it can be shown that r2 is the proportion of the variance in y that is explained by our regression model
- r2 = sŷ2 / sy2
-
- Insert r2 sy2 into sy2 = sŷ2 + ser2 and rearrange to get:
- ser2 = sy2 (1 – r2)
- From this we can see that the greater the correlation the smaller the error variance, so the better our prediction
Is the model significant? - i.e. do we get a significantly better prediction of y from our regression equation than by just predicting the mean?
- F-statistic:
- So all we need to
- know are r and n !
Summary - Correlation
- Regression
- Relevance to SPM
General Linear Model - Linear regression is actually a form of the General Linear Model where the parameters are b, the slope of the line, and a, the intercept.
- y = bx + a +ε
- A General Linear Model is just any model that describes the data in terms of a straight line
- One voxel: The GLM
- Our aim: Solve equation for β – tells us how much BOLD signal is explained by X
Multiple regression - Multiple regression is used to determine the effect of a number of independent variables, x1, x2, x3 etc., on a single dependent variable, y
- The different x variables are combined in a linear way and each has its own regression coefficient:
- y = b0 + b1x1+ b2x2 +…..+ bnxn + ε
- The a parameters reflect the independent contribution of each independent variable, x, to the value of the dependent variable, y.
- i.e. the amount of variance in y that is accounted for by each x variable after all the other x variables have been accounted for
SPM - Linear regression is a GLM that models the effect of one independent variable, x, on one dependent variable, y
- Multiple Regression models the effect of several independent variables, x1, x2 etc, on one dependent variable, y
- Both are types of General Linear Model
- This is what SPM does and will be explained soon…
Summary - Correlation
- Regression
- Relevance to SPM
- Thanks!
Share with your friends: |
The database is protected by copyright ©sckool.org 2023
send message
|
|