
Lorelei Howard and Nick Wright MfD 2008 ttests, anova and regression

Date  20.08.2017  Size  13.89 Kb.   #28254 
  Lorelei Howard and Nick Wright MfD 2008
Overview  Why do we need statistics?
 P values
 Ttests
 ANOVA
Why do we need statistics?  To enable us to test experimental hypotheses
 H0 = null hypothesis
 H1 = experimental hypothesis
 In terms of fMRI
 Null = no difference in brain activation between these 2 conditions
 Exp = there is a difference in brain activation between these 2 conditions
2 types of statistics  Descriptive Stats
 e.g., mean and standard deviation (S.D)
 Inferential statistics
Issues when making inferences So how do we know whether the effect observed in our sample was genuine?  So how do we know whether the effect observed in our sample was genuine?
 Instead we use p values to indicate our level of certainty that our results represent a genuine effect present in the whole population
P values  P values = the probability that the observed result was obtained by chance
 i.e. when the null hypothesis is true
 α level is set a priori (Usually 0.05)
 If p < α level then we reject the null hypothesis and accept the experimental hypothesis
 95% certain that our experimental effect is genuine
 If however, p > α level then we reject the experimental hypothesis and accept the null hypothesis
Two types of errors  Type I error = false positive
 α level of 0.05 means that there is 5% risk that a type I error will be encountered
 Type II error = false negative
ttests Hypothetical experiment  Q – does viewing pictures of the Simpson and the Griffin family activate the same brain regions?
 Condition 1 = Simpson family faces
 Condition 2 = Griffin family faces
Calculating T  Difference between the means divided by the pooled standard error of the mean
How do we apply this to fMRI data analysis? Degrees of freedom  = number of unconstrained data points
 Which in this case = number of data points – 1.
 Can use t value and df to find the associated p value
 Then compare to the α level
Different types of ttest  2 sample t tests
 One sample t tests
 compare the mean of one sample to a given value
Another approach to group differences  Analysis Of VAriance (ANOVA)
 Multiple groups
 e.g. Different facial expressions
 H0 = no differences between groups
 H1 = differences between groups
Calculating F  F = the between group variance divided by the within group variance
 the model variance/error variance
 for F to be significant the between group variance should be considerably larger than the within group variance
What can be concluded from a significant ANOVA?  There is a significant difference between the groups
 NOT where this difference lies
 Finding exactly where the differences lie requires further statistical analyses
Different types of ANOVA  Oneway ANOVA
 One factor with more than 2 levels
 Factorial ANOVAs
 Mixed design ANOVAs
 Some factors independent, others related
Conclusions  Ttests assess if two group means differ significantly
 Can compare two samples or one sample to a given value
 ANOVAs compare more than two groups or more complicated scenarios
 They use variances instead of means
Further reading  Howell. Statistical methods for psychologists
 Howitt and Cramer. An introduction to statistics in psychology
 Huettel. Functional magnetic resonance imaging (especially chapter 12)
PART 2  Correlation
 Regression
 Relevance to GLM and SPM
Correlation  Strength and direction of the relationship between variables
 Scattergrams
Describe correlation: covariance  A statistic representing the degree to which 2 variables vary together
 Covariance formula
 cf. variance formula
 but…
 the absolute value of cov(x,y) is also a function of the standard deviations of x and y.
Describe correlation: Pearson correlation coefficient (r)  Equation
 r = 1 (max. negative correlation); r = 0 (no constant relationship); r = 1 (max. positive correlation)
 Limitations:
Summary  Correlation
 Regression
 Relevance to SPM
Regression  Regression: Prediction of one variable from knowledge of one or more other variables.
 Regression v. correlation: Regression allows you to predict one variable from the other (not just say if there is an association).
 Linear regression aims to fit a straight line to data that for any value of x gives the best prediction of y.
Best fit line, minimising sum of squared errors  Describing the line as in GCSE maths: y = m x + c
 Here, ŷ = bx + a
 ŷ : predicted value of y
 b: slope of regression line
 a: intercept
 Residual error (ε): Difference between obtained and predicted values of y (i.e. y ŷ).
 Best fit line (values of b and a) is the one that minimises the sum of squared errors (SSerror) (y ŷ)2
How to minimise SSerror  Minimise (y ŷ)2 , which is (ybx+a)2
 Plotting SSerror for each possible regression line gives a parabola.
 Minimum SSerror is at the bottom of the curve where the gradient is zero – and this can found with calculus.
 Take partial derivatives of (ybxa)2 and solve for 0 as simultaneous equations, giving:
 Sums of squared error (SSerror)
How good is the model?  We can calculate the regression line for any data, but how well does it fit the data?
 Total variance = predicted variance + error variance
 sy2 = sŷ2 + ser2
 Also, it can be shown that r2 is the proportion of the variance in y that is explained by our regression model
 r2 = sŷ2 / sy2

 Insert r2 sy2 into sy2 = sŷ2 + ser2 and rearrange to get:
 ser2 = sy2 (1 – r2)
 From this we can see that the greater the correlation the smaller the error variance, so the better our prediction
Is the model significant?  i.e. do we get a significantly better prediction of y from our regression equation than by just predicting the mean?
 Fstatistic:
 So all we need to
 know are r and n !
Summary  Correlation
 Regression
 Relevance to SPM
General Linear Model  Linear regression is actually a form of the General Linear Model where the parameters are b, the slope of the line, and a, the intercept.
 y = bx + a +ε
 A General Linear Model is just any model that describes the data in terms of a straight line
 One voxel: The GLM
 Our aim: Solve equation for β – tells us how much BOLD signal is explained by X
Multiple regression  Multiple regression is used to determine the effect of a number of independent variables, x1, x2, x3 etc., on a single dependent variable, y
 The different x variables are combined in a linear way and each has its own regression coefficient:
 y = b0 + b1x1+ b2x2 +…..+ bnxn + ε
 The a parameters reflect the independent contribution of each independent variable, x, to the value of the dependent variable, y.
 i.e. the amount of variance in y that is accounted for by each x variable after all the other x variables have been accounted for
SPM  Linear regression is a GLM that models the effect of one independent variable, x, on one dependent variable, y
 Multiple Regression models the effect of several independent variables, x1, x2 etc, on one dependent variable, y
 Both are types of General Linear Model
 This is what SPM does and will be explained soon…
Summary  Correlation
 Regression
 Relevance to SPM
 Thanks!
Share with your friends: 
The database is protected by copyright ©sckool.org 2023
send message

