Sumários

Class 9, Module II

30 Março 2021, 14:30 Jorge Filipe Campinos Landerset Cadima

Online classe Monday, April 12, 14h30-17h00
[slides 343-370] Some final comments regarding the ANOVA context. The Analysis of Covariance (ANCOVA): the general ideia. A detailed study of the ANCOVA model to compare different simple linear regressions in k contexts defined by the k levels of a factor: the concept, notation and terminology; the full ANCOVA model allowing for different regression lines in each factor level and its relation to the simple linear regression models fitted only with the subset of the data for each factor level. Submodels of interest, resulting from different hypotheses regarding the parameters of the full ANCOVA model. The role of partial F and Student's-t tests in choosing an appropriate model. An example with the iris dataset, relating Petal.Length and Sepal.Length: the full ANCOVA model assuming different regression lines in the factor levels; the model assuming parallel lines, but possibly different intercepts; and the single regression line model; choosing the model. A warning about the model assumptions. A formula relating the coefficients of determination of the ANCOVA model and of the individual simple regressions in each factor level. A word on more general contexts. ANCOVA Exercise 1.


Class 8, Module II

29 Março 2021, 14:30 Jorge Filipe Campinos Landerset Cadima

Online class Tuesday, March 30, 14h30-17h00
[Slides 278-342] A final note on one-way ANOVA models. Some general principles of experimental design: randomization, repetitions and pseudo-repetitions, additional predictors as ways of controlling unexplained variance (blocks in an ANOVA context). Two-way factorial designs: definition, notation. A first model (without interaction effects): the model equation and restrictions; additional assumptions. The two tests of interest and the  corresponding statistics with F distribution. Some additional formulas. A word of warning about the order of factors in unbalanced designs. An example with R. The interpretation of the model parameters and the lack of flexibility of the model: the need for interaction effects. The two-way ANOVA model with interaction effects: model equation and restrictions. The interpretation of the model parameters. The three F tests, their test statistics and distribution under H0. Interaction plots. Two examples with R. Some formulas.


Class 7, Module II

23 Março 2021, 14:30 Jorge Filipe Campinos Landerset Cadima

Online class on Monday, March 29, 14h30-17h00
[Slides 222-277] Diagnostic tools for linear regression models: again the leverages; Cook's distances and influence. The adjusted R^2 Discussion, examples and R commands. Some final comments. The Analysis of Variance (ANOVA) context of the linear model: motivating examples, terminology and notation. The one-way ANOVA: the model; the role of the indicator (dummy) variables; the need for a restriction to avoid an excess number of parameters and a brief discussion of alternatives. The one way ANOVA model and the F test for the existence of factor effects. Special formulas that arise in a one-way ANOVA context, for fitted values, residuals, Sums of Squares, Mean Squares. A discussion fo the F test statistic. Examples and R commands. Specificities in model-checking.


Class 6, Module II

22 Março 2021, 14:30 Jorge Filipe Campinos Landerset Cadima

  Online class on Tuesday March 23, from 14h30 to 17h00
[Slides 183-221] The partial F test: hypotheses and their meaning; the test statistic (in two alternative expressions); the distribution under H0; outline of the proof. Examples: Exercise 9 (brix data). The selection of submodels: full searches using the leaps and bounds algorithm (the leaps package); heuristic algorithms using stepwise selection. The Akaike Information Criterion (AIC): its interpretation and the illustration of a backward elimination algorithm using the AIC; the R function step. Model checking: the distribution of residuals in a Linear Model: its proof and its use in checking the validity of the model assumptions. Three kinds of residuals. Plots of (usual) residuals vs. fitted values; qq.plots of standardized residuals; their interpretation. The R command plot for objects of class lm. Other diagnostics: the notion of outliers. Leverage and leverage points: properties and interpretation.


Class 5, Module II

16 Março 2021, 14:30 Jorge Filipe Campinos Landerset Cadima

Online class on Monday March 22, from 14h30 to 17h00

Exercise 18: a)b)c)d) (partial) e) f) (just discussion). [Slides 156-182] The theory for inference on any linear combination of the model parameters. Three specific cases of interest: individual parameters, sum/difference of two parameters, the expected value of Y for given values of the predictors. Confidence intervals and hypotheses tests for the general result and for the specific cases. Examples and R commands. Confidence bands for the population regression line. Prediction intervals and bands for individual values of Y: examples and R command. A general result to compare linear models and submodels: Cochran's Theorem. The F goodness-of-fit test and a discussion of its proof. Alternative ways of writing the test hypotheses and the test statistic. Justification or the right-hand sided rejection region in the F test.