EVALUATION METHODS
The course will be evaluated through two assignments. Both assignments will involve simple data management tasks, as well as statistical analysis and interpretation. Students will be expected to apply simple techniques to validate data quality and the accuracy of any data manipulations. There will be no final examination.
Assignment #1 25%
Assignment #2 75%
Total 100%
The two assignments will both use data from the same source: an extract of the Framingham Study. The first assignment will relate to non-regression methods of survival analysi.s It is the shorter of the two assignments and is intended to provide some interim feedback on progress. It is due to be handed in on March 20. I will attempt to hand-back the marked assignments on March 27 although, given the anticipated size of this year’s class, that may not prove feasible.
The second assignment is a larger analysis task and will involve application of Cox regression models. It is due to be handed in on April 27.
Please note that the data set you will be using involves real data. It is not made-up data sets designed to give simple or neat answers. Rather, it is a selection of real data from the Framingham study; the data may well be messy or have inconclusive relationships, etc. As such, you should not assume that you should jump in with a complex model without looking at the data first using descriptive and similar simple methods. That is, you are expected to undertake appropriate exploratory data analyses before doing the full regression analysis.
I encourage you to include your full SAS code since then I can give you feedback on the code if there were problems with your analysis.
A major aspect of these assignments (especially assignment #2) is the interpretation of the results. I do not want you to simply give me a computer print-out: I may know how to interpret the print-out but you have to show me that you know how to interpret it. You should not ‘cut and paste’ the SAS output results. Rather, re-type the relevant parts of the output into a suitable table. You also should not just type out the regression coefficients and leave it at that. Again, I may know how to interpret the coefficients – your task is to show me that you also know how. At a minimum, the analyses must be converted back into epidemiologically meaningful parameters (e.g. IDR). However, what I really am looking for is that you can interpret what the coefficients and analyses mean in epidemiological terms. This is a key objective of the course. For example, what do the analyses mean in terms of confounding? How do they impact on the underlying hypotheses?