Statistical Modeling
Dr. Courtney Brown
Assignment #2
In this assignment, you will again analyze the on-year and off-year pattern of political participation in voting during U.S. congressional elections. Keep focused on the idea that you have to use the normal English language to explain the statistics. Write clearly. In this case, you will conduct and evaluate t-tests to test for the differences between means. You are to examine differences in congressional mobilization using the periods from 1932-1970 and 1972-1988. Think about why these two periods are useful, and especially why you may want to begin in 1932 for the first period. The programs below are set-up to analyze total congressional mobilization. But also look at total Republican congressional mobilization and total Democratic congressional mobilization. Look at all three when you compare the two periods. Use the same data set that you used for the last assignment.
You will be doing this assignment with R. Further below is a SAS program that will also do the analysis of the data, and it is included here for your reference. Immediately below is an R script that does some of the analysis to get you started. The code is initially set up assuming unequal variances between time periods, but the modification for equal variances is also indicated below. How would you know if which test to use? You need to finish the script to include the test for off-year elections, where the variable ON equals 0 instead of 1.
You are to analyze these data using t-tests to determine if there really is a difference in congressional mobilization between the time periods. That is, compare the on-year results for the early period to the on-year results for the later period, etc.
# First we get our data.
mydata <- read.table("usparty.txt")
names(mydata) # Lets us see all the variable names.
mysubsetdata<-subset(mydata, select=c(YEAR, MTOTCONG, ON)) #This keeps only the three variables that we need.
summary(mysubsetdata) # Since no variables are listed, a summary for all variables in the data frame is printed.
mysubsetdata #This prints out all the variable values.
# Now let us work with just the years 1932 through 1970, on-year data only.
my3270onyeardata <- subset(mysubsetdata, YEAR >= 1932 & YEAR <= 1970 & ON == 1)
my3270onyeardata
# Now let us work with just the years 1972 through 1988, on-year data only.
my7288onyeardata <- subset(mysubsetdata, YEAR >= 1972 & YEAR <= 1988 & ON == 1)
my7288onyeardata
# Here is the t-test between the earlier and later periods for on-year elections.
t.test(my3270onyeardata$MTOTCONG, my7288onyeardata$MTOTCONG)
# If you were assuming equal variances across time periods, then you would write this test as below.
t.test(my3270onyeardata$MTOTCONG, my7288onyeardata$MTOTCONG, var.equal=TRUE)
# You could also conduct an F-test to see if the variances really are equal, to be safe.
var.test(my3270onyeardata$MTOTCONG, my7288onyeardata$MTOTCONG)
* Below is the comparable SAS code that conducts this analysis;
libname windata 'e:\';
GOPTIONS lfactor=10 hsize=6 in vsize=6 in horigin=1 in vorigin=1 in;
options nocenter;
**********************************************************;
* CLASS, NOTE THAT IF YOU BEGIN A LINE WITH AN ASTERISK *
* THEN YOU CAN PUT NOTES IN YOUR PROGRAM FILES. THIS IS
* LIKE A COMMENT CARD IN SPSS. HOWEVER, REMEMBER
* TO EVENTUALLY PUT A FINAL SEMICOLON AT THE END OF YOUR COMMENTS.;
***********************************************************;
* NOTE THAT I INDENT SOME STATEMENTS. THIS
* IS JUST FOR NEATNESS.;
***********************************************************;
* COPYRIGHT (c) Courtney Brown 2005, All Rights Reserved;
* Permission granted to use this file and computer code for any nonprofit and
* educational purposes, including classroom instruction.
* No further permission required.
* Please cite source as "From www.courtneybrown.com";
***********************************************************;
DATA USPARTY;SET windata.USPARTY;
IF (YEAR LE 1930) THEN PERIOD=1;
IF ((YEAR GE 1932) AND (YEAR LE 1970)) THEN PERIOD=2;
IF ((YEAR GE 1972) AND (YEAR LE 1988)) THEN PERIOD=3;
DATA ONYEAR;SET USPARTY;
IF (ON = 1);
IF (PERIOD NE 1);
DATA OFFYEAR;SET USPARTY;
IF (ON = 0);
IF (PERIOD NE 1);
proc print DATA=ONYEAR;var year period mtotcong ON;
proc print DATA=OFFYEAR;var year period mtotcong ON;
DATA ONYEAR;SET ONYEAR;
PROC SORT;BY PERIOD;
PROC MEANS; VAR YEAR PERIOD MTOTCONG ON;
BY PERIOD;
TITLE "ON-YEAR ELECTIONS";
PROC TTEST;
CLASS PERIOD;
VAR MTOTCONG;
DATA OFFYEAR;SET OFFYEAR;
PROC SORT;BY PERIOD;
PROC MEANS; VAR YEAR PERIOD MTOTCONG ON;
BY PERIOD;
TITLE "OFF-YEAR ELECTIONS";
PROC TTEST;
CLASS PERIOD;
VAR MTOTCONG;
run;
quit;