Statistical Modeling

Dr. Courtney Brown

Assignment #3

You will be working with cross-tabulation tables and a new SAS data set in this assignment. Remember that this is an assignment of scientific writing, so explain your results clearly so that anyone can understand your findings. First, download the survey data set for the Reagan vs. Carter election in 1980. (Right-mouse click this link and choose "Save Target As.") This data set is called a "panel study" because there were four waves of interviews that took place throughout 1980 to capture the sense of the campaign during (1) the entry into the primary season, (2) the post-primary season, (3) the main campaign in September, and (4) after the vote in November. After you download the data set, you will need to extract it since it is "zipped." Use Windows Explorer to do this by right-mouse clicking the zipped file and choosing "Extract." Put the extracted data set on your USB drive. Now you can run the SAS code below. Cut and paste everything below this line into your SAS editor. Then run the program. You will find that you can see the variable lables in the output for Proc Means.

For this assignment, you will search for an interesting relationship between two or more variables by constructing cross-tabulation tables using variations of the program below. Try all sorts of variables and variable combinations. Use the variable labels from the Proc Means output to find new and interesting variables to examine with cross-tabulation tables. Look at what relationships have significant chi-square tests. Find some story to tell from your exploratory analysis of these data. Then write up what you find in a three to five page analysis. Be sure to include at least one revealing cross-tabulation table in your analysis. (More than one is fine.) Discuss the relevant percentages. Remember that when you are trying to describe a relationship between two variables, it is normal to compare column numbers when you are using row percentages, and visa versa.

Here is some help in interpreting the variable values.
Feeling thermometers: 0 to 100, with 50 being neutral.
Liberal/conservative scales: 1=extreme liberal, 7=extreme conservative.
Inter1-Inter3: respondent's interest in the campaign/low to high.
P1 through P4: This refers to the panel wave, January, July, Sept. & Nov.
Expectation to vote: 5 will vote, 1 no.
Education: years of education
Income: not in thousands of dollars, but a scale, low to high.
Frequency of church attendance: low to high
R : This refers to the respondent.
Generally all of the variables go from to low to high. Thus, if you see a variable and you do not know the coding scheme, assume that a small number means less and a larger number means more. The other codes are in the variable labels.

Most of the variables for this data set originated as a panel study supplied by the Interuniversity Consortium for Social and Political Research (ICPSR). Emory University is a member of the ICPSR. I have added some contextual variables to the survey data set by extracting these contextual data from a separate ICPSR data set.

libname windata 'e:\';
GOPTIONS lfactor=10 hsize=6 in vsize=6 in horigin=1 in vorigin=1 in;
options nocenter ls=120;
**********************************************************;
* CLASS, NOTE THAT IF YOU BEGIN A LINE WITH AN ASTERISK *
* THEN YOU CAN PUT NOTES IN YOUR PROGRAM FILES. THIS IS
* LIKE A COMMENT CARD IN SPSS. HOWEVER, REMEMBER
* TO EVENTUALLY PUT A FINAL SEMICOLON AT THE END OF YOUR COMMENTS.;
***********************************************************;
* NOTE THAT I INDENT SOME STATEMENTS. THIS
* IS JUST FOR NEATNESS.;
***********************************************************;
* COPYRIGHT (c) Courtney Brown 2005, All Rights Reserved;
* Permission granted to use this file and computer code for any nonprofit and
* educational purposes, including classroom instruction.
* No further permission required.
* Please cite source as "From www.courtneybrown.com";
***********************************************************;
DATA panel80;SET windata.panel80;
if ((vote eq 1) or (vote eq 2));
if (age le 43) then generation = 'youth';
if (age gt 43) then generation = 'older';
if (partyid le 2) then party = 'Dem.';
if ((partyid ge 3) and (partyid le 5)) then party = 'Ind.';
if (partyid ge 6) then party = 'Rep.';
gender=sex;
proc format;
value VoteFmt 1 ='Reagan'
2 ='Carter'
3 ='Clark'
4 ='Anderson';
value GenderFmt 1 ='Male'
2 ='Female';
proc means;
proc contents;
proc freq;
tables gender*vote generation*vote gender*party generation*party / chisq;
format gender GenderFmt. vote VoteFmt.;
title 'Gender Gap';
run;
quit;