Mathematical Modeling of Complex Systems

Dr. Courtney Brown

Assignment #2

This next assignment is closely related to the previous assignment. In the previous assignment, you worked with a first-order difference equation with constant coefficients. You had R produce and plot a collection of data points using that difference equation. Then you used those data points in a regression program to see that you could recover the intercept and slope of the difference equation.

In this assignment, you will work with real data. All of the data are for the United States, and are supplied by your instructor. You will again work with R. The data which you will use for this assignment are (1) labor union membership as a proportion of the total labor force from 1930 to 1970, (2) the proportion of females in the total labor force from 1948 to 1968, (3) the proportion of females aged 45 to 64 in the total female labor force from 1958 to 1968, (4) the proportion of married females in the female U.S. labor force from 1948 to 1968, (5) the infant mortality rates for nonwhites per 1,000 live births from 1921 to 1970, (6) infant mortality rates for whites per 1,000 live births from 1921 to 1970, (7) the proportion of national electrical energy use per capita in the U.S. in million kilowatt-hours from 1920 to 1970, and (8) the proportion of national outlays for national defense from 1948 to 1968.

For each of these items above, you will plot the first differences and get a regression line fitted to the differences. Note that your parameter estimates are of great interest to you. Use them to plot a difference equation trajectory on top of a plot of the actual data plotted over time. Thus, you will be handing in a pair of plots for each of the above items (a total of 16 plots).

Note that sometimes the difference equation plots a trajectory that goes right through the data when they are plotted over time, and sometimes things are not so tidy. Why to you think that should happen? (Hint: Look at the fit of the regression for the first differences.) Interpret your results for each pair of plots with a paragraph of two of text. Try to figure out from each pair of plots what you have learned about the world, and what have you learned about first-order linear difference equations. Write up what you have learned, and hand it in.

NOTE: The concept of this ingenious assignment was originally developed by Professor John Sprague.

GRADUATE STUDENTS ONLY: Experiment with different graphing approaches using R. Be creative. Find examples from the vast R help literature, and try to copy the most intriguing approaches.

Below is the R code that does the entire first part of this assignment (for the labor union membership). The SAS version of the same code is further below. Just cut and paste the R version into the R console and run it. If you are using SAS, place the code into the SAS editor and run it. Print out the plots and output, and you are ready to write up a bit of interpretation. Use this code below as a model for all of the other parts of the assignment. Follow the links above to get the data for each of the other parts. Cut and paste the new data for each part into the code below, change your variable labels appropriately, plus a few odds and ends, and you will be ready to do all of the parts in no time. Below is the code for the first part.

# AGAIN, WE START WITH THE R VERSION.

mydata <- read.table(file.choose(), header=TRUE) # first we get the data
attach(mydata)
# Now we create a function that allows us to lag a variable
lagvar <- function(x,y){
NAtotal <- rep(NA, y)
result <- c(NAtotal,x[-((length(x)-y+1):length(x))])
return(result)}

lagvar1 <- lagvar(laborun,1) # Here we call the lag function to give us a lagged variable
newdata <- cbind(mydata, lagvar1) # We include the new variable with our data set
newdata # we print out our data set
first.model <- lm(laborun ~ lagvar1)
first.model
summary(first.model)

# Now we produce a plot of the first differences of the data
plot(lagvar1, laborun, xlab="", ylab="", xlim=c(0,30), ylim=c(0,30), pch=19)
title(xlab="Labor Union Membership at t", ylab="Labor Union Membership at t+1", main="Figure 1: Plot of the first differences", cex=1.5, col="black", font=2)
abline(first.model, lwd=2)

# Now we create the first-order difference equation that fits our data.
y1 <- 6 # This is the initial condition. Approximate this by looking at the early data numbers.
y2 <- 0
t <- 0
a <- .93971 # This is the slope gotten from the first.model regression above.
b <- 1.55045 # This is the intercept gotten from the first.model regression above.
ystar <- b/(1-a) # This is the equilibrium value.
ystar # This prints out the equilibrium value.
timeserieslength <- 41 # Count the number of data values to get this number.
for (i in 1:timeserieslength) {
y2[i] <- (a*y1[i])+b
t[i] <- i
if (i < timeserieslength) y1[i+1]=y2[i]
}

windows() #This gives you a new graphics window so that your second plot does not erase your first plot.

plot(year, laborun, xlab="", ylab="", ylim=c(0,30), pch=19)
lines(year, y1, xlab="", ylab="", lwd=2)
title(xlab="Years", ylab="Labor Union Membership", main="Figure 2: U.S. Labor Union Membership, 1930-70", cex=1.5, col="black", font=2)

 

*BELOW IS THE SAS VERSION;

GOPTIONS lfactor=10 hsize=6 in vsize=6 in horigin=1 in vorigin=1 in;
options nocenter;
**********************************************************;
* CLASS, NOTE THAT IF YOU BEGIN A LINE WITH AN ASTERISK *
* THEN YOU CAN PUT NOTES IN YOUR PROGRAM FILES. THIS IS
* LIKE A COMMENT CARD IN SPSS. HOWEVER, REMEMBER
* TO EVENTUALLY PUT A FINAL SEMICOLON AT THE END OF YOUR COMMENTS.;
***********************************************************;
* NOTE THAT I INDENT SOME STATEMENTS. THIS
* IS JUST FOR NEATNESS.;
***********************************************************;
* COPYRIGHT (c) Courtney Brown 2004, All Rights Reserved;
* Permission granted to use this file and computer code for any nonprofit and
* educational purposes, including classroom instruction.
* No further permission required.
* Please cite source as "From www.courtneybrown.com";
***********************************************************;
*** DATA SET..... Labor Union Membership as a proportion of total labor force
*** from 1930-70 *****************
************************************************************;
DATA LABOR;
INPUT YEAR 1-4 LABORUN 6-10;
CARDS;
1930 6.8
1931 6.5
1932 6.0
1933 5.2
1934 5.9
1935 6.7
1936 7.4
1937 12.9
1938 14.6
1939 15.8
1940 15.5
1941 17.7
1942 17.2
1943 20.5
1944 21.4
1945 21.9
1946 23.6
1947 23.9
1948 23.1
1949 22.7
1950 22.3
1951 24.5
1952 24.2
1953 25.5
1954 25.4
1955 24.7
1956 25.2
1957 24.9
1958 24.2
1959 24.1
1960 23.6
1961 22.3
1962 22.6
1963 22.2
1964 22.2
1965 22.4
1966 22.7
1967 22.7
1968 23.0
1969 22.6
1970 22.6
;
DATA LABOR;SET;
LLABORUN=LAG(LABORUN);
symbol1 color=black v=NONE f=centb i=join;
symbol2 color=black f=centb v='.' height=2 interpol=R;
symbol3 color=black f=centb v='.' height=2;

PROC REG;
MODEL LABORUN=LLABORUN;
PROC GPLOT;
axis1 color=black
value=(h=1.5 f=swissb c=black)
label=(h=1.3 a=90 r=0 f=swissb c=black 'Labor Union Membership at t+1');
axis2 color=black
value=(h=1.5 f=swissb c=black)
label=(h=1.3 f=swissb c=black 'Labor Union Membership at t');
PLOT LABORUN*LLABORUN=2 /
vaxis=axis1 haxis=axis2 vminor=0 hminor=0;
TITLE 'Figure 1a: Union Membership in the U.S. Labor Force';
DATA TRAJECT;
A=.939710;
B=1.550449;
Y1=6;
DO T=1 TO 41;
Y2=(A*Y1)+B;
OUTPUT;
Y1=Y2;
END;
PROC SORT;
BY T;
DATA LABOR;SET LABOR;
T=YEAR-1929;
DATA COMBINE;
MERGE TRAJECT LABOR;
BY T;
PROC PRINT;
TITLE 'Figure 1b: Union Membership in the U.S. Labor Force';
PROC GPLOT;
axis1 color=black
value=(h=1.5 f=swissb c=black)
label=(h=1.3 a=90 r=0 f=swissb c=black 'Labor Union Membership');
axis2 color=black
value=(h=1.5 f=swissb c=black)
label=(h=1.3 f=swissb c=black 'Year');
PLOT LABORUN*YEAR='A' Y2*YEAR=1 / overlay
vaxis=axis1 haxis=axis2 vminor=0 hminor=0;
run;
quit;