# [Solution]STATS Module 5 Problem Set

Module 5 Problem Set Module 5 Problem Set Due Date: May 04, 2016 23:59:59 Max Points:85 Details: Some commonly employed statistical analyses include correlation and…

Module 5 Problem Set
Module 5 Problem Set
Due Date: May 04, 2016 23:59:59
Max Points:85
Details:
Some commonly employed statistical
analyses include correlation and regression. In this assignment, you will
practice correlation and regression techniques from an SPSS data set.
General
Requirements:
Use the following information to
ensure successful completion of the assignment:
Review “SPSS Access
Instructions” for information on how to access SPSS for this
assignment.
Access the document, “Introduction
to Statistical Analysis Using IBM SPSS Statistics, Student Guide” to
complete the assignment.
“Bank.sav” and open it with SPSS. Use the data to complete the
assignment.
“Census.sav” and open it with SPSS. Use the data to complete the
assignment.
Directions:
Locate the data set
“Bank.sav” and open it with SPSS. Follow the steps in section 10.15
Learning Activity as written. Answer all of the questions in the activity based
and include supporting graphs or tables from the SPSS output for submission to
the instructor.
Locate the data set
“Census.sav” and open it with SPSS. Follow the steps in section 11.16
Learning Activity as written. Answer all of the questions in the activity based
and include supporting graphs or tables from the SPSS output for submission to
the instructor.
10.15 Learning Activity
The overall goal of this learning
activity is to visualize the relationship between two scale variables creating
scatterplots and to quantify this relationship with the correlation
coefficient. In this set of learning activities you will use the data file Bank.sav.
The file Bank.sav, a PASW Statistics
data file that contains information on employees of a major bank. Included is
data on beginning and current salary position, time working, and demographic
information.
1. Suppose you are interested in
understanding how an employees demographic characteristics, beginning salary,
and time at the bank and in the work force are related to current salary. Start
by producing scatterplots of salbeg, sex, time, age, edlevel, and work with
salnow. Add a fit line to each plot. Check on the variable labels for time and
work so you understand what these variables are measuring.
2. Describe the relationships based
on the scatterplots. Do they all appear to be linear? Are any relationships
negative? What is the strongest relationship?
3. Now produce correlations with all
these variables. Which correlations with salnow are significant? What is the
largest correlation in absolute value with salnow? Did this match what you
thought based on the scatterplots?
4. Examine the correlations between
the other variables? Which variables are most strongly related? Create
scatterplots for these as well to check for linearity.
5. For those with more time: Go back
and review the scatterplots with salnow. Are there any employees who are
outliersâfar from the fit lineâin any of the scatterplots? How might they be
affecting the relationship?
11.16
Learning Activity
The overall goal of this learning
activity is to run linear regressions and to interpret the output. You will use
the PASW Statistics data file Census.sav.
The file Census.sav, a PASW
Statistics data file from a survey done on the general adult population.
Questions were included about various attitudes and demographic
characteristics.
Supporting Materials
regression analysis, see:
Allison, Paul D. 1998. Multiple
Regression: A Primer. Thousand Oaks, CA: Pine Forge.
Draper, Norman and Smith, Harry.
1998. Applied Regression Analysis. 3rd ed. New York: Wiley.
1. Run a linear regression to
predict total family income (income06) with highest year of education (educ).
First, do a scatterplot of these two variables and superimpose a fit line. Does
the relationship seem linear? How would you characterize the relationship?
2. Now run the linear regression.
What is the Adjusted R square value? Is the regression significant? What is the
B coefficient for educ? Interpret it.
in the U.S. or overseas),age,sex, and number of brothers and
sisters (sibs). Check the coding onbornso you can interpret its
coefficient. First, do a scatterplot ofageandsibswithincome06.
Superimpose a fit line. Does the relationship seem linear? How would you
characterize the relationship? Why not do scatterplots ofincome06withsex
andborn?
4. Use all these variables to
predictincome06. Request residual statistics including the histogram of
errors and the scatterplot of standardized values. Also request casewise
diagnostics. What is the Adjusted R square? How much has it increased from
above?
5. Which variables are significant
predictors? What is the effect of each onincome06? Which variable is
the strongest predictor? The weakest?
6. Examine the casewise diagnostics.
Do you see any pattern? Are there more cases with large errors than we would
expect?
7. Examine the histogram and
scatterplot. Are the errors normally distributed? Do you see any pattern in the
scatterplot? What might that mean?
8. What is the prediction equation
forincome06?