Regression

please check attached file

Mall

Sales Size Windows Competitors Mall Size Nearest Competitor
4453 3860 39 12 943700 227
4770 4150 41 15 532500 142
4821 3880 39 15 390500 263
4912 4000 39 13 545500 219
4774 4140 40 10 329600 232
4638 4370 48 14 802600 257
4076 3570 37 16 463300 241
3967 3870 39 16 855200 220
4000 4020 44 21 443000 188
4379 3990 38 16 613400 209
5761 4930 50 15 420300 220
3561 3540 34 15 626700 167
4145 3950 36 14 601500 187
4406 3770 36 12 593000 199
4972 3940 38 11 347100 204
4414 3590 35 10 355900 146
4363 4090 38 13 490100 206
4499 4580 45 16 649200 144
3573 3580 35 18 685900 178
5287 4380 42 15 106200 149
5339 4330 40 10 354900 231
4656 4060 37 11 598700 225
3943 3380 34 16 381800 163
5121 4760 44 17 597900 224
4557 3800 36 14 745300 195

Data

Customer Balance Direct
1 1.22 0
2 1.56 0
3 2.10 0
4 2.25 0
5 2.89 0
6 3.55 0
7 3.56 0
8 3.65 1
9 3.77 0
10 3.88 0
11 3.98 0
12 4.09 0
13 4.26 0
14 4.57 0
15 4.91 0
16 5.24 0
17 5.54 0
18 5.73 0
19 5.88 0
20 6.17 0
21 6.28 0
22 6.52 0
23 6.70 0
24 6.74 1
25 6.80 0
26 6.85 1
27 6.93 0
28 6.98 0
29 7.26 1
30 7.42 0
31 7.66 0
32 7.76 0
33 7.78 1
34 7.94 0
35 8.10 1
36 8.60 0
37 9.75 1
38 9.79 0
39 10.11 1
40 10.15 0
41 10.19 1
42 10.37 1
43 10.92 0
44 11.35 1
45 11.69 0
46 14.43 1
47 15.07 1
48 18.45 1
49 24.98 0
50 26.05 1

Sheet1

Sheet2

Data

Student GPA Program Return
1 3.78 1 1
2 2.38 0 1
3 1.30 0 0
4 2.19 1 0
5 3.22 1 1
6 2.68 1 1
7 2.72 0 0
8 1.74 0 0
9 1.86 0 0
10 3.53 1 1
11 3.12 0 1
12 1.21 0 0
13 3.01 1 1
14 2.70 0 0
15 2.47 1 1
16 2.37 0 1
17 2.28 1 0
18 2.76 0 0
19 2.24 0 0
20 3.10 1 1
21 2.86 0 1
22 2.62 0 1
23 3.07 1 1
24 2.18 0 1
25 3.30 1 1
26 2.43 1 0
27 2.38 0 0
28 1.23 0 0
29 2.63 1 1
30 4.00 1 1
31 2.21 1 0
32 1.73 0 0
33 2.62 1 0
34 2.65 1 1
35 2.82 1 1
36 3.77 1 1
37 2.38 1 1
38 3.11 1 1
39 2.82 0 1
40 2.16 1 0
41 2.66 1 0
42 3.33 1 1
43 2.77 1 0
44 4.00 1 1
45 2.33 0 0
46 3.41 0 0
47 1.98 1 0
48 3.06 1 1
49 3.98 1 1
50 3.24 1 1
51 3.22 0 0
52 3.05 1 1
53 2.66 0 0
54 2.88 1 1
55 2.60 1 1
56 3.12 1 1
57 3.12 1 1
58 2.72 1 1
59 3.38 1 1
60 3.29 0 1
61 3.49 1 1
62 2.89 1 1
63 2.46 1 1
64 1.90 1 1
65 2.77 1 1
66 2.61 0 0
67 2.44 1 0
68 3.10 1 1
69 3.87 1 1
70 3.25 1 1
71 2.44 1 1
72 2.66 1 1
73 3.06 1 1
74 2.75 0 1
75 3.65 1 1
76 1.66 0 0
77 3.41 1 1
78 2.91 1 1
79 2.15 0 1
80 1.54 0 0
81 3.45 1 1
82 1.98 0 0
83 2.22 0 0
84 2.51 0 0
85 2.99 1 1
86 3.48 1 1
87 2.75 1 1
88 3.71 1 1
89 2.11 0 0
90 2.86 1 1
91 2.87 1 1
92 2.64 0 1
93 1.61 0 0
94 3.48 1 1
95 3.04 1 1
96 2.25 0 1
97 2.54 0 0
98 2.57 1 1
99 1.70 1 1
100 3.85 1 1

Sheet2

Sheet3

By beginning of this assignment, you affirm that you will not give or receive any
unauthorized help, and that all work will be your own. You agree to abide by Seneca’s
Academic Integrity Policy, and you understand any violation of academic integrity will be
subject to the penalties outlined in the policy.

Problem 1 (35 % marks) File: MALL. XLS

A national chain of women’s clothing stores with locations in the large shopping malls
thinks that it can do a better job of planning more renovations and expansions if it
understands what variables impact sales. It plans a small pilot study on stores in 25
different mall locations. The data it collects consist of monthly sales, store size (sq. ft),
number of linear feet of window display, number of competitors located in mall, size of
the mall (sq. ft), and distance to nearest competitor (ft).

1. Define a multiple regression model for the data. (6 marks)
2. Interpret the values of the coefficients in the model. (15 marks)
3. Test whether the model as a whole is significant. At the 0.05 level of significance,

what is your conclusion? (2 marks)
4. Use the model to predict monthly sales for each of the stores in the study. (6

marks)
5. Find and interpret the value of 2 for this model. (2 marks)
6. Test the individual regression coefficients (i.e., check the result of test statistics

that SAS or Excel provides). At the 0.05 level of significance, what are your
conclusions? (2 marks)

7. If you were going to drop just one variable from the model, which one would you
choose? Why? (2 marks)

Problem 2 (35%) – File: Bank. xlsx

Community Bank would like to increase the number of customers who use payroll
deposit. Management is considering a new sales campaign that will require each branch
manager to call each customer who does not currently use payoff direct deposit. As an
incentive to sign up for payroll direct deposit, each customer contacted will be offered
free checking for two years. Because of the time and cost associated with the new
campaign, management would like to focus their efforts on customers who have the
highest probability of signing up for payroll direct deposit. Management believes that the
average monthly balance in a customer’s checking account may be useful predictor of
whether the customer will sign up for direct payroll deposit. To investigate the
relationship between these two variables, Community Bank tried the new campaign
using a sample of 50 checking account customers who do not currently use payroll
direct deposit. The sample data show the average monthly checking account balance
(in hundreds of dollars) and whether the customer contacted signed up for payroll direct
deposit (coded 1 if the customer signed up for payroll direct deposit and 0 if not).

1. For the Community Bank data, use SAS to formulate the estimated logistic
regression equation. (5 marks)

2. Estimate the probability that customers with an average monthly balance of
$1000 will sign up for direct payroll deposit. (5 marks)

3. Suppose Community Bank only wants to contact customers who have a 0.50 or
higher probability of signing up for direct payroll deposit. What is the average
monthly balance required to achieve this level of probability? (10 marks)

4. What is the estimated odds ratio? What is the interpretation? (15 marks)

Problem 3 (30%) – File: Lakeland. xlsx

Over the past few years, the percentage of students who leave Lakeland College at the
end of the first year has increased. Last year Lakeland started a voluntary one-week
orientation program to help first-year students adjust to campus life. If Lakeland can
show that the orientation program has a positive effect on retention, they will consider
making the program a requirement for all first-year students. Lakeland’s administration
also suspects that students with lower GPAs have a higher probability of leaving
Lakeland at the end of the first year. To investigate the relation of these variables to
retention, Lakeland selected a random sample of 100 students from last year’s entering
class. The data are contained in the data set named Lakeland.

1. Write the logistic regression equation relating x to y. (5 mark)
2. For the Lakeland data, use SAS to compute the estimated logistic regression

equation. (5 marks)
3. Use the estimated logit computed above to estimate the probability that students

with a 2.5 grade point average who did not attend the orientation program will
return to Lakeland for their sophomore year. What is the estimated probability for
students with a 2.5 grade point average who attended the orientation program?
(10 marks)

4. What is the estimated odds ratio for the orientation program? Interpret it. (5
marks)

5. Would you recommend making the orientation program a required activity? Why
or why not? (5 marks)

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now