2 Fundamentals of Statistical Analysis question, R required, 20 hours deadline

Questions:

Question 1 (40 points):

Regression and MLE We are interested in estimating the median home value in New England. For this, we employ a regression from the origin (β1=0)

(β1=0)

as presented below:

Yi=βXi+εi

Yi=βXi+εi

Where Yi

Yi

is median home value in New England town i

i

, and Xi

Xi

is a binary variable that equals to 1 if the house is in town i

i

and equals to 0 otherwise.

Let Y1,Y2,,Yn

Y1,Y2,,Yn

be independent where

YiεiN(βXi,σ2),N(0,σ2).

YiN(βXi,σ2),εiN(0,σ2).

  1. (15 points) Find the MLE of β
    β

    , β̂ MLE

    β^MLE

    .

  2. (15 points) Find the MLE of σ2
    σ2

    , σ̂ 2MLE

    σ^MLE2

    .

  3. (10 points) Show that sums of squares of error, SSE, can be written as:

SSE=i=1ny2iβ̂ i=1nxiyi

SSE=i=1nyi2β^i=1nxiyi

Question 2 (40 points): Confidence Interval

Let Yi

Yi

still be the median home value in New England town i

i

. Let the generated Y

Y

below to be the entire population data on median value of NEw England homes, where μ=$329,108

μ=$329,108

and σ=$50,000

σ=$50,000

.

set.seed(12)
Y=rnorm(1000, mean=329108, sd=50000)

For steps 1 and 2 to let’s present we do not know μ

μ

.

  1. (5 points) Take 100 samples of size 30 (without replacement) from the population of Y
    Y

    ’s

  2. (10 points) Calculate a 95% confidence interval for μ
    μ

    for all of the 100 samples.

  3. (10 points) How many of these samples include the true mean μ=
    μ=

    ?

  4. (15 points) Repeat steps b and c for 90% confidence intervals.

Question 3 (20 points) Regression Estimation

  1. (7 points) Using the synthetic data provided below on median home values (Y
    Y

    ) and towns in New England (X)

    (X)

    , estimate the regression from question 1, i.e.,

Yi=βXi+εi

Yi=βXi+εi

Are the coefficients statistically significant? Do not forget to use factor(X) as opposed to X in your regression!!

housing=read.table("https://unh.box.com/shared/static/twmyqbvx0toxhvdv0n23c55e5cc3ipe4.csv", header = TRUE, sep=",", dec=".")
head(housing)
##          Y X
## 1 426419.3 7
## 2 416306.1 8
## 3 344116.1 9
## 4 453613.3 7
## 5 303323.9 5
## 6 314420.3 6
  1. (13 points) Check the residuals of the model. Are the assumptions satisfied? Why? Why not?

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Order Over WhatsApp Place an Order Online