Bus 226 Assignment #1

You can work in groups of up to 4 people. Submit one assignment per group. Attach one compressed folder (a .zip file) containing all your data files.

Question #1:

Fun with distributions: Take random draws from the chi-squared distribution (use =chiinv(rand( ),4) in Excel) and plot out three well made histograms showing how the shape is affected by sample size. Use samples of 15, 150, and 1500 draws.

Then, using the Central Limit Theorem, make three high-quality histograms of the sampling distributions created with the sample sizes above. Use 30 samples in each case.

In each of the 6 cases, include a box-plot as well. Use the same scale across all 6 box-plots. What is the ratio of the standard deviation to the interquartile range (IQR) in each case?

Question #2:

In class, we discussed the formula to find the sample mean and standard error for quantitative data. If data is qualitative, then we use the sample proportion and standard error of the proportion. The formula for the sampling proportion is (Number of items with some Characteristic of interest)/(Sample Size). The formula for the standard error of the proportion is , where π is the population proportion.

Suppose you have the following data where Y represents yes and N represents no for a sample of 50 college students responding to the question “Do you currently have access to Netflix?”

Y Y Y Y Y N Y N Y N Y N Y Y Y Y Y Y Y Y Y Y Y Y Y N Y Y Y Y Y Y Y N Y Y Y Y Y N Y Y Y Y Y Y Y N Y N

Determine the sample proportion, p, of college students who have access to Netflix.

If the population proportion is 0.45, determine the standard error of the proportion.

If the total population of the university is 16,524, what is the finite population correction (fpc) factor? Knowing the fpc factor allows for a more accurate estimation of the standard error, i.e. standard error = fpc.

The Z-value for the sampling distribution of the proportion is . How does the Z-value change when the fpc is taken into account?

Question #3:

Diamonds are categorized according to the “four C’s”: carats, clarity, color, and cut. Each diamond stone that is sold on the open market is provided a certificate by an independent diamond assessor that lists these characteristics. Data for 308 diamonds were extracted from Singapore’s Business Times and are saved in the Diamonds.xls file uploaded to our course files at myclass.ufv.ca.

Color is classified as D, E, F, G, H, or I, while clarity is classified as IF, VVS1, VVS2, VS1, or VS2. Use a graphical technique to summarize the color and clarity of the 308 diamond stones. What is the color and clarity that occurs most often? Least often? Do any of the colours or clarities look like they affect prices? Which ones?

Form 95% confidence intervals around the mean prices of each color. As an empirical rule, about 95% of data lies between and. How many of them overlap? What does this tell you?