Purchase Solution

Descriptive Statistics & Charts

Not what you're looking for?

Ask Custom Question

See Data file attached.

The file batonrougesufa10.xlsx contains data on houses sold in Baton Rouge, Louisiana in mid-2005. The dataset contains the following variables:

Price: The price the house sold for
Square Feet: The size of the house in square feet
Beds: The number of bedrooms in the house
Baths: The number of bathrooms in the house
Year Built: The year in which the house was built
Age: The age of the house in years
Pool: Does the house have a pool (yes/no)
Fireplace: Does the house have a fireplace (yes/no)
Waterfront: Is the house on the waterfront (yes/no)
Days on Market: How many days the house spent on the market before it was ultimately sold
Occupancy: Is the house occupied by the owner, a tenant, or is it vacant
Style: What is the architectural style of the house?

Problem 1.1
Use the Baton Rouge house price data to do the following:

(a) Which of the variables are categorical and which are numerical?
(b) Which of the categorical variables are nominal and which are ordinal?
(c) Which of the numerical variables are ratio and which are interval?

Problem 1.2
Use the Baton Rouge house price data to do the following:

(a) Create a summary table and an ordered summary table of the architectural styles of the homes sold.
(b) Construct a bar chart, a pie chart, and a Pareto diagram.
(c) Which graphical method do you think is best to portray these data?
(d) Based on this data, what conclusions can you make about architectural styles in Baton Rouge?

Excel tips:
The easiest way to create the frequency summary table is with the COUNTIF command. The command works like this: =COUNTIF(x,y), where x is the set of cells you want to look in for a particular value, and y is the value you are looking for. For example, =COUNTIF(K:K, "Vacant") would look in the K column for all instances of the term "Vacant", count them up, and give you the number.

Excel doesn't have a "canned" option to create a Pareto Diagram, so this is what you need to do: First, construct a column chart with data for both percentage and cumulative percentage. Then, click on one of the columns that represents data from one of the cumulative percentages, choose "change series chart type," and set it to line.

Problem 1.3
Use the Baton Rouge house price data to do the following:

(a) Construct a frequency distribution and a percentage distribution of the number of bedrooms in the sold houses.
(b) Construct a histogram and a percentage polygon.
(c) Plot a cumulative percentage polygon.

Excel tips:
As a general rule of thumb, the fewer times you type out a formula, the better. If you can accomplish a task by writing one formula and then filling it down with the fill bar, do it that way so you can minimize your chances of making mistakes. One feature that is very useful for accomplishing this task is making use of relative and absolute cell references. Say, for example, in cell C1 you have the expression =A1+B1. If you fill C1 down to C2, C2 will have the equation =A2+B2...when filling down, Excel viewed your cell references as relative, as if you said in C1 to make C1 equal to the sum of the two cells to the left of it. When you fill down to C2, Excel said that C2 should equal the sum of the two cells to the left of it, in this case A2 and B2. In some cases this is exactly what you want Excel to do, but in others you do not want relative cell references. For example, cell D1 may have total nationwide sales for your company, A1:A51 may have the names of the 50 states (plus DC!), and B1:B51 may have total sales within each state. In column C you want to have the percentage of total sales within that state. If in C1 you type =B1/D1, you will get the correct result, but then if you fill D1 down to D51, you will get error messages (or wrong answers) everywhere else. The easiest fix is to use an absolute reference in your equation, which you accomplish with the dollar sign ($). The $ is essentially a way of "locking" in either a row or column in a cell reference. If you type in C1 =B1/D$1, and then fill down, Excel will "lock in" the first row in the reference, so all of your formulae will compute correctly! With knowledge and appropriate application of relative and absolute references, it is possible to create the table in part (a) by typing exactly 4 equations (and filling them down) and nothing else!

Problem 1.4
Use the Baton Rouge house price data to do the following:

(a) Compute the mean, median, first quartile, and third quartile for the price variable.
(b) Compute the variance, standard deviation, range, interquartile range, coefficient of variation, skewness, and Z scores for the price variable.
(c) Are the data skewed? If so, how?
(d) Based on the results of (a) through (c), what conclusions can you reach concerning price?
(e) Calculate the proportion of house prices that are +/- 1, +/- 2, and +/- 3 standard deviations of the mean.
(f) Compare and contrast your findings with what would be expected on the basis of the empirical rule.
Excel Tips:
Excel has built-in functions to calculate the mean (AVERAGE), median (MEDIAN), quartiles (QUARTILE), variance (VAR for samples, VARP for populations), and standard deviation (STDEV for samples, STDEVP for populations). You might also think to use the MIN and MAX commands in calculating range. Search the Excel helpfile for the appropriate syntax of these commands. You should note that the method Excel uses to calculate quartiles differs slightly from the method outlined in the book.

I mentioned the COUNTIF command above, and you may have thought to use COUNTIF to accomplish part (e). However, a single COUNTIF command cannot handle more than one condition, so if you want to use COUNTIF you should absolutely think outside the box a bit. Or, if you are using Excel 2007 or later, there is a COUNTIFS command that handles multiple conditions.

Problem 1.5
Use the Baton Rouge house price data to do the following:

(a) Compute the covariance between house price and house size.
(b) Compute the coefficient of correlation between house price and house size.
(c) Construct a scatterplot between house price and house size with house price on the Y-Axis.
(d) Which of (a)-(c) do you think is the most valuable means to understand the relationship between house price and house size?
(e) What conclusions can you reach about the relationship between house price and house size?
Excel Tips:
Excel has built-in functions to calculate the covariance (COVAR) and correlation coefficient (CORREL). Search the Excel helpfile for the appropriate syntax of these commands.

Purchase this Solution

Solution Summary

The solution provides step by step method for the calculation of descriptive statistics and the construction of bar chart, pie chart, pareto distribution frequency polygon, histogram etc. Formula for the calculation and Interpretations of the results are also included.

Purchase this Solution


Free BrainMass Quizzes
Terms and Definitions for Statistics

This quiz covers basic terms and definitions of statistics.

Measures of Central Tendency

Tests knowledge of the three main measures of central tendency, including some simple calculation questions.

Measures of Central Tendency

This quiz evaluates the students understanding of the measures of central tendency seen in statistics. This quiz is specifically designed to incorporate the measures of central tendency as they relate to psychological research.

Know Your Statistical Concepts

Each question is a choice-summary multiple choice question that presents you with a statistical concept and then 4 numbered statements. You must decide which (if any) of the numbered statements is/are true as they relate to the statistical concept.