Chemistry Homework Solutions
Problem
#33439

how to use excel to perform Q test and generate histogram

Please see attachment

Thanks

Attached file(s):
Attachments
s1.xls  View File
S2.xls  View File
Appendix D.pdf  View File

Attachment Content Summary (Note: view attachment at the above link before purchasing. Actual attachment content may vary slightly from that shown below.)

Appendix D.pdf
D-1

Appendix D

Statistical Treatment of Experimental Data


This Appendix is a very brief review of this topic. A more detailed review can be found in, for example, Fundamentals of
Analytical Chemistry, 7th ed., by D. A. Skoog, D. M. West, and F. J. Holler, Saunders College Publishing, chapters 2, 3,
and 4.

Every laboratory measurement has associated with it some level of uncertainty that cannot be exactly determined but must
be estimated to give the investigator a handle on the reliability of the measurement. Uncertainties can arise due to
determinate and indeterminate errors.

Determinate errors are those with assignable causes and definite values which, if it is realized that they exist, can be
eliminated or at least compensated for. These errors, however, left unchecked, can give erroneous answers and poor results.
Instrumental and equipment errors belong to this category. If a 250 mL volumetric flask really contains 249 mL, an error
will always result. This can be checked and corrected each time the flask is used. Calibrations can eliminate these errors.
Note, however, if this is not corrected for, experimental results may be very precise (reproducible) but not very accurate
(close to the correct value). Keep in mind the difference between precision and accuracy in this course.

Other major causes of determinate errors include methodology (slow reactions, incomplete reactions, side reaction,
selectivity instead of specificity, and chemical instabilities) and personal errors (bias, color blindness, incorrect readings,
and inaccurate recordings). These must be minimized if not eliminated. Too often, however, it is not realized that they are
present, and they are left unchecked.

Even if all determinate errors are handled, indeterminate errors are always present. Indeterminate errors are due to
inherent limitations in experimental equipment and procedures. These must be minimized but can never be totally
eliminated.

Consider a problem as simple as measuring 50-mL of water. If you use a beaker (probably not a good choice since beaker
markings are approximate at best), you cannot fill it to "exactly" the line. What does that mean? The line has a finite width
and your eyes have a finite ability to see the exact same spot on the line reproducibly. Even if you used a pipet (a much
better choice) you could never measure exactly 50-mL. Volumes are difficult to measure, so get it by measuring the mass
and look up the density and calculate volume. Still, mass measurements contain indeterminate errors (our balances read to
0.0001 g) but that does not mean if the water weighs 49.9000 g, it is exactly that. We often report this as 49.9000 g ± .0001
g showing the uncertainty in the measurement. More accurate balances are available but further problems occur
(temperature dependencies, humidity, etc.).

We are stuck with indeterminate errors but hope to minimize these with use of precision equipment. Also, indeterminate
errors are often referred to as random errors since positive and negative errors should be equally likely (note that this was
not true for most determinate errors). Therefore, by doing more than one analysis and obtaining an "average" value, we can
not only get idea of where the true value (central tendency) is and hope that the errors will somewhat cancel out or at least
minimized. If all determinate errors are minimized and we carefully analyze a sample a number of times, good precision
should imply good accuracy.

Beginning students in analyses often feel that if three trials are good, why not ten or twenty? We will find, however, that
three careful measurements averaged to give the reported value will be superior to more trials (where fatigue leads to
sloppiness and gigantic wastes of time). One well done trial would suffice except that you have no feeling for accuracy
since you have no idea of precision. Two trials are fine except that if precision is poor, you are uncertain which trial is
better. With three, often two are close and one is off, and we will discuss guidelines on how to determine if you should
throw out the third one in calculating your "best value." Do not be too quick in discarding it however (unless you can
identify a determinate error that led to the deviation). Remember, one trial always has perfect precision, but not necessarily
good accuracy.
D-2
Formulas

Central Tendency:
n

xi =1
i
Average or mean ( x) = , median = middle value, mode = most frequently occurred value.
n

Range:
n n

(x
i =1
i - x) (x
i =1
i - x) 2
deviation = x i - x , average deviation = , standard deviation (s) =
n n -1


Sy
relative standard deviation (S y ) r =
y
x × 100
coefficient of variance (CV) =
x
t×s
confidence limit = x ± , (t taken from Table I, next page)
n

n n n

i =1
xi y i - i =1
xi × y
i =1
i /n
least squares line slope (m) = 2
n n
i =1
x i2 -
xi / n
i =1


n n

i =1
yi x
i =1
i
least squares line intercept (b) = -m× , where m is least squares line slope.
n n

Propagation of errors
Sums and differences S y = S a + S b + S c + ...
2 2 2


Products and quotients (S y ) r = (S a ) r + (S b ) r + (S c ) r + ...
2 2 2




Note: If both sums and differences as well as products and quotients are involved, the uncertainties associated with the
former are evaluated first.
D-3

Table I. t-values for Various Levels of Confidence

t×s
x± = confidence limit
n

x is the average value, t is the t-value, s is the standard deviation, and n is the number of data points.

Student-t values at various Confidence Interval
(N-1)
Degress of
Freedom 80% 90% 95% 99% 99.9%
1 3.08 6.31 12.7 63.7 637
2 1.89 2.92 4.30 9.92 31.6
3 1.64 2.35 3.18 5.84 12.9
4 1.53 2.13 2.78 4.60 8.60
5 1.48 2.02 2.57 4.03 6.86
6 1.44 1.94 2.45 3.71 5.96
7 1.42 1.90 2.36 3.50 5.40
8 1.40 1.86 2.31 3.36 5.04
9 1.38 1.83 2.26 3.25 4.78
10 1.37 1.81 2.23 3.17 4.59
11 1.36 1.80 2.20 3.11 4.44
12 1.36 1.78 2.18 3.06 4.32
13 1.35 1.77 2.16 3.01 4.22
14 1.34 1.76 2.14 2.98 4.14
1.29 1.64 1.96 2.58 3.29



Table II. Critical Values for Rejection Quotient Q

Qcrit (Reject if Qexp > Qcrit)
Number of
Observations 90% Confidence 96% Confidence 99% Confidence
3 0.94 0.98 0.99
4 0.76 0.85 0.93
5 0.64 0.73 0.82
6 0.56 0.64 0.74
7 0.51 0.59 0.68
8 0.47 0.54 0.63
9 0.44 0.51 0.60
10 0.41 0.48 0.57


Before doing the Q test, arrange the data in increasing (or decreasing) order.

difference between questionab le point and its nearest neighbor
Qexp =
range of data (largest minus smallest)
D-4

Practice problem. Before doing this problem, make sure you are familiar with what is mentioned in Spreadsheet and Word
Processing ­ Short Version.

A B C D E F G H I
1 92.65 Bin range Bin Frequency
2 84.31 n= 59 70 70 0
3 78.73 max = 94.96 75 75 0
4 94.02 min = 75.11 80 80 4
5 75.11 mean = 86.38 85 85 18
6 89.58 median = 86.44 90 90 26
7 83.62 mode = 84.31 95 95 11
8 83.54 sd = 4.24 More 0
9 87.72 90%CL = 87.28 85.47
10 89.73 99%CL = 87.80 84.95
11 82.83
12 88.85
13 88.56 Histogram
14 89.00
15 83.99 30
16 92.71
17 90.12 25
Frequency




18 88.10 20
19 87.15 15
20 87.35
21 78.76 10
22 87.91 5
23 90.26
0
24 85.70
25 80.13 70 75 80 85 90 95
26 83.83 Bin
27 88.42
28 86.16
29 92.98
30 93.13
31 83.86
32 83.68
33 93.34
34 86.44
35 81.56
36 84.31
37 85.76
38 94.96
39 79.91
40 80.13
41 90.27
42 88.08
43 85.02
44 85.07
45 81.39
46 88.94
47 87.54
48 88.73
49 89.54
50 85.00
51 80.24
52 83.36
53 85.43
54 88.02
55 90.30
56 88.16
57 84.12
58 85.94
59 82.25
D-5
Assuming that the values in column A are collected from an experiment, these numbers are now subjected to certain
statistical treatment. Given this set of numbers you are suppose to try to reproduce the results and graph shown on the
previous page.

To obtain the numbers:
1. Go to WebCT Chem6CL Homepage and click on Submit/Receive Information.
2. On the next page click on AppendixD_data.
3. Save the spreadsheet on your hard drive or floppy disk before doing the following. Note all shaded cells must contain a
formula.

1. D2 gives the total number of data points. It is calculated by using the formula =count(A1:A59).

2. D3 shows the largest of the 59 values. What formula should be used to obtain the value 94.96?

3. D4 shows the smallest of the 59 values. What formula should be used?

4. D5 ­ D10, E9, and E10 represent average, median, mode, standard deviation, 90% and 99% confidence limits,
respectively, of the 59 numbers. Use the appropriate formulae to calculate each of these values. Consult a statistical text
on the meaning and significance of average, median, mode, and confidence limits and their relationship, if any.

The remaining part of this problem involves generating a histogram for this set of numbers. Microsoft Excel provides a
number of data analysis tools called the Analysis ToolPak. Histogram is one of the tools. It can be found under Data
Analysis... when Tools on the menu bar is clicked. If the Data Analysis is not there, you need to install the Analysis
ToolPak in Microsoft Excel. Here are the directions for installing the ToolPak.

a. On the Tools menu, click Add-Ins.
If Analysis ToolPak is not listed in the Add-Ins dialog box, click Browse and locate the drive, folder name, and file
name for the Analysis ToolPak add-in, Analys32.xll - usually located in the Library\Analysis folder - or run the Setup
program if it isn't installed. The Microsoft Office CD may be required.
b. Select the Analysis ToolPak check box.


5. Bin Range. Bin range refers to a column of equally spaced numbers, bracketing the maximum and minimum values of
the data set. Since max = 94.96 and min = 75.11, you should set the bin range to be 70, 75, 80, 85, 90, and 95. Enter these
six numbers in F2 ­ F7 one at a time or set up a formula to do this. Note that they are in ascending order. In general,
· Use max and min to bracket the x-axis. If this is done correctly, the number next to More (see cell H8) will be zero.
· Try to keep the number of bins 20 or less.
· The bins should be whole numbers, if possible.

6. Histogram Table. The next step is to set up a table that counts the number of times (i.e. frequency) the numbers in the
data set fall in a certain bin range. This is done automatically by the spreadsheet:

a. On the spreadsheet, click Tools, on the drop down menu, click Data Analysis....
b. On the Data Analysis dialog box, highlight Histogram (as shown in the Data Analysis dialog box) and click OK.
D-6

c. Enter the three cell references exactly as they appear below. Make sure you understand what each of them mean. If not,
ask.




d. When OK is clicked, the histogram table (i.e., G1 ­ H8) will appear automatically.

e. Construct the histogram. In constructing the histogram you will come across the ChartWizard ­ Step 1 of 4 ­ Chart
Type dialog box. Select the chart type as indicated.
D-7

f. In the next two dialog boxes, enter the cell references exactly as indicated.
D-8

g. Finally, enter the title, x- and y-axes labels. Make sure you understand what boundary values are for a particular bin.
For example, what does bin 80 mean?


For several of the experiments, class data will be provided. For these experiments you will be asked to perform the same
type of operations as you have done here. Namely, calculate the parameters indicated in C2 ­ C10 and generate the
histogram (see page D-4) on a spreadsheet. These parameters and the histogram are in turn copied and pasted into the
discussion section of your report.
Solution
What is this?
By OTA - Overall OTA Rating
Suraj Joshi, PhD (IP) - 4.7/5
Purchase Cost Now
$2.19 CAD (was ~$19.95)
Included in Download
  • Plain text response
  • Attached file(s):
    • s1.xls
    • S2.xls
$2.19 Instant Download
Add to Cart
Why you can trust BrainMass.com
  • Your Information is Secure
  • Best Online Academic Help Service
  • Students find real academic Success
Related Solutions
Browse