<?xml version="1.0" encoding="UTF-8" ?>
  <resource>
  <id>6411</id>
  <path>/www/nrich/html/content/id/6411/</path>
  <resourceTypeID>1</resourceTypeID>
  <last_published>2011-06-16T17:25:13</last_published>
  <indexXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;
The probability density functions for two related, but unknown,
distributions are given in the following accurately plotted
chart.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
It is known that the means of the distributions are whole numbers,
and that the two pdfs only have a single turning point.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
By numerically estimating the required integrals, what can you
deduce with certainty about the two means?&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;mdo:image height=&quot;383&quot; width=&quot;591&quot; src=&quot;curves.jpg&quot; alt=&quot;&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;&lt;/mdoxml&gt;</indexXML>
  <solutionXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;
&lt;h4&gt;Probability Density Functions&lt;/h4&gt;
&lt;br&gt;&lt;/br&gt;
&lt;p&gt;The probability density function, or PDF, is a function which describes the probability of a random variable taking on certain values. For a continuous random variable, the probability that the variable lies between two values is given by the integral of the density function between these values. &lt;br&gt;&lt;/br&gt;
 &lt;br&gt;&lt;/br&gt;
We know that the sum of the probabilities of all possible outcomes is 1. So the integral of the PDF over all possible values of the variable is equal to 1.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
We also need to know how to calculate the mean of the variable from the PDF. We first recall the definition of mean: $ \bar{x}=\Sigma x \,Pr(\hbox{X=x})$&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
Because the integral of the PDF gives us the probabilities of the variable occuring, the equation for the mean becomes  $$ \bar{x}=\int xf(x) \,dx $$  where $f(x)$ is the density function. &lt;br&gt;&lt;/br&gt;
 &lt;/p&gt;
&lt;h4&gt;Integrating using area approximation &lt;/h4&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt;We are now ready to find the means of our two PDFs. However, because we do not know their exact form we will have to approximate for the integrals. &lt;/div&gt;
&lt;br&gt;&lt;/br&gt;
&lt;p&gt; For example, consider a variable with the distribution function as below.&lt;mdo:image alt=&quot;below&quot; height=&quot;284&quot; src=&quot;6411_extra.jpg&quot; width=&quot;369&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
We wish to calculate $ Pr(1/2 \leq X \leq 3/4) $ which we can find by  calculating the area of the shaded rectangle:&lt;br&gt;&lt;/br&gt;
 $$ Pr(1/2 \leq X \leq 3/4) =\int^{3/4}_{1/2} \,dx= base \times height = ({3\over 4} - {1\over 2}) \times 1 = {1\over 4} $$&lt;br&gt;&lt;/br&gt;
 &lt;/p&gt;
&lt;h4&gt;Red Line Mean&lt;/h4&gt;
&lt;h4&gt; &lt;/h4&gt;
&lt;p&gt;Applying the same idea  to the red line in the problem, we can estimate the area under the curves using rectangles and trapeziums. Two such trapeziums are marked below in green.&lt;br&gt;&lt;/br&gt;
&lt;mdo:image alt=&quot;&quot; height=&quot;390&quot; src=&quot;curves_extra.jpg&quot; width=&quot;584&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
To find the area of the trapezium, we use the result $ Area(trapezium) = {h \times (a+b) \over 2} $&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
This gives us the probability that our variable lies within the small trapezium of height 1. To find the mean, we then need to multiply this probability by the value of the variable in this interval. We approximate here, by using the midpoint of the trapezium height.&lt;br&gt;&lt;/br&gt;
 &lt;br&gt;&lt;/br&gt;
Take for example the above trapezium on the right, where the variable ranges from 10 to 11. We approximate by taking the value of the variable as 10.5, and mutiply this by the probability of the region to get the mean. The table below gives our estimates of these values.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
 &lt;/p&gt;
&lt;table style=&quot;&quot; border=&quot;1&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;h=&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;a=&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;b=&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;Area&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;Midpoint&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;Mean&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.01&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.005&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.75&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.00375&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.015&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.0575&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.08625&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.15&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.0125&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;2.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.3125&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.15&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.155&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.1525&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;3.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.53375&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.155&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.135&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.145&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;4.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.6525&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.135&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.12&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.1275&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;5.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.70125&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.12&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.085&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.1025&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;6.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.66625&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.085&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.06&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.0725&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;7.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.54375&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.06&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.045&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.0525&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;8.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.44625&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.045&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.035&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.04&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;9.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.38&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.035&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.025&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.03&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;10.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.315&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.025&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.02&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.0225&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;11.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.25875&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.02&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.015&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.0175&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;12.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.21875&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.015&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.01&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.0125&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;13.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.16875&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.01&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.01&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.01&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;14.5&lt;/td&gt;
&lt;td style=&quot;text-align: center;&quot;&gt;0.145&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;p&gt;The sum of the means in the right hand column is 5.4325. Because the question tells us the mean is an integer, we should also approximate the mean in the region 15 to 20. &lt;br&gt;&lt;/br&gt;
 &lt;br&gt;&lt;/br&gt;
As the probabilities in this range are so low, it is easier to approximate the area as a very flat rectangle. Remembering that the area under the PDF is the same as the probability of the variable being in that region, we find $$ Pr(15 \leq X \leq 20)=5 \times 0.005=0.025 $$ Again we use the midpoint approximation, and find $$ \bar{x}=17.5 \times 0.025 = 0.4375 $$&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
Summing over all the means, this gives us $ \bar{x}=5.4325 + 0.4375 = 5.87 \approx 6 $&lt;br&gt;&lt;/br&gt;
 &lt;br&gt;&lt;/br&gt;
We leave the grey line for you to compute. You might want to find an even closer estimation of the mean, and then find the relationship between the two PDFs.&lt;/p&gt;&lt;/mdoxml&gt;</solutionXML>
  <noteXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;
 

&lt;h3&gt;Why do this problem?&lt;/h3&gt;

This &lt;a href=&quot;http://nrich.maths.org/public/viewer.php?obj_id=6411&amp;amp;part=&quot;&gt;
problem&lt;/a&gt; gives an opportunity to practise numerical integration
in the context of probability distributions. It will really allow
students to get into the meaning of probability density functions
in terms of areas and probabilities. Instead of simply requiring an
explicit calculation, students will need to engage with decisions
concerning limits and integration.&lt;br&gt;&lt;/br&gt;
 

&lt;h3&gt;Possible approach&lt;/h3&gt;

The first stage of the problem is to realize that a numerical
integration is needed to calculate the mean. Once the class has
realised that this is the case, they will need to start to perform
the integrations. This will require various choices as there are
many ways in which this can be done. To facilitate this, you might
like to print off copies of the graph for students to draw on. 

&lt;h3&gt;Key questions&lt;/h3&gt;

&lt;div&gt;How do we relate a probability density function to a
probability?&lt;/div&gt;

&lt;div&gt;How do the two graphs relate to each other?&lt;/div&gt;

&lt;div&gt;What is the graphical interpretation of an integral?&lt;/div&gt;

&lt;div&gt;How important will the effect of the second graph be?&lt;/div&gt;

&lt;div&gt;What happens for values larger than $20$? Are these values
relevant?&lt;/div&gt;

&lt;br&gt;&lt;/br&gt;
 

&lt;h3&gt;Possible extension&lt;/h3&gt;

How might you try to estimate the variance for these distributions
numerically? 

&lt;h3&gt;Possible support&lt;/h3&gt;

&lt;div&gt;First try to show that numerically the area under the red
curve is 1. You can then use the decomposition into rectangles and
trapezia to try to work out the mean.&lt;/div&gt;

&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;&lt;/mdoxml&gt;</noteXML>
  <clueXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;
Although numerical integration is not exact, you might like to try
to do a numerical integration which you KNOW is smaller than the
mean and then do another integration which you can be very sure
gives a value which is larger than the mean.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
Don't forget that you can break an area down into rectangles or
trapezia.&lt;br&gt;&lt;/br&gt;&lt;/mdoxml&gt;</clueXML>
  <canonXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;
These are actually lognormal distributions.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;mdo:image height=&quot;127&quot; width=&quot;327&quot; src=&quot;params.jpg&quot; alt=&quot;&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
The means are 5 and 6. &lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;&lt;/mdoxml&gt;</canonXML>
  <end_user_role>5</end_user_role>
  <difficulty>3</difficulty>
  <keystage1>0</keystage1>
  <keystage2>0</keystage2>
  <keystage3>0</keystage3>
  <keystage4>0</keystage4>
  <keystage4plus>1</keystage4plus>
  <title>What's your mean?</title>
  <description>Can you work out the means of these distributions using numerical
methods?</description>
  <spec_group>Advanced Probability and Statistics
    <specifier>Probability distributions, expectation and variance</specifier>
  </spec_group>
  <spec_group>Pre-Calculus and Calculus
    <specifier>Numerical integration</specifier>
  </spec_group>
</resource>