Statistics and Probability

  • Families,

    When you collect data, or results, organizing it can be very helpful when predicting future events. For example, a basketball player may expect to score some points each game. The other teams that play against that player would like to know who is likely going to score how many points. In other words, who should score the most will get lots of attention, and who scores the least might not get as much defensive coverage. On the former Trailblazers, Brandon Roy who is 6 feet 6 inches tall, scored a lot of points, so other teams will want to slow him down. Joel Pryzbilla on the other hand is seven feet two inches tall scored few points typically, so no one really guards him that much. Let's look a little closer. Let's say Brandon Roy's points are collected over 5 games and in those games he scores 27, 22, 23, 27 and 31 points. There are some good ways to organize that information to know how to expect what he will do next. We will talk about some terms your child will hear and learn.

    Data: The information collected. Here it is points scored 27, 22, 23, 27 and 31.

    Order: Data should be organized from smallest to largest unless otherwise noted. In this case, we would see 22, 23, 27, 27, 31

    Range: The largest number minus the smallest. Here it is 31 - 22 = 9, so 9 is our range.

    Average: an expected result based on the information collected. Average can be measured in 3 different ways. Mean, median, and mode.

    Mean: the most commonly used, add up the data and divide by the number of data points collected. In our case 22 + 23 + 27 + 27 + 31 = 130, the sum, then divide 130 by 5, the number of data points because there are 5 data points. 130 divided by 5 = 26 points. We should expect Brandon Roy to score 26 points in the next game. Our mean is 26. This can be thought of as the sum divided by the number of data points.

    Median: this is the value in the middle of the order, like the middle of the road is the median. In our case, 22, 23, 27, 27, 31 is our collection in order. The 27 is right in the middle, with 2 collection points to the left, 22 and 23, while there are 2 collection points to the right, 27 and 31. Our median is 27.

    Mode: The mode is the value or data that shows up the most. Mode = most is an easy way to remember. In our case, there are 2 values of 27 so that is our mode. Sometimes, there is no mode if no value shows up more than any other.

    So we have 3 different ways to predict what Brandon Roy can be expected to do and the values are pretty close.

    That is a basic look at some measures of statistics. Now a look at probability.

    An event or an outcome is a result that may or may not happen.

    A four color spinner with equal size parts for example has 4 different outcomes, say red, white, blue and green. The spinner can land on only 1 of those 4 colors. To land on red, there is a 1 in 4 chance, since there are 4 colors to choose from, but only 1 red. The outcome you want is the favorable or desired outcome. We have called it the winner or winning outcome in class. When we talk about probability we mean the likelihood of a result or outcome as mentioned earlier. In math terms, the probability is the number of favorable or winning outcomes, for us that is 1, the red, divided by the number of possible outcomes, which is 4. This makes our probability equal to 0.25 since 1 divided by 4 is 0.25 which we can put on a probability number line from 0 to 1. The value of 0 represents an impossible outcome while the value of 1 represents a certain outcome. An outcome which is equally likely has a value of 0.5 which is equal to one half. This is the probability of winning in a coin toss, since there is 1 winning outcome, a head for example, while there are 2 possible outcomes, heads or tails.

    Still reading, well, thank you. On that number line from 0 to 1, the middle is 0. 5 as mentioned earlier. Anything between 0 and 0. 5 is considered unlikely, while anything between 0.5 and 1 is considered likely.

    Let's look at a "number cube" with numbers 1 through 6 having equally likely outcomes. If you talk about the probability of rolling a 4, you can think of it like this. There is 1 favorable or winning outcome, the 4, while there are 6 equally likely outcomes. So your probability of rolling a 4 is 1 (favorable outcomes) divided by 6 (possible outcomes). This is the fraction 1/6 or one sixth. I don't know how to make that fraction on a computer, sorry.

    As we talk about probability, fairness must be considered. On a spinner for example, for it to be fair, all the outcomes must be equally likely.

    Of course, we need to talk about variability in data point collected. If collect data points and most points are clustered (sorry, I know that word has other meanings) together, but one data point is way away, then maybe that data point is not so good. The distance between the biggest and smallest points is called the range. Range equals the biggest data point minus the smallest data point. Range = Max - min. The point that lies far away from the cluster is called the outlier. The distant point lies out away from the cluster or clusters. Sometimes, that point is ignored because of errors in the collection of points, but sometimes you have a point that just is much larger or smaller than the others. These are the main ideas your child should know, as shown above.

    If you have suggestions or comments, please let me know by sending me an email at ajaquiss@pps.net and I will try to get back to you as soon as possible.