Data Collection and Presentation: Frequency Tables, Measures of Central Tendency, and Calculations

Data Collection and Presentation


Content:

  1. Formulation of Frequency Table for Ungrouped Data
  2. Measures of Central Tendency

1. Formulation of Frequency Table for Ungrouped Data

Ungrouped Data

Ungrouped data refers to a set of raw data that is not organized into class intervals. It represents individual observations, and the occurrences of each unique value can be counted.

To formulate a frequency table for ungrouped data, two main steps are involved:

  1. Prepare a Tally Sheet
  2. Prepare a Frequency Table

i. Preparation of a Tally Sheet

A tally sheet is used to record the occurrence of each value or variable. Each time a value repeats, a stroke (or tally) is marked. Groups of five tallies (represented as lllll) form a bundle to make counting easier.

For example, if we have scores of students in a test:

Example:
The following are the scores of 30 students in an economics test:

2, 4, 8, 8, 2, 6, 6, 8, 2, 4,
8, 0, 8, 6, 0, 10, 2, 2, 0, 10,
4, 6, 0, 10, 2, 2, 6, 6, 4, 2.

We begin by tallying each number:

ScoresTallyFrequency
0llll4
2llllllll8
4lllll5
6lllll5
8lllll5
10lll3

ii. Preparation of a Frequency Table

A frequency table is created by counting the tallies for each number. This table shows the total number of occurrences for each score.

The frequency table for the given scores is as follows:

ScoresFrequency
04
28
45
65
85
103

2. Measures of Central Tendency

Measures of central tendency are statistical values that describe the center or typical value of a set of data. They include the mean, median, and mode. These measures help us understand the distribution of data by identifying a central point around which the data is clustered.


i. The Mean

The mean (also called the arithmetic mean) is the average of a set of numbers. It is calculated by adding all the values together and dividing by the number of values.

Formula for Mean:

Mean=Sum of all observationsNumber of observations\text{Mean} = \frac{\text{Sum of all observations}}{\text{Number of observations}}Example: Calculate the mean of the following scores: 14, 18, 24, 16, 30, 12, 20, and 10.

Solution:

  1. Add up all the scores:
    14+18+24+16+30+12+20+10=14414 + 18 + 24 + 16 + 30 + 12 + 20 + 10 = 144
  2. Count the number of observations (students):
    There are 8 students.
  3. Divide the sum by the number of observations:

    Mean=1448=18\text{Mean} = \frac{144}{8} = 18

So, the mean score is 18.


Advantages of the Mean:
  1. Easy to calculate.
  2. Easy to interpret.
  3. Widely used and well-known.
  4. Provides an exact value.
  5. Useful for comparing different sets of data.
Disadvantages of the Mean:
  1. Can be difficult to compute manually if there are many values.
  2. Extreme values (outliers) can distort the mean.
  3. Cannot always be interpreted visually (e.g., in a graph).
  4. May not accurately reflect the data if there are missing or incorrect values.

ii. The Median

The median is the middle value of a data set when the values are arranged in either ascending or descending order. If the total number of observations is odd, the median is the value in the middle. If the total number of observations is even, the median is the average of the two middle values.

Example 1: Find the median of the following scores: 12, 8, 15, 9, 3, 7, and 1.

Solution:

  1. Arrange the data in ascending order:
    1,3,7,8,9,12,151, 3, 7, 8, 9, 12, 15
  2. Since there are 7 observations (an odd number), the median is the middle value, which is the 4th position:
    Median = 8

Example 2: Find the median of the following scores: 36, 42, 10, 15, 9, 32, 16, and 12.

Solution:

  1. Arrange the data in ascending order:
    9,10,12,15,16,32,36,429, 10, 12, 15, 16, 32, 36, 42
  2. Since there are 8 observations (an even number), the median is the average of the 4th and 5th values:

    Median=15+162=312=15.5\text{Median} = \frac{15 + 16}{2} = \frac{31}{2} = 15.5

So, the median score is 15.5.


Advantages of the Median:
  1. Simple to calculate, especially for small data sets.
  2. It is not affected by extreme values or outliers.
  3. It provides a better central value for skewed data distributions.
Disadvantages of the Median:
  1. It is not useful for further statistical analysis (e.g., in calculating other averages).
  2. It ignores some values in the data set.
  3. It may not represent the “typical” value of the set of data in some cases.

iii. The Mode

The mode is the value that appears most frequently in a data set. It is the most common observation in the set. The mode is particularly useful when analyzing categorical data.

Example:
Using the frequency distribution from the earlier example:

ScoresFrequency
12
23
36
42
53
62
91
101

The mode is 3, as it appears the most frequently (6 times).


Advantages of the Mode:
  1. Easy to understand and interpret.
  2. Not affected by extreme values.
  3. Can be determined easily from a frequency table or graph.
Disadvantages of the Mode:
  1. It may not be unique (i.e., more than one mode can exist).
  2. It may not be useful for further statistical calculations.
  3. It does not consider all values in the data set.

Evaluation Questions

  1. Define mean.
  2. What are the disadvantages of the mean?
  3. Define median and calculate the median of the following set of numbers: 14, 18, 20, 22, 24.
  4. What are the advantages of the median?
  5. Define mode and calculate the mode of the following data set: 3, 3, 4, 5, 5, 5, 6, 7.
  6. List three disadvantages of the mode.
  7. Explain measures of central tendency.
  8. Calculate the mean of the following numbers: 11, 15, 18, 20, 22, 30.

Reading Assignment

  • Amplified and Simplified Economics for SSS by Femi Longe, pages 28-33
  • Comprehensive Economics for SSS by J.V. Anyaele, chapter 2, pages 61-70; chapter 2, pages 48-50

General Evaluation

  1. Find the mean of the following set of numbers:
    14, 11, 12, 13, 11, 14, 12, 20, 24, 21, 22, 23, 20, 11, 13, 23.
  2. Define the median and state four advantages of the median.
  3. Calculate the mean of: 18, 14, 14, 15, 13, 18, 19, 19, 19, 21.
  4. What are the disadvantages of the mean?
  5. Define the mode.
  6. List three advantages of the mode.

Weekend Assignment

  1. A stroke of five (5) makes up a: a) Tally
    b) Frequency
    c) Mode
    d) Observation
  2. The number that occurs most often in a data set is the: a) Median
    b) Mode
    c) Mean
    d) Frequency

Conclusion

In this lesson, we covered how to formulate a frequency table from ungrouped data, as well as explored measures of central tendency such as mean, median, and mode.

Evaluation Questions

  1. The measure of central tendency that is calculated by adding all the values and dividing by the total number of values is called the __________.
    a) Mode
    b) Median
    c) Mean
    d) Range
  2. The measure of central tendency that is the middle value when the data is arranged in ascending or descending order is called the __________.
    a) Mode
    b) Median
    c) Mean
    d) Range
  3. The measure of central tendency that represents the most frequent value in a data set is the __________.
    a) Mode
    b) Median
    c) Mean
    d) Range
  4. A __________ is a table that lists all the values of a data set and their frequencies.
    a) Frequency distribution
    b) Bar chart
    c) Histogram
    d) Frequency table
  5. To find the mean, you must __________ the total sum of the values by the number of observations.
    a) Multiply
    b) Add
    c) Divide
    d) Subtract
  6. In a data set with an odd number of values, the median is the value located at the __________.
    a) Start of the data set
    b) End of the data set
    c) Middle of the data set
    d) Average of the first and last value
  7. If a data set has two values that appear most frequently, it is called a __________ distribution.
    a) Unimodal
    b) Bimodal
    c) Uniform
    d) Skewed
  8. A set of scores has the values: 2, 5, 5, 6, 6, 6, 7, 7. The mode of the data set is __________.
    a) 5
    b) 6
    c) 7
    d) 5 and 6
  9. If the total number of observations in a data set is 12, the median will be the average of the __________ and __________ values.
    a) 6th, 7th
    b) 5th, 6th
    c) 4th, 5th
    d) 3rd, 4th
  10. The mean of the numbers 12, 15, 18, 10, and 20 is __________.
    a) 15
    b) 17
    c) 18
    d) 16
  11. A data set has the values: 3, 3, 5, 5, 7, 7, 7, 8, 8. The median is __________.
    a) 5
    b) 6
    c) 7
    d) 5.5
  12. If a frequency table lists data values and their corresponding tallies, it is important to remember that each group of five tallies represents __________.
    a) One
    b) Two
    c) Three
    d) Five
  13. A data set with extreme values that are much higher or lower than the rest is called a __________ distribution.
    a) Skewed
    b) Normal
    c) Symmetrical
    d) Uniform
  14. The mean can be affected by __________.
    a) The mode
    b) Outliers
    c) The range
    d) The median
  15. The range is calculated by subtracting the __________ value from the __________ value in a data set.
    a) Lowest, highest
    b) Highest, lowest
    c) Middle, lowest
    d) Middle, highest

Class Activity Discussion

  1. What is the mean, and how do we calculate it?
    The mean is the sum of all values divided by the number of observations. For example, if the scores of 5 students are 12, 15, 18, 10, and 20, the mean is:
    12+15+18+10+205=755=15\frac{12 + 15 + 18 + 10 + 20}{5} = \frac{75}{5} = 15.
    So, the mean score is 15.
  2. What is the median, and how is it different from the mean?
    The median is the middle value of a data set when arranged in ascending or descending order. For an odd number of observations, it’s the central value, and for an even number of observations, it’s the average of the two middle values.
    Example: If the data set is 2, 3, 5, 6, 8, the median is 5.
  3. How do we calculate the mode?
    The mode is the value that occurs most frequently. If a data set is 5, 6, 7, 7, 8, the mode is 7 because it appears most often.
  4. Can a data set have more than one mode?
    Yes, a data set can be bimodal (two modes) or multimodal (more than two modes). For example, the data set 3, 3, 5, 5, 7 has two modes: 3 and 5.
  5. What is a frequency table, and how is it used?
    A frequency table lists all the values in a data set and the number of times each value occurs. This table helps organize the data and makes it easier to analyze. For example, if 2 appears 4 times, 3 appears 3 times, etc., a frequency table will show these occurrences clearly.
  6. What is the range of a data set, and how is it calculated?
    The range is the difference between the highest and lowest values in a data set. For example, if the data set is 1, 5, 9, 10, the range is 10−1=910 – 1 = 9.
  7. Why might the mean not represent the “typical” value in a data set?
    The mean can be skewed by extreme values (outliers). For example, if most students score around 70 on a test, but one student scores 10, the mean will be much lower than the “typical” score.
  8. What happens when a data set has an even number of values?
    When there is an even number of values, the median is the average of the two middle values. For example, in the data set 3, 5, 7, 8, the median is 5+72=6\frac{5 + 7}{2} = 6.
  9. Can the mode be useful for all types of data?
    The mode is particularly useful for categorical data, where you need to know the most frequent category, such as most popular color or most frequent product preference.
  10. How do tallies help in creating a frequency table?
    Tallies make it easier to count occurrences in a data set. Each group of five tallies represents one unit, making the counting process faster and more organized.
  11. What is a skewed data distribution?
    A skewed distribution occurs when the data is not symmetrical. In a positively skewed distribution, most values are clustered on the left side, with a few larger values on the right. In a negatively skewed distribution, most values are on the right side, with a few smaller values on the left.
  12. Why is it important to understand the median?
    The median is useful when data is skewed or has outliers, as it gives a better representation of the center of the data than the mean in such cases.
  13. What is a frequency distribution?
    A frequency distribution shows the frequency of each value or group of values in a data set. It helps in organizing and summarizing the data efficiently.
  14. How can the mode help in analyzing data?
    The mode helps identify the most common value in a data set, which is useful for determining trends or preferences in categorical data.
  15. What is the advantage of using the median over the mean?
    The median is less affected by outliers, making it a better measure of central tendency for skewed data.

Evaluation

  1. What is the formula for calculating the mean?
  2. How do you calculate the median for a data set with an even number of values?
  3. What is the mode of the following data set: 10, 12, 12, 13, 14, 15, 15, 15?
  4. How is a frequency table different from a histogram?
  5. What is the range of the data set: 5, 9, 2, 4, 7?
  6. How does the presence of outliers affect the mean?
  7. If a data set is 3, 6, 9, 6, 3, 6, what is the mode?
  8. What is the median of the following data set: 12, 5, 9, 8, 15?
  9. How do you find the mean for the data set: 8, 6, 9, 7, 5, 8, 9, 10?
  10. What is the mode of this data set: 4, 5, 5, 4, 5, 6, 7, 6?