3

I'm trying to get the weighted average of CpM (Cost per Mile) for the dataset below: enter image description here

I'm calculating the average by:

1st approach: Average CpM = sum(TotalCost / Miles) / count = (2 + 2.85 + 3.1 + 2.9) / 4 = $2.715

However, I've seen a colleague calculate the weighted average CpM the following way:

2nd approach: Average CpM = sum(TotalCost) / sum(Miles) = (100 + 200 + 310 + 290) / (50 + 70 + 100 + 100) = $2.81

When we introduce a new outlier that has extremely high CpM to the dataset, the 1st approach tends to shoot up along with the outlier, while the 2nd is more robust to the outlier:

enter image description here

1st approach: Average CpM = sum(TotalCost / Miles) / count = (2 + 2.85 + 3.1 + 2.9 + 10) / 5 = $5.21

2nd approach: Average CpM = sum(TotalCost) / sum(Miles) = (100 + 200 + 310 + 290 + 100) / (50 + 70 + 100 + 100 + 10) = $3.03

I'm not sure if the 2nd approach of calculating Average CpM is a valid approach, and if not, what this formula means and what it's getting at. I also wonder if there are any other ways to calculate the weighted arithmetic mean.

2 Answers2

2

Your first method does not give a meaningful result.

Imagine you want to build a house. Someone calculates the average cost of the building materials, by finding out the cost of one plank of wood, one brick, one doorbell, etc., and taking the average. That won't be useful because you'll need much more wood and bricks than doorbells.

You are doing the same here. You are taking the cost of one mile from the first trip, one mile from the second trip, etc, and taking the average. You are however paying for many more miles on some trips than others.

2

I'm trying to get the weighted average of CpM (Cost per Mile) for the dataset below: enter image description here

1st approach: Average CpM = sum(TotalCost / Miles) / count = (2 + 2.85 + 3.1 + 2.9) / 4 = $2.715

This is merely the simple average of the four CpM values, not any weighted average.

2nd approach: Average CpM = sum(TotalCost) / sum(Miles) = (100 + 200 + 310 + 290) / (50 + 70 + 100 + 100) = $2.81

This is the arithmetic mean of the four CpM values weighted by their mileage.

$$\frac{m_1}{m_1+m_2+m_3+m_4}\left(\dfrac{c_1}{m_1}\right)+\frac{m_2}{m_1+m_2+m_3+m_4}\left(\dfrac{c_2}{m_2}\right)+\frac{m_3}{m_1+m_2+m_3+m_4}\left(\frac{c_3}{m_3}\right)+\frac{m_4}{m_1+m_2+m_3+m_4}\left(\frac{c_4}{m_4}\right)\\=\boxed{\frac{c_1+c_2+c_3+c_4}{m_1+m_2+m_3+m_4}}.$$

what does this formula means and what is it getting at

It gives the representative (i.e., 'average') value of the four CpM values such that the trips with higher mileage are given more weight than the trips with lower mileage.

If you require more intuition, please skim the middle section of this answer: Calculating marks using simple average versus weighted average.

I also wonder if there are any other ways to calculate the weighted arithmetic mean.

You could also compute the arithmetic mean of the four CpM values weighted by their associated costs, but not useful.

ryang
  • 38,879
  • 14
  • 81
  • 179
  • Hi ryang, I really appreciate your detailed explanation. So if I were to take the 2nd approach to calculate "weighted Cost" not "weighted CpM", how would it look then? – user9532692 May 17 '23 at 16:09
  • @user9532692 We did not calculate weighted CpM (what do that and weighted cost mean?). Can you see how that unwieldy expression relates to the boldfaced label above it? – ryang May 17 '23 at 16:23
  • Yes this makes sense. I think there is a typo in the denominator (two m_4) in your expression. As for my previous question in my comment, I was asking if we could use a similar approach to get arithmetic mean of the four TotalCost values weighted by their associated TotalCost which in this example, becomes 255.64? – user9532692 May 17 '23 at 17:13
  • c_1(c_1/(c_1+c_2+c_3+c_4)) + c_2(c_2/(c_1+c_2+c_3+c_4)) + ... + c_4(c_4/(c_1+c_2+c_3+c_4)) = 255.64; This is how I'm calculating the arithmetic mean of the four TotalCost values weighted by their associated TotalCost – user9532692 May 17 '23 at 17:42
  • Assuming that I have more data with more weeks (WeekNum=2, 3, ...), will my proposed arithmetic mean of the TotalCost values weighted by their associated TotalCost by each week be considered a "volume" weighted mean of TotalCost? – user9532692 May 17 '23 at 18:55
  • I don't think that's the accurate way to calculate "volume" weighted mean of TotalCost for each week. I might as well create a new post with a new example for this. Thanks for all your help ryang! – user9532692 May 17 '23 at 20:11
  • I created a new post somewhat related to this post. Thanks in advance for your feedback! – user9532692 May 17 '23 at 22:18