The Python numpy average() function helps compute the weighted average of a given array along the specified axis.
Python numpy average Syntax
The syntax of this statistical method is
numpy.average(a, axis = None, weights = None, returned = False, *, keepdims = <no value>)
The arguments are
- a = Array
- If you specify the axis value, this function will only calculate the average for that axis.
- weights – It will calculate the weighted average if you define the weights.
- returned – if you set this to true, it will return the sum of weights.
- keepdims – To retain the original shape.
Difference between average() and mean()
Both the average() and mean() functions will calculate the average of the ndarray. However, Python numpy average() has a weight argument to compute the weighted average.
Python numpy average Example
In this example, we declared a one-dimensional ndarray of five integers. Next, we calculate both the normal and weighted average of the array.
import numpy as np a = np.array([10, 20, 30, 40, 50]) b = np.average(a) print(b) c = np.average(a, weights = [2, 4, 6, 8, 10]) print(c) d = np.average(a, weights = [2, 4, 6, 8, 10], returned = True) print(d)
30.0
36.666666666666664
(36.666666666666664, 30.0)
(10 * 2 + 20* 4 + 30 * 6 + 40* 8 + 50* 10) / (2 + 4 + 6+ 8 + 10)
= (20 + 80 + 180 + 320 + 500) / 30
= 1100 / 30
= 36.666666666666664
In the last line, we used returned = True will return the sum of the given weights. And the sum = 2 + 4 + 6+ 8 + 10 = 30
Two-Dimensional example
In this program, we will declare a two-dimensional square matrix.
import numpy as np a = np.array([[10, 20], [30, 40]]) b = np.average(a) print(b) c = np.average(a, weights = [[3, 5], [6, 2]]) print(c)
25.0
24.375
b = (10 + 20 + 30 + 40) \ 4 = 100 / 4 = 25
c = (10 * 3 + 20* 5 + 30 * 6 + 40* 2) / (3 + 5 + 6 + 2) = 390/ 16
c = 24.375
By default, this Python function calculates the average for the whole numpy array. However, we can compute the row and column-wise by specifying the axis value. For instance, in the below example, axis = 0 calculates the average for each column, and axis = 1 calculates each row.
The result will reshape if you use the keepdims argument and assigned True value.
import numpy as np a = np.arange(20).reshape(4, 5) print(a) b = np.average(a) print(b) c = np.average(a, axis = 0) print(c) d = np.average(a, axis = 1) print(d) e = np.average(a, axis = 1, keepdims = True) print(e)
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]]
9.5
[ 7.5 8.5 9.5 10.5 11.5]
[ 2. 7. 12. 17.]
[[ 2.]
[ 7.]
[12.]
[17.]]
In this example, we used arange and reshape methods to generate a 2D array of 4 * 3 size and the numbers from 0 to 11. Next, we used axis and weights so that Python would calculate the weighted average of numpy array rows and columns.
import numpy as np a = np.arange(12).reshape(4, 3) print(a) b = np.average(a) print(b) c = np.average(a, axis = 0, weights = [3, 5, 7, 9]) print(c) d = np.average(a, axis = 1, weights = [2, 4, 6]) print(d)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
5.5
[5.75 6.75 7.75]
[ 1.33333333 4.33333333 7.33333333 10.33333333]
For b = (0 + 1 + 2+ 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11) / 12 = 66/12 = 5.5
First Column in c = (0 * 3 + 3 * 5 + 6 * 7 + 9 * 9) / (3 + 5 + 7 + 9) = 138 / 24 = 5.75
First Row in d = (0 * 2 + 1 * 4 + 2 * 6) / (2 + 4 + 6) = 16 / 12 = 1.33333333