Broadcasting in Numpy
Broadcasting can be a useful tool from Numpy as it will allow you to change the dimensions of multiple arrays. This can be very useful when working with multiple different arrays at once and need them all to have the same dimensions. There are a few rules when trying to do this though.
- All input arrays with ndim smaller than the largest ndim array, have 1’s prepended to their shape. This means the arrays dimensionally smaller than the largest dimensionally array have 1’s on their shape.
- The size in each dimension of the output shape is the maximum of all the input sizes in that dimension. Meaning every array will have the largest rows and columns.
- An input can be used in the calculation if its size in a particular dimension either matches the output size in that dimension, or has value exactly 1.
- If an input has a dimension size of 1 in its shape, the first data entry in that dimension will be used for all calculations along that dimension. In other words, the stepping machinery of the ufunc will simply not step along that dimension (the stride will be 0 for that dimension).
If a.shape is (5,1), b.shape is (1,6), c.shape is (6,) and d.shape is () so that d is a scalar, then a, b, c, and d are all ‘broadcastable’ to dimension (5,6); and
- a acts like a (5,6) array where a[:,0] is broadcast to the other columns,
- b acts like a (5,6) array where b[0,:] is broadcast to the other rows,
- c acts like a (1,6) array and therefore like a (5,6) array where c[:] is broadcast to every row, and finally,
- d acts like a (5,6) array where the single value is repeated.
Reshaping is especially useful for modeling and linear regression.