x <- 1:10
z <- 11:20
x + z
[1] 12 14 16 18 20 22 24 26 28 30
Many built-in R functions are vectorized and so are many functions from external packages as well.
A vectorized function operates on all elements of an object.
Vectorization is very efficient: it can save both human (your) time and machine time.
In many cases, applying a function on all elements simultaneously may seem like the obvious or expected behavior, but since not all functions are vectorized, make sure to check the documentation (and/or test whether a function is vectorized using a simple example).
Such operations are applied between corresponding elements of each vector:
x <- 1:10
z <- 11:20
x + z
[1] 12 14 16 18 20 22 24 26 28 30
i.e. the above is equal to c(x[1] + z[1], x[2] + z[2], ..., x[n] + z[n])
.
Weight <- rnorm(20, mean = 80, sd = 1.7)
Weight
[1] 79.36801 81.06268 79.41230 80.10540 80.87167 79.12432 79.71534 79.99593
[9] 80.22232 78.98860 79.92995 80.51565 76.70282 80.84724 78.15982 81.09863
[17] 76.51254 79.88208 76.89691 80.46485
Height <- rnorm(20, mean = 1.7, sd = .1)
Height
[1] 1.712619 1.825532 1.740072 1.746919 1.831087 1.634120 1.607686 1.880719
[9] 1.655010 1.641103 1.686898 1.699648 1.810535 1.694773 1.666255 1.575280
[17] 1.562882 1.463314 1.549886 1.911374
BMI <- Weight/Height^2
BMI
[1] 27.05975 24.32441 26.22728 26.24921 24.12005 29.63071 30.84177 22.61621
[9] 29.28829 29.32869 28.08873 27.87161 23.39902 28.14765 28.15145 32.68119
[17] 31.32421 37.30561 32.01175 22.02494
In this cases, the scalar is repeated to match the length of the vector, i.e. it is recycled:
x + 10
[1] 11 12 13 14 15 16 17 18 19 20
x * 2
[1] 2 4 6 8 10 12 14 16 18 20
x / 10
[1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x ^ 2
[1] 1 4 9 16 25 36 49 64 81 100
Operations between a vector and a scalar are a special case of operations between vectors of unequal length. Whenever you perform an operation between two objects of different length, the shorter object’s elements are recycled:
x + c(2:1)
[1] 3 3 5 5 7 7 9 9 11 11
Operations between objects of unequal length can occur by mistake. If the shorter object’s length is a multiple of the longer object’s length, there will be no error or warning, as above. Otherwise, there is a warning (which may be confusing at first) BUT recycling still happens and is highly unlikely to be intentional.
x + c(1, 3, 9)
Warning in x + c(1, 3, 9): longer object length is not a multiple of shorter
object length
[1] 2 5 12 5 8 15 8 11 18 11
Operations between matrices are similarly vectorized, i.e. performed between corresponding elements:
Some examples of common mathematical operations that are vectorized:
log(x)
[1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595 1.9459101
[8] 2.0794415 2.1972246 2.3025851
sqrt(x)
[1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427
[9] 3.000000 3.162278
sin(x)
[1] 0.8414710 0.9092974 0.1411200 -0.7568025 -0.9589243 -0.2794155
[7] 0.6569866 0.9893582 0.4121185 -0.5440211
cos(x)
[1] 0.5403023 -0.4161468 -0.9899925 -0.6536436 0.2836622 0.9601703
[7] 0.7539023 -0.1455000 -0.9111303 -0.8390715