For better or worse, having lots of time inside during the COVID-19 Pandemic has led me to tackle some of my outstanding R coding problems. Here’s one possible solution I’ve come up with for generating random community abundance matrices following one of the suggested approaches in Ulrich and Gotelli 2010.

# Implementing the IT random matrix filling / creation algorithm from Ulrich and Gotelli 2010, Ecology.

To quote from the text:

IT assigns individuals randomly to matrix cells with probabilities proportional to observed row and column abundance totals until, for each row and column, total abundances are reached.

## Approach

To sample with “probabilities proportional to observed row and column abundance totals”, I used the `chisq.test`

function. The expected values in a matrix (or contingency table) are a function of the row and column abundances. I believe this approach is supported by Ulrich and Gotelli’s comparison of the results they found using the IT algorithm compared to \(\chi^2\) analysis for 200 random matrices (p. 3395).

Using these **expected** values, we can generate matrices comprised of all zero values *except* a single 1 value for a random matrix element. The element that is assigned a value of 1 will be decided based on a random draw from the matrix indices that is weighted by the *expected* values in the original matrix. Next, by adding a series of these random matrices we can make a new matrix with the same **matrix sum** as the original matrix. Finally, using an `if`

statement, we can do this while assuring that the row and column abundances remain the same as the original matrix.

## Working through an example

Below I present two examples of using this approach. The first comes from the help file for the `chisq.test`

function, with some small modifications.

`chisq.test`

example

First thing to do is to use `chisq.test`

to get the **expected** counts for any matrix. These expected counts are a function of the row and column sums:

\[ \text{expected count} = \frac{\text{row sum} \times \text{column sum}}{\text{matrix sum}} \]

```
M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))
Xsq <- chisq.test(M) # Prints test summary
# Xsq$observed # observed counts (same as M)
Xsq$expected # expected counts under the null
```

```
## A B C
## A 703.6714 319.6453 533.6834
## B 542.3286 246.3547 411.3166
```

Next, we use a `while`

loop to make a new matrix with a matrix sum equal to that of the original matrix. Within the `while`

loop, new matrices are created with all zeros *except* a randomly placed value of 1. We’re adding an `if`

statement to “throw out” new zero matrices that result in row or column sums that are greater than those for the original matrix.

```
# Make a new matrix of all zeros
new_mat <- M * 0
while(sum(new_mat) < sum(M)){
zero_mat <- M * 0
zero_mat[sample(1:length(Xsq$expected), size = 1, prob = Xsq$expected)] <- 1
new_mat_temp <- new_mat + zero_mat
if( !any((rowSums(new_mat_temp) - rowSums(Xsq$expected)) > 0 ) &
!any((colSums(new_mat_temp) - colSums(Xsq$expected)) > 0 )) {
new_mat <- new_mat_temp
}
}
new_mat
```

```
## A B C
## A 707 309 541
## B 539 257 404
```

Finally, let’s do just a few checks to make sure the matrix sums are the same and the row and column sums are the same.

`sum(new_mat) == sum(M)`

`## [1] TRUE`

`all(rowSums(new_mat) == rowSums(Xsq$expected))`

`## [1] TRUE`

`all(colSums(new_mat) == colSums(Xsq$expected))`

`## [1] TRUE`

### A larger random matrix

Here I’m applying the same setup as above on a 10 x 10 matrix, with shuffled values between 1 and 100.

```
rand_matr <- matrix(sample(1:100), nrow = 10, ncol = 10)
rand_matr_exp <- chisq.test(rand_matr)$expected
# Make a new matrix of all zeros
new_mat <- rand_matr * 0
while(sum(new_mat) < sum(rand_matr)){
zero_mat <- rand_matr * 0
zero_mat[sample(1:length(rand_matr_exp), size = 1, prob = rand_matr_exp)] <- 1
new_mat_temp <- new_mat + zero_mat
if( !any((rowSums(new_mat_temp) - rowSums(rand_matr_exp)) > 0 ) &
!any((colSums(new_mat_temp) - colSums(rand_matr_exp)) > 0 )) {
new_mat <- new_mat_temp
}
}
new_mat
```

```
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 32 36 44 66 65 45 52 30 43 24
## [2,] 47 52 68 61 70 70 82 40 58 50
## [3,] 35 56 44 48 75 51 61 40 53 49
## [4,] 38 53 70 70 67 69 67 43 55 45
## [5,] 39 47 58 48 84 42 64 48 50 45
## [6,] 15 39 32 48 48 35 51 22 34 27
## [7,] 48 63 73 75 106 83 67 48 52 55
## [8,] 44 50 67 60 75 67 59 41 50 49
## [9,] 29 33 22 42 38 39 34 22 35 26
## [10,] 41 42 54 53 52 52 50 43 66 45
```

`sum(new_mat) == sum(rand_matr)`

`## [1] TRUE`

`all(rowSums(new_mat) == rowSums(rand_matr_exp))`

`## [1] TRUE`

`all(colSums(new_mat) == colSums(rand_matr_exp))`

`## [1] TRUE`

## Making a function for this

Let’s make this into a function

```
it_randomize <- function(matr){
# perform chisq test to get expected values
matr_exp <- chisq.test(matr)$expected
# make a new matrix of all zeros
new_mat <- matr * 0
# use a while loop to add matrices with a single, randomly
# placed value of 1 until
# the new_mat sum is the same as the original matrix
while(sum(new_mat) < sum(matr)){
# make a random matrix
# this is done by randomly choosing a matrix index value, weighted
# by the expected matrix values at each index
zero_mat <- matr * 0
zero_mat[sample(1:length(matr_exp), size = 1, prob = matr_exp)] <- 1
# add the random matrix to the new_mat temporalily
new_mat_temp <- new_mat + zero_mat
# keep new_mat_temp if it doesn't result in over filling the
# row or column sums
if( !any((rowSums(new_mat_temp) - rowSums(matr_exp)) > 0 ) &
!any((colSums(new_mat_temp) - colSums(matr_exp)) > 0 )) {
new_mat <- new_mat_temp
}
}
return(new_mat)
}
```

Apply the new function

```
new_mat <- it_randomize(M)
new_mat
```

```
## A B C
## A 717 306 534
## B 529 260 411
```

`sum(new_mat) == sum(M)`

`## [1] TRUE`

`all(rowSums(new_mat) == rowSums(Xsq$expected))`

`## [1] TRUE`

`all(colSums(new_mat) == colSums(Xsq$expected))`

`## [1] TRUE`

```
new_mat <- it_randomize(rand_matr)
new_mat
```

```
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 39 38 37 49 61 51 56 32 33 41
## [2,] 49 61 51 63 90 59 61 51 50 63
## [3,] 33 45 55 59 73 56 51 40 59 41
## [4,] 31 63 70 71 76 58 70 39 52 47
## [5,] 43 53 62 59 59 58 48 39 60 44
## [6,] 38 34 43 44 33 35 35 24 36 29
## [7,] 40 67 78 73 87 68 83 51 70 53
## [8,] 35 35 54 62 75 71 82 42 59 47
## [9,] 24 31 33 32 45 37 40 26 32 20
## [10,] 36 44 49 59 81 60 61 33 45 30
```

`sum(new_mat) == sum(rand_matr)`

`## [1] TRUE`

`all(rowSums(new_mat) == rowSums(rand_matr_exp))`

`## [1] TRUE`

`all(colSums(new_mat) == colSums(rand_matr_exp))`

`## [1] TRUE`