For better or worse, having lots of time inside during the COVID-19 Pandemic has led me to tackle some of my outstanding R coding problems. Here’s one possible solution I’ve come up with for generating random community abundance matrices following one of the suggested approaches in Ulrich and Gotelli 2010.

# Implementing the IT random matrix filling / creation algorithm from Ulrich and Gotelli 2010, Ecology.

To quote from the text:

IT assigns individuals randomly to matrix cells with probabilities proportional to observed row and column abundance totals until, for each row and column, total abundances are reached.

## Approach

To sample with “probabilities proportional to observed row and column abundance totals”, I used the `chisq.test`

function. The expected values in a matrix (or contingency table) are a function of the row and column abundances. I believe this approach is supported by Ulrich and Gotelli’s comparison of the results they found using the IT algorithm compared to \(\chi^2\) analysis for 200 random matrices (p. 3395).

Using these **expected** values, we can generate matrices comprised of all zero values *except* a single 1 value for a random matrix element. The element that is assigned a value of 1 will be decided based on a random draw from the matrix indices that is weighted by the *expected* values in the original matrix. Next, by adding a series of these random matrices we can make a new matrix with the same **matrix sum** as the original matrix. Finally, using an `if`

statement, we can do this while assuring that the row and column abundances remain the same as the original matrix.

## Working through an example

Below I present two examples of using this approach. The first comes from the help file for the `chisq.test`

function, with some small modifications.

`chisq.test`

example

First thing to do is to use `chisq.test`

to get the **expected** counts for any matrix. These expected counts are a function of the row and column sums:

\[ \text{expected count} = \frac{\text{row sum} \times \text{column sum}}{\text{matrix sum}} \]

```
M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))
Xsq <- chisq.test(M) # Prints test summary
# Xsq$observed # observed counts (same as M)
Xsq$expected # expected counts under the null
```

```
## A B C
## A 703.6714 319.6453 533.6834
## B 542.3286 246.3547 411.3166
```

Next, we use a `while`

loop to make a new matrix with a matrix sum equal to that of the original matrix. Within the `while`

loop, new matrices are created with all zeros *except* a randomly placed value of 1. We’re adding an `if`

statement to “throw out” new zero matrices that result in row or column sums that are greater than those for the original matrix.

```
# Make a new matrix of all zeros
new_mat <- M * 0
while(sum(new_mat) < sum(M)){
zero_mat <- M * 0
zero_mat[sample(1:length(Xsq$expected), size = 1, prob = Xsq$expected)] <- 1
new_mat_temp <- new_mat + zero_mat
if( !any((rowSums(new_mat_temp) - rowSums(Xsq$expected)) > 0 ) &
!any((colSums(new_mat_temp) - colSums(Xsq$expected)) > 0 )) {
new_mat <- new_mat_temp
}
}
new_mat
```

```
## A B C
## A 706 329 522
## B 540 237 423
```

Finally, let’s do just a few checks to make sure the matrix sums are the same and the row and column sums are the same.

`sum(new_mat) == sum(M)`

`## [1] TRUE`

`all(rowSums(new_mat) == rowSums(Xsq$expected))`

`## [1] TRUE`

`all(colSums(new_mat) == colSums(Xsq$expected))`

`## [1] TRUE`

### A larger random matrix

Here I’m applying the same setup as above on a 10 x 10 matrix, with shuffled values between 1 and 100.

```
rand_matr <- matrix(sample(1:100), nrow = 10, ncol = 10)
rand_matr_exp <- chisq.test(rand_matr)$expected
# Make a new matrix of all zeros
new_mat <- rand_matr * 0
while(sum(new_mat) < sum(rand_matr)){
zero_mat <- rand_matr * 0
zero_mat[sample(1:length(rand_matr_exp), size = 1, prob = rand_matr_exp)] <- 1
new_mat_temp <- new_mat + zero_mat
if( !any((rowSums(new_mat_temp) - rowSums(rand_matr_exp)) > 0 ) &
!any((colSums(new_mat_temp) - colSums(rand_matr_exp)) > 0 )) {
new_mat <- new_mat_temp
}
}
new_mat
```

```
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 41 35 53 39 39 50 42 39 48 35
## [2,] 44 50 73 37 39 54 45 55 46 40
## [3,] 35 45 70 39 53 58 54 52 40 42
## [4,] 50 41 88 61 52 62 48 35 52 44
## [5,] 60 55 64 46 71 74 46 68 51 64
## [6,] 42 38 67 45 51 54 47 53 53 48
## [7,] 48 38 74 53 71 88 53 48 49 44
## [8,] 36 33 51 40 51 61 42 53 43 40
## [9,] 38 33 55 50 45 50 42 40 49 34
## [10,] 50 43 100 49 49 60 60 73 49 43
```

`sum(new_mat) == sum(rand_matr)`

`## [1] TRUE`

`all(rowSums(new_mat) == rowSums(rand_matr_exp))`

`## [1] TRUE`

`all(colSums(new_mat) == colSums(rand_matr_exp))`

`## [1] TRUE`

## Making a function for this

Let’s make this into a function

```
it_randomize <- function(matr){
# perform chisq test to get expected values
matr_exp <- chisq.test(matr)$expected
# make a new matrix of all zeros
new_mat <- matr * 0
# use a while loop to add matrices with a single, randomly
# placed value of 1 until
# the new_mat sum is the same as the original matrix
while(sum(new_mat) < sum(matr)){
# make a random matrix
# this is done by randomly choosing a matrix index value, weighted
# by the expected matrix values at each index
zero_mat <- matr * 0
zero_mat[sample(1:length(matr_exp), size = 1, prob = matr_exp)] <- 1
# add the random matrix to the new_mat temporalily
new_mat_temp <- new_mat + zero_mat
# keep new_mat_temp if it doesn't result in over filling the
# row or column sums
if( !any((rowSums(new_mat_temp) - rowSums(matr_exp)) > 0 ) &
!any((colSums(new_mat_temp) - colSums(matr_exp)) > 0 )) {
new_mat <- new_mat_temp
}
}
return(new_mat)
}
```

Apply the new function

```
new_mat <- it_randomize(M)
new_mat
```

```
## A B C
## A 689 320 548
## B 557 246 397
```

`sum(new_mat) == sum(M)`

`## [1] TRUE`

`all(rowSums(new_mat) == rowSums(Xsq$expected))`

`## [1] TRUE`

`all(colSums(new_mat) == colSums(Xsq$expected))`

`## [1] TRUE`

```
new_mat <- it_randomize(rand_matr)
new_mat
```

```
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 38 36 70 38 50 41 39 46 30 33
## [2,] 55 38 78 36 46 61 36 46 47 40
## [3,] 35 28 77 43 53 67 59 40 48 38
## [4,] 40 45 65 47 50 70 42 62 58 54
## [5,] 58 40 72 51 68 69 56 78 55 52
## [6,] 47 25 72 48 45 70 50 40 55 46
## [7,] 61 58 78 64 55 65 47 53 48 37
## [8,] 37 49 57 40 38 48 61 36 45 39
## [9,] 27 33 58 34 55 49 46 44 43 47
## [10,] 46 59 68 58 61 71 43 71 51 48
```

`sum(new_mat) == sum(rand_matr)`

`## [1] TRUE`

`all(rowSums(new_mat) == rowSums(rand_matr_exp))`

`## [1] TRUE`

`all(colSums(new_mat) == colSums(rand_matr_exp))`

`## [1] TRUE`