Here, I introduce an R function for calculating minor allele frequencies (MAF).

calcmaf <- function(M, col1ID = TRUE) {
    if(col1ID) {
        maf = colMeans(M[,-1])/2
    } else {
        maf = colMeans(M)/2
    }
    maf[maf > 0.5] <- 1 - maf[maf > 0.5]
    return(unname(maf))
}

The calcMAF function takes arguments M and col1ID. M is the genotype data frame with genotypes coded as 0:2. col1ID takes TRUE or FALSE. If TRUE (default) the 1st column of M is animal ID.

Let’s create an example genotype data frame for 10 genotypes and 20 SNPs, where the first column is animal ID.

set.seed(995)
M <- matrix(c(sample(0:2, 200, replace = TRUE)), nrow = 10)
(M <- as.data.frame(cbind(101:110, M)))
    V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21
1  101  2  0  2  1  0  2  1  1   1   0   1   1   2   1   0   1   2   0   1   1
2  102  1  2  0  0  1  1  1  2   2   0   0   2   2   0   1   2   2   2   1   0
3  103  0  0  2  1  0  1  2  1   0   2   2   1   1   0   0   0   0   2   0   1
4  104  1  1  2  1  0  2  2  1   1   1   0   1   2   1   0   2   2   0   2   0
5  105  0  2  1  1  2  2  2  1   1   2   1   0   2   2   2   1   2   0   1   1
6  106  0  1  2  0  0  0  2  0   0   0   0   1   0   1   0   1   1   1   1   2
7  107  2  2  2  2  0  0  1  0   0   2   1   0   2   2   1   0   2   0   0   0
8  108  2  1  2  1  1  1  0  1   0   0   2   1   0   1   1   2   2   2   0   1
9  109  2  1  0  2  0  0  0  0   0   0   0   0   0   2   0   1   0   0   2   2
10 110  2  2  0  2  0  0  1  0   0   1   0   1   1   2   0   1   1   2   0   2

Now, calculate MAF:

calcmaf(M) # equivalent to calcmaf(M, col1ID = TRUE)
 [1] 0.40 0.40 0.35 0.45 0.20 0.45 0.40 0.35 0.25 0.40 0.35 0.40 0.40 0.40 0.25
[16] 0.45 0.30 0.45 0.40 0.50

Suppose, we had a data frame, purely genotypes, where the first column was not animal ID (M[,-1]). Then:

calcmaf(M = M[,-1], col1ID = FALSE)
 [1] 0.40 0.40 0.35 0.45 0.20 0.45 0.40 0.35 0.25 0.40 0.35 0.40 0.40 0.40 0.25
[16] 0.45 0.30 0.45 0.40 0.50