In this short blog post, I’m gonna show you a super simple way to calculate genotype frequencies.

Let’s say we’re looking at a single spot in the DNA (a biallelic situation) with two possible versions, “A” and “a”. That means we can have three possible genotype combos: “AA”, “Aa”, and “aa”.

What we want to do is count how many times each of these genotypes shows up at each SNP.

Let’s create a random genotype matrix for 10 genotyped animals and 20 SNPs.

set.seed(995)
(M <- matrix(c(sample(0:2, 200, replace = TRUE)), nrow = 10))

Which gives us:

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
 [1,]    2    0    2    1    0    2    1    1    1     0     1     1     2     1     0     1     2     0     1     1
 [2,]    1    2    0    0    1    1    1    2    2     0     0     2     2     0     1     2     2     2     1     0
 [3,]    0    0    2    1    0    1    2    1    0     2     2     1     1     0     0     0     0     2     0     1
 [4,]    1    1    2    1    0    2    2    1    1     1     0     1     2     1     0     2     2     0     2     0
 [5,]    0    2    1    1    2    2    2    1    1     2     1     0     2     2     2     1     2     0     1     1
 [6,]    0    1    2    0    0    0    2    0    0     0     0     1     0     1     0     1     1     1     1     2
 [7,]    2    2    2    2    0    0    1    0    0     2     1     0     2     2     1     0     2     0     0     0
 [8,]    2    1    2    1    1    1    0    1    0     0     2     1     0     1     1     2     2     2     0     1
 [9,]    2    1    0    2    0    0    0    0    0     0     0     0     0     2     0     1     0     0     2     2
[10,]    2    2    0    2    0    0    1    0    0     1     0     1     1     2     0     1     1     2     0     2

M can either be a regular matrix, a Matrix (from the Matrix package), a data.frame, or a data.table. We can count the number of each genotype at each SNP in just three simple lines of R code:

(aa <- colSums(M == 0))
(Aa <- colSums(M == 1))
(AA <- colSums(M == 2))

And here are the results:

[1] 3 2 3 2 7 4 2 4 6 5 5 3 3 2 6 2 2 5 4 3
[1] 2 4 1 5 2 3 4 5 3 2 3 6 2 4 3 5 2 1 4 4
[1] 5 4 6 3 1 3 4 1 1 3 2 1 5 4 1 3 6 4 2 3

There you have it! A super quick way to get those genotype counts. Hope that was helpful!