Wondering how to count the number of generations in a population? Knowing the pedigree depth is important in population studies. It can get more complex with overlapping generations, but the logic and process are simple.
There are two ways to approach this: top-bottom and bottom-top, differing only in the direction of counting.
Top-bottom approach
- Discard animals with both parents missing.
- Set parents that no longer appear in the pedigree (i.e., not in the first column) to missing.
- Count one generation.
- Repeat until no animals remain in the pedigree.
Here’s the R code, assuming ped is a pedigree data frame with animal, sire, and dam columns, and missing parents are denoted as 0:
i = 0
while(nrow(ped) > 0) {
    i = i + 1
    nrowped <- nrow(ped)
    ped = ped[ped[,2] != 0 | ped[,3] != 0,]
    ped[!ped[,2] %in% ped[,1], 2] = 0
    ped[!ped[,3] %in% ped[,1], 3] = 0
    print(paste(nrowped - nrow(ped), "individuals in generation", i))
}
Bottom-top approach
- Discard non-parents.
- Count one generation.
- Repeat until no animals remain in the pedigree.
Here’s the R code:
i = 0
while(nrow(ped) > 0) {
    nrowped <- nrow(ped)
    ped = ped[ped[,1] %in% c(unique(ped[,2]), unique(ped[,3])),]
    print(paste(nrowped - nrow(ped), "individuals in generation n -", i))
    i = i + 1
}
I tested both approaches on a large pedigree and found that the bottom-top approach is considerably faster, mainly due to step 2 in the top-bottom method.