In total, 5,693,067 phylogenetic trees from 12 databases for 14,887,069 genes from 2,435 species have been processed in order to compute comprehensive set of 2,286,510,661 orthologs and paralogs. In brief, to compute meta-predictions, we need to:

  1. cross-link proteomes in order to know which ID (gene/protein) from one repository corresponds to which ID (gene/protein) in another repository
  2. parse trees from all repositories and retrieve homologs
  3. combine homology information from multiple repositories into meta-predictions