GRAF: GRAF (Genetic Relationship and Fingerprinting) is a C++ program that quickly finds the closely related subjects, infers subject ancestry, determines subject sexes using genotypes and compares the results derived from genotypes with those reported in the phenotype datasets. It includes the following three features:
  1. GRAF-rel (included in all versions).
    Finds duplicate samples (or identical twins) and closely related subjects using genotypes and compares them with those reported in the pedigree file and the subject-sample mapping (SSM) file.

  2. GRAF-pop (included in version 2.0 and newer ones).
    Infers subject ancestry from genotypes, estimates population structure and uses the results to validate the self-reported populations in the phenotype datasets.

  3. GRAF-sex (included in version 2.4).
    Determines subject sexes using the genotypes and uses them to validate the self-reported sexes in phenotype datasets.

  4. For a more detailed description, see GRAF_README.

Click the following link to download GRAF 2.4.

  1. Jin Y, Schäffer AA, Sherry ST, and Feolo M (2017). Quickly identifying identical and closely related subjects in large databases using genotype data. PLoS One. 12(6):e0179106.[Abstract][PDF]
  2. Jin Y, Schäffer AA, Feolo M, Holmes JB and Kattman BL (2019). GRAF-pop: A Fast Distance-based Method to Infer Subject Ancestry from Multiple Genotype Datasets without Principal Components Analysis. G3: Genes | Genomes | Genetics. DOI: 10.1534/g3.118.200925. [Abstract][PDF]

TransEAV: dbGaP requires that the submitted phenotypic datasets be rectangular tables with each row representing one subject or sample, and each column representing a phenotypic trait or attribute (called variable in dbGaP), and each cell storing one attribute value. However, sometimes datasets are collected and recorded using Entity-Attribute-Value model (EAV) model. In EAV model, one dataset table usually has three columns: subject (or sample), attribute, and value, and each row stores only one attribute value for one subject or sample. This script converts a dataset in EAV model to a rectangular table that can be submitted to dbGaP. For a more detailed description, please see README.txt in the package.

Click the following link to download TransEAV.