O'Reilly logo

Scala Machine Learning Projects by Md. Rezaul Karim

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

1000 Genomes Projects dataset description

The data from the 1000 Genomes project is a very large catalog of human genetic variants. The project aims to determine genetic variants with frequencies higher than 1% in the populations studied. The data has been made openly available and freely accessible through public data repositories to scientists worldwide. Also, the data from the 1000 Genomes project is widely used to screen variants discovered in exome data from individuals with genetic disorders and in cancer genome projects.

The genotype dataset in Variant Call Format (VCF) provides the data of human individuals (that is, samples) and their genetic variants, and in addition, the global allele frequencies as well as the ones for the super ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required