Use Case: Adding Pangenome (Advanced)
This section describes a few typical steps in loading a pangenome with multiple annotated assemblies. In this exercise, we will load a dataset for cucumber pangenome (Cucumis sativus) with 37 annotated assemblies described in a paper Graph-based pangenome reveals structural variation dynamics during cucumber breeding | Nature Genetics.
We will:
- add organism(s),
- add genomic sequences,
- add gene annotation,
- run functional annotation,
- and calculate ortholog/paralog pairs between genes in all assemblies.
Most of the steps will require control files in the INI format. To generate the files, we will use the command build ini. We will also:
- add variants with 446 genotypes from a VCF file,
- and assign functional annotation to gene models by matching them to the proteins in SwissProt.
Step 4: Create orthologs
Step 5: Add variants