Add

The add command is used to add objects to the Persephone database. The syntax is as follows:

add {target} -c controlFile [(-v | -t) -d]

where "controlFile" is file with an ".ini" extension that contains the required data needed to add the target. (See Control Files for more information.)

The table below lists the definitions for the add command parameters.

Add Command Parameters

Parameter	Required or Optional?	Definition
{target}	Required	A target is the object type you want to add, which can be alignment, annotation, annotation_qualifier, annotation_search, bam, bed, expression, genetic_map, map, mapsettreenode, marker, marker_qualifier, ontology, organism, ortholog, qtl, qualifier_link, quantitative, ribbon, sample, sequence, sequencedatabase, synteny, track_qualifier, tracktreenode, variant
-c controlFile	Required most of the time	Loads data from the specified control file. As described in Control Files, control files are files with an ".ini" extension. Please note, the supported data format of the control file varies depending on the target type you select.
-v	Optional	Executes the add command in verbose mode with extra information printed on screen.
-t	Optional	Executes the add command in test mode.
-d	Optional	Executes the add command in debug mode. If needed, you can send the debug output to Persephone Software, LLC. at http://persephonesoft.com/contact.
-e	Optional	Test and run. Run test and, if successful, immediately run the real load. This will keep the database locked and will not let any concurrent job to modify the data between the test and the load.
-f	Optional	Normally, when creating some objects from the command line, the program will ask for confirmation. The force mode will skip this confirmation. The flag is typically used in a batch mode, when adding the data is done via a script.

Adding Data with the Add Command

See the following use cases for examples of using the add command to add data.

Purpose	Example command	Notes
Add an Organism	add organism -c indica.ini -v	Before entering any data, a parent organism should be created
Add Map Set, Maps, and Sequences	add sequence -c indica.ini -v	Map sets can be of different kinds. If the maps are based on genomic sequences, use this command to add the map set itself and the maps with sequences
Add Gene Annotations	add annotation -c indica_bgi.ini -v	Add gene model tracks using this command. Each map can contain several tracks with gene models predicted by different annotation methods.
Add qualifiers to existing gene models	add annotation_qualifier -c indica_pfam.ini -v	Add extra qualifiers to the gene models already loaded to the database. This info may include functional annotation or hyperlinks to external resources
Adding new qualifiers by extracting values from the existing qualifiers	add annotation_qualifier "Oryza sativa/IRGSP1.0"	Add new qualifiers interactively. Name the map set and a track with gene models, provide text modification rules (regular expressions) on how to extract the new values from existing qualifiers and store them under new qualifiers.
Adding qualifier links	add qualifier_link	Interactive command to nominate a qualifier as a hyper-link. Normally, this allows to open external web pages with extra info about the gene.
Adding annotation search terms	add annotation_search "Oryza sativa/IRGSP1.0"	Interactively mark some qualifiers as gene name or function. This info will be used to narrow down the search.
Add Markers (GFF files)	add marker -c clinvar.ini -v	Create a marker track with markers positioned on a map. The mapping coordinates can be bp for the maps based on sequence or cM for genetic maps. To add marker tracks to the genetic maps, the map set should be created first using the command add map
Add Markers (Delimited Text Files)	add marker -c clinvar.ini -v	Same as above, but the marker coordinates are provided in a form of a tab-delimited file.
Adding SequenceDatabase	add sequencedatabase -c arabidopsis.ini -v	Add map set with sequences and gene annotation in one step by providing data in Genbank format
Add genetic maps	add map -c linkage.ini -v	Adding genetic maps is done in two steps: first, add the empty maps, then add marker mapping. The information about the maps can be provided in a separate file listing the sizes of maps or can be derived from the file with marker positions.
Adding Expression Data	add expression -c tissues.ini -v	This command adds gene expression on the level one gene - one value per experiment. One job like this can load multiple values for each gene.
Adding Variants	add variant -c 1000genomes.ini -v	Load variants that contain SNPs or indels. To save space, the data is highly compressed, so that each position in each sample carries the alleles and coverage values only. The position names and other properties can be stored as an additional marker track.
Adding Info for Genotyping Samples	add sample -c extra_info.ini -v	The genotyping samples can have additional qualifiers and description.
Adding Ontology Terms	add ontology -c gramene.ini -v	Each QTL is linked to a trait that is placed in a trait ontology. Before loading any QTL, provide the trait ontology in OBO format
Adding QTLs	add qtl -c heat.ini -v	QTLs must have a trait listed in the trait ontology. A QTL should be assigned to a study that groups multiple QTLs. The QTL data can be read from files in text or Excel format, that may also include the study information.
Adding Synteny Ribbons	add ribbons -c irgsp_ir64.ini -v	Synteny ribbon-like connectors link related intervals between sequences. Note that the web version of Persephone can find such regions in the run time.
Adding a Track Tree Node	add tracktreenode "Oryza sativa/IRGSP1.0"	For better organization, the tracks can be grouped to form a tree-like structure. This interactive command will ask to name the new group node and to list the tracks to be grouped.
Adding BED file	add bed -c regions.ini -v	The data in BED file with additional annotation information can be displayed as colored elements on the maps.
Adding protein or nucleotide alignments	add alignment -c swissprot.ini -v	A special track with protein or cDNA alignments can help annotating genomic regions. A typical example - a set of pre-calculated tblastn hits for well characterized proteins, such as SwissProt.
Adding map set tree nodes	add mapsettreenode "Oryza sativa/Genetic" -v	Map sets can be organized by introducing more nodes in the map set tree. One command can introduce more than one node if a child node references a parent that does not exist yet.
Adding quantitative tracks	add quantitative rna-seq-coverage.ini -v	Quantitative tracks can contain values displayed in a form of a chart along the sequence, such as RNA-seq coverage.
Adding alignments in sam format	add sam ESTs.ini -v	If you want to load a "countable" number of spliced alignments (not millions), use this command. Note: the web version of Persephone allows users to visualize large bam files.
Adding sequence storage	add storage	In case the database is not Oracle, storing sequences in the database is not allowed. The compressed sequence data is stored in the file system, with the metadata being loaded into the database. If the default storage space is not enough, add another storage interactively, using this command.
Adding orthologous gene pairs	add ortholog -c rice-corn.ini -v	The orthologous gene pairs can be calculated by different methods. Use this command to load this information supplied in a tab-delimited file.