The most popular scenario is using PersephoneShell in the Docker container.

The script persephone.sh provided with the Docker image has the command to start PersephoneShell. If you launch PersephoneShell from the directory of the script, the command is:

 ./persephone.sh psh

This command starts PersephoneShell inside the Docker container environment. Now you can type the PersephoneShell commands at the prompt.

To get familiar with the settings, let's run the command 'list config':

PS> list config
Database Connection:
            name in config file: persephone
                   database url: localhost:3306/persephone
                       username: persephone
            sequence storage(s):

Search index for current connection:
                            url: http://localhost:8983/solr/
                      core name: persephone
               multiple threads: True

File location settings for current connection:
                   file storage: /data/FileStorage
               sequence storage: /data/sequences
               BLAST db storage: /data/BlastDB

BLAST settings:
         BLAST binary directory: /data/NCBIBlast
               BLAST parameters: -evalue 1e-5 -max_target_seqs 5 -max_hsps 3 -word_size 4 -threshold 99 -num_threads 8

Other settings:
                 temp directory: /data/Temp
                 data directory: /data/Data
          user config directory:
                ortholog finder: DIAMOND
             DIAMOND parameters: -e 1e-5 --max-target-seqs 5 --max-hsps 2

Remember, when we were creating the container, we were using the directory-mounting parameter (-v). Using the mounted directories helps address the data or sample files outside the container by using the container's internal directory names. By default, the internal location /data/Data points to the mounted directory. If, for instance, the command to the Docker image contained these parameters:

-v /share:/data/Data

the internal path /data/Data/tomato will be equivalent to the directory /share/tomato outside the container. 

In the configuration listing, you can see that "data directory" is set to /data/Data. This is the value of PersephoneShell's internal variable $DATA. This means that to reference the outside file /share/tomato/genome.fa you can specify its path inside the container as $DATA/tomato/genome.fa. This variable $DATA can be used in the control files with loading instructions to point to the source data files or on the PersephoneShell's command line. For example the command 'analyze fasta' can look like this:

analyze fasta $DATA/tomato/genome.fa

The sample files

PersephoneShell comes with a set of sample files (in the INI format) for different loading jobs. They are located in the directory /data/psh/Samples inside the container. The files are organized into separate directories by category: Sequence, Annotation, Genetic, etc. For your convenience, we recommend opening two terminal windows: one for PersephoneShell and the other one for handling the control INI files. To enter the container environment, use this command:

docker exec -it persephone bash

The Docker container has the file manager called Midnight Commander pre-installed. It is launched by the command 'mc'. You can use it to navigate the file system. Its internal editor (mcedit, accessed via F4 key) shows the INI files with syntax highlighting. If you are comfortable with it, you can work with the sample files in the Docker environment without copying them outside the container for editing.

It is typical to create a copy of an existing sample file and modify it to your needs. Some files contain a suffix in their names, such as '-ncbi' or '-phytozome'. These files are tailored to the specifics of the data files from different providers. For example, if you are going to load a file from NCBI, use a sample file with the suffix -ncbi as the basis of your edits. This allows you to reuse many instructions, saving you time and effort.

To make a copy of a file in the same directory using Midnight Commander, press Shift-F5.

If you prefer to use your favorite text editor outside the Docker container, it makes sense to copy the sample files from the container to a sub-folder of the mounted directory. This way, the files will be accessible by PersephoneShell from inside the container. For example, if we have mounted the directory /share, we can create the directory for the sample files as /share/Samples. From PersephoneShell, this directory will be accessible as $DATA/Samples.

To facilitate transferring the files between the container and the outside world, use the command persephone.sh copysamples:

./persephone.sh copysamples -s /share/Samples
Welcome to Persephone docker tool V 1.1.7878
Found container 'persephone'
Successfully copied sample files to /share/Samples

Adding the first organism

Let's add our first organism entry, which will serve as the foundation for other entries, including sequences and markers. The information about the organism should be provided in an INI file. We will use an existing sample file in the sub-folder Organism as a template and will make a copy. Suppose, we are adding an organism Vitis vinifera. We will call the new file vitis-vinifera.ini.

Now we will edit the text in the file to look like this:


[Organism]
; Organism ID (optional, if not specified, it will be autogenerated)
OrganismId=29760 
; Look up taxonomy information in http://www.ncbi.nlm.nih.gov/taxonomy
; Taxonomy ID (optional but recommended)
TaxonomyId=29760 
; Alternative ID: user defined ID
;AlternativeId=""
; Scientific name (required)
ScientificName="Vitis vinifera"
; Common name (optional)
CommonName="wine grape"
;If plant, specify if the organism is monocot(0) or eudicot(1)
PlantClassification=1

The command to add the organism is

add organism -c <control.ini>

If the sample files are located outside the container, the control file path will be $DATA/Samples/Organism/vitis-vinifera.ini.

In case you prefer working with the files inside the container, first check the current directory by using the command pwd. Now you can address the files by typing their full path, such as /data/psh/Samples/Organism/vitis-vinifera.ini or, by taking into account the current directory and using the relative addressing: Samples/Organism/vitis-vinifera.ini.

Please remember to use the auto-complete feature (pressing TAB) to speed up typing the file path and the command parameters.

Most of the commands can be executed in the test mode (-t). Before loading the data please test the command first:

add organism -c Samples/Organism/vitis-vinifera.ini -t

If the test is successful, execute the full command, optionally using the verbose output (-v):

add organism -c Samples/Organism/vitis-vinifera.ini -v

Now you can list the organisms by the command

list organism

to confirm that the new organism has been successfully added to the system.

Please check other sections of the documentation for PersephoneShell, especially the page about customization. Please keep in mind that the location of the configuration files for PersephoneShell inside the Docker container is /data/psh.