Install
PersephoneShell needs a couple of helper programs, such as NCBI MagicBLAST or BLASTN. The targets of the install command are:
blast
magic_blast
diamond
swissprot
The packages are normally installed during the first initialization with the command init. In case BLAST functionality had not been enabled or the system was initialized years ago, these additional programs can be installed later by running the command:
install magic_blast
or
install blast
and, for diamond protein sequence aligner:
install diamond
or to download the SwissProt collection of proteins used with the command 'create function':
install swissprot
The procedures download the software archives from the web and unpack the binaries into the directory specified in psh.exe.config as BlastBinDir:
<!-- Advanced configuration for PersephoneShell -->
<PersephoneShell DeleteOrphanData="true"
TempDir="/tmp"
DataDir="~/bin/psh/data"
BlastBinDir="~/bin/blast"
BlastParams="-evalue 1e-5 -max_target_seqs 5 -max_hsps 1 -word_size 4 -threshold 100 -num_threads 8"
DiamondParams="-e 1e-5 --max-target-seqs 5 --max-hsps 2"
OrthologFinder="DIAMOND"
PromptFormat="$g"
>
When configuring BLAST you will need to provide the location for the BLAST index files (also used by MagicBLAST and diamond and SwissProt):
PS> install blast
Created directory /home/user1/psh/blast
Do you want to download the BLAST binary files from https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.10.0/ncbi-blast-2.10.0+-x64-linux.tar.gz? (Y/N) Y
Downloading BLAST binary files...
Extracting BLAST binary files into /home/user1/psh/blast...
Success
Please specify a directory where BLAST data files will be stored.
BLAST data directory? /mnt/d/data/blastdb/toy
Created directory /mnt/d/data/blastdb/toy
After the installation, a file blast.ini will be created in the PersephoneShell installation directory. Just in case, the contents of a sample blast.ini is simple:
[toy]
BLastDataDir=/mnt/d/data/blastdb/toy
where toy is the name of the connection used to start PersephoneShell. The file can have data for several connections.
The command
install diamond
will download and install diamond - a very fast protein sequence aligner used for finding the orthologs. It runs much faster than BLASTP while providing a similar sensitivity. To establish diamond as the aligner, set the value of OrthologFinder in the main PersephoneShell configuration file to "DIAMOND". The extra parameters that control the output of diamond can be provided in DiamondParams variable. If the value for DiamondParams is not provided, a default set of parameters will be used: "-e 1e-5 --max-target-seqs 5 --max-hsps 2".