This section provides the steps and guidelines needed to install and configure the Persephone system.

Note

The Persephone System Setup Guide is intended for administrators only.

Note

Recently, we introduced an installation for the back end where all the components are supplied in the form of a single Docker image. Once the Docker image is installed, you can start using PersephoneShell to populate the data and run the Persephone web client. 


Logical System Diagram

The following figure shows a logical (conceptual) representation of the Persephone system.

The majority of the data is stored in the database. Users can drag&drop external files onto the client application or reference them by URL to create their private tracks or entire map sets (genomes).

The main components

The required components of the Persephone system are: a database, an API server, a Solr server, and the loader application PersephoneShell. The majority of the genomic data is stored in the database. The API server reads the data and sends it to the client application responding to its requests. The Apache Solr server provides the fast search services. PersephoneShell loads the data to the database checking for its consistency.

The back end

  • Oracle/MySQL(MariaDb) database server
    You need to configure either an in-house or a cloud-based Oracle or MariaDb database server. (Oracle 11g or higher 64-bit Standard Edition, the lower-cost Standard Edition One for one to five users, or Enterprise Edition are supported.) Multiple operating systems (Red Hat Linux 5.4, Ubuntu 22.04, Amazon Linux, Solaris 10, Windows 7 or later[32 or 64-bit, all editions], Windows Server 2011, Windows Server 2008 and 2008 RT, Windows Server 2003) are supported.  Please consult the proper database documentation.
  • Apache Solr search indexing engine. 
    A powerful system running under Java is used for the text search of Persephone objects. Install it from the Solr website, create a new core, and reference it in the configuration for the API server and the loader application PersephoneShell.
  • BLAST
    We use a local copy of the NCBI-BLAST binaries that will be engaged by the Persephone client and PersephoneShell. The installation is done with one command 'install blast' issued in PersephoneShell. The application will download and install the BLAST binaries.
  • PersephoneShell - the data loader application
    The command line tool will load the data from standard bioinformatic files like FASTA, gff3, BAM/CRAM, bed, bedgraph, vcf, etc. and check for consistency of the data.
  • API Server. The API server (known as "WebCerberus") communicates with the database(s). The API server performs data-caching, data compression, and other optimizations, which result in dramatic performance improvements. The server can be cloud-based or installed in-house. The API server (SelfHostingWebCerberus) can run as a stand-alone application that does not require any extra Web server like nginx or Apache. It can be installed on Windows OS (Windows 7 or later, Windows Server 2011 or later) or on Linux. Microsoft .NET 4.7.2 or later is required for the Windows installation. In case of Linux, Mono framework is needed to run the .NET applications. 

The main application

  • Persephone Web Client. The new client application is now replacing the desktop version. It runs in any popular web browser and is OS independent. The software is hosted by the server SelfHostingWebCerberus: launch the server and navigate to a URL to start using the client application - no installation of the client is necessary. 

Installation steps without details

1. Install the database server, create an empty database and a new user.

2. Install Solr search engine (requires Java). Create a new core.

3. If hosting on linux, install Mono (we need it to run .NET applications).

4. Unpack PersephoneShell from our archive and update configuration values: the database connection string, path for BLAST binaries, URL to Solr, location of external files, etc.

5. Unpack WebCerberus from our archive and modify the configuration: the database connection string, path for BLAST binaries and index files, URL to Solr, etc. 

We, Persephone Software, can help you with the configuration, usually via an online meeting.

The advanced security configuration steps to enable user registration are described here

Just to remind you, if you are comfortable with Docker, we can provide a Persephone Docker image that already has all needed components pre-installed and configured. Just spin the container and navigate to a URL to see the live application.

System Requirements

The web client of Persephone runs on any desktop OS, as long as it provides a modern web browser. The recommended hardware for the client machine would look like this:

Client:

Processor Type

Dual core

Processor Speed

2.8 GHz

Memory

4 GB minimum

Local Storage

1 GB for program files and the data cache.


Server:

The requirements for the server side are higher (depending on the load, for small teams it can run on a regular laptop).

Processor Type

Quad core

Processor Speed

2.8 GHz

Memory

16 GB minimum

Local Storage

100 GB for program files and the data cache.

While most of the data is stored in the database, the bulky entries, such as genomic sequences, BAM tracks, or variant binary data, are stored outside the database, in the file system. This organization reduces stress on the database and dramatically improves the timing of the backup process. There is no need to "export" massive data records from the database in the form of SQL text, which is very time-consuming - the compressed binary files are ready for backup without additional preparation.