The Apache Solr is a more advanced search engine than Lucene version used by Persephone. Using it requires one additional step - an installation of the Solr server that runs on Java. By using Solr you will also benefit from a much faster indexing process done by SolrIndexUpdater tool.

Install Java

To run Solr, you will need the Java Runtime Environment (JRE) version 1.8 or higher. At the command line, check your Java version like this:

C:\>java -version

If Java is not installed you will receive the following message, otherwise go to step “Install Solr”

Download and install Java from URL: https://www.java.com/inc/BrowserRedirect1.jsp?locale=en

Check your Java version:

Install Solr

Download Solr from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

For Windows, we need a zip-file: solr-7.5.0.zip, for Linux - a tgz-file: solr-7.5.0.tgz.

For more information about installing Solr under Linux, see here: https://tecadmin.net/install-apache-solr-on-ubuntu/

Unzip solr-7.5.0.zip on disk, for example to C:\Data

You might need to modify bin/solr.in.cmd file to include:

set SOLR_JAVA_HOME="c:\Program Files (x86)\Java\jre1.8.0_181"

pointing to the location of your Java installation directory.


Start the Server

Start command prompt as: cmd.exe and change to the directory with Solr executables

cd c:\Data\solr-7.5.0\bin\
solr.cmd start



Use your Internet browser to check Solr by opening the URL: http://localhost:8983



Create the Core (Collection)

Create a new core by running the following command:

solr.cmd create -c <core_name>

You can name the core anything you want. Let’s name the core “CORENAME”.

After this, the core will be available in the dropdown list in the Solr web interface:


Make sure that the parameter maxBooleanClauses in file {PATH_TO_SOLR}\{CORE_NAME}\conf\solrconfig.xml is set to at least 32768. For example, for Solr engine installed at g:\solr and a core called CORENAME, the path to the file would be g:\solr\server\solr\CORENAME\conf\solrconfig.xml.

Please increase maxBooleanClauses value to be at least 32768. The section under <query> should look like this:

    <maxBooleanClauses>32768</maxBooleanClauses>

After changing the configuration of the core described above, you would need to reload the core or restart the Solr server. Reload the core in "Solr Admin" page by going to http://localhost:8983/solr/#/. Click on “Core Admin”, select the needed core and click on button "Reload".

Alternatively, you can issue the command

solr.cmd restart -p 8983

Solr core is ready for indexing.

Configuration of SolrIndexUpdater

SolrIndexUpdater is a tool to create the search index of Persephone data for the Solr search engine. The utility collects the data from the database and passes it to the Solr server. The Persephone client application or the Cerberus server will delegate the search tasks to the Solr server.

Before using SolrIndexUpdater, please put the database connection string to the config file in the usual form. If you name the connection “Default”, SolrIndexUpdater will use it if no other connection is specified on the command line.

The URL to the core is given in "SolrSearchSettings" section in SolrIndexUpdater’s config file:

  <SolrSearchSettings Enabled="true" ServerUrl="http://localhost:8983/solr/CORENAME" UpdateMode="APPEND" />

You can copy this URL from "Solr Admin” at http://localhost:8983/solr/#/.  On this page, select the needed core from combo box on the left, click on “Query”, click on “Execute Query” and copy the URL from the top of the results. Copy this link without “select?q:*:*” to the field “ServerUrl”.

SolrIndexUpdater is ready to use.

Using SolrIndexUpdater.


Usage: SolrSearchIndexUpdater [OPTIONS]+
Update search index in Solr server.

Options:
  -t, --type=IndexType       (required) the IndexType, must be one of {MARKER,
                               ANNOTATION, QTL, ALL}
  -l, --url=URL              Full URL to the Solr server, which must end in the
                               Solr Core name
  -u, --updateMode=UpdateMode
                             the UpdateMode, must be one of {APPEND, OVERWRITE}
                               (by default, will use the mode specified in the
                               config file)
  -s, --connectionString=VALUE
                             name of connection string stored in config file (
                               by default, will use the connection string
                               specified in the config file under the 'Default'
                               key)
  -m, --mapSets=VALUE        list of map set ACCESSION_NOs or MAP_SET_IDs to be
                               indexed, separated by ',', (by default, ALL map
                               sets will be indexed)
      --tr, --tracks=VALUE   list of tracks to be indexed, separated by ',', (
                               by default, ALL tracks will be indexed)
  -q, --qualifiers=VALUE     list of qualifier names to be indexed, separated
                               by ',', (by default, ALL qualifiers will be
                               indexed)
  -a, --async                build the index asynchronously; see Solr
                               documentation for more info (default is true)
  -d, --debug                show debug information
  -h, --help                 show this message and exit

Most common scenarios:

       Update all the data:

               SolrIndexUpdater –t ALL

       Update all data for a specific map set:

               SolrIndexUpdater –t ALL –m 11111

Note

Like other tools written for Windows OS, SolrIndexUpdater can run on linux, using mono framework:
mono SolrIndexUpdater.exe -t ALL -m 11111


Set up Persephone/Cerberus to use Solr Search Index

Normally, the communication with Solr is performed via Cerberus. In this case, its configuration should be updated:

<configSections> of SelfHostingCerberus.exe.config should contain:

    <section name="SolrSearchSettings" type="CrsDao.Search.SolrSearch.SolrSearchConfSection, CrsDao" />

Also, add the following node to SelfHostingCerberus.exe.config:

<!--
    Search engine configuration.
   
         ServerURL:    Required. Full URL to the Solr server, which must end in the Solr Core name,
                       i.e. http://localhost:8983/solr/persephone
                       To create a core, you can run "solr create_core -n desiredCoreName"
   
           Enabled:    If "false", search engine will be disabled, and all of the other
                       parameters will be ignored. Default is "false".
-->
  <SolrSearchSettings Enabled="true" ServerUrl="http://localhost:8983/solr/persephone" />

In case you want Persephone directly communicate with the Solr server, the Solr options should be configured in “config.json” or in “*.config” files. The only required option ServerUrl is URL to the Solr server.

config.json (for Persephone):
  // Solr Search settings
  "SolrSearch": {
    // Required. Full URL to the Solr server, which must end in the Solr Core name,
    //   i.e. http://somehost:8983/solr/persephone
    //   To create a core, you can run "solr create_core -n desiredCoreName"
    "ServerUrl": "http://somehost:8983/solr/persephone"
  }