The Variants dialog

Persephone organizes variants (such as SNPs or insertions/deletions) into samples, which usually correspond to columns in the original VCF file. You can filter all available samples (by name, qualifier value, and other properties); select samples of interest; preview their contents; and then display them alongside other tracks on a map in the form of a Variants track.

To select the samples you wish to display, open the Variants dialog by selecting Tools | Variants from the main toolbar:

Alternatively, right-click any map to open its context menu. If the Persephone database contains samples for this map, you will see the Show variants menu option:

The counter shows the number of currently selected samples (in this case, 0) as well as the total number of samples available for the current map (2,548 in this case). 

The Variants dialog displays summary information for all available samples in the database:

 1:  The Map set selector lists all map sets with available variant samples, along with the count of samples for each map set; it supports the standard table filtering controls.

 2:  The Map list shows all maps with samples for the currently selected map set, along with the count of selected and available samples for each map.

 3:  The main Selected samples table lists all the currently selected samples (it is described in detail below). Click the Add button to open the Select samples dialog, where you can select among the samples for the current map set.

 4:  The Preview panel displays a preview of the currently selected samples on the current map; 

 5:  The Track settings panel controls display options for variants in each sample.

Adding and removing samples

Click the Add button to open the Select samples dialog:

In this dialog, the main table on the left contains all available samples for the currently selected map set; this table supports all the standard search and filtering controls. The property selector on the right lists all the basic properties of the samples, as well as their qualifiers. Check the checkbox next to a property or qualifier to display its values in the table; doing so allows you to filter and sort the samples by this value. For example, you could find all samples whose "source" qualifier is "MC":

Choose the samples you wish to add, then click the Select button. The list of samples will appear in the Selected samples table, and a graphical preview of their contents will appear in the Preview panel (as described in more detail below). 

Note that the columns in main Selected samples table will reflect the columns you have previously chosen to display in the Select samples dialog. To choose a different set of columns, click the Add button again, and check or uncheck additional properties/qualifiers. The changes you make will be immediately reflected in the main table:

You can select one or more samples in the main table and click the Remove button to remove them (alternatively, you can also right-click a row and choose Remove from the context menu). 

To clear all currently selected samples, first make sure no rows in the table are selected; as usual, you can Ctrl-click (or Cmd-click on Mac) a row to deselect it. The Remove button will then change to say Clear all:

This button's appearance (with a thicker border and gradient fill) indicates that you must hold it down for about one second instead of clicking it. As you do, the button will begin filling up with a darker blue color, from left to right:

Once the button is completely filled up, the entire table will be cleared of samples.

Assigning colors to samples

Each column in the main Selected samples table has a radio button under its header; the number of distinct values for that column is listed there as well. Click this radio button to assign a color to each sample based on this column's value:

Each sample will be marked with a color corresponding to a value (e.g. "Thailand" is marked with the  teal  color). These colors are assigned automatically, but you can edit them by clicking the color swatches in the column header. Hover the mouse over the swatch to see its corresponding value...

...then click it to change its color. For example, you could choose to mark "Thailand" in  blue , and all others in white:

If there are 10 or more distinct values in a column, they will not receive individual colors; instead, they will be marked with a two-tone gradient:

In this example, the smallest value (32.4) is shown in   teal ; the largest value (43.6) in  red ; and values in between are shaded accordingly (we've also sorted the table by the "depth" column to make this gradient easier to see).

Click the radio button in another column to use that column to determine sample colors; click a radio button that is already selected to clear its selection and return to the default view.

You can also assign a unique color to an individual sample. To do so, right-click the sample's row in the main table, then select Select custom color:

In this case, this individual sample (named "IRIS_313-11710") is marked in  green , even though the value of its "ORI_COUNTRY" qualifier -- "Thailand" -- would normally be marked in  blue . Right-click the sample again and choose Clear custom color to revert to the default coloring mode:

The Preview panel and variant display options

The Preview panel displays a graphical view of the currently selected samples on the currently selected map; this view can be customized in the Track settings panel, as described below. These samples will be ordered according to their position in the main Selected samples table.

Roll the mouse wheel to zoom the view in or out (similar to zooming a map in the main view); drag the mouse over the view to scroll it (alternatively, you can use the scroll bars on the edges of the view). Click the Reset zoom button to reset the zoom so that the entire map fits on the screen.

By default, the variants for each sample are shown as a chart that consists of three lines:

The top line (blue) contains homozygous variants that are identical to the reference (by default, the reference sequence); the middle line (red) contains homozygous variants that are different from reference; and the bottom line (green) contains heterozygous variants. When the map is zoomed out, these variants are shown as heat maps (white spaces indicate areas with no variants); however, you can zoom in to resolve more detail:

Each sample is assigned a distance score based on its variants: the more variants it has in common with the reference, the lower its distance. Thus, you can enable the Distance column in the Select samples dialog, then sort the samples by distance in the main table:

Samples with more "blue" variants (same as reference) will be placed closer to the top, whereas samples with more "red" variants (different from reference) or "green" variants (heterozygous) will be placed closer to the bottom.

Preview options

You can use the Thickness slider in the Options panel to make the preview lines thinner (to show more samples on the screen at the same time), as shown in the example above. In addition you can check the Vertical view checkbox to rotate the sample lines vertically:

(These options affect only the appearance of variants in the Preview panel, but not the Variants track in the main view.)

Display options

Instead of displaying the variants as heat maps, you can check the Sliding window checkbox to display them as bar charts. In this mode, a sliding window filter (currently, 15 variants in length) is passed over all variants in the sample, and the number of "red" and "green" variants is used to determine the height of the corresponding bar; the "blue" background is still rendered as a heat map of samples that are identical to the reference:

You can also check the Flat color checkbox to draw each sample as a flat line. In this mode, if the majority of the variants in the sliding window are "red" (different from reference), that entire segment of the chart will be drawn in red (the same applies to "green" variants that are hetereozygous):

This display mode is the most compact, making it possible to fit many more samples on the screen and review them at a glance.

You can also check the Hide reference variants checkbox to hide the "blue" variants (those identical from reference) completely. In this mode, the background of each sample line becomes a heat map of the "red" variants, and the only the "green" (heterozygous) variants in the sliding window are displayed on the chart:

This option also works in the regular, non-sliding-window mode, although in that case it merely hides the top ("blue") line on each chart:

(These options affect only the appearance of variants in the Preview panel, but not the Variants track in the main view.)

Alternative references

By default, variants are compared to the reference sequence. However, you can instead compare them to a specific sample. To do so, drag-and-drop the drag handle in the sample's row over the Parent 1 box:

Alternatively, you could right-click the sample's row, and select Set as Parent 1 from the context menu:

The Coloring schema will automatically change to Reference genotype, and the preview display will update. In this mode, variants that are the same as the ones in the selected Parent 1 sample (in this case, B001) are be colored dark blue; variants that differ from Parent 1 are dark red; and the rest are gray:

The name of the sample that is set as Parent 1 (e.g. B001) is highlighted in blue, for ease of identification (naturally, this sample will contain only "dark blue" variants). In addition, the values in the Distance column change to to reflect the distance between each sample and the currently selected parent (as opposed to the default measure of distance to the reference sequence). Note that in this coloring schema, the Distance column is only populated for samples that are currently selected in the main table; its value in the Select samples dialog will be blank:

You can also select a second reference sample to serve as Parent 2; when you do, the coloring schema will be automatically set to Two parents. In this coloring schema, samples that are identical to Parent 1 are still colored blue; samples that are identical to Parent 2 are colored red; and the rest are gray. The value of the Distance column also changes, scoring the first parent as 0.0, the second parent as 1.0, and all the other samples somewhere in between. The names of the two parent samples will be highlighted in blue and red respectively, for ease of identification:

You can switch between different coloring schemas by clicking the appropriate radio button. To remove a sample as a parent, click the button next to its name:

Displaying variants on maps

Once you are satisfied with your coloring schema and display options, click the Display map button to open the currently selected map in the main view (click Display map and close to display the map and close the Variants dialog). The Variants track will be added to the map: 

If you open the Variants dialog again and select a different coloring schema, add/remove samples, sort the samples by a different column, or assign different background colors to the samples, these changes will immediately take effect on the map in the main view.

The Variants track will be automatically added to any map containing any of the currently selected samples, although you can always hide this track if it is not needed.