As was noted in 'add sequences' section, some of the maps can also be designated as chromosomes. This helps correctly sorting the maps in the data grids, or deciding which maps should be shown as representatives of the full genome in the graphical output of various search results.

The logic in listing all the maps in the grid below the map set tree in Persephone is as follows:

- first show the chromosomes ordered by order_no assigned during loading the sequences.

- then show the rest of the maps, reversely ordered by size

Mark some sequences as chromosomes - add chromosome records

It may happen that the decision to identify chromosomes comes after the sequences have been loaded. To add the chromosome records to an existing map set use the command edit chromosomes:

edit chromosomes <{mapSetId | path}>

for example, we decided to mark ChrUn as a chromosome entry:


PS> edit chromosomes "Oryza sativa japonica/Rice IRGSP-1"
Existing chromosomes for Oryza sativa japonica/Rice IRGSP-1:
   CHROM_NAME   CHROM_LENGTH    MAP_NAME
         Chr1     43,270,923        Chr1
         Chr2     35,937,250        Chr2
         Chr3     36,413,819        Chr3
         Chr4     35,502,694        Chr4
         Chr5     29,958,434        Chr5
         Chr6     31,248,787        Chr6
         Chr7     29,697,621        Chr7
         Chr8     28,443,022        Chr8
         Chr9     23,012,720        Chr9
        Chr10     23,207,287       Chr10
        Chr11     29,021,106       Chr11
        Chr12     27,531,856       Chr12

There are 2 more maps that can be designated as chromosomes.
A - Add chromosomes, O - change order, R - remove chromosome flag (unmark), ESC - cancel: A

LINE_NO     MAP_NAME    CHROM_LENGTH
[ 0]          ChrUn          633,585
[ 1]          ChrSy          592,136

Type line number(s) of maps to be designated as chromosomes.
Use comma as a separator or use range of numbers like 0..10 : 0
ChrUn

Choose one of the options:
R - use regular expression to extract chromosome name from map name
F - (not implemented) use file with map and chromosome name pairs (one line for each pair, comma- or tab-delimited)
M - (not implemented) type the names manually
S - (not implemented) sort maps by name and use the order number
Your choice: R
Regular expression to extract chromosome name from the map name: (.+)
ChrUn  ==>      ChrUn
Do you want to insert the chromosome records listed above? (Y/N)Y
Inserted 1 chromosome record(s)

In the example above, first, a list of existing maps called chromosomes is displayed. The program found 2 maps that do not have an associated chromosome record. The first of them, ChrUn, listed under line_no 0 is selected.

In the current version, only one method of naming chromosomes is implemented: using a regular expression to extract the chromosome name from the map name. The program will search for the common prefix among selected maps and will suggest using it in the regular expression, so that the resultant chromosome name is as short as possible. The chromosome names are normally shown in the graphical representation of a genome, where they appear together with other chromosome names, so it is important to have them short.

The suggested regular expression (.+) will result in copying the map name into the chromosome name verbatim. If you want to shorten the name of the chromosome, you might want to remove the leading Chr by using the regular expression Chr(.+)

Reorder existing chromosomes

The order_no records of the sequences are considered only when sorting the chromosomes, as they are shown first, on top of the list of all maps. The rest of the sequences that are not chromosomes are ordered by size.

When loading the sequences, their order_no is assigned according to their order in the original FASTA file. It is not uncommon that the sequences in the file are given in the arbitrary order.

To change the chromosome order, run a command like


PS> edit chromosome "Oryza sativa/IRGSP-1.0.31"

This will list the current order of the sequences. 


Existing chromosomes for Oryza sativa/IRGSP-1.0.31:
    CHROM_NAME   CHROM_LENGTH    MAP_NAME  ORDER_NO
             9     23,012,720           9    49
            10     23,207,287          10    50
            12     27,531,856          12    51
             8     28,443,022           8    52
            11     29,021,106          11    53
             7     29,697,621           7    54
             5     29,958,434           5    55
             6     31,248,787           6    56
             4     35,502,694           4    57
             2     35,937,250           2    58
             3     36,413,819           3    59
             1     43,270,923           1    60

There are several ways of reordering the maps. Most of the time, the chromosomes can be ordered by their "natural" order, which sorts the map names taking into account the numerical values that could be part of the names. This will ensure, for example, that Chr2 will precede Chr10. Note, that using a plain alpha-numeric sorting will put Chr2 after Chr10. 

  • To use the natural ordering, type 'N':


There are 49 more maps that can be designated as chromosomes.
A - Add chromosomes, O - change order,R - remove chromosome flag (unmark), ESC - cancel: O
N - natural order, F - use records from file, ESC - cancel: N
    CHROM_NAME   CHROM_LENGTH    MAP_NAME  ORDER_NO
             1     43,270,923           1     0
             2     35,937,250           2     1
             3     36,413,819           3     2
             4     35,502,694           4     3
             5     29,958,434           5     4
             6     31,248,787           6     5
             7     29,697,621           7     6
             8     28,443,022           8     7
             9     23,012,720           9     8
            10     23,207,287          10     9
            11     29,021,106          11    10
            12     27,531,856          12    11

Save the new order? (Y/N) Y

  • An alternative way of reordering the chromosomes uses the records in a tab-delimited file, where a map name is followed by the order_no, such as:

Chr1        0
Chr2        1
...

Type 'F' at the prompt to select this mode of the ordering and provide the path to the file:


N - natural order, F - use records from file, M - order manually, ESC - cancel: F
Path to the file with chromosome order (mapName TAB orderNo):? /tmp/chrom-order.csv
    CHROM_NAME   CHROM_LENGTH    MAP_NAME  ORDER_NO
             1     43,270,923           1     1
             2     35,937,250           2     2
             3     36,413,819           3     3
             4     35,502,694           4     4
             5     29,958,434           5     5
             6     31,248,787           6     6
             7     29,697,621           7     7
             8     28,443,022           8     8
             9     23,012,720           9     9
            10     23,207,287          10    10
            11     29,021,106          11    11
            12     27,531,856          12    12

Save the new order? (Y/N) Y

Removing chromosome records

It is possible to remove the chromosome flag from maps, and mark them as non-chromosomes. If, for example, you would like to downgrade the map FLA1.3ch00, that currently has a status of a chromosome  (called ch00), to the level of a scaffold, run this:

PS> edit chromosome "Solanum lycopersicum/FLA1.3"
13 existing chromosomes for Solanum lycopersicum/FLA1.3:
    CHROM_NAME   CHROM_LENGTH      MAP_NAME  ORDER_NO
          ch00      5,490,904    FLA1.3ch00     0
          ch01     95,309,210    FLA1.3ch01     1
          ch02     52,158,778    FLA1.3ch02     2
          ch03     66,828,682    FLA1.3ch03     3
          ch04     67,650,907    FLA1.3ch04     4
          ch05     66,930,101    FLA1.3ch05     5
          ch06     46,398,775    FLA1.3ch06     6
          ch07     69,121,753    FLA1.3ch07     7
          ch08     63,731,143    FLA1.3ch08     8
          ch09     67,978,353    FLA1.3ch09     9
          ch10     68,636,165    FLA1.3ch10    10
          ch11     56,952,951    FLA1.3ch11    11
          ch12     68,816,593    FLA1.3ch12    12

O - change order, R - remove chromosome flag (unmark), ESC - cancel: R
LINE_NO     CHROM_NAME   CHROM_LENGTH      MAP_NAME  ORDER_NO
[0 ]              ch00      5,490,904    FLA1.3ch00     0
[1 ]              ch01     95,309,210    FLA1.3ch01     1
[2 ]              ch02     52,158,778    FLA1.3ch02     2
[3 ]              ch03     66,828,682    FLA1.3ch03     3
[4 ]              ch04     67,650,907    FLA1.3ch04     4
[5 ]              ch05     66,930,101    FLA1.3ch05     5
[6 ]              ch06     46,398,775    FLA1.3ch06     6
[7 ]              ch07     69,121,753    FLA1.3ch07     7
[8 ]              ch08     63,731,143    FLA1.3ch08     8
[9 ]              ch09     67,978,353    FLA1.3ch09     9
[10]              ch10     68,636,165    FLA1.3ch10    10
[11]              ch11     56,952,951    FLA1.3ch11    11
[12]              ch12     68,816,593    FLA1.3ch12    12

Select line number(s) of maps to remove the chromosome flag.
Use comma as a separator or use range of numbers like 0..10 : 0
FLA1.3ch00
The following 1 chromosome record(s) will be cleared

    CHROM_NAME   CHROM_LENGTH      MAP_NAME
          ch00      5,490,904    FLA1.3ch00
Do you want to proceed? (Y/N) Y
DATA_VERSION updated
Deleted 1 chromosome record(s)