cleanup temp_blast_folder

During the process of data loading, some temporary files are created that can later be deleted. Sometimes, it makes sense to reuse them to save time on lengthy operations. For example, when finding gene ortholog pairs, the protein sequences are generated from the records in the database and saved in FASTA files in the sub-folder of the main temporary folder. These protein files can be reused for multiple ortholog finding procedures. 

The results of the lengthy BLAST searches engaged in the ortholog generation are also stored in the temporary BLAST folder. If the BLAST process completes successfully, the files can be reused. For example, in a rare case of failure of writing to the database (network issues?), the process of finding orthologs can be repeated, and in this case, the temporary files with proteins and BLAST search results can be reused. However, if the BLAST parameters (specified via PersephoneShell configuration file) have been changed, the search result should not be reused - we want to rerun the BLAST process. In that case, it is important to delete the BLAST results from the temporary directory. Use cleanup temp_blast_folder command (with optional test mode -t) to enforce re-generating the new search results.

cleanup blast_folder

With the older versions of PersephoneShell, the process of deleting a map set or an annotation track was leaving the corresponding BLAST files behind. To save the disk space and avoid confusion, when the BLAST index files are no longer needed and cannot be reused, run the command cleanup blast_folder [-t]

Note

The latest version of PersephoneShell automatically deletes the BLAST files if they are no longer associated with the existing map sets or tracks. This is included in the action of the commands delete mapset and delete tracktreenode.

cleanup vcf_temp_folder

The test mode of the command add variant parses the VCF file and prepares binary blocks to be later loaded into the database. These binary data is stored in the temporary folders which uses quite some disk space. Normally, after successful loading of the variant data these temporary files are no longer needed and could be deleted using the cleanup command.

cleanup sequence_storage

Remove the "orphan" files from the storage that do not have corresponding records in the database. This situation might be a result of upgrading from an older version of PersephoneShell.