[feature] : script to automatically download annotated genome files of interest from ncbi
[improvement] : do not allow very long gaps within syntenies; option -max_gap_size_for_creation_penalty (-mgsc) that defines the length of a gap being applied the gap_opening penality before being applied the gap_exension penality
[improvement] : run_Insyght_pipeline.pl: check that the child processes launched are still alive while waiting for them, else die with error → useful to detect kernel kills due to lack of resources.
[improvement] : generate_sorted_list_comp_elements.pl compatible with no mirror data
[improvement] : Task_blast_all_generator_for_IDRIS.pl : option to use environment variables in bash script (${QSUB_LOG_DIR}, ${PLAST_EXEC_PATH}, ${FASTA_MAKEBLASTDB_INPUT_DIR}, ${BLAST_OUTPUT_DIR}, ${MARKER_DONE_BLAST_DIR})
[bug] : file name generated by scripts when setting the to CLUSTER_ORGANISM could be very long and problematic for most unix file systems (limit filename lenght is 255, ‘getconf NAME_MAX /’, and limit path lenght is 4096, ‘getconf PATH_MAX /’ : keep only a representative organism id for the name of the file /list_cluster/list_cluster.txt
[improvement] : make align able to handle blast output file that includes many different organisms
[bug] : for the software that find syntenies (ComProMix), turn on flag -Wall -Wshadow -Wextra -std=c++11 and correct associated warnings ; get rid of warning at compile time
[bug] : for the software that find syntenies (ComProMix), if multiple best highest scores, take the longest synteny among all
[improvement] : for the software that find syntenies (ComProMix), possibility to print tandem duplication if it involve a bdbh (successive homologs not in diagonal but next to each other or up-below in the matrix). Remark : _tandem_dups_table.tsv
[improvement] : for the software that find syntenies (ComProMix), print syntenies that branch to other bigger syntenies in a special file. Remark : _isBranchedToAnotherSynteny_table
[improvement] : replace atoi by std::stoi and handle related exception accordingly in HomologyMatrix.cc
[improvement] : modify MakeFile to not depends on psql librairies when compiling main_align
[improvement] : [Pipeline] update to blast 2.6
[bug] : remove special characters that causes problem while parsing genome file (i.e gene name with backslash for example)
[improvement] : update scripts to new database schema 3
[improvement] : modified script Task_add_alignment_integrator_for_IDRIS.pl with option -TRIM_UNUSED_ALIGNMENT_PARAMS_DATA. When turned ON, do not insert a row in the table alignment_params for a particular comparison of pairwise elements if it doesn't lead to any synteny or ortholog (table alignments and alignment_pairs are empty) ; gain of disk usage of 5 - 40% depending on the average number of elements per organisms (most gain especially when dealing with organisms with many contigs) ; the script is 3-4% slower with option
[improvement] : modified script perl Task_add_alignment_integrator_for_IDRIS.pl with option -FILE_BLACKLISTED_ORGANISM_IDS to not insert data for the organism_ids listed in the blacklist file (1 organism id per line)
[improvement] : better handling of concurrent access to file for script process_Task_add_alignment_parse_tsv_output_Replacing_Forward_Reverse.pl
[improvement] : added RESTRICT_TO_LIST_ASSEMBLY_ACCESSION option to download_genome_files_from_ncbi.pl
[improvement] : add option -OUTPUT_DIR and -INPUT_DIR to generate_sorted_list_comp_elements.pl and integrate_sorted_list_comp_orga_whole.pl
[improvement] : added database check not do redo computation of a given element id if the flag CONTINUE_MULTI_RUNS_COMPUTATION is activated
[bug] : fixed an error in generate_sorted_list_comp_elements.pl : wrong computation of alignemnt score when no mirror data
[improvement] : database integrity not affected when an error occur and data already inserted in previous runs