User Tools

Site Tools


release_notes_on_database_schema

Release notes on database schema


  • rev 3 (December 2018)
    • major modifications
      • added tables for the new types of data generated by the pipeline : protein fusions, genes in tandem duplication, proteins alignments within 1% of best bit score
      • added information to distinguish organisms being inserted or partially inserted
    • minor modifications
      • alignment_param_id changed from integer to bigint
      • clean up unused tables
      • simplification of the table ncbi_taxonomy_tree
      • moving information on parameters in its own table to save disk space
      • fixed minor errors when restauring database from scratch (origami_init).
  • rev 2 (August 2017)
    • modified table genes, elements, and organisms to decrease cross table queries: added columns locus_tag, accession, is_pseudo, length_residues to table genes
    • added genome assembly info to help organize molecules into living organisms: additional info in table organisms: NCBI_assemblyaccession_IT, NCBI_assemblyname_IT, NCBI_lastupdatedate_IT, NCBI_seqreleasedate_IT, NCBI_isolate_IT, NCBI_speciestaxid_IT, NCBI_biosampleid_IT, NCBI_biosampleaccn_IT, NCBI_list_bioprojectid_IT, NCBI_list_bioprojectaccn_IT, NCBI_assemblyclass_IT, NCBI_assemblystatus_IT
    • added more links from ncbi: additional info in table elements: NCBI_internal_id, NCBI_sourcedb, NCBI_tech, NCBI_geneticcode, NCBI_topology, NCBI_completeness, NCBI_status, NCBI_comment
    • removals of columns to avoid redundancy: elements.description (→ sequences.definition), elements.date_seq (→ sequences.date_seq), elements.version (→ sequences.version), genes.residue (→ prot_feat.proteine)
    • fix problem with long gene name and locus_tag
    • fix ownership of table seq_group_id, seq_user_id and functions truncate_tables_public, truncate_tables_micado
release_notes_on_database_schema.txt · Last modified: 2019/01/04 08:11 by thomas.lacroix@inra.fr