release_notes_on_database_schema
Release notes on database schema
rev 3 (December 2018)
major modifications
added tables for the new types of data generated by the pipeline : protein fusions, genes in tandem duplication, proteins alignments within 1% of best bit score
added information to distinguish organisms being inserted or partially inserted
minor modifications
alignment_param_id changed from integer to bigint
clean up unused tables
simplification of the table ncbi_taxonomy_tree
moving information on parameters in its own table to save disk space
fixed minor errors when restauring database from scratch (origami_init).
rev 2 (August 2017)
modified table genes, elements, and organisms to decrease cross table queries: added columns locus_tag, accession, is_pseudo, length_residues to table genes
added genome assembly info to help organize molecules into living organisms: additional info in table organisms: NCBI_assemblyaccession_IT, NCBI_assemblyname_IT, NCBI_lastupdatedate_IT, NCBI_seqreleasedate_IT, NCBI_isolate_IT, NCBI_speciestaxid_IT, NCBI_biosampleid_IT, NCBI_biosampleaccn_IT, NCBI_list_bioprojectid_IT, NCBI_list_bioprojectaccn_IT, NCBI_assemblyclass_IT, NCBI_assemblystatus_IT
added more links from ncbi: additional info in table elements: NCBI_internal_id, NCBI_sourcedb, NCBI_tech, NCBI_geneticcode, NCBI_topology, NCBI_completeness, NCBI_status, NCBI_comment
removals of columns to avoid redundancy: elements.description (→ sequences.definition), elements.date_seq (→ sequences.date_seq), elements.version (→ sequences.version), genes.residue (→ prot_feat.proteine)
fix problem with long gene name and locus_tag
fix ownership of table seq_group_id, seq_user_id and functions truncate_tables_public, truncate_tables_micado
release_notes_on_database_schema.txt · Last modified: 2019/01/04 08:11 by thomas.lacroix@inra.fr