NMP-db

From Informatics

Jump to: navigation, search
  • What labs or how many are using NMPdb?

    - Rost lab

  • Who is the main “database authority” for NMPdb?

    - Sven Mika (mika@rostlab.org)

  • What kind of database is it? (flat-file, relational, XML, etc)

    - flat-file

  • Size of database?

    - 5 MB

  • Anticipated yearly growth? (Megabytes/Gigabytes)

    - 10%

  • Backup procedures? How often?
    • Database backups (Hot, Cold, Both) [and/or]
    • Operating system backup

      - OS backup whenever the database is updated

  • What servers/operating systems are hosting them (IP addresses)
    • IP => 156.111.70.150
    • Red Hat Enterprise Linux WS Release 3 (Taroon Update 8)
  • Approximately how many *active* users?
  • - difficult to predict

  • How often is the database used? (Daily, Weekly, Monthly)

  • What platforms are being used? (Oracle, MySQL, PostgreSQL, etc)

    - N/A

  • What applications are using these databases?
    • Web interface?
    • Application (GUI)?
    • Command line interface (CUI)?
    • CGI web interface
  • Is it accessible from outside the firewall to public users?

    - YES (NMPdb)

  • What is the primary purpose of the database & types of data stored?
    • The nuclear matrix (NM) is a structure resulting from the aggregation of proteins and RNA in the nucleus of eukaryotic cells; it is the ‘sticky bit’ that remains after aggressive DNAse digestion and salt extraction protocols.
    • Owing to the important role of the NM in DNA replication, DNA transcription and RNA splicing, the expression pattern of NM proteins has become an important early indicator for numerous cancers/tumors.
    • NMPdb currently contains details of 398 NM proteins. These were collected through a semi-automated analysis of over 3000 scientific articles in PubMed.
    • The 398 proteins were matched to 302 protein sequences in UniProt or GenBank. Our NMPdb repository annotates these links along with the following annotations: organism, cell type, PubMed identifier, sequence-based predictions of structural and functional features and for some entries the explicit sequence segment that is responsible for localization (nuclear matrix targeting signal)
    • NMPdb has been formatted in an EMBL-like flat file format. Each NM protein is represented by one entry. All entries in the database contain the following fields: (i) origin (organism and cell types), (ii) type of nuclear matrix interaction/involvement (INM, ASC, MIX or NUS), (iii) molecular mass and known or calculated pI for locating the protein on a 2D gel and (iv) reference (PubMed IDs of articles describing the interaction).
    • The complete NMPdb database can be downloaded via ftp.
  • Are there any issues or problems with the database?
    • Specific error messages popping up?
    • Problems connecting from the application or web interface?
    • Performance issues (queries are slow, freezes at times, etc)
  • no
  • Would they like help in administering the database?
  • no
  • What additional features or changes would users like to see?
    • new tables or queries
    • additional screens on application or web interface
    • migrate to different database platform (i.e. MySQL to Oracle)
      • no feedback has been provided by users
         
Personal tools