CaIntegrator

From Informatics

Jump to: navigation, search

This page provides a quick introduction to caIntegrator, installation notes, bugs, architecture.

Contents

caIntegrator - Getting Started/Overview

 http://gforge.nci.nih.gov/frs/?group_id=154

The caIntegrator knowledge framework provides researchers with the ability to perform ad hoc querying and reporting across multiple domains. The overall goal of the caIntegrator project is to provide a framework with the infrastructural components needed to develop enterprise level translational applications such as Rembrandt and I-SPY. In terms of the bigger picture, the goals are to:

  • adopt caIntegrator as a warehouse to store analysis results from clinical studies involving genotypic/expression data
  • adopt caArray to store the raw array data (this has yet to be decided, as the Cancer Center currently uses GeneTraffic for this purpose and we may decide to go with that option instead)
  • adopt caTissue for managing the storage of tissue data
  • adopt geWorkbench for facilitating analyses and for providing access to finding data stored in caIntegrator
  • build a caBIG-compatible generic framework that allows retrieval and transformation of data from a variety of heterogeneous data sources that house:
    • microarray data
    • genomic data
    • tissue array
    • imaging and clinical data
  • build a user-centric, high-performance search, retrieval and analysis platform for translational data:
    • build an analytical tool that allows Clinician/Scientists/Biostatisticians to conduct translational analysis of study specific data in a user-friendly manner
    • caIntegrator-derived applications:
      • Rembrandt
      • ISPY
      • CGEMS
      • DCEG/EAGLE

Architecture

This application framework comprises an n-tier service oriented architecture that allows pluggable web-based graphical user interfaces, a business object layer, server components that process the queries and result sets, a data access layer and a robust data warehouse.

  • caIntegrator Architecture Guiding Principles
    • build a framework with the infrastructural components needed to develop enterprise level translational applications such as Rembrandt, I-SPY, and CGEMS
    • driven by user requirements
    • user-friendly for a wide range of audience (physician scientists, programmers, statisticians)
    • standards-based and pattern-driven
    • extensible and scalable
    • reuse/extend existing open-source technologies
    • caBIG silver-level compatibility
  • caIntegrator Architecture Summary
    • n-tiered architecture (J2EE)
    • rich user-friendly web tier (Struts, XML/XSL, AJAX)
    • clinical-genomics service layer that handles both fine and coarse grained, strongly typed objects
    • scalable run-time analysis service (JMS/R-Server/R-Binary)
    • High Performance Query Service (multi-threaded query processing/ hybrid star schema)
    • remote interface with WebGenome (EJB)

Hardware and Software Requirements

Java Software Development Kit (JDK) version 1.5.0_04

  http://java.sun.com/j2se/1.5.0/download.jsp

JBoss Container (recommended:JBoss version 4.0.4)

http://labs.jboss.com/jbossas/downloads

Jakarta Ant version 1.6.2

http://archive.apache.org/dist/ant/binaries/

Oracle 9i Release 2 (9.2.0.5)

http://www.oracle.com

caIntegrator v1.0

http://gforge.nci.nih.gov/frs/?group_id=154

caIntegrator WGS 1.2 Source Bundle

http://gforge.nci.nih.gov/frs/?group_id=154

Weka 3.4.10 Data Mining Software

http://www.cs.waikato.ac.nz/~ml/weka/index.html

Installation Notes

Create Database and Load Seed Data

  • Check to make sure that database is running and can be connected to:
C:\>tnsping biodb1_adora
TNS Ping Utility for 32-bit Windows: Version 9.2.0.1.0 - Production on 05-JUN-2007 19:03:42
Copyright (c) 1997 Oracle Corporation.  All rights reserved.
Used parameter files:
C:\OraClient92\network\admin\sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = ADORA)(PORT = 1521))) (CONNECT_DA
TA = (SID = BIODB1) (SERVER = DEDICATED)))
OK (20 msec)
  • Logged into Oracle9iR2 on ADORA with DBA account
  • Created a tablespace and created user "integrator" with this tablespace as default
  • Downloaded the wgs_db.zip file from the caIntegrator gForge site specified above
  • Unzipped the file and moved it to D:\Michael on ADORA
D:\Michael>imp integrator/<password>@biodb2 file=wgs.dmp log=wgs.log full=y
Import: Release 9.2.0.5.0 - Production on Tue May 29 18:20:37 2007
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
Connected to: Oracle9i Enterprise Edition Release 9.2.0.5.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.5.0 - Production
Export file created by EXPORT:V09.02.00 via conventional path
import done in WE8MSWIN1252 character set and AL16UTF16 NCHAR character set
. importing CGEMSQA's objects into INTEGRATOR
. . importing table                "CHR_START_END"         26 rows imported
. . importing table                 "DNA_SPECIMEN"          0 rows imported
. . importing table                   "GENE_ALIAS"      71418 rows imported
. . importing table                     "GENE_DIM"      26850 rows imported
. . importing table                "GENE_SNP_ASSO"     698058 rows imported
. . importing table                "GENOTYPE_FACT"          9 rows imported
. . importing table           "GENOTYPE_STATUS_LU"          2 rows imported
. . importing table                    "HISTOLOGY"          0 rows imported
. . importing table               "SNPID_GENE_MAP"     586388 rows imported
. . importing table    "SNP_ANALYSIS_FINDING_FACT"         10 rows imported
. . importing table           "SNP_ANALYSIS_GROUP"         28 rows imported
. . importing table                    "SNP_ASSAY"    1617414 rows imported
. . importing table     "SNP_ASSOCIATION_ANALYSIS"         10 rows imported
. . importing table                      "SNP_DIM"    1062062 rows imported
. . importing table           "SNP_FREQUENCY_FACT"          3 rows imported
. . importing table                      "SNP_MAP"     647002 rows imported
. . importing table                    "SNP_PANEL"          4 rows imported
. . importing table                     "SPECIMEN"       9454 rows imported
. . importing table        "STDPT_ANALYSIS_GRP_AS"      22922 rows imported
. . importing table                    "STUDY_DIM"          3 rows imported
. . importing table             "STUDY_PANEL_ASSO"          5 rows imported
. . importing table            "STUDY_PARTICIPANT"       6902 rows imported
. . importing table             "STUDY_POPULATION"         12 rows imported
. . importing table    "STUDY_STDPOPUPLATION_ASSO"         12 rows imported
. . importing table         "STUDY_TIMECOURSE_DIM"          0 rows imported
Import terminated successfully without warnings.

Result:

  • 25 tables
  • 4 views
  • 61 indexes

Download required software packages and install (as root)

(1) downloaded JBoss 4.0.4 to /opt/downloads and unzipped it in /opt to create directory "jboss-4.0.4"
(2) downloaded Jakarta Ant 1.6.2 to /opt/downloads and unzipped it in /opt to create directory "ant-1.6.2"
(3) downloaded caIntegrator v1.0 to /opt/downloads and unzipped it in /opt to create directory "caintegrator"
(4) downloaded caIntegrator WGS 1.2 Source bundle and unzipped it in /opt to yield four new zip files:
caintegrator-analysis-commons.zip
caintegrator-application-commons.zip
caintegrator-spec.zip
cgems.zip
(5) download Weka 3.4.10 to /opt/downloads and unzipped it in /opt to create directory "weka-3-4-10"

Step 1: Building caintegrator-analysis-commons

[rmhonig@afdev opt]# export JAVA_HOME=/opt/java

[rmhonig@afdev caintegrator-analysis-commons]# /opt/ant-1.6.2/bin/ant build_dependency
Buildfile: build.xml
jar_check:
warning:
build_jar:
  [delete] Deleting directory /opt/caintegrator-analysis-commons/bin
   [mkdir] Created dir: /opt/caintegrator-analysis-commons/bin
   [javac] Compiling 53 source files to /opt/caintegrator-analysis-commons/bin
     [jar] Building jar: /opt/caintegrator-analysis-commons/caintegrator-analysis-commons.jar
  [delete] Deleting directory /opt/caintegrator-analysis-commons/bin
   [mkdir] Created dir: /opt/caintegrator-analysis-commons/bin
   [javac] Compiling 53 source files to /opt/caintegrator-analysis-commons/bin
     [jar] Building jar: /opt/caintegrator-analysis-commons/caintegrator-analysis-commons.jar
build_dependency:
    [echo] 
    [echo]                     Artifacts copied to ../artifacts
    [echo] 
    [copy] Copying 1 file to /opt/artifacts
BUILD SUCCESSFUL
Total time: 5 seconds

*** CONFIRM *** caintegrator-analysis-commons.jar was successfully created under /opt/artifacts directory

Step 2: Building caintegrator-spec

  • [rmhonig@afdev weka-3-4-10]# cp weka.jar /opt/caintegrator-spec/deployed_jars
  • [rmhonig@afdev weka-3-4-10]# cd /opt/caintegrator-spec/deployed_jars
  • [rmhonig@afdev opt]# cd caintegrator-spec
[rmhonig@afdev caintegrator-spec]# /opt/ant-1.6.2/bin/ant build_dependency 
Buildfile: build.xml
jar_check:
warning:
config_application_context:
    [copy] Copying 1 file to /opt/caintegrator-spec
build_jar_anthill:
   [mkdir] Created dir: /opt/caintegrator-spec/bin
   [javac] Compiling 278 source files to /opt/caintegrator-spec/bin
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/annotation/snp/bean/PlatformTechnology.java:42: 
warning: unmappable character for encoding UTF8
   [javac]    * The SNPlex� Genotyping System enables the simultaneous genotyping of up to 48 SNPs (single nucleotide 
   [javac]                ^
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/common/bean/Measurement.java:46: 
warning: unmappable character for encoding UTF8
   [javac]    * such as ml, kg, mm, m/s, �F, etc. 
   [javac]                               ^
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/common/bean/Measurement.java:53: 
warning: unmappable character for encoding UTF8
   [javac]    * such as ml, kg, mm, m/s, �F, etc. 
   [javac]                               ^
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/finding/clinical/breastCancer/bean/BreastCancerClinicalFinding.java:113:
warning: unmappable character for encoding UTF8
   [javac]    * Estrogen Receptor Status � Total Score Total Score = ER_PS+ ER_IS Considered Allred Score; = 3 is 
   [javac]                               ^
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/finding/clinical/breastCancer/bean/BreastCancerClinicalFinding.java:120:
warning: unmappable character for encoding UTF8
   [javac]    * Estrogen Receptor Status � Total Score Total Score = ER_PS+ ER_IS Considered Allred Score; = 3 is 
   [javac]                               ^
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/finding/clinical/breastCancer/bean/BreastCancerClinicalFinding.java:288:
warning: unmappable character for encoding UTF8
   [javac]    * Size of Largest Palpable Node (cm) � Clinical Assessment at Baseline
   [javac]                                         ^
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/finding/clinical/breastCancer/bean/BreastCancerClinicalFinding.java:293:
warning: unmappable character for encoding UTF8
   [javac]    * Size of Largest Palpable Node (cm) � Clinical Assessment at Baseline
   [javac]                                         ^
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/finding/clinical/breastCancer/bean/BreastCancerClinicalFinding.java:394:
warning: unmappable character for encoding UTF8
   [javac]    * Progesterone Receptor Status � Total Score Total Score = PgR_PgS+ PgR_IS Considered Allred Score; 
   [javac]                                   ^
   [javac]   /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/finding/clinical/breastCancer/bean/BreastCancerClinicalFinding.java:401:
warning: unmappable character for encoding UTF8
   [javac]    * Progesterone Receptor Status � Total Score Total Score = PgR_PgS+ PgR_IS Considered Allred Score; 
   [javac]                                   ^
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/study/bean/ProcedureName.java:97: 
warning: unmappable character for encoding UTF8
   [javac]    * the health of the heart�s major pumping chambers. 
   [javac]                             ^
   [javac] /opt/caintegrator-spec/src/gov/nih/nci/caintegrator/domain/study/bean/ProcedureName.java:106: 
warning: unmappable character for encoding UTF8
   [javac]    * history � an account of the symptoms as experienced by the patient. Together with the medical history, 
   [javac]              ^
   [javac] Note: Some input files use or override a deprecated API.
   [javac] Note: Recompile with -Xlint:deprecation for details.
   [javac] Note: Some input files use unchecked or unsafe operations.
   [javac] Note: Recompile with -Xlint:unchecked for details.
   [javac] 11 warnings
   [jar] Building jar: /opt/caintegrator-spec/caintegrator-spec.jar
build_dependency:
    [echo] 
    [echo]                     Artifacts copied to ../artifacts
    [echo] 
    [copy] Copying 1 file to /opt/artifacts
BUILD SUCCESSFUL
Total time: 11 seconds

*** CONFIRM *** caintegrator-spec.jar was successfully created under /opt/artifacts directory

Step 3: Building caintegrator-application-commons

[rmhonig@afdev opt]# cd caintegrator-application-commons

[rmhonig@afdev caintegrator-application-commons]# /opt/ant-1.6.2/bin/ant build_dependency
Buildfile: build.xml
build_jar_anthill:
   [mkdir] Created dir: /opt/caintegrator-application-commons/bin
   [javac] Compiling 70 source files to /opt/caintegrator-application-commons/bin
   [javac] Note: Some input files use unchecked or unsafe operations.
   [javac] Note: Recompile with -Xlint:unchecked for details.
     [jar] Building jar: /opt/caintegrator-application-commons/caintegrator-application-commons.jar
retrieve_deployment_artifacts:
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
build_dependency:
    [echo] 
    [echo]                     Artifacts copied to ../artifacts
    [echo] 
    [copy] Copying 1 file to /opt/artifacts
BUILD SUCCESSFUL
Total time: 3 seconds

*** CONFIRM *** caintegrator-application-commons.jar was successfully created under /opt/artifacts directory

Step 4: Building cgems.war

[rmhonig@afdev ~]# cd /opt/cgems/

[rmhonig@afdev cgems]# /opt/ant-1.6.2/bin/ant build_war_anthill
Buildfile: build.xml
config_application_context:
    [copy] Copying 1 file to /opt/cgems
    [move] Moving 1 files to /opt/cgems/src
config_common_security_module:
    [echo] Configuring Common Security Module
    [echo] Setting ApplicationSecurityConfig.xml
    [copy] Copying 1 file to /opt/cgems/csm_deploy
    [echo] Setting cgems.hibernate.cfg.xml
    [copy] Copying 1 file to /opt/cgems/csm_deploy
    [echo] Configuring oracle-ds.xml
    [copy] Copying 1 file to /opt/cgems/csm_deploy
    [echo] Configuring properties-service.xml
    [copy] Copying 1 file to /opt/cgems/csm_deploy
[replaceregexp] The following file is missing: '/opt/cgems/csm_deploy/properties-service.xml'
    [echo] Configuring login-config.xml
    [copy] Copying 1 file to /opt/cgems/csm_deploy
configure_cgems-properties-service:
    [echo] Setting caIntegratorConfig.xml
    [copy] Copying 1 file to /opt/cgems/caintegrator_deploy
    [echo] Configuring properties-service.xml
    [copy] Copying 1 file to /opt/cgems/caintegrator_deploy
    [copy] Copying 1 file to /opt/cgems/caintegrator_deploy
    [copy] Copying 1 file to /opt/cgems/caintegrator_deploy
    [copy] Copying 1 file to /opt/cgems/caintegrator_deploy
deploy_artifacts:
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
    [copy] Copying 1 file to /opt/artifacts
build_war_anthill:
   [mkdir] Created dir: /opt/cgems/bin
   [javac] Compiling 49 source files to /opt/cgems/bin
    [copy] Copying 54 files to /opt/cgems/WebRoot/WEB-INF/classes
    [copy] Copying 3 files to /opt/cgems/WebRoot/WEB-INF/classes
     [war] Building war: /opt/cgems/cgems.war
     [war] Warning: selected war files include a WEB-INF/web.xml which will be ignored (please use webxml attribute to war task)
    [copy] Copying 1 file to /opt/artifacts
BUILD SUCCESSFUL
Total time: 13 seconds 
(1) *** CONFIRM *** cgems.war was successfully created under /opt/artifacts directory (2) ApplicationSecurityConfig.xml was successfully created under the /opt/artifacts directory (3) oracle-ds.xml was successfully created under the /opt/artifacts directory (4) properties-service.xml was successfully created under the /opt/artifacts directory (5) login-config.xml was successfully created under the /opt/artifacts directory

Step 5: Configure JBOSS for WGS CGEMS application

  • Modify /opt/artifacts/oracle-ds.xml with database information (IP address, db_instance, user-name, password)
<datasources>
  <local-tx-datasource>
              <jndi-name>cgems</jndi-name>
              <connection-url>jdbc:oracle:thin:@156.111.188.180:1521:BIODB2</connection-url>
              <user-name>integrator</user-name>
              <password>XXXXXXXXX</password>
              <driver-class>oracle.jdbc.driver.OracleDriver</driver-class>
              <exception-sorter-class-name>org.jboss.resource.adapter.jdbc.vendor.OracleExceptionSorter</exception-sorter-class-name>
  </local-tx-datasource>
</datasources>
  • # cp oracle-ds.xml /opt/jboss-4.0.4/server/default/deploy/
  • # mkdir caintegrator/externalized_properties_folder
  • # cp mail.properties zip.properties ../caintegrator/externalized_properties_folder/
  • copied the following to /opt/jboss-4.0.4/server/default/deploy/properties-service.xml
 <attribute name="Properties">
     gov.nih.nci.cgems.zip.properties=/opt/caintegrator/externalized_properties_folder/zip.properties
     gov.nih.nci.cgems.mail.properties=/opt/caintegrator/externalized_properties_folder/mail.properties
     gov.nih.nci.caintegrator.configFile=/opt/caintegrator/externalized_properties_folder/caIntegratorConfig.xml
 </attribute>

Step 6: Deploy WGS CGEMS application under JBoss

  • [rmhonig@afdev artifacts]# cp cgems.war /opt/jboss-4.0.4/server/default/deploy/
  • [rmhonig@afdev bin]# export JAVA_HOME=/opt/java
  • [rmhonig@afdev ~]# cd /opt/jboss-4.0.4/bin
  • [rmhonig@afdev bin]# nohup ./run.sh & (to start JBoss)
  • Point the browser to http://afdev:8080/cgems/ to get to CGEMS About page.
  • [rmhonig@afdev bin]# ./shutdown.sh -S (to stop JBoss)
Personal tools