Difference between revisions of "FAQ"

(Q. Can I run geWorkbench on a 32/64-bit Windows system?)
 
(48 intermediate revisions by the same user not shown)
Line 1: Line 1:
__NOEDITSECTION__
+
==Using geWorkbench==
=== Using geWorkbench ===
 
  
====Q. How can I run geWorkbench on a 64-bit Windows system (or other system for which no installer currently is distributed)?====
+
===Q. Can I run geWorkbench on a 32/64-bit Windows system?===
  
A. The windows installer version of geWorkbench has been tested on a 64 bit Windows Vista machine.  For other platforms, please also try the installer versions firstIf you encounter problems, we recommend you try the Generic version.  It can be used on any operating system supporting Java, including 64 bit systems.  Instructions are found on the geWorkbench releases notes at the GForge distribution site:  https://gforge.nci.nih.gov/frs/?group_id=78
+
A. Yes.  geWorkbench can run on both 32 and 64-bit systems.  Starting with geWorkbench 2.6.0, both 32 and 64-bit installers are provided for Windows.  The 64-bit version is preferred as it allows larger data sets to be used.  Mac OS X itself only supports 64-bit versions.  For Linux, a 64-bit installer is provided; for 32-bit, please use the "noJRE" Linux installer.  In that case you will need to install the Java 7 JRE yourself, which can be downloaded from java.oracle.com.
  
====Q. I get a Java error when I try to start geWorkbench.====
+
Further instructions can be found on the [[Download_and_Installation|Download and Installation]] page.
  
A. This is almost always caused by the Sun Java 1.5 JRE not being installed or not being found.  Try reinstalling the Java 1.5 JRE and then reinstall geWorkbench.  geWorkbench must be installed after the JRE.
+
===Q. Will calculations run faster using a 64-bit JVM than on a 32-bit JVM?===
  
====Q. Where can I obtain the latest annotation files for my microarray platform?====
+
We have compared single runs of ARACNe on 32-bit and 64-bit JVMs and found a significant speed increase when using the 64-bit JVM.
A. Affymetrix annotation files can be downloaded from their support site, at www.affymetrix.com: Support->Technical Documentation->Annotation Files  ([http://www.affymetrix.com/support/technical/annotationfilesmain.affx])
+
 
 +
Testing conditions:
 +
* Dataset: We tested using a dataset with 176 microarrays of type HG-U133A, with 22,283 probesets. A set of 2013 hub markers was used.
 +
* ARACNe parameters: p-value cutoff 0.01, Bonferroni correction, DPI 0.15, no bootstrapping.
 +
* Machines: Core 2-6700 CPU with  4 or 6 GB memory, 2.66 GHz.
 +
* OS: Windows 7, 64 bit Enterprise Edition.
 +
* JVM: Oracle 1.6.0_32 or 1.6.0_31.
 +
 
 +
Results:  On both machines tested, the ARACNe jobs always finished faster when running on the 64-bit JVM than on the 32-bit JVM.  In paired, back-to-back tests on each machine, the time to finish was 20% to 90% longer on the 32-bit JVM (Actual results from 3 tests: 21%, 51%, 95% longer on 32 bit).
 +
 
 +
Possible explanation: Current AMD and Intel CPUs, when operating using the 32-bit x86 instruction set, have access to 8 general purpose registers.  However, when operating in 64-bit mode (x86-64), the operating system has access to an additional 8 general purpose registers (1).  These may provide an advantage to code with tight loops in the calculation (2).  For 64-bit operation, both a 64-bit operating system and a 64-bit JVM must be used.
 +
 
 +
References:
 +
 
 +
1. http://en.wikipedia.org/wiki/X86-64#Architectural_features
 +
 
 +
2. http://en.wikipedia.org/wiki/64-bit#Pros_and_cons
 +
 
 +
===Q. I get a Java error when I try to start geWorkbench.===
 +
 
 +
A. This is almost always caused by the Java JRE not being installed or not being found (when you use a version of geWorkbench that does not include the JRE).  You can either try a geWorkbench installer that includes the JRE, or make sure that an appropriate JRE is installed on your system.
 +
 
 +
===Q. Where can I obtain the latest annotation files for my microarray platform?===
 +
A. Affymetrix annotation files can be downloaded from their support site, at www.affymetrix.com. A free Netaffx account sign-up is required.
 +
 
 +
There are several ways to reach the annotation files, two of which are shown next:
 +
 
 +
1. The following link will take you directly to a list of most (but not all) current annotation files.
 +
 
 +
http://www.affymetrix.com/support/technical/annotationfilesmain.affx
 +
 
 +
2. Some arrays, for example the HG-U95Av2 array used in some geWorkbench tutorials (for the BCell-100.exp data set), are not shown in the above list.  However, the files can be obtained from the catalog of individual array products at the following link:
 +
 
 +
http://www.affymetrix.com/support/technical/byproduct.affx?cat=arrays
 +
 
 +
For the HG-U95Av2 annotation file, browse down to 3' Gene Expression Analysis and look for Human Genome Arrays.  Then find the section "Current NetAffx Annotation Files".
  
The annotation files for the HG-U95 platform are no longer stored at the main download location.  Instead, go to "Products and Services" (http://www.affymetrix.com/products_services/index.affx) and under 3' arrays, in the "Human Arrays" dropdown, select the "Human Genome U95 set".  Then select the "Tools and Data" tab.  The annotation files are located under the heading "Current NetAffx Annotation Files".  For the BCell-100.exp data set used in some of the tutorials, the array type is HG-U95Av2. 
 
  
 
The file format required is CSV (comma separated values).
 
The file format required is CSV (comma separated values).
  
====Q. How do I increase the amount of memory available to Java to run geWorkbench?====
+
===Q. Is the Affymetrix Gene 2.0 ST annotation file type supported?===
A. It depends on how you are running geWorkbench.
+
A. Yes.  geWorkbench 2.4.0 added a parser to support the Affymetrix Gene and Exon 1.0 ST transcript-level, CSV-format annotation files.  The Gene 2.0 ST transcript-level CSV annotation file uses the same format and can be read in using the '''Gene/Exon 1.0 ST''' parser choice in geWorkbench.
 +
 
 +
===Q. How do I increase the amount of memory available to Java to run geWorkbench?===
 +
A. Windows and Mac OS X versions of geWorkbench now include .bat and .command files, respectively, in the installation directory, which allow directly starting geWorkbench with anywhere from 1 to 16 GB of Java Heap Memory.
  
1. If you are running a packaged distribution of geWorkbench (created using InstallAnywhere), there is a file in the geWorkbench root directory called UILauncher.lax. There is a line there which specifies the Java heap size:
+
1. If you wish to instead change the default  maximum amount of Java heap memory when using the InstallAnywhere launcher, you can edit the "geworkbench.lax" file.  This does not apply to the Mac OS X version.  However, the values found in the geworkbench.lax files are already set to the maximum values that work with the InstallAnywhere launchers, depending on whether the 32-bit or 64-bit version is being used.
  
'''lax.nl.java.option.java.heap.size.max=640678989'''
+
'''lax.nl.java.option.java.heap.size.max=950MB''' (32-bit versions)
  
Here it is shown set to about 640 MB. You can experiment with increasing this, subject to the amount of memory in your machine and demands on it from other applications.
+
'''lax.nl.java.option.java.heap.size.max=2147483647''' (64-bit versions) (can also be entered as "2G".
  
2. If you are running geWorkbench from the source distribution using Ant, you can edit the build.xml file found in the geWorkbench root directory to alter the memory requested using the variable '''jvmarg''':
+
2. If you are running geWorkbench from the Generic or source distributions using Ant, you can edit the build.xml file found in the geWorkbench root directory to alter the memory requested using the jvmarg value="-Xmx1024M" option:
  
 
  <target name="run" depends="init" description="Runs geWorkbench.">
 
  <target name="run" depends="init" description="Runs geWorkbench.">
 
     <java fork="true" classname="org.geworkbench.engine.config.UILauncher">
 
     <java fork="true" classname="org.geworkbench.engine.config.UILauncher">
         <jvmarg value="-Xmx512M"/>
+
         <jvmarg value="-Xmx1024M"/>
 
         <jvmarg value="-Djava.library.path=lib"/>
 
         <jvmarg value="-Djava.library.path=lib"/>
 
         <arg value="all_release.xml"/>
 
         <arg value="all_release.xml"/>
Line 37: Line 73:
 
  </target>
 
  </target>
  
Here it is shown requesting 512 MB.
+
Here it is shown requesting 1 GB.
  
====Q. How do I turn on logging in installer-based versions of geWorkbench?====
+
===Q. Is there a shortcuts menu for available commands?===
Copies of geWorkbench installed using the InstallAnywhere-generated install file contain a configuration file in their root directory called UILauncher.lax.  Make the changes like the following to turn on logging to files:
+
A. Yes! - on the PC, you can use F12.
  
#  LAX.STDERR.REDIRECT
 
#  -------------------
 
#  leave blank for no output, "console" to send to a console window,
 
#  and any path to a file to save to the file
 
  
lax.stderr.redirect=log/stderr.log
+
===Q. geWorkbench appears to be frozen.===
 +
A. Sometimes, a "modal" dialog box can appear but by chance get hidden behind other windows (if the user clicks on some other window without noticing the dialog box).  The hidden dialog box is waiting for user interaction, and geWorkbench is waiting for the dialog to be dismissed, so the application is unresponsive.  This is a feature of how Java works.  One way to see if this has happened is to minimize all open windows on your desktop, for all applications, and the maximize the geWorkbench window.  If there is an open dialog box, it should now appear in front of the geWorkbench main window.  (c.f. Mantis entry 1959).
  
#  LAX.STDOUT.REDIRECT
 
#  -------------------
 
#  leave blank for no output, "console" to send to a console window,
 
#  and any path to a file to save to the file
 
  
lax.stdout.redirect=log/stdout.log
+
===Q. How can I download the geWorkbench source?===
 +
<span id="download_geworkbench_code"></span>
  
 +
A. The latest geWorkbench source code can be downloaded from Github.  Please see the instructions on the  [[Download_and_Installation#Download_the_geWorkbench_Source_Code |Download the geWorkbench Source Code]] section of the [[Download_and_Installation|Download and Installation page]].  Instructions for compiling the code are available on the same page at [[Download_and_Installation#Compiling_the_geWorkbench_Source_Code | Compiling the geWorkbench Source Code]].
  
====Q. How can I download the geWorkbench source?====
+
===Q. How do I turn on logging in installer-based versions of geWorkbench?===
<span id="download_geworkbench_code"></span>
+
Copies of geWorkbench installed using the InstallAnywhere-generated install file contain a configuration file in their root directory called UILauncher.lax.   Make the changes like the following to turn on logging to files:
A. You can download the development version from the GForge site (http://gforge.nci.nih.gov/scm/?group_id=78).
 
# Use the "Nightly CVS Tree Snapshot" link, or
 
# we are in the process of creating our own CVS snapshot, which will be available shortly.
 
  
 +
#  LAX.STDERR.REDIRECT
 +
#  -------------------
 +
#  leave blank for no output, "console" to send to a console window,
 +
#  and any path to a file to save to the file
  
 +
lax.stderr.redirect=log/stderr.log
  
Using the Nightly CVS Tree:
+
#  LAX.STDOUT.REDIRECT
 +
#  -------------------
 +
#  leave blank for no output, "console" to send to a console window,
 +
#  and any path to a file to save to the file
  
# Click on the SCM tab.
+
lax.stdout.redirect=log/stdout.log
# Click on the link titled "Download The Nightly CVS Tree Snapshot" to download the whole code tree (warning: this is a big zip file).
 
# Unzip and go to the top directory of the distribution.
 
# Execute the script launch_geworkbench.sh or launch_geworkbench.bat (Windows).  
 
  
  
Anonymous CVS download from GForge is no longer available.  For project members, the CVS settings are:
+
===Q. How can I reference geWorkbench in a publication?===
 +
A1: geWorkbench is described in the following publication:
  
# CVS server: cbiocvs2.nci.nih.gov:
+
http://www.ncbi.nlm.nih.gov/pubmed/20511363
# CVSROOT directory: /share/content/gforge/geworkbench
 
# CVS module name: geworkbench
 
  
 +
Floratos A, Smith K, Ji Z, Watkinson J, Califano A. (2010). geWorkbench: an open source platform for integrative genomics.  Bioinformatics 26(14):1779-80. Epub 2010 May 28.
  
Whichever method is used, this is a source distribution which you will need to compile yourself:
 
* you must have the Java JDK 5 (or higher) installed on your machine - see http://java.sun.com, under the "Java SE" link.
 
* you must make sure that the value of the environment variable JAVA_HOME is the directory where the JDK is installed.
 
  
Note that Sun recommends adding the Java JDK bin directory to your machine's PATH variable.  Under Windows, "Typically this full path looks something like C:\Program Files\Java\jdk1.5.0_<version>\bin."   Full instructions can be found on the Java site.
+
A2: Further support statement:
 +
::"Analysis was carried out using geWorkbench (http://www.geworkbench.org), a free open source genomic analysis platform developed at Columbia University with funding from the NIH Roadmap Initiative (1U54CA121852-01A1) and the National Cancer Institute".
  
If you have Ant installed on your machine, you can just type "ant run" in the geworkbench root directory and the program will be built and run.  Ant can be downloaded from http://ant.apache.org/.  Note that installing Ant involves manually adding the Ant bin directory to the PATH variable, setting the ANT_HOME directory, and optionally setting the JAVA_HOME directory.
 
  
If you do not have Ant, you can just execute one of the provided launch scripts, also found the in geworkbench root directory.  They are:
+
===Q. What differences are there in how genes without symbols are handled in GO Terms analysis vs a CNKB query?===
* Windows: launch_geworkbench.bat
+
A. In the GO Terms analysis component, only genes with a gene symbol are used in the analysis, and only genes with a gene symbol will appear in the list of genes associated with a selected term.  By contrast, in the CNKB component, markers which match the particular id (e.g. Entrez ID) used in the query are returned even if they do not have an associated gene symbol.  (Mantis issue #2478).
* Linux/Unix/Mac: launch_geworkbench.sh
 
 
 
====Q. How can I reference geWorkbench in a publication?====
 
A. You can use the following statement:
 
::"Analysis was carried out using geWorkbench (http://www.geworkbench.org), a free open source genomic analysis platform developed at Columbia University with funding from the NIH Roadmap Initiative (1U54CA121852-01A1) and the National Cancer Institute".
 

Latest revision as of 18:39, 23 January 2015

Using geWorkbench

Q. Can I run geWorkbench on a 32/64-bit Windows system?

A. Yes. geWorkbench can run on both 32 and 64-bit systems. Starting with geWorkbench 2.6.0, both 32 and 64-bit installers are provided for Windows. The 64-bit version is preferred as it allows larger data sets to be used. Mac OS X itself only supports 64-bit versions. For Linux, a 64-bit installer is provided; for 32-bit, please use the "noJRE" Linux installer. In that case you will need to install the Java 7 JRE yourself, which can be downloaded from java.oracle.com.

Further instructions can be found on the Download and Installation page.

Q. Will calculations run faster using a 64-bit JVM than on a 32-bit JVM?

We have compared single runs of ARACNe on 32-bit and 64-bit JVMs and found a significant speed increase when using the 64-bit JVM.

Testing conditions:

  • Dataset: We tested using a dataset with 176 microarrays of type HG-U133A, with 22,283 probesets. A set of 2013 hub markers was used.
  • ARACNe parameters: p-value cutoff 0.01, Bonferroni correction, DPI 0.15, no bootstrapping.
  • Machines: Core 2-6700 CPU with 4 or 6 GB memory, 2.66 GHz.
  • OS: Windows 7, 64 bit Enterprise Edition.
  • JVM: Oracle 1.6.0_32 or 1.6.0_31.

Results: On both machines tested, the ARACNe jobs always finished faster when running on the 64-bit JVM than on the 32-bit JVM. In paired, back-to-back tests on each machine, the time to finish was 20% to 90% longer on the 32-bit JVM (Actual results from 3 tests: 21%, 51%, 95% longer on 32 bit).

Possible explanation: Current AMD and Intel CPUs, when operating using the 32-bit x86 instruction set, have access to 8 general purpose registers. However, when operating in 64-bit mode (x86-64), the operating system has access to an additional 8 general purpose registers (1). These may provide an advantage to code with tight loops in the calculation (2). For 64-bit operation, both a 64-bit operating system and a 64-bit JVM must be used.

References:

1. http://en.wikipedia.org/wiki/X86-64#Architectural_features

2. http://en.wikipedia.org/wiki/64-bit#Pros_and_cons

Q. I get a Java error when I try to start geWorkbench.

A. This is almost always caused by the Java JRE not being installed or not being found (when you use a version of geWorkbench that does not include the JRE). You can either try a geWorkbench installer that includes the JRE, or make sure that an appropriate JRE is installed on your system.

Q. Where can I obtain the latest annotation files for my microarray platform?

A. Affymetrix annotation files can be downloaded from their support site, at www.affymetrix.com. A free Netaffx account sign-up is required.

There are several ways to reach the annotation files, two of which are shown next:

1. The following link will take you directly to a list of most (but not all) current annotation files.

http://www.affymetrix.com/support/technical/annotationfilesmain.affx

2. Some arrays, for example the HG-U95Av2 array used in some geWorkbench tutorials (for the BCell-100.exp data set), are not shown in the above list. However, the files can be obtained from the catalog of individual array products at the following link:

http://www.affymetrix.com/support/technical/byproduct.affx?cat=arrays

For the HG-U95Av2 annotation file, browse down to 3' Gene Expression Analysis and look for Human Genome Arrays. Then find the section "Current NetAffx Annotation Files".


The file format required is CSV (comma separated values).

Q. Is the Affymetrix Gene 2.0 ST annotation file type supported?

A. Yes. geWorkbench 2.4.0 added a parser to support the Affymetrix Gene and Exon 1.0 ST transcript-level, CSV-format annotation files. The Gene 2.0 ST transcript-level CSV annotation file uses the same format and can be read in using the Gene/Exon 1.0 ST parser choice in geWorkbench.

Q. How do I increase the amount of memory available to Java to run geWorkbench?

A. Windows and Mac OS X versions of geWorkbench now include .bat and .command files, respectively, in the installation directory, which allow directly starting geWorkbench with anywhere from 1 to 16 GB of Java Heap Memory.

1. If you wish to instead change the default maximum amount of Java heap memory when using the InstallAnywhere launcher, you can edit the "geworkbench.lax" file. This does not apply to the Mac OS X version. However, the values found in the geworkbench.lax files are already set to the maximum values that work with the InstallAnywhere launchers, depending on whether the 32-bit or 64-bit version is being used.

lax.nl.java.option.java.heap.size.max=950MB (32-bit versions)

lax.nl.java.option.java.heap.size.max=2147483647 (64-bit versions) (can also be entered as "2G".

2. If you are running geWorkbench from the Generic or source distributions using Ant, you can edit the build.xml file found in the geWorkbench root directory to alter the memory requested using the jvmarg value="-Xmx1024M" option:

<target name="run" depends="init" description="Runs geWorkbench.">
    <java fork="true" classname="org.geworkbench.engine.config.UILauncher">
       <jvmarg value="-Xmx1024M"/>
       <jvmarg value="-Djava.library.path=lib"/>
       <arg value="all_release.xml"/>
       <classpath refid="run.classpath"/>
    </java>
</target>

Here it is shown requesting 1 GB.

Q. Is there a shortcuts menu for available commands?

A. Yes! - on the PC, you can use F12.


Q. geWorkbench appears to be frozen.

A. Sometimes, a "modal" dialog box can appear but by chance get hidden behind other windows (if the user clicks on some other window without noticing the dialog box). The hidden dialog box is waiting for user interaction, and geWorkbench is waiting for the dialog to be dismissed, so the application is unresponsive. This is a feature of how Java works. One way to see if this has happened is to minimize all open windows on your desktop, for all applications, and the maximize the geWorkbench window. If there is an open dialog box, it should now appear in front of the geWorkbench main window. (c.f. Mantis entry 1959).


Q. How can I download the geWorkbench source?

A. The latest geWorkbench source code can be downloaded from Github. Please see the instructions on the Download the geWorkbench Source Code section of the Download and Installation page. Instructions for compiling the code are available on the same page at Compiling the geWorkbench Source Code.

Q. How do I turn on logging in installer-based versions of geWorkbench?

Copies of geWorkbench installed using the InstallAnywhere-generated install file contain a configuration file in their root directory called UILauncher.lax. Make the changes like the following to turn on logging to files:

#   LAX.STDERR.REDIRECT
#   -------------------
#   leave blank for no output, "console" to send to a console window,
#   and any path to a file to save to the file

lax.stderr.redirect=log/stderr.log

#   LAX.STDOUT.REDIRECT
#   -------------------
#   leave blank for no output, "console" to send to a console window,
#   and any path to a file to save to the file

lax.stdout.redirect=log/stdout.log


Q. How can I reference geWorkbench in a publication?

A1: geWorkbench is described in the following publication:

http://www.ncbi.nlm.nih.gov/pubmed/20511363

Floratos A, Smith K, Ji Z, Watkinson J, Califano A. (2010). geWorkbench: an open source platform for integrative genomics. Bioinformatics 26(14):1779-80. Epub 2010 May 28.


A2: Further support statement:

"Analysis was carried out using geWorkbench (http://www.geworkbench.org), a free open source genomic analysis platform developed at Columbia University with funding from the NIH Roadmap Initiative (1U54CA121852-01A1) and the National Cancer Institute".


Q. What differences are there in how genes without symbols are handled in GO Terms analysis vs a CNKB query?

A. In the GO Terms analysis component, only genes with a gene symbol are used in the analysis, and only genes with a gene symbol will appear in the list of genes associated with a selected term. By contrast, in the CNKB component, markers which match the particular id (e.g. Entrez ID) used in the query are returned even if they do not have an associated gene symbol. (Mantis issue #2478).