Frequently Asked Questions


General

  1. What is the latest version of GenePattern?
  2. Where can I find the hardware and software prerequisites for GenePattern?
  3. Where can I find the GenePattern release notes?
  4. How can I get help with GenePattern or provide feedback?
  5. How do I cite GenePattern?
  6. If I am a member of the press, how can I get more information?
  7. How can I find out about upcoming workshops or new module releases?
  8. Is there a version of GenePattern that can run in the cloud?
  9. Is the source code available?
  10. Why is GenePattern slow on my Mac OSX 10.9 Mavericks machine?

Public Server

  1. Are there other public GenePattern servers?
  2. What can I do if I my job keeps running out of memory or fails with no error on the public server?
  3. Why is my GenePattern job stuck in the PENDING state?
  4. Why is my module taking so long to run?
  5. Why did my GenePattern job fail?
  6. How do I run several files through a set of modules in parallel?
  7. How can I easily run the same analysis on many different data files?

Installation

  1. How do I install (uninstall) GenePattern?
  2. How do I upgrade to the latest version of GenePattern without losing my losing my data (such as jobs, uploaded files and modules)?
  3. How can I work around a LaunchAnywhere error?
  4. Does GenePattern support the international settings on my computer?
  5. I am behind a web proxy/firewall and my GenePattern server says it cannot connect to the module repository to load the modules. What do I do?
  6. I want to install GenePattern into our corporate/departmental/other Web server and not have GenePattern run in its own Web server. How do I install it?
  7. When should I choose to install the GenePattern server on a different port than the default 8080?
  8. How do I install GenePattern on a 64-bit Windows machine if I want to use a version before 3.2.2?
  9. Why can't I connect to my GenePattern server on Windows 7 or Vista?
  10. Why doesn't clicking StartGenePatternServer launch GenePattern on Linux?
  11. Where are the dock icons for my GenePattern server on my Mountain Lion (OS X 10.8) machine?
  12. Why can't I install a licensed module on my GenePattern server?
  13. Why can't I install GenePattern on my Mac OSX 10.9 Mavericks machine?
  14. I just installed Java 8. Why does my new GenePattern install say I need Java 7+ ?

Configuration

  1. How do I increase the memory allocated to the GenePattern server or client?
  2. How do I increase the memory allocated to a module?
  3. Can I run more than one instance of the GenePattern server on a machine?
  4. How can I set up a GenePattern server for others to use remotely?
  5. Why do my remote users get malformed URL errors when accessing visualizers? 
  6. How do I configure GenePattern to work with a queuing system (or grid engine)?
  7. How do I configure the GenePattern server on a machine with multiple IP addresses? Can I keep the GenePattern URL from changing when the server hostname changes?
  8. How do I modify the GenePattern session timeout interval?
  9. How do I modify how often result files are deleted from my GenePattern server?
  10. I'm getting 'error "connection refused"': what is the problem?

R/Java/Perl

  1. I already have R/Perl/Java on my machine. Will the versions of R/Perl/Java that GenePattern installs interfere with these?
  2. Can I configure GenePattern to work with versions of R/Perl/Java other than those installed by GenePattern?
  3. Can my module use a different version of R than GenePattern?
  4. How can I get R to install correctly on my Mac?

File Formats/Uploading Files/Zip Files

  1. Does GenePattern support cDNA and other 2-channel microarray data?
  2. Where can I find information about file formats used by GenePattern?
  3. How can I convert between RES, GCT, and ODF formats?
  4. Is there an easier way to create a CLS file than creating it by hand in a text editor?
  5. How do I convert a file to GenePattern format?
  6. How can I use CEL, MAGE-ML, and MAGE-TAB files in GenePattern?
  7. How do I zip my files for use in GenePattern?
  8. Why is nothing happening when I try to upload my large file?
  9. Why did the module I tried fail to run with my ZIP file as input?
  10. How do I format my GenePattern output for submission to GEO?
  11. What does "Error in subfiles: subscript out of bounds" mean?
  12. What is a CSV file?
  13. I keep getting file errors when I run a module. What are common reasons for file errors that I can check?

Modules, Pipelines, and Suites

  1. How do I upgrade to the latest version of GenePattern without losing my modules/pipelines/suites?
  2. Why does the "no such module" error occur for a module on the server?
  3. I have installed a module/pipeline/suite, but I do not see it. What's wrong?
  4. My pipeline requires an input file, but displays a file-not-found error when I enter a file name. What's wrong?
  5. Can I use a file path as input for a GenePattern module?
  6. Why does the Specify File Path or URL option not work in Internet Explorer?
  7. Why can't I use a directory as input for all modules?
  8. What is the current list of deprecated GenePattern modules?

Annotation Modules

  1. What versions of genomic databases is GeneCruiser currently using?

Clustering Modules

  1. Why am I seeing a "Process existed with status code: 138" error when I try to run ConsensusClustering?

Gene List Selection Modules

  1. Why do the scores from ComparativeMarkerSelection and ClassNeighbors differ?
  2. I have used ComparativeMarkerSelection to construct gene lists representing different experimental conditions. Is there a GenePattern module that can determine if there are upstream non-coding motifs over represented in those gene lists?
  3. What does a missing value error for ComparativeMarkerSelection mean?
  4. Why do I receive an error when running my preprocessed GCT file and CLS file in ComparativeMarkerSelection?
  5. Why am I getting the "none of the gene sets passed the size thresholds" error in GSEA?

IGV Modules

  1. How can I pre-process my RNA-seq data for IGV?
  2. How can I properly view my GCT or RES file in IGV?

Preprocess Modules

  1. Why does ExpressionFileCreator fail?
  2. Does ExpressionFileCreator support Exon arrays?
  3. What does "Could not obtain CDF environment" mean?
  4. Why did I get a warning stating that my index is older than my BAM file?
  5. Can I process raw Illumina BeadChip data in GenePattern?

Projection Modules

  1. How can I run non-negative matrix factorization NMF on data that contains negative values, such as log-ratio or unthresholded Affymetrix data?

RNA-seq Modules

  1. How can I use the RNA-seq modules available in GenePattern?
  2. I am running a large number of RNA sequencing jobs, and I'd like to be able to look at the quality of the data. Is there a tool I could use for this?
  3. How do I find reference genomes to use in TopHat, Bowtie, or BWA?
  4. How do I find reference genome annotation files or whole genome files to use with the GenePattern RNA-seq tools?

SNP Analysis Modules

  1. How do I resolve GISTIC errors?
  2. What does the GISTIC MATLAB error "Matrix dimensions must agree." mean?
  3. I get "??? Attempted to access rl(:,2); index out of bounds because size(rl)=[0,1]" in my stderr file when running GISTIC, what does this mean?
  4. I get "??? Index exceeds matrix dimensions." in my stderr file when running GISTIC, what does this mean?
  5. Does GISTIC support SNP 6.0 data?
  6. Does GISTIC support Agilent data?
  7. What does the GISTIC error, "Invalid file identifier" mean?
  8. How can I install GISTIC_2.0 on my own GenePattern server?
  9. Can you send me the source code for GISTIC?
  10. I got this error when running GISTIC_2.0 "All input data were removed after NaN processing", what does it mean?
  11. How can I create a GISTIC markers file for my segmented data file?
  12. Is there a markers file I can use with the TCGA level 3 data?

Visualization and Image Creator Modules

  1. How do I get a heat map with a high enough resolution for publication?
  2. How can I export a Heat Map image with gene annotations?
  3. When I do a Hierarchical Clustering analysis, two files are produced, but the Hierarchical Cluster Viewer (JavaTreeView) looks like it needs three files. Do I need another one?
  4. Why can't I run the GenePattern visualizers?

Module Creation/Programming Language Environments

  1. How can I retrieve external database information from GenePattern?
  2. My MATLAB figures are not appearing in the MATLAB visualizer I created. Why?
  3. How can I share a GenePattern module?
  4. How can I make my GenePattern module available on the GenePattern public server?
  5. Why can't I specify a 64-bit platform for a module?
  6. Where can I find out more about how to launch GenePattern modules from other programming languages?
  7. Can I use the GenePattern APIs to create a web service that programmatically accesses the GenePattern server?
  8. Why can't I call my pipeline/module from MATLAB?

Other

If you haven't found what you are looking for, please send an email to gp-help(at)broadinstitute.org.

 Back to top


Frequently Asked Questions

What is the latest version of GenePattern?

For information about the latest version of GenePattern and its components, see the Release Notes.

 Back to top

Where can I find the hardware and software prerequisites for GenePattern?

The Release Notes list hardware requirements, supported operating systems, and supported browsers.

 Back to top

Where can I find the GenePattern release notes?

Click here for the latest release notes.

 Back to top

How can I get help with GenePattern or provide feedback?

In addition to this FAQ, the GenePattern team provides the following online resources:

  • Video tutorials
  • Concepts provides a brief introduction to GenePattern and its primary objects: modules, pipelines, suites.
  • Quick Start provides a 10-minute tour of GenePattern.
  • The Tutorial provides an extended 40-minute hands-on introduction to expression analysis in GenePattern.
  • The User Guide describes how to run analyses, create pipelines and generally work with GenePattern.
  • The Administrators Guide describes how to configure a local or networked server. If you are using the GenePattern public server, you do not need this information.
  • The Programmers Guide provides guidelines for writing modules and instructions for accessing GenePattern from the Java, MATLAB, and R programming environments.
  • The Modules page lists the modules and pipelines available from the Broad Institute, with links to their documentation.
  • The File Formats Guide describes all file formats and provides instructions for creating input files.
  • The Release Notes describe new features in the current release.

To provide feedback or ask a question not addressed by the online resources, send email to gp-help(at)broadinstitute.org.

 Back to top

How do I cite GenePattern?

To cite GenePattern, please use the following citation:
Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP (2006) GenePattern 2.0 Nature Genetics 38 no. 5 (2006): pp500-501 doi:10.1038/ng0506-500.

To cite a GenePattern analysis or visualization module, cite the GenePattern software and the original paper or other source for the module as specified in the module documentation. Documentation for each module is available on the Modules page and in GenePattern (click Help when prompted to enter the module's parameters).

 Back to top

If I am a member of the press, how can I get more information?

If you are a member of the press and need additional information about GenePattern, please contact our Public Relations department.

 Back to top

What can I do if I my job keeps running out of memory or fails with no error on the public server?

Sometimes jobs with large datasets or parameter settings which cause greater computational load on the system can fail because they ran out of memory. Usually you will get an error message stating that the module ran out of memory, though sometimes this can cause a "silent" failure; meaning that you see that your job failed, but there are no error files. In either case, GenePattern administrators can track down and resolve the problem. GenePattern administrators can assign memory settings on a per user basis, allowing your jobs to run with the required amount of memory for your analysis.

To have an administrator look into your errors and adjust the memory settings for your jobs please contact us at gp-help(at)broadinstitute.org, making sure to provide your username and job id.

 Back to top

Are there other public GenePattern servers?

The public server maintained by the GenePattern development team can be found at http://genepattern.broadinstitute.org. There are also other GenePattern servers maintained by other organizations that have been made publicly available. These organizations include:

  • SMD:Stanford Microarray Database
    The SMD provides an extensive microarray database and has integrated with GenePattern to provided tools for the analysis of the data. More information about SMD can be found at http://smd.stanford.edu and the paper that describes the integration of GenePattern in their environment can be found here. Please send questions and comments regarding SMD to array(at)genome.stanford.edu.
  • NuGO: NBX
    NuGO has developed a Black Box environment that utilizes GenePattern as its preferred analysis tool and has deployed NuGO-modified versions of some GenePattern modules on the GenePattern servers installed on the Black Boxes. For more information about the NuGO NBX please visit http://www.nugo.org/NBX/. The paper that describes the modules NuGO has contributed to GenePattern can be found here. Any comments or questions regarding the NuGO NBX and the NuGO GenePattern modules should be directed to sian.astley(at)bbsrc.ac.uk.
  • Garvan Institute
    The Peter Wills Bioinformatics Centre at the Garvan Institute of Medical Research in Sydney, Australia, has set up a public GenePattern server here.

References

  • Hubble J, Demeter J, Jin H, Mao M, Nitzberg M, Reddy TB, Wymore F, Zachariah ZK, Sherlock G, Ball CA. Implementation of GenePattern within the Stanford Microarray Database. Nucleic Acids Res. 2009;37(Database issue):D898-901. PMCID: 2686537.
  • De Groot PJ, Reiff C, Mayer C, Muller M. NuGO contributions to GenePattern. Genes Nutr. 2008;3:143-6. PMCID: 2593018.

 Back to top

How can I find out about upcoming workshops or new module releases?

You can follow GenePattern on Twitter, FaceBook, or our RSS feed. Check our Twitter feed in the news items at genepattern.org, join our mailing list, or add yourself to the Workshops notification list.

 Back to top

Is there a version of GenePattern that can run in the cloud?

Yes. The GenePattern team has released a GenePattern Amazon Machine Instance (AMI) which is available on us-east-1.

 Back to top

How do I install (uninstall) GenePattern?

To install GenePattern, go to GenePattern Download and follow the instructions for your operating system.

To uninstall GenePattern, use the utility provided as part of the GenePattern installation. If the GenePattern uninstall utility is unavailable, deleting the GenePattern installation folder removes all GenePattern files other than the desktop icons.

Mac users: If R2.5 is not already installed, GenePattern installs it in the /Library/Frameworks/R.framework/Versions/2.5 folder. Uninstalling GenePattern does not uninstall R. To uninstall R, use the utility provided by R.

 Back to top

How do I upgrade to the latest version of GenePattern without losing my data (such as jobs, uploaded files and modules)?

Simply install the new version of GenePattern into the same directory as your previous version. Do not uninstall first. It is unnecessary and will delete your existing modules, pipelines and suites. When you overwrite the previous version:

  • Existing modules, pipelines and suites are preserved.
  • The following settings are read from your existing genepattern.properties file and displayed as default values for your new installation: settings for R, Java, Perl, LSID Authority, proxy settings, HSQL database URL and port, and file purge frequency and time.
  • The default value for the webserver (Tomcat) port used by the GenePattern server is always 8080. If your existing installation uses a different port number, you can specify that port number during the installation.
  • The require.password setting from your existing genepattern.properties file is preserved in your new installation.
  • Backup copies of the following configuration files are created: genepattern.properties[.backup (before GenePattern 3.4) or .save (3.4 and up)], permissionMap.xml[.backup], and userGroups.xml[.backup]. To recreate your previous settings after installing GenePattern, compare the saved files with the newly installed files and modify the new files as necessary. Do not replace the newly installed configuration files with the saved copies.

User groups: The userGroups.xml file for GenePattern 3.2 omits the group named Public. In GenePattern 3.2, all users are now in a predefined group named Public. To avoid confusion, do not recreate the group named Public.

R versions: Installing GenePattern 3.1 (or later) installs R2.5 and sets the full path to R2.5. See Using Different Versions of R for information on how to create and/or use GenePattern modules written for other versions of R.

 Back to top

What is the recommended method of upgrading my GenePattern server to 3.3.3 or higher?

Large uploaded data files or output files will significantly slow down your GenePattern upgrade installation. If you have less than approximately 10 GB of data files either uploaded (via the Upload tab) or output by GenePattern jobs, you can just follow the GenePattern server installation instructions. However, if you have more than 10 GB of uploaded data files or output files, we suggest that you:

  1. Move your uploaded files to a non-default location, for instance:
    mv <GenePatternServer>/Tomcat/temp <GenePatternServer>_data/temp
  2. Edit the following property in both StartGenePatternServer.lax and genepattern.properties:
    java.io.tmpdir=<GenePatternServer>_data/temp

    Replace <GenePatternServer> with an actual path.

  3. Then follow the GenePattern server installation instructions.

 Back to top

Is the source code available?

Yes. Source code for GenePattern and its modules is available under the GenePattern software license. Source code for the GenePattern Server application is on GitHub. Source code for most modules is available directly from GenePattern. From the job input form, open the gear menu to 'Export' the module or view 'Properties' to select source files individually. 

 Back to top

Why is GenePattern slow on my Mac OSX 10.9 Mavericks machine?

Please see this blog post for information regarding this issue.

 Back to top

I already have R/Perl/Java on my machine. Will the versions of R/Perl/Java that GenePattern installs interfere with these?

No. The R, Perl, and Java installations that come with the GenePattern are installed within the GenePattern directory and do not affect any other versions that you may currently have.

 Back to top

Can I configure GenePattern to work with versions of R/Perl/Java other than those installed by GenePattern?

You can configure GenePattern to work with other versions of R/Perl/Java; however, the versions of R, Perl, and Java bundled with GenePattern are the ones that have been fully tested. We cannot guarantee that other versions will work.

Java VM: If you install a GenePattern server without the Java VM, choosing instead to use a Java VM that you have already installed, ensure that the file tools.jar (provided by SUN seperately from the JRE and JDK) is on your classpath. When you install a GenePattern server with an included VM, the GenePattern installation does this for you. If this file is not on your classpath, when you attempt to install a module that requires the MatlabComponentRuntime (MCR) Installer, the MCR Installer fails.

R versions: GenePattern modules can be written for any version of R. For details on how to specify which version to use, see Using Different Versions of R.

 Back to top

Does GenePattern support the international settings on my computer?

GenePattern supports the Basic Latin character set. Characters other than those in the Basic Latin character set may not be displayed correctly. Asian character sets are not currently supported.

All analysis and visualization modules support the decimal point (.) as the separator between the integral and fractional parts of a decimal number. Using a decimal comma (,) may cause unexpected behavior in some modules.

 Back to top

I am behind a web proxy/firewall and my GenePattern server says it cannot connect to the module repository to load the modules. What do I do?

If you did not indicate that you were behind a web proxy/firewall when you installed GenePattern, you must update the proxy settings for your server before you can install the modules:

  1. Start GenePattern (http://localhost:8080/gp is usually the URL).
  2. Click Administration>Server Settings to display the server settings.
  3. Click Proxy to display the proxy settings.
  4. On the Proxy Settings page, enter the hostname and port of your web proxy server. If you do not know them, contact your IT help desk to get the values. If you need to log into the proxy server, also enter your username and password (these will NOT be saved to a file and will need to be reentered following a server restart next time you want to connect).
  5. Click Save to update the proxy settings.
  6. Click Modules & Pipelines>Install from Repository to install the modules.

If you still cannot connect to the repository, email us at gp-help(at)broadinstitute.org.

 Back to top

I want to install GenePattern into our corporate/departmental/other Web server and not have GenePattern run in its own Web server. How do I install it?

You need to use the war file installation. Instructions are available here.

 Back to top

When should I choose to install the GenePattern server on a different port than the default 8080?

If you already have a server such as Tomcat running on this port, you need to install the GenePattern server on a different port to avoid conflicts.

 Back to top

How do I install GenePattern on a 64-bit Windows machine if I want to use a version before 3.2.2?

When GenePattern is installed on Windows 64-bit systems in the default C:Program Files (x86) directory, modules fail because of some code that is expecting only "C:Program Files" and then truncates that location to "C:Progr ~1". There is similar bug in ComparativeMarkerSelection. These errors are corrected in the 3.2.2 release of GenePattern. However, if you do not upgrade to release 3.2.2 or after, the work around is to re-install GenePattern in a directory that has no spaces in the name.

 Back to top

Why can't I connect to my GenePattern server on Windows 7 or Vista?

On Windows 7 and Vista, the StartGenePatternServer and StopGenePatternServer applications must be run as an administrator. To start or stop the GenePattern server, right-click on StartGenePatternServer.exe or StopGenePatternServer.exe and select Run as administrator.

To launch GenePattern in your browser, you can double-click the GenePatternHome.html icon located with the StartGenePatternServer and StopGenePatternServer icons.

 Back to top

How can I set up a GenePattern server for others to use remotely?

There are two useful sections of the GenePattern User Guide that explain how to do this:

 Back to top

Why do my remote users get malformed URL errors when accessing visualizers?

The downloadable installer makes the assumption that GP will be run on an individual's PC or laptop and is configured for that. The remote client computer can't reach the necessary files because they are being served on URLs using the server's loopback network interface (127.0.0.1). For shared server use, there's one setting that needs to be changed. Go to the location where GenePattern has been installed on your server and look for the genepattern.properties file in the "resources" directory. Open it with a text editor like emacs or vi. Look for a property named GenePatternURL and replace the 127.0.0.1 part with the network name of that server. You should leave the ':8080/gp' part in place unless you have changed the Tomcat port configuration. Stop and restart the GenePattern server and the URLs should now point to the right location.

 Back to top

Why doesn't clicking StartGenePatternServer launch GenePattern on Linux?

The StartGenePatternServer application only starts the server on Linux machines (on Mac it will start the server and launch a web browser). To access the web client interface for your GenePattern server, click the GenePatternHome.html shortcut icon, or, if you did not install icons in your task bar or on your desktop, GenePatternHome.html can be found at the top level of your GenePattern install directory.

 Back to top

How can I get R to install correctly on my Mac?

Some Mac users have found that the R library is not installing correctly when they try to install GenePattern. Even after making sure that the folder into which GenePattern is installing R has write permissions, upon running a module, they receive the following error message:

java.io.IOException: Cannot run program "/Library/Frameworks/R.framework/Versions/2.5/Resources/bin/R": error=2, 
No such file or directory while running R command [/Library/Frameworks/R.framework/Versions/2.5/Resources/bin/R, 
--no--save, --quiet, --slave, --no-restore]

This may be a simple GenePattern server configuration problem. First, check that something is installed at that path. Open the Terminal.app and run the following commands:

ls /Library/Frameworks/R.framework/Versions/2.5

ls /Library/Frameworks/R.framework/Versions

If there is something installed at this path, then check that the path to R is correctly configured in your GenePattern server. Go to the Administration>Server Settings>Programming Languages GenePattern page and verify that:

R 2.5 Home: /Library/Frameworks/R.framework/Versions/2.5/Resources

If this is configured correctly, you may be able to correct the problem by manually downloading and installing R 2.5.

If you have further issues installing R, contact us at gp-help(at)broadinstitute.org.

 Back to top

How do I increase the memory allocated to the GenePattern server or client?

See Increasing Memory Allocation.

 Back to top

How do I increase the memory allocated to a module?

See Increasing Memory Allocation.

 Back to top

Can I run more than one instance of the GenePattern server on a machine?

Yes. If you are running more than one installation of GenePattern on the same machine, you must make sure that the port numbers for the GenePattern server and the HSQL server are unique to each installation. The Tomcat server listens on two ports, 8080 (requests) and 8005 (shutdown) by default, and the HSQL server listens on port 9001. All 3 ports need to be modified on the second copy of Tomcat. For example, you can set the GenePattern server port to 8080 and 8005 on one install and 8081 and 8086 on the other, and set the HSQL port to 9001 on one and 9002 on the other. You can configure these port numbers when you are installing the server.

 Back to top

How do I configure the GenePattern server on a machine with multiple IP addresses? Can I keep the GenePattern URL from changing when the server hostname changes?

Choose one hostname for the GenePattern server; for example, http://servername.domainname.edu:8080/gp/. Edit the genepattern.properties file and set the following properties:

  • GenePatternURL=http\://servername.domainname.edu:8080/gp/
  • GENEPATTERN_PORT=8080
  • gpServerHostAddress=servername.domainname.edu
  • fqHostName=servername.domainname.edu
  • fullyQualifiedHostName=servername.domainname.edu

 Back to top

How do I modify the GenePattern session timeout interval?

Session timeout is set in the Tomcat configuration file of the GenePattern server. To modify this setting for a local GenePattern server:

  1. Edit the Tomcat configuration file in the GenePattern installation directory: GenePatternServer/Tomcat/conf/web.xml.
  2. Modify the session-timeout property. Enter a value in minutes, where 0 disables session timeout. For example, to set the timeout to one day:
    <session-config>
       <session-timeout>1440</session-timeout>
    </session-config>
  3. Save the Tomcat configuration file.
  4. Restart the GenePattern server.

On the public GenePattern server, session timeout is set to four hours and cannot be modified by a user.

 Back to top

How do I configure GenePattern to work with a queuing system (or grid engine)?

Queuing systems such as the Load Sharing Facility (LSF) and the Sun Grid Engine (SGE) allow computational resources to be used effectively. If you have such a queuing system installed at your site and you have installed a local GenePattern server, you can configure the GenePattern server to work with the queuing system. For instructions on how to do so, see Using a Queuing System.

 Back to top

How do I modify how often result files are deleted from my GenePattern server?

From your GenePattern server, go to Administration>Server Settings and click File Purge in the menu at the left. From here you can specify when anaylsis result files are deleted from the server:

  • Use Purge Jobs After to specify the number of days the server keeps the analysis result files. To prevent the server from automatically deleting the files, set this value to -1.
  • Use Purge Time to specify what time of day (24-hour format) the server deletes the files.

Click Save to save your changes. Click Restore to return to the values set at the installation.

Note: This setting can only be modified on local GenePattern servers for which you have administrative rights. You cannot change this setting on the Public Server.

 Back to top

I'm getting 'error "connection refused"': what is the problem?

A refused connection is most likely due to a proxy issue. If you are behind a proxy or firewall, verify that you have correctly configured GenePattern and/or talked with your local SysAdmin allow GenePattern access to your machine.

To configure a proxy connection in GenePattern please do the following:

  1. In the GenePattern Web Client, click Administration>Server Settings to display the server settings.
  2. Click Proxy to display the proxy settings.
  3. On the Proxy Settings page, enter the hostname and port of your web proxy server. If you do not know them, contact your IT help desk to get the values. If you need to log into the proxy server, also enter your username and password (these will NOT be saved to a file and will need to be reentered following a server restart next time you want to connect).
  4. Click Save to update the proxy settings.

If this does not resolve the issue, please contact us at gp-help(at)broadinstitute.org.

 Back to top

Can I use a file path as input for a GenePattern module?

If you install your own GenePattern server, the default setting is not to allow input file paths. To change this, if you have administrator privileges on the server, add or edit the following in your genepattern.properties file:

allow.input.file.paths=true

Then restart your server. This will allow users to input an arbitrary network file path (such as file:///server/directory/file.gct) as the value for an input file parameter. When input file paths are allowed, you can use the server.browse.file.system.root property to set a root directory where the GenePattern server begins browsing for the specified network file path.

Note: On the Broad public server, we prevent users from entering an input file path (file://urls) as an input file for a module in order to better secure the machine running the public server.

 Back to top

How can I work around a LaunchAnywhere error?

If you tried to install GenePattern on Ubuntu, you may have received an installation error: "An internal LaunchAnywhere application error has occurred and this application cannot proceed. (LAX)" with "java.lang.IllegalArgumentException: Malformed \uxxxx encoding." in the stack trace.

LaunchAnywhere can interfere with the prompt string formatter PS1. In order to work around this problem, you need to use the following command:

$ export PS1=">"
>sudo sh./GPServer.bin

This is not only important for installing GenePattern on Ubuntu, but also launching GenePatternServer. Use the command before the GenePatternServer startup command, like so:

$ export PS1=">"
>./StartGenePatternServer

 Back to top

What is the current list of deprecated GenePattern modules?

Updated Oct 14, 2021

Please direct any questions about these modules to our help forum.

ABSOLUTE
ABSOLUTE.review
ABSOLUTE.summarize
BlastNPipeline
BlastParser
BlastSubtraction
BlastSubtractionLoop
BlastXml
Bowtie.aligner
Bowtie.indexer
BWA.Unmapped
BWAPipeline
BWASubtraction
CaArray2.1.0ImportViewer
CaArray2.3.0ImportViewer
CaArray2ImportViewer
caArrayImportViewer
CatalogueReads
CBS
ConcatenateFiles
Cuffcompare
Cuffdiff
Cufflinks
CufflinksCuffmergePipeline
Cuffmerge
ExternalSort
ExternalSort
ExtractFullQuery
ExtractPairsBam
ExtractUnmapped
ExtractUnmappedAdapterBlast
Fasta2FQone
Fastq2FQone
FilterLength
FQone2Fasta
FQone2Fastq
GENE-E
GeneCruiser
GetUnmappedReads
GISTICPreprocess
HAPSEG
HierarchicalClustering.MATLAB
HierarchicalClusteringImage.MATLAB
Lu.Getz.Miska.Nature.June.2005.clustering.ep.mRNA.pipeline
Lu.Getz.Miska.Nature.June.2005.clustering.ep.miRNA.pipeline
Lu.Getz.Miska.Nature.June.2005.clustering.ALL.pipeline
Lu.Getz.Miska.Nature.June.2005.clustering.miGCM218.pipeline
Lu.Getz.Miska.Nature.June.2005.PDT.miRNA.pipeline
Lu.Getz.Miska.Nature.June.2005.PDT.mRNA.pipeline
MAGeCK
MegaBlastPipeline
MultiplotPreprocess
ParallelCBS
ParsedBlastParser
PathSeq.BlastN
PathSeq.BlastX
PathSeq.BWA.aln
PathSeq.MegaBlast
PathSeqPrototype
PathseqReport
PNN
PNNXValidationOptimization
PostBlastN.All
PostBlastX.All
PostMegaBlast.Bacterial
PostMegaBlast.Ribosomal
PostSubtraction
PostSubtraction.Assembly
PostSubtraction.Contigs
PostSubtraction.Contigs.Scatter
PostSubtraction.Contigs.Step
PostSubtraction.Unmapped -
PostSubtraction.Unmapped.Scatter
PostSubtraction.Unmapped.Step
PreSubtraction
QualFilter
RankNormalize
RemoveDuicates
RepeatMaskerFormatChange
RepeatMaskerRead
SAMTools.FastaIndex
ScatterBlastSubtraction
ScatterBWASubtraction
Sit2Pairs
SraToFastQ
Subtraction
TopHat
Trinity_r2012.06.08
UniqueIdentifier

 Back to top

 

Why does the Specify File Path or URL option not work in Internet Explorer?

This is a known issue: when users click the Browse Server File System button, the Internet Explorer web browser window (instead of a pop-up window) becomes the file system browser.

If you want to continue using Internet Explorer, you can copy and paste or manually enter the server file path rather than clicking the Browse Server File System button. We recommend using another browser for full functionality.

 Back to top

Does GenePattern support cDNA and other 2-channel microarray data?

Yes. Most GenePattern analyses can run on 2-channel or ratio-based data as easily as on single channel or absolute value data. To run 2-channel data in GenePattern, do the following:

  • Convert your ratio-based data to a GenePattern GCT file. This tab-delimited text file format contains features (genes or probes), samples, and a computed ratio value for each feature in each sample.
  • GenePattern modules cannot analyze files with missing values. If your data has missing values, one way to address the issue is to use the ImputeMissingValues.KNN module to impute the missing values.

Your data is now in a GCT file that can be analyzed by most GenePattern modules. (If you want to use non-negative matrix factorization (NMF) and your data contains negative values, see the NMF note in the Modules & Pipelines section below.)

Ratio values for cDNA data can be computed using a variety of methods. How the ratios are computed determines whether it is possible to create a class (CLS) file for the cDNA ratio data. For example:

  • If ratios for all samples are computed against a common reference, as shown below, each sample can be assigned a distinct class and it is possible to create a class (CLS) file.

    normal sample (Cy3) / common reference (Cy5) = phenotype 1
    treated sample (Cy3) / common reference (Cy5) = phenotype 2

  • If ratios are computed by comparing conditions, as shown below, it may not be possible to create a CLS file.

    normal sample (Cy3) / treated sample (Cy5) = phenotype

If you cannot create a CLS file, you can analyze your data using modules that do not require class files (such as ConsensusClustering), but will not be able to use modules that require the CLS file (such as ComparativeMarkerSelection).

 Back to top

Where can I find information about file formats used by GenePattern?

Information on file formats supported by the modules currently in GenePattern is available in File Formats.

 Back to top

How can I convert between RES, GCT, and ODF formats?

Run your file through PreprocessDataset. Select the desired output format for your file. If you only want to convert the file type without filtering, select "no filter" as the choice for the "filter flag" parameter.

 Back to top

How do I convert a file to GenePattern format?

File Formats describes the file formats used in GenePattern and, where applicable, suggests methods for converting files to these formats.

How can I use CEL, MAGE-ML, and MAGE-TAB files in GenePattern?

The ExpressionFileCreator module converts a set of individual CEL files into an expression data set that is usable by GenePattern modules. The MAGEMLImportViewer module imports data in MAGE-ML format into GenePattern, and similarly, the MAGETABImportViewer module imports data in MAGE-TAB format into GenePattern.

 Back to top

I have installed a module/pipeline/suite, but I do not see it. What's wrong?

This generally occurs for one of two reasons:

  • If the same zip file is installed twice, by two users, the second one overwrites the first one. While the bits are the same (including LSID), the ownership and privacy are subject to change and may end up hiding it from the module's original installer if the second installer installs it as private.
  • The same suite cannot be installed as a "private" suite for more than one user. If you install a private suite and do not see it, it may already be installed as a private suite by another user.

 Back to top

My pipeline requires an input file, but displays a file-not-found error when I enter a file name. What's wrong?

Pipeline input files with spaces in their names may give file-not-found errors. If this happens, use DOS' "dir /x" command to get the 8.3 version of the directory and filename and use that instead of the long filename. If you are using a Unix-based platform, you may need to quote the filename parameters on the command line definition.

 Back to top

How can I run non-negative matrix factorization NMF on data that contains negative values, such as log-ratio or unthresholded Affymetrix data?

To run NMF on data that contains negative values, you must do the following (using the method of Kim, P. M. & Tidor, B. (2003) Genome Res. 13, 1706-1718):

  • Create one dataset with all negative numbers zeroed
  • Create another dataset with all positive numbers zeroed and the signs of all negative numbers removed
  • Merge the two (eg. by concatenation), resulting in a dataset twice as large as the original, but with positive values only and zeros, hence appropriate for NMF.

To do this in MATLAB, you can execute the following: anew=[max(a,0);-min(a,0)]; where a is the original data.
We are currently developing a GenePattern module to perform this operation as well.

 Back to top

When I do a Hierarchical Clustering analysis, two files are produced, but the Hierarchical Cluster Viewer (JavaTreeView) looks like it needs three files. Do I need another one?

No, you can use the two files that are created and leave the remaining input box blank. HierarchicalClustering creates a cdt file and one or two additional files: an atr file if you clustered by samples (columns), a gtr file if you clustered by genes (rows), or both atr and gtr files if you clustered by both samples and genes (columns and rows). The JavaTreeView module accepts the two or three files created by HierarchicalClustering.

 Back to top

How can I export a Heat Map image with gene annotations?

The HeatMapViewer module currently does not include gene annotations with the saved image. Use the HeatMapImage module to include gene annotations.

 Back to top

Why do the scores from ComparativeMarkerSelection and ClassNeighbors differ?

When computing the t-test or signal to noise ratio, ClassNeighbors thresholds the standard deviation to ensure that it is at least twenty percent of the mean. Additionally, if the standard deviation is zero, ClassNeighbors sets it to 0.1.

 Back to top

I have used ComparativeMarkerSelection to construct gene lists representing different experimental conditions. Is there a GenePattern module that can determine if there are upstream non-coding motifs over represented in those gene lists?

Yes. You can use the GSEA module with the c3 (motif) gene sets. The GSEA module is documented on the Modules page.

 Back to top

How do I resolve GISTIC errors?

Most errors reported by users running the GISTIC module are caused by a mismatch between the segmentation and markers files. If an error occurs, verify that all markers indicated in the segmentation file appear in the markers file and only those markers indicated by the segmentation file appear in the markers file.

The CBS and GLAD segmentation methods produce GISTIC-friendly marker positions. Partek's latest beta version also produces GISTIC-friendly marker positions. However, if you used an earlier version of the Partek algorithm to create the segmentation file, the algorithm did not report the exact physical position of the first and last markers of the segments. If you run GISTIC on a segmentation file generated using the earlier version of the algorithm, the physical positions of the marker file will not agree with the start or stop positions of the segmentation file. Note that Partek also uses the control probes in the generation of the CN/segmentation.

 Back to top

What does the GISTIC MATLAB error "Matrix dimensions must agree." mean?

??? Error using ==> plus
Matrix dimensions must agree.
Error in ==> make_D_from_seg at 158
Error in ==> run_gistic_from_seg at 58
Error in ==> gp_gistic_from_seg at 177
MATLAB:dimagree

If you are running GISTIC and get the error above in your stderr.txt file, you should verify that your segmentation file and markers file are exactly matched. Only the markers from the markers file should be indicated in the segmentation file and only those markers indicated by the segments should be in the markers file.

IE seg file should be

1-4
5-6				

and markers file should be

1
2
3
4
5
6

 Back to top

I get "??? Attempted to access rl(:,2); index out of bounds because size(rl)=[0,1]" in my stderr file when running GISTIC, what does this mean?

If your run of GISTIC fails with the error below in the stderr.txt file, check your segmentation file format. Please see the sections on the segmentation file format in the GISTIC documentation for more details and examples.

??? Attempted to access rl(:,2); index out of bounds because size(rl)=[0,1].
Error in ==> derunlength at 25
Error in ==> smooth_cbs at 148
Error in ==> run_gistic_from_seg at 125
Error in ==> gp_gistic_from_seg at 177
MATLAB:badsubscript

If this does not resolve the issue, please contact us at gp-help(at)broadinstitute.org.

 Back to top

I get "??? Index exceeds matrix dimensions." in my stderr file when running GISTIC, what does this mean?

If your run of GISTIC fails with the error below in the stderr.txt file, check your markers file format. Please see the sections on the markers file format in the GISTIC documentation for more details and examples.

??? Index exceeds matrix dimensions.
Error in ==> check_if_has_header at 13
Error in ==> make_D_from_seg at 21
Error in ==> run_gistic_from_seg at 58
Error in ==> gp_gistic_from_seg at 179
MATLAB:badsubscript

If this does not resolve the issue, please contact us at gp-help(at)broadinstitute.org.

 Back to top

Does GISTIC support SNP 6.0 data?

Yes, GISTIC supports the Affymetrix Human SNP 6.0 array.

If you have further questions please contact us at gp-help(at)broadinstitute.org.

 Back to top

Why can't I run the GenePattern visualizers?

​Java applet based visualizers no longer function in any browser. Please click here to read the full blog post.

 Back to top

Why is my module taking so long to run?

Some computationally-intense modules can take a day or more to run. Some examples are FLAMEMetacluster, NMFConsensusClustering, GISTIC, and GLAD. In addition, server load can affect queuing times on the Broad public server, and this can affect the length of time a module can take to complete.

If your job does not use a computationally-intense module or a large data set, and it takes longer than about 4 hours to complete, please contact us at gp-help(at)broadinstitute.org.

 Back to top

What does a missing value error for ComparativeMarkerSelection mean?

If you receive the following errors while performing an analysis with ComparativeMarkerSelection:

Error in if (min(p) < 0 || max (p) > 1) \{: missing value where TRUE/FALSE needed Execution halted

or

ERROR: The estimated pi0 <=0. Check that you have valid p-values or use another lambda method.

then a gene in your data has insufficient variation in its expression values. Use the PreprocessDataset module with a filter that is more stringent than you have previously used on your data set before running ComparativeMarkerSelection.

If you continue to experience problems, please contact us at gp-help(at)broadinstitute.org.

 Back to top

Why does ExpressionFileCreator fail?

If ExpressionFileCreator fails on the GenePattern public server, please contact us at gp-help(at)broadinstitute.org.

If ExpressionFileCreator fails on your local server, but works on the public server, you need a more recent version (version 8 or 9) of ExpressionFileCreator. Only version 7 (and earlier versions) is available in the public repository (via Pipelines & Modules>Install from repository) because versions 8 and 9, which support the updated CEL file formats, require R 2.8, and GenePattern installs with R 2.5.

You can either use ExpressionFileCreator on the GenePattern public server or install a more recent version of .

Instructions for installing ExpressionFileCreator on your local server are available at ftp://ftp.broadinstitute.org/pub/genepattern/public_module_installation/efc_install_instructions.txt.

 Back to top

What does "Could not obtain CDF environment" mean?

If ExpressionFileCreator gives you the following error when you try to convert Affymetrix CEL files to GCT format:

Error in getCdfInfo(object) :
Could not obtain CDF environment, problems encountered:
Specified environment does not contain MoGene-1_0-st-v1
Library - package mogene10stv1cdf not installed
Bioconductor - mogene10stv1cdf not available
Calls: parseCmdLine ... .local -> indexProbes -> indexProbes -> .local -> getCdfInfo
Execution halted

Then the CDF for your array was not found in the Broad-hosted CDF library. You need to use a custom CDF to support the conversion. CDF files are available here. For instance, if you were analyzing the Mouse Gene 1.0 ST Array, you could type in that search term on the Affymetrix page. The result page opens, where you could find your CDF file under the Library Files section.

Provide this CDF file as the input for the cdf file parameter in ExpressionFileCreator.

Please note that a number of newer Affymetrix array types are not current supported by ExpressionFileCreator, including the 1.1, 2.0, 2.1 ST arrays, Exon arrays, and HTA 2.0 arrays. This is the case even if a CDF file is provided. Please see the ExpressionFileCreator documentation for details and future plans.

 Back to top

What does the GISTIC error, "Invalid file identifier" mean?

The usual cause of this error is spaces in any of the input file names.

 Back to top

Why does the "no such module" error occur for a module on the server?

If you run an imported pipeline on your own GenePattern server, and you get the error, "No such module [module name]", when you know you have that module on your server, then the pipeline requires a version of the module that is not on your server. If you return to the pipeline page and click Properties, you can view the modules that are required but not installed. If you install these module versions from the repository, the pipeline will run.

 Back to top

How can I properly view my GCT or RES file in IGV?

The default IGV display option for a GCT or RES file is the Heatmap. For the heatmap to make sense, the data must be row-centered, scaled and possibly have a threshold applied.

For complete information, see the blog post about Using IGV Through GenePattern.

 Back to top

Why is nothing happening when I try to upload my large file?

There are limitations on file upload size. Files uploaded via the Browse button on the module input page must be under 1.2 GB. To use larger files, there are a few options:

  • Select Upload files from the blue arrow in the Uploads tab to upload your file. This invokes the large file upload system. Add the files you want to upload to the Java applet and click the upload arrow. This makes your files available in the Uploads tab, and you can set them as input for appropriate GenePattern modules. More information about the Uploads tab can be found in our User Guide
  • Download and install GenePattern on a local machine. Put your files on a server that is accessible to your GenePattern server – that is, on the same file system or via a network share – and use the file path as input for the GenePattern modules. (Note: you will have to enable file paths on your server.)
  • Put your files on a web-accessible machine or FTP site and specify a URL or FTP address for the input file. Make sure that the machine you use is accessible to the GenePattern server.

 Back to top

What does "Error in subfiles: subscript out of bounds" mean?

This error can be produced if there are hidden files or directories in the ZIP archive. This usually occurs on a Mac when using the "Compress" option from the right-click pop-up menu. If this is the case, you may want to use the zip command from the terminal window to zip files instead. If you didn't Compress on a Mac, then you should check that there are no hidden files in the ZIP archive.

 Back to top

How can I pre-process my RNA-seq data for IGV?

The recommended format for RNA-seq data in IGV is the BAM file. If you run your SAM or BAM file as the input file for the SortSam module, you can sort and index it, and can convert a SAM file to BAM.

In addition, the IGVTools.sort and IGVTools.index modules can sort and index a SAM file. These modules are currently in beta. If you would like to use them, please contact us at gp-help(at)broadinstitute.org.

 Back to top

Why did the module I tried fail to run with my ZIP file as input?

If your ZIP file has a directory in it, GenePattern cannot resolve it. Unfortunately, if you generated your ZIP archive using the Finder on the Macintosh OS, the Mac builds a directory structure into your ZIP archive and GenePattern cannot resolve it. To zip on a Mac, use the zip command from a terminal window; for example, if you wanted to create a ZIP archive called "all_foo" that contains the files all_foo.cls and all_foo.gct, you could use the following command:

zip all_foo all_foo.cls all_foo.gct

Some other reasons that your ZIP file may fail include spaces in the names of the files or hidden files. If you cannot locate the issue with your ZIP file, please contact us at gp-help(at)broadinstitute.org.

 Back to top

Why did my GenePattern job fail?

The first place to look for the reason is the stderr.txt file, which should be available in the job summary or job status page. This file often contains plain text indicating what went wrong with a job, such as formatting or filtering errors. If you find that this file does not help you resolve the error, please contact us at gp-help(at)broadinstitute.org.

 Back to top

How can I use the RNA-seq modules available in GenePattern?

You can use the RNA-seq modules either on the Broad public server or by installing them on a GenePattern server installed on your machine or a network-accessible server.

 Back to top

If You Choose to Run RNA-seq Modules on Your Own GenePattern Server

If you have not installed GenePattern on your local machine, instructions for installing a local GenePattern server are provided on the Download GenePattern page.

If you have already installed a GenePattern server, select Modules & Pipelines>Install from repository. The page will present all available modules. You only need to select the checkboxes for the modules you want and click Install Checked.

Note: The main analysis RNA-seq modules (Bowtie, BWA, Cufflinks, TopHat, and Scripture) currently only run on Macintosh and Linux. If you do not have access to machines with these operating systems, you can use the modules on the Broad public server. The conversion/utility modules that are related to the RNA-seq modules are available for Macintosh, Linux, and Windows.

You may find it helpful to enable your GenePattern server to accept file paths in order to handle large input files that are already present on the system where your local server is installed. To do this, edit genepattern.properties (located in the resources directory under your GenePattern server directory) and make allow.input.file.paths=true. This allows users to input a network file path (such as file:///server/directory/file.gct) as the value for an input file parameter. When this value is set to true, you can define a root directory where the GenePattern server begins browsing for network files by setting server.browse.file.system.root to the root directory you want to specify.

Example: In genepattern.properties, setting server.browse.system.root=/Users/mydata/ngs will cause the browser window to open to /Users/mydata/ngs when a user chooses Specify File Path or URL

Example: In the config_default.yaml file, setting server.browse.system.root: [ "/Users/mydata/ngs", "Users/shared"] will add two folders to the browser window.

 Back to top

Why is my GenePattern job stuck in the PENDING state?

There are a few reasons why this might occur. Jobs are often PENDING because GenePattern is a shared resource. When your job is in the PENDING state, it means that it is waiting in the queue behind other jobs for the GenePattern server to submit the job to the server farm. Jobs that use large files and access them via an external URL may hold up the line while those files are transferred to the GenePattern server, even keeping jobs that normally take a few seconds in PENDING.

The job will run when the queue clears up.

If this is a common issue on your GenePattern server, it is possible to configure it to help reduce the wait. If you want to reconfigure your GenePattern server in this way, please contact us at gp-help(at)broadinstitute.org.

 Back to top

Why do I receive an error when running my preprocessed GCT file and CLS file in ComparativeMarkerSelection?

If you tried to run your preprocessed GCT file and CLS file in ComparativeMarkerSelection, but it gives you the following error:

An error occurred while reading the file ClassFile.cls.

Cause: Header line needs three numbers!

Make sure your CLS file is space delimited and not tab-delimited. This is the most common cause of this error. If this does not stop the error, please contact us at gp-help(at)broadinstitute.org.

 Back to top

Why did I get a warning stating that my index is older than my BAM file?

If you try to run an indexed BAM file through a module and receive a warning that your index file (BAI) is older than your BAM file, it means that the timestamps for these files are out of sync. If you receive this warning, you should index your BAM file by using the SortSam module.

How do I run several files through a set of modules in parallel?

You can do this by creating a pipeline for the jobs you want to run in parallel.

Then you can submit your set of data files to the pipeline as batch job. For more information, see Batch Processing.

There are additional features that make it easier to work with large input files and to run batches of jobs in parallel:

  • The User Guide describes how to work with large files in GenePattern
  • GenePattern also has a feature (disabled by default) that allows you to access input files on the server's file system. With this feature turned on, you don't need to directly upload your input files via the job input form. See Using File Paths for details.
  • GenePattern has a programming interface (with versions for Java, R, and MATLAB) that allows you to submit your jobs in parallel. See the Programmers Guide for more details.

 Back to top

How do I zip my files for use in GenePattern?

On Windows, you need to select the files to be added to the ZIP archive (hold down the Control or Shift key while selecting to select a group). Then right-click on the group and select WinZip (or whichever zip application you have on your machine). Do not select a folder and zip it – that will create a directory inside the ZIP archive; if your ZIP archive has a directory in it, GenePattern cannot resolve it.

On Macintosh, if you generate your ZIP archive using the Finder, Mac builds a directory structure into your ZIP archive and GenePattern cannot resolve it. To zip files on a Mac, use the zip command from a terminal window (launched from Applications/Utilities); for example, if you wanted to create a ZIP archive called "all_foo" that contains the files all_foo.cls and all_foo.gct, you could use the following command:

zip all_foo all_foo.cls all_foo.gct

If you follow these instructions and find that GenePattern does not accept your ZIP file, check for spaces in the names of the files or hidden files in the ZIP archive. If you cannot locate the issue with your ZIP file, please contact us. The GenePattern team plans to develop a ZIP module to help users with creating ZIP archives.

 Back to top

How can I easily run the same analysis on many different data files?

As of GenePattern 3.3.3, GenePattern supports batch jobs. To use this feature:

  1. Select the Uploads tab.
  2. Click the arrow next to the uploads directory, name a subdirectory, and click Create.
  3. Click the arrow next to your subdirectory and select Upload. This launches the GenePattern file uploader.
  4. Click Add in the top of the uploader window and select all the files you want to run as a batch.
  5. Click the upload arrow. This will upload all your files into the subdirectory you just made on the GenePattern server. Do not close the uploader window while the file upload is in progress.
  6. Once the files are uploaded, click the blue arrow next to the directory containing the files, and select the module or pipeline you want to use for your analysis.
  7. If there is more than one input file field you need to populate, you can select "send to as batch" for those parameters that accept batch inputs. Make sure that all the files for a given analysis, which need to be paired, have the same name; for instance, "file1.gct" would be processed with "file1.cls".
  8. Run the module.

The module will be run once for each file selected. All the job results for the batch will be listed under a single batch ID.

If you have difficulties with the batch upload function in GenePattern 3.3.3, please contact us at gp-help(at)broadinstitute.org.

 Back to top

Why can't I use a directory as input for all modules?

While as of GenePattern 3.3.3, GenePattern supports the use of directories as input for modules, not all modules support this function.

A few quick ways to tell if a module does accept directories are:

  • Click on the arrow next to a directory in the Uploads tab; the modules listed in that drop-down will accept directories as input.
  • Check the caption under the input parameter; if the module accepts directories as input, it will indicate that here.
  • Check the module documentation (available from the help link in the upper righthand corner of the module's page); the input parameters section will make it clear if a directory is accepted as input for the module.

 Back to top

Does ExpressionFileCreator support Exon arrays?

ExpressionFileCreator does not currently support Exon arrays. The GenePattern module development team is working on a module for this.

 Back to top

How do I format my GenePattern output for submission to GEO?

There are currently no modules in GenePattern for submitting data to GEO. The NCBI has webtools for this purpose, such as GEOarchive.

 Back to top

How do I get a heat map with a high enough resolution for publication?

To generate a new heat map image at a resolution near 300 dpi, you can:

  1. Select the HeatMapImage module in GenePattern.
  2. Change both column size and row size to 33 pixels.
  3. For best results, change show grid to "no". (The grid does not scale as much as the column/row size does, and so may look suboptimal for print publication.)
  4. Generate your heat map image. Open your heat map image file in an image manipulation application that can scale images (like Adobe Photoshop or GIMP) and increase the image resolution to 300 dpi. This will reduce the size of the image by about 4 times (thus why you enlarged the image above) and leave it at a resolution of 300 dpi, which is optimal for print publication.

If you already have a heat map image that you cannot for some reason recreate that is at 72 dpi, you can use an image manipulation application that can scale images (like Adobe Photoshop or GIMP) to increase the resolution to 180 dpi. This will shrink the image by half, but 180 dpi is usually the minimum resolution necessary for print publication.

 Back to top

I am running a large number of RNA sequencing jobs, and I'd like to be able to look at the quality of the data. Is there a tool I could use for this?

Yes: the RNAseQC module in GenePattern calculates standard RNA-seq related metrics, including depth of coverage, ribosomal RNA contamination, continuity of coverage and GC bias. See the module documentation for the recommended data processing workflow for optimal use of this QC analysis.

 Back to top

Is there an easier way to create a CLS file than creating it by hand in a text editor?

Try the ClsFileCreator module in GenePattern. The ClsFileCreator is a wizard-based tool that can be used to create class label (CLS) files from array data in the GCT or RES file formats.

 Back to top

How can I install GISTIC_2.0 on my own GenePattern server?

To install GISTIC follow the instructions below: Note that you must install on a 64-bit linux machine.

To install GISTIC on your 64-bit Linux machine, export it from the public GenePattern Server.
Select GISTIC from the list of modules. Click the export link to save the module in a zip file.

If you do not already have MATLAB installed, you will need to do so. An executable and instructions can be found on the GISTIC 2.0 publication page

Once MATLAB is installed, you will need to add lines like the following to your <GenePatternServer>/resources/genepattern.properties or custom.properties file:

MATLAB_LIBRARY=/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/sys/java/jre/glnxa64/jre/lib/amd64\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/sys/java/jre/glnxa64/jre/lib/amd64/server\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/sys/java/jre/glnxa64/jre/lib/amd64/native_threads\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/sys/os/glnxa64\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/runtime/glnxa64\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/bin/glnxa64

and

APPLERES_DIR=/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/X11/app-defaults

Then restart your GenePattern Server. (To avoid restarting - enter these via the Administration>Custom page)

Then connect to your GenePattern Server, and import GISTIC from the zip file you just exported.

If you then wish to run it from command line. Please use one of the programming interfaces, as described in the Programmers Guide.

 Back to top

How can I retrieve external database information from GenePattern?

The GenePattern server itself does not connect to any database, but modules can and have been written to connect to databases and retrieve data from them including caArray (caArrayImportViewer) and Gene Expression Omnibus (GEOImporter). To connect to any database of your choice, write a simple command-line program to connect to the database and retrieve data into a file format and install this program as a module into GenePattern (see Creating Modules).

 Back to top

My MATLAB figures are not appearing in the MATLAB visualizer I created. Why?

When creating a matlab visualizer using matlab 7.0 compiled m-code (any release before 7.4), any figures that you create in MATLAB must have the value visible set to on or they will not be drawn to the screen.

 Back to top

Can my module use a different version of R than GenePattern?

GenePattern modules can be written for any version of R. For details on how to specify which version to use, see Using Different Versions of R.

 Back to top

How can I share a GenePattern module?

You can submit your module to the GenePattern Archive (GParc). You will need to register to use all the features of GParc. Check the resources for module developers for best practices in testing and documentation before submitting your module.

 Back to top

Why can't I specify a 64-bit platform for a module?

GenePattern does not have a valid CPU Type for 64-bit platforms. So if you try to specify a 64-bit CPU Type, the module will fail on 64-bit platforms, whether or not they are running compatibility mode. You will have to set the CPU Type to 'any' and add more information on the appropriate platforms in your documentation. If this does not stop the module from failing on appropriate platforms, contact us at gp-help(at)broadinstitute.org.

 Back to top

Where can I find out more about how to launch GenePattern modules from other programming languages?

The reference guide for accessing GenePattern modules from Java, MATLAB, and R is the Programmers Guide.

 Back to top

Can I use the GenePattern APIs to create a web service that programmatically accesses the GenePattern server?

GenePattern provides a REST API for use by web applications. A WADL file for the REST API can be accessed at the URL below:

http://your_server:your_port/gp/rest/application.wadl

GenePattern also provides an older SOAP API. This API is deprecated, but is still available. The WSDL file for the GenePattern SOAP API is available at:

http://your_server:your_port/gp/services

For more information about the programming libraries, see the Programmers Guide.

 Back to top

Why can't I call my pipeline/module from MATLAB?

A pipeline or module with a period in its name cannot be called from MATLAB.

 Back to top

What is a CSV file?

CSV stands for "comma-separated values". While CSV files will open in Excel or similar spreadsheet applications, it is important to remember that the values in these files are comma-delimited, not space- or tab-delimited.

 Back to top

Can I process raw Illumina BeadChip data in GenePattern?

There are several modules available in updated form on the Broad GenePattern server for the processing of raw Illumina scan data into GCT files that are usable by GenePattern: IlluminaScanExtractor, IlluminaNormalizer, and IlluminaConcatenator, only support the 6k Transcriptionally Informative Gene (TIG) panel (GEO accession: GPL5474), but not other DASL gene panels at this time. The IlluminaDASLPipeline is a workflow that chains together these 3 modules so that it is easy to process zipped Illumina scan data files produced by a DNA-mediated Annealing, Selection, extension and Ligation (DASL) assay.

IlluminaExpressionFileCreator extracts the mean value for each probe from a set of Illumina expression IDAT files and put them into GCT format.

 Back to top

What versions of genomic databases is GeneCruiser currently using?

UniGene and SwissProt are at the current versions listed on their websites and are updating regularly. We are working on restoring regular updates for Entrez Gene. If you are interested in knowing the version of another of the databases accessed by GeneCruiser, please contact us.

 Back to top

Can you send me the source code for GISTIC?

We do not currently distribute the source code for GISTIC. The executable is available and can be found on the GISTIC page. You can also export the GISTIC module from the Broad's public GenePattern server. Note that the GISTIC module and executable are currently compiled only for 64-bit Linux.

The GISTIC developers are working on a version that will allow us to distribute the source code, but it is still currently in development.

 Back to top

How can I make my GenePattern module available on the GenePattern public server?

First, please look at GenePattern Archive (GParc) to see if this will satisfy your requirements. If it seems that GParc is not the right answer for you, please contact the GenePattern team at gp-help(at)broadinstitute.org to begin discussing the possibility of releasing your module on the GenePattern public server. When you contact us, please provide your code and any documentation you currently have.

 Back to top

Does GISTIC support Agilent data?

Yes, GISTIC will support Agilent data. However, you must convert your aCGH data into SEG (segmented) format. GenePattern does not currently provide a module for converting Agilent data to SEG format.

 Back to top

Why am I getting the "none of the gene sets passed the size thresholds" error in GSEA?

There are several points you need to check in your gene sets. Check that your gene identifiers are all uppercase if you are not using the collapse to gene symbols option. For other information, please see the error 1001 FAQ for GSEA for the list.

 Back to top

I keep getting file errors when I run a module. What are common reasons for file errors that I can check?

There are several things you can check in your files that commonly cause file errors:

  • Do your files/directories have spaces in their names?
    • Remove them or replace them with _ (underscore) or periods.
  • Are there characters such as parentheses or pound signs (#) in your file names?
    • Remove them or replace them with _ (underscore) or periods.
  • What type of file is the module expecting? Is your file the correct type?
    • Check the module documentation for more information.
  • Does the file have the correct extension for the file type the module is expecting?
    • Sometimes Excel or similar programs can add a ".txt" or other extension to the file name. Remove it (rename the file on your desktop and delete the .txt extension) and make sure the file name ends with the correct extension.
  • Is your data delimited in the way the module expects it to be?
    • Check the module documentation to see if it expects tab-, comma-, or space-delimited data (or something else), then make sure your file is formatted appropriately.
  • Did you edit your file in Excel or similar program?
    • Such applications can sometimes add extra spaces or tabs. Open your file in a text-editing application and look for these extra invisible characters that can cause errors.
  • Do the contents of your file match the expected file format?
    • Check that your file contains all the expected columns and header information in the expected order for the given format. See File Formats.

 Back to top

Where are the dock icons for my GenePattern server on my Mountain Lion (OS X 10.8) machine?

When you install GenePattern 3.4.0 or earlier and select the option to install icons in the dock during install, the icons will not appear in the dock. They only appear there when the server is running. You can, however, manually place them there.

 Back to top

Why am I seeing a "Process existed with status code: 138" error when I try to run ConsensusClustering?

The ConsensusClustering module does not work with Java 1.6.0_33 on Macintosh.  As a workaround, you can run ConsensusClustering on the GenePattern public server, or on a server that is on a Windows machine or a Macintosh with a Java version other than 1.6.0_33.

 Back to top

Why can't I install a licensed module on my GenePattern server?

Licensed modules can only be installed on servers running GenePattern 3.5 or higher. Upgrade your GenePattern server and try again.

 Back to top

Why can't I install GenePattern on my Mac OSX 10.9 Mavericks machine?

Please refer to this blog entry for assistance with this question.

 Back to top

I got this error when running GISTIC_2.0 "All input data were removed after NaN processing", what does it mean?

GISTIC expects that the segments for a sample should cover almost all of its genome, even the regions where the copy number is normal. Any gaps in coverage for any sample are removed from the GISTIC analysis.

 Back to top

How do I find reference genomes to use in TopHat, Bowtie, or BWA?

The TopHat, Bowtie, and BWA GenePattern modules provide easy access to the reference genome index bundles for a number of species.  If we aren't yet hosting the index for the species you need, you can email us at gp-help@broadinstitute.org and we will add your species to the available indexes, or you can find additional reference genome bundles for other species are available from the Illumina iGenomes website.  Note that the GenePattern modules cannot use the iGenomes bundles directly as packaged there.  It will be necessary for you to unpack the bundle and repackage the pertinent files (for example, the Bowtie2 Index files) as a ZIP archive.  Remember that there are some special considerations for creating ZIP archives for use in GenePattern.

 Back to top

How do I find reference genome annotation files or whole genome files to use with the GenePattern RNA-seq tools?

Several of the modules also accept reference genome annotation files (GTF files) and/or whole genome FASTA files.  A list of these are available from our FTP site in the following locations:

The modules can usually accept an FTP URL directly wherever a file input is allowed, so there is no need for you to download the reference file; instead, just copy and paste the file's FTP URL into the file input parameter.

How can I create a GISTIC markers file for my segmented data file?

The best way to create a markers file for your data ( so that it matches correctly) is to take the first 3 columns from the copy number file you used as input to the segmentation method used to create the seg file for GISTIC.

 Back to top

Is there a markers file that I can use with the level 3 TCGA data?

The SNP 6.0 markers file used for our TCGA GISTIC analyses is available here: ftp://ftp.broadinstitute.org/pub/GISTIC2.0/hg19_support/genome.info.6.0_hg19.na31_minus_frequent_nan_probes_sorted_2.1.txt

 Back to top

I just installed Java 8. Why does my new GenePattern install say I need Java 7+ ?

It is likely that you installed the Java 8 JRE via your browser, which allows you to run Java apps, but is not sufficient for running GenePattern. You need the Java JDK. https://java.com/en/download/manual.jsp An easy way to see what jdk you have installed is to bring up a terminal window and type "java -version" (without the quotes).

 Back to top