MAExplorer Newsletter #5, January 15, 2002 ========================================== This newsletter is issued periodically to announce major changes and enhancements to the MicroArray Explorer (MAExplorer). Since you have expressed an interest in MAExplorer in the past, you were included in this newsletter mailing. If you don't wish to receive further mailings, let us know and we will remove you from the list. Old newsletters are available in the Web site archive, so if you are interested in following developments in MAExplorer but don't want the E-mail you could get the information there. WEB SITE: http://www.lecb.ncifcrf.gov/MAExplorer CONTACT: mae@ncifcrf.gov PHONE: 301-846-5535 (Peter Lemkin) 301-846-5539 (Greg Thornwall) FAX: 301-846-5598 NEWSLETTER ARCHIVE: http://www.lecb.ncifcrf.gov/MAExplorer/Newsletters/ Enhancements in the current Version 0.94.04 since the last Newsletter ===================================================================== At the time of the last October 17, 2001 newsletter #4, the version was 0.92.09. There were a number of improvements and new features added as well as the Beta-release of a data format conversion tool (Cvt2Mae). Details on these features are described in the Reference Manual. 1. Installing MAExplorer We now use the ZeroG Corporation InstallAnywhere 4.5 program to install MAExplorer. MAExplorer is now distributed with JDK 1.3 Java Virtual Machine rather than the older JDK1.1.8 used by the previous versions. InstallAnywhere 4.5 requires this to work across all platforms including MacOS-X, IBM AIX, HP-UX, and some new Unix systems. This fixes several problems that have occurred with Windows-NT and may fix some other installation problems. 2. Menu reorganization, name changes and new commands ----------------------------------------------------- 2.1 We reorganized the Analysis menu to make the user interface more consistent. We moved "Cluster Plots" submenu of the "Plots" submenu up one level in the "Analysis" menu. 2.2 A number of new gene data filters have been added. The (Analysis | Filter | Filter per-sample Good Spot data) and the (Analysis | Filter by spot CV) filters have been extended. New options for these commands now allow filtering either by 1) "HP-X 'sets'" or "HP-Y 'set'", 2) by "HP-X or HP-Y", and 3) BY "HP-X or HP-Y 'sets'". The previous filter only allowed filtering by "HP-X and HP-Y" and "HP-X and HP-Y 'sets'". 2.3 You may now pre-edit the scrollable thresholds prior to using them in data filters and clustering. A new (Edit | Preferences | Adjust all Filter threshold scrollers) popups the state scroller window with all of the thresholds available to adjust. This is useful when you want to adjust thresholds before you enable data Filtering or clustering. A bug was fixed where the state of the previous clustering method was not always cleared if you aborted clustering by clicking on the delete window button (eg. in Windows or the Mac). 2.4 A new pseudoarray image display command (Analysis | Plots | Pseudocolor (HP-X,HP-Y) 'sets' p-Value) has been added. It displays a pseudo color value proportional to the p-Value for in a t-Test of the HP-X 'set' vs HP-Y 'set'. This only is available if HP-X 'set' vs HP-Y 'set' data is available. 2.5 Two new optional popup logging windows were added that log query messages and response and command history. The (View | Show log messages) command logs messages that occur during a session. These could be the query and response when interrogating the database for particular genes. The (View | Show log of command history) command shows the commands you have invoked during your data mining session. The windows may be saved in log files or cut and pasted into other programs. 3. Clustering changes --------------------- 3.1 When using K-means clustering, you may optionally use the median instead of mean when computing the center of a cluster by using the (Analysis | Cluster | Use median instead of mean for K-means clustering). 3.2 After you have created a ClusterGram using hierarchical clustering, you can select a subset of genes that appear significant to you in the ClusterGram and save them in the Edited Gene List for further analysis. They do not need be contiguous. Click on a row with the Control (Shift) key pressed in the ClusterGram for hierarchical clustering will add(remove) that gene to(from) the Edited Gene List. If the (View | Show 'Edited Gene List') is enabled, then it will draw a magenta '*' before the gene name to indicate that you have selected it. This lets you select a particular subset of the genes from the hierarchical cluster. 4. Data Conversion issues ------------------------- 4.1 The Cvt2Mae "wizard" data format conversion tool has been enhanced and made more flexible. The <User-defined> data option is now available to convert non-standard data. A default array layout has been added for GenePix data. Since each array is different, this can serve as a basis to define an array layout for your data that was quantified using GenePix. This tool lets the user convert their array data to MAExplorer format so it may be used with MAExplorer and is described in the Cvt2Mae home page (http://www.lecb.ncifcrf.gov/Cvt2Mae). The Cvt2Mae Affymetrix array layouts offer the option of using genomic identifiers in the Description field. 4.2 Restrictions on the length of sample file names has been removed for MacOS 8-9. This resolves a potential problem for MacOS8/9 users. The name of sample names and sample file names IDs has been made more flexible. If the "Database_File" is longer than 32 characters, there may be problems reading that file with MacOS-8/9. Therefore, you can use the "DatabaseFileID" to specify the "Database_File" (in which case they would be the same). Then, the labels used for the samples are the "Sample_ID" fields. This is upwards compatible with the previous method which used the "Database_File" for both the file name and the label name. 4.3. A new Gene Class "Empty Wells" has been added and contain a list of empty spots on the array (i.e. spots with no genes or the names "Empty", "EmptyWell", or "Empty Well"). Also, Alternate names were added for easier automatic detection of ESTs, ESTs similar to named genes were added. These changes are documented in the Reference Manual (see Automatic Gene Class naming based on Gene Name in Appendix C Table C.4.1). 4.4 Additional input data error detection of bad (field, grid, row, column) data was added to MAExplorer when reading the GIPO and Quant input files. This can be useful in detecting corrupted input data. 5. Documentation ---------------- 5.1 The Reference Manual and tutorials have been revised to reflect the renaming of Hybridization Probes to Hybridized Samples. It has also been clarified and revised and new material added. 5.2 The Web page devoted to MAExplorer PDF (Adobe Acrobat) documents has been updated. 5.3 The Web page for MAExplorer plugins describes and documents MAEPlugins. http://www.lecb.ncifcrf.gov/MAEPlugins). Some initial MAEPlugins are undergoing alpha-testing and will be made available on this Web site. Future enhancements under development ===================================== * Added functionality in the Cvt2Mae data format conversion tool for "User-defined" data. * User extensible analysis methods using Java Plugins by having users write to a MAExplorer Java plugin API (Application Programming Interface. Check for status on the web page http://www.lecb.ncifcrf.gov/MAEPlugins for more information. * New data filtering methods * New normalization methods * New clustering methods * Improved graphics * Improved documentation and training films * Access additional genomic Web databases * Generation of extracted image spot regions for specific genes across a set of samples * etc. Please contact us with your comments, suggestions and problems running MAExplorer at mae@ncifcrf.gov.