MAExplorer Newsletter #3, August 30, 2001 ========================================= This newsletter is issued periodically to announce major changes and enhancements to the MicroArray Explorer (MAExplorer). Since you have expressed an interest in MAExplorer in the past, you were included in this newsletter mailing. If you don't wish to receive further mailings, let us know and we will remove you from the list. Old newsletters are available in the Web site archive, so if you are interested in following developments in MAExplorer but don't want the E-mail you could get the information there. WEB SITE: http://www.lecb.ncifcrf.gov/MAExplorer CONTACT: mae@ncifcrf.gov PHONE: 301-846-5535 (Peter Lemkin) 301-846-5539 (Greg Thornwall) FAX: 301-846-5598 NEWSLETTER ARCHIVE: http://www.lecb.ncifcrf.gov/MAExplorer/Newsletters/ Enhancements in the current Version 0.90.08 since the last Newsletter ===================================================================== At the time of the last April 5, 2001 newsletter #2, the version was 0.89.18. There were a number of improvements and new features added as well as the Beta-release of a data format conversion tool (Cvt2Mae). Details on these features are described in the Reference Manual. 1. A number of new gene data filters have been added ---------------------------------------------------- 1.1 A new data filter "Filter by 'Good Spot data'" was added that may be used in eliminating bad spot data on a per-gene set basis. This uses the "QualCheck" field in the quantified data table is present. It maps either an 1) integer numeric code (see Appendix C of the Reference Manual), 2) an alphabetic code (e.g. Affymetrix "Abs Call") of "P" (or "G" or "T") to Good Spot, "A" (or "B" or "F") to Bad Spot, and "M" to Marginal Spot, or 3) a continuous quality value. In this latter case, QualCheck may be a continuous monotonically increasing floating point value (e.g. 0.0 to 100.0, or 0.0 to 1.0, etc.) in which case a "Spot Quality" State threshold slider will popup when the filter is invoked. The NCI/CIT mAdb array database server will be generating GOOD/BAD (case 1 above) QualCheck data so it can take advantage of this data filter. 1.2 Additional change-limit constraints "Compare channels meeting range" were added to the "Filter by spot intensity [SI1:SI2] sliders" to allow filtering of spot intensity independently on the Cy3 and Cy5 channels of (Cy3/Cy5) ratio data. This allows filtering by spot intensity on some or all of the samples by one of (ALL, ANY, AT MOST, AT LEAST) constraints. For the (AT MOST, AT LEAST) constraints, a new "Percent SI ok" threshold slider allows you define a fuzzy definition of present or absent that may be useful in handling noisy low intensity spots that have corresponding spots that are present. 1.3 Some of the State Scrollers were too sensitive when setting low threshold values. These now have a non-linear range vernier control at the low end. The dynamic range of some of the scrollers has been increased making it possible to set the desired values over a wider range. 1.4 A data filter "Filter by positive intensity data" has been added that allows the user to use either "positive only" or "positive or negative" intensity data (this may be used with Affymetrix or other data that contains negative values). It is only available for data that contains negative numbers. 1.5 A data filter "Filter by Cy3/Cy5 HP-X ratio or Zdiff sliders" has been added by gene ratios or Zdiff values within [CR1:CR2] or [CZ1:CZ2] threshold ranges (depending on the normalization method). This is useful for filtering ratio data from a single sample. 2. Changes in genomic identifiers and popup genomic server links ---------------------------------------------------------------- 2.1 The underlying convention of associating a Gene with a Clone ID or GenBank ID has been changed to now use a more general "MasterID" as the master gene index. It determines what genomic IDs are available in your particular database and then assigns the MasterID to what is actually there - including a generic Genomic ID assigned by a user. This makes it more flexible than when used the Clone ID as the master gene index since Clone ID may not exist in some databases (e.g. Oligo arrays). In addition, the View menu "Enable display current gene in popup XXXX Web browser" only shows what options are actually there - rather than graying out the commands that you may not use. 2.2 LocusLink has been added to the possible popup Web pages from LocusID (if it exists) or LocusLink mapped from the GenBank identifier as an alternative if the LocusID is not available. 2.3 A new way dynamically adding genomic Web server browser views linked to particular genomic identifiers has been added to the configuration file as lists of (Menu entry/Web server URL/genomic identifiers). 3. New graphic plots and changes in the pseudo array image ---------------------------------------------------------- 3.1 Four new scatter plots were added for comparing individual channels of Cy3/Cy5 ratio data between HP-X and HP-Y samples. This allows you to compare either the Cy3 or Cy5 channel of HP-X against the Cy3 or Cy5 channel of HP-Y samples. 3.2 The pseudo array image generation paradigm has been changed so it now uses a fixed size spot and spacing and makes the underlying pseudo array image canvas size dependent on the number of grids, rows, columns and fields. This results in a more consistent interface and does not require the user to define this region. 3.3 Added a "Pseudocolor Red-Yellow-Green HP-XY plot" as an alternative visualization of HP-X/HP-Y data. The color is the additive sum of the HP-X (red) and HP-Y (green) intensities. 4. The Cvt2Mae data format conversion tool ------------------------------------------ A Java stand-alone Cvt2Mae data format conversion tool is being beta-tested. This tool lets the user convert their array data to MAExplorer format so it may be used with MAExplorer. This is described in the Reference Manual Appendix C.6 and the Cvt2Mae home page (http://www.lecb.ncifcrf.gov/Cvt2Mae) where you may download the converter to your computer. It currently handles Incyte and Affymetrix layouts (beta-testing). It is being generalized to allow users to define and save their own array layouts for their own special arrays or array data. 5. Improvements on selecting sets of sample hybridzations --------------------------------------------------------- 5.1 The HP-X, HP-Y sets and HP-E list of sample hybridzations may now be manipulated more easily. The HybProbe menu command "Choose HP-X, HP-Y and HP-E" popup sample chooser now has additional options to make it easier to select, remove, and reorder the three sample lists. 6.2 Added HybProbe menu command "Edit use (Cy5/Cy3) else (Cy3/Cy5) for each HP" for use with ratio data. This selectively swaps (Cy3,Cy5) data entries so may use (carefully!) dye-swap data for replicates. Added UP/DOWN buttons in the HP chooser to make it easier to adjust the order of the HP-E list. 6. Clustering methods --------------------- 6.1 Changed the name of the "NPN clustering" method in the menus and reports to "K-means clustering" since that is essentially what it is. 6.2 Added "SaveAs GeneSets" button to K-means clustering report. lets you save all of the clusters as named Gene sets ("Cluster #1", "Cluster #2", etc.) for further processing. Note that named gene sets may be renamed and deleted allowing you to keep the clusters you want and delete those you don't want to save. 7. Improvements in the gene sets and sample condition lists operations ---------------------------------------------------------------------- 7.1 Added commands for renaming Gene Sets (Edit Menu | Sets of Genes | Rename gene set)and Condition lists (Edit Menu | Sets of Conditions | Rename condition list). 7.2 Added Gene set and Condition set operations now let you select the sets by a picking an entry in a selection list as well as by typing in names of sets. 8. Misc inprovements -------------------- 8.1 Improved HP sample ID reporting in various lists, reports and plots 8.2 Fixed problem with popup Web browser for Macintoshes. 8.3 Optimized gene name and identifier sorting on startup so it starts much faster when working with large arrays. 8.4 Enhanced configuration file parser and sample name menu defaults to simplify the data that is actually required. 8.5 Added better error checking on input files to help catch some errors in manually edited user data files and when reading non-standard data. 8.6 Additional error checking added for input data files to try to catch bad data errors. This is useful when you are editing the data files manually. 9. Documentation ---------------- 9.1 The Reference Manual and tutorials have been clarified and revised and new material added. 9.2 An Web page devoted to MAExplorer PDF (Adobe Acrobat) documents has been setup and is accessible from the MAExplorer home page. Future enhancements under development ===================================== * Added functionality in the Cvt2Mae data format conversion tool * User extensible analysis methods using Java Plugins by having users write to a MAExplorer Java plugin API (check for status on the web page http://www.lecb.ncifcrf.gov/MAEPlugins) * New data filtering methods * New normalization methods * New clustering methods * Improved graphics * Improved documentation and training films * Access additional genomic Web databases * Generation of extracted image spot regions for specific genes across a set of samples * etc. Please contact us with your comments, suggestions and problems in running MAExplorer at mae@ncifcrf.gov.