MAExplorer Newsletter #2, April 5, 2001 ======================================= This newsletter is issued periodically to announce major changes and enhancements to the MicroArray Explorer (MAExplorer). Since you have expressed an interest in MAExplorer in the past, you were included in this newsletter mailing. If you don't wish to receive further mailings, let us know and we will remove you from the list. Old newsletters will be made available in the Web site archive, so if you are interested in following developments in MAExplorer but don't want the mail you could get the information there. WEB SITE: http://www.lecb.ncifcrf.gov/MAExplorer CONTACT: mae@ncifcrf.gov PHONE: 301-846-5535 (Peter Lemkin) 301-846-5539 (Greg Thornwall) FAX: 301-846-5598 NEWSLETTER ARCHIVE: http://www.lecb.ncifcrf.gov/MAExplorer/Newsletters/ Some recent enhancements in the current Version 0.89.18 ======================================================= 1. Manipulating genes with multiple copies on the array ------------------------------------------------------- The MAExplorer model has been extended to handle multiple copies (replicates) of selected genes (as opposed to duplicating each gene in multiple fields - see discussion in Reference manual). These include: a) finding all copies of a named gene in the popup gene guesser using the "Set E.G.L." button. This assigns all copies of the specified gene(s) to the Edited Gene List. b) The gene class "Replicate genes" has been added to the (Analyze | GeneClass) menu. This lets you quickly use the data Filter to find all replicated genes using the GeneClass filter. c) Finally, the "Filter by Genes with replicates" data Filter was added so it could be used in conjunction with other GeneClass filters (e.g. all named genes AND replicated genes). 2. Additional data Filter to handle negative intensity values ------------------------------------------------------------- Negative quantified intensity values may be negative for some types of arrays. MAExplorer can now handle this using: a) a configuration parameter "allowNegQuantDataFlag" which is set to TRUE if the data will contain negative numbers. b) We then added a new data Filter "Filter by positive quantified data" that filters out raw quantified data that contain negative values. 3. Improved data input ---------------------- This includes: a) fixing several consistency checks when reading non-standard array data; b) ignoring enclosing space characters in SamplesDB, Configuration and .mae startup files. c) Various dependencies in the Configuration and SampleDB input files have been removed so that missing fields may either be ignored or are derived from data fields which are included. (For examle, the SubMenus parameter no longer need be specified and and may be derived from the SampleDB data if sub-menus are specified). 4. Additional genomic database IDs in the GIPO table ---------------------------------------------------- We added (GeneBankAcc, SwissProt) in addition to previous (dbEST3', dbEST5', GeneBankAcc3', GeneBankAcc5', Clone_ID, Unigene_ID). If neither GeneBankAcc3' nor GeneBankAcc5' is present but the GeneBankAcc id is, it will use the latter. If Clone_ID is not present, but GeneBankAcc is, it will use the latter as the identifier. In addition, if the GIPO database does not contain some of these genomic identifiers, menus, plots and reports that used them no longer do so. If Clone_IDs are available, but dbESTs and GeneBank ids are not, then GeneBank and dbEST may be accessed via the mAdb Clone report or UniGene report. 5. Various Graphical User Interface improvements ------------------------------------------------ Some of the popup window displays are now updated when the data Filter, normalization, EGL, or current clone changes. For example, additional information, such as the correlation coefficient between HP-X and HP-Y, or HP-X and -Y 'sets', is computed for the current set of genes passing the data Filter and is updated in the scatter plot if the data Filter or normalization method changes. 6. New /Report directory ------------------------ A new Report directory has been added to hold SaveAs text (.txt) and plot window image (.gif) files. It is generated as needed it does not exist at the time it is needed. In addition, if a fatal error (called DRYROT errors) ever occurs, it reports this as a text file which may be saved in the Report directory (it can then be used to help analyze the problem). 7. Documentation ---------------- The Reference Manual documentation has been updated. In particular: a) figures were updated for Genes instead of the older notation of Clone (Version 0.88.*); b) the Short Tutorial (Appendix A) was updated and clarified including a new Short Tutorial Table Of Contents. The latter appears on the left when you invoke it from the MAExplorer home page. c) The descrition in Appendix C describing how to setup the information files to needed to use with various arrays has been extensively edited. d) The Section 3 introduction to data mining with MAExplorer has been extended. A report "Introduction to Data Mining of cDNA Microarrays using the MicroArray Explorer" is available as a PDF document. This gives a very brief to data mining microarrays. Then then describes some of the types of operations that one might do when data mining as well as how to download and install MAExplorer, and how to use it with the NCI/CIT mAdb microarray database (for registered users). http://www.lecb.ncifcrf.gov/MAExplorer/PDF/IntroDataMiningWithMAExplorer.pdf Future enhancements under development ===================================== * A data conversion tool to import a variety of hybridized array data types and automatically generate all files for running MAExplorer. * New clustering methods * Extensible analysis methods * Improved graphics * Improved documentation and training * etc. Please contact us with your comments, suggestions and problems in running MAExplorer at mae@ncifcrf.gov.