MAExplorer Newsletter #3, August 30, 2001
=========================================

This newsletter is issued periodically to announce  major changes and 
enhancements to the MicroArray Explorer (MAExplorer).

Since you have expressed an interest in MAExplorer in the past, you
were included in this newsletter mailing. If you don't wish to receive
further mailings, let us know and we will remove you from the list.
Old newsletters are available in the Web site archive, so if you are
interested in following developments in MAExplorer but don't want the
E-mail you could get the information there.

WEB SITE:     http://www.lecb.ncifcrf.gov/MAExplorer
CONTACT:      mae@ncifcrf.gov
PHONE:        301-846-5535 (Peter Lemkin)
              301-846-5539 (Greg Thornwall)
FAX:          301-846-5598
NEWSLETTER 
ARCHIVE:      http://www.lecb.ncifcrf.gov/MAExplorer/Newsletters/



Enhancements in the current Version 0.90.08 since the last Newsletter
===================================================================== 

At the time of the last April 5, 2001 newsletter #2, the version 
was 0.89.18.

There were a number of improvements and new features added as well as
the Beta-release of a data format conversion tool (Cvt2Mae). Details
on these features are described in the Reference Manual.


1. A number of new gene data filters have been added
----------------------------------------------------

1.1 A new data filter "Filter by 'Good Spot data'" was added that may
be used in eliminating bad spot data on a per-gene set basis. This
uses the "QualCheck" field in the quantified data table is present. It
maps either an 1) integer numeric code (see Appendix C of the
Reference Manual), 2) an alphabetic code (e.g. Affymetrix "Abs Call")
of "P" (or "G" or "T") to Good Spot, "A" (or "B" or "F")  to Bad Spot, 
and "M" to Marginal Spot, or 3) a continuous quality value. In this
latter case, QualCheck may be a continuous monotonically increasing
floating point value (e.g. 0.0 to 100.0, or 0.0 to 1.0, etc.) in which
case a "Spot Quality" State threshold slider will popup when the
filter is invoked. The NCI/CIT mAdb array database server will be
generating GOOD/BAD (case 1 above) QualCheck data so it can take
advantage of this data filter.

1.2 Additional change-limit constraints "Compare channels meeting
range" were added to the "Filter by spot intensity [SI1:SI2] sliders"
to allow filtering of spot intensity independently on the Cy3 and Cy5
channels of (Cy3/Cy5) ratio data. This allows filtering by spot
intensity on some or all of the samples by one of (ALL, ANY, AT MOST,
AT LEAST) constraints. For the (AT MOST, AT LEAST) constraints, a new
"Percent SI ok" threshold slider allows you define a fuzzy definition
of present or absent that may be useful in handling noisy low
intensity spots that have corresponding spots that are present.

1.3 Some of the State Scrollers were too sensitive when setting low
threshold values. These now have a non-linear range vernier control at
the low end.  The dynamic range of some of the scrollers has been 
increased making it possible to set the desired values over a
wider range.

1.4 A data filter "Filter by positive intensity data" has been added
that allows the user to use either "positive only" or "positive or
negative" intensity data (this may be used with Affymetrix or other
data that contains negative values). It is only available for data
that contains negative numbers.

1.5 A data filter "Filter by Cy3/Cy5 HP-X ratio or Zdiff sliders" has
been added by gene ratios or Zdiff values within [CR1:CR2] or
[CZ1:CZ2] threshold ranges (depending on the normalization
method). This is useful for filtering ratio data from a single sample.


2. Changes in genomic identifiers and popup genomic server links
----------------------------------------------------------------

2.1 The underlying convention of associating a Gene with a Clone ID or
GenBank ID has been changed to now use a more general "MasterID" as
the master gene index. It determines what genomic IDs are available in
your particular database and then assigns the MasterID to what is
actually there - including a generic Genomic ID assigned by a
user. This makes it more flexible than when used the Clone ID as the
master gene index since Clone ID may not exist in some databases
(e.g. Oligo arrays). In addition, the View menu "Enable display
current gene in popup XXXX Web browser" only shows what options are
actually there - rather than graying out the commands that you may not
use.

2.2 LocusLink has been added to the possible popup Web pages from
LocusID (if it exists) or LocusLink mapped from the GenBank identifier
as an alternative if the LocusID is not available.

2.3 A new way dynamically adding genomic Web server browser views
linked to particular genomic identifiers has been added to the
configuration file as lists of (Menu entry/Web server URL/genomic
identifiers).


3. New graphic plots and changes in the pseudo array image
----------------------------------------------------------

3.1 Four new scatter plots were added for comparing individual
channels of Cy3/Cy5 ratio data between HP-X and HP-Y samples. This
allows you to compare either the Cy3 or Cy5 channel of HP-X against
the Cy3 or Cy5 channel of HP-Y samples.

3.2 The pseudo array image generation paradigm has been changed so it
now uses a fixed size spot and spacing and makes the underlying pseudo
array image canvas size dependent on the number of grids, rows,
columns and fields. This results in a more consistent interface and
does not require the user to define this region.

3.3 Added a "Pseudocolor Red-Yellow-Green HP-XY plot" as an
alternative visualization of HP-X/HP-Y data. The color is the additive
sum of the HP-X (red) and HP-Y (green) intensities.


4. The Cvt2Mae data format conversion tool
------------------------------------------

A Java stand-alone Cvt2Mae data format conversion tool is being
beta-tested. This tool lets the user convert their array data to
MAExplorer format so it may be used with MAExplorer. This is described
in the Reference Manual Appendix C.6 and the Cvt2Mae home page
(http://www.lecb.ncifcrf.gov/Cvt2Mae) where you may download the
converter to your computer. It currently handles Incyte and Affymetrix
layouts (beta-testing). It is being generalized to allow users to
define and save their own array layouts for their own special arrays
or array data.


5. Improvements on selecting sets of sample hybridzations
---------------------------------------------------------

5.1 The HP-X, HP-Y sets and HP-E list of sample hybridzations may now
be manipulated more easily. The HybProbe menu command "Choose HP-X,
HP-Y and HP-E" popup sample chooser now has additional options to make
it easier to select, remove, and reorder the three sample lists.

6.2 Added HybProbe menu command "Edit use (Cy5/Cy3) else (Cy3/Cy5) for
each HP" for use with ratio data. This selectively swaps (Cy3,Cy5)
data entries so may use (carefully!) dye-swap data for
replicates. Added UP/DOWN buttons in the HP chooser to make it easier
to adjust the order of the HP-E list.


6. Clustering methods
---------------------

6.1 Changed the name of the "NPN clustering" method in the menus and
reports to "K-means clustering" since that is essentially what it is.

6.2 Added "SaveAs GeneSets" button to K-means clustering report.  lets
you save all of the clusters as named Gene sets ("Cluster #1",
"Cluster #2", etc.) for further processing. Note that named gene sets
may be renamed and deleted allowing you to keep the clusters you want
and delete those you don't want to save.


7. Improvements in the gene sets and sample condition lists operations
----------------------------------------------------------------------

7.1 Added commands for renaming Gene Sets (Edit Menu | Sets of Genes |
Rename gene set)and Condition lists (Edit Menu | Sets of Conditions |
Rename condition list).

7.2 Added Gene set and Condition set operations now let you select the
sets by a picking an entry in a selection list as well as by typing in
names of sets.



8. Misc inprovements
--------------------

8.1 Improved HP sample ID reporting in various lists, reports and
plots

8.2 Fixed problem with popup Web browser for Macintoshes.

8.3 Optimized gene name and identifier sorting on startup so it starts
much faster when working with large arrays.

8.4 Enhanced configuration file parser and sample name menu defaults
to simplify the data that is actually required.

8.5 Added better error checking on input files to help catch some
errors in manually edited user data files and when reading
non-standard data.

8.6 Additional error checking added for input data files to try to
catch bad data errors. This is useful when you are editing the data 
files manually.


9. Documentation
----------------

9.1 The Reference Manual and tutorials have been clarified and revised
and new material added.

9.2 An Web page devoted to MAExplorer PDF (Adobe Acrobat) documents
has been setup and is accessible from the MAExplorer home page.


Future enhancements under development
=====================================

 * Added functionality in the Cvt2Mae data format conversion tool

 * User extensible analysis methods using Java Plugins by having users
   write to a MAExplorer Java plugin API (check for status on the web page
   http://www.lecb.ncifcrf.gov/MAEPlugins)

 * New data filtering methods

 * New normalization methods

 * New clustering methods

 * Improved graphics

 * Improved documentation and training films

 * Access additional genomic Web databases

 * Generation of extracted image spot regions for specific genes
   across a set of samples

 * etc.

Please contact us with your comments, suggestions and problems in
running MAExplorer at mae@ncifcrf.gov.