Class MJAstatistics

java.lang.Object
  |
  +--MJAbase
        |
        +--MJAstatistics

public class MJAstatistics
extends MJAbase

MAExplorer Gather Scatter API class to access MJAstatistics methods and data structures. Access statistics methods

List of methods available to Plugin-writers

 get_f() - CALC: calculated f statistic 
 get_t() - CALC: t or t' statistic previously computed 
 get_pT() - CALC: t-test p-value w/NULL hypoth previously computed 
 get_pF() - CALC: f-test p-value w/NULL hypoth previously computed 
 get_fStat() - CALC: f-statistic previously computed  
 get_dF() - CALC: degrees of freedom previously computed  

 get_useTest() - CALC: 'B' or 'T' - t-test to use computed 
 get_title() - title for data used in histogram previously computed  
 get_meanIdx() - index of mean in hist[] previously computed 
 get_medianIdx() - index of median in hist[] previously computed 
 get_modeIdx() - index of mode in hist[] previously computed  
 get_nBinsH() - 0 if none. size of histogram previously computed  
 get_hist() - histogram of size [0:nBinsH-1] previously computed  
 get_medianH() - mode of data[] for histogram previously computed 
 get_modeH() - mode of data[] for histogram previously computed   
 get_meanH() - mean of data[] for histogram previously computed   
 get_stdDevH() - standard deviation of data[] for histogram 
 get_meanAbsDevH() - mean absolute deviation for histogram 
 get_minDataH() - min value in data[] for histogram previously computed  
 get_maxDataH() - max value in data[] for histogram previously computed  
 get_deltaBinH() - width of the histogram bins

 get_nCond_pValue() - get p-value computed of N-condition F-test
 get_nCondFstat() - get f-statistic computed in N-condition F-test
 get_nCondMeanSqWithinVariance() - get mnSqWithin variance computed in N-condition F-test
 get_nCondMeanSqBetweenVariance() - get mnSqBetween variance computed in N-condition F-test 
 get_nCond_dfWithin() - get dfWithin (deg. of freedom) computed in N-condition F-test 
 get_nCond_dfBetween() - get dfBetween (deg. of freedom) computed in N-condition F-test 
 get_nConditions() - get nConditions used in computation of N-condition F-test
 get_conditionsData() - get samples for each condition for N-cond F-test
 get_nSamplesAllConditions() - # samples for each condition for N-cond F-test
 get_meansAllConditions() - get means of conditions for N-cond F-test
 get_varianceAllConditions() - get variances of conditions for N-cond F-test
 --------------------------------------------------------
 calcMeanAndVariance() - compute mean & var of dataS[] 
 calcNCondFtestStat() - calc. F-test statistics of data[0:nConditions-1][samples]
 calcFprobFromVariances() - calc 2-tailed f prob. that vars. are same.
 calcTandPvalues() - given (n1,m1,s1) and (n2,m2,s2), calc f, t, p, dF.
 calcHistStats() - compute and analyze histogram.
 calcHistStats() - compute and analyze histogram.

This work was produced by Peter Lemkin of the National Cancer Institute, an agency of the United States Government. As a work of the United States Government there is no associated copyright. It is offered as open source software under the Mozilla Public License (version 1.1) subject to the limitations noted in the accompanying LEGAL file. This notice must be included with the code. The MAExplorer Mozilla and Legal files are available on http://maexplorer.sourceforge.net/.

Version:
$Date: 2003/02/18 18:02:33 $ $Revision: $
Author:
P. Lemkin (NCI), J. Evans (CIT), C. Santos (CIT), G. Thornwall (SAIC), NCI-Frederick, Frederick, MD
See Also:
MAExplorer Home


Fields inherited from class MJAbase
COMPARE_ALL, COMPARE_ANY, COMPARE_AT_LEAST, COMPARE_AT_MOST, COMPARE_PRODUCT, COMPARE_SUM, DATA_F1TOT, DATA_F2TOT, DATA_MEAN_F1F2TOT, DATA_RATIO_F1F2TOT, DRAW_BIN, DRAW_BOX, DRAW_CIRCLE, DRAW_PLUS, EDIT_ADD, EDIT_NOP, EDIT_RMV, GENE_ATCC_ID, GENE_BAD_DATA, GENE_BAD_LOCAL_SPOT_BKGRD, GENE_BAD_MID, GENE_BAD_SPOT, GENE_BAD_SPOT_GEOMETRY, GENE_DUP_SPOT, GENE_GOOD_MID, GENE_IMAGE_ID, GENE_IS_CUR_GENE, GENE_IS_EGL_GENE, GENE_IS_FILTERED, GENE_IS_KMEANS, GENE_IS_NOT_FILTERED, GENE_LOW_SPOT_REF_SIGNAL, GENE_MARGINAL_SPOT, GENE_USE_GBID_FOR_CLONEID, HIER_CLUST_NEXT_MIN_LNKG, HIER_CLUST_PGMA_LNKG, HIER_CLUST_PGMC_LNKG, MARKER_CIRCLE, MARKER_CURRENT, MARKER_GENES, MARKER_KMEANS_CLUSTER, MARKER_NONE, MARKER_PLUS, MARKER_SQUARE, MASTER_CLONE_ID, MASTER_DBEST3, MASTER_DBEST5, MASTER_GENBANK, MASTER_GENBANK3, MASTER_GENBANK5, MASTER_GENE_NAME, MASTER_GENERIC_ID, MASTER_LOCUSLINK, MASTER_SWISS_PROT, MASTER_UG_ID, MASTER_UG_NAME, MAX_COLORS, PLOT_CLUSTER_GENES, PLOT_CLUSTER_HIER, PLOT_CLUSTER_HYBSAMPLES, PLOT_CLUSTERGRAM, PLOT_EXPR_PROFILE, PLOT_F1_F2_INTENS, PLOT_F1_F2_MVSA, PLOT_HIST_F1F2_RATIO, PLOT_HIST_HP_XY_RATIO, PLOT_HIST_HP_XY_SETS_RATIO, PLOT_HP_XY_INTENS, PLOT_INTENS_HIST, PLOT_KMEANS_CLUSTERGRAM, PLOT_PSEUDO_F1F2_IMG, PLOT_PSEUDO_F1F2_RYG_IMG, PLOT_PSEUDO_HP_XY_IMG, PLOT_PSEUDO_HP_XY_RYG_IMG, PLOT_PSEUDOIMG, PRPROP_CUR_GENE, PRPROP_FILTER, PRPROP_LABEL, PRPROP_SLIDER, PRPROP_TIMEOUT, PRPROP_UNIQUE, QUALTYPE_ALPHA, QUALTYPE_PROP_CODE, QUALTYPE_THR, RANGE_INSIDE, RANGE_OUTSIDE, RPT_FMT_DYN, RPT_FMT_TAB_DELIM, RPT_NONE, RPT_TBL_ALL_GENES_CLUSTER, RPT_TBL_CALIB_DNA_STAT, RPT_TBL_CUR_GENE_CLUSTER, RPT_TBL_EDITED_GENE_LIST, RPT_TBL_EXPR_PROFILE, RPT_TBL_FILTERED_GENES, RPT_TBL_GENE_CLASS, RPT_TBL_HIER_CLUSTER, RPT_TBL_HIGH_F1F2, RPT_TBL_HIGH_RATIO, RPT_TBL_HP_DB_INFO, RPT_TBL_HP_HP_CORR, RPT_TBL_HP_MN_VAR_STAT, RPT_TBL_HP_XY_SET_STAT, RPT_TBL_KMEANS_CLUSTER, RPT_TBL_LOW_F1F2, RPT_TBL_LOW_RATIO, RPT_TBL_MAE_PRJ_DB, RPT_TBL_MN_KMEANS_CLUSTER, RPT_TBL_NAMED_GENES, RPT_TBL_NORMALIZATION_GENE_LIST, RPT_TBL_OCL_STAT, RPT_TBL_SAMPLES_DB_INFO, RPT_TBL_SAMPLES_WEB_LINKS, SS_MODE_ELIST, SS_MODE_MS, SS_MODE_XANDY_SETS, SS_MODE_XORY_SETS, SS_MODE_XSET, SS_MODE_XY, SS_MODE_YSET
 
Method Summary
 boolean calcFprobFromVariances(int n1, int n2, double var1, double var2)
          calcFprobFromVariances() - calc 2-tailed f prob.
 int calcHistStats(java.lang.String title, int nBins, float[] data, int nData)
          calcHistStats() - compute and analyze histogram.
 int calcHistStats(java.lang.String title, int nBins, float[] data, int nData, int[] hist)
          calcHistStats() - compute and analyze histogram.
 boolean calcMeanAndVariance(float[] dataS, int nSamples, int classK, boolean calcStdDevFlag)
          calcMeanAndVariance() - compute the mean and variance of dataS[] and save in mean[classK] and variance[classK] arrays.
 boolean calcNCondFtestStat(float[][] data, int[] nData, int nConditions)
          calcNCondFtestStat() - calc.
 boolean calcTandPvalues(int n1, int n2, double m1, double m2, double s1, double s2)
          calcTandPvalues() - given (n1,m1,s1) and (n2,m2,s2), calc f, t, p, dF.
 float[][] get_conditionsData()
          get_conditionsData() - data[0:nConditions-1][sampleNbrInClass] for computations of N-condition F-test
 float get_deltaBinH()
          get_deltaBinH() - width of the histogram bins computed as: nBinsH/(maxDataH-minDataH) previously computed
 double get_dF()
          get_dF() - CALC: degrees of freedom previously computed
 double get_f()
          get_f() - CALC: calculated f statistic previously computed
 double get_fStat()
          get_fStat() - CALC: f-statistic previously computed
 int[] get_hist()
          get_hist() - histogram of size [0:nBinsH-1] previously computed
 float get_maxDataH()
          get_maxDataH() - max value in data[] for histogram previously computed
 float get_meanAbsDevH()
          get_meanAbsDevH() - mean absolute deviation for histogram previously computed
 float get_meanH()
          get_meanH() - mean of data[] for histogram previously computed
 int get_meanIdx()
          get_meanIdx() - index of mean in hist[] previously computed
 double[] get_meansAllConditions()
          get_meansAllConditions() - means of samples in each of the nConditions used in computation of N-condition F-test
 float get_medianH()
          get_medianH() - mode of data[] for histogram previously computed
 int get_medianIdx()
          get_medianIdx() - index of median in hist[] previously computed
 float get_minDataH()
          get_minDataH() - min value in data[] for histogram previously computed
 float get_modeH()
          get_modeH() - mode of data[] for histogram previously computed
 int get_modeIdx()
          get_modeIdx() - index of mode in hist[] previously computed
 int get_nBinsH()
          get_nBinsH() - 0 if none.
 double get_nCond_dfBetween()
          get_nCond_dfBetween() - get dfBetween degrees of freedom used in computation of N-condition F-test
 double get_nCond_dfWithin()
          get_nCond_dfWithin() - get dfWithin degrees of freedom used in computation of N-condition F-test
 double get_nCond_pValue()
          get_nCond_pValue() - get p-value computed of N-condition F-test
 double get_nCondFstat()
          get_nCondFstat() - get f-statistic computed in N-condition F-test
 int get_nConditions()
          get_nConditions() - get nConditions used in computation of N-condition F-test
 double get_nCondMeanSqBetweenVariance()
          get_nCondMeanSqBetweenVariance() - get mnSqBetween variance used in computation of N-condition F-test
 double get_nCondMeanSqWithinVariance()
          get_nCondMeanSqWithinVariance() - get mnSqWithin variance used in computation of N-condition F-test
 int[] get_nSamplesAllConditions()
          get_nSamplesAllConditions() - # of samples in each of the nConditions used in computation of N-condition F-test
 double get_pF()
          get_pF() - CALC: f-test p-value w/NULL hypoth previously computed
 double get_pT()
          get_pT() - CALC: t-test p-value w/NULL hypoth previously computed
 double[] get_stdDevAllConditions()
          get_stdDevAllConditions() - stdDev of samples in each of the nConditions used in computation of N-condition F-test
 float get_stdDevH()
          get_stdDevH() - standard deviation of data[] for histogram previously computed
 double get_t()
          get_t() - CALC: t or t' statistic computed previously computed
 java.lang.String get_title()
          get_title() - title for data used in histogram previously computed
 char get_useTest()
          get_useTest() - CALC: 'B' or 'T' - t-test to use computed using F-statistic previously computed
 double[] get_varianceAllConditions()
          get_varianceAllConditions() - variance of samples in each of the nConditions used in computation of N-condition F-test
 
Methods inherited from class MJAbase
cvtHashtable2SimpleTable, cvtTable2Hashtable
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

get_f

public final double get_f()
get_f() - CALC: calculated f statistic previously computed
See Also:
calcFprobFromVariances(int, int, double, double), calcTandPvalues(int, int, double, double, double, double)

get_t

public final double get_t()
get_t() - CALC: t or t' statistic computed previously computed
See Also:
calcTandPvalues(int, int, double, double, double, double)

get_pT

public final double get_pT()
get_pT() - CALC: t-test p-value w/NULL hypoth previously computed
See Also:
calcTandPvalues(int, int, double, double, double, double)

get_pF

public final double get_pF()
get_pF() - CALC: f-test p-value w/NULL hypoth previously computed
See Also:
calcFprobFromVariances(int, int, double, double), calcTandPvalues(int, int, double, double, double, double)

get_fStat

public final double get_fStat()
get_fStat() - CALC: f-statistic previously computed
See Also:
calcFprobFromVariances(int, int, double, double), calcTandPvalues(int, int, double, double, double, double)

get_dF

public final double get_dF()
get_dF() - CALC: degrees of freedom previously computed
See Also:
calcTandPvalues(int, int, double, double, double, double)

get_useTest

public final char get_useTest()
get_useTest() - CALC: 'B' or 'T' - t-test to use computed using F-statistic previously computed
See Also:
calcTandPvalues(int, int, double, double, double, double)

get_title

public final java.lang.String get_title()
get_title() - title for data used in histogram previously computed

get_meanIdx

public final int get_meanIdx()
get_meanIdx() - index of mean in hist[] previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_medianIdx

public final int get_medianIdx()
get_medianIdx() - index of median in hist[] previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_modeIdx

public final int get_modeIdx()
get_modeIdx() - index of mode in hist[] previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_nBinsH

public final int get_nBinsH()
get_nBinsH() - 0 if none. size of histogram previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_hist

public final int[] get_hist()
get_hist() - histogram of size [0:nBinsH-1] previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_medianH

public final float get_medianH()
get_medianH() - mode of data[] for histogram previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_modeH

public final float get_modeH()
get_modeH() - mode of data[] for histogram previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_meanH

public final float get_meanH()
get_meanH() - mean of data[] for histogram previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_stdDevH

public final float get_stdDevH()
get_stdDevH() - standard deviation of data[] for histogram previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_meanAbsDevH

public final float get_meanAbsDevH()
get_meanAbsDevH() - mean absolute deviation for histogram previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_minDataH

public final float get_minDataH()
get_minDataH() - min value in data[] for histogram previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_maxDataH

public final float get_maxDataH()
get_maxDataH() - max value in data[] for histogram previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_deltaBinH

public final float get_deltaBinH()
get_deltaBinH() - width of the histogram bins computed as: nBinsH/(maxDataH-minDataH) previously computed
See Also:
calcHistStats(java.lang.String, int, float[], int)

get_nCond_pValue

public final double get_nCond_pValue()
get_nCond_pValue() - get p-value computed of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_nCondFstat

public final double get_nCondFstat()
get_nCondFstat() - get f-statistic computed in N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_nCondMeanSqWithinVariance

public final double get_nCondMeanSqWithinVariance()
get_nCondMeanSqWithinVariance() - get mnSqWithin variance used in computation of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_nCondMeanSqBetweenVariance

public final double get_nCondMeanSqBetweenVariance()
get_nCondMeanSqBetweenVariance() - get mnSqBetween variance used in computation of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_nCond_dfWithin

public final double get_nCond_dfWithin()
get_nCond_dfWithin() - get dfWithin degrees of freedom used in computation of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_nCond_dfBetween

public final double get_nCond_dfBetween()
get_nCond_dfBetween() - get dfBetween degrees of freedom used in computation of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_nConditions

public final int get_nConditions()
get_nConditions() - get nConditions used in computation of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_conditionsData

public final float[][] get_conditionsData()
get_conditionsData() - data[0:nConditions-1][sampleNbrInClass] for computations of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean), calcMeanAndVariance(float[], int, int, boolean)

get_nSamplesAllConditions

public final int[] get_nSamplesAllConditions()
get_nSamplesAllConditions() - # of samples in each of the nConditions used in computation of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_meansAllConditions

public final double[] get_meansAllConditions()
get_meansAllConditions() - means of samples in each of the nConditions used in computation of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_varianceAllConditions

public final double[] get_varianceAllConditions()
get_varianceAllConditions() - variance of samples in each of the nConditions used in computation of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

get_stdDevAllConditions

public final double[] get_stdDevAllConditions()
get_stdDevAllConditions() - stdDev of samples in each of the nConditions used in computation of N-condition F-test
See Also:
calcMeanAndVariance(float[], int, int, boolean)

calcMeanAndVariance

public boolean calcMeanAndVariance(float[] dataS,
                                   int nSamples,
                                   int classK,
                                   boolean calcStdDevFlag)
calcMeanAndVariance() - compute the mean and variance of dataS[] and save in mean[classK] and variance[classK] arrays. Also save the dataS in data[classK] and nSamples in nData[classK].
Parameters:
data - is array of size [0:nSamples-1] of data
nSamples - is size of data
classK - is the class # associated with this data (start at 0)
calcStdDevFlag - also compute stdDev[classK] as well
Returns:
true if succeed

calcNCondFtestStat

public boolean calcNCondFtestStat(float[][] data,
                                  int[] nData,
                                  int nConditions)
calcNCondFtestStat() - calc. F-test statistics of data[0:nConditions-1][samples] It computes:
    pFnConds - p value
    fStatNconds - f statistic
    mnSqWithin - mean within class variance
    mnSqBetween - mean between class variance
    dF1 - degrees of freedom df1
    dF2 - degrees of freedom df2
 
Parameters:
data - sample data[nConditions][sampleNbrInCondition]
nData - # samples in each [nConditions]
nConditions - # of conditions
Returns:
false if any of the data is invalid (need >1 sample/Condition)

calcFprobFromVariances

public final boolean calcFprobFromVariances(int n1,
                                            int n2,
                                            double var1,
                                            double var2)
calcFprobFromVariances() - calc 2-tailed f prob. that vars. are same. It computes:
    fStat - the f-statistic
    pF    - probability tthat the two samples are the same
Parameters:
n1 - # samples class 1
n2 - # samples class 2
var1 - variance of class 1
var2 - variance of class 2
Returns:
false if any problems.

calcTandPvalues

public final boolean calcTandPvalues(int n1,
                                     int n2,
                                     double m1,
                                     double m2,
                                     double s1,
                                     double s2)
calcTandPvalues() - given (n1,m1,s1) and (n2,m2,s2), calc f, t, p, dF. Use Behrens-Fisher/Satterthwaite estimate for t and dF if f-stat is < 0.05 p-value that variances are different. Otherwise use the standard student t-statistic with DF= (n1+n2-2). If you want to force the test the set useTest to 'B' or 'T', else it will pick the test to use (ie. TB or TP). It uses the algorithm described Numerical Recipes in C (1st Ed) for estimating p-value given the t-statistic using the incomplete beta function betai(). It computes:
    f - calculated f statistic
    t - t or t' statistic
    pT - t-test p-value w/NULL hypoth
    pF - f-test p-value w/NULL hypoth
    dF - degrees of freedom
    useTest - is 'B' or'T' depending on f statistic
 
Parameters:
n1 - # samples in class 1
n2 - # samples in class 2
m1 - sample mean class 1
m2 - sample mean class 2
s1 - sample std dev class 1
s2 - sample std dev class 2
Returns:
false if any of the data is invalid (need >= 2 samples/class) or the beta fct fails.

calcHistStats

public final int calcHistStats(java.lang.String title,
                               int nBins,
                               float[] data,
                               int nData)
calcHistStats() - compute and analyze histogram. for whatever range of data is given. This computes and returns the results in variables of this instance:
   hist[0:nBins-1], medianH, modeH, meanH,
   stdDevH, meanAbsDevH, minDataH, maxDataH, deltaBinH
 
Parameters:
title - for data
nBins - size of hist[]
data - of size [0:nData-1]
nData - size of data array
Returns:
nBins if successful.

calcHistStats

public final int calcHistStats(java.lang.String title,
                               int nBins,
                               float[] data,
                               int nData,
                               int[] hist)
calcHistStats() - compute and analyze histogram. for whatever range of data is given. This computes and returns the results in variables of this instance:
   hist[0:nBins-1], medianH, modeH, meanH,
   stdDevH, meanAbsDevH, minDataH, maxDataH, deltaBinH.
 Note: if hist is shorintt[nBins+1], then it will not be allocated.
 
Parameters:
title - for data
nBins - size of hist[]
data - of size [0:nData-1]
nData - size of data array
hist - opt. [nBins+1] else null in which case it will allocate it locally
Returns:
nBins if successful else 0.