Between Group Eigen Analysis of Microarray Data

BGA Web Tutorial

Home
Takes you to our home page

Figures
Figures from the paper

Supplement Enhanced interactive Figures (Khan dataset)

Method
The mathematical basis of BGA

ADE-4 - Download ADE-4 software

Running BGA using ADE-4

This tutorial descibes how to run between group eigen analysis (BGA) using the multivariate software package ADE-4, which is freely available on Mac and Windows operating sytems.

Preparing the data

Start up ADE-4. Use File -> Change data folder to specify the working directory (the folder in which your microarray data files are saved)

Open microarry data in a spreadsheet package (such as Excel), removed column and row headings. The column and row labels should be saved in a separate text file. Exported the data in tab delimited format.
Convert the raw data into binary format using by dragging and dropping onto the ADE-trans icon BGA -
Converting Text file to binary format using ADE Trans

or using TextToBin -> Text->Binary

BGA -Converting Text file to binary format

The converted binary file can be viewed dragging and dropping it on the ADEBin icon ADE Bin icon

Transpose the binary data file using FilesUtil->transpose .

Between group analysis requires that the categories (groups of samples) are in rows. Hence when running a between group analysis on standard microarray data format, the data must be transposed. Transposing the data is not required for COA analysis.

Create a category file. This is a text file containing a one column list with the same number of rows as the data file. For example if you have 35 microarrays, this should contain 35 rows. This assigns each sample to a category (or grouping), thus if you have four data groups, it should contain a list of 1,2,3,4 where 1 assigns that sample to group 1, etc.

Convert this file into binary format using ADETrans (as above). Then read this read, using CategVar->Read Categ File.

Read category file using CategVar-> Read Categ file

Running the analysis

Run Correspondence (COA) or principal component analysis (PCA) on the transposed data.

The data must be non-negative (usually integer) values for COA, if required a scalar can be added to the data using MatAlg -> Scalar Addition

A standard COA will produce 8 output files (.fcta, .fcpl, .fcpc, .fcma, .fcvp, .fcpa, .fcli and .fcco) and a log file. Equally a PCA will produce 8 output files, these are labelled .cnXX where last two letters are the same as those from a COA.

The .XXvp file contain the eigenvalues and relative inertia for each axis. The .XXco and .XXli contain column and row scores respectively. Note the naming convention: xxli (rows), .xxco (columns). Details about the output files are available in the ADE-4 help files, which can be accessed. from each menu.

Link the .XXta (output file from COA, .fcta or PCA, .cnta) and the .cat (category) file using Discrimin-> Initialize:LinkPrep.

This outputs a .dis file which is the input file for between analysis.

and then run between analysis Discrimin-> Between analysis:Run. A monte carlo test (permutation) test on the data can be run using Discrimin-> Between analysis:test.

Between analysis producing 9 output files and a log file. These files are .bec1, .beco, .beli, .bels, .bepa, .bevp, .bepc, .bepl and .beta. Again the .XXvp file contains the eigenvalues and relative inertia for each axis. The .XXco contains the standard column scores (genes co-ordinates), the .beli contains the standard scores of the groups (centroids of the groups) and the .bels file contains the sampling unit scores resulting from the averaging on the column scores (individual sample co-ordinates). The .XXpl and .pc contain the weights of the groups and columns.

Projecting Test data

Prepare the test data as above. Save test data as tab-delimited file with no column or row names. Convert to binary format (as above).

Then use DDUtil ->Supplementary Rows to transform the data, similarly to the initial transformations. For example COA (chi-square) or PCA (column centre). Select the .XXvp file (.fcvp for COA or .cnvp for PCA) and your binary transposed test data. The number of rows must match. This will produce 2 files, the transformed data matrix (_tab) and another output file.

Project the _tab file onto the BGA axis using DDUtil ->Row Projections. Select the .bevp file and the _tab file. This will produce one output file, which are the projections of the test data onto the bga axes.

Graphing the results

If you have only two groups, this will produce one axis which discriminates between the two groups. A simple graph such as these can be produced using Graph 1D->Between Graph. To plot the samples and genes plot the .bels and .beco files.

If you have more than one axis, use ScatterClass or Scatters, depending on whether you have category information (a .cat file) or not. ScatterClass->Stars will produce graph such as this.

Graph Between Analysis - more than one axes

Again for more information on these modules, refer to the help provided in each menu

The Higgins Bioinformatics Lab, Updated 26th March 2004

Between Group Eigen Analysis of Microarray Data

Running BGA using ADE-4

Preparing the data

Running the analysis

Projecting Test data

Graphing the results