The main functions of affyAnalysisQC are:
- to compute array quality information;
- to plot images that allow identifying any aberrations present in the dataset;
- to return pre-processed data and QC reports.
If you encounter an issue by using the code, you can report it at any moment or, once you have your own account, using our internal tracking system. You can also use this system to post comments or feature suggestions.
Note that three example datasets has been made available on our
Download page. They include:
• dataset raw CEL files,
• description file,
• affyAnalysisQC ouput files:
- execution logfile,
- report file (pdf),
- zip archive with images and tables and
- normalized data (text file)
The R software can be downloaded from http://www.r-project.org. From this website, follow the link to a local CRAN mirror in order to download the program. affyAnalysisQC is compatible with R version 2.12.0 and higher
Several libraries (packages) need to be installed before affyAnalysisQC can be executed.
For your convenience, we prepared a script that loads almost all of the libraries needed. You
can remove from the list libraries that are already included on your R installation.
The next line could be entered in R to execute the script (installing libraries will take a while):
Note that this script does not install some of the annotation packages needed. These are chiptype specific, and installing all of them would take a large amount of disk space. Depending on your system, R may automatically install these when running the script. If not, it will be needed to install these libraries manually (c.f. the instructions that follow).
In the Windows GUI, select the packages menu, followed by Select repositories... A dialog window will pop up, from where the 'BioC software' is selected. Then, go back to packages and select install package(s). Select the required packages from the list below and click OK.
In the R terminal, you can do exactly the same as above, by using the following procedure. Use:
and make sure that at least 'BioC software' is selected. Next, use:
to install a package or:
install.packages(c("libraryname1", "libraryname2", "libraryname3"))
to install - for example - three packages named libraryname1, libraryname2 and libraryname3 respectively.
Alternatively, the required packages can be installed using BioConductor, using the following command:
biocLite(c("libraryname1", "libraryname2", "libraryname3"))
This section contains a description of the settings to be provided in the
affyAnalysisQC.R script. By the way comparable settings and parameters are
requested and used from the webportal and the GenePattern module.
First of all, three directories need to be defined:
- a directory containing the CEL files (DATA.DIR)
- a directory containing the
affyAnalysisQC.R helper scripts (SCRIPT.DIR)
- a directory to which the output tables and images should be written (WORK.DIR)
The SCRIPT.DIR is set to
by default, to collect the up-dated script files (mainly
functions_images.R) from the repository.
Of course, if you would like to make changes to these functions, you can download them to your local
machine, and set the SCRIPT.DIR to the correct location.
AffyAnalysisQC can use a description file containing information about the arrays (samples) in the dataset. This file is retrieved from the
arrayGroup parameter. It require the entire path to the file.
arrayGroup is not set, the CEL file names are used as sample names,
and no distinctive groups colours will be used in the images produced.
The description file is a tab-delimited text file containing three columns with the following layout:
ArrayDataFile SourceName FactorValue Array1.CEL patient1 patient Array14.CEL control1 control Aray23.CEL patient2 patient Array7.CEL patient3 patient ... ... ...
The first column contains the names of the CEL files (or any type that can be read by the ReadAffy
(affy) function, e.g. CEL.gz) that are in the DATA.DIR. The second column contains the names to be
used for each array in the plots and tables produced. The third column contains the names of the
groups the samples belong to.
The column headers should be present, but may be named otherwise (as long as the order is the same, and no spaces are used in the names). If there are more than three columns, all further ones are ignored.
The next parameter,
reorder, indicates whether for the images and tables produced, the
arrays have to be reordered by experimental group first, as this may ease interpretation.
Choice of the plots to be computed
All further parameters are mostly Booleans that indicate whether a certain plot or table has to be computed or not. Information on these plots can be found in the comment lines in the
affyAnalysisQC.R file itself, and on the
arrayanalysis.org website (see: "module description").
Options required for some plots
A few other parameters provide options for the plots:
normOption1 indicate whether MA plots and normalization should be computer for the whole
dataset (“dataset”) or per experimental group (“group”).
normMeth parameters give settings for clustering and
normalization, respectively (c.f. help given in the script itself).
customCDF indicates whether, before
normalization, the array annotation has to be updated – as is advisable – with a custom cdf
environment from the
BrainArray lab. This is made by default.
species are two settings needed when an updated cdf is
requested: the first indicates the database for which the updated cdf should be chosen (when
selecting “ENSG”, the common gene name and description will also be added to the normalized data
table), the second indicates the species (if not given, the script will try to deduce it from the
The last line of the
affyAnalysisQC.R script, starting with
source, loads the
run_affyAnalysisQC.R script, that creates all the images and output tables.
After opening R (by either running the R GUI or typing R in a command shell),
affyAnalysisQC can be initiated by entering:
For a detailed description of the
run_affyAnalysisQC.R script, which is the core script of the module and all functions it calls, we refer the the function guide at doc_affyQC_func.php.