Technical documentation

Path workflow - document version: 1.0

Table Of Content

[Introduction]
[Use the on-line Path module]
[Interpret the results of the Path module]

Introduction

Overview of the documentation

This technical documentation has two main objectives:
- to guide you in the use of the Path module
- to give interpretative help on the outputs of the module

All source code has been written in R and is available at https://github.com/BiGCAT-UM/Path_Module.

The Pathway module can be run :
- on-line via the http://www.arrayanalysis.org webportal (follow "Get started" and choose "Pathway analysis")
- or as an automated R workflow from a local computer

The main functions of the Pathway module are:
  1. - to import a dataset;
  2. - to create a visualization;
  3. - to calculate z-scores based on the criterion;
  4. - to return a list of pathways sorted on the basis of z-scores.

How to use the documentation

As shown in the Table Of Contents, you will find the separate sections :

  1. - Using the on-line Pathway module
  2. - Interpreting the results provided

Bug tracking system

If you encounter an issue by using the code, you can report it at any moment on our internal tracking system : http://trac.bigcat.unimaas.nl/arrayanalysis/newticket. You can also use this system to post comments or feature suggestions.

Example gene level statistics file

An example dataset is available. When running the module, you can check a box to use this data set (Example1) in order the explore the functionality of the module.

Use the on-line Path module

Pathway Analysis
You can access the on-line module on arrayanalysis.org webportal: (follow "Get started" and choose "Statistical analysis").
You don't need to log in; you just need to prepare a gene level statistics file containing the statistical contrasts between the different groups of your Affymetrix .CEL files (you may also obtain the file by running the statistical analysis module).

The on-line module contains four steps before the launch of the analysis:
- Step1: First you load the gene level statistics file and select species
Alternatively, you might select the Example dataset for exploring the modue. In that case you do not need to select the species. The dataset used is for Human.
Click on Run Path, to proceed.
- Step2: Choose the column in the data file containing the identifiers and the database used for annotation if all the idenitifers are from the same database. If different identifier systems are used for annotating the dataset, the system code column has to be chosen. The system code specifies which database each idenitifier belongs to.
- Step3: Specify a criterion for calculating the z-scores.
- Step4: Select color criteria for visualization of the uploaded data on the pathways.
Then:
- Execution: The module is executed with the settings you choose
- Results: You get the results after the execution step, at the website or by e-mail.

First step: load the data file and select species

The following picture shows the screen for the first step:

step 1

This dialog allows you to upload a tab-delimited text file with (gene level statistics) data and choose the relevant species. Alternatively, the module can be run with an example data set, by ticking the checkbox presented.
The interrogation mark button will give you contextual help.

Second step: Identifier Mapping

The following part of the online form is used for the second step:

step 2


Your dataset has been uploaded. For mapping the uploaded data to the pathways, the annotation information needs to be filled in.
"Identifier Column" Choose the column in the uploaded data file containing the identifiers used for annotation.
"Database" If the identifiers used for annotation are all from the same databse, then select the database.
OR "System Code" If identifiers from different databases are used for annotation then the a column containing the system code of the databse needs to be selected.
The interrogation mark button will give you contextual help.

Third step: Set criterion for z-score calculation

The following part of the online form is used for the third step:

step 3


Select a criterion for calculating the z-score. You could, e.g. specify a criterion based on a fold change threshold. You can either type the expression in the "Expression" field or you can use the available parameters and operators listed by clicking on them.

Fourth step: Creating a visualization

The following part of the online form is used for the fourth step:

step 3


Data can be visualized on pathways using colours. A gradient colouring scheme can be used to visualize a range of data on a gene (e.g. fold change) while a rule can be applied for certain criteria allowing only the genes which qualify to be coloured (e.g. P Value "<" 0.05)

Execution

After clicking 'Run' the module is executed.

execution

Results

Upon completion a page of results is displayed on your screen.

results 1
results 2
results 3

In the first part of the screen, your settings are recalled. Then links to the log file of the run and to the zip file containing all results (index file, pathway images, and related backpages) are presented. The results will be described in the next section of this documentation.

[Top]

Interpret the results of the Path module

index

The output consists of :

[Top]