TFactS Manual

CONTENTS:

  1. DESCRIPTION
  2. AUTHORS
  3. INPUT
    1. TFactS MODES
    2. Supported organisms
    3. Gene specification
  4. ANNOTATED SIGNATURES
    1. Data sources
    2. Target gene signatures
  5. METHOD
    1. Comparison Statistics
    2. Statistical significance
    3. Example
  6. PUBLICATIONS
  7. ACKNOWLEDGEMENTS


DESCRIPTION

is designed to predict which transcription factors are regulated, inhibited or activated in a biological system based on lists of upregulated and downregulated genes generated in microarray experiments.
TFactS takes as input lists of up- and/or down-regulated genes (query genes), compares it with a catalogue of annotated target genes, and returns three lists of transcription factors whose annotated target genes show a significant overlap with the query genes. The first list shows the Regulated Transcription Factors(TF) using the the Sign-Less catalogue, the second list shows the activated TF and the third list shows the Respressed TF. Both the activated and repressed lists are produced using the Sign-Sensitive catalogue.

AUTHORS

Ahmed Essaghir & Jean-Baptiste Demoulin

INPUT

TFactS MODES:

TFactS can be used in 2 modes (Batch Mode and Simple Mode):
  1. BatchTFactS is made to simplify multiple submissions:
  2. TFactS (Simple Mode) takes as input one or two lists of query genes:
  3. The user can specify which catalogue to use. "Add Data" button can be used to sepcify costum catalogues.
  4. The user can specify how many random selection to use for negative control (RC : see Methods section)
  5. The user can specify desired thresholds for positive control and False discovery control (see Methods section)
  6. The user can limit the comparison list of TFs to the ones having more than a certain number of target genes (target gene # threshold)

Supported organisms:

TFactS primarily aims at interpreting human data. Rat and mouse human orthologue genes are also supported.

Gene specification:

genes should be specified by their HGNC official names or by their Entrez ids.

ANNOTATED SIGNATURES

TFactS relies on catalogues of transcription factor signatures. A signature is defined here as the list of genes regulated by a transcription factor.
The Sign-Sensitive catalogue enumerates pairwise relationships between transcription factors (TF) and their target genes (TG), documented with the regulation type (up or down). The Sign-Less catalogue doesnt contain regulation type information.

Data sources

Lists of TG are obtained from literature (see references) and databases like , , curated and . We annotated some of these lists with the regulation type (up or down) based on the original publications (from PubMed). SREBP and p53 TG were based on our previous reports[1,2] and were complemented by data in TRED with the regulation type from original publications. The FOXOs and STATs transcription factors TG lists were built from published papers[3-21].

Target gene signatures

In our catalogue of TF target genes , we define a signature as a set of all the target genes of this TF. And a Generic TF is defined as a TF representing a very close family of transcription factors: STAT, e.g, is a Generic TF for STAT1, STAT3 and STAT5.

Specific signatures

A Specific Signature is the normal signature as defined above for each TF.

Generic signatures

A Generic Signature is the union of all Specific signatures of the TF represented as a Generic transcription factor.

METHOD

The program compares the list(s) of query genes (up and/or down) with a catalogue of signatures.

Comparison Statistics

For each TF Three hypotheses are tested using a contingency table as follows :

TFUser: PresentUser: AbsentTotal
Catalogue: Presentkm-km
Catalogue: Absentn-kN+k-n-mN-m
TotalnN-nN

m: # TG of TF
n: # submitted genes
N: # signatures in the catalogue
k differs according to the hypothesis to be tested as follows :

Statistical significance

TFacts gives as output three tables, each table gives statistics for each of the three hypotheses (Regulation (sign-less catalogue), Activation and Repression (sign-sentisitve catalogue) of a TF)

Example

Total Submitted genes (UP + DOWN): 43;
FOXO3 has 66 TG in the sign-sensitive catalogue;
among the submitted genes, 11 matched FOXO3 TG;
among these 11 genes : 8 matched the regulation type and 3 mismatched the regulation type.
Total number of Tests : 24;
Number of Repetitions (random simulation): 100;

Activation of FOXO3
FOXO3User: PresentUser: AbsentTotal
Catalogue: Present85866
Catalogue: Absent3523282363
Total4323862429
  1. P-value = 1.40e-4;
  2. E-value = 3.36e-3;
  3. Q-value = 3.36e-3;
  4. FDR control = 2.08e-3 (≥ P-value);
  5. RC ≤1%

Repression of FOXO3
FOXO3User: PresentUser: AbsentTotal
Catalogue: Present36366
Catalogue: Absent4023232363
Total4323862429
  1. P-value = 2.15e-1;
  2. E-value = 5.15e+0;
  3. Q-value = 3.06e-1;
  4. FDR control = 1.67e-2 (< P-value);
  5. RC ≤1%

If we set thresholds like : P-value ≤ 0.05; E-value ≤ 0.05; Q-value ≤ 0.05; FDR ≤ 0.05 and RC ≤ 5% Then we can say that FOXO3 is activated in this system.
The regulation hypothesis is based on the sign-less catalogue and the contingency table is built as the same as above without taking into account the sign of the regulations.

PUBLICATIONS

Essaghir A, Toffalini F, Knoops L, Kallin A, van Helden J, Demoulin JB: Transcription factor regulation can be accurately predicted from the presence of target gene signatures in microarray gene expression data. Nucleic Acids Res. 2010 Jun 1;38(11):e120. Epub 2010 Mar 9. Essaghir A, Demoulin JB: A Minimal Connected Network of Transcription Factors Regulated in Human Tumors and Its Application to the Quest for Universal Cancer Biomarkers. Plos One 7 (6), 2012, e39666.

ACKNOWLEDGEMENTS