Advertisement

Quick Links

VAN Package
User Guide Version 1.0

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the GLSY22522 and is the answer not in the manual?

Questions and answers

Summary of Contents for StrengthTrainer GLSY22522

  • Page 1 VAN Package User Guide Version 1.0...
  • Page 2: Table Of Contents

    Table of Contents Topic Page no Installation instructions Example dataset Example analysis Protein-protein interactome MicroRNA-target interactome Network visualization using R or Cytoscape – an example Meta-analysis of multiple datasets – an example Generating microRNA/protein based interactome – an example Understanding input data and parameters Conversion of gene symbols to Entrez Ids Understanding output data 10.
  • Page 3: Installation Instructions

    Section 1: Installation instructions Packages to download For Windows VAN_1.0.0.zip VANData_1.0.0.zip For Unix VAN_1.0.0.tar.gz VANData_1.0.0.tar.gz For Mac VAN_1.0.0.tgz VANData_1.0.0.tgz Example dataset to download Example_DataSet.zip - 2 -...
  • Page 4 Installation Steps Download and install R version 2.15.1 or higher from the website http://www.r- project.org At the R command prompt type chooseCRANmirror() ## Select one of the options from the pop-up menu, e.g. Australia (Canberra) setRepositories() ## Select the following three options from the pop-up menu – ## CRAN, BioC software, BioC annotation install.packages("annotate") install.packages("doParallel")
  • Page 5: Example Dataset

    Section 2: Example dataset The Example_DataSet.zip contains – Gene_Expr_Two_Conditions.txt: An example gene expression dataset with samples corresponding to two conditions – StateA and StateB Gene_Expr_Four_Conditions.txt: An example gene expression dataset with samples corresponding to four conditions – StateA, StateB, StateC, and StateD. Micro_Expr_Two_Conditions.txt: An example microRNA expression dataset with samples corresponding to two conditions –...
  • Page 6 - 5 -...
  • Page 7: Example Analysis

    Section 3: Example analysis For all the analyses, we assume that example dataset (Section 2) is saved in the folder C:/My_Packages. We also assume that the VAN package has been loaded and the current working directory has been set appropriately as shown below – Load the VAN package and set the working directory At the R command prompt type setwd("C:/My_Packages")
  • Page 8 "Micro_Expr_Two_Conditions.txt") , labelIndex = 1 , mapFile = "Mirnome_Map.txt" , outFile = "Test_Output_Mirnome.txt" , randomizeCount = 10) The order of the two expression data files is important. As shown in the above example, the first file should correspond to gene expression data and the second file to microRNA expression data.
  • Page 9 Section 4: Network Visualization using R and Cytoscape – An example We provide two options for visualizing the changes in associations between a protein/microRNA hub and its interactors. Typically, the input file for data visualization will correspond to the output correlation file generated by the...
  • Page 10 obtainPairSubset(filePrefix="Gene_Output_1" , useAdjustedProb=FALSE , probThresh=0.05) The above command generates an output file “Gene_Output_1_Cor_Signif.txt”. This file contains the hub-interactor pairs for only those hubs which have an unadjusted p-value less than 0.05 We provide an example layout file and a ‘color-blind safe’ edge palette (created with the aid of http:www.//colorbrewer2.org/ and http://jfly.iam.u-tokyo.ac.jp/color/) for visualization...
  • Page 11 For multiple conditions, the above procedure is followed with additional Columns activated at the data import step (refer to 3f, above). This should enable multiple States to be available for viewing as described at 7, above. - 10 -...
  • Page 12: Meta-Analysis Of Multiple Datasets - An Example

    Section 5: Meta-analysis of multiple datasets – An example To combine the results obtained using multiple datasets, we have implemented two meta-analysis methods – Fisher’s combined test and RankProd [3]. Since both methods assume independence of datasets, the output p-values should be interpreted with caution if the same expression dataset was combined with multiple protein or microRNA interactomes.
  • Page 13 Section 6: Generating microRNA-target or protein-protein interaction interactome a. Protein-protein interactions: To generate an input protein interactome file, perform the following steps – Download a MiTab Lite file from the website http://wodaklab.org/iRefWeb/search/index Apply the function generatePpiMap to the MiTab Lite file. To illustrate this, we use the example file “MiTabLite_Example.txt”...
  • Page 14: Understanding Input Data And Parameters

    Section 7: Understanding input data and parameters Input file formats Expression data: VAN does not provide any functions for preprocessing user expression data, such as data normalization to remove systematic variation, or other potential steps including data transformation and/or filtering. These functions are widely available elsewhere and users should provide, as input to VAN, an appropriately normalized and filtered data.
  • Page 15 the carriage return (i.e. the key labeled “Enter” on the keyboard) should be pressed immediately after LABEL_END. Set 1 Set 2 Figure 2: An example input expression data file with two sets of labels Interactome data: The interactome data file is tab-separated and has two columns – the first column corresponds to hubs and the second column to interactors.
  • Page 16 that increase number permutations will increase the execution time. The number of permutations can also be lowered but is not recommended. By default, the p-values for modules are adjustMethod "BH" adjusted using the Benjamimi-Hochberg (or false discovery rate) adjustment method [16], as implemented in R.
  • Page 17 correspond to gene symbols. However, if the gene names correspond to Entrez IDs, this parameter should be changed to "ENTREZ". It should be noted that if two expression data files are provided (one for gene expression another microRNAs), then both should contain the same type of gene names i.e.
  • Page 18: Conversion Of Gene Symbols To Entrez Ids

    Section 8: Conversion of gene symbols to Entrez IDs Expression data If the expression data and interactome data contain gene labels in different formats, i.e. one corresponds to Entrez IDs and the other to gene symbols, then the gene symbols are mapped to Entrez IDs.
  • Page 19: Understanding Output Data

    Section 9: Understanding output data All the output data files are tab-separated and can be viewed using a text editor or MS Excel. Enriched Modules The function identifySignificantHubs is used to evaluate the network modules (i.e. hubs and their associated interaction partners) for a given combination of expression and interactome data.
  • Page 20: Combining Output Data With Known Cancer Annotation

    Section 10: Combining output data with known cancer annotation In case of cancer-related datasets, to facilitate the biological interpretability of the enriched hubs, we provide a function obtainCancerInfo. This function maps the hubs (corresponding to enriched modules) to the catalogue of genes already causally associated with cancer(s), provided the catalogue file is provided as an input parameter in Excel format.
  • Page 21: Measures Of Association

    Section 11: Measures of association Inside our R package, every row of the expression data is median-centered and its variance is set to one prior to the calculation of the association measure. Before we describe the various association measures implemented in our package, we introduce some notation.
  • Page 22 We test the null hypothesis that the average change in association between a hub ρ and its interactors (i.e. ) is not stronger than that by chance. For this purpose, we ρ randomly assign the samples to B1 and B2 and recalculate .
  • Page 23: References

    Section 12: References Turner B, Razick S, Turinsky AL, Vlasblom J, Crowdy EK, Cho E, Morrison K, Donaldson IM, Wodak SJ: iRefWeb: interactive analysis of consolidated protein interaction data and their supporting evidence. Database-the Journal of Biological Databases and Curation 2010.
  • Page 24 Taylor IW, Linding R, Warde-Farley D, Liu YM, Pesquita C, Faria D, Bull S, Pawson T, Morris Q, Wrana JL: Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nature Biotechnology 2009, 27(2):199-204. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge YC, Gentry J et al: Bioconductor: open software development for computational biology and bioinformatics.

Table of Contents