Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
FAQ - Using IPA Description FAQs about Using IPA To assist you with using the features of IPA, here are answers to some frequently asked questions. Core, Tox, and Metabolomics Analysis FAQs What file formats can I use to upload my data into IPA? IPA accepts Excel 1997-2003 files for data upload. Please see Data Upload Workflow for more details. What species identifiers are accepted for analysis by IPA? IPA support upload of identifiers from the following species. Arabidopsis thaliana Bovine Caenorhabditis elegans Chicken Chimp Danio rerio Dog Drosophila melanogaster Human Mouse Rat Rhesus Monkey Saccharomyces cerevisiae Schizosaccharomyces pombe IPA also accepts chemical identifiers. For more information, see Data Upload Definitions. Why does IPA map my input molecules? Mapping takes the identifiers from your dataset file and compares them to all of the molecules in the Ingenuity Knowledge Base. This process unambiguously identifies the input molecules and ensures that the correct molecules are considered for the analysis. Can datasets containing multiple identifier types be mapped? Yes. For more details, please see the Data Upload workflow. Why don’t all of the molecules in my dataset map to the Ingenuity Knowledge Base? This could be due to one of several reasons: 1. The gene ID does not correspond to a known gene product. For example, most ESTs are not found in the Ingenuity Knowledge Base (exception: ESTs that have a corresponding Entrez Gene identifier are found in the Ingenuity Knowledge Base). 2. There are insufficient Findings in the literature regarding this molecule. 3. Findings for this molecule have not been entered in the Ingenuity Knowledge Base. 4. A gene/protein ID corresponds to several loci or more than one gene. Such identifiers are left unmapped in the application due to the ambiguity of the identity. I re-ran an analysis and now some of my identifiers that used to map do not. Why? We are now unable to clearly disambiguate the mapping. This can be caused by the identifier mapping to several genes, of which we do not have a clear way of determining which mapping should be favored over another. Alternatively, the identifier which we previously used to be able to disambiguate has now been deprecated. Lastly, in some cases the use of a specific identifier may be deprecated by the vendor. What are Network Eligible molecules? What are Functions/ Pathways Eligible molecules? Network Eligible Molecules are molecules from your dataset that meet the following criteria: 1. They have been designated as being of interest (e.g. by the values in the Expression Value, Absent and Override columns). 2. They interact with other molecules in the Ingenuity Knowledge Base. Functions/ Pathway Eligible molecules are molecules from your dataset file that meet the following criteria: 1. They have been designated as being of interest (e.g. by the values in the Expression Value, Absent, and Override columns). 2. They have at least one functional annotation or disease association in the Ingenuity Knowledge Base. Why aren’t all the IDs in my dataset that meet the expression value cutoff selected as Network Eligible molecules? Only those molecules that have demonstrated relationships to other genes, proteins or endogenous chemicals can be integrated into the analysis. In other words, a molecule that is not known to have a relationship with any other molecule cannot be incorporated into a network. In addition, microarrays frequently include multiple Gene IDs that correspond to the same gene and although these would all be mapped, they refer to a single gene. Finally, in some cases this information may not reside in the Ingenuity Knowledge Base at this time. How may the number of Network Eligible molecules be increased? The number of Network Eligible molecules may be increased by: 1. Expression Value Cutoff: Change the cutoff value to include more molecules. 2. Override: Annotate more molecules with an ”X” in the Override column of the input file. 3. Focus On: Include both upregulated and downregulated molecules instead of only one or the other. 4. Absent: Specify fewer molecules that meet the cutoff as absent. 5. Additional IDs: Add additional identifiers to your input file that meet the cutoff or override criteria. What if my molecule-of-interest is not in the Ingenuity Knowledge Base? IPA may provide other information about molecules that are not Network Eligible or Functions/Pathways Eligible such as subcellular localization, tissue expression and protein family membership. This information is available for all mapped identifiers and is contained in the Gene View (for genes) or Chemical View (for chemicals). Molecules that are not eligible for network generation or that are not yet incorporated in the Ingenuity Knowledge Base may be added as custom nodes to networks and pathways by using the Add Molecule feature. Is there a cutoff for the number or size of networks generated in an analysis? In IPA, there are network size parameters that enable you to select the number of molecules per network, and the number of networks that are returned. You have the flexibility to build larger networks, consolidating key molecular events and highlighting central regulatory molecules. You can also build smaller networks if you prefer hone in on key events. Within the Create Analysis page (Core, Tox, and Metabolomics Analysis), you can select the size and number of networks you would like generated for a dataset. Networks containing 35, 70, or 140 molecules can now be created. The network generation algorithm will pull in molecules from your dataset and the Ingenuity Knowledge Base based on molecular relationships until it reaches the network size limit you have specified. Depending on the dataset, not all networks generated will have the maximum number of molecules. You can specify how many networks should be returned for an analysis. If you choose to create networks with 35 molecules, you may generate 10, 25, or 50 networks Note: 25 networks is the default setting for networks with 35 molecules. If you choose to create networks with 70 molecules, you may generate 10 or 25 networks. Note: 10 networks is the default setting for networks with 70 molecules. If you choose to create networks with 140 molecules, you may generate 10 or 25 networks. Note: 10 networks is the default setting for networks with 140 molecules Are networks pre-computed and compared to the dataset or are they generated de novo based on the input data? Networks are generated de novo and are dependent upon the input data. For example, by setting the cutoff value higher you can focus on networks which are centered around the most differentially regulated genes in your dataset. Network Generation Algorithm FAQs What are the steps in the Network Generation Algorithm? 1. The user designates molecules of interest on the Create Analysis page before running the analysis. Molecules of interest which interact with each other and molecules in the Ingenuity Knowledge Base are identified as Network Eligible Genes. Network Eligible Molecules serve as "seeds" for generating networks. 2. Network Eligible Molecules are combined into networks that maximize their specific connectivity, which is their interconnectedness with each other relative to all molecules they are connected to in the Ingenuity Knowledge Base. 3. Additional molecules from the Ingenuity Knowledge Base are used to specifically connect two or more smaller networks by merging them into a larger one. Networks can be built with 35, 70, or 140 molecules each to keep them to a usable size. (Note: You may select to include endogenous chemicals in networks on the Create Analysis page. If this is deselected, only genes, RNA, or proteins will be used for Network Generation.) 4. Networks are scored based on the number of Network Eligible Molecules they contain. The higher the score, the lower the probability of finding the observed number of Network Eligible Molecules in a given network by random chance. For more details, see the IPA Network Generation Algorithm whitepaper. How does the Network Generation Algorithm work if I have uploaded a list of molecules without expression values? IPA considers all Network Eligible Molecules on your list to be of equal importance when generating networks for gene lists. Network Eligible Molecules are uploaded genes that have interactions with other molecules in the Ingenuity Knowledge Base. See ”What are the steps in the Network Generation Algorithm”. Are all relationships displayed in a network for a particular set of molecules? Networks show relevant relationships as specified by the Analysis Components settings in the Create Analysis page when the analysis was run. The relationships that are displayed are direct interactions (two molecules that make physical contact with each other such as binding or phosphorylation) and indirect interactions (do not require physical contact between the two molecules, such as signaling events). Some Findings present in the Ingenuity Knowledge Base are not used in the network generation process such as localization, expression, and mutant information. You may often find these additional relationships helpful and biologically important. There are several ways to view additional relationships: 1. You can view the full complement of direct and indirect interactions for a gene by double-clicking it to see its Node View summary and then clicking the Neighborhood Explorer link. See the Mutant Information section of the Node View for interactions involving functionally mutant forms of the gene. 2. You may add interactions to a network by clicking the Build button and using the Grow (to add new molecules) or Connect tools after selecting molecules of interest. Alternatively, you can add custom interactions by clicjing the Draw button, and using the Add Relationship feature. 3. You may select multiple networks of interest in the Networks tab and then click the Merge Networks button to combine them into one network which adds and highlights all interactions between genes in different local networks. How does IPA use my expression values in the Network Generation Algorithm? If you set a cutoff value in the Create Analysis page, IPA compares the expression values of your genes to it to identify the Network Eligible Molecules. Expression values are also used along with specific connectivity to prioritize addition of molecules that are not Network Eligible into networks. What is the "Score" for a network, how is it calculated, and how should I interpret this? The score is a numerical value used to rank networks according to their degree of relevance to the Network Eligible Molecules in your dataset. The score takes into account the number of Network Eligible Molecules in the network and its size, as well as the total number of Network Eligible Molecules analyzed and the total number of molecules in the Ingenuity Knowledge Base that could potentially be included in networks. In the Networks view, networks are ordered according to their score, with the highest scoring network displayed at the top of the page. The network Score is based on the hypergeometric distribution and is calculated with the right-tailed Fisher's Exact Test. The score is the -log(Fisher's Exact test result). For this example, suppose that a network of 35 molecules has a Fisher Exact Test result of 1x10-6. The network’s Score = -log(Fisher's Exact test result) = 6. This can be interpreted as, "There is a 1 in a million chance of getting a network containing at least the same number of Network Eligible molecules by chance when randomly picking 35 genes that can be in networks from the Ingenuity Knowledge Base”. The score is not an indication of the quality or biological relevance of the network; it simply calculates the approximate "fit" between each network and your Network Eligible Molecules. How do "hub" molecules affect the Network Generation Algorithm? The network generation algorithm optimizes for specific connectivity, so when a "hub" molecules is included in a network it connects a higher fraction of Network Eligible molecules relative to all genes the "hub" gene is connected to. Biologically many such "hub" genes often exist in multiple protein complexes and this is represented in networks as many genes connected to the "hub" gene rather than many protein complex nodes. If you see a "hub" molecule that is not a Network Eligible Molecule and believe it is unlikely to be present or active in your biological context, we recommend flagging the gene as Absent in your input file, which will cause the algorithm to exclude it. Additionally, "hub" molecules often have many indirect molecular signaling effects, so running analyses with direct interactions only also reduces the likelihood of hub genes by only including them in cases where they directly physically interact with Network Eligible molecules. One way to see the specific connectivity significance of a hub molecule is to add it to a new MyPathway. Use the Grow function with the same relationship types (e.g. direct and indirect interactions or direct only) and molecule types as in the analysis to get the total number of molecules. Then overlay the expression values from the analysis to determine the number of nodes that are Network Eligible molecules. Why do I get significant network scores when I submit a random list of molecules into IPA? The purpose of IPA's network generation algorithm is to find networks of highly connected Network Eligible molecules. If you submit molecules chosen at random, IPA will still do its best to bring as many Network Eligible Molecules into a single network as possible, because it assumes that these molecules have some interest to you. If the number of Network Eligible Molecules is large, resulting networks can often receive a high score because of the generally high interconnectivity of molecules in the Ingenuity Knowledge Base. When looking at the List of Networks generated for a random list of molecules, the highest-scoring network typically has a lower score than a non-random list. Additionally, the distribution of network scores (e.g. when comparing sorted order) typically is lower and falls off more sharply for random networks relative to actual data (see the algorithm whitepaper). It's important to keep the biological context in mind when evaluating the networks since the goal of the algorithm is to come up with the best hypotheses it can about how the Network Eligible Molecules may be interacting biologically, using the principle of specific connectivity. The algorithm assumes there are some biological commonalities and attempts to identify and highlight them for you. The score and the processes associated with each network are intended to help you identify the most striking and relevant networks given your biological context. The network score is not intended to prove that a particular network represents what is happening in a biological system. It just indicates that the network is rare relative to all hypotheses it could come up with. References 1.) For more information on the IPA Network Generation Algorithm, click here. 2.) For a peer-reviewed explanation of the IPA Network Generation Algorithm, see: Calvano SE, Xiao W, Richards DR, Felciano RM, Baker HV, Cho RJ, Chen RO, Brownstein BH, Cobb JP, Tschoeke SK, Miller-Graziano C, Moldawer LL, Mindrinos MN, Davis RW, Tompkins RG, Lowry SF; Inflamm and Host Response to Injury Large Scale Collab. Res. Program. "A network-based analysis of systemic inflammation in humans". Nature. 2005 Oct 13;437 (7061):1032-7. PMID: 16136080. Attachment