Download Clustering of co-expressed genes based on RNA

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene expression profiling wikipedia , lookup

Metagenomics wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
Clustering of co-expressed genes based on RNA-Seq data
_______________________________________________________________________
Position: Post-doctoral
Duration: 12 months
Starting data: September, 2016
Environment: The candidate will be based at the Toulouse Mathematics Institute
(Institut de Mathématiques de Toulouse, IMT) in Toulouse, France
Contact: Cathy Maugis-Rabusseau ([email protected])
Andrea Rau ([email protected])
MixStatSeq ANR project : http://perso.math.univ-toulouse.fr/maugis/mixstatseq/
______________________________________________________________________
Topic: Significant advances in next generation sequencing technologies have made RNA
sequencing (RNA-seq) a popular choice for studies of gene expression. Although
microarrays and RNA-seq both aim to characterize transcriptional activity, the statistical
tools developed for the analysis of the former are ill-suited to the latter, and methodological
developments specific to RNA-seq data have been an active area of research in recent
years. In the French National Research Agency (ANR) project MixStatSeq, we are
interested in detecting clusters of co-expressed genes that share similar expression
profiles across several experimental conditions from RNA-seq data. Identifying these
groups of co-expressed genes is of great biological interest, as they may share similar
transcriptional regulatory mechanisms. In addition, such co-expression analyses represent
a variety of statistical challenges; in the MixStatSeq project, we focus on the use of modelbased clustering methods to explore RNA-seq data, but several obstacles must still be
addressed in this context.
The post-doctoral researcher will first focus on identifying the most appropriate strategy to
adopt for co-expression analyses of RNA-seq data, including the choice of appropriate
transformations and mixture model collections for RNA-seq data, as well as the definition
of an adapted criterion to select the number of clusters present in the data. Second, the
post-doctoral researcher will focus on the comparison and aggregation of related coexpressed gene clustering results from RNA-seq, microarray, and functional annotation
data to improve the biological interpretability and robustness of co-expression analyses.
Throughout this work, novel statistical or computational developments, including an R
package including graphical tools for data visualization, are expected to be developed as
needed. The post-doc will make use of publicly available RNA-seq data as well as data
generated in the Animal Genetics and Integrative Biology (GABI) research unit at INRA.
Keywords: clustering, mixture models, model selection, clustering aggregation, RNA-seq
and microarray datasets
Skills : The candidate should have a Ph.D. or equivalent degree in biostatistics or statistics
by the start date and written proficiency in English. We are looking for a highly motivated
and skilled candidate who is strongly motivated by challenging research topics and
applications in biology. Strong programming skills in R are expected, and some experience
in Python would be appreciated. Familiarity with RNA-seq data or mixture models is
desirable but not required.
Additional information:
To apply, send an email to Cathy Maugis-Rabusseau with a CV and a letter of motivation
describing your background and interest in the project, and the name of two references.