Download The essence of multicellularity - Introduction to concepts of gene

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Expression vector wikipedia , lookup

Biochemical cascade wikipedia , lookup

Life wikipedia , lookup

Biology wikipedia , lookup

Neurogenetics wikipedia , lookup

Cell culture wikipedia , lookup

Organ-on-a-chip wikipedia , lookup

Cell cycle wikipedia , lookup

Gene expression profiling wikipedia , lookup

Epigenetics in stem-cell differentiation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Neuronal lineage marker wikipedia , lookup

Cell (biology) wikipedia , lookup

State switching wikipedia , lookup

Cell theory wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Microbial cooperation wikipedia , lookup

Symbiogenesis wikipedia , lookup

Introduction to genetics wikipedia , lookup

Gene regulatory network wikipedia , lookup

Developmental biology wikipedia , lookup

Transcript
Lectures at Centro de Ciencias Genómicas-UNAM , November 14, 2005
Sem inar for students:
The essence of multicellularity - Introduction to concepts of
gene network dynamics and attractors
Sui Huang, Children’s Hospital and Harvard Medical School, Boston, USA
Pre-lecture remarks
A hall-mark of multicellular organisms (metazoa) is the differentiation of cells into various
cell types, each of which exerts specialized functions within the various organs: liver cells,
brain cells, muscle cells, or skin cells. How do these different cells develop from a
totipotent single fertilized egg cell ? The most primitive form of differentiation in evolution
is the differentiation into somatic cells and germ line cells in simple metazoa. Somatic cells
are the building blocks of an individual organism, whereas the germ line cells form sperms
and eggs and are responsible for reproduction of the organism, hence they serve to
transfer the genetic material to the next generation. In other words, germline cells form a
quasi-immortal line, while somatic cells die with the death of the individual organism.
How are the different cell types different from each other ? Using the nematode
Ascaris as a model for differentiation between somatic and germline cells, Theodor Boveri
found in 1910, that the somatic cells loose part of their genome (as we would call it today),
as evidenced by what he called “chromatin diminution”. Of course, this is not the case for
the germline cells, which have to maintain the entire genome and pass it on to the next
generation. Thus, one could envision a mechanism of cell differentiation in which
specialized somatic cells keep only the genes needed to exert their function and loose DNA
containing the not needed genes, whereas the germline cells would contain and maintain
the entire genomic information. It turned out that the Ascaris mechanism is rather an
exception, applying only to primitive but not to higher organism, such as vertebrates.
In higher organism, all somatic cells in the body contain the entire genomic
information (with some notable exceptions), i.e., no genomic DNA is lost during
development from the fertilized egg to a specialized cell of the multicellular adult organism
(with some notable exceptions). All genes are present, so to speak, as dormant instructions
in all cells. It is their activation, or in molecular biology terms, expression which differs
from cell type to cell type. Expression of a gene (a segment of genomic DNA) involves the
transcription of a gene into a mRNA m olecules (“transcript”) which serve as a template
for the translation into a protein which executes the instruction encoded by a gene.
To understand the differential expression of genes between cell types, imagine a
single cell as a computer. The body w ould then be made up of zillions of tiny little
computers. Then, think of the nucleus of each cells with the chromosomes containing the
DNA that defines the genome, as the hard drive: then, every cell, a liver or brain cell, would
contain the same, entire genomic information in their nucleus, i.e. they would have the
same hard drive: they are clones of the very same computer. However, although each
cell/computer has all software programs, different application programs are opened and
loaded into the memory in the different cell types because they work on different “projects”
that require different programs. A program that is open is the equivalent of a gene that is
expressed. Some programs have specific functions while others, such as the operating
system and basic communication software, are loaded in every computer. In biological
terms, each cell type in the body expresses a different set genes. Genes that are
expressed only in the liver are called “liver-specific genes”. For instance, the alcohol
dehydrogenase, or ADH, is an enzyme that degrades alcohol and thus, exerts the metabolic
function of the liver. The gene that encodes ADH is in every cell, but it is transcribed and
translated only in liver cells. In contrast, genes that fulfill basic functions, akin to programs
of the operation system (OS) in computers, are expressed in every cell. For instance, the
gene that encodes the protein actin, which acts as a building block for the cytoskeleton,
1
H u an g - Pr in ciples of M u lticellu lar ity
or the genes that govern protein synthesis, are expressed in every cell. Such OS-like genes
are called “house-keeping genes”. Thus, overall, the profiles of expressed proteins in a cell
is distinct for each cell type, but they overlap because of the shared housekeeping genes.
In summary, a cell type, or more generally, a phenotypic cell state (such as proliferating,
senescent, apoptotic, differentiated cells) is characterized by a gene expression profile
unique to that cell state.
The human genome contains roughly N = 30,000 genes (a rather conservative
estimate). How many gene expression profiles can there possibly be ? Let’s make the
following simplifying assumption: each gene can be either expressed (i.e., the protein it
encodes is present in the cell) or it is not expressed (repressed). Let’s symbolize each gene
as a bit, G i, where i =1,2,...N; then G i = 1 would mean “the gene i is turned ON (expressed)”
and accordingly, G i = 0 would represent the OFF (repressed) state of that gene. Each string
S of length N, e.g., S = {1010110... } would represent a genome-wide gene expression
profile. W ith N = 30,000 of such binary genes we would have S = 2 30000 . 10 10000 possible
configurations of strings, or genome-wide gene expression profiles. (Remember that w e
have made a simplifying assumption of ON/OFF genes - the reality is much more
complicated and thus ‘worse’: each gene can have many levels of expression; moreover its
transcript can be differentially spliced to give rise to not one but several proteins which
themselves are subjected to post-translational modification). Each one of these gene
expression profile S would be a unique configuration, and could theoretically represent a
cell type. Despite our simplification, the number of gene expression profiles we can get,
10 3000, is an astronomic number ! (Compare: there are ~ 10 80 protons in the universe). In
other words, we would have almost an endless continuum of cell types. Many cell types
would be very similar to each other. For instance, the one specified by the string
{00000....001} would be almost undistinguishable from the one specified by {00000...011}.
The reality, however is different: we don’t observe a quasi-continuum of cell
phenotypes, but clearly, we see distinct and almost discrete entities: liver cells, brain cells;
proliferating cells or dying cells - and no intermediates. Even if there may be different
subtypes of liver cells, they are all similar to each other and collectively, much more distinct
from, say a nerve cell. Thus, despite sharing the identical genomic information, cells have
their own type identities, and do not easily morph over between the various types. In other
words, we have a finite number of distinct cell (pheno)types. This discreteness was first
noticed by the embryologist Conrad F. W addington in the 1940s, who spoke of “wellrecognizable types” whereas “intermediates are rare and unstable”. In modern cell biology
textbooks, an often encountered estimate is that the human body (which contains roughly
10 15 cells) has around 200 cell types. Today, thanks to the molecular analysis of entire gene
expression profiles using DNA microarray technology (e.g., GeneChips TM), we know that
this is clearly an underestimate. As mentioned above, many cell types actually encompass
a set of sub-types. For instance, an endothelial cells that form the lining of the blood
vessels, are distinct from each other in different parts of the body. Nevertheless, it is
intriguing that there is such a low, finite number of quasi-discrete cell types given the vast
combinatorial possibilities of gene expression profiles.
W here does the discreteness of cell type entities come from ? How can a totipotent
zygote generate the diversity of distinct cell types ? And why are cell types in general so
stable, and do not gradually “drift” away into one another? The main reason why an egg can
so robustly develop into a multicellular organism with distinct and stable cell types is that
gene expression is regulated. Often the expression of one gene is regulated by the gene
product of another gene. Thus, the genes are influencing each other’s expression: the 0's
and 1's in the string S cannot flip independently. These regulatory interactions among the
genes of a genome form an almost genome-wide gene regulatory network. Concretely,
a gene can encode a protein that acts as a transcription factor which specifically activates
or represses the expression of another gene. Such gene-gene interactions lead to “dynamic
constraints” or “network frustrations” so that only a minority of the 10 3000 or so
2
H u an g - Pr in ciples of M u lticellu lar ity
combinatorically possible profiles are realizable. In other words: the vast space of possible
expression profiles collapses into a small number of stable profiles -the discrete cell states.
Kauffman introduced in the 1960s the concept of random boolean networks to
study the fundamental aspects of a system of a large number of interacting elements, e.g.,
genes. He showed that under some circumstances (e.g., certain schemes of ‘wiring’ of the
genes), the network will not behave “chaotically” (despite the random, irregular regulatory
interactions between the genes) but exhibit ordered behavior in the from of discrete
“attractor states” - which mirror the properties of the discrete cell types.
In these two lectures, you will learn some generic concepts that will help explain how
gene regulatory networks lead to the emergence of discrete behavior which underlie the
very basic nature of cell types in multicellular organisms. Such knowledge will be useful for
a future “systems biology” view of multicellular organisms, and specifically, for stem cell
biology. Therefore, it is paramount to understand the generic natural principles of
differentiation and cell type identity in addition to simply “listing” the genes that
participate in determining particular cell types.
What will you get out from these lectures (Learning objectives):
! Knowledge of basic principles of non-linear dynamics in molecular pathways that
generate discrete responses, such as cell identities. Familiarity w ith concept of
bistability and multistability.
! Adoption of an overarching conceptual framework of cell regulation that provides a
deeper insight when reading “all these signaling pathway papers”
! Understanding of generic feature of cell fate regulation, in particular, stem cell
differentiation
How will you get to that level of understanding:
The central question that you will learn to answer is how a continuous (gradually changing)
system produces discrete, qualitatively distinct behavior. The lectures will take following
small steps that can be followed by anyone with basic calculus background to achieve the
above learning objectives:
! Introduction to simple one- and two-dimensional non-linear systems and their
qualitative behavior, notably, multistability and bifurcations.
! Brief Introduction to dynamics of large networks (discrete, high-dimensional dynamics)
! Discussions about multi-cellularity, stem cell potential, the trans-differentiation debate,
tumorigenesis
Suggested preparatory reading:
# This overview - pay attention to the terms in bold italics that will be used and discussed
in the lectures.
# Original results to be presented in main lecture:
G Huang S, Eichler G, Bar-Yam Y, Ingber DE: Cell fates as high-dimensional attractor
states of a com plex gene regulatory network. Phys Rev Lett 2005,
94(12):128701.
# Recent overview/introductory articles
G Huang S: Back to the biology in systems biology: what can we learn from
biomolecular networks. Brief Funct Genomics Proteomics 2004, 2(4):279-297. This
is more a critical commentary on how networks are treated in modern biology
G Huang S: Multistability and Multicellularity: Cell Fates as High-dimensional
Attractors of Gene Regulatory Networks. In: Computational Systems Biology.
Edited by Kriete A. E: Elsevier Acadmic Press; 2005. This is a fairly comprehensive
introductory overview
© S u i H u an g N ov 2005
3