Download L07v01a complete export.stamped_doc

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

DNA nanotechnology wikipedia , lookup

Replisome wikipedia , lookup

Microsatellite wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
L07v01a complete export
[00:00:00.00]
[00:00:01.08] SPEAKER: Hi there. In today's series of videos, we're going to talk about gene
specific transcription. In the previous class, we talked about general features of transcription and
translation. What genes, in general, have to happen for genes to get turned on or off. But it's not
enough to be able to turn genes on or off. You have to turn on selective genes when you want
them, and keep others off when you need them to be off.
[00:00:27.82] So this is about a higher level of control. And the very basic switch in that control
is a DNA binding protein binding to a DNA specific sequence to turn a gene on that's right in
that locality. Call this a lock and a key. The pieces have to fit perfectly. They're made to match
each other. And that's how a gene can be activated.
[00:00:56.12] We'll look at some of the molecular details, in terms of how proteins contact DNA,
which I say is reading the code of the DNA in the major and minor grooves to recognize this is
the gene that I'm supposed to turn on. And then lastly, in this video we'll talk extremely briefly
about some different classes of DNA binding proteins or transcription factors.
[00:01:22.38] And this slide sort of highlights the point. Let's start with the carrot. Here we have
a single cell with one nucleus, with one copy of the genome, and yet all the information is there
so that it, over time, develops from a young embryo, young plant, finally to a mature carrot plant.
This is a timed orchestration of turning genes on and off so they produce the proper proteins at
the proper time, the proper molecules. They build the proper structures, and in the end, allow the
life cycle to continue. And they have produced an adult functioning organism.
[00:02:07.51] The frog and the cow examples focus in on the cloning techniques, and we'll
discuss those later in the course. But it's the same thing as how do the various instructions that
are contained in the DNA be deployed in a time dependent manner to produce the final results of
an adult organism?
[00:02:31.52] So here we see a slide again. Six levels where genes can be controlled and their
functions modulated so that you can have, in the end, an active protein doing a job that you want
it to do. We'll talk mostly today about this very first step, transcriptional control. Because of its
primacy, it's one of the best studied of these features. And it is, as you know, in essence, of your
20,000 to 25,000 genes, which ones are we going to turn on at this particular time to perform the
next steps that we need to do in order to respond to the environment or carry out the plan that is
inherent in our genome?
[00:03:17.97] Once you've made the RNA, there's lots of other ways to influence this process of
getting to the active protein, processing the RNA, transporting the RNA, binding to it,
localization, making it free and available, or sequestering it from being used. There is control at
how much protein is made for each mRNA, how actively or quickly the mRNA is degraded, how
folded proteins' activity is controlled. Because an inactive verses an active protein can produce
completely different results. And by the second part of this course, we will have discussed all of
these in pretty decent detail.
[00:04:10.55] OK, transcriptional control. Two basic components, a lock and key. The biological
equivalents are a short stretch of a defined DNA sequence. This is the gatekeeper for a gene. And
a gene regulatory protein that recognizes and binds to that sequence. This is the key that's sort of
opening the lock or turning the gene on.
[00:04:42.51] So how will the proteins read the sequence that is present? We know-- this is a
picture from a previous slide-- how DNA recognizes itself. A and T participate in hydrogen
bonding. And T recognizes an A by two hydrogen bonds. And C recognizes G by three hydrogen
bonds. But the protein is not able to recognize these bases based on this base pairing. It is in the
center here of the DNA double helix, and they are occupied with each other. That is not
available.
[00:05:21.12] What is available to the proteins is the ends that are exposed in the major and
minor grooves. And although it's poorly drawn, this is what-- the protein C is looking at the end-this is what's in the major groove. Same here. Down here. This side is, in the way it's drawn in
this book, which is this is a poor representation of the actual geometry, is what's seen in the
minor groove.
[00:05:56.14] So let's look here. Let's look at a particular slice, right here in the end on of a base
pair. You could see hydrogens; carbons, the darker blue; nitrogens, the lighter blue; carbon,
hydrogen, nitrogen, and oxygen. And this pattern, which we will schematize on the next slide, is
how DNA binding protein recognizes whether it is a G, a C, an A, or a T, at that location. And it
has the ability to query several base pairs at a particular time. You'll see, because of the geometry
of the double helix, because it is twisting, that it's going to be hard for a single protein, for
instance, to recognize 20 consecutive bases.
[00:06:58.64] Because of the approach, it might be able to recognize quite a few. But it might
recognize five here, down here, and five here, just because that is the face of the DNA that is
exposed to the protein. And here we make that clearer. This is drawn to better perspective. And
it's color coded in a very convenient way. So let's imagine the protein is approaching this GC
base pair from the major groove. It'll see a hydrogen, nitrogen, oxygen, a hydrogen, a hydrogen,
and a hydrogen.
[00:07:39.09] Also note that, if we go left to right, a GC base pair will look different to it than a
CG base pair. So the proteins distinguish between these two binding situations. If we compare
the GC to the AT, we can see, for instance, that the G will see if it's looking at a G, it's going to
see an oxygen here where it's seeing a hydrogen in sort of the complementary position. And then
there's this methyl group, a hydrophobic atom, for the T base, which is not present for C. So this
is the way that proteins can recognize these differences in DNA sequence without trying to
access the hydrogen bonding, which we think of as synonymous as defining the G base.
[00:08:33.66] And now, we've completely schematized the view. The helix of the DNA is
running up and down. And we're looking in the major groove from the side. And we'll see for a
GC base pair, hydrogen bond acceptor, acceptor, donor, and hydrogen atom for our GC based
pair. And you can see that the patterns for the other bases are all unique.
[00:09:02.96] This is the DNA. This is the ones and zeros, essentially, of the DNA code. Of
course, that analogy works on different levels. The information content of DNA is two bits per
base pair. And so that a single one or zero could not distinguish a G from an A, just like a single
nucleotide can't code for a single amino acid. Two digits, a one and a zero, could code for the
four different bases.
[00:09:36.87] Just building in some molecular details, in terms of how a DNA binding protein
will access or read the information. And that's by positioning a collection of amino acids in three
dimensional positions such that they can interact properly with the sequence. And then several
stacks of bases on top of this or below this. In this case, the amino acid asparagine is making two
hydrogen bond interactions with the base from a single amino acid.
[00:10:08.64] Now there is a great deal of interest of learning the code. Learning which amino
acids are contacting the base so you can predict which is the sequence that's going to respond to
a particular DNA binding protein. And that's a super challenging problem, and we haven't gotten
there. And in my mind, it's an open question if we'll get there or not.
[00:10:36.41] So in this slide, we start to look at different classes of DNA binding proteins.
There are, I believe-- I'll have to check some of these numbers-- about 1,000 proteins which are
known to bind to DNA. And a lot of them are general factors, like histone proteins. Of the types
of transcription factors that bind DNA in a sequence specific manner, with the express intent of
regulating gene expression, I think the human genome contains about 400 of those different
proteins.
[00:11:11.90] But of course, humans build complexity and diversity by forming heterodimers of
two different pairs of proteins or alternative splicing, which could possibly influence the base
pairs which are recognized. Anyway, there are about five major classes. I thought it's worth
highlighting about three of them. One of the very common ones is called a helix-turn-helix motif.
And the larger of these two helices lies right in the major groove, allowing many amino acids to
potentially interact with the exposed edges of the DNA bases.
[00:11:57.18] These proteins do not regularly act as monomers. Here, it's only making contact
with, at most, about six bases. But a dimer of these, either a homodimer or a heterodimer, could
interact with about 12 base pairs and start to give you reasonable specificity in terms of binding
the sequence that you're interested in.
[00:12:22.66] Another class of proteins that bind DNA specifically are called zinc fingers. Here
you see a picture of a zinc finger with three fingers, if you will, three portions of an alpha helix
binding in the major groove over about 180 degrees, or more, actually, in the major group of a
DNA. You see a schematic. These are three relatively independent helices contacting
approximately six bases still, even though it's spread out over nine or so. And they're called zinc
fingers because these three balls are atoms of zinc. They are coordinated by cysteine and
histidine amino acids. And their positions are relatively conserved.
[00:13:18.23] This class of enzymes is of extreme interest to biotechnologists because you have
the ability to engineer these three different fingers relatively independently. That is, if I make a
change here in trying to alter which base it recognizes, to the first approximation, I'm not going
to greatly affect which bases these other two fingers recognize.
[00:13:49.60] And this gives engineers the ability to start designing circuits where specific genes
that only you want to be turned on at a certain time could be controlled by a sequence creating
the protein that you want and a unique sequence, which exists only where you want it in the
genome and controlling and really creating that interaction. And here we see a detailed view of
the zinc finger, the zinc atom, the two cysteines, and the two histidine residues. Overall, there
about a hundred of these proteins in the human genome, this class.
[00:14:33.31] The last class that we want to introduce ourselves to is leucine zipper proteins.
And you see these two long alpha helices are grabbing the DNA, kind of like chopsticks. And
here, the alpha helices will fit into two portions of the major groove. And these helices interact
by hydrophobic faces. In this case, it's right here on the side. Now this would be rotating round to
the back. And then it would be coming over to this side. Hydrophobic faces between these two
residues.
[00:15:10.79] Leucine is one of the smaller hydrophobic amino acids, although they're smaller.
And they predominate on the interface between these two alpha helices. That's true whether these
two helices are homodimers from the same subunit or from heterodimers, meaning that the two
helices belong to two separate proteins. This class of proteins sort of lept into scientific
consciousness when a couple of the very important cancer promoting genes, jun and fos, were
recognized to be leucine zippers.
[00:15:53.49] So now we've talked a little bit about how proteins can recognize DNA. And in the
next video, we will see why the proteins sometimes will bind and sometimes won't bind. So
we're getting further along in our quest to control DNA in a time-specific manner. Thanks.