Download 1) - WordPress.com

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Data assimilation wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
TOPIC: Assessing the effectiveness of drug treatments using
mathematical modeling - Application of Mathematical Models Cell
Application
TUTOR GUIDE
MODULE CONTENT: This module contains simple exercises for biology majors
to begin applying and interpreting mathematical models of biological systems.
TABLE OF CONTENTS
Alignment to HHMI Competencies for Entering Medical Students………………...1
Outline of concepts covered, module activities, and implementation……..……....2
Module: Worksheet for completion in class........................................................3-9
Pre-laboratory Review Questions (optional)...…….…...……………….………10-11
Suggested Questions for Assessment.................................................................12
Guidelines for Implementation……………………………...............…..............12-13
Contact Information for Module Developers........................................................14
Alignment to HHMI Competencies for Entering Medical Students:
Competency
E1. Apply quantitative reasoning
and appropriate mathematics to
describe or explain phenomena
in the natural world.
Learning Objective
E1.2. Interpret data sets and communicate
those interpretations using visual and other
appropriate tools.
Activity
1
2.
E1.3. Make statistical inferences from data
sets (evaluating best fit linear relationships
based on calculating error sums of squares)
E1.5. Make inferences about natural
phenomena using mathematical models
3.
1
3.
Mathematical Concepts covered:
- mathematical modeling in a biological context
- linear models
- regression models
In class activities:
- group discussion
- graphing and interpreting data
- construction of linear models
- using regression approach calculations for determining “best fit” relationships
between two variables.
Components of module:
- preparatory assignment to complete and turn in as homework before class
- in class worksheet:
- discussion questions
- plotting and interpreting data
- calculations of sums of squared error values to quantitatively assess the
goodness of fit of lines to observed data.
- suggested assessment questions
- guidelines for implementation
Estimated time to complete in class worksheet
- 60 minutes
Targeted students:
- first year-biology majors in introductory biology course covering cell and
molecular biology
Quantitative Skills Required:
- Basic arithmetic
- Logical reasoning
- Interpreting data from tables
- Graph/Data Interpretation
2
Worksheet: Introduction to Mathematical Modeling in Biology
Biological processes, such as the conversion of sunlight to plant biomass, the
transcription of DNA to RNA, the growth of cells in an organism and the rate of
growth of populations, are influenced by many factors (variables). Although the
scientific fields of chemistry and physics have long relied on the use of
mathematical models, the use of mathematical models in biology has been much
less extensive. This is changing, however, and all practicing biologists in the
future are going to need to be skilled in the use of mathematical and statistical
models to understand the biological processes they are studying. Our ability to
gather more sophisticated data on more variables regulating biological systems
requires that we find ways of integrating, organizing, and evaluating this
information. Mathematical models provide tools for doing just that, as they
precisely describe relationships among variables that drive biological systems
and provide us with a means to test our understanding of how these systems
work. In this module we will begin to introduce you the use of mathematical
models in biology.
Discussion Questions
To begin, form groups of 3 students. Discuss and write down the answers to the
following questions:
1. What is a mathematical model? What is a mathematical model of a biological
system?
2. How does a scientist know if a model of a biological system is a “good”
model?
Lab Exercises – Part I
Students: To receive credit for this exercise, provide short answers and
supporting graphs and calculations for all questions below.
Data Set – plotting data, model evaluation
3
The drug Angioblock was developed to treat some forms of cancer. Angioblock
acts by inhibiting the formation of blood vessels; cancer cells require the
formation of many new blood vessels in order to form tumors, so blocking the
formation of new vessels preferentially harms cancer cells. To see if Angioblock
has an effect on tumor cell division and if the effect depends on tumor size,
colorectal tumors of different sizes were treated for one week with Angioblock
and then the number of tumor cells that were dividing (the proliferation index)
was measured in each tumor. Consider the following experimental data:
Observation
(i)
Tumor
(mm)
1
2
3
4
5
6
7
8
9
10
8
11
13
15
19
23
27
33
35
44
Diameter Proliferation
Index
without
Angioblock
375
455
722
318
216
161
232
245
223
154
Proliferation Index
with Angioblock
325
380
337
248
147
103
152
175
153
79
Activities and Questions:
1. In your groups, first outline two hypotheses that you might test based on the
description above.
The first step to address these hypotheses is to plot the data. In your groups,
decide what types of plots could be made. In each case, what is the dependent
variable of interest (what will you plot on the y-axis)? What is the independent
variable (to be plotted on x-axis)? Why did you choose one variable for the y
axis and the other one for the x axis (in other words, what is the question the
researchers are trying to answer and how could these data be used to provide
them with an answer)?
4
2. a. Each individual in the group should now take a piece of graph paper and
plot all the data on one graph and use different symbols to distinguish data from
the tumors that are treated with the drug from those that were not (e.g., use
diamonds for controls, squares for angioblock treatment).
b. Based on the graphs produced, do you think the effect of Angioblock
depended on the tumor size? Why or why not?
c. Now, looking at the graphs, draw a line through each data set that you feel
best fits the data points (one line through the proliferation index data without
Angioblock and the other for the data that came from the Angioblock treatment).
Now use the formula for a line (y=mx+b) to come up with a linear model for each
line, where y is the dependent variable, b is the value of y when x = 0, and m =
the slope of the line (change in y / change in x). Write a sentence explaining in
words exactly what the relationship between tumor diameter and proliferation
index is for each treatment group.
5
Lab Exercises Part 2
3. We will now examine how well the line you drew fit the actual data for
the Angioblock treatment group (your groups' line will be compared to
ones drawn by other groups to see who drew the line that best fit the data
using the Error Sum of Squares calculation described below). Each group
must first decide which group member drew the line that best fit the data take a moment now to do this.
The formula below (the error sums of squares formula) is a procedure that is
used in statistical analyses to find the line that best fits the data in experiments of
this nature.
Using the formula below calculate the error sum of squares (calculation shown
below) for the line that was drawn through the data from the Angioblock
treatment group from the Data set above. To do this calculation, find the value of
each value of Y (the proliferation index) that corresponds to a given value of X
(tumor diameter) for each observation i (using the data from Data set). Now for
every observation i (the measured proliferation index at each tumor diameter
measured) determine the difference between every observed value of Y at each
X value and the value of Y that passes through the line that you drew at that
X value. Calculate a total Error Sum of Squares separately for each model (A
and B) In the end you should have a sum of squares error value for each line.
When your group is finished raise your hand. When all groups are finished one
member of each group will give the formula for their line and the error sum of
square value for that line (please include this information on the sheet you
will hand in for this question, include your calculations). The winning group
will have the lowest error sums of squares value (and will so be the line that best
fits the data).
6
MODULE FEEDBACK - Each year we work to improve the modules in the active
learning "discussion" sections. Please answer the following question with regard
to this module on this sheet and turn in your answer to the TA. You can do this
anonymously if you like by turning in this sheet separately from your module
answers.
How helpful was this module in helping you understand the fundamental
concepts in mathematical modeling of biological data?
A = Extremely helpful
B= Very helpful
C= Moderately helpful
D= A little bit helpful
E = Not helpful at all
Module Rating ____________
Thank you!
7
Pre-laboratory Exercise: To be completed before you come to class and
handed in at the beginning of class.
This homework is designed to review lines and linear models and to prepare you
for the upcoming module on mathematical modeling and subsequent models in
Ecology that you will have to work with. If you encounter an unfamiliar term,
please refer to a textbook for high school algebra or pre-calculus.
On her way to her volleyball game, Joanne stops by the grocery store for a
healthy snack that will give her energy for the game. She decides to get a jar of
peanut butter and some bananas. She notices that the peanut butter costs $3.99
for a jar and that the bananas cost $0.49 per pound of bananas.
1. How much will one pound of bananas and a jar of peanut butter cost?
2. Joanne decides to bring a snack for each of the 6 girls on the volleyball team.
How much will it cost for six pounds of bananas and a jar of peanut butter?
3. On the grid provided below, draw a Cartesian coordinate system with number
of pounds of bananas on the x-axis and the total cost of the peanut butter and
bananas on the y-axis. Plot the two points that you found for the previous two
questions.
8
7
6
5
4
3
2
1
0
4. Plot the line on the grid above through the two points that you found. Now
compute the slope of the line that goes through these two points (remember,
“rise over run”: (y2-y1) = m (x2-x1) where m is the slope of the line).
8
Now write the equation of the line in slope-intercept form, y= mx + b, where b is
the y-intercept (the place where the line crosses the y axis).
Congratulations! You have just made a mathematical model of peanut butter and
bananas!
5. What does the slope correspond to in terms of the scenario above? What is
the y-intercept of the line? What does it correspond to in the scenario above?
6. What is the independent variable?
What is the dependent variable? Respond in symbols as well as in words.
7. How would the line change if the price of bananas increased or decreased?
How would the line change if the cost of the jar of peanut butter increased or
decreased?
9
Suggested Questions for Assessment
E1. Apply quantitative
reasoning and
appropriate mathematics
to describe or explain
phenomena in the natural
world.
E1.2. Interpret data sets and
communicate those interpretations
using visual and other appropriate
tools.
1
2.a. – c.
E1.5. Make inferences about natural
phenomena using mathematical
models
3.
Guide for implementation: Discussion
Have students break up into groups of 3 to discuss and come up with answers to
the questions. Groups get together to talk about each question – work pauses in
after 10 minutes or as soon as you think each group has come up with something
for all questions (no longer than 15 minutes). The TA should then pick a person
from 3 groups chosen at random to share with the class their group’s answer to
one of the questions. Tell them that everyone in the group should be prepared to
share the answer with the class, as you will chose who speaks randomly among
the group. Suggested ideas that should emerge from each question are listed
below.
An alternative way to run this discussion section of the class would be to run it
“Question Time” style where you let them discuss each question in turn for three
minutes, then ring a bell or use another method to cut off discussion, then have
the whole group report their answer. Then move on to the next question.
1. What is a mathematical model? A mathematical description of a system. A
model of a biological system is a description of a biological process using
mathematics. Mathematical models typically involve formulas that describe
relationships among variables and also provide information on the relative
strength of these variables on a given biological observation (population growth
rate, mutation rates of DNA). Models are typically organized into dependent and
independent variables (but not always). TAs should guide discussion toward
mathematical models as ways to evaluate 1) how well we understand the biotic
or abiotic factors that influence biological systems 2) our ability to predict how
changes in certain variables influence biological systems.
2. How does a scientist know if a model is good?
- scientists can evaluate how well the model explains the observations it was
based on – how well does the model fit the observations.
- scientists can evaluate how accurate the model is in predicting future events
(either through new experiments or through new observations that were not used
to construct the original model).
10
Guide for Implementation: Lab Activity
1) TA hands out a copy of the first data sheet and a piece of graph paper to
each member of a group. Each individual is instructed to plot the data points and
construct lines through the data points based on the model equations. Each
individual will then decide which model (line) more accurately “fits” the data and
provide rationale. One or two groups share their decision with the class.
a) The graph paper should have x and y axes drawn in lower left when held
in landscape orientation.
b) If possible, project the data plot & regression lines for discussion –maybe
via instructor’s computer & projector or overhead projector.
c) Discussion of possible data outliers may come up – explain why the data
may not fall exactly along the best fit line.
2) Instructor provides brief explanation of least squares method of evaluating
regression lines, writing the equation for this calculation on the board.
Students then perform these calculations in their groups (can divide up who
does computation for which model).
a) Sample explanation: By plotting the data it looks like y varies with x. The
models describe a linear relationship where y varies with x. But the data
points don’t actually all fall on the line. The distance a point falls from the
line is a form of error. The model that best explains, or fits, the data is the
one that the data points are closest to, or has the least error (the smallest
error sum of squares).
n
ErrorSumOfSquares = å (Yi - Yˆi ) 2
i=1
Yi is a data point at Xi, and Yhat is the value of Y that falls on the line at Xi as
defined by the model.
3) Instructor-led discussion (as a class) about which group’s model is the “best
fit”.
a) The actual best fit lines are shown in the answer key. This information
may or may not be shared and is up to the instructor.
b) Instructor may foreshadow whether relationships may be linear or not, and
whether the regression is valid beyond the range of the data.
11
Module Developers:
Please contact us if you have comments/suggestions/corrections
Kathleen Hoffman
Department of Mathematics and Statistics
University of Maryland Baltimore County
[email protected]
Jeff Leips
Department of Biological Sciences
University of Maryland Baltimore County
[email protected]
Sarah Leupen
Department of Biological Sciences
University of Maryland Baltimore County
[email protected]
Acknowledgments:
This module was developed as part of the National Experiment in Undergraduate
Science Education (NEXUS) through Grant No. 52007126 to the University of
Maryland, Baltimore County (UMBC) from the Howard Hughes Medical Institute.
12