Download Technical Report no. 99 - Department of Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Behavioural genetics wikipedia , lookup

Microevolution wikipedia , lookup

Human genetic variation wikipedia , lookup

Public health genomics wikipedia , lookup

Inbreeding avoidance wikipedia , lookup

Inbreeding wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Transcript
PEDPACK: USER'S MANUAL
by
AlUD
Thomas
TECHNICAL REPORT No. 99
July 1987
Department of Statistics, GN-22
University of Washington
searae, Wa~ihinl~ton 98195 USA
PEDPACK: USER'S MANUAL
AlunThomas
ABSTRACT
Pedpack is a package of programs for pedigree analysis which uses the
UNIX operating system. The programs are written in C. This manual is
intended for users familiar with the UNIX commands and file handling system
but not necessarily with programming.
Fonowing a description of the programs provided by Pedpack, the
transcripts of three.demonstration sessions are given. These cover basic.pedigree
management, drawing pedigrees and calculation of probability functions. There
is also a bibliography relevant to those aspects of pedigree analysis covered by
the package.
The development of PEDPACK and production of this report were supported in part by a grant
entitled "Pedigree Analysis in Genetic Isolates" to Elizabeth Thompson, from the Graduate
S«;hOQIResear«;h Fund of the University of Washington.
Table of Contents
1.
Introduction to Pedpack
1
2.
Main pedigree handling commands provided by Pedpack
1
3.
Description of all commands provided by Pedpack
2
4.
Demonstration 1: Basic pedigree management
8
5.
Demonstration 2: Drawing the pedigree
19
6.
Demonstration 3: Peeling
23
7.
Bibliography
29
1. Introduction to Pedpack.
Pedpack is a set of programs for creating, managing and analysing databases for pedigrees
and genetic traits. It runs under the UNIX operating system and the user requires familiarity
with the basic UNIX file structure and command language.
When called for the first time the command Pedpack creates a directory called Ped,work
and takes the user into this directory. Future calls from the same directory move the user into the
already existing Ped.work, A C shell is entered and the pedigree handling programs are made
available. The default prompt inside Pedpack is a question mark-?
Standard UNIX commands should also be available, except for cd which is disenabled to
prevent the user from leaving directory Ped. work until Pedpack is exited completely using either
quit or exit. Once inside Pedpackthere should be no need to move to other directories.
Two subdirectories called Pedigreea and Traits are created in Ped.work, these will contain
the pedigree and genetic trait databases. While files in Ped.work itself can be safely moved,
edited etc. using the usual UNIXcommands those in Pedigrees and Traits should be created and
changed only through the commands provided by Pedpack,
Users can edit the .cshrc file in Ped, work to make aliases or to modify the path so that their
own commands can also be available in Pedpack,
Upon typing quit or exit the Pedpack C shell is exited and the user is returned to the
directory from which Pedpack was originally called. Any databases or output files created
during the session remain in Ped.work or its subdirectories and will be available when Pedpack
is next called.
The command help can be used from inside Pedpack for more information about the
package.
2. Main pedigree handling commandS provided by Pedpack.
newped
Takes a raw pedigree, specified in standard triplet form
and creates a database for use by subsequent programs.
newtrait
Creates a database representing the structure of a
genetic trait.
getgendat
Gets genetic data about a pedigree from an input
and incorporates this into the pedigree database.
checkgendat
on a peoigree
-2-
•
with Mendelian segregation.
browse
Allows the user to inquire about the structure of and
the genetic data on a pedigree. This includes calculation
of inbreeding and kinship coefficients and enumeration
of relationships between individuals.
update
Allows the user to add new individuals to a pedigree and
to alter data already present.
setcoords &
draw
Used together these produce a drawing of the pedigree as
a marriage node graph. Simulated annealing is used to
optimise the clarity ofthe picture.
Uses the peeling method to calculate probability functions
for genetic data on simple or complex pedigrees. This
command includes options to generate random peeling
sequences and to improve the efficiency of sequences
using simulated annealing.
peel
A fuller description of these programs and all the other Pedpack commands are given in the
following section..
3. Description of all commands provided by Pedpack,
3.1. browse
Brings a pedigree on line and allows the user to inspect various aspects of it. Available
options are
inf
infx
all
inb x
inb all
inb pro
prints
prints
prints
prints
prints
prints
x
information about the pedigree
information about individual x
information about each individuals in the pedigree
the inbreeding coefficient of x
all the inbreeding coefficients for the pedigree
a
of
by generation
generation x
kinship coerncient of x
y
quit (q)
prints all the relationships between x and y
leaves browse and returns to Pedpack
Output from browse can be redirected using >. For example:
? browse Bison »outfile
n inf 123
n inb all
nq
?
This prints information about Bison 123 and aU the inbreeding coefficients for this pedigree
into the file called outfile.
3.2. checkgendat
Checks the data for the given trait on the given pedigree. Coding errors and segregation
errors in nuclear families are looked for. This command should be used after inputting genetic
data using either getgendat or update. For example:
't checkgendat ABO Samaritans
3.3. commands
Gives a list of available commands.
3.4. draw
Takes a pedigree with precalculated coordinates and produces a standard device
independent plot file containing a marriage node graph picture of the pedigree. The file that the
plot instructions are sent to must be the standard output file. If this is not specified using > then
draw will try to put the picture on the screen immediately, causing nasty errors.
An example of a set of commands for drawing a pedigree might be as follows:
? setcoords
1 plot outplot
Where ... appears above the commands will prompt the user for required options.
The plot command is a standard unix command, and output from it can be redirected from
the screen to a hardcopy output device.
3.5. getgendat
The user isproIIlPtedforthename of the file where genetic data for a pedi~ree is held. The
columns of the filewheJ:"e thed,ata fora particular trait appear are then asked for along with the
coding system used, i.e, which integer refers to which phenotype.
The command checkgendat should be used after inputting genetic data. For example:
1 getgendat Rh Tristan
1 checkgendat Rh Tristan »outchecks
3.6. help
Gives the following information:
For a brief introduction to Pedpack type spec intro or just intro.
For a list of the main pedigree handling programs type spec list.
For a list of all available commands type commands.
For a description of a command type spec followed by the name of the command.
For a list of C subroutines
by the package type subs.
For a description of a subroutine type spec followed by the name of the subroutine.
For information about installing Pedpack
spec
11l;:)1i4ll.
Calculates linkage disequilibrium for two loci each with two co-dominant alleles.
3.8. newped
This program interrogates the user for the location of a new pedigree, and the name by
which the pedigree is to be known. The pedigree is then read in and the structure set up. Some
calculations are made to determine characteristics of the pedigree and it is then output, in a
standard form, to the file Ped,work/Pedigreesz''name", where "name" is the identifier for the
pedigree for future use.
It is assumed that the pedigree is in the standard triplet form of name, father's name,
mother's name and thai it satisfies the usual rules for specifying a pedigree, i.e.
(1)
One line of input must correspond to one individual.
On each line the names of the individual and its parents must
appear before any other data, and must be separated by spaces.
(2)
Names of individuals are positive integers while 0 is used to
represent the unknown parents of founders.
(3)
All founders must be listed. For each individual who is not a
founder both parents must be listed.
(4)
An individual always appears later in the list than its parents.
3.9. newtrait
Sets up a new genetic trait for future reference. Each trait is set up independently of any
pedigree. When prompted the user must input the number of alleles for the trait and the names.
Then the number of phenotypes and their names are inputted. The program will then prompt the
user for the penetrance matrix.
There is a final option to delete the trait if mistakes have been made in inputting the
parameters.
-63.10. parameters
Outputs the penetrance and Mendelian segregation probabilities, for any trait that has
already been set up, to the standard output file.
3.11. peds
Gives a list of the pedigrees that have been set up.
3.12. peel
Uses the peeling method to calculate probability functions for genetic data on a pedigree.
Most of the output from this program can be re-routed using >. Some will always go to the
standard error file.
Warning: this is a large number crunching program and should only be used when sensible
peeling sequences are available. If you don't know what a peeling sequence is then you should
not use this program. If c is the size of the largest cutset in the sequence and g is the number of
genotypes then the program should run on pollux if:
g C < 1,600,000
The peeling sequence used is originally set to be the order of marriages on the stack. There
are, however, options to randomise this order and to improve the computational efficiency of the
sequence using simulated annealing.
3.13. setcoords
Sets up the pedigree with coordinates suitable for drawing a marriage node graph. There is
an option to user annealing to clarify the picture by attempting to minimise the total squared line
length of the picture.
See draw for an example of how setcoords might be used.
3.14. spec
Gives a brief description of the keyword following it. For example
? spec browse
-7-
3.15.
Gives a list of the traits that have been set up.
3.16. update
Brings a pedigree on line and allows the user to update parts of it. Available options are
inf
infx
add x
altx
new
quit (q)
prints information about the pedigree
prints information about individual x
adds individual x to the pedigree
alters the data for individual x
appends data from a new file to the original pedigree.
leaves update and returns to Pedpack
There is an option to overwrite the original pedigree or to create a new one. The command
checkgendat should be used whenever update has been used to change any genetic data.
4. Demonstration 1: Basic pedigree management.
following is a transcript of a session, using Pedpack,which
demonstrates how new pedigree and trait databas.esare created and updated.
help and spec commands are given, and Is is used to show
subdirecrories Pedigrees and Traits.
user would type those words following the prompts %, ?, nor ...
fV\lJ""q&.
Pedpack
Creating a working directory called Ped.work for pedigree analysis.
Two subdirectories called Pedigrees and Traits are created in reU.worK.
these will containrhepedigree and genetic trait databases.
Ped.work itself c8n.he. safely moved, edited etc. using the
commands those in Pedigrees and Traits should be created
only throughthe commands provided by Pedpack,
Users can edit the .cshrc file in Ped.work to make aliases or to modify the
path so that their owncommands can also be available inPedpack.
Upon typing quit or exit the Pedpack C shell is exited and the user is
returned to the directory from which Pedpack was originally called. Any
databases or outputfiles created during the session remain in Ped.work or
its subdirectories and will be available when Pedpack is next called.
Directory Ped.work created.
Introduction to Pedpack.
The command help can be used from inside Pedpack for more information
about the package.
? Is
is a set of programs for creating, managing and analysing
pedigrees and genetic traits. It runs under the UNIXoperating
user requires familiarity with the basic UNIX file structure
command language.
for the first time the command Pedpackcreates a directory
'CU.work and takes the user into this directory. Future calls from the
same
move the user into the already existing Ped,work A C shell
is
and the pedigree handling programs are made available. The
default promptinside Pedpack is a question mark-?
Standard UNIX commands should also be available, except for cd which is
to prevent the user from leaving directory Ped.work until
Pedpaek is exited completely using either quit or exit. Once inside Pedpack
be no need to move to other directories.
Pedigrees Traits
?help
For a brief introduction to Pedpack type spec intro or just intro.
For a list of the main pedigree handling programs type spec list.
For a list of all available commands type commands.
For a description ofa command type spec followed by the
command.
For a list of C subroutines used by the package type subs.
of the
I
00
I
a
of a subroutine type spec followed by the name of the
browse
Allowsthe user to inquire about the structureof and
genetic data on a pedigree. This includescalculation
inbreeding and kinship coefficients
relationships between individuals.
update
Allowsthe user to add new individuals to a pedigree
to alter data already present.
infonnation about installing Pedpack type spec install.
Pedpaek type quit.
?commands
setcoords &
draw
vailable commands:
draw
peds
intro
newtrait setcoords triplets checkgendat
updatecommands getgendat
parameters spec
newped
peel
subs doublepeel help
peel
']
Used together these produce a drawing of the pedigree
as a marriage node graph. Simulated annealing is used
optimise the clarityof the picture.
Uses the peeling method to calculateprobability
functions for geneticdata on simple or complex
pedigrees. This command includesoptions to
generate random peeling sequences and to improve
the efficiency of sequences using simulated annealing.
pedigree handling commands provided by Pedpack,
? spec newped
newped
Takes a raw pedigree, specified in standard triplet form and
creates a database for use by subsequent programs.
Createsa database representing the structure of a genetic trait.
Gets geneticdata abouta pedigree from an input file and
incorporates this into the pedigree database.
Checks geneticdata on a pedigree for codingerrors and for
genotypes on nuclearfamilies inconsistent with Mendelian
segregation.
SPEC: newped
This program interrogates the user for the location of a new pedigree, and
the name by whichthe pedigree is to be known. The pedigree is then read in
and the structure set up. Some calculations are made to determine
characteristics of the pedigree and it is then output, ina standard form, to
the file Ped.work/Pedigreesr'name", where "name" is the identifier for the
pedigree for future use.
It is assumed that the pedigree is in the standard triplet form of name,
\0
name, mother's name and that it satisfies the usual rules for
Pedigree Test has been set up, 23 individuals inputted.
seecifvins a pedigree, i.e,
? spec browse
(I) One line of input must correspond to one individual. On each
names of the individual and its parentajmust appear
other data, and must be separated by spaces.
(2) Names of individuals are positive integers while 0 is used to
SPEC: browse
Brings a pedigree on line and allows the user to inspect various aspects
it
represent the unknown parents of founders.
Available options are
All founders must be listed. For each individual who is not
a founder both parents must be listed.
An individual always appears later in the list than its parents.
'/ cp /pol.lux/pedpacklPedigreesffestitriplets triplets
'?Is
triplets
inf
inf x
inf all
inb x
inb all
inb pro
inb gen x
kin x y
rel x y
quit (q)
: prints information about the pedigree
: prints information about individual x
: prints information about each individuals in the pedigree
: prints the inbreeding coefficient of x
: prints all the inbreeding coefficients for the pedigree
: prints a profile of inbreeding by generation
: prints aUthe inbreeding coefficients for generation x
: prints the kinship coefficient of x and y
: prints all the relationships between x and y
: leaves browse and returns to Pedpack
'/
Output from browse can be redirected using >.
r'nrrpntlu
existing pedigrees
E.g.
'/ newped
? browse Bison >outfile
program reads in a new pedigree specified in standard triplet form.
name of the file that the pedigree is in ... triplets
?? inf 123
n inb all
name of the pedigree ... Test
??q
I
.....
o
I
Inbreeding(l)
?
information about Bison 123 and all the inbreeding coefficients
pedlign~e into the file called outfile,
0.00000
?? kin 1 2
Kinship(I,2) = 0.00000
?browse
??q
?peds
:I
Pm,.",ntll :
°°
Spouse: 5 Children: 9
: female
Generation : 1
JJejl:ree: 1
Inbr,eedi:ng : 0.00000
Coordinates : (0,00000,0.00000)
6
:6
Parents:
Spouse : 14 Children: 16
Spouse: 9 Children: 15 13 12 10
: female
Generation : 1
1,JCl:&U;;;C: 2
Inb£1eedlll1g : 0.00000
Cocrdiaates : (0.00000,0.00000)
°°
Currently existing pedigrees
Test
? spec newtrait
I
l-'
l-'
SPEC: newtrait
Sets up a new genetic trait for future reference. Each trait
independently of any pedigree. When prompted the. user must
number of alleles for the trait and the names. Then the number
phenotypes and their names are inputted. The program will then prompt
user for the penetrancematrix.
There is a final option to delete the trait if mistakes have been made
inputting the parameters.
? traits
Currently set up traits
? newtrait
I
of
... MN
Check the penetrancematrix.
m mn
n
MM 1.000000.00000 0.00000
MN 0.00000 1.00000 0.00000
NN 0.000000.000001.00000
;:semng up genetictrait MN.
alleles ... 2
Number of phenotypes ... 3
Unless you type no this trait will be installed ... y
names when prompted.
? traits
1 ... M
Currently set up traits
2 ... N
MN
pbenotype names when prompted.
? spec parameters
Phenotype I ... m
SPEC: parameters
Phenotype 2 ... mn
Phenotvoe 3 ... n
penetrance probabilities, one genotype ata time
m mn n
MM 1 0 0
m mn n
MN 0 1 1
m mn n
010
m mn n
001
Outputs the penetranee and Mendelian segregation probabilities. for any
trait that has already been set up, to the standard output file.
? parametersMN
Number of alleles 2, namelyi- M N
Numberof genotypes 3
Numberof phenotypes 3
Penetrancematrix:
Phenotypes
Genotypes unknown m mn n
MM
1.00 1.00 0.00 0.00
MN 1.00 0.00 1.00 0.00
.....
N
I
1.00 0.00 0.00 1.00
Segreganon matricies:
genotype: MM
MM MN NN
MM 1.00 0.50 0.00
MN 0.50 0.25 0.00
0.00 0.00
aenotvre: MN
MM MN NN
MM 0.00 0.50 1.00
MN 0.50 0.50 0.50
1.00 0.50 0.00
genl()type: NN
MN NN
0.00 0.00
0.00 0.25 0.50
0.00 0.50 1.00
?
getgendat
[Using open mode]
"input" [New file]
12
22
31
43
121
:wq
"input" [New file] 5 lines, 25 characters
? Is
Pedigrees Traits
I
input
triplets
? getgendat
Name oftrait .,. MN
Name of pedigree ... Test
Type full name of input file ... input
user is prompted for the name of the file where genetic data for a
eediaree is held. The columns of the file where the data for a particular trait
are then asked for along with the coding system used, i.e, which
to whichphenotype.
Ready to input data for MN.
Data for individuals must be on separate lines in the file.
When prompted give the column numbers where the data items named are
listed.
command checkgendat should be used after inputting genetic data.
First column of name ... 1
? vi
.....
w
Checks the data for the given trait on the given pedigree. Coding errors and
segregation errors in nuclear families are looked for.
column of name ,.. 2
column of genetic data ... 4
column of genetic data ... 4
is the largest code number for a phenotype? ... 3
numbers for each of the phenotypes.
This command should be used after inputting genetic data using either
getgendat or update.
? checkgendat MN Test
Checking data for trait MN on pedigree Test.
Phenotype m ... 1
Checking individual data for coding errors.
No coding errors.
Phenotvoe mn ... 2
Phenotvoe n ... 3
Type! to overwrite Test or give a new pedigree name .... !
Observed phenotype frequencies.
unknown
18
m
2
mn
2
n
1
Finding allele, genotype and phenotype frequencies.
? Is
EM algorithm converged to an accuracy of O.OooOOe+OO in 1 iterations
for 5 individuals were considered, 5 items were input and 0 were
input
?
Currently
?
triplets
Alleles.
M
N
Frequency
0.60000
0.40000
pedigrees
Genotypes.
M M 0.36000
M N 0.48000
N N 0.16000
Phenotypes
Std. error
0.15492
0.15492
I-'
.j:-.
I
Observed Expected Chi-sqd
2 1.800 0.02222
m
2 2.400 0.06667
mn
1 0.800 0.05000
n
5
5.000 0.13889 on 1 degrees offreedom.
nuclearfamily for segregation errors.
segregaaon errors found.
?
Pedigreedata for Test :Number ofindividuals "" 23
Number of marriages"" 12
Number of indlists 16
Number ofll1arlists == 24
Highestlabel == 23
Number of components
I
Highestgenerati'on "" 5
Coordinatestset== NO
Generations reset == NO
Inbreeding calculated == YES
Numberofgenetic traits set
1 namely: MN.
?? inf 1
a pe(Uf!ree on line and allows the user to update parts of it.
vailable options are
x
new
information about the pedigree
information about individual x
; aces individual x to the pedigree
the data for individual x
: appends data from a new file to the original pedigree.
: leavesupdate and returns to Pedpack
is an codon to overwrite the original pedigreeor to create a new one.
command checkgendat should be used wheneverupdate has been used
to chanae anv geneticdata.
'1
Pedigree data.- '
Name: 1
Parents: 00
Spouse: 5 •Children: 9
Sex: female
Generation: 1
Degree: 1
Inbreeding: 0.00000
Coordinates: (0.00000,0.00000)
Genetic dara.MN:2
?? inf6
Pedigreedata.Name: 6
°°
Parents:
Spouse: 14 Children: 16
Spouse: 9 Children: 15 13 12 10
: female
Generation: 1
Degree: 2
Inbreeding: 0.00000
Coordinates : (0.00000,0.00000)
?? add 24
Give names of parents,
father ... 22
mother ... 23
MN:O
O.K., 24 with parents 22 and 23 added.
6
?? inf24
J"\nta IUl;
individual 6.
Coordinates of 6 are (0.00000,0.00000).
or C to change ... !
Type ! to
phenotypic data is not checked at this stage.
MN is phenotype number O.
it or C to change ... C
phenotype number ... 3
6
Pedigree datatName: 24
Parents: 22 23
Sex: unknown
Generation : 6
Degree: 1
Inbreeding: 0.31250
Coordinates: (0.00000,0.00000)
Genetic dataiMN:O
?? alt 24
:6
P!:Irl'nt" :
00
Spouse: 14 Children: 16
Spouse : 9 Children: 15 13 12 10
Sex: female
Generation : 1
~J<;;;CU;;V: 2
Inbreeding: 0.00000
Coordinates : (0.00000,0.00000)
Altering individual 24.
Sex of 24 is unknown.
Type! to leave it and M,F or U to change it ... M
Coordinates of 24 are (0.00000,0.00000).
Type ! to leave them or C to change .., !
Warning: altered phenotypic data is not checked at this stage.
Data for trait MN is phenotype number O.
Type! to leave it or C to change ... C
phenotype number ... 2
m
mn
n
2
3
2
Finding allele, genotype and phenotype frequencies.
PedifUl::e datat: 24
Parents : 23
: male
Generation: 6
JJI;¢~:;U;;I;¢: 1
Inbreeding : 0.31250
Coerdmates : (0.00000,0.00000)
:2
EM algorithm converged to an accuracy of O.OOOOOe+OO in 0 iterations
Alleles.
M
N
Frequency
0.50000
0.50000
Std. error
0.13363
0.13363
Genotypes.
M M 0.25000
M N 0.50000
N N 0.25000
q
I-'
"'-J
individuals added» 1.
Number of individuals altered == 2.
Phenotypes
Observed Expected Chi-sqd
m
2 1.750 0.03571
mn
3 3.500 0.07143
n
2 1.750 0.03571
! to overwriteTest or name a new pedigree.... !
Total
?
7.000 0.14286 on 1 degrees of freedom.
MNTest
for trait MN on pedigree Test.
Cb::cking individual data for coding errors.
errors.
OhllP:rvPtf
7
phenotype frequencies.
17
Checking each nuclear family for segregationerrors.
Parents 9 and 6 with phenotypes 0 3
with children
name 15 type 0
name 13 type 0
name 12 type 1
name 10 type 0
Errors found in 1 families.
Checking individual data for codingerrors.
No codingerrors.
? tupdate
12
_ individual 12.
of 12 is unknown.
! to
it and M,For U to changeit ... !
Coordinates of 12 are (0.00000,0.00000).
them or C to change ... !
Observed phenotype frequencies.
unknown
17
m
1
mn
3
n
3
Findingallele, genotype and phenotype frequencies.
EM algorithm converged to an accuracy of O.OOOOOe+OO in 1 iterations
Alleles.
phenotypic data is not checked at this stage.
MN is phenotype number 1.
it or C to change ... C
Frequency
Std. error
0.35714
0.64286
0.12806
0.12806
M
N
I
I-'
00
phenotype number... 3
Genotypes.
M M 0.12755
M N 0.45918
N N 0.41327
q
Number individuals added = O.
Number individuals altered == 1.
Type! to overwrite Test or name a new pedigree.
Phenotypes
Observed Expected Chi-sqd
m
1 0.8930.01286
mn
3 3.2140.01429
n
?
Total
ctleclCg.enaat MNTest
3
2.893 0.00397
7
7.000 0.03111 on 1 degreesof freedom.
Checking each nuclearfamily for segregation errors.
No segregation errorsfound.
for trait MNon pedigree Test.
? browse Test »alldata
22 21 20 1 50.31250 0
23 21 18 2 50.18750 0
24 22 23 1 60.31250 2
q
? Is
? quit
alldata input
triplets
?poHux%
?more
S. Demonstration 2: Drawing the pedigree.
of lJedijtUee Test.
name pa ma sex gen inb-cf traits
1 0 0 2 10.00000 2
2 0 0 2 10.00000 2
3 0 0 1 10.00000 1
4 0 0 1 10.00000 3
5 0 0 1 10.00000 0
6 0 0 2 10.00000 3
7 0 0 2 10.00000 0
8 4 7 2 20.00000 0
9 5 1 1 2 0.00000 0
10 9 6 1 30.00000 0
11 5 8 2 3 0.00000 0
9 6 0 3 0.00000 3
13 9 6 2 3 0.00000 0
14 5 8 1 3 0.00000 0
9 6 2 3 0.00000 0
14 6 0 40.00000 0
10 2 0 40.00000 0
3
2 40.00000 0
19 9 11 0 4 0.12500 0
10 13 2 40.25000 0
10
1 40.25000 0
In this session transcript the pedpack commands setcoords and draw
are used in conjunction with the standard unix command pIotto produce a
marriage node-graph of the pedigree set up in demonstration 1. The
marriage nodegraphproduced is shownat the end of this section.
I-'
\0
pollux% Pedpack
Directory Ped, workentered
?peds
Currently existing pedigrees
Test
? specsetcoords
SPEC: setcoords
Sets up the pedigree with coordinates suitable for drawing a marriage node
graph. There is an. option to user annealing to clarify the piqture by
attempting to minimise the total squared line lengthof the picture.
for an example of how setcoords might be used.
?
Resetting generations.
New coordinates have been calculated for Test.
Do you wish torrse.armealing to improve the picture? (yIn) ... y
Using the annealing algorithm to minimise total squared line length.
To fix marriagesatsofile generations type f ... f
a
with precalculated coordinates and produces a standard
independent plot file containing a marriage node graph picture of the
plot instructions are sent to must be the standard output
is not specified using :> then draw will try to put the picture on
screen immediatelv, causing nasty errors.
Type number of generation to be fixed.
Type quit or q when finished
first ... 3
next ... q
Number of iterations ... 10000
An examole of a set of commands for drawing a pedigree might be as
Starting temperature ... 1
N
o
'1 seteoorcs
I
Cooling factor ....99
Searching for a minimum total squared line length graph
»ourplot
?
Total squared x distances at start
Starting temperature
Cooling factor
1.05286
1.00000
0.99000
?
appears above the commands will prompt the user for required
command plot is a standard unix command, and output from it can be
the screen to a hardcopy output device.
? setcooros
Total squared x distances after searching0.26651
Finishing temperature
""
0.00000
Freezing temperature
0.00527
"" 10000
Number of tries
""
149
Number of downhill steps
""
683
Last downhill step
130
Number of uphill steps
==
523
Last uphill step
steps
step
5
332
10.120
! to overwrite Test or name a newpedigree..
'1 Is
alldata input
? draw
triplets
»picnire
Do vou wanta frame? (yin) ... y
want labels? (yin) ... y
graphof pedigree drawn to standard outputfile.
?
?
?nnllmc%
- 225.1. Marriage node graph attest pedigree.
6. Demonstration 3: Peeling.
Nameof trait .,. MN
session transcript we use the pedigree and trait set up in
demonstration 1 to demonstrate the use of the peeling program.
Nameof pedigree ... Test
Peeling data for trait MN on pedigree Test.
Directory Ped.work entered
'!
Type namesof individuals in reference set.
Type quit or q when finished.
first ... 3
next ... 5
peeling method to calculate probability functions for genetic data
on a pedigree,
next ... q
of the output from this program can be re-routed using >. Some will
to the standard error file.
Warning: this is a large number crunching program and should only he used
sensib'le peeling sequences are available. If you don't know what a
peeling
is thenyou should not use this program.
If cis
of the largest cutset in the sequence and g is the number of
the program shouldrnn if:
g**c < 2,000,000
pe~lJil1lg
sequence used is originally set to he the order of marriages on
There are, however, options to randomise this order and to
computational efficiency of the sequence using simulated
?
next ... 22
N
W
Peelingorder is beingset to order of marriages in stack.
Largestcutset cost for this sequence is 8.000.
Do you want to improve the sequence? (yin) ... y
Type:
ann to anneal the sequence,
ran to randomise the sequence,
use to use the currentsequence for peeling,
q to dump the sequence and quit peel altogether.
ann, ran, use or q ... ann
Using simulated annealing to improve the peeling sequence.
When prompted input the annealing parameters.
Number of iterations ... 10000
Starting temperature
Cooling
n ••
.n
10
? Is
Pedigrees alldata
99
Traits
input
outlikes picture
outpeel
triplets
cutset cost is 5.000000
? more outpeel
ann, ran, use or q n. use
Using simulated annealing to improve the peeling sequence.
Total: 450000e+00
18
Total: 1.35000e+Ol
\.-UU>CL. 3 1521 22 Total: 1.72194e+00
101521322 Total: 1.72194e+00
LmSCI.: 20
10 153 Total: 1.43495e+00
153 Total: 6.58905e-Ol
13
153 Total: 6.25960e-Ol
\"'U'bI;;;,; 9 6
3 Total: 1.09611e-02
Cutset: 9 11 6
3 Total: 3.29032e-02
\.-Ulbca; 5 11 6
3 Total: 6.06578e-03
145 11 3 Total: 1.81973e-02
\"'U'bti,; 5 8
3 Total: 6.06578e-03
\.-UI2'lCI. 5
3 Total: 8.35592e-04
Order of 3 individuals in final cutset is:
5
3
Number of iterations
10000
Starting temperature
10.000000
Cooling factor
0.990000
Maximum storage at start
8.000000
Starting time
0.240000
Maximum storage at end
Number of uphill steps
Last uphill step
Freezing temperature
Number of lateral steps
Last lateral step
Number of downhill steps
Last downhill step
Stopping time
5.000000
121
632
0.017612
805
9999
308
9982
18.339996
19.000 seconds.
likelihood has 27 terms.
Type name of output file for likelihoods or type ! to dump result.
r\
name or ! n. outlikes
l:._
Outputting result to outlikes,
Checking individual data for coding errors.
No coding errors.
Observed phenotype frequencies.
unknown
17
1
m
N
.p-
I
mn
n
Time used for this operation:
genotype and phenotype frequencies.
Peeling marriageof21 and 18 with children: 23
Segregation for 23
R function for 22 23
Numberof individuals involved is 4
Numbersummed out is I, namely: 23
Numberin output cutset is 3, namely: 21 18 22
Number of terms in output cutset is 27
Number of non-zerocontributions: 37
Numberof non-zeroterms
: 25
Total likelihood : 1.35000e+01
Time used for this operation:
0.020 seconds.
3
algorithm converged to an accuracy of O.OOOOOe+OO in 1 iterations
M
N
M
Frequency
0.35714
0.64286
Std. error
0.12806
0.12806
M 0.12755
M N 0.45918
N N 0.41327
m
mn
n
Expected Chi-sqd
1 0.893 0.01286
3 3.214 0.01429
3 2.893 0.00397
7
7.000 0.03111 on 1 degrees of'freedom.
of 22 and 23 with children: 24
24
24
individuals involved is 3
Number summed out is I, namely: 24
Number in outputcutset is 2, namely: 22 23
terms in outputcutset is 9
non-zero contributions : 7
non-zero terms
:7
: 4.50000e+00
J)",,,,lino
0.020 seconds.
3
Peeling marriage of3 and 15 with children: 18
Penetrance for 3
Segregation for 18
Prior for 3
R function for 21 18 22
Numberof individuals involvedis 5
Numbersummed out is I, namely: 18
Numberin output cutset is 4, namely: 3 1521 22
Numberof terms in output cutset is 81
Numberof non-zerocontributions: 34
Numberof non-zero terms
: 26
Totallikelihood: 1.72194e+00
Time used for this operation:
0.020 seconds.
Peeling marriageof 10 and 15 with children: 21
Indexfor 3
Segregation for 21
R function for 3 15 21 22
Number of individuals involved is 5
Number summed out is 0, namely:
N
VI
cutset is 5, namely: 10 15 21 3 22
terms in output cutset is 243
Number of non-zero contributions: 43
: 43
Number of non-zerc terms
likelihood: I.72194e+00
0.040 seconds.
this operation:
P~~1ina
of 21 and 20 with children: 22
3
Segreganon for
R function
10 1521 3
individuals involved is 6
Number summed out is 1, namely: 21
output cutset is 5, namely: 2022 10 153
terms in output cutset is 243
Number of aon-aero contributions : 77
Number of non-zero terms
: 55
: 1.43495e+00
this operation:
0.080 seconds.
Peelinc marriage of 10 and 2 with children: 17
2
Seglregation for 17
2
R function for 2022 10 153
individuals involved is 7
Number summed out is 2, namely: 2 17
cutset is 5, namely: 10 20 22 153
Number of terms in output cutset is 243
non-zero contributions : 131
non-zero terms
: 55
Total likelihood : 6.58905e-Ol
used for this operation:
0.120 seconds.
Peeling marriage of 10 and 13 with children: 20
Index for 3
Segregation for 20
R function for 10 20 22153
Number ofilldividuals involved is 6
Number summed out is I, namely: 20
Number in output cutset is 5, namely: 101322153
Number of terms in output cutset is 243
Number of non-zero contributions: 99
: 66
Number of non-zero terms
Total likelihood : ·6.25960e~01
Time used for this operation:
0.080 seconds.
Peeling marriage of 9 and 6 with children: 15 13 12 10
Penetrance for 6
Penetrance for 12
Index for 3
Segregation for 15
Segregation for 13
Segregation for 12
Segregation for 10
Prior for 6
R function for 10 1322153
Number of individuals involved is 8
Number summed out is 4, namely: 15 13 12 10
Number in oUlput cutsetis 4, namely: 9 6 22 3
Number of termsinoutput cutset is 81
Number of non-zero contributions: 21
Number of non-zero terms
:4
Total likelihood : I.09677e~02
Time used for this operation:
0.020 seconds.
Peeling marriage of 9 and 11 with children: 19
Index for 6
tv
0'
I
Index for 3
Segregation for 19
R function for 9 6 22 3
Number of individuals involved is 6
Number summed out is I, namely: 19
Number in output cutset is 5, namely: 9 II 6 22 3
Number of terms in output cutset is 243
non-zero contributions : 25
Number of non-zero terms
: 12
: 3.29032e-02
this operation:
0.040 seconds.
Peeling marriage of.5 and 1 with children: 9
I
Segregation for 9
5
1
R function for 9 11 6 22 3
individuals involved is 7
Number summed out is 2, namely: 1 9
Number in output cutset is 5, namely: 5 11 6 22 3
terms in output cutset is 243
Number of non-zero contributions: 33
non-zero terms
: 27
: 6.06578e-03
this operation:
0.060 seconds.
Peeling marriage of 14 and 6 with children: 16
Index for 6
Index for 3
Segregation for 16
R function for 5 11 6 22 3
Number of individuals involved is 7
Number summed out is 2, namely: 6 16
Number in output cutset is 5, namely: 145 11 22 3
Number of termsin output cutset is 243
Number of non-zero contributions : 108
: 81
Number of non-zero terms
Total likelihood : 1.81973e-02
Time used for this operation:
0.060 seconds.
Peeling marriage of 5 and 8 with children: 14 11
Index for 3
Segregation for 14
Segregation for 11
R function for 145 11 223
Number of individuals involved is 6
Number summed out is 2, namely: 14 11
Number in output cutset is 4, namely: 5 8 22 3
Number of terms in output cutset is 81
Number of non-zero contributions: 87
Number of non-zero terms
: 27
Total likelihood : 6.06578e-03
0.040 seconds.
Time used for this operation:
Peeling marriage of 4 and 7 with children: 8
Penetrance for 4
Index for 3
Segregation for 8
Prior for 4
Prior for 7
R function for 5 8 22 3
Number of individuals involved is 6
Number summed out is 3, namely: 47 8
Number in output cutset is 3, namely: 5 22 3
Number of terms in output cutset is 27
N
""-.J
non-zero contributions: 36
non-zero terms
:9
likelihood: 8.35592e-04
thisoperation:
0.040 seconds.
taken:
19.000 seconds.
? more outlises
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
I:lO.,oe- O.OOOOOe+OO
3.3:i98ioe-()4 O.OOOOOe+OO
51740e-05
3.0'166:Ze-{)S
1.39iQ2ge-05
>.06262Ie-05
14Yl:Se-IJQ
• L.U'''-U.J
1U:l
?
'1
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
O.OOOOOe+OO
N
00
I
- 297. Bibliograpby.
7.1. General.
Cavalli-Sforza, L.L. & Bodmer, W.F. 1971. The Genetics ofHuman Populations. W.H.
Freeman and Company, San Francisco.
Thompsen, E.A. 1986. Pedigree Analysis in Human Genetics. The Johns Hopkins
University Press.
Mourant, A.E., Kopec, A.C., & Domaniewska-Sobczak, K. 1976. The Distribution of
the Human Blood Groups and other polymorphisms. Oxford University Press, London.
7.2. Programming.
Kennedy,W..J. & Gentle, J.E. 1980 Statistical computing. New York: M. Dekker.
Kernighan, B.W. & Pike, R. 1984. The UNIX programming environment. Englewood
Cliffs, N.J.: Prentice-HalL
Kernighan, B.W. & Richie, D.M. 1978. The C programming language. Englewood
Cliffs, N.J.: Prentice-Hall.
7.3. Inbreeding.
Darlington, C.D. 1960. Cousin marriages and the evolution of the breeding system in
man. Heredity. 14,297-332.
Hajnal, J. 1963. Random mating and the frequency of consanguineous marriages.
Proceedings of the Royal Society. 159 (B), 125-177.
Morton, N.E., Crow, J.F& Muller, H..J.. 1956. An estimate ofthe mutational damage in
man from data on consanguineous marriages. Proceedings of the National Academy of Science.
42, 855-863.
Roberts, D.F. 1968. Genetic effects ofpopulation size reduction. Nature. 220, 1084-1088.
Scbull, W.J. 1958. Empirical risks in consanguineous marriages; sex ratio, malformation
and viability. American Journal of Human Genetics. 10,294-343.
7.4. Peeling.
Cannings, C., Tbompson, E.A, & Skolnick, M.H.
likelihoods on pedigrees ofarbitrary complexity. Adv. appl.
1976. Recursive deviation of
622-625.
- 30Cannings, C~,Thomps()n,E~A, & Skolnick, M.H. 1978. Probability functions on
complex.pedigrees. Adv. appl. Probab. lO,26-61.
Elston, R.C., & Stewart, M.H. 1971. A general model for the genetic analysis of
pedigree data. Human Heredity. 21,523-542.
Lange, 1(., & Elston, R.C. 1975. Extensions to pedigree analysis I. Likelihood
calculations for simple and complex pedigrees. Human Heredity. 23, lO5-112.
Thomas, A. 1986. Optimal computation of Probability Functions for Pedigree Analysis.
IMA Journal of Mathematics Applied in Medicine & Biology. 3, 167-178.
7.5. Other programs.
Thomas, A.·· 1986. PEDPACK: An
package of procedures for pedigree
analysis. Technical Report Number 20, Department of Biophysics and Medical Computing,
.
University of Utah.
ALG0L68C
Thompson, E.A. 1977. Peeling programs for pedigrees of arbitrary complexity.
Technical Report Number 6, Department of Biophysics and Medical Computing, University of
Utah.
Thompson, E.A. 1980. Package of recursive routines for computation on pedigrees.
Technical Report Number 17, Department of Biophysics and Medical Computing, University of
Utah.
7.6. Case studies.
Bonne, B. 1963. The Samaritans: a demographic study. Human Biology. 35,61-89.
~Olds~rnidt, ~., Ro~en,A. &llo~eI1,I. 1960. Changirtg marriage systems in the Jewish
communities ofIsrael. Annals of Human Genetics. 24,191-204.
Roberts, D.F. 1971. The demography of Tristan da Cunha.
465-479.
Population Studies. 25,
Roberts, D.F. & Bonne, B. 1973. Reproduction and inbreeding among the Samaritans.
Social Biology. 20, 64-70.
Thompson, E.A. & Roberts, D.F. 1980. Kinship structures and heterozygosity on Tristan
da Cunha. American Journal of Human Genetics. 32, 445-452.