Download 2D-3D Registration Methods for Computer-Assisted Orthopaedic Surgery Ren Hui Gong

Document related concepts

Medical imaging wikipedia , lookup

Image-guided radiation therapy wikipedia , lookup

Fluoroscopy wikipedia , lookup

Transcript
2D-3D Registration Methods for
Computer-Assisted Orthopaedic Surgery
by
Ren Hui Gong
A thesis submitted to the
School of Computing
in conformity with the requirements for
the degree of Doctor of Philosophy
Queen’s University
Kingston, Ontario, Canada
September 2011
c Ren Hui Gong, 2011
Copyright Abstract
2D-3D registration is one of the underpinning technologies that enables image-guided
intervention in computer-assisted orthopaedic surgery (CAOS). Preoperative 3D images and surgical plans need to be mapped to the patient in the operating room before
they can be used to augment the surgical intervention, and this task is generally fulfilled by using 2D-3D registration which spatially aligns a preoperative 3D image to
a set of intraoperative fluoroscopic images.
The key problem in 2D-3D registration is to define an accurate similarity metric
between the 2D and 3D data, and choose an appropriate optimization algorithm.
Various similarity metrics and optimization algorithms have been proposed for 2D3D registration; however, current techniques have several critical limitations. First,
a good initial guess - usually within a few millimetres from the true solution - is
required, and such capture range is often not wide enough for clinical use. Second,
for currently used optimization algorithms, it is difficult to achieve a good balance
between the computation efficiency and registration accuracy. Third, most current
techniques register a 3D image of a single bone to a set of fluoroscopic images, but in
many CAOS procedures, such as a multi-fragment fracture treatment, multiple bone
pieces are involved.
In this thesis, research has been conducted to investigate the above problems: 1)
i
two new registration techniques are proposed that use recently developed optimization techniques, i.e. Unscented Kalman Filter (UKF) and Covariance Matrix Adaptation Evolution Strategy (CMA-ES), to improve the capture range for the 2D-3D
registration problem; 2) a multiple-object 2D-3D registration technique is proposed
that simultaneously aligns multiple 3D images of fracture fragments to a set of fluoroscopic images of fracture ensemble; 3) a new method is developed for fast and
efficient construction of anatomical atlases; and 4) a new atlas-based multiple-object
2D-3D registration technique is proposed to aid fracture reduction in the absence
of preoperative 3D images. Experimental results showed that: 1) by using the new
optimization algorithms, the robustness against noise and outliers was improved, and
the registrations could be performed more efficiently; 2) the simultaneous registration of multiple bone fragments could achieve a clinically acceptable global alignment
among all objects with reasonable computation cost; and 3) the new atlas construction method could construct and update intensity atlases accurately and efficiently;
and 4) the use of atlas in multiple-object 2D-3D registration is feasible.
ii
Acknowledgements
First of all, I would like to express my sincere appreciation and thanks to my supervisors, Purang Abolmaesumi and James Stewart. Without their supervision, support
and patience, this thesis would not have been possible.
In addition, I would like to thank Paul St. John, Burton Ma, Manuela Kunz
and David Pichora for their great assistances in collecting or providing medical data,
which is a key prerequisite of my research.
Many thanks to Burton Ma for developing and providing the fracture treatment
planning tool in support of my research. Thanks also to Manuela Kunz for her
knowledge and assistance throughout my research work.
I am thankful to Mehdi H. Moghari, my former colleague, for his assistance in
understanding the Kalman Filter and for useful discussions on other mathematical
problems.
iii
Statement of Originality
I hereby declare that this submission is my own work and to the best of my knowledge
it contains no materials previously published or written by another person, except
where due acknowledgement is made in the thesis. Any contribution made to the
research by others, with whom I have worked, is explicitly acknowledged in the thesis.
iv
Contents
Abstract
i
Acknowledgements
iii
Statement of Originality
iv
List of Tables
ix
List of Figures
xi
Glossary
Chapter 1:
Introduction
1.1 Motivation . . . . . .
1.2 Objectives . . . . . .
1.3 Contributions . . . .
1.4 Thesis Organization .
xiii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 2:
Background
2.1 Overview of CAOS . . . . . . . . . . . . . . .
2.1.1 Components . . . . . . . . . . . . . . .
2.1.2 Procedures . . . . . . . . . . . . . . . .
2.1.3 2D-3D Registration in CAOS . . . . .
2.2 Overview of 2D-3D Registration . . . . . . . .
2.2.1 Registration in General . . . . . . . . .
2.2.2 2D-3D Registration . . . . . . . . . . .
2.3 2D and 3D Data in CAOS . . . . . . . . . . .
2.3.1 X-ray fluoroscopy . . . . . . . . . . . .
2.3.2 Computed Tomography (CT) . . . . .
2.3.3 Anatomical Atlases . . . . . . . . . . .
2.4 Transformations in 2D-3D Registration . . . .
2.4.1 Transformations within the Fixed Data
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
3
3
5
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
7
8
10
11
11
14
16
16
20
21
22
23
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
24
26
29
30
32
34
36
37
38
41
43
44
45
46
48
Chapter 3:
2D-3D Registration with Unscented Kalman Filter
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 Transform and its Initial Value . . . . . . . . . . . . . . .
3.3.2 Hardware-based Volume Rendering Engine . . . . . . . . .
3.3.3 Similarity Measure . . . . . . . . . . . . . . . . . . . . . .
3.3.4 Unscented Kalman Filter . . . . . . . . . . . . . . . . . . .
3.4 Experiment, Results, and Discussion . . . . . . . . . . . . . . . .
3.4.1 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
50
50
51
52
53
54
55
56
58
58
59
60
62
62
.
.
.
.
.
.
.
.
64
64
65
68
68
70
73
75
77
2.5
2.6
2.7
2.4.2 Transformation of the Moving Data . . . . . .
2.4.3 Parametrization of the Output Transformation
DRR Generation . . . . . . . . . . . . . . . . . . . .
2.5.1 Ray-casting . . . . . . . . . . . . . . . . . . .
2.5.2 GPU-based Texture Mapping . . . . . . . . .
2.5.3 Other Techniques . . . . . . . . . . . . . . . .
2.5.4 DRR Generation for Multiple Objects . . . .
2D-3D Similarity Metrics . . . . . . . . . . . . . . . .
2.6.1 Correlation-based Metrics . . . . . . . . . . .
2.6.2 Information-theory Metrics . . . . . . . . . . .
2.6.3 Metrics using Spatial Information . . . . . . .
Optimization Algorithms . . . . . . . . . . . . . . . .
2.7.1 Techniques Not using Derivatives . . . . . . .
2.7.2 Techniques using Derivatives . . . . . . . . . .
2.7.3 Robust and Efficient Optimization . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 4:
2D-3D Registration with the CMA-ES Method
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Algorithm Overview . . . . . . . . . . . . . . . . . . .
4.3.2 Optimization with CMA-ES . . . . . . . . . . . . . . .
4.4 Experiments, Results and Discussion . . . . . . . . . . . . . .
4.4.1 Registration of CT to Simulated Fluoroscopy . . . . . .
4.4.2 Registration of CT to Real Fluoroscopy . . . . . . . . .
vi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4.5
4.4.3 The Impact of Initial Search Size . . . . . . . . . . . . . . . .
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 5:
Multiple-Object 2D-3D Registration
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Preoperative Treatment Plan . . . . . . . . . . . .
5.3.2 Tracked Intraoperative Fluoroscopic Images . . . .
5.3.3 Transforms . . . . . . . . . . . . . . . . . . . . . .
5.3.4 Similarity Metric . . . . . . . . . . . . . . . . . . .
5.3.5 DRR Computation . . . . . . . . . . . . . . . . . .
5.3.6 Optimization Scheme . . . . . . . . . . . . . . . . .
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.1 Error Measurement . . . . . . . . . . . . . . . . . .
5.4.2 Experiments with Synthetic Fractures . . . . . . . .
5.4.3 Experiments with Fracture Phantoms . . . . . . . .
5.4.4 Validation Against Outliers in Fluoroscopic Images
5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 6:
Modelling of 3D Intensity Atlas with B-Spline FFD
6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1 Atlas Representations . . . . . . . . . . . . . . . . . . . .
6.3.2 Atlas Construction . . . . . . . . . . . . . . . . . . . . . .
6.3.3 Instance Generation . . . . . . . . . . . . . . . . . . . . .
6.4 Experiments, Results and Discussion . . . . . . . . . . . . . . . .
6.4.1 Accuracy of Intensity Approximation . . . . . . . . . . . .
6.4.2 Performance of Instance Generation . . . . . . . . . . . . .
6.4.3 Performance of Atlas Construction . . . . . . . . . . . . .
6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 7:
Atlas-based Multiple-object 2D-3D Registration
7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3.1 The Atlas of Distal Radius . . . . . . . . . . . . . . . .
7.3.2 Multiple-Fragment Deformable 2D-3D Registration . .
7.4 Experiments and Results . . . . . . . . . . . . . . . . . . . . .
vii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
79
80
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
81
81
82
85
85
87
87
88
91
91
93
95
96
101
108
111
116
.
.
.
.
.
.
.
.
.
.
.
118
118
119
121
121
123
125
127
127
129
130
131
.
.
.
.
.
.
132
132
133
135
135
140
144
7.5
Discussion and Summary . . . . . . . . . . . . . . . . . . . . . . . . . 147
Chapter 8:
Conclusion
8.1 Summary of Contributions . . . . . . . . . . . .
8.1.1 Robust and Efficient 2D-3D Registration
8.1.2 Multiple-object 2D-3D Registration . . .
8.2 Future Work . . . . . . . . . . . . . . . . . . . .
Bibliography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
148
148
148
150
152
154
viii
List of Tables
3.1
Testing data specifications (UKF) . . . . . . . . . . . . . . . . . . . .
58
3.2
Comparison of capture range (UKF) . . . . . . . . . . . . . . . . . .
60
3.3
Comparison of accuracy (UKF) . . . . . . . . . . . . . . . . . . . . .
61
3.4
Comparison of performance (UKF) . . . . . . . . . . . . . . . . . . .
61
4.1
Testing data specifications (CMA-ES) . . . . . . . . . . . . . . . . . .
74
4.2
Generation of testing cases . . . . . . . . . . . . . . . . . . . . . . . .
76
4.3
Results with simulated X-rays: registration error . . . . . . . . . . . .
76
4.4
Results with simulated X-rays: capture range, accuracy and computation time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
4.5
Results with real X-rays: registration error . . . . . . . . . . . . . . .
78
4.6
Results with real X-rays: capture range, accuracy and computation time 78
4.7
Impact of initial search size: capture range, accuracy and computation
time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
5.1
Error statistics - two-fragment synthetic fracture . . . . . . . . . . . . 100
5.2
Error statistics - three-fragment synthetic fracture . . . . . . . . . . . 101
5.3
Error statistics - two-fragment fracture phantom . . . . . . . . . . . . 106
5.4
Error statistics - three-fragment fracture phantom . . . . . . . . . . . 108
5.5
Results for outlier studies . . . . . . . . . . . . . . . . . . . . . . . . 110
ix
6.1
Accuracies of intensity approximation with B-spline FFD . . . . . . . 128
6.2
Performance of GPU-accelerated instance generation . . . . . . . . . 130
7.1
Preliminary results of atlas-based 2D/3D registration . . . . . . . . . 146
x
List of Figures
2.1
A typical CAOS setup . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.2
Overview of general registration . . . . . . . . . . . . . . . . . . . . .
12
2.3
Overview of DRR-based 2D-3D registration . . . . . . . . . . . . . .
14
2.4
An example of C-arm calibration drum . . . . . . . . . . . . . . . . .
18
2.5
An X-ray example before and after calibration . . . . . . . . . . . . .
19
2.6
Transformations in 2D-3D registration . . . . . . . . . . . . . . . . .
23
3.1
Overview of UKF-based 2D-3D registration . . . . . . . . . . . . . . .
54
3.2
Overview of the UKF algorithm . . . . . . . . . . . . . . . . . . . . .
57
3.3
Testing data (UKF) . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
3.4
Experimental results (UKF) . . . . . . . . . . . . . . . . . . . . . . .
61
3.5
Visual check of registration results . . . . . . . . . . . . . . . . . . . .
62
4.1
Overview of CMA-ES-based 2D-3D registration . . . . . . . . . . . .
69
4.2
Overview of the CMA-ES algorithm . . . . . . . . . . . . . . . . . . .
71
4.3
Impact of initial search size: registration error . . . . . . . . . . . . .
79
5.1
Overview of multiple-object 2D-3D registration . . . . . . . . . . . .
86
5.2
The testing cycle for each single experiment . . . . . . . . . . . . . .
94
5.3
Simulated two-fragment wrist fracture . . . . . . . . . . . . . . . . .
97
xi
5.4
Simulated three-fragment wrist fracture . . . . . . . . . . . . . . . . .
97
5.5
Formation of a simulated X-ray . . . . . . . . . . . . . . . . . . . . .
98
5.6
Experimental results - two-fragment synthetic fracture . . . . . . . .
99
5.7
Experimental results - three-fragment synthetic fracture . . . . . . . . 100
5.8
Simulated X-rays and the corresponding DRRs . . . . . . . . . . . . . 102
5.9
Two-fragment fracture phantom . . . . . . . . . . . . . . . . . . . . . 103
5.10 Three-fragment fracture phantom . . . . . . . . . . . . . . . . . . . . 103
5.11 X-ray images used for phantom studies . . . . . . . . . . . . . . . . . 105
5.12 Experimental results - two-fragment fracture phantom . . . . . . . . . 107
5.13 Experimental results - three-fragment fracture phantom . . . . . . . . 107
5.14 Visual check of registration results - two-fragment fracture phantom . 109
5.15 Visual check of registration results - three-fragment fracture phantom
110
5.16 Visual check of registration results - outlier studies . . . . . . . . . . 111
6.1
Training examples for CT atlas of distal radius . . . . . . . . . . . . . 127
6.2
Approximated CT data with different B-spline spacings . . . . . . . . 128
7.1
Training examples for building a CT atlas of distal radius . . . . . . . 136
7.2
Overview of atlas-based multiple-object 2D-3D registration . . . . . . 141
7.3
Modelling of a fracture fragment . . . . . . . . . . . . . . . . . . . . . 142
7.4
Generation of synthetic fracture . . . . . . . . . . . . . . . . . . . . . 145
7.5
Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
xii
Glossary
2
2D
2-dimensional.
3
3D
3-dimensional.
A
AABB
Axis-aligned Bounding Box, a bounding box that is aligned with the coordinate axes, p. 33.
ABS
Acrylonitrile Butadiene Styrene, p. 104.
ACL
Anterior Cruciate Ligament, p. 6.
AP
Anterior-posterior, a camera orientation that looks straight down the patient’s chest, p. 84.
ASGTM Adaptive Slice Geometry Texture Mapping, an improved texture mapping
method for volume rendering, p. 54.
xiii
C
CAOS
Computer Assisted Orthopedic Surgery, p. 1.
CART
Computer Assisted Radiotherapy, p. 51.
CAS
Computer Assisted Surgery, p. 1.
CF
Coordinate Frame, p. 7.
CMA-ES Covariance Matrix Adaptation Evolution Strategy, an algorithm for estimating the parameters of nonlinear systems, p. 3.
Co-registered 2D images A set of 2D images within a same coordinate frame, p. 14.
CPU
Central Processing Unit, p. 31.
CT
Computed Tomography, a 3D image modality that is dependent on X-ray
imaging, p. 10.
CUDA
Compute Unified Device Architecture, an architecture developed by NVIDIA
for performing parallel computations on GPUs, p. 32.
CVR
Coefficients-to-voxels Ratio, p. 127.
D
DirectX
A component of Microsoft Windows that handles multimedia-related tasks
such as video, audio and input/output, p. 31.
DOF
Degree of Freedom, p. 27.
DRB
Dynamic Reference Base, p. 8.
xiv
DRO
Distal Radius Osteotomy, p. 6.
DRR
Digitally Reconstructed Radiograph, a projection image obtained by simulating the X-ray imaging process on a CT image, p. 15.
F
FFD
Free-form Deformation, p. 118.
Fixed data The data set in a registration that is used as the reference and is kept
fixed during the registration, p. 12.
G
GC
Gradient Correlation, a similarity metric between two images that employs
the correlations between the gradient images, p. 43.
GD
Gradient Difference, a similarity metric between two images that employs
the differences between the gradient images, p. 44.
GPGPU
General-Purpose computation on Graphics Processing Units, p. 31.
GPU
Graphics Processing Unit, p. 31.
Group-wise registration A registration that simultaneously aligns multiple images,
p. 13.
GRV
Gaussian Random Variable, p. 56.
GUI
Graphical User Interface, p. 123.
xv
H
HRA
Hip Resurfacing Arthroplasty, p. 6.
I
IIR
Infinite Impulse Response, p. 89.
ips
Instances per second, the average speed to generate instances from an
atlas, p. 130.
M
MBA
Multi-level B-spline Approximation, p. 120.
MI
Mutual Information, a similarity metric between two images that uses the
joint entropy, p. 41.
Moving data The data set in a registration whose pose is to be determined by the
registration, p. 12.
MRI
Magnetic Resonance Imaging, a 3D image modality that is dependent on
nuclear magnetic resonance, p. 10.
mTRE
mean Target Registration Error, p. 54.
N
NCC
Normalized Correlation Coefficients, a similarity metric between two images that evaluates the cross-correlation of the two images, p. 38.
NMI
Normalized Mutual Information, a variant of MI, p. 41.
xvi
O
OBB
Oriented Bounding Box, a bounding box that is aligned with the principal
axes of an anatomy.
OIVR
Order Independent Volume Rendering, p. 37.
OpenCL
Open Computing Language, a framework managed by Khronos Group for
programming on heterogeneous devices such as CPUs, GPUs and other
processors, p. 32.
OpenGL Open Graphics Library, a standard programming interface managed by
Khronos Group for writing programs to produce 2D and 3D graphics,
p. 55.
OR
Operating Room, p. 7.
P
Pair-wise registration A registration that aligns two images, p. 11.
PCA
Principal Component Analysis, p. 25.
PI
Pattern Intensity, a similarity metric between two images that counts the
number of a special intensity pattern, p. 43.
R
ROI
Region of Interest.
xvii
S
SLNC
Sum of Local Normalized Correlation, a variant of NCC that evaluates
and combines local cross-correlations, p. 39.
SRC
Stochastic Rank Correlation, a similarity metric between two images that
calculates the cross-correlation between the intensity ranks of the two images, p. 40.
SVD
Singular Value Decomposition, p. 54.
T
THA
Total Hip Arthroplasty, p. 6.
TKA
Total Knee Arthroplasty, p. 6.
TRE
Target Registration Error.
U
UKF
Unscented Kalman Filter, an algorithm for estimating the parameters of
nonlinear systems, p. 3.
UT
Unscented Transform, p. 56.
V
Volume rendering Generation of a 2D image from a 3D image.
VWC
Variance-Weighted Correlation, a variant of SLNC that performs a special
weighting when combining the local cross-correlations, p. 39.
xviii
1
Chapter 1
Introduction
Registration of medical images enables automatic augmentation and/or fusion of multiple images of an object in order to get new insights about the trauma or anatomical
structures under the skin. It is one of the underpinning technologies for computerassisted surgery (CAS). One important type of registration is 2D-3D registration to
spatially align 3D CT images to 2D X-ray fluoroscopy images, because CT and fluoroscopy are the most available imaging modalities that produce high-quality images
for bony structures. 2D-3D registration is especially important in computer-assisted
orthopaedic surgery (CAOS), which is a major subspecialty of CAS that deals with
human bones such as spine, hip, knee, and wrist.
In this thesis, new 2D-3D registration techniques for CAOS are investigated, particularly for multi-fragment fracture fixation applications.
1.1
Motivation
Despite the importance of 2D-3D registration in CAOS, the use of 2D-3D registration
in clinics has been limited due to several challenges:
First, most registration techniques solve the problems in an iterative fashion that
1.1. MOTIVATION
2
requires a sufficiently good initial alignment between the two images being registered.
Large errors in initial alignment would significantly increase the failure rate of the
registration; in many cases, such as complex fracture cases, finding an initially close
alignment is a difficult task.
Second, during image acquisition and the process of registration, various system
noise and artefacts can be introduced. For example, imaging sensors can produce
system noises due to their physical limitations; image acquisition from inappropriate
orientations can introduce outliers that appear in one image but are missed in another image; and some registration techniques require post-processing of the images
which may introduce undesired artefacts. When the accumulated noise and imaging
artefacts become dominant, the registration faces problems such as high failure rate,
inaccurate final results, sensitivity to initial alignment, and slow convergence.
Third, 2D-3D registration in CAOS is often used with a preoperative treatment
plan to guide the intraoperative intervention. The problem is to identify the intraoperative positions of all involved bones in the coordinate space of the preoperative
treatment plan. Due to challenges mentioned above, most reported 2D-3D registration techniques handle only a single bony structure during each registration in order
to achieve a good compromise among performance, robustness and accuracy. When
using those techniques in treating cases that involve multiple bone fragments, more
complex procedures are needed, and a good global alignment for all involved bone
fragments may be difficult to achieve.
1.2. OBJECTIVES
1.2
3
Objectives
The overall objective of this thesis is to develop and evaluate new techniques of 2D3D registration in order to tackle the challenges presented in the previous section.
Specifically, the following objectives have been defined:
• To seek new optimization algorithms for 2D-3D registration that are not only
robust against various types of noise but are also efficient to run.
• To develop 2D-3D registration techniques that can simultaneously handle multiple bony structures and have reasonable computation time.
1.3
Contributions
As the principal author, the main contributions of this thesis consist of the following
published works:
1. A 2D-3D registration method that utilises the Unscented Kalman Filter (UKF)
was proposed and evaluated. The method showed better robustness against
noise with wide capture range while not compromising the computation speed.
This work was published in the proceedings of MICCAI 2006, with J. Stewart
and P. Abolmaesumi as co-authors [38].
2. A 2D-3D registration method that takes advantage of the robust Covariance
Matrix Adaptation Evolution Strategy (CMA-ES) algorithm was proposed and
evaluated. Similar to the UKF-based method, this method demonstrated great
robustness against noise, wide capture range and good computation performance. In addition, this method had better usability because fewer user parameters were involved. Preliminary and extended results of this work were
1.3. CONTRIBUTIONS
4
published in the proceedings of IEEE EMBC 2006 and SPIE 2008, both with
J. Stewart and P. Abolmaesumi as co-authors [37, 36].
3. A multiple-object 2D-3D registration method was proposed and evaluated.
This method is especially useful in CAOS for identifying the relative positions
of the intraoperative bone fragments with respect to preoperative treatment
plans. Such capability can lead to novel computer-assisted image-guided multifragment fracture fixation techniques, and methods for post-operatively assessing the treatment errors. This work was published in the journal of IEEE TBME
2011, with J. Stewart and P. Abolmaesumi as co-authors [41].
4. The proposed multiple-object 2D-3D registration method depends on two independent procedures developing the treatment plan and performing the registration. Using a statistical shape model of the bone being treated as a reference
can significantly simplify the entire process. As the performance of constructing
and using such a statistical shape is very important, a new method was proposed
to model, construct and use 3D intensity atlases. This work was published in
the proceedings of IEEE EMBC 2010, with J. Stewart and P. Abolmaesumi as
co-authors [40].
5. A new multiple-object 2D-3D registration method that incorporates the statistical atlas of the bone was proposed and evaluated. This method integrates
automatic planning with the registration; thus aims to improve the clinical usability. This work was published in the proceedings of SPIE 2009, with P.
Abolmaesumi as the co-author [39].
1.4. THESIS ORGANIZATION
1.4
5
Thesis Organization
The rest of this thesis is organized into seven chapters:
Chapter 2 gives an overview of CAOS and provides a review of current 2D-3D
registration techniques. Key problems in 2D-3D registration that motivated this
thesis are also discussed.
Chapters 3 and 4 present two new 2D-3D registration techniques that employ the
UKF and CMA-ES optimization algorithms to improve the registration performance.
Chapter 5 describes the first multiple-object 2D-3D registration method for pose
identification in CAOS.
Chapter 6 presents the new method for modelling, constructing and using 3D
intensity-based anatomical atlases.
Chapter 7 presents the multiple-object 2D-3D registration method that incorporates the atlas of the bone for improved user experience.
Finally, Chapter 8 gives some concluding remarks and points out the potential
directions of future research.
6
Chapter 2
Background
This chapter first provides an overview of CAOS and highlights how 2D-3D registration is situated in the big picture of CAOS. Then, a brief introduction to 2D-3D
registration is given and the key problems are described in more detail.
2.1
Overview of CAOS
Over the past two decades, the traditional surgery has been rapidly moving to
computer-assisted surgery (CAS) because CAS enables the development of more accurate minimally-invasive procedures [30, 83, 105]. One important subspecialty of CAS
is computer-assisted orthopaedic surgery (CAOS) that focuses on the integration of
CAS technologies into the field of orthopaedic surgery. Applications of CAOS have
been found in many orthopaedic therapies such as spinal surgery, total hip arthroplasty (THA), hip resurfacing arthroplasty (HRA), total knee arthroplasty (TKA),
anterior cruciate ligament (ACL) reconstruction, distal radius osteotomy (DRO), and
so on. A large number of articles in the forms of clinical trials and retrospective reviews have been published, and the use of CAOS has demonstrated the benefits such
2.1. OVERVIEW OF CAOS
7
Figure 2.1: A typical CAOS navigation system [26].
as revolutionized operating room (OR) configurations and surgical procedures, enhanced preoperative planning, improved intraoperative effectiveness and efficiency,
increased speed of postoperative recovery, and improved clinical outcomes [1, 24, 79].
2.1.1
Components
Fig. 2.1 illustrates a typical CAOS setup in the OR. As shown in the picture, a CAOS
system has three major constituents:
• A camera or 3D localizer that is used to determine the reference coordinate
frame (CF) of the OR, and to integrate all trackable OR devices;
• Various trackable devices that are monitored by the camera to provide the
positions or poses of the patient or other OR objects. Example devices include
2.1. OVERVIEW OF CAOS
8
pointers, dynamic reference bases (DRB), surgical tools, intraoperative imaging
devices, and so on;
• A computer workstation that stores preoperative surgical plans and connects
all OR devices to provides image-based navigation.
2.1.2
Procedures
Using CAOS technologies for medical treatment generally consists of three phases or
procedures:
Preoperative planning. Before a surgery is conducted, a digital “patient” is created
on the computer workstation and is subsequently used as the base for making surgical
plans. The digital “patient” consists of various information about the patient and
trauma such as medical images, 3D models of the involved anatomical structures,
geometry information of implants, and functional data that are relevant. Multiple
imaging modalities may be used concurrently for better visualization of different
anatomic structures. All information is mapped into a common coordinate frame
before surgical plans can be made. The planning process is patient and procedure
specific, and has variable complexity. A planning can be as simple as specifying a
target, or as complex as determining the final shape of a bone fracture and specifying
the movement trajectories of various fracture fragments.
Intraoperative intervention. Once the patient is in OR, the digital “patient” on the
computer is mapped to the physical patient in OR, which means to align the coordinate frame of the digital “patient” to the OR coordinate frame that is determined by
the camera. This task is performed either by aligning a set of anatomical or artificial
landmarks that is visible both preoperatively and intraoperatively, or by capturing a
2.1. OVERVIEW OF CAOS
9
few intraoperative images and aligning them with the preoperative images. During
the intervention, this task may also be performed periodically to update the surgical
plan to reflect critical anatomical deformations or patient pose changes. Once the two
coordinate frames are aligned, the navigation system provides visual assistance to the
surgical team by tracking the surgical tools and visualizing the spatial relationships
of the anatomical structures. The digital “patient” and the associated surgical plans
are used as references to provide visual or quantitative feedback on the treatment
accuracy.
Postoperative assessment. After the surgery, it is necessary to quantitatively or
visually evaluate the outcome of intervention with respect to the preoperative surgical
plan. This task is often achieved using non-invasive methods because the evaluation
may be repeated periodically over the healing time. In the most commonly used
method, postoperative images are captured and compared with the preoperative images and the surgical plan.
Each of the procedures described above employs various technologies. Collectively,
several key technologies can be identified [83, 120]:
• Medical imaging and image processing that provide appropriate 2D or 3D images
to aid the therapy;
• Segmentation and anatomical modelling that extract and construct anatomical
models from medical images to aid planning, visualization, navigation, and so
on;
• Surgical planning that performs tasks such as trauma visualization, determination of surgical paths, simulation of surgical procedures, and so on, before an
intervention is conducted;
2.1. OVERVIEW OF CAOS
10
• Data registration that integrates different images or anatomical models into
a common context such that more comprehensive understanding about the
trauma can be obtained;
• Visualization that renders 3D images and anatomical models to highlight the
structures of interest;
• Tracking that reports the positions or poses of various OR objects such as
surgical tools, intraoperative imaging devices, DRBs, and so on;
• Human-computer interaction that accepts inputs from the surgical team and
provides visual or quantitative feedback.
Though all the technologies pointed above are important for CAOS, this thesis focuses
on the registration topic; specifically, the 2D-3D registration.
2.1.3
2D-3D Registration in CAOS
One popular type of registration is 2D-3D registration that aligns high-quality preoperative 3D data sets such as CT, MRI and 3D ultrasound to portable intraoperative
2D image modalities such as X-ray fluoroscopy and ultrasound. In CAOS, registration of CT to X-ray is especially important because these two modalities are not
only most suitable for imaging bony structures, but also widely available in hospitals.
2D-3D registration is a fundamental task in many CAOS procedures. In preoperative
planning, it can be used to provide 3D visualizations of the trauma by registering
bone models to fluoroscopic images. During intraoperative surgical intervention, it
can be used to map the preoperative plan on the computer to the physical patient
in the OR by registering the digital “patient” to a set of intraoperative fluoroscopic
2.2. OVERVIEW OF 2D-3D REGISTRATION
11
images. This task is the most important use of 2D-3D registration in CAOS, and
it is the key prerequisite that enables image-based navigation and surgical guidance.
In postoperative assessment, 2D-3D registration can be used to measure or evaluate
the treatment errors by registering the preoperative plan to a set of postoperative
fluoroscopic images.
2.2
Overview of 2D-3D Registration
2D-3D registration is a special case of the general image registration problem. Therefore, a short introduction to the image registration is given first, followed by special
focus on the 2D-3D registration problem.
2.2.1
Registration in General
The registration task integrates multiple data of an object into a common context in
order to provide useful information to the physicians. This task aligns two or more
independent data sets in a single reference coordinate frame and establishes spatial
correspondences among the points in different data sets. The data to be aligned can
be raw medical images or anatomical models that are derived from the images.
One fundamental type of registration is called Pair-wise registration, which is a
process of geometrically aligning two correlated data sets. One of the two data sets is
served as the reference data, and the other data set is transformed into the reference
data’s coordinate frame such that the two data sets are aligned. Fig. 2.2 illustrates
the key components as well as the general work-flow of a pair-wise registration.
The key components of a pair-wise registration are two data objects and three
processes:
2.2. OVERVIEW OF 2D-3D REGISTRATION
12
Figure 2.2: Key components and general work-flow of a pair-wise registration.
• A Fixed data set that is served as the reference data;
• A Moving data set that is transformed by the registration to match the reference
data;
• A transformation process, parameterized as a vector of scalar parameters, that
positions the moving data in the coordinate frame of the reference data;
• A similarity calculation process that evaluates a user-defined cost function,
named the similarity metric, to quantitatively report the accuracy of alignment
under a set of transformation parameters; and
• An optimization process that uses a user-selected algorithm, called the optimizer, to compute values of the transformation parameters which produce an
optimal alignment (i.e. maximally accurate) between the two data sets.
In practice, the processes of transformation and similarity calculation are usually
combined into a single process for better computational efficiency. The objective of
registration is to determine the transformation parameters. The problem is often
2.2. OVERVIEW OF 2D-3D REGISTRATION
13
solved iteratively: starting from an initial guess of the transformation parameters,
the moving data is first transformed and the similarity metric is calculated, then
the transformation parameters are refined by the optimizer such that the similarity
metric is improved. The process is repeated, and the final result of the transformation
parameters is reported when the similarity metric satisfies some predefined criteria.
Group-wise registration is another type of registration that is widely used in medical image analysis and CAS. It simultaneously aligns more than two data in order
to obtain an optimal global alignment among all involved data. Group-wise registration can be treated as a special case of the pair-wise registration, where all data are
collectively treated as the moving data, and the fixed data is unknown in advance
but is dynamically determined throughout the registration process. Group-wise registration is usually implemented using special algorithms that iteratively optimize the
registration parameters and the fixed data, but it can also be implemented by using
a series of pair-wise registrations.
Registration has been used in a wide range of applications. It is one of the underpinning technologies in computer-assisted surgery that enables non-invasive or minimally invasive medical procedures. Registration techniques can be classified according
to different criteria, with the commonly used ones being the modalities of individual
data (single-modality versus multiple-modality), the number of data to be aligned
(pair-wise versus group-wise), the nature of the transformation that positions the
data in the reference coordinate frame (rigid, affine or deformable), the dimensionalities of individual data (2D-2D, 3D-3D or 2D-3D), and the nature of the optimization
algorithm (analytic versus iterative). Comprehensive surveys on pair-wise registration can be found in the literature [17, 43, 48, 61, 69, 128], and several methods on
2.2. OVERVIEW OF 2D-3D REGISTRATION
14
Figure 2.3: Key components and work-flow of DRR-based 2D-3D registration.
group-wise registration have also been reported [4, 10, 66, 71, 81, 100, 109, 130].
2.2.2
2D-3D Registration
2D-3D registration is a special type of pair-wise registration. It is a common practice
to register the 3D data to the 2D data. That is, the 2D and 3D data are used as the
fixed and moving data, respectively. Traditionally, the 3D moving data is a single 3D
image; however, in certain applications such as CAOS for fracture treatment, there
is an increasing need for 3D data that consists of multiple 3D data members such
as 3D images of multiple fracture fragments or multiple anatomical models. When
multiple 3D data members are involved, the registration is called multiple-object 2D3D registration.
As a 3D data contains significantly more information than a 2D data, the 2D
fixed data usually consists of a set of Co-registered 2D images rather than a single 2D
image. The term “co-registered” means that the 2D images are in the same coordinate
2.2. OVERVIEW OF 2D-3D REGISTRATION
15
frame, taken from the same subject but from different viewing directions.
2D data and 3D data cannot be directly compared, so special processing is needed
when defining the similarity metric. Two solutions have been used to deal with this
issue: either constructing an intermediate 3D data from the 2D data [90, 107, 126],
or generating an intermediate 2D data from the 3D data [9, 13, 34, 50, 55, 56, 112].
In the first solution, an intermediate 3D data is computed from the 2D fixed data,
and the similarity metric is defined between the 3D moving data and the intermediate
3D data. This solution requires some prior knowledge to reconstruct the intermediate 3D data from the 2D fixed data because the 2D fixed data itself may not contain
enough information to perform an accurate reconstruction. As the similarity calculation is performed in 3D, the computation cost is also increased. Because of those
limitations, a small number of methods have used this solution.
A more popular solution is to compute an intermediate 2D data from the 3D
moving data and define the similarity metric between two sets of 2D data. The
intermediate 2D data can be obtained by simulating the X-ray imaging process on
the 3D moving data with the same imaging parameters that produced the 2D fixed
data. For each fluoroscopic image in the 2D fixed data, a corresponding 2D projection
image, known as digitally reconstructed radiograph (DRR), is synthesized. As this
solution tightly depends on intermediate DRRs and the similarity calculation is performed on raw intensities or features derived from the intensities (such as the image
gradient), the 2D-3D registration based on this scheme is also called DRR-based or
intensity-based 2D-3D registration, or gradient-based 2D-3D registration if the image
gradient is primarily used. Fig. 2.3 shows an updated diagram for DRR-based 2D-3D
registration.
2.3. 2D AND 3D DATA IN CAOS
16
In the subsequent sections, each of the components in 2D-3D registration are
briefly described. As comprehensive reviews on 2D-3D registration can be found in
several publications [70, 114, 129], the goal in this chapter is not to give a detailed
review, but to highlight the technologies and problems that are relevant to this thesis.
In each of the following chapters, further references that are relevant to the particular
chapter are provided.
2.3
2D and 3D Data in CAOS
As mentioned in Section 2.1.3, the most important and commonly used imaging
modalities in CAOS are 2D fluoroscopy and 3D CT. They are summarized below to
show that the data can be an important source of errors in 2D-3D registration.
2.3.1
X-ray fluoroscopy
X-ray imaging has been used for medical applications for over a century. X-rays
are generated from radiation of high energy electromagnetic waves. When the radiation travels through the human body, the rays interact with the tissue layers and
are gradually attenuated. The types of interaction include photoelectric absorption,
Rayleigh scattering and Compton scattering, and the combined interaction is measured using the term attenuation coefficient. Different tissues have different attenuation coefficients. After penetrating the human body, the attenuated rays hit an
X-ray sensitive detector and are converted into a visible projection image. Traditionally, the projection image is recorded onto a film which is not a suitable medium for
computer-assisted applications. Nowadays, electronic media are used and the image
is instantly displayed on a monitor. This type of X-ray devices is commonly known
2.3. 2D AND 3D DATA IN CAOS
17
as fluoroscopy, or C-arm because the detector and radiation source are mounted on
the two ends of a C-shaped frame.
Let I0 be the initial energy of an X-ray x at the source point, I1 be the attenuated
energy at the detector plane, A and B be the incident and exit points of the ray
through the tissue. Then, the following relationship between the two energies exists
I1 = I0 e−
RB
A
µ(s)ds
,
(2.1)
where s is a point on the ray x, µ(.) is the tissue absorption coefficient at the given
point, and ds is an infinitesimal length along the ray. A fluoroscopy device measures
the photon fluence (determined by I1 ) reached at the detector plane, and the quantized
pixel value is determined by the physical characteristics of the imager.
Eq. (2.1) shows the X-ray imaging process for a monochromatic X-ray beam. In
practice, the X-ray beams used in fluoroscopy devices are polychromatic and have
a moderately broad energy spectrum. When such a beam travels through a tissue,
X-rays with different energies are attenuated differently: some are more easily attenuated (called soft X-rays), and some are more penetrating (called hard X-rays).
The net effect is that the total amount of attenuation is determined by the tissue
thickness and the beam spectrum, and the thicker the tissue is the more penetrating
the beam will be. This phenomenon is called beam hardening, and can cause artefacts
in recorded images [97].
Fluoroscopic images are characterized by a small field of view and generally exhibit
geometric and intensity distortions. The distortions vary for different imaging orientations, because C-arm devices are heavy objects and rotating the C-arm will cause
deformations in the shape of the C-arm frame. Newer C-arm devices can produce
2.3. 2D AND 3D DATA IN CAOS
18
Figure 2.4: An example of calibration drum for correcting C-arm distortions [60].
images with less distortion, but they are expensive and not widely available in the OR
yet. For traditional C-arm devices, it has been reported that a different distortion
correction for every imaging orientation is necessary [46]. Two approaches have been
proposed for correcting C-arm distortions: offline calibration [121] and online calibration [60, 63, 102, 104]. The offline approach computes the calibration parameters for a
fixed set of C-arm orientations. It produces cleaner images but has a main drawback
that the available imaging orientations are limited. The more popular approach is
the online calibration, which computes the calibration parameters for every captured
image. This is usually realized by mounting a two-layer calibration drum on top of
the C-arm intensifier. Each of the layers contains a number of radio-opaque markers
with different diameters (usually two) and known geometrical configurations. When
an image is captured, part of the markers on each layer are detected from the image,
and subsequently used to correct the geometry distortions as well as to compute the
X-ray source location. To detect the markers from the image, the prior knowledge of
the marker configurations is utilized. Fig. 2.4 shows a C-arm calibration drum.
2.3. 2D AND 3D DATA IN CAOS
19
Figure 2.5: An example of X-ray fluoroscopy before (left) and after (right) calibration.
While the use of a calibration drum provides a good trade-off between the calibration accuracy and the accessibility of the device, it also introduces side-effects.
First, the markers on both layers are shown in the captured images, and should be
removed and then interpolated before further processing. This process will introduce
additional noise into the images. Fig. 2.5 shows an example of an X-ray image before
and after removing the calibration markers, where the introduced noise is small but
still visible. Second, the calibration requires enough markers to be detected, which
may not be possible for some imaging orientations due to occlusions between the
markers and the imaged tissues. Furthermore, the drum is not only used for calibration, but is also used to report the imaging orientations, which means that the drum
must be visible to the camera in order to produce valid images. However, this is
often a problem in the crowded OR environment. So the use of calibration drum also
limits the available imaging orientations. For experiments in this thesis, a commercial
product was used to calibrate the acquired X-ray images, and the calibration results
were dependent on various factors such as the source-to-object distance, the imaging
parameters, the type of bones, and so on.
2.3. 2D AND 3D DATA IN CAOS
2.3.2
20
Computed Tomography (CT)
CT is a 3D X-ray modality that is reconstructed from 2D X-ray projections. It generates cross-sectional projection images of the human body, and then a 3D tomography
data is computed from the projection images using the Radon Transform [106]. The
projection images are acquired by rapidly rotating the X-ray tube around the patient,
and the transmitted radiation is measured using an array of X-ray detectors that are
mounted on the device gantry.
There are two basic types of CT machines: single-slice CT and spiral CT. Traditionally, the X-ray source rotates through 360◦ within the gantry and the patient
table is moved through the X-ray beam in discrete steps. At each table position, one
image slice is acquired. In modern CT scanners, the X-ray source generates a fan
beam of X-rays, and multiple detectors are used to record the image data simultaneously. Compared with the first generation CT scanners that use a single X-ray beam,
this design greatly improves the image acquisition speed.
In spiral CT scanners, the X-ray source continuously rotates within the gantry
while the patient table is moved through the X-ray beam at a constant speed. So the
radiation passing through the patient takes on a spiral or helix form. As a continuous
volume is acquired in a one-go, compared with the conventional single-slice scanners,
the image acquisition speed is significantly improved. To further improve the scanning
efficiency, multislice spiral CT scanners are developed. In such devices, an array of
detectors in z direction are used and multiple spiral slices are simultaneously acquired.
As anatomical regions of interest can be imaged within a single breath hold, the
possible artefacts due to patient movement can be reduced.
CT imaging is considered to be geometrically accurate, so user calibrations are
2.3. 2D AND 3D DATA IN CAOS
21
usually not necessary. However, it can exhibit intensity artefacts when metallic objects are present in the field of view. These artefacts are the result of reconstruction
using corrupted projection data, which is caused by the X-rays being greatly attenuated by the metal. There are a number of approaches to metal artefact reduction,
including use of higher energy X-ray beams, and interpolating the missing projection
data. CT imaging can also present artefacts that are caused by beam hardening.
However, for most recent CT scanners, such artefacts will be corrected internally
when CT images are acquired.
CT is mainly a preoperative modality, but intraoperative CT is also available.
Similar to X-ray fluoroscopy, the main drawback of CT imaging is the ionizing radiation.
2.3.3
Anatomical Atlases
An anatomical atlas [20], or statistical shape model, is a special type of data set that
is generalized from a set of images or anatomical models. It captures, in a compact
form, the mean and variability of an anatomy within a population of subjects or across
multiple studies of a same subject over time. It not only represents the subjects that
are used to construct the atlas, but also can predict the shapes of unknown new
subjects. In registration applications, this property of atlas is very useful because an
unknown atlas instance can be used in place of a missed data set and the concrete
shape of the instance can be dynamically determined during the registration.
An atlas of a particular anatomy is constructed from a set of subjects of the
anatomy called training examples. The training examples are geometry or intensity
models of the anatomy, so the quality of the anatomical modelling is a key factor
2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION
22
that affects the quality of the constructed atlas. When geometry models are used,
the model representations are more compact, thus the segmentation error introduced
during model construction is a main quality factor for the constructed atlas. When
intensity models are used, accurate and efficient representations are needed to model
both geometry and intensity, thus the selection of an appropriate model representation
is a key factor. Another important factor that affects the quality of an atlas is the
set of training examples that are used to construct the atlas. They need to be diverse
enough to cover all possible shapes, and should include minimal artefacts in shape or
intensities.
2.4
Transformations in 2D-3D Registration
2D-3D registration in CAOS involves a number of different coordinate frames. Fig. 2.6
illustrates the involved coordinate frames and their relationships for a typical multipleobject 2D-3D registration. The primary coordinate frames include patient, camera,
C-arm intensifier, fluoroscopic image, and 3D data. If the 3D data is a collection of 3D
objects, then there will be additional coordinate frames for the member objects. The
available coordinate frames can be split into two groups: those that are associated
with the fixed data, and those that are associated with the moving data. The goal
of registration is to establish a link between the two groups of coordinate frames,
or specifically, to find a spatial transformation that places the moving data into the
coordinate frame of the fixed data.
2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION
23
Figure 2.6: Transformations in a multiple-object 2D-3D registration.
2.4.1
Transformations within the Fixed Data
In general, the patient coordinate frame is used as the reference coordinate frame
of the fixed data, and it is defined by a DRB that is mounted on the patient. For
a captured fluoroscopic image i, its pose in the reference coordinate frame can be
written as a concatenation of three transformations
intensif ier
camera −1
camera
)i ,
(Tfpatient
luoro )i = (Tpatient ) (Tintensif ier )i (Tf luoro
(2.2)
where i = 1...N and N is the number of fluoroscopic images. In the above equation,
camera
camera
Tpatient
and (Tintensif
ier )i are reported by the camera, and respectively represent the
poses of the patient and the C-arm intensifier (or calibration drum) in the coordinate
ier
frame of the camera. (Tfintensif
)i is the conversion from the X-ray image coordinate
luoro
2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION
24
frame (usually defined with respect to the X-ray source) to the intensifier coordinate
frame (usually defined with respect to the center of the detector plane). This transformation consists of two parts: a constant part that is known when the two involved
coordinate frames are determined, and a variable part that is reported by the calibration procedure to compensate the X-ray source deviation for the current imaging
orientation. The transformations within the fixed data are shown in blue dotted lines
in Fig. 2.6.
2.4.2
Transformation of the Moving Data
Depending on the type of the moving data, the transformation that is computed by
registration has different forms.
A single 3D object as the moving data
When the moving data is a single 3D image, the transformation is a single rigid
transformation which brings the 3D image into the patient coordinate frame:
patient
T (.; θ) = Tmdata
,
(2.3)
where θ is a vector of scalars which represents a user-selected parametrization of the
rigid transformation.
If the 3D image is not available for some reason, it is a common practice to use
an anatomical atlas as the replacement. In such cases, an additional transformation
needs to be determined, and the general transformation is composed as:
patient instance
T (.; θ, θatlas ) = Tmdata
Tatlas ,
(2.4)
2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION
25
instance
where Tatlas
models the process of producing instances from the mean shape of
the atlas, which is a deformable transformation represented using parameters θatlas ,
derived from a Principal Component Analysis (PCA) of the anatomical shape and/or
intensity variations.
Multiple 3D objects as the moving data
For moving data that contains multiple 3D images, the general transformation consists
of a global transformation and a set of local transformations:
patient
Tglobal (., θg ) = Tmdata
;
mdata
Tlocal (., {θk }) = {(TCT
)k }, k = 1...M.
(2.5)
(2.6)
patient
Tmdata
is a rigid transformation which brings the moving data as an entirety into
mdata
the patient coordinate frame, {(TCT
)k } is a set of rigid transformations that po-
sition individual member objects within the moving data, and M is the number of
member objects in the moving data. When implementing the registration algorithm,
the global transformation can be merged into individual local transformations to reduce the number of parameters to be estimated. However, having a separate global
transformation can improve the performance and robustness of the registration. Fig.
2.6 illustrates such a case, and the transformations to be computed are shown in red
dotted lines.
When an atlas is used instead of a set of 3D images, each member object is now a
dynamic instance, and the corresponding local transformation needs to be appended
to include the process of instance generation from the atlas. The updated general
2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION
26
transformation can be written as:
patient
Tglobal (., θg ) = Tmdata
;
mdata
instance
Tlocal (., {θk }, θatlas ) = {(Tinstance
)k (Tatlas
)k }, k = 1...M.
(2.7)
(2.8)
instance
For a given member object k, the deformable PCA transformation (Tatlas
)k promdata
)k positions
duces an instance from the atlas, and the rigid transformation (Tinstance
the instance within the moving data.
2.4.3
Parametrization of the Output Transformation
The general transformation of the moving data is the main output of the registration. It needs to be parametrized before the optimization algorithm can search for a
solution of it. Depending on the optimization algorithm being used, the parametrization method can be a key factor that affects the performance and robustness of the
optimization process. In general, a good parametrization has a small number of parameters, less ambiguity (e.g., parameters are orthogonal or independent each other),
similar dynamic ranges for all parameters, uniform behaviours across all regions in
the parameters domain, and so on.
As described in the previous section, the general transformation to be computed
in this thesis may involve two types of transformations: 3D rigid transformations of
rigid bone fragments, and 3D deformable transformations of a statistical atlas derived from PCA-based parametrization of the bone shapes in a population. While
deformable transformations have a unique way of PCA-based parametrization, the
parametrization of rigid transformations has various forms. A rigid transformation
with parameters θ can be decomposed into two consecutive sub-transformations: a
2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION
27
translation with parameters θt , and a rotation with parameters θr . The parametrization of translation is simply a three-component vector that describes the offsets with
respect to the three coordinate axes, that is, θt = (tx , ty , tz ). For the rotation subtransformation, a number of parametrizations exist, with the most general one being
the 3 × 3 rotation matrix. This section discusses several parametrizations of the
rotation that are commonly used in 2D-3D registration.
Euler Angles
Euler angles [57, 54] represent the rotation using a three-component vector θr =
(rx , ry , rz ), where rx , ry and rz are rotation angles of three sequential rotations around
each of the coordinate axes. The relationship between this representation and the
rotation matrix can be seen by writing each of the individual rotations as a matrix,
and then composing those matrices. The individual matrices can be composed in
different orders; however, their impact on optimization is not significant.
A rotation represented using Euler angles has minimal number of parameters, so it
is an efficient representation for optimization. Another advantage of this parametrization is that, in most medical applications, the values of the rotation and translation
parameters have similar ranges if the angles are in degrees and the translations are
in millimetres. This is a preferred property for many optimization algorithms such
as Gradient Descent and Downhill-Simplex.
The drawback of using Euler angles is that the angles are coupled to each other.
That is, for a given rotation, there are multiple sets of Euler angle representations
that can produce the same rotation. This ambiguity can cause a problem known as
“gimbal lock” [119], which is a loss of one rotational degree-of-freedom (DOF) when
2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION
28
certain parameter values of the representation are encountered.
Unit Quaternion and Angle-Axis
Unit quaternion [54] was formulated to overcome the “gimbal lock” problem in the
Euler angle representation. In this parametrization, the rotation is represented using
a unit quaternion θr = (X, Y, Z, W ). A unit quaternion is a four-element vector
having unit magnitude. Three of them, that is (X, Y, Z), determine the rotation axis
and the other, that is W , determines the rotation angle about the axis. The use of
unit quaternions can avoid the “gimbal lock” problem because each rotation can be
represented using a unambiguous quadruple whose parameters are independent each
other. Unit quaternions also have other nice properties such as easy composition and
differentiation, capable of performing smooth interpolation between any two rotations,
and so on.
As unit quaternion has four parameters of two types (i.e. axis and angle), it
is slightly more expensive to be optimized than the Euler angles and, in order to
achieve a good optimization performance, it is preferred to use dedicated optimization
algorithms that can employ the special properties of the unit quaternion.
Another representation that closely relates to the unit quaternion is Angle-Axis
[118]. The two representations are essentially the same, but Angle-Axis explicitly
represents the rotation angle using degrees or radians. This representation shares the
same advantages and disadvantages of the unit quaternion.
2.5. DRR GENERATION
29
Versor
Versor [45] is derived from unit quaternion, and can be written as θr = (vx , vy , vz ). It
encodes the rotation angle into the rotation axis by scaling the rotation axis with a
factor which is the cosine of the half of the rotation angle. The advantage is that the
number of parameters to be optimized is reduced by one, and general optimization
algorithms work well with this representation because all parameters are in the same
range. During optimization, changing one parameter will simultaneously change the
rotation axis and the rotation angle, which may not be a desired behaviour for some
applications.
Spherical
Spherical representation [3] describes a rotation using three angles θr = (α, β, γ).
It can be seen as a variant of versor or unit quaternion, where α and β are used
to represent the rotation axis in the spherical coordinate system, and γ represents
the rotation angle about the rotation axis. This representation removes the coupling
between the axis and angle in versors while maintaining three parameters. Similar to
the Euler angles, the use of angles can benefit the optimization algorithms because
the rotation and translation parameters can be scaled to have the same range. One
drawback of this representation is that, because spherical coordinate system is used,
the optimization does not behave uniformly across all areas in the parameters domain.
2.5
DRR Generation
DRR generation is the operation of simulating the X-ray imaging process on 3D
images to produce a 2D X-ray view of the 3D data. There are two key requirements
2.5. DRR GENERATION
30
about DRR generation. Firstly, it must be fast enough as many algorithms heavily
depend on dynamically generated DRRs. Secondly, the generated DRRs must closely
resemble the real X-ray images so that DRR generation is not a major source of errors.
For many years, generation of realistic DRRs has been the performance bottleneck for
a large number of 2D-3D registration methods. Recent technology advancement in
graphics processing units (GPUs) has greatly improved the speed of DRR generation.
However, the accuracy of the generated DRRs still needs to be improved. In many
publications, DRR generation is also known as volume rendering. It should be noted
that volume rendering has a broader meaning and embraces a variety of rendering
techniques including and other than X-ray imaging simulation. This section provides
an overview of the commonly used DRR generation techniques.
2.5.1
Ray-casting
As CT and fluoroscopic images are generated with different X-ray energy spectra
which are usually unknown to the user, it is difficult to exactly simulate the X-ray
imaging process (Eq. 2.1) on a CT image. Instead, approximated methods are used.
One commonly used method is ray-casting (sometimes known as ray-tracing, though
the latter is more general and more complex) which is defined as follows
Ix =
n
X
i=0
Ci αi
!
i−1
Y
(1 − αj ) ,
(2.9)
j=0
where x is the X-ray passing through the tissue, n is the number of voxels on the
ray, Ci is the CT value in Hounsfield unit at the i-th voxel along the ray, and αk is
the opacity at the corresponding voxel. The opacities for individual voxels are used
to simulate the tissue absorption coefficients from the CT numbers, and their values
2.5. DRR GENERATION
31
are assigned by using a transfer function. The selection of an appropriate transfer
function is very important for generating realistic DRRs, and a few functions have
been suggested [73].
The ray-casting approach can produce highly realistic DRRs. However, direct
implementation of Eq. (2.9) on the CPU is a time-consuming task due to expensive
operations such as interpolation and iteration. With recent quad-core CPUs, it still
takes more than one second to compute a DRR with moderate resolution (such as
256 × 256) from a CT image with usual resolution (such as 512 × 512 × 200), which
is a speed that is not appropriate for interactive use.
To improve the performance of ray-casting, several methods have been proposed.
One method accelerates the ray-tracing process by applying several techniques [125]:
replacing most floating-point computations with integer operations; removing noninteresting voxels; and early ray termination. The method can improve the computation speed up to a few times with negligible compromise in DRR quality, but the
magnitude of improvement largely depends on the image contents and it is still not
significant for registration applications. Another method simulates the ray-casting
process by using graphics hardware [98]. It takes advantage of the GPU parallel
rendering mechanism, and ray-casting is implemented using GPU shader programs
such as DirectX Pixel Shader 3.0 and NVIDIA Fragment Program 2. With recent
consumer-grade GPUs, the speed improvement with respect to the original CPUbased method can go up to 100-200 times while maintaining good DRR guality.
However, using the shader programs for such a task is becoming outdated when the
more powerful GPGPU (General-Purpose computation on Graphics Processing Units)
2.5. DRR GENERATION
32
computation techniques, such as CUDA [2] and OpenCL [99], were developed a couple of years ago. In the latest improvement [96], ray-casting is implemented within
GPU using the CUDA technology. Compared with the old GPU-based ray-casting
method, the new method marginally improved the DRR generation speed and the
DRR quality, but it significantly simplified the implementation and brought great
potentials for further improvements with future GPUs.
2.5.2
GPU-based Texture Mapping
Texture mapping is a technique widely used in computer graphics. It maps a bitmap
image, called a texture, to a polygon, which is usually an expensive operation that
involves interpolation. Most recent GPUs provide hardware support for texture mapping as well as the alpha blending operations that combines two bitmaps. These two
features can be used together to accelerate the DRR generation process. The process
generally involves three steps: l) slice the bounding box of the 3D image into polygons
along a particular direction; 2) map each polygon with the corresponding 2D texture
taken from the 3D image; and 3) blend all textured polygons into a final image. The
first step is often done by CPU and the remaining steps are done by GPU.
Texture mapping can be viewed as an iterative implementation of the ray-casting
process (Eq. 2.9). It starts from the furthest voxel (with respect to the source) on
the ray, and recursively accumulates the attenuation coefficients (simulated from the
CT numbers and opacities) of the voxels towards the source
Ix(n) = 0,
Ix(i) = Ci αi + (1 − αi )Ix(i+1) , i = n − 1, ..., 0,
(2.10)
2.5. DRR GENERATION
33
where Ci and αi are the CT value and opacity of a voxel i on the ray x, and the
opacities are specified by the user by using a transfer function.
Early graphics hardware supports only 2D textures. In this case, the 3D image
is sliced along each of the three main axes, and each slice is stored as a 2D texture.
When a DRR is requested, the image axis that is closest to the viewing direction is
selected, and texture mapping and blending are performed along that direction. This
technique is called object-aligned texture mapping and has a major drawback: if the
angle between the slicing and viewing directions is too large, artefacts will appear.
Recent GPUs support 3D texture mapping, in which case the entire 3D image
is stored in the texture memory, and texture mapping and blending is performed
along the viewing direction. Compared with the 2D texture mapping technique, the
difference is that the 2D slices are now constructed on-the-fly out of the 3D image,
and therefore can be oriented perpendicular to the viewing direction, resulting in an
image with fewer artefacts. This technique is called view-aligned texture mapping
and is currently the most popular technique for DRR generation.
The performance of 3D texture mapping does not depend on the contents of the
3D image, which is a good property if constant computation time is important for the
context applications. However, for 3D images that contain a lot of empty voxels, it
is a kind of waste of the computation power. An improved method, called Adaptive
Slice Geometry for Hardware Assisted Volume Rendering, has been proposed to solve
this problem [9]. The method removes all empty voxels and computes Axis-aligned
Bounding Boxes (AABBs) for structures of interest in a pre-processing step. During
volume rendering, texture mapping is only applied for polygons that are obtained by
slicing the AABBs. The improvement in computation speed is obvious as most 3D
2.5. DRR GENERATION
34
images contain certain amount of non-relevant structures.
The slicing operation in texture mapping is usually done within the CPU using
some computational geometry algorithms. A simple method is to use a sweeping
plane to cut the bounding box of the 3D image along the viewing direction, and then
compute the intersections as well as the polygons for texture-mapping [9]. To improve
the slicing performance, a new method was proposed and the Marching Cubes [65]
algorithm along with a special look-up table was used to aid the computation of the
intersection polygons [6].
2.5.3
Other Techniques
Aside from the ray-tracing and texture-mapping techniques described above, several
other DRR generation techniques are also available for registration applications:
Shear-warp. This algorithm [55] transforms the 3D image to an intermediate coordinate system called the “sheared object space”. In this space the projective viewing
rays are transformed into axis-aligned parallel rays for easy walk-through along the
rays. The algorithm involves shearing, scaling and resampling the volume slices. The
slices are composed together in front-to-back order resulting in an intermediate 2D
image. The final step is to “warp” this image in order to transform it back to the
original image space. The rendering is very efficient, as the voxels in the intermediate slices correspond with the scan-lines of the final image, and can be composed
immediately. On the other hand this algorithm produces artefacts under certain
circumstances, and therefore may be problematic if used for accurate registration.
Splatting. This method [116] is similar to the ray-casting method but using a
different projection scheme. Each voxel of interest is projected onto the 2D viewing
2.5. DRR GENERATION
35
plane, and a Gaussian splat is used to approximate the projection result. Then,
all resulting splats are composed together in the order of back-to-front to produce
the final image. To improve the computation efficiency, only voxels that effectively
contribute to the final image are used during the calculation. This method has better
computation performance than the ray-casting method. However, artefacts such as
aliasing can exist. Several variants [131, 13, 44, 62, 113, 115] have been proposed to
further improve the speed and accuracy of the original splatting method.
Transgraph. Transgraph, or light-field, is a pre-computed data structure that
stores the pixel intensities of a large number of DRR rays in an efficient way [56, 93].
Each ray is represented using two points in the 3D space, and the associated DRR
pixel intensity is computed using the ray-casting function (Eq. 2.9) or any other
DRR formulation functions. This pre-computation is performed for many different
viewing directions around a user-selected reference direction. Rendering of a DRR is
then reduced to, for each ray in the output DRR, retrieve the closest rays from the
transgraph and compute an interpolated pixel intensity from the retrieved rays. The
more pre-computed DRR rays, the longer computation will take (to several hours or
even days) and the more accurate result it can achieve. Another limitation of this
method is that, when the content of the 3D data is changed, the transgraph has to
be recomputed.
Cylindrical harmonics. This technique [112] transforms the 3D image into a cylindrical harmonics representation, which consists of a series of orthonormal cylindrical
harmonics basis functions and their corresponding coefficients. Then a reference projection orientation is selected, and each of the harmonics is projected along the selected orientation. This will produce a set of 2D projections (called harmonic DRRs),
2.5. DRR GENERATION
36
and their super-positions is the DRR of the 3D image in its reference orientation.
When a DRR from an arbitrary direction is requested, the harmonic DRRs are exponentially weighted by the orientation of the requested DRR (represented as the
relative angle with respect to the reference orientation), and super-positioned to produce the output DRR. This method can produce DRRs with good quality (the actual
quality is dependent on the number of harmonics being used as well as the 3D image
content), and the main advantage is that, once the harmonic DRRs are generated
from a chosen reference direction, new DRRs can be quickly obtained by simply
super-positioning the harmonic DRRs. Another advantage is that, the number of
harmonics to be used can be truncated such that DRRs can be composed even faster
with certain trade-off in image quality.
2.5.4
DRR Generation for Multiple Objects
In all above-described methods, only a single 3D image is involved during DRR
generation. This should work fine for most medical applications. However, some
applications involve multiple moving objects (as is the case for bone fragments in
fracture treatment), where rendering of multiple 3D images or anatomical models
can be a key requirement.
DRR generation for multiple objects can be done in one of two approaches. The
first approach is to modify the existing methods developed for a single 3D image to
simultaneously handle multiple objects during DRR generation. As different DRR
methods have different complexities, the efforts required for the modifications also
vary. The advantage is that, with this approach, DRRs of multiple objects can be
2.6. 2D-3D SIMILARITY METRICS
37
produced without trade-offs in image quality. The second approach is to use an existing single object rendering method to produce object DRRs for individual objects,
and then combine the individual object DRRs. This approach is simple to implement,
and can take advantage of all existing DRR methods. However, simply combining
the individual object DRRs may introduce artefacts to the final DRR, because occlusions among individual objects can happen. In most existing DRR methods, the
rendering is performed along the viewing direction in either back-to-front or front-toback order, and the accumulation of attenuation coefficients for different objects is
not a separable operation. The simplest way to combine the individual object DRRs
is to compute the mean of them. It may blur out some useful structures, but it is
the function that will minimize the overall artefacts for all possible DRR directions.
Some single object DRR implementations use order independent volume rendering
(OIVR), in which case the choice of the combination function is not important.
2.6
2D-3D Similarity Metrics
A similarity metric is computed from all the data that are involved in registration, and
is a function of the transformation parameters. It can be single-valued or multiplevalued, and can be intensity-based or feature-based. Feature-based similarity metrics
are more efficient to compute; however, accurate feature extraction is necessary and
the errors in feature extraction propagate to the similarity measure. Intensity-based
similarity metrics are more accurate in general because raw image intensities provide
more information than extracted features; however, the computation speed is usually
a bottleneck in registration applications that use intensity-based similarity metrics.
2.6. 2D-3D SIMILARITY METRICS
38
In 2D-3D registration, the most commonly used similarity metrics are intensitybased, and they are computed from X-ray images and DRRs. This section gives an
overview of such metrics.
2.6.1
Correlation-based Metrics
Correlation is a good metric for intra-modality registrations because it is invariant
to linear intensity differences between two images. That is, the metric value remains
unchanged even if the pixel intensities in one or both of the images are multiplied by
a positive constant, or are increased or decreased by a constant.
Normalized Correlation Coefficient. The simplest form of correlation-based metrics is the Normalized Correlation Coefficient (NCC) [114]. The NCC of two images is
computed by first normalizing each image to have zero mean and unit variance, then
multiplying each pixel in one image by the corresponding pixel in the other image,
and summing the products. Let A and B be the two images, and Ω be the domain or
region-of-interest within which the similarity will be calculated, then NCC is defined
as
X
p∈Ω
X
1 X
A(p)
B(p)
|Ω| p∈Ω
p∈Ω
!2 v
!2 ,
u
X
X
uX
1
A(p) t
B(p)2 −
B(p)
|Ω| p∈Ω
p∈Ω
p∈Ω
A(p)B(p) −
N CC(A, B, Ω) = v
u
uX
1
t
A(p)2 −
|Ω|
p∈Ω
(2.11)
Sum of Local Normalized Correlation. One problem associated with NCC is that
it is not resistant to non-uniform intensity distortions that appear in different areas
of the intensifiers. To address this problem, a revised version of NCC, called Sum of
2.6. 2D-3D SIMILARITY METRICS
39
Local Normalized Correlation (SLNC), is proposed [57]. The new metric computes a
local NCC in a small neighbourhood for each pair of pixels in the two images, and
then reports the mean of the local NCC values as the measure. The SLNC for a pair
of images A and B with region-of-interest Ω is defined as
SLN C(A, B, Ω) =
1 X
N CC (A, B, R(p)) ,
|Ω| p∈Ω
(2.12)
where R(p) is the small neighbourhood around a pixel p, and N CC is defined in
Eq. (2.11). One good by-product of this revision is that the metric calculation can
now be optimized for multi-threaded or parallel execution, because the calculation of
a local NCC only involves a small number of neighbouring pixels.
Variance-Weighted Correlation. With SLNC, individual local NCC values are
equally weighted, which is not a reasonable behaviour in some situations. For example, if neighbourhood R(p) contains pure background intensities and neighbourhood
R(q) contains boundaries of anatomic structures, the two regions will be considered
equally important when computing the SLNC metric. However, R(q) contains more
useful information than R(p) for registration and should be assigned more weight.
To address such concerns, a revision of SLNC, named Variance-Weighted Correlation
(VWC), was proposed [56]. In VWC, local NCC values are weighted by the variances
of the corresponding regions when composing the final similarity value. One of the
two images is chosen as the control image, and is used to compute the weights. This
modification effectively concentrates attention in those regions of the control image
where the signal strengths are high. When DRR is used as the control image, VWC
is especially useful for excluding foreign objects such as DRBs and surgical tools that
appear only in X-ray images, since the DRR (as the control image) does not contain
2.6. 2D-3D SIMILARITY METRICS
40
the DRBs and tools. Mathematically, VWC is defined as
X
V W C(A, B, Ω) =
C (I, R(p)) N CC (A, B, R(p))
p∈Ω
X
,
(2.13)
I(p) ,
(2.14)
C (I, R(p))
p∈Ω
2

C(I, R(p)) =
1
|R(p)|
X
I(p)2 − 
p∈R(p)
1
|R(p)|
X
p∈R(p)
where I is the selected control image (that is, either A or B), and C(I, R(p)) is the
variance of the neighbourhood region of the point p in the control image.
Stochastic Rank Correlation. All above described metrics are designed for intramodality registration problems, where intensities of the corresponding pixels from two
images have nearly a linear relationship. To use the correlation technique in other
registration problems, such as intra-modality registration and DRR-based 2D-3D registration with low quality DRRs, Stochastic Rank Correlation (SRC) is proposed [14].
This metric calculates the correlation on the intensity ranks of two images instead of
raw intensities. For each of the two images, the pixels are sorted according to the
intensity values, then the rank of an intensity value is computed as the mean index of
all pixels with the same intensity value. Let ρA (p) and ρB (p) be the intensity ranks
of the same pixel p in two corresponding images A and B, then the SRC metric is
defined as
SRC(A, B, Ω) = 1 −
6
2
p∈Ω (ρA (p) − ρB (p))
,
|Ω|(|Ω|2 − 1)
P
(2.15)
where Ω is a mask that indicates the pixels of interest. The mask is randomly generated by uniformly sampling the fixed image domain, and its use is mainly for improving the computation performance. As the similarity measure is computed from
2.6. 2D-3D SIMILARITY METRICS
41
intensity ranks instead of original intensities, this metric is known to be robust against
intensity non-linearity between the two images, and robust against outliers that only
appear in one image.
2.6.2
Information-theory Metrics
A group of similarity metrics are based on information theory [87]. These metrics
evaluate the amount of information that is contained in the joint intensity distribution of two images. Let X and Y be the variables representing intensities in two
correlated images, then the joint intensity distribution of the two images is defined
as the probability of X and Y being corresponded at the same pixel location. It
will contain the most information when the two images are aligned, and the least
information when they are completely independent.
Entropy-based metrics. One commonly used function to measure the amount
of information within a message is Shannon entropy. The similarity metric that
computes the Shannon entropy for the joint distribution of two images is called joint
entropy [86]. The computation of joint entropy is straight-forward; however, if the two
images have no overlap at all, the joint entropy will still report a high response, which
is not a desired behaviour. This problem can be solved by using Mutual Information
(MI) or Normalized Mutual Information (NMI) [86], which combines the joint entropy
with the entropies of individual images. Let p(a) be the histogram of an image A, and
p(a, b) be the joint intensity distribution of two images A and B, then the entropy of
2.6. 2D-3D SIMILARITY METRICS
42
a single image, H(A), and the joint entropy of two images, H(A, B), are defined as
H(A) = −
X
p(a) log p(a),
(2.16)
p(a, b) log p(a, b),
(2.17)
a∈A
H(A, B) = −
X
a∈A,b∈B
and the metrics MI and NMI are defined as:
M I(A, B) = H(A) + H(B) − H(A, B),
(2.18)
H(A) + H(B)
.
H(A, B)
(2.19)
N M I(A, B) =
The difference between MI and NMI is that MI computes the difference between the
two types of entropies, and NMI computes the ratio. It has been reported [86] that
NMI is more stable than MI when the overlapping area between the two images varies.
Combining entropy with spatial information. Entropy-based metrics consider only
the statistical properties of the joint histogram and ignore the information of local
neighbourhoods. To incorporate useful spatial information such as edges, gradients,
and so on, a number of extensions of the entropy-based metrics have been suggested.
In [91], MI calculations were performed over blocks of pixels in the images. In [75],
PCA was performed in order to incorporate local spatial information into MI. In
[85, 49], MI was combined with gradient information to obtain new metrics such
as Asymmetric Gradient-based Mutual Information and Symmetric Gradient-based
Mutual Information.
f-function metrics. Several other metrics are based on joint distribution but
2.6. 2D-3D SIMILARITY METRICS
43
do not use entropy [87]. Such metrics include V -information, Iα -information, Mα information, Xα -information and Rα -information. In those metrics, different parametercontrolled functions are used to compute the information contained within the joint
distribution image, and each metric is aimed to solve a special type of problems.
2.6.3
Metrics using Spatial Information
This type of metrics takes into account some kind of neighbourhood information at
every pixel location [114]. This can be done by adding all pixel differences within a
certain radius or by calculating gradient images for further examination.
Pattern Intensity (PI). This metric computes the difference between two images,
and counts the amount of a special pattern contained in the difference image. When
computing the difference, one of the image is dynamically scaled (that is, each registration step has a different scaling factor) such that the difference image has the least
contrast. The pattern is defined over a small neighbourhood of radius r for every
pixel in the difference image, and its shape is controlled by a constant σ. The metric
is defined as follows:
P I(A, B) =
σ2
1 X X
,
|Ω| p∈Ω
σ 2 + (D(p) − D(q))2
(2.20)
D = A − sB,
(2.21)
q∈R(p)
where D is the difference image, s is a dynamically computed scaling factor, and R(p)
is a small neighbourhood of point p.
Gradient Correlation (GC). This metric calculates the horizontal and vertical
gradient images for each of the two images. Then, normalized correlation is calculated
2.7. OPTIMIZATION ALGORITHMS
44
for each pair of the horizontal and vertical gradient images. The final value of the
metric is computed as the average of the two calculated correlation values. Let GA,x
and GA,y be the two gradient images of the image A, and GB,x and GB,y be the two
gradient images of the image B, then the final value of the metric is computed as:
1
GC(A, B) = (N CC(GA,x , GB,x ) + N CC(GA,y , GB,y )),
2
(2.22)
where N CC is defined as Eq. (2.11).
Gradient Difference (GD). Similar to GC, this metric also depends on the horizontal and vertical gradient images of the two images. However, instead of computing
a NCC value for each pair of gradient images, a difference image is computed for each
pair of the gradient images, and the same pattern as of PI is applied. Let Dx and Dy
be the two difference images between the horizontal and vertical gradient images, the
metric is defined as follows:
GD(A, B) =
X
p∈Ω
X σy2
σx2
+
,
σx2 + Dx2 p∈Ω σy2 + Dy2
(2.23)
Dx = GA,x − sGB,x ,
(2.24)
Dy = GA,y − sGB,y ,
(2.25)
where σx2 and σy2 are constants that control the pattern shapes in horizontal and
vertical directions, and s is a dynamically computed scaling factor.
2.7
Optimization Algorithms
Optimization is the process that searches for a value of the transformation parameters
such that the similarity metric reaches a pre-defined target. The target can be a
2.7. OPTIMIZATION ALGORITHMS
45
minimal, maximal, or constant value. Depending on how the similarity metric is
defined, the problem may have a closed-form solution or may have to be solved in
an iterative fashion. The closed-form solution is only available for few problems,
such as point-based registration with known point correspondences [8, 18]. The vast
majority of problems are solved iteratively by starting from an initial guess of the
solution and proceeding towards the optimal solution. There are two common ways to
drive the optimization process in iterative optimization techniques. When derivative
information is not available or is unreliable, the proceeding direction at each iteration
step is learnt by evaluating the metric function at sampled points in the parameters
domain. On the other hand, the proceeding directions can be obtained more efficiently
if derivative information is available. This section gives an overview of the popular
optimization techniques [89] that have been used for 2D-3D registration.
2.7.1
Techniques Not using Derivatives
Hill-Climbing. This is the simplest optimization algorithm that uses no derivatives.
In each iteration loop, the parameter value in each dimension is altered by a specific
step size in both directions and new values of the similarity metric at this position
are calculated. After having evaluated all 2N neighbours (with N being the number
of parameters), the one that improves the metric value most is chosen and set as the
base position for the next iteration. If none of the neighbours achieves a better value
than the current, either a downscaling of the step size is executed, or the algorithm
terminates, assuming to have found an optimal position.
Downhill-Simplex. Hill-Climbing requires a large number of function evaluations,
thus is not an efficient process. The Simplex algorithm [89] improves Hill-Climbing
2.7. OPTIMIZATION ALGORITHMS
46
by reducing the number of function evaluations. A simplex is the simplest geometric
shape consisting of N +1 corners in N-dimensional space. A starting simplex is defined
at the initial position; next, the metric is evaluated at all corners; then, depending on
the results of metric evaluations, the shape of the simplex is changed. This algorithm
is mostly known for its simple and elegant implementation; however, the improvement
in computation cost is still not significant.
2.7.2
Techniques using Derivatives
Gauss-Newton. When derivatives are available, Newton’s method can be used to find
a solution efficiently. At each iteration step, the proceeding direction is computed
from the first and second derivatives of the metric. Let µ be the parameters to be
optimized, the Newton update can be defined as
µk+1 = µk − (∇2 f (µk ))−1 ∇f (µk ),
(2.26)
where ∇f (.) and ∇2 f (.) are the first and second derivatives, respectively. The Newton
method is known for its efficiency. However, it is not guaranteed to convergence to an
optimum. When the second derivatives are not available or difficult to compute, the
Gauss-Newton algorithm can be used, where the second derivatives are approximated
with the Jacobians of the metric function.
Gradient-Descent. This algorithm is similar to Gauss-Newton, but uses the gradient instead of Jacobians. It approaches a local minimum of the metric function by
taking steps proportional to the negative of the gradient at the current position. The
2.7. OPTIMIZATION ALGORITHMS
47
Gradient-Descent method can be described as follows
µk+1 = µk − Γ∇f (µk ),
(2.27)
where ∇f (.) is the gradient, and Γ is a diagonal scaling matrix that determines the
step size at each iteration. This optimization method can guarantee to converge to a
local optimum. However, if the scaling matrix is not appropriately updated during the
optimization, the convergence speed can be quite slow. To overcome this problem,
the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm was developed. At each
step, the step size is caluculated by using an efficient scaling matrix and performs a
more sophisticated search.
Levenberg-Marquardt. This algorithm automatically shifts between the GaussNewton and Gradient-Descent during execution. It is more robust than the GaussNewton, which means that it can find a solution even if the initial guess of the solution
is far from the final solution. On the other hand, for well-behaved metric functions
and reasonable starting positions, the Levenberg-Marquardt algorithm tends to be a
bit slower than the Gauss-Newton method. The Levenberg-Marquardt method can
be described as follows
µk+1 = µk − (H(µk ))−1 ∇f (µk ),
(2.28)
H(µ) = ∇2 f (µ)(1 + λ∆),
(2.29)
where ∇f (.) and ∇2 f (.) are the first and second derivatives, respectively, λ ∈ [0, +∞)
is a user parameter that controls the compromise between the Newton method (λ = 0)
and the gradient method (λ → +∞), ∆ is a matrix of Kronecker symbols, and H(.)
2.7. OPTIMIZATION ALGORITHMS
48
represents a modified Hessian matrix.
2.7.3
Robust and Efficient Optimization
Optimization can be a very time-consuming process, and may frequently trap to local
minima if the similarity metric is not well defined. A couple of techniques have been
used together with the usual optimization algorithms to address such problems.
Multi-resolution Strategy. This technique [68] starts with a fast but coarse estimation of the solution, and gradually refines the solution with more precise but slower
estimations. Often a pyramid of sampled images at different resolutions is created
from the full-resolution images. The first optimization is carried out with the coarsest
images at the pyramid top, terminating early when respective stopping criteria are
met. Then, the optimization is restarted from the resulting position using the more
precise images down the pyramid hierarchy, this time with smaller tolerance values
for termination. The optimization is repeated until the finest resolution images at
the pyramid bottom are used. When moving from one level to the next, more information is incorporated into the metric, and the metric becomes more accurate and
more specific. A key advantage of this optimization scheme is that the metric shape
at low resolutions is smoother and may contain fewer local optima. Therefore, the
use of a multi-resolution strategy not only can speed up the registration, but also is
a key means to overcome the problems of local optima.
Simulated Annealing. This is another commonly used technique for avoiding local
optima [52]. The idea is that, at each iteration step, the current solution is replaced by
a random “nearby” solution, chosen with a probability that depends on the distance
between the current and target metric values, and on a decreasing global parameter.
2.7. OPTIMIZATION ALGORITHMS
49
When the current solution is close to a local minima, the random perturbation on the
current solution will increase the chance to get out of the local minima in the next
iteration step.
50
Chapter 3
2D-3D Registration with Unscented Kalman Filter
3.1
Overview
In this chapter, a robust 2D-3D registration method with a wide capture range is
presented.1 The method registers preoperatively collected 3D Computed Tomography
(CT) data sets of a single bone fragment to its intra-operative fluoroscopic images.
The registration technique relies on hardware rendering of CT data on consumergrade graphics cards to generate digitally reconstructed radiographs (DRRs) in real
time. We also employ Unscented Kalman Filter to solve for the non-linear dynamics
governing this 2D-3D registration problem. The method is validated on phantom
models of three different anatomies, namely scaphoid, pelvis and femur. We show
that, under the same testing conditions, our proposed technique outperforms the
conventional simplex-based method in capture range and robustness while providing
comparable accuracy and computation time.
1
This work has been published in Proceedings of MICCAI: R. H. Gong, J. Stewart, and P.
Abolmaesumi, “A new method for CT to fluoroscope registration based on unscented Kalman filter”,
Medical Image Computing and Computer-Assisted Intervention (MICCAI), 9(1):891-898, 2006.
3.2. INTRODUCTION
3.2
51
Introduction
Registration of CT to fluoroscopic images is a fundamental task in Computer-Assisted
Orthopaedic Surgery (CAOS) and Radiotherapy (CART). In the case of CAOS, registration of pre-operative CT to a set of intraoperative fluoroscopic images can be
used to create a precise link between the virtual patient (i.e. the pre-operative CT)
displayed on a screen and the physical patient in the operating room (OR) so that the
CT image can be used to guide the intervention. In the case of CART, registration
of CT to portal images allows precise configuration of treatment beams so that the
radiation is focused on tumors/lesions; thus the damage to healthy tissues remains
minimal.
The CT-to-fluoroscopy registration problem can be briefly described as finding a
geometric transform that positions the CT in the patient’s coordinate space so that
a user-defined similarity measure between the CT and a set of fluoroscopic images is
optimal. Usually the CT is captured preoperatively and used for surgical/treatment
planning, and the fluoroscopic images are captured intra-operatively and used to
update the surgical/treatment plan dynamically. A clinically usable CAOS or CART
system requires accurate, fast and robust registration between the two data sets.
One can formulate CT-to-fluoroscopy registration as a 2D-3D registration problem. A number of methods have been proposed to address this problem in the literature. More detailed reviews on this topic can be found in [5, 25, 28, 33, 53, 56, 64,
82, 107]. A commonly adopted approach is to generate intermediate simulated 2D
fluoroscopic images, called digitally reconstructed radiographs (DRRs), from the 3D
CT and compare the simulated fluoroscopic images with the real ones. Registration is
3.3. METHOD
52
achieved when the simulated fluoroscopic images closely resemble the real ones. Multiple fluoroscopic images from different viewing angles are often used in the process to
compensate for the loss of depth information in the acquired 2D fluoroscopic images.
The majority of the previous work has focused on how to generate DRRs quickly
and realistically [9, 13, 55, 56, 94, 98] or on how to define/select a similarity measure
between DRRs and fluoroscopic images. Those methods often relied on either the
simplex or, if calculation of derivatives is possible, the gradient-descent optimization
method to search for an optimal registration. In this paper, due to the non-linear nature of the CT-to-fluoroscopy registration problem, we propose to use the Unscented
Kalman Filter (UKF) as the optimization method. The proposed registration method
requires no calculation of derivatives, deals with multiple observations simultaneously,
estimates the variance along with the state, and uses an improved hardware-based
technique for fast DRR generation. We believe that these features could potentially
benefit the CT-to-fluoroscopy and other 2D-3D registration problems. To validate our
approach, we extensively test our method on various phantom data sets and compare
it with a conventional simplex-based approach.
The remaining of this chapter is organized as follows: Section 3.3 gives the details of our technique; Section 3.4 describes the testing scenarios and presents the
experimental results; and Section 3.5 provides a summary.
3.3
Method
Our method consists of four major components: a transform that positions the CT
in the patient’s coordinate space; a hardware-based volume rendering engine that
generates DRRs at interactive rates; a similarity measure that compares the DRRs
3.3. METHOD
53
with the corresponding fluoroscopic images; and the UKF that recursively optimizes
the transform parameters. Figure 3.1 shows how these components interact with each
other to search for an optimal registration solution. Once the algorithm is initialized,
it runs iteratively until some pre-defined stopping criteria are met. One iteration of
the algorithm works as follows:
1. Apply the current transform to CT;
2. The transformed CT is fed to the volume rendering engine along with the fluoroscopic images’ imaging parameters, including the C-arm focal position and
the C-arm orientation;
3. For each fluoroscopic images, a corresponding DRR is generated by the graphics
hardware;
4. A set of similarity measures are computed for all (DRR, fluoroscopy) pairs;
5. The computed similarity measures are fed to the UKF along with the current
transform parameters as well as their variances;
6. The UKF updates the transform parameters and the variances.
The above process repeats until a set of optimal similarity measures is achieved or
the updates to the parameters or variances are sufficiently small. In the following
sections, we discuss each of the components in detail.
3.3.1
Transform and its Initial Value
The transform used to position the CT in OR is a simple rigid transform with six
parameters: three Euler angles for rotation and three scalars for translation. Most
3.3. METHOD
54
Figure 3.1: The UKF-based approach for CT-to-fluoroscopy registration.
2D-3D registration methods require an initial transform that is close to the unknown
real solution to start the registration. Our method is no exception. We find an initial
guess of the transform parameters by manually selecting a few landmarks from both
CT and fluoroscopic images, e.g., three to four visible points on the bone surface,
and solving an absolute orientation problem using the singular value decomposition
(SVD) based technique [103]. If the landmarks are selected carefully, the computed
initial parameters can yield an initial mean Target Registration Error (mTRE) [110]
within 3 cm, which is often sufficient to start our registration method.
3.3.2
Hardware-based Volume Rendering Engine
We use an improved hardware-based technique, i.e. the Adaptive Slice Geometry
Texture Mapping (ASGTM) algorithm [9], for fast DRR generation. The algorithm
3.3. METHOD
55
improves the common view-aligned 3D texture-mapping based method by adaptively
slicing the volume based on image content. First, the CT is partitioned into a set
of axis-aligned bounding boxes (AABBs) based on a user-defined transfer function
that removes the non-interesting voxels and highlights the anatomical structures of
interest. Then, the AABBs are sliced along the viewing direction, i.e. the focal axis
of the C-arm, in order from back to front. Finally, the powerful OpenGL features of
consumer-grade graphics cards, including 3D texture-mapping, multi-texturing, and
fragment program are employed to render and blend the slices into a final DRR. We
tested our implementation on an ATI Radeon X800 card with 256 MB video memory
by rendering CT volumes of size 512 × 512 × 256 into DRRs of size 473 × 473, and
have achieved the speed of 20-50 frames per second, depending on the image content
in the CT data.
3.3.3
Similarity Measure
A variety of similarity measures [56, 82, 129] have been proposed for comparing a
DRR with a fluoroscopic images. A few examples are normalized correlation, variance weighted correlation, mutual information, pattern intensity, gradient difference,
and gradient correlation. Different measures have very different behaviours in the
parameter space, depending on the type of transform used, the initial conditions, and
the contents of original image data. We do not bias towards any particular similarity
measure. Any or a combination of them can be used with our method in a plug and
play fashion.
3.3. METHOD
3.3.4
56
Unscented Kalman Filter
UKF [111] is a sequential least squares optimization technique employed for solving
non-linear systems. It estimates both the state and its covariance matrix, and no
calculations of Jacobian or Hessian are required. Instead, UKF assumes that the
unknown state is a Gaussian-distributed random variable (GRV) and uses a minimal
set of carefully chosen sample points along with the corresponding observations to
learn about the behaviors of a true non-linear system. The sample points, which
are generated using the Unscented Transform (UT) [111], completely capture the
true mean and variance of the GRV and, when propagating through the non-linear
system, capture the posterior mean and variance accurately up to at least the second
order Taylor series approximation. Figure 3.2 illustrates the workflow of UKF, which
contains the following three stages:
1. Calculate sigma points from the current state and variance using UT;
2. Propagate the sigma points through the known dynamic state and observation
models;
3. Compute the gain and update the state as well as its covariance matrix using
the computed gain and the known observations.
The general equations and details about the UKF can be found in [111]. In our
proposed method for CT-to-fluoroscopy registration, the state and observation models
3.3. METHOD
57
Figure 3.2: The UKF algorithm.
have the following forms:
x = [θx , θy , θz , tx , ty , tz ]T ,
(3.1)
xi = xi−1 + N (0, σx2 ),
(3.2)
yi = SM (xi ) + N (0, σy2 ),
(3.3)
where x is the transform parameters to be estimated, SM is the non-linear similarity
measure between the DRRs and the corresponding fluoroscopic images as a function
of transform parameters, σx2 and σy2 are the variances of the process and measurement
noises intrinsic to the system. As multiple fluoroscopic images are used in our method,
y is a multiple dimensional measurement vector.
3.4. EXPERIMENT, RESULTS, AND DISCUSSION
58
Table 3.1: Data specifications.
Scaphoid CT
Pelvis CT
Femur CT
Fluoroscopy/DRR
Size (pixels)
2562 × 64
2563
2562 × 128
2562
Resolution (mm3 )
0.3752 × 0.525
1.1762 × 0.766
0.6252 × 1.445
0.8362
Since the registration goal is to achieve an optimal similarity value for all
< DRR, f luoroscopy > pairs, the known observations are constant in our method,
which is the optimal value of the selected similarity measure. For example, in the
case of normalized correlation, the value is 1.0.
3.4
3.4.1
Experiment, Results, and Discussion
Data Sets
We evaluate our method using three different phantoms: a small scaphoid bone,
a large pelvis, and a long and thin femur, all with embedded fiducial markers for
gold-standard validations. For the scaphoid and pelvis phantoms, the CT data were
registered to synthetic fluoroscopic images, i.e. DRRs. The DRRs were generated
from three orthogonal views with the CT positioned in the origin of the reference
space. For the femur phantom, the CT was registered to three real fluoroscopic
images. Table 3.1 lists the specifications of all testing images, and Figure 3.3 shows
the CT and the corresponding DRRs/fluoroscopic images for each phantom.
3.4. EXPERIMENT, RESULTS, AND DISCUSSION
59
Figure 3.3: The CT volumes and the corresponding DRRs/fluoroscopic images. Left:
scaphoid; middle: pelvis; right: femur.
3.4.2
Experiments
A conventional simplex-based method was implemented along with our UKF-based
approach for the purpose of comparison. For each phantom data set, 100 experiments
were performed for both methods with the initial transforms generated by adding
random rotations (±12◦ ) and translations (±12 mm) to the gold-standard. As the
simplex-based method has a much smaller capture range than the UKF-based method,
100 additional experiments were performed for the simplex-based method with perturbations of the initial position in the range of ±5◦ and ±5 mm. For the CT-to-DRRs
registrations, the gold-standards were known. For the CT-to-fluoroscopy registration,
the gold-standard was computed by a fiducial registration [103] using the embedded
markers. Normalized correlation was used as the similarity measure in all experiments, and the same set of process and measurement noise assumptions were made
for all UKF experiments.
3.4. EXPERIMENT, RESULTS, AND DISCUSSION
60
Table 3.2: Comparison of capture range (unit: mm).
Method
UKF-based
Simplex-based
3.4.3
Scaphoid
12.0
5.8
Pelvis
20.0
9.6
Femur
11.0
8.0
Results
We recorded the initial misalignment error (here after called initial mTRE) and final
mTREs for all experiments. Each mTRE was computed as the average difference
between the positions of some CT points mapped by the evaluated transform and the
gold-standard. We used 20 randomly selected points from the region bounding the
bones for the mTRE calculation. Figure 3.4 shows the registration results, and Figure
3.5 shows the image differences between the DRRs and the corresponding fluoroscopic
images before and after registration for one experiment of the femur data with our
proposed method.
We compare the capture range, accuracy and computation time of the two methods. Capture range is defined as the distance from the gold-standard in which at least
95% of registrations are successful. It is measured as an initial mTRE and reflects
the robustness of an algorithm. A registration is said successful if the final mTRE is
within 2 mm for the scaphoid and pelvis data and 3 mm for the femur data. Accuracy was evaluated using the statistics of the successful registrations, which include
the mean and standard deviation of the final mTREs. Since DRR generation is the
dominant operation in both methods, the computation time was measured as the average number of DRR generations required to reach successful registrations. Tables
3.2 - 3.4 show the results of capture range, accuracy and computation time for each
phantom data and each method.
3.4. EXPERIMENT, RESULTS, AND DISCUSSION
61
Figure 3.4: Registration results: initial mTREs vs final mTREs. All units are in
mm. Left column: UKF-based; right column: Simplex-based. First row:
scaphoid; second row: pelvis; third row: femur.
Table 3.3: Comparison of accuracy (unit: mm).
Method
UKF-based
Simplex-based
Scaphoid
0.80 ± 0.58
0.32 ± 0.21
Pelvis
0.11 ± 0.10
0.25 ± 0.20
Femur
2.42 ± 0.57
2.53 ± 0.56
Table 3.4: Comparison of required number of DRRs.
Method
UKF-based
Simplex-based
Scaphoid
760
620
Pelvis
650
630
Femur
700
580
3.5. SUMMARY
62
Figure 3.5: The image differences between the DRRs and the corresponding fluoroscopic images for one experiment of the femur data with the UKF-based
method. Top: before registration; Bottom: after registration.
3.4.4
Discussion
From Figure 3.4 and Table 3.2, it is obvious that the UKF-based approach consistently
has a much larger capture range than that of the simplex-based method. Tables 3.3
and 3.4 also indicate that the two methods have similar performance in accuracy and
computation cost, though the simplex-based method is slightly faster.
3.5
Summary
We presented a new method for registering 3D CT data sets to 2D fluoroscopic images
that uses UKF for robust optimization and a hardware-based adaptive geometric
slicing technique for fast DRR generation. The experimental results showed that
3.5. SUMMARY
63
our method outperforms the conventional simplex-based method by having a larger
capture range while providing comparable accuracy and computation time. Future
work will include the extension of the proposed method to register multi-fragment
bone fractures simultaneously to a set of intra-operative fluoroscopic images, which
will be used in computer-assisted trauma surgery.
64
Chapter 4
2D-3D Registration with the CMA-ES Method
4.1
Overview
In this chapter, we propose a new method for 2D-3D registration and report its
experimental results.12 The method employs the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm to search for an optimal transformation that
aligns the 2D and 3D data. The similarity calculation is based on Digitally Reconstructed Radiographs (DRRs), which are dynamically generated from the 3D data
using a hardware-accelerated technique - Adaptive Slice Geometry Texture Mapping
(ASGTM). Three bone phantoms of different sizes and shapes were used to test our
method: a long femur, a large pelvis, and a small scaphoid. Experiments were performed to register CT to fluoroscopy and DRRs of these phantoms using the proposed
method and two other methods, i.e. our previously proposed Unscented Kalman Filter (UKF) based method (from Chapter 3) and a commonly used simplex-based
1
Preliminary results of this work has been published in Proceedings of EMBC: R. H. Gong, P.
Abolmaesumi, and J. Stewart, “A robust technique for 2D-3D registration”, Annual International
Conference of the IEEE Engineering in Medicine and Biology Society, 1:1433-1436, 2006;
2
Final results of this work has been published in Proceedings of SPIE: R. H. Gong and P. Abolmaesumi, “2D-3D registration with the CMA-ES method”, SPIE Medical Imaging, pages 69181M169181M9, Feb. 2008.
4.2. INTRODUCTION
65
method. The experimental results showed that: 1) with slightly more computation
overhead, the proposed method was significantly more robust to local minima than
the simplex-based method; 2) while as robust as the UKF-based method in terms of
capture range, the new method was not sensitive to the initial values of its exposed
control parameters, and does not need the knowledge about the system noise within
the similarity metric; 3) the proposed method was fast and consistently achieved the
best accuracies in all compared methods.
4.2
Introduction
2D-3D registration is a fundamental task in computer assisted surgery (CAS). In such
surgeries, in order to use pre-operative CT to guide the surgical procedure during the
intervention, the CT must first be mapped to the physical patient in the operating
room, and this can be done through registering the 3D CT to a set of intra-operative
2D fluoroscopic images. Another important application is in computer assisted radiotherapy, where registration of CT to a few portal images is used to focus the harmful
treatment beams on the lesion area thus minimizing the damage to the surrounding
healthy tissues.
The goal of 2D-3D registration is to find a spatial transformation that transforms
one data set (usually a 3D data set) from its local coordinate space to an other
coordinate space (usually that of a 2D data set consisting of a set of 2D images) coordinate space so that the two data sets are aligned in terms of some similarity metric.
A 2D-3D registration method generally involves determining three components: a
transformation that spatially correlates the two data sets, a similarity metric that
evaluates how well the two data sets are aligned under a particular transformation,
4.2. INTRODUCTION
66
and an optimization technique that iteratively searches for an optimal solution of the
transformation.
A variety of methods have been proposed for 2D-3D registration [37, 58, 82].
Most of the methods have focused on defining an accurate and efficient similarity
metric, and have relied on simple search algorithms such as simplex and gradientdescent, to find the final solution. Early methods [58] have used geometry features
(e.g., edges and surfaces) to define the similarity metric for obtaining acceptable computation speed. The main drawback of those methods is the need for an accurate
feature extraction, where the errors in segmentation propagate through the registration process. Due to the fast increase in computation power in recent years, most
current methods [37, 82] compute the similarity directly from image intensities to
achieve better robustness and accuracy. This group of methods dynamically generates the simulated 2D data, called Digitally Reconstructed Radiographs (DRRs),
from the 3D data and computes the similarity from a set of 2D image pairs. A variety
of functions, including normalized correlation coefficients (NCC), variance-weighted
correlation (VWC), gradient correlation (GC), gradient difference (GD), pattern intensity (PI) and mutual information (MI), have been used to define the similarity
between a 2D image and its corresponding DRR. To achieve interactive computation
performance, hardware-accelerated techniques [9] are usually employed to speed-up
the DRR-generation process. Finally, some recent work reconstructs 3D data from
the 2D data to take advantage of the variety of existing 3D registration techniques.
However, one limitation of this type of method is that it needs a large number of
2D images or the statistical information about the studied object for accurate 3D
reconstruction.
4.2. INTRODUCTION
67
While simple optimization techniques are known for their ease of use, they are
sensitive to local minima. When used in 2D-3D registration, they work well only
if a good initial guess of the solution can be found. The main reason is that, due
to different dimensionalities and modalities involved in this registration problem,
the 2D-3D similarity metrics are usually highly nonlinear and have a rugged search
landscape. In real applications, finding such an initial alignment usually involves using
a user interface, which is a time-consuming task, or using known geometry objects
in the field. To develop a more robust approach, in our previous work an Unscented
Kalman Filter (UKF) based method was proposed and the UKF was used as the
optimization strategy [37, 38]. The method demonstrated significant improvement in
capture range compared to a commonly used simplex-based method. It also provided
a possibility to estimate the registration errors using a closed-form solution [74] after
the registration was finalized. However, the method is only suitable for the situations
that the knowledge about the system noises can be easily obtained and the similarity
metric has a known target value.
In this work, we propose a fast and more general method that uses the Covariance
Matrix Adaption Evolution Strategy (CMA-ES) technique [42] as the optimization
strategy to achieve high robustness and better usability. In Section 4.3, we provide
the details of the algorithm. In Section 4.4, we validate the proposed method, and
compare it with two prior work: our previous UKF-based method and a simplex-based
method. Finally, a summary will be provided in Section 4.5.
4.3. METHOD
4.3
68
Method
4.3.1
Algorithm Overview
Without losing generality, in the subsequent sections we assume orthopaedic surgery
as the common application of CT to fluoroscopy registration in CAS. In this case,
the 3D data is the pre-operative CT, quantized in the coordinate space of the CT
machine, and the 2D data is a series of intra-operative fluoroscopic images, captured
from different orientations in the fluoroscopy coordinate space. Fig. 4.1 shows the
overall method as well as the interactions between its components. The inputs are the
two data sets being registered and an initial guess for the registration transformation.
Then the optimizer, CMA-ES, iteratively refines the transformation according to the
similarity between the 2D data and the dynamically generated DRRs. We briefly
describe the transformation and similarity metric in the paragraphs that follow. The
details about searching for an optimal transformation with CMA-ES will be given in
Section 4.3.2.
The transformation takes the 3D data from its local coordinate space to the 2D
data’s coordinate space. Depending on the application, the transformation can be of
any type including rigid, similarity, affine, non-rigid, or a combination. In the context
of CT to fluoroscopy registration, this is usually a 3D rigid transformation consisting
of rotational and translational components. The translation has a fixed form with
three parameters, while the rotation can have various representations with different
number of parameters, such as Euler angles and versor with three parameters, unit
quaternion and angle-axis with four parameters, and so on. Our method does not
tend to a particular representation. For compactness and intuitiveness, the form of
Euler angles was selected in this work.
4.3. METHOD
69
Figure 4.1: CMA-ES based 2D-3D registration method.
We have adopted the intensity-based approach in the proposed method to take
advantage of its robustness and high accuracy. For each fluoroscopic image, one DRR
is generated from the CT using the current transformation and the fluoroscopic image
settings, then a similarity is computed between each fluoroscopic image and DRR pair.
The final similarity between the 2D and 3D data is formulated as a linear combination
of the similarities between each pair. All current similarity metrics (NCC, VWC, GC,
GD, MI, PI) can be used with our method, and the selection usually depends on the
image quality or content. Because DRR generation is the dominant operation during
the registration process, a hardware-accelerated technique, named Adaptive Slice
Geometry Texture Mapping (ASGTM) [9], is used to accelerate the task. ASGTM is
an improvement of the commonly used 3D texture mapping technique. It excludes
4.3. METHOD
70
the non-interesting voxels of the 3D data from rendering by generating and using
a set of Axis-aligned Bounding Boxes (AABBs) in a preprocessing step to further
accelerate the DRR generation process.
4.3.2
Optimization with CMA-ES
The main contribution of this work is to use the CMA-ES optimization technique
in 2D-3D registration for improved robustness and better usability. CMA-ES [42]
is a sampling-based search algorithm known for robust and efficient operation in a
rugged search landscape. The method requires no calculation of derivatives; instead
the learning is done through taking random samples around the current solution
according to a multivariate normal distribution. In each iteration of the optimization
process, the solution is refined by sampling, selection and recombination, and the
search distribution is adaptively deformed according to both new information from
the selected samples and the information from previous steps. Fig. 4.2 shows the key
steps of the CMA-ES algorithm.
In Fig. 4.2, the initial guess of the solution and search distribution are provided
by the user, which are the initial position of the 3D data and its uncertainty. The
search distribution is represented using a covariance matrix C, which determines the
distribution shape, and a scalar s, which determines the distribution size (a scaling
factor that is applied to the variances of the distribution). Initially, only s is specified
and the search distribution has a spherical shape with an isotropic standard deviation
in all directions. Each iteration consists of three key steps. First, a population
of parameter samples are drawn according to the current search distribution. The
sampling size, λ, is determined by the dimension of the solution parameters n, and
4.3. METHOD
71
Figure 4.2: The CMA-ES algorithm.
a common choice is λ = 4 + b3 ln nc. Next, the samples are evaluated and sorted
according to the metric values, and the first µ samples are selected and recombined.
The value of µ is user selectable with the default being µ = bλ/2c. The recombination
refines the solution, that is, the mean of the search distribution, by computing the
weighted mean of the the selected samples with the coefficients {wi }, i = 1, ..., µ,
within which better-valued samples are assigned more weights. Finally, the covariance
matrix of the search distribution is updated. This is the core part of the CMAES algorithm, and the update is based on three sources: the search distribution of
4.3. METHOD
72
the previous iteration, the accumulated evolution path of the solution from the first
iteration to the current iteration, and the distribution of the selected samples at the
current iteration. Each source is assigned a weight and the assignment is controlled by
the parameters cc and cσ , which are computed from the recombination weights {wi }.
Except for the initial guess, all control parameters can be automatically determined
and have been appropriately suggested [42]. The stopping criteria are user-defined.
The commonly used ones are the maximum number of iterations, the tolerance of
function update (with respect to similarity value), and the tolerance of parameter
update (with respect to the solution parameters). In summary, our CMA-ES based
2D-3D registration method works as follows:
Inputs: 2D data, 3D data, initial transformation parameters T0 , and initial search
distribution size σ0 ;
Output: final transformation parameters T .
1. Initialize the search distribution N (T, σ 2 I) (where I is the identity matrix) with
T = T0 and σ = σ0 , and the evolution path p to be null; Compute the population
size λ, the selection size µ, the recombination weights {wi }, i = 1, ..., µ, and the
parameters cc and cσ .
2. Until stopping criteria are met, do the following:
(a) Draw a population of samples {Ti }, i = 1, ..., λ, according to the distribution N ;
(b) For each sample Ti , transform the 3D data, generate DRRs, and compute
the similarity measure;
(c) Select µ best samples according to similarity values;
4.4. EXPERIMENTS, RESULTS AND DISCUSSION
73
(d) Update T by recombining the selected samples with the weights {wi };
(e) Update N by linearly combining the following three components with the
parameters cc and cσ : the previous N , the covariance of the µ selected
samples, and the covariance of p;
(f) Update p (see [42] for more details).
4.4
Experiments, Results and Discussion
We used three bone phantoms of different sizes and shapes, including a long femur,
a large pelvis, and a small scaphoid, to evaluate the proposed method. Four pairs of
2D and 3D data were acquired or synthesized from the phantoms. The 3D data were
CTs, captured using a GE LightSpeed Plus machine. The 2D data were simulated
fluoroscopic images and real fluoroscopic data. The simulated fluoroscopic images
were generated from CTs along coordinate axes using the ASGTM technique. The
real fluoroscopic images were acquired using an OEC-9800 fluoroscopy device. Table
4.1 lists the specifications of the data used in this study.
Three types of experiments were conducted. First, registration of CT to simulated
fluoroscopic images was performed for each phantom using the proposed method and
two other methods: our previous UKF-based method and a commonly used simplexbased method. Next, registration of CT to real fluoroscopic images was performed for
the pelvis phantom. Finally, additional experiments for studying the impact of the
initial search distribution of the CMS-ES algorithm were conducted. All experiments
were done on a Dell OptiPlex GX270 computer equipped with 2 GB RAM and an
ATI Radeon X800 (256 MB video RAM) graphics card.
4.4. EXPERIMENTS, RESULTS AND DISCUSSION
Table 4.1: Data specifications.
74
4.4. EXPERIMENTS, RESULTS AND DISCUSSION
75
Mean Target Registration Error (mTRE), capture range, accuracy and computation time were used to evaluate each method. mTRE was used to measure the initial
and final misalignments, and was calculated using the segmented surface points of
the corresponding bone in CT. Capture range measures the robustness of a method
under a collection of experiments. It was chosen as the range of initial mTRE that
95% of registrations would success. A registration was defined successful if the final
mTRE ≤2 mm for experiments using simulated fluoroscopic images, and ≤4 mm for
real fluoroscopic images. The accuracy was measured using the mean and standard
deviation of the final mTREs of the successful registrations. The computation time
was measured as the mean and standard deviation of the time required achieving
successful registration.
4.4.1
Registration of CT to Simulated Fluoroscopy
For each phantom, three simulated fluoroscopic images were generated from CT.
The CT data was placed at the origin, and the simulated fluoroscopic images were
generated along coordinate axes with focal length of 920 mm and origin being at the
half focal length. Obviously, the gold standards were identity transformations with all
parameters being zeros. One hundred experiments with random initial CT positions
were conducted for each phantom and each method. The initial CT positions were
obtained by applying small perturbations to the gold standard. Table 4.2 lists the
magnitude of perturbations for each phantom. NC was used as the similarity metric.
Table 4.3 shows the initial and final mTREs. Table 4.4 shows the capture ranges,
accuracies, and computation time.
From Tables 4.3 and 4.4, we have the following observations: 1) the proposed
4.4. EXPERIMENTS, RESULTS AND DISCUSSION
76
Table 4.2: Perturbations used to generate random initial CT positions for CT to
simulated fluoroscopy registrations. The perturbations were made around
the six components (3 rotations, 3 translations) of the gold standard.
Femur
Pelvis
Scaphoid
Rotational Components (◦ )
±20
±50
±30
Translational Components (mm)
±20
±20
±10
Table 4.3: Experimental results of CT to simulated fluoroscopy registrations: initial
mTREs vs final mTREs (unit: mm).
4.4. EXPERIMENTS, RESULTS AND DISCUSSION
77
Table 4.4: Experimental results of CT to simulated fluoroscopy registrations: capture
range (unit: mm), accuracy (unit: mm), and computation time (unit: s).
Phantom
Capture Range
Scaphoid
Accuracy
Computation Time
Capture Range
Pelvis
Accuracy
Computation Time
Capture Range
Femur
Accuracy
Computation Time
CMA-based
> 40
0.26 ± 0.48
147 ± 53
80
0.07 ± 0.07
94 ± 26
> 100
0.42 ± 0.63
99 ± 27
UKF-based
> 40
0.87 ± 0.54
187 ± 16
72
0.40 ± 0.10
154 ± 38
> 100
1.87 ± 0.63
124 ± 14
Simplex-based
9
1.24 ± 0.99
95 ± 27
50
0.30 ± 0.47
86 ± 22
10
1.31 ± 0.82
65 ± 14
method achieved similar to or better capture ranges than the UKF-based method
for all testing phantoms, and both methods were significantly more robust than the
simplex-based method in terms of capture range; 2) the CMA-ES based method
consistently achieved the best accuracy; 3) the CMA-ES based method took slightly
longer time than the simplex-based method to converge, but on average the difference
was within one minute for a single registration.
4.4.2
Registration of CT to Real Fluoroscopy
To examine the method’s performance in a simulated surgical environment, in this
test the real fluoroscopic images of the pelvis phantom were used. Three fluoroscopic
images were acquired from arbitrary viewing directions that were apart about 45
degrees each other, and the pose information was reported by the tracking camera.
Four embedded fiducials, visible in CT and tracked during fluoroscopy acquisition,
were used to obtain the gold standard. Similar to the CT to simulated fluoroscopy
registration experiments, 100 experiments with random initial CT positions around
4.4. EXPERIMENTS, RESULTS AND DISCUSSION
78
Table 4.5: Experimental results of CT to real fluoroscopy registrations for pelvis:
initial mTREs vs final mTREs (unit: mm).
Table 4.6: Experimental results of CT to real fluoroscopy registrations for pelvis:
capture range (unit: mm), accuracy (unit: mm), and computation time
(unit: s).
Capture Range
Accuracy
Computation Time
CMA-based
22
3.19 ± 0.44
114 ± 40
UKF-based
10
2.56 ± 0.74
156 ± 40
Simplex-based
9
3.24 ± 0.65
90 ± 8
the gold standard were performed for each of the three methods. The perturbations
were ±15◦ for rotational components and ±20 mm for translational components.
VWC was used as the similarity metric. The results are shown in Tables 4.5 and 4.6.
Obviously, the CMA-ES based method achieved the largest capture range. However, the UKF-based method achieved the best accuracy but did not show much
improvement in capture range. This can be explained by one important property of
the UKF-based method: it requires a good understanding about the error sources
in the system to work robustly and efficiently. In CT to fluoroscopy registration, a
variety of sources (CT acquisition, fluoroscopy acquisition, DRR generation, outliers
in fluoroscopic images, and so on) would cause the generated DRRs to not exactly
match the corresponding fluoroscopic images. Finding the statistics about the combined errors is usually not an easy task. In this study, they were determined by trial
4.4. EXPERIMENTS, RESULTS AND DISCUSSION
79
Figure 4.3: Experimental results for different initial distributions of the CMA-ES
based method: initial mTREs vs final mTREs (unit: mm). The pelvis
phantom was used as the testing data.
and error, and may not have been optimally chosen. In our previous work [37], it was
demonstrated that the UKF-based method was able to achieve significant improvements in capture range and computation time if such knowledge can be accurately
obtained.
4.4.3
The Impact of Initial Search Size
While most control parameters of the CMA-ES algorithm have been suggested before
[42], the user has to provide an initial search distribution in the form of σ. The
parameter indicates the uncertainty about the user-supplied initial transformation,
and the value of 1.0 has been recommended. Here, we analyze the sensitivity of
the algorithm to this parameter. The CT and simulated fluoroscopy of the pelvis
phantom and three different values of σ, i.e. 0.5, 1.0, and 2.0, were used in this
testing. Similar to the previous cases, 100 experiments were performed for each value
of σ. The results are shown in Figure 4.3 and Table 4.7.
These experiments demonstrate that the recommended value of 1.0 gives the best
balance between capture range, accuracy and computation time; however, the differences caused by the parameter were not significant. This observation was anticipated
4.5. SUMMARY
80
Table 4.7: Experimental results for different initial distributions of the CMA-ES
based method: capture range (unit: mm), accuracy (unit: mm) and computation time (unit: s). The pelvis phantom was used as the testing data.
Capture Range
Accuracy
Computation Time
σ = 0.5
75
0.12 ± 0.27
119 ± 33
σ = 1.0
80
0.07 ± 0.07
94 ± 26
σ = 2.0
78
0.10 ± 0.28
115 ± 27
because the CMA-ES algorithm is able to adaptively change its search distribution
according to the local search landscape.
4.5
Summary
In this chapter, we presented a new 2D-3D registration method that takes advantage
of the CMA-ES searching algorithm to achieve improved robustness, fast computation
speed and better usability. From the experimental results, we have the following
conclusions:
1. The proposed method is able to achieve highly accurate results and is significantly more robust with respect to local minima than the simplex-based method.
It is a fast method as most registrations can be finished in 1-2 minutes;
2. The UKF-based method is able to achieve the same capture range as the CMAES based method if a good understanding about the system errors can be obtained. However, the CMA-ES based method is a more general solution because
most of its control parameters can be automatically determined and it is not
sensitive to the only exposed parameter.
81
Chapter 5
Multiple-Object 2D-3D Registration
5.1
Overview
This chapter presents a multiple-object 2D-3D registration technique for non-invasively
identifying the poses of fracture fragments in the space of a preoperative treatment
plan.1 The treatment plan is generated from tessellation of computed tomography images of fracture fragments. The registration technique recursively updates the treatment plan and matches its digitally reconstructed radiographs (DRRs) to a small
number of intraoperative fluoroscopic images. The proposed approach combines an
image similarity metric that integrates edge information with mutual information,
and a global-local optimization scheme, to deal with challenges associated with the
registration of multiple small fragments and limited imaging orientations in the operating room. The method is easy to use as minimum user interaction is required.
Experiments on simulated fractures and two distal radius fracture phantoms demonstrate clinically acceptable target registration errors with capture range as large as
1
This work has been published in Journal of TBME: R. H. Gong, J. Stewart, and P. Abolmaesumi,
“Multiple-Object 2D-3D Registration for Non-invasive Pose Identification of Fracture Fragments”,
IEEE Transaction on Biomedical Engineering, volume 99, Jan. 2011.
5.2. INTRODUCTION
82
10 mm.
5.2
Introduction
The emergence of computer-assisted surgery (CAS) enables the use of a preoperative
treatment plan to guide surgical operations. It is nowadays the preferred choice of
many surgeons because of demonstrated advantages such as low incidence of surgeryinduced infection, short healing time, and high union rates [117]. A fundamental task
in such procedures is to accurately and responsively establish a spatial correspondence
between the preoperative treatment plan and the patient in Operating Room (OR).
In the context of fracture treatment, the task becomes to identify the poses of the
fracture fragments in the space of the treatment plan in order to obtain knowledge
about spatial deviations between the actual and planned positions of the fragments.
Conventionally, optical tracking is used to identify the position of each fracture.
This pose identification technology is accurate, fast and reliable. However, it requires
line-of-sight between the camera and a number of reference bodies that are mounted
on the fracture fragments. Such an approach has several drawbacks: First, the size
and weight of the reference bodies may limit their application in some types of fractures that contain multiple, small fragments; second, the line-of-sight constraint limits
the surgeon’s flexibility in the OR; third, mounting of reference bodies on bones is
invasive, which may lead to longer recovery time.
2D-3D registration is an alternative pose identification technique that overcomes
the limitations of optical tracking. To identify the poses, a treatment plan generated
from preoperative 3D data, such as a Computed Tomography (CT) of the trauma region, is mapped to intraoperative fluoroscopic images through an image registration
5.2. INTRODUCTION
83
process. This process is less responsive and less accurate than optical tracking; however, it provides a non-invasive, suboptimal solution for the cases that optical tracking
is impossible or costly. In the case of orthopaedic surgery, the registration process is
usually performed by maximizing a similarity metric between simulated fluoroscopic
images generated from the 3D CT data, called digitally reconstructed radiographs
(DRRs), and the actual fluoroscopic images. A number of DRR-based 2D-3D registration techniques have been suggested (see e.g., [53, 64, 82, 84, 93, 127]). However,
none of these techniques has been reported for the pose identification problem in
fracture treatment, primarily due to the following challenges:
• Involvement of multiple moving objects. Most current 2D-3D techniques handle
only one large or long bone such as pelvis or femur. In fracture treatment, multiple and maybe small bones are involved, which not only increases the computation complexity, but also increases the likelihood of occlusion on fluoroscopic
images.
• Limitations from the constrained OR environment. The imaging orientations
are very limited due to the collision of the fluoroscopy imaging device with
the OR table. So, the best fluoroscopic imaging views that are optimum for
registration of multiple fracture fragments may not be available.
Previously, the multiple-object 2D-3D registration problem has been explored in
a few studies. In [22], phase-based mutual information was used as the similarity
metric for registering femur and tibia to fluoroscopic images of knee joint. Each bone
was registered to the joint images separately as a conventional 2D-3D registration,
and the phase information was used to reduce the impact of the outliers, i.e. the other
5.2. INTRODUCTION
84
bone shown in the joint images. In addition, a very good initialization (±3◦ and ±3
pixels) was required. In [67], multiple bones were registered to two fluoroscopic images (one AP view and one lateral view). Correlation of edge information was used to
deal with overlaps between bones, and an optimization algorithm that takes advantage of the orthogonal fluoroscopic images was used. Rough user segmentations from
fluoroscopic images were needed, and only preliminary results with synthetic fluoroscopic images were reported. In our previous work [39], multiple fracture fragments
of unknown shapes were registered to a set of fluoroscopic images. The registration
determined not only the poses of the fragments, but also their true shapes by simultaneously deforming a bone atlas through automatic planning (i.e. planning and
registration were combined into a single procedure). Mutual information was used as
the cost function, and a small amount of user interactions were necessary to remove
the impact of outliers on fluoroscopic images. However, validation has only been performed with a simple synthetic fracture. In addition to intensity-based registration
techniques above, feature-based registration has been proposed [31]. However, this
approach depends on accurate segmentation of the fracture within the image, and
is specifically designed for tubular-shape bone structures such as the femur. Hybrid
pose identification methods have also been proposed where a combination of optical
tracking and 2D-3D registration is used [78].
In this chapter, we describe a new multiple-object 2D-3D registration technique,
aiming at solving the pose identification problem in fracture treatment. A similarity
metric that integrates edge information with mutual information is used in order to
obtain a more accurate and smoother cost function in a very noisy image environment.
5.3. METHODS
85
A key to the success of the approach is the use of a global-local alternating optimization scheme that is based on the Covariance Matrix Adaptation Evolution Strategy
(CMA-ES) algorithm [42], which can handle rugged objective functions. Treatment
planning is done in a separate step in order to achieve reliable results. We evaluate
the proposed technique with synthetic fractures as well as actual fracture phantoms.
The rest of this chapter is organized as follows: Section 5.3 presents the details of
our method. The experimental results are reported in Section 5.4, and discussed in
Section 5.5. Finally, Section 5.6 provides a summary.
5.3
Methods
The main components of our multiple-object 2D-3D registration algorithm are illustrated in Fig. 5.1. The three inputs are: the preoperative treatment plan as the
moving data (Section 5.3.1); a set of intraoperative fluoroscopic images as the fixed
data (Section 5.3.2); and an initial transform on the treatment plan that roughly
aligns the treatment plan with the fluoroscopic images (Section 5.3.3). To find the
poses of the intraoperative fragments in the space of the treatment plan, the transform is recursively refined until the generated DRRs of the transformed treatment
plan (Section II-E) match the corresponding fluoroscopic images in terms of the similarity metric we have defined (Section 5.3.4). The process is steered by a global-local
alternating optimization scheme (Section 5.3.6).
5.3.1
Preoperative Treatment Plan
The preoperative treatment plan has as its goal the surgeon’s ideal shape of the bone
after the treatment. The plan is made by manipulating and aligning computer models
5.3. METHODS
86
Figure 5.1: The main components and flowchart of our multiple-object 2D-3D registration algorithm. The poses of the intraoperative fragments in the space
of treatment plan are computed by registering the treatment plan to a
set of intraoperatve fluoroscopic images.
of individual fracture fragments that are segmented from a diagnostic CT.
We chose to use intensity models to represent the fragments. We used a semiautomatic active contour-based technique [124] to segment the fragments, then manually corrected the boundaries where large segmentation errors occurred.
The goal of planning is to find a set of rigid-body transforms, one per fragment,
that transform the fragment models from their local coordinate frames in the diagnostic CT to the coordinate frame of the treatment plan such that an ideal bone shape
is obtained.
We denote the transforms as {plan Tmodel(i) }, i = 1, ..., N , where N is the number
of fragments. A number of planning methods can be used to obtain such transforms.
In the simplest method, the fragment models are interactively manipulated on a
computer screen, and the final shape of the bone is determined according to user’s
5.3. METHODS
87
expertise [11, 12, 35, 47]. For automatic planning, a template bone model is used as
a reference, and the fragments are concurrently registered to the template using 3D
registration techniques. The commonly used template is the reflected contra-lateral
bone [77, 80], or a statistical shape model of the bone [20, 23, 39, 92] as the planning
reference. Given the focus of our work, the accuracy of surgical planning and the
approach taken to generate the plan are irrelevant to the accuracy of the proposed
registration method. However, we use the transforms obtained during planning as
the ground truth for validation of 2D-3D registration.
5.3.2
Tracked Intraoperative Fluoroscopic Images
In this study, fluoroscopic images are captured using a GE OEC 9800 C-arm that is
commonly available in ORs. We assume that the pose of the C-arm is tracked, hence,
the relative positions of the fluoroscopic images are known. We also assume that the
images are calibrated and distortion free, and that the imaging parameters such as
the distance of the X-ray source to the OR table are known. These assumptions are
similar to the ones made in prior work and can be satisfied by tracking the position
of the C-arm relative to the OR table using optical, or magnetic, tracking techniques,
or by using specialized fiducials in the image.
5.3.3
Transforms
The registration process estimates two types of transforms: one global transform,
OR
Tplan , that is applied on the entire treatment plan, and a set of local transforms,
{plan Tmodel0 (i) }, i = 1, ..., N , that are applied to the individual fragment models. Note
that the models have been placed at their planned positions after planning, and we
5.3. METHODS
88
use model0 to distinguish them from their original positions in the diagnostic CT.
After registration, the pose of a fragment i in the space of the OR is formed as
Ti = OR Tplan plan Tmodel0 (i) , i = 1, ..., N.
(5.1)
The global transform maps the treatment plan to the patient in the OR, and
the local transforms reposition the models within the treatment plan so that their
final positions match the intraoperative positions of the corresponding fragments. For
a given fragment, if the registration is accurate, the local transform represents the
spatial deviation between its true pose in the OR and the planned pose in the OR
according to the treatment plan, which is the information that surgeons are most
interested to know.
Each of the global and local transforms is represented using a rigid transform with
six parameters pi = (θx , θy , θz , tx , ty , tz )i , i = 0, 1, ..., N (i = 0 for global transform),
where the first three parameters determine the rotation in Euler angles2 . Here, rotation is with respect to the geometry center of the treatment plan (for i = 0) or the
corresponding fragment (for i > 0). In the case of N fragments, there are (1 + N ) × 6
parameters to be determined.
5.3.4
Similarity Metric
A similarity metric is defined over the fluoroscopic images and the corresponding
DRRs of the transformed treatment plan. Our design goal is to take advantage of
the wide capture range of mutual information (MI) while making use of the edge
information for better robustness against noise and outliers. An approach similar to
2
We found Euler angles to work well (taking into account the singularities) for our optimization
scheme. Other representations such as the quaternions may also be used.
5.3. METHODS
89
Munbodh et al. [76] is adopted, where images (both fluoroscopic images and DRRs)
are first processed with an edge-enhancement technique, and then a similarity measure
is computed from the modified images. We use the gradient magnitude to derive
the edge information, and the infinite impulse response (IIR) filter is used for fast
computation. The edge thickness is controlled by the standard deviation, σ, of a
smoothing Gaussian kernel. For a pixel location (x, y), the edge-likelihood factor is
computed as
w(x, y) = max[0, e
G(x,y)−A
B
1
− e B ],
(5.2)
where G(x, y) is the gradient magnitude, A is a constant that affects the contrast of
the resulting image, and B ∈ [0, 1) is a threshold that reduces the shadow generated
by the soft tissues surrounding the trauma region.
The computed factor is a positive integer: if it is smaller than one, the pixel will
be suppressed; if it is greater than one, the pixel will be enhanced. The values of σ, A
and B are determined through empirical testing, and we have found that the values
2, 15 and 0.2, respectively, are suitable for most of our experiments. Once the edge
likelihood factor is computed, the original pixel is modified by weighting the pixel
with the factor
I 0 (x, y) = I(x, y)w(x, y).
(5.3)
After performing the edge-enhancement process, Mattes Mutual Information (MMI)
[72] is computed for every pair of fluoroscopic image and DRR. MMI is an implementation of mutual information that computes the measure from a small set of pixel
5.3. METHODS
90
samples and estimates the image histograms with the Parzin Windowing technique.
We have slightly modified the original MMI algorithm by adding pixel samples from
the edges detected during edge-enhancement. The gradient magnitude of the fluoroscopic image is thresholded to keep only structures of strong gradient magnitudes,
using the threshold value
t = Gmax − α(Gmax − Gmean ),
(5.4)
where Gmax and Gmean are the maximum and mean values of the gradient magnitude,
and α determines how much edge information to keep. We have chosen a fixed value
of 0.3 for α and found that it was appropriate in most cases. When the images contain
extreme fine or coarse details, it will be helpful to tune this parameter so that suitable
amount of edge information is used. In total, 5% of uniformly sampled pixels plus
the edge samples generated by Eq. (5.4) are used to compute the MMI for each pair
of images, and the overall metric is formed as
M
1 X
M M Ij (p; f luoroj , DRRj ),
E(p; f luoros, plan) =
M j=1
p = {p0 , p1 , ..., pN },
(5.5)
where M is the number of fluoroscopic images used for registration. Note that, in the
above function we didn’t apply any constraint to the transformation parameters such
as collision avoidance, so the fragments can overlap each other during the registration,
which not only increases the failure rate of the registration, but also causes slower
convergence. This decision was made mainly because currently it lacks an efficient
5.3. METHODS
91
algorithm for collision detection among multiple 3D objects of irregular shapes.
5.3.5
DRR Computation
As a large number of DRRs are required for registration, a 3D texture-mapping technique that employs modern Graphics Processing Units (GPUs) is used to accelerate
the production of DRRs [9]. To generate a DRR of an X-ray view, the transformed
treatment plan is sliced along the viewing direction with all fragment models being
sectioned simultaneously, and then the slices are blended together in the order of
back to front. In order to highlight particular structures of interest (the bone in our
case) and to produce DRRs that better simulate real fluoroscopic images, a transfer
function is used during slicing and blending. All those operations (i.e. slicing and
blending) are performed within the GPU hardware.
5.3.6
Optimization Scheme
The goal of optimization is to find a set of transform parameters that minimize the
similarity metric defined in Eq. (5.5). As the metric function is highly non-linear,
and also due to the involvement of multiple fragments, we use a coarse-to-fine twolevel optimization scheme. At each level, we perform two types of optimizations in
an alternating and iterative fashion:
1. Global optimization - the parameters of the global transform are estimated for
a certain number of iterations, so that all fragment models are transformed as
an entirety.
2. Local optimization - the parameters of the local transforms are estimated in two
steps. First, the parameters for individual fragments are sequentially estimated
5.3. METHODS
92
for a certain number of iterations, in the order from the largest fragment to
the smallest one, where the size of a fragment is computed as the number of
non-zero voxels in the fragment model. Second, all parameters are estimated in
parallel for a certain number of iterations.
The number of iterations for the global, sequential and parallel optimizations are
user-defined, and we have used 20, 20 and 10, respectively, for all of our experiments.
All optimizations are performed using the CMA-ES algorithm [42], which is known
for robust estimation of nonlinear functions that have a rugged search landscape.
The algorithm requires no derivative calculation; instead, the proceeding directions
are learned by sampling the parameter space according to a probability search distribution and selecting the samples that best predict the convergence direction. As
estimation advances, the parameters and the search distribution are progressively
updated according to the newly added samples. Compared with traditional optimization algorithms such as simplex and gradient-descent, although each iteration
of the CMA-ES algorithm is computationally more expensive, the total number of
iterations required for convergence is much smaller because more informative samples
are considered in each iteration and the history information is also carried on.
In our previous work on single-object 2D-3D registration (Chapter 4) [36], the
CMA-ES algorithm has demonstrated a large capture range of about 20 mm with
convergence within a couple of minutes.
To improve the capture range as well as the performance, the optimizations are
performed for two resolution levels. In the first level, fluoroscopic images are downsampled to have a resolution of 128 × 128 pixels with isotropic spacing, and fragment
models are down-sampled to have a resolution of 128×128×64 voxels with anisotropic
5.4. RESULTS
93
spacing. In the second level, the resolution along each coordinate axis is doubled for
both fluoroscopic images and fragment models.
Our algorithm requires a rough initial alignment between the treatment plan and
the fluoroscopic images. Initialization is performed interactively from a graphical user
interface (GUI), which provides good initial guesses for both global and local parameters with only a few user interactions. Alternatively, the inverse of the transformations
obtained during planning can be used as a good initial guess of the local parameters,
and the traditional optical tracking or landmark-based registration techniques can be
used to supply an initial guess of the global parameters.
5.4
Results
We evaluate our method with three types of experiments. First, we use two synthetic
fractures to test the method’s behavior under ideal conditions. Second, we use two
fracture phantoms, simulating real patient cases, to test the method’s behavior in a
clinical environment. Lastly, a patient fracture with simulated fluoroscopic images are
used to study the behavior of our method in the presence of outliers on fluoroscopic
images.
For each fracture case in each type of experiment, a variety of treatment plans
is used to evaluate the method. The treatment plans are randomly generated on
computer rather than having been obtained through an actual planning procedure,
which greatly simplifies our experimental process, and allows extensive testing of
our method. As we mentioned before, the focus of the current work is on image
registration and hence, the quality of generated plans do not affect the conclusions
drawn in this study.
5.4. RESULTS
94
Figure 5.2: The process of planning, registration (including fluoroscopic imaging and
fragments pose identification), and error calculation for every single experiment.
Fig. 5.2 shows the complete testing cycle for every single experiment:
1. Planning - A treatment plan is generated from the fragment models of the
fracture.
2. Registration - This step consists of 2a) fluoroscopic imaging of the fracture, and
2b) pose identification of the fragments with our proposed method.
3. Error evaluation - The identified fragment poses are compared with their true
poses.
5.4. RESULTS
5.4.1
95
Error Measurement
Fig. 5.2 also shows the associated transforms in each testing stage as well as the error
calculation for a fragment i. The planning yields the transform
plan
5.3.1), the intraoperative fluoroscopic imaging yields the transform
Tmodel(i) (Section
OR
Tmodel , which
takes all fragment models as one fracture into OR, and the pose identification obtains
the transform Ti (Section 5.3.2). The composition of the transforms from planning
and intraoperative imaging is the gold-standard of our registration
−1
TiGS = OR Tmodel plan Tmodel(i)
, i = 1, ..., N.
(5.6)
We use the mean Target Registration Error (mTRE) [27] to report the pose identification error ei . It is determined by the gold-standard transform and the transform
obtained during registration, and is evaluated over all surface points of the fragment
ei = mT REi =
1 X Ti x − TiGS x , i = 1, ..., N,
|Ω| x∈Ω
(5.7)
where Ω is the surface point set. The above error represents the mean surface-tosurface distance between corresponding points of the true fragment in OR and the
estimated fragment after registration.
A pose identification for a fragment is deemed successful if the final mTRE is below
3 mm. In order to study the performance of the method under different distances
of planning, mTREs are correlated to the model-to-plan distances, and success rates
are analyzed for different ranges of such distances. The model-to-plan distance of
a fragment di is the mean displacement that is generated by the planning. It is
calculated from the transform obtained during planning and all surface points of the
5.4. RESULTS
96
fragment
di =
5.4.2
1 X plan −1
Tmodel(i) x − x , i = 1, ..., N.
|Ω| x∈Ω
(5.8)
Experiments with Synthetic Fractures
We first perform experiments with synthetic fractures and synthetic fluoroscopic images. The goal of this type of experiment is to investigate our method’s behavior
under optimal conditions, where system noise introduced during the planning and
pose identification stages is minimum, and best imaging views with minimum fragment occlusions are available.
The right wrist CT of a human cadaver is used to generate the synthetic fracture
cases. The CT has a resolution of 512 × 512 × 72 voxels and a spacing of 0.174 ×
0.174 × 1 mm3 (lower resolution CT could also be used as long as fragment models
can be accurately extracted). The radius is first segmented from the CT, then is
cut with a plane to generate two fracture cases: a two-fragment fracture case (Fig.
5.3) and a three-fragment fracture case (Fig. 5.4). This process also produces a
total of five fragment models, each is resampled and cropped to have a resolution
of 256 × 256 × 128 voxels and a spacing of 0.2 × 0.2 × 0.6 mm3 . The two-fragment
fracture case simulates a diaphyseal segmental fracture that consists of two major
bone fragments: one with irregular shape, and one with tubular shape. The threefragment fracture case simulates a trauma that includes an additional peri-articular
oblique fracture. These simulated fractures create fragments of different shapes and
sizes, and are common types of fractures in clinics.
Fluoroscopic images are simulated from the fragment models with a virtual GE
5.4. RESULTS
97
Figure 5.3: Simulated wrist fracture that contains a two-fragment diaphyseal segmental fracture.
Figure 5.4: Simulated wrist fracture that consists of two fracture surfaces and three
fracture fragments: one diaphyseal segmental fracture surface and one
peri-articular oblique fracture surface.
OEC 9800 C-arm (Section 5.3.1). The virtual device has a fixed source-to-detector
distance of 920 mm, and a detector size of 213.7 × 213.7 mm2 . Fluoroscopic images
are simulated with a resolution of 256 × 256 pixels and a spacing of 0.83 × 0.83 mm2 ,
with the fragment models being placed at a location with a source-to-object distance
of 400 mm. To obtain a simulated fluoroscopic image Fj∗ from a direction j, a DRR,
DRRj , is first generated from the fragment models along the direction, then it is
composed with a true fluoroscopic image Fempty that is captured from an empty
field, and finally a certain amount of random noise is added to the final image. The
composition at a pixel location (x, y) is formulated as
5.4. RESULTS
98
Figure 5.5: Formation of a simulated fluoroscopic image. Left: DRR of a fracture CT.
Middle: empty true fluoroscopic image. Right: the simulated fluoroscopic
image.
Fj∗ (x, y) = DRRj (x, y)Fempty (x, y)/D + N (µ, σ 2 ), j = 1, ..., M
(5.9)
where D is a constant used to control the contrast of the resulting image, N (µ, σ 2 )
controls the amount of noise added, and M is the number of fluoroscopic views. When
generating the DRRs from fragment models, we also incorporated the physical phenomenon - heel effect, which further introduced intensity distortions in the resulting
X-ray images. Fig. 5.5 shows the formation of one simulated fluoroscopic image.
For each of the synthetic fracture cases, the poses of all fragment models are
independently and randomly perturbed to obtain a set of 100 virtual treatment plans.
The perturbation ranges are ±15◦ (maximal value) for the rotation parameters (with
rotation centers at the geometry centers of the models), and ±10 mm (maximal value)
for the translation parameters. Two simulated fluoroscopic images are produced:
one from the x-direction (AP view), one from the y-direction (lateral view), and
composition parameters of D = 256, µ = 0 and σ = 5 are used. The coordinate
axes of both fracture cases have been shown in Fig. 5.3 and Fig. 5.4. As fluoroscopic
5.4. RESULTS
99
images are directly simulated from the fragment models, the transform for fluoroscopic
imaging
OR
Tmodel (Fig. 5.2) is the identity. So, calculation of the gold standard
(Eq. 5.6) is simplified.
For each fracture case and each generated treatment plan, the perturbed poses
are recovered with our method. Registrations with mTRE > 3 mm are considered
unsuccessful, and are excluded from the calculation of error statistics. The results
are summarized in Fig. 5.6 and Table 5.1 for the two-fragment fracture case, and
in Fig. 5.7 and Table 5.2 for the three-fragment fracture case. For a visual check
of the final result, Fig. 5.8 shows the intensity differences between the simulated
fluoroscopic images and the corresponding DRRs before and after registration for one
typical experiment.
Figure 5.6: Pose identification errors versus distances of planning for the twofragment synthetic fracture experiments.
5.4. RESULTS
100
Fragment A
Fragment B
θx
0.24±0.44
0.37±0.46
Errors
θy
0.52±0.61
1.03±0.96
in
θz
2.06±1.26
0.64±0.68
Rigid-Body
tx
0.78±0.85
1.83±1.71
Parameters
ty
0.33±0.61
0.66±0.83
tz
0.27±0.54
0.05±0.06
mTRE (mm)
0.43±0.49
0.24±0.18
Successful
di ≤ 10 mm
91
100
Registrations
di ≤ 15 mm
90
98
(%)
di ≤ 20 mm
84
85
Table 5.1: Error statistics for the two-fragment synthetic fracture experiments.
Figure 5.7: Pose identification errors versus distances of planning for the threefragment synthetic fracture experiments.
5.4. RESULTS
101
Fragment A
Fragment B
Fragment C
θx
1.03±1.15
0.51±0.77
0.89±0.69
Errors
θy
0.58±0.46
1.03±0.81
1.42±0.98
in
θz
2.08±1.27
0.60±0.77
1.40±0.90
Rigid-Body
tx
0.84±0.68
1.79±1.44
2.76±1.95
Parameters
ty
1.49±1.69
0.91±1.37
1.79±1.40
tz
0.38±1.37
0.08±0.08
0.13±0.11
mTRE (mm)
0.62±1.33
0.21±0.16
0.31±0.12
Successful
di ≤ 10 mm
68
100
93
Registrations
di ≤ 15 mm
57
87
65
(%)
di ≤ 20 mm
50
80
61
Table 5.2: Error statistics for the three-fragment synthetic fracture experiments.
5.4.3
Experiments with Fracture Phantoms
Two fracture phantoms are made to study our method’s performance in a lifelike clinical environment, where various system noises exist, and orientations of fluoroscopic
imaging are more restricted.
Two wrist fracture cases are used as templates to create the fracture phantoms.
The first case is a peri-articular distal radius fracture which has one fracture surface and two involved fragments (Fig. 5.9). The second case is a more complicated
comminuted distal radius fracture which has two irregular fracture surfaces and three
involved fragments (Fig. 5.10). In this fracture case, two involved fragments have
5.4. RESULTS
102
AP view
Lateral view
Figure 5.8: Difference images between the simulated fluoroscopic images and the corresponding DRRs for one typical case of the three-fragment synthetic fracture experiments. The background of the simulated fluoroscopic images
is removed in order to visualize the details. First row: before registration.
Second row: after registration. The model-to-plan distances for fragments
A, B and C are 10.7 mm, 7.8 mm and 13.0 mm, respectively. The corresponding pose identification errors are 0.4 mm, 0.1 mm and 0.3 mm,
respectively. Registration took about four minutes on an Intel Core 2 PC
with a GeForce 8600 GPU.
5.4. RESULTS
103
Figure 5.9: One phantom replicates a peri-articular segmental fracture which has one
fracture surface with two involved fragments.
Figure 5.10: The second phantom replicates a comminuted fracture case which has
two irregular fracture surfaces with three involved fragments.
small sizes, and one is embedded inside a large bone. This is a rather difficult problem for registration, because it is impossible to obtain fluoroscopic images without
occlusions, and also the images of the small fragments do not provide rich information
for registration.
The construction of a fracture phantom starts from the diagnostic CT of the
fracture case.
5.4. RESULTS
104
First, the fragments are segmented from the CT, and the corresponding mesh
models are created.
Next, the mesh models are printed out with Acrylonitrile Butadiene Styrene (ABS)
plastic using a rapid prototyping 3D printer. Then, the surfaces of the printed bones
are coated with a barium sulfate solution. A small amount of lacquer is added into
the solution to prevent the barium sulfate from dissolving in hot water. The use of
this solution is to simulate bone’s density under X-ray imaging as plastic materials
are hardly seen in CT and fluoroscopic images.
Next, the models are positioned inside a plastic container, tissue mimicing material
is poured in and the container is cooled down. The use of the tissue mimicking
material is to fix the models’ positions as well as to simulate the surrounding soft
tissues. The tissue mimicking material is made by mixing 4% of Agar with 96% of
boiled water, which becomes hardened quickly when it is cooled down. During the
hardening process, the models are manipulated to create the desired fracture layout.
Finally, five labeled CT fiduals are attached on three faces of the phantom container for computing the gold standard during the experiments.
Note that the constructed fracture phantom replicates the shapes and sizes of the
fragments in the fracture case, but with a customized spatial relationship among the
fragments.
For each created fracture phantom, a CT scan is acquired, then fragments are
segmented, down-sampled and cropped. Created fragment models have a resolution
of 256×256×128 voxels and a spacing of 0.35×0.35×0.7 mm3 for both phantoms. 50
random treatment plans are generated from the fragment models, with a perturbation
range of ±15◦ for rotation and ±10 mm for translation for the two-fragment fracture
5.4. RESULTS
105
Figure 5.11: Four fluoroscopic images are used for each phantom case. First row: the
two-fragment fracture case. Second row: the three-fragment fracture
case.
case, and ±10◦ / ± 5 mm for the three-fragment fracture case.
Four fluoroscopic images are acquired for each phantom with the phantom being
placed roughly in the iso-centre of the C-arm. The imaging orientations are about
45◦ apart each other, starting from the x-direction (AP view) and rotating counterclockwise around the z-axis (Fig. 5.9 and Fig. 5.10). So, the third imaging direction
is close to the lateral view. Due to the constraints within OR, these views are the
most available orientations. Other orientations may also be available, but either
they do not add much information for registration, or the patient’s pose has to be
changed. Images are captured with a resolution of 473 × 473 pixels and a spacing of
0.45 × 0.45 mm2 , and down-sampled to have a resolution of 256 × 256 pixels and a
spacing of 0.83 × 0.83 mm2 . As the C-arm rotates around the phantoms, the focal
length varies between 916 mm and 923 mm due to the changes in the position of the
C-arm. The fluoroscopic images used for this study are shown in Fig. 5.11.
5.4. RESULTS
θx
θy
θz
tx
ty
tz
mTRE (mm)
Successful
di ≤ 10 mm
Registrations di ≤ 15 mm
(%)
di ≤ 25 mm
Errors
in
Rigid-Body
Parameters
106
Fragment A
1.79±1.21
1.23±1.16
1.72±1.14
0.80±0.49
1.05±0.72
0.45±0.34
1.53±0.64
98
91
87
Fragment B
1.29±0.86
1.82±1.78
1.53±1.18
0.99±0.54
2.44±0.73
0.71±0.52
2.79±0.60
92
86
77
Table 5.3: Error statistics for the two-fragment fracture phantom experiments.
To compute the gold-standard for each phantom case, the transform that positions
the phantom in the OR for fluoroscopic imaging (i.e.
OR
Tmodel in Fig. 5.2) must be
known. This is obtained by using the CT markers that are attached on the phantom
container surfaces. During fluoroscopic acquisition, the markers’ positions in the OR
coordinate frame are also recorded using optical tracking. After the phantom CT is
acquired, the positions of the same set of markers in the phantom coordinate frame are
obtained by segmenting the markers from the phantom CT. As the correspondences
of the two sets of points are known, the transform that correlates the two coordinate
frames can be easily computed by aligning the two point sets.
Registrations are performed using the default edge-enhancement and optimization parameters. Results with mTRE > 3 mm are considered unsuccessful, and are
excluded from the calculation of error statistics. Fig. 5.12 and Table 5.3 show the
experimental results for the two-fragment fracture phantom, and Fig. 5.13 and Table 5.4 show the results for the three-fragment fracture phantom. Figures 5.15 and
5.14 show the examples of the registration results for the two phantoms.
5.4. RESULTS
107
Figure 5.12: Pose identification errors versus distances of planning for the twofragment fracture phantom experiments. 6% of total points are cropped
in the fragment A chart because mTRE are out of the display range (i.e.
> 25 mm).
Figure 5.13: Pose identification errors versus distances of planning for the threefragment fracture phantom experiments. 2% of total points are cropped
in the fragment C chart because mTRE are out of the display range (i.e.
> 15 mm).
5.4. RESULTS
θx
θy
θz
tx
ty
tz
mTRE (mm)
Successful
di ≤ 5 mm
Registrations di ≤ 10 mm
(%)
di ≤ 15 mm
Errors
in
Rigid-Body
Parameters
108
Fragment A
1.61±1.67
1.18±1.05
2.68±1.82
0.66±0.35
0.76±0.50
1.13±0.52
1.74±0.52
100
93
87
Fragment B
2.00±2.29
2.10±1.81
1.58±1.19
1.44±0.99
0.96±1.52
0.90±0.68
1.98±1.80
98
94
86
Fragment C
1.99±1.34
1.33±1.81
2.59±2.18
0.87±0.59
0.96±0.50
0.85±0.47
1.71±0.41
100
97
93
Table 5.4: Error statistics for the three-fragment fracture phantom experiments.
5.4.4
Validation Against Outliers in Fluoroscopic Images
To study the behaviors of our method in the presence of outliers in fluoroscopic
images, we generated simulated fluoroscopic images using the two-fragment patient
CT which includes surrounding bones (Fig. 5.9). First, both fracture fragments were
rotated and shifted manually to create a larger amount of displacement. Then, using
the method described before (Section 5.4.2), a set of simulated fluoroscopic images
were created from four directions that are perpendicular to the z-axis, starting from
-x and rotating towards +y, and roughly 75◦ apart (Fig. 5.16 first row). Those
directions were selected because they are usually available in the OR and provide rich
information about the fracture.
Two types of studies were performed. First, the original fragment positions in the
patient CT were used as the planned positions, in which case the manually created
displacements (i.e. model-to-plan distances or initial errors) were the registration gold
standard. Second, similar to previous experiments, a total of 100 random treatment
plans were generated around the manually created fracture configuration (±15◦ for
5.4. RESULTS
109
Figure 5.14: Result of registration for the two fragment phantom. Top row shows
the fluoro images used. Middle row shows the treatment plan. Bottom
row shows the result of the registration of the plan to the fluoro images.
Note that to demonstrate the robustness of the proposed registration
technique, we have chosen a treatment plan that differs significantly
from the pose of the fractures in the fluoro images.
the rotation parameters and ±10 mm for the translation parameters), in which case
the perturbed displacements were the registration gold standards. Fig. 5.16 shows the
simulated fluoroscopic images used in both studies as well as the treatment plan for
the first study, and Table 5.5 summarizes the experimental results for both studies.
5.4. RESULTS
110
Figure 5.15: Result of registration for the three fragment phantom. Top row shows
the fluoro images used. Middle row shows the treatment plan. Bottom
row shows the result of the registration of the plan to the fluoro images.
Note the adjustment of the location of the small fracture fragment in
the bottom row.
Study I
Study II
Displacement (mm)
mTRE (mm)
Displacement (mm)
mTRE (mm)
Success Rate (%)
Fragment A
12.34
0.74
10.57±6.24
0.83±0.64
86
Fragment B
8.22
0.45
15.95±7.18
0.56±0.49
91
Table 5.5: Summary of experimental results for studies with outliers presented in
fluoroscopic images (initial and final errors in Study II are denoted as
mean ± standard deviation).
5.5. DISCUSSION
111
Figure 5.16: First row: four simulated fluoroscopic images were generated from the
two-fragment patient CT which includes surrounding bones as outliers
(fragment positions were manually displaced). Second row: the treatment plan shown in the corresponding imaging views (the surrounding
bones were not present in DRRs during registration, they are included
for illustration).
5.5
Discussion
The effectiveness of our method has been strongly supported by our experimental
results. For the synthetic fracture cases, an average pose identification error of 0.4 ±
0.5 mm (i.e. the average of mTRE for all synthetic fracture experiments and all
fragments) is achieved (see Tables 5.1 and 5.2) with only two simulated fluoroscopic
images from the best imaging directions. For the phantom fracture cases, an average
pose identification error of 2.0 ± 0.8 mm (i.e. the average of mTRE for all phantom
fracture experiments and all fragments) is achieved (Tables 5.3 and 5.4) with only
four fluoroscopic images from restricted imaging directions. Having consulted with
orthopedic physicians in Kingston General Hospital, these accuracies are sufficient
5.5. DISCUSSION
112
and meet most clinical needs. As for capture range, for a planning distance of below
10 mm, almost all registrations are successful for most fragments in the synthetic
fracture experiments (Tables 5.1 and 5.2). Two exceptions are the fragment A in
both fracture cases, where larger failure rate are observed. But if we had relax the
registration success criterion to 3 mm, all registrations for the fragment A in both
fracture cases would have been successful.
For the phantom fracture cases, all experiments are successful for a planning distance of below 5 mm (Tables 5.3 and 5.4). Given the difficulty of multiple-object 2D3D registration in a noisy environment, we consider those acceptable capture ranges.
Furthermore, the planning distance of 5 mm can normally be achieved by orthopaedic
surgeons within our institution using manual alignment of bone fragments.
In Tables 5.1 and 5.2, the fragment A in both fracture cases has higher failure
rates, and a slightly larger mTRE distribution. A closer look at the errors in rigidbody parameters reveals that the main error sources originate from rotation about
the z-axis. This is because the fragment A in both cases has a cylinderical shape, and
the AP and lateral imaging views provide relatively little information for correcting
this kind of rotational error. In Table 5.2, the failure rates for the two small fragments
(i.e. B and C) are larger than that of the large fragment. The results show that the
shape and size of the fracture do affect the performance of our method. However, the
impact can be reduced by using fluoroscopic images from better imaging directions.
Comparing all three fragments in Table 5.2, we see that the fragment C has similar
mTRE distribution (1.71 ± 0.41 mm) as that of A (1.74 ± 0.52 mm), but the fragment
B has a larger error distribution (1.98±1.80 mm) than that of C though its size is even
slightly larger. This observation can be explained by the imaging views (Fig. 5.11
5.5. DISCUSSION
113
second row) used for registration: the irregular shadows of the fragment C are more
informative for registration, so a good pose identification accuracy could be achieved
but with a larger failure rate; while the fragment B has a lot of occlusions with the
large bone in the fluoroscopic images, so a larger error distribution is reasonable due
to the uncertainties caused by occlusions. Our algorithm is designed for handling
these kinds of issues, so the overall pose identification performance for the fragment
B is still acceptable.
Given the 0.4±0.5 mm error in synthetic fracture experiments and the 2.0±0.8 mm
error in phantom fracture experiments, we believe that the major source of error of
our algorithm comes from a few factors that are related to the image similarity metric.
In the synthetic fracture experiments, the best imaging directions are used, and the
discrepancies between the fluoroscopic images and DRRs are minimum as the fluoroscopic images are simulated from DRRs. So the registration errors should come from
sources such as empty fluoroscope background, artificially introduced noise, nonlinear function transformations including image enhancement and mutual information,
and premature termination of optimization, which collectively produced an error of
0.4 ± 0.5 mm. In our phantom fracture studies, the acquired fluoroscopic images are
post-processed in order to determine the imaging parameters as well as to correct
the geometry distortions. During registration, though the quality of the produced
DRRs can be controlled by using a transfer function or using a more accurate DRR
generation method, the discrepancies between the fluoroscopic images and DDRs will
still be large. Also, in phantom studies, the involved fragments have different shapes
and sizes, and the available imaging views are constrained.
It should be noted that, in our phantom studies, the phantoms are built without
5.5. DISCUSSION
114
including the surrounding bones such as the ulna and carpal bones. Since in distal
radius fracture treatment, fluoroscopic images are usually acquired from AP and
lateral directions, the surrounding bones can be maximally cropped from the images
before the registration is initiated. Therefore, excluding the surrounding bones should
not have critical impact on our experimental results.
In cases that a lot of surrounding bones do appear on fluoroscopic images, the
reported final errors (Table 5.5 Study II) are slightly worse than those from a similar
synthetic case but without outliers (Table 5.1 row “mTRE”). However, the impact of
the outliers is not significant, and the overall final errors are still within 1 mm. We also
observe that the results are even better than those from the two-fragment phantom
case (Table 5.3 row “mTRE”). This is mainly because simulated fluoroscopic images
of good quality have been used, but our studies still show that, if fluoroscopic images
of good quality and from appropriate imaging views can be acquired, the impact of
outliers will not be significant.
Our method has a number of parameters to control the behavior of the edge
enhancement filter and optimization, and they are currently determined through empirical testing. We have found that the registration results are somewhat sensitive
to some of the parameters such as the σ, A and B in the edge-enhancement filter.
Comparing the failure rates in Table 5.2 and Table 5.4, we see that the experiments
for phantom fractures achieve better capture ranges than the synthetic fracture experiments. The same observation can also be seen for the fragment B in Tables 5.1
and 5.3. Although more fluoroscopic images are used in the phantom fracture experiments, the main reason for this observation is that the user parameters are tuned
for the phantom experiments. The values of those parameters are mainly determined
5.5. DISCUSSION
115
by the structure details in the fluoroscopic images, and the type of device used for
fluoroscopic imaging. Therefore, for each type of fracture and each type of imaging
device, the parameters need only to be tuned once.
Our method employs several techniques in order to achieve a good balance between
performance and robustness. For example, GPU-based DRR generation and IIR
gradient are employed for speed consideration, while Mattes mutual information,
CMA-ES, and multiple-resolution and global-local alternating optimization scheme
are adopted for consideration of both performance and robustness. Each registration
in our experiments takes about 3 to 9 minutes to complete, and the experiments are
performed on a personal computer with the following configuration: Windows XP,
Intel Core 2 2.4 GHz CPU, 3 GB RAM, nVidia GeForce 8600 GT graphics card and
256 MB VRAM. As the computation power of CPU and GPU is rapidly increasing,
we are optimistic that it won’t be long before we can finish one registration within
one minute.
In clinical environment, there are three potential ways to use the transformations
reported by our method. Firstly, our method can be used like traditional CAS systems
to guide the intraoperative reduction procedure with a preoperative treatment plan,
where the obtained transformations are used to compute the treatment errors with
respect to a carefully made plan and provide both visual and quantitative feedbacks
to the surgeon. The advantage is that our method is non-invasive, and the drawback
is that the computation efficiency as well as robustness need to be further improved.
The second usage is for postoperative evaluation of treatment errors, which is similar
to the first usage but has fewer requirements about the computation speed, and our
method has immediate suitability. Lastly, the most ambitious and prospective usage
5.6. SUMMARY
116
is to use the obtained transformations to directly control a treatment robot in order
to achieve automatic fracture reduction, and there is still a long way to go to achieve
this goal.
5.6
Summary
We present a new multiple-object 2D-3D registration method to identify the poses
of the fracture fragments in the space of treatment plan with only 2-4 fluoroscopic
images. Two key techniques are used to deal with occlusions, outliers, noise and small
fragment sizes. First, a similarity metric that integrates edge information with mutual
information is used to obtain an accurate and smooth cost function. Second, a globallocal alternating optimization scheme that employs CMA-ES is used to achieve a good
balance between the capture range and convergence speed. A mean pose identification
error of 0.4 mm with a capture range up to 10 mm is achieved for synthetic fracture
studies that simulate optimal treatment setups, and a mean pose identification error
of 2.0 mm with a capture range up to 5 mm is achieved for phantom fracture studies
that simulate lifelike treatment setups.
Using 2D-3D registration in place of optical tracking provides minimally invasive treatment for complex bone fractures with reduced amount of irradiation. The
method can be used as the first trial before more invasive and costly procedures are
attempted, or it can be used to evaluate post-treatment errors against the treatment
plan.
In future work, we will seek more systematic ways to determine the user parameters that control the edge-enhancement filter and the optimization behavior. We
will also further optimize the software by using the latest GPU parallel computation
5.6. SUMMARY
technologies to improve our method’s performance.
117
118
Chapter 6
Modelling of 3D Intensity Atlas with B-Spline
FFD
6.1
Overview
Fast instance generation is a key requirement in atlas-based registration and other
problems that need a large number of atlas instances.1 This chapter describes a new
method to represent and construct intensity atlases. Both geometry and intensity
information are represented using B-spline free-form deformation (FFD) lattices; intensities are approximated using the multi-level B-spline approximation algorithm
during model creation, and the parallel computation capability of modern graphics
processing units is used to accelerate the process of instance generation. Experiments
with distal radius CTs show that, with a coefficients-to-voxels ratio2 of 0.16, intensities can be approximated up to an average accuracy of 2 ± 17 grey-levels (mean ±
stdev, out of 3072 total grey-levels), and instances of resolution 256 × 256 × 200 can
1
This work has been published in Proceedings of EMBC: R. H. Gong, J. Stewart, and P. Abolmaesumi, “A new representation of intensity atlas for GPU-accelerated instance generation”, Annual
International Conference of Engineering in Medicine and Biology Society, pages 4399-4402, 2010.
2
Defined as the ratio between the number of B-spline intensity deformation parameters and the
number of original voxels, see Section 6.4.1 for more details.
6.2. INTRODUCTION
119
be produced in a rate of 25 instances per second with a GeForce GTX 285 video card,
which is about a 500 times performance improvement over the traditional method
that uses plain CPU.
6.2
Introduction
Anatomical atlases (or statistical shape models) are widely used in Computer Assisted
Surgery (CAS) for registration, segmentation, planning, interpretation, and so on. An
atlas captures the statistics, including mean and variations, of a set of instances of the
same anatomy. Captured information can be geometry only (geometry atlas hereafter)
or both geometry and intensity (intensity atlas hereafter). While geometry atlases
are more efficient for use due to their compact sizes, intensity atlases offer greater
reliability and accuracy because of the additional intensity information they provide.
One key problem associated with an atlas is to find an appropriate representation
of the atlas such that correspondences across a set of training shapes can be easily
established, and new instances can be efficiently generated.
Geometry atlases have been well studied during the past two decades. Reported
representations or underpinning techniques include point distribution models [20],
minimum description length [23], medial axes [32], spherical harmonics [15], tetrahedral mesh [123], connected in-spheres [101], and deformation fields [92]. For most
of those techniques, the process of atlas construction and the quality of the resulting model depend on the geometry of the shape being studied. One exception is
the technique that uses deformation fields, which is the result of B-spline deformable
registration. However, that technique uses multiple component volumetric data of
full-resolution as the model representation, which not only takes large storage space,
6.2. INTRODUCTION
120
but also slows down the process of instance generation. This limitation can become
the performance bottleneck when the application context requires a large number of
atlas instances, such as with atlas-based 3D registration.
Adding intensity information into atlases further complicates the problem. Thus,
intensities are usually sampled or mathematically approximated in order to reduce
the size of the models. The Active Appearance Models suggested by Cootes et al.
[19] captures texture variations within a sampled “shape-free” patch (i.e. the region
of interest). Berks et al. [7] use a thin-plate spline to approximate intensities in
their Mammographic Appearance Models. For a 3D intensity atlas, Yao et al. [122]
choose to use Bernstein polynomials to approximate intensity distributions within
each tetrahedron. In all these methods, while intensity information is captured with
satisfactory accuracies, efficiently producing instances from the models remains a
problem.
In recent years, the processing power of consumer-grade graphics processing units
(GPUs) is improving rapidly, and GPUs are increasingly used for general-purpose
computations. In this chapter, we describe a simple representation for intensity atlases
that takes advantage of this trend. Both geometry and intensity information are
represented using B-spline deformation lattices: intensities are approximated with the
Multi-level B-spline Approximation (MBA) algorithm [59] during atlas creation; and
the parallel computation capability of modern GPUs is used to accelerate the process
of instance generation. We evaluate our method with a set of distal radius CT data,
and report the accuracies of intensity approximation as well as the performance of
instance generation.
Our primary contribution is the use of a B-spline lattice for representing the
6.3. METHOD
121
intensity information of atlas models, which is an accurate and compact representation
that is good for GPU parallel processing. Our experiments showed that, comparing
with the traditional CPU-based method, the use of B-splines and GPU can achieve
about a 500 times improvement in performance.
6.3
Method
6.3.1
Atlas Representations
Every atlas has two representations: an internal representation that describes the
mean and variations of the atlas, and an external representation, known as shape parameters, for user applications to request instances. The former has different formats
for different atlases, and determines the space/time costs needed to produce each
instance. On the other hand, the latter has a compact format, and its dimension has
great impact on the performance and behaviors of the user applications.
The internal representation of any instance in our atlas can be generally described
using a function S = f (V ; C), with C = {S, M } and V = {Φg , Φi }. C and V represent
the constant and variable aspects of the model, respectively:
• C consists of a volume that describes the mean geometry and intensities (S),
and a binary mask that segments the anatomy of interest (M ).
• V describes the geometry and intensity variations between this instance and the
mean volume, which consists of two uncorrelated (in terms of the positions of the
control points) B-spline control lattices, one for geometry deviation (Φg ) and one
for intensity deviation (Φi ). The two lattices cover the same region of interest,
but with different resolutions and spacings. The selection of resolutions depends
6.3. METHOD
122
on the accuracies of the variations to be captured, but the resolutions are always
coarser than the original volume grid in order to achieve data reduction.
To obtain an instance from the model, the two control lattices are sequentially applied
to the mean volume, with intensity first, followed by geometry.
The goal with the external representation is to use the fewest parameters to represent the most variation within a set of training examples. This is done by performing Principle Component Analysis (PCA) on the variable part of the internal
representation. For each training example, Sk , k = 1...N , we form a column vector
vk = (φxg(k) , φyg(k) , φzg(k) , φi(k) )T , where the first three are the X, Y and Z components of
the geometry B-spline lattice Φg , each linearized as a row vector. Similarly, the last
component is the linearized row vector of the intensity B-spline lattice Φi . Performing
PCA on v with all training examples, we obtain following generalized linear model
v = Pb
(6.1)
where the columns of P describe a set of orthogonal modes of shape variation corresponding to the given set of training examples, and b is the projection of a known
v in the new coordinate space, which is the external representation of the instance
that user applications see. The modes are sorted in descending order, according to
their variance. The dimension of b is less than or equal to N , and the values of b
are bounded with the convex hull that contains the training examples. All values of
b within the convex hull collectively represent the valid shapes that the given set of
training examples can predict.
6.3. METHOD
6.3.2
123
Atlas Construction
Given a set of volumetric training examples, the goal of atlas construction is to
mutually align all training examples, and transform each into internal and external
representations. The following steps give an overview of the process:
1. Segmentation and initial alignment.
• Segment the anatomy from all training examples, and select an arbitrary
one to define the atlas Coordinate Frame (CF). The principal axes of the
selected shape are used to define the atlas CF: shortest 7→ X, medium 7→
Y, longest 7→ Z (Fig. 6.1).
• For each other training example, the principal axes of the shape are automatically aligned with the atlas CF. As multiple mappings between the
two CFs exist, some training examples will be misaligned. However, the
misalignments are only 180◦ rotation about the X-axis and/or 90◦ rotation about the Z-axis. They are corrected interactively in a graphical user
interface (GUI).
2. Group-wise registration to align all training examples. The results include a
common mean volume with the corresponding anatomy mask and, for each
training example, a sequence of transformations that transform the training
example into the mean, which include:
• A rigid transform. It is discarded after registration as it is not an intrinsic
property of any shape.
• A free-form geometry deformation in the form of a B-spline lattice.
6.3. METHOD
124
• An anisotropic scaling. It is merged into the above transform after registration as our atlas does not explicitly capture scaling.
• An intensity deformation in the form of B-spline lattice. The lattice is
constructed using the MBA algorithm (described later).
The method by Balci et al. [4] is ideal for this task. However, it is sensitive to a
number of user settings. So, we adopted a sub-optimal solution by appointing
(rather than dynamically computing) one training example as the mean, and
performing piece-wise registrations to align all training examples. The negative
side of this choice is that the appointed mean may be far away from the true
mean, which leads to larger total variations within the constructed atlas, such
that more modes of variation are needed to capture all possible shapes. However,
the choice does not affect the accuracy in reproducing the training examples
from the resulted shape parameters. Moreover, the negative impact can be
minimized by visualizing all training examples on the computer screen and
choosing the one that is average sized and is not obviously abnormal.
3. Once all training examples are aligned, PCA is performed on the geometry and
intensity B-spline lattices as described in Section 6.3.1.
One main contribution of this work is to represent and approximate intensity
values with a B-spline lattice. This decision is motivated by two facts: first, intensity
is used as supplementary information to the geometry, so certain approximation errors
are allowed; second, we approximate the intensity differences between the training
examples and the mean, which are usually smooth. So, the use of a B-spline lattice
is able to achieve good data reduction while maintaining satisfactory accuracy. For
6.3. METHOD
125
a given training example, the intensities are approximated using the MBA algorithm
[59] as follows:
1. Initialize D to be the intensity difference between the training example and the
mean, choose an initial size for the output lattice Φ, compute Φ from D using
the B-spline Approximation (BA) algorithm, and set Ψ = Φ;
2. Approximate D with Ψ to obtain D0 , and compute the residual R = D − D0 ;
3. If R is satisfactory, finish approximation and report Φ as the final lattice, otherwise do following:
• Double the resolution of Ψ in each dimension, and compute its new value
from R using the BA algorithm;
• Refine Φ to have the same resolution as Ψ, and merge Ψ into Φ (see [59]
for the details);
• Set D = R;
• Repeat the steps 2-3.
6.3.3
Instance Generation
Given a vector of shape parameters, b, a corresponding instance can be generated
from the atlas. First, b is back-projected into a vector v with Eq. (6.1), which is
in turn re-formated to obtain a geometry B-spline lattice and an intensity B-spline
lattice. Then, the two lattices are sequentially applied to the mean volume.
Applying the B-spline lattices to the mean volume is the most time-consuming
task during instance generation, especially when the resolution of the instance is large.
6.3. METHOD
126
One good property about B-spline lattices is that they are regular data structures,
which is good for parallel processing.
We use CUDA-capable GPUs from nVidia (Santa Clara, CA) [2] to aid instance
generation. Such GPUs contain a number of processor cores, each of which can
execute multiple threads in parallel. To configure a GPU for instance generation, the
GPU is programmed with a kernel [2] that maps each voxel of the resulting volume
to one GPU thread, each of which performs geometry and intensity transformations
independently and simultaneously. As the total number of concurrent threads is
limited, the resulting volume is generated slice by slice; within each slice all voxels
are computed in parallel. The mean volume, the anatomy mask, and the coefficients
of the B-spline lattices are stored on the GPU as 3D textures for fast computations.
When an instance is requested, the coefficients of the B-spline lattices are sent from
the host memory to the GPU, then the kernel is executed, and finally the result is
transfered back to the host memory.
While all threads are executed in parallel, the computation each thread does
is relatively simple: it transforms the thread’s coordinates with the geometry Bspline lattice, fetches the corresponding texel from the mean volume texture, and
performs intensity transformation with the intensity B-spline lattice if the texel is
within the anatomy mask. It should be noted that, in actual implementation, we
perform PCA on the inverse of the geometry and intensity transformations, so the
above computation is performed straightforward at each output voxel location.
6.4. EXPERIMENTS, RESULTS AND DISCUSSION
127
Figure 6.1: Training examples for building an atlas of distal radius (after group-wise
registration). Also shown is the reference coordinate frame of the atlas.
6.4
Experiments, Results and Discussion
We built an atlas of distal radii from nine wrist CTs using the method described above.
All volumes are re-sampled and cropped to have a resolution of 256 × 256 × 200 voxels
and a spacing of 0.45 × 0.45 × 1.18 mm3 . Fig. 6.1 shows the training set as well as
the reference coordinate frame used for atlas construction.
We report on the accuracy of intensity approximation with B-spline lattice, and
the performance gain when a GPU is used for instance generation.
6.4.1
Accuracy of Intensity Approximation
Two training examples are chosen to test the accuracy of reproduced intensity. Bspline lattices of three different sizes are used to study the relationship between the
approximation accuracy and the data reduction rate, where data reduction rates are
evaluated using coefficients-to-voxels ratio (CVR), and are computed as the ratio
between the total number of coefficients in the B-spline lattice and the total number
of voxels in the training example. Table 6.1 shows the results, and Fig. 6.2 gives a
visual check of the approximation results.
When approximating a data set with a B-spline lattice, the spacing of the lattice
6.4. EXPERIMENTS, RESULTS AND DISCUSSION
128
Table 6.1: Accuracies of the intensity approximation with different final B-spline lattice sizes (mean ± stdev; unit: Hounsfield units; intensity range: [-1024,
2047]; accuracies were computed over all voxels within the bones).
Final Lattice Size
Coefficients/Voxels Ratio
Training Example 1
Training Example 2
35 × 35 × 35
0.003
5 ± 34
5 ± 28
67 × 67 × 67
0.02
4 ± 28
4 ± 19
131 × 131 × 131
0.16
2 ± 17
2 ± 11
Figure 6.2: Approximated volumes (one axial slice is shown) for different final Bspline lattice sizes. The top image shows the volume to be approximated.
6.4. EXPERIMENTS, RESULTS AND DISCUSSION
129
determines the approximation errors [59]. When the spacing is larger than the minimum distance between any two voxels, approximation will happen and approximation
errors will occur; otherwise, interpolation will happen and no approximation errors
will be present. This property makes the B-spline an ideal choice for sparse data
approximation. However, if the data values are smooth, a dense data set still can
be approximated with a large lattice spacing while maintaining small approximation
errors. We can observe this phenomenon from Table 6.1, where small approximation
errors were achieved even when the coefficients-to-voxels ratios were much smaller
than 1. This observation can be explained by the fact that the intensity difference
between any two images of the same anatomy and modality will be smooth if they
are deformed to be accurately aligned.
6.4.2
Performance of Instance Generation
As the size of the output volume and the data transfer between CPU and GPU
are important factors that affect the performance of atlas-based applications, we
performed three types of testing with different instance resolutions:
• CPU only - instances are generated and processed in CPU (the traditional
method);
• CPU+GPU - GPU generates instances and CPU does processing (there is data
transfer between CPU and GPU); and
• GPU only - GPU performs both instance generation and processing (there is
negligible data transfer between CPU and GPU).
6.4. EXPERIMENTS, RESULTS AND DISCUSSION
130
Table 6.2: Performance of instance generation under different usage scenarios.
Instance Resolution
CPU only (ips)
CPU+GPU (ips)
GPU only (ips)
128×128×100
256×256×200
512×512×400
0.4
6.82
112.71
0.05
5.37
25.58
0.006
2.27
4.19
In this study we were only concerned about the speed of instance generation, so no
actual “processing” was applied to the generated instances during the testing. The
performance is evaluated as number of instances per second (ips) within the following
environment: Intel i7-920, 6 GB RAM, GeForce GTX 285 with 1 GB VRAM, Windows 7 64-bit, CUDA SDK 2.3, and Visual C++ 2008. 25 random instances were
generated for each “CPU only” experiment, and 500 random instances were generated
for each of the other experiments. Table 6.2 shows the averaged results.
From Table 6.2 we can see large performance improvements when the GPU was
used for instance generation. Also observed is that the data transfer between CPU
and GPU is an important performance factor, especially when frequent and large data
transfers exist. This suggests that there is still room for performance improvement
if the subsequent tasks are also performed in the GPU. For example, in atlas-based
registrations, additional performance may be achieved if similarity computation is
also done in the GPU.
6.4.3
Performance of Atlas Construction
As described in Section 6.3.2, our atlas method depends on pair-wise registration to
incrementally incorporate the training examples. While the quality of the constructed
atlas may not be as good as the method that uses the group-wise registration, the
performance of atlas construction can be an advantage because it is linear to the
6.5. SUMMARY
131
number of training examples; moreover, new training examples can be added with
constant time because it is not necessary to repeat the registrations for previous
training examples. In our experiment with 9 training examples of moderate CT
resolution, the construction took around 3 hours.
6.5
Summary
In this chapter, we described a new representation for intensity atlases. Both geometry and intensity information are represented using B-spline lattices, intensities are
approximated using the MBA algorithm, and CUDA-capable GPUs are used for fast
instance generation. Testing with human wrist CTs showed that the B-spline lattice
is an appropriate and compact representation for carrying intensity information in
intensity atlases. The use of GPU parallel computation also demonstrated significant
speed improvements against the traditional CPU-based method.
132
Chapter 7
Atlas-based Multiple-object 2D-3D Registration
7.1
Overview
In this chapter, we describe a method to guide the surgical fixation of distal radius
fractures.1 The method registers the fracture fragments to a volumetric intensitybased statistical anatomical atlas of distal radius, reconstructed from human cadavers
and patient data, using a few intra-operative X-ray fluoroscopic images of the fracture.
No pre-operative Computed Tomography (CT) images are required, hence radiation
exposure to patients is substantially reduced. Intra-operatively, each bone fragment
is roughly segmented from the X-ray images by a surgeon, and a corresponding segmentation volume is created from the back-projections of the 2D segmentations. An
optimization procedure positions each segmentation volume at the appropriate pose
on the atlas, while simultaneously deforming the atlas such that the overlap of the
2D projection of the atlas with individual fragments in the segmented regions is maximized. Our simulation results shows that this method can accurately identify the
1
This work has been published in Proceedings of SPIE: R. H. Gong and P. Abolmaesumi, “Reduction of multi-fragment fractures of distal radius using an atlas-based 2D-3D registration technique”,
SPIE Medical Imaging, pages 726137-1 - 726137-9, 2009.
7.2. INTRODUCTION
133
pose of large fragments using only two X-ray views, but for small fragments, more
than two X-rays may be needed. The method does not assume any prior knowledge
about the shape of the bone and the number of fragments, thus it is also potentially
suitable for the fixation of other types of multi-fragment fractures.
7.2
Introduction
Distal radius fractures account for about 15% of all fractures in the emergency room
[16, 21]. Treatment is usually performed using minimal invasive techniques that are
based on imaging and tracking technologies. Compared with other types of fractures
that have been extensively studied in the literature (for example, the ones in hip and
knee), distal radius fractures involve multiple and small fracture fragments. Thus the
treatment is more challenging.
Traditionally, fracture reduction is controlled by intraoperative fluoroscopy [11] or
intraoperative CT [51]. A serious drawback of solely using intraoperative imaging is
the need for a large number of X-ray images, which exposes the surgical team and
patient to excessive irradiation.
In recent years, preoperative 3D data, including bone fragment models [21] and
treatment templates [77], are incorporated and used as the main modalities that guide
the surgical process. The use of intraoperative imaging is minimized as this imaging
is used only when establishing or updating the spatial correspondence between the
preoperative 3D data and the patient. Fragment models are usually created from a
diagnostic CT of the fracture, and a CT of the reflected contra-lateral bone of the
fractured bone is commonly used as the treatment template [77]. This new treatment
technique not only reduces the amount of radiation, but also provides intuitive 3D
7.2. INTRODUCTION
134
views of the fracture region without intraoperatively acquiring or reconstructing an
instant 3D image. However, it assumes mirror symmetry between the corresponding
bones from both sides of human body, while in reality those bones usually differ in
size and shape. As a result, using the contra-lateral bone to guide the treatment may
lead to significant misalignments.
We propose a new method for treating distal radius fractures which uses a more
case-specific template to guide the treatment process. An atlas (statistical shape)
of healthy distal radii is used as a deformable treatment template, and a multiplefragment deformable 2D-3D registration algorithm that depends on a small number
of intraoperative X-ray images is used to compute the relative poses between the template and the individual fracture fragments in operation room (OR). As the shape
of the template is simultaneously estimated during registration to match the fracture
being treated, the template is more accurate. In addition, the use of the atlas eliminates preoperative CT imaging and 3D segmentation, which significantly simplifies
the treatment process.
A couple of atlas-based or deformable 2D-3D registration techniques have been
proposed in the literature. Sadowsky et al. [95] proposed a method that uses statistical shapes to replace CT in the circumstances that the field of view of X-ray
images are limited. Tang et al. [101] proposed a method that uses a hybrid atlas
and a few X-ray images for 3D surface reconstruction. Those methods deal with only
a single bone piece and are thus not suitable for fracture treatment. To the best of
our knowledge, our technique is the first that treats multi-fragment fractures using
an atlas-based 2D-3D registration technique.
We present the proposed method in Section 7.3, and report the preliminary results
7.3. METHOD
135
with a synthetic fracture in Section 7.4. Finally, Section 7.5 concludes the chapter.
7.3
Method
Our method is based on an atlas of distal radii, which is used as a dynamic template
to guide the treatment, and a multiple-fragment deformable 2D-3D registration algorithm, which is used to compute the relative poses between the template and the
individual fracture fragments in OR.
7.3.1
The Atlas of Distal Radius
An atlas captures the statistical information, including mean and variations, of a
group of objects. It not only represents the objects used to construct the atlas, but
also produces “interpolated” new objects that do not exist during atlas construction. An atlas can capture either the statistical information of geometrical silhouette
(hereafter geometry atlas) [20, 23, 29, 88, 92], or the statistical information of both
geometrical silhouette and internal density values (hereafter intensity atlas) [122].
While geometry atlases are known for good efficiency in the applications of modelfitting, intensity atlases are more robust to segmentation errors introduced during
atlas creation and model-fitting.
Our atlas is an intensity atlas that represents a family of CT volumes containing
the distal radii (about 1/4 of the full radius) of both right and left arms. Though
Yao [122] has developed an efficient method for building intensity atlases using tetrahedral mesh (for representing geometry) and Bernstein polynomials (for representing
density), we used a simpler while more general approach that is based on B-spline
7.3. METHOD
136
Figure 7.1: A few examples of the training data used to build the atlas of distal
radius. All data have been reoriented such that the principal axes are
aligned with the coordinate axes. Reflection has also been performed for
the data from the left-arm. (Data courtesy of Kingston General Hospital,
Ontario, Canada)
deformable transformation. It extends Rueckert’s method [92] by capturing additional information including the scale of the bone and CT Hounsfield values within
the bone. Our method does not tightly depend on the content of the training data;
thus can be used for building the atlas for objects of any shape.
A total of 16 training data (CT), six from Distal Radius Osteotomy (DRO) surgeries and 10 from cadavers, were used to build the atlas. The DRO data contained
only the two ends of the radius (intended to reduce the irradiation to the patients),
while the cadaver data contained the full radius. Nine data sets were from the right
arm, and seven from the left arm. A few examples of the training data are shown in
Figure 7.1.
7.3. METHOD
137
To build the atlas, all training data were first normalized into a common coordinate
frame. Then, Principal Component Analysis (PCA) was performed and the atlas was
created. Once the atlas is constructed, new instances can be generated from atlas
coefficients.
Normalization of Training Data
This process transforms all training data into a common coordinate frame, i.e. the
atlas coordinate frame, such that all data are aligned. This is a group registration
problem, and we use the term “normalization” to distinguish it from other types
of registration problems. After normalization, each data set is decomposed into a
common mean shape, and a transformation that determines the variation of the data
away from the mean shape. We model the transformation as a concatenation of
five sub-transforms that sequentially transform the data into the mean shape: rigid
transform one → anisotropic scaling → rigid transform two → B-spline deformation
→ intensity transform. We use the following method to compute the mean and the
individual transformations:
1. Compute the first rigid sub-transforms by initially aligning all training data:
(a) Segment the radius from each training data set;
(b) Reorient each segmented radius such that its principal axes are aligned
with the coordinate axes (longest 7→ Z, medium 7→ Y, shortest 7→ X), and
its geometry center is at the origin;
(c) Due to the special shape of the radius, most training data will be uniformly
oriented after the previous step, a few may be mis-oriented by 90◦ or 180◦ ,
those are identified and corrected in a graphical user interface (GUI);
7.3. METHOD
138
(d) Reflect the radius in Y-direction if it is from the left arm.
2. Compute the mean size of the radius using the Axis-aligned Bounding Boxes
(AABBs) of all training data and, for each training data, compute the scaling
factors and scale/crop the radius.
3. Perform pair-wise non-rigid registrations to accurately align all training data.
This will compute the common mean shape and, for each training data set, the
second rigid sub-transform and the B-spline deformation. For more accurate
alignments, group-wise registration algorithm [4] can be used instead.
4. For each training data set and each voxel, compute the intensity difference with
respect to the mean shape. To reduce the memory usage, polynomials (e.g.,
power polynomials or Bernstein polynomials) can also be used to approximate
the intensity values [122] with greatly reduced number of coefficients.
Atlas Construction
As described in the previous section, after normalization, the variation of each training
data with respect to the mean shape is represented using a sequence of five subtransforms. We build the atlas to capture the statistical information of the last four
sub-transforms, because the first one is not an intrinsic part of the radius and will be
determined for specific user applications. For each training data set, the inverses of the
four sub-transforms were first computed, and then parameterized and concatenated to
form one column of a matrix X. Since the inverse of a B-spline deformable transform
is not analytically available, we approximated the inverse by registering the mean
shape back to the training data. A fast but less accurate pseudo-inversion algorithm
7.3. METHOD
139
could also used [108]. Finally, we perform Principal Component Analysis (PCA) for
X and project each training data (i.e. some column xi ) into the Eigen space:
ai = diag(σ1 ...σN )([v1 ...vN ]T xi )
(7.1)
where N is the number of training data sets (16 in our case), (v1 ...vN ) are the
2
eigenvectors computed from XXT , and (σ12 ...σN
) are the corresponding variances
along each eigenvector. ai is the N -dimensional projected point, which is also called
atlas coefficients. The convex hull of all projected points contains all valid shapes the
atlas could generate, based on the set of given training data.
Instance Generation
To generate an instance from the atlas, a set of atlas coefficients is provided, and
the inverse process of Equation (7.1) is performed to compute a sequence of subtransforms. Then, sequentially and in reverse order, each of the sub-transforms is
applied to the mean shape. If a left-arm radius is requested, the generated volume
is Y-reflected. When supplying the atlas coefficients, it is important to constrain the
point to within the convex hull shaped by the training data; otherwise, the generated
shape could be unrealistic.
When generating an instance from a set of atlas coefficients, only the first few
significant eigenmode can be used. Depending on the application requirement, using
an appropriate number of eigenmode could achieve a good trade-off between the
quality of the generated instances and the computation cost for solving the user
problem. In our distal radius atlas, the first five eigenmode accounted for about
2
70% of total variations in the training data (computed as Σ5i=1 σi2 /ΣN
i=1 σi ), which is
7.3. METHOD
140
acceptable for testing our fracture reduction method.
7.3.2
Multiple-Fragment Deformable 2D-3D Registration
The inputs of our method are a set of co-registered intraoperative X-ray images and a
dynamic treatment template, i.e. an instance of the distal radius atlas with changing
shape. The goal is to: 1) determine the real shape of the template based on the
X-ray images of the fracture, and 2) find the relative poses between the determined
template and the individual fracture fragments in the OR.
An overview of the method is illustrated in Figure 7.2. The fixed data is the set
of intraoperative X-ray images. The moving data is the set of fracture fragments in
the OR to be reassembled, each modeled as a segmentation volume on the dynamic
template (Figure 7.3). To search for a solution, first, the shape of the template
is deformed and the segmentation volumes of the individual fragments are moved;
then, simulated X-ray images of the fragments, also called Digitally Reconstructed
Radiographs (DRRs), are generated and combined; next, similarity values between
the combined DRRs and X-ray images are computed; finally, based on the similarity
values under the current transformation, an optimizer is used to update the shape
of the template and the poses of the fragments. We describe each of the involving
components in the following sections.
The Fragment Model
Each bone fragment is modelled as a segmentation volume on the dynamic template.
The segmentation volume (i.e. region of interest for one fragment) is constructed interactively from the X-ray images: the bone fragment is roughly segmented by hand
7.3. METHOD
141
Figure 7.2: Overview of the multiple-fragment deformable 2D-3D registration algorithm.
from the X-ray images, then the 3D back-projections of these 2D segmentations are
intersected to produce a bounding volume around the fragment. The 2D segmentations are convex polygons of (typically) four or five edges. Each polygon defines a
cone in 3D with its apex at the known focal point of the X-ray. For each fragment,
the cones from different X-ray images are intersected to form a convex 3D volume.
During optimization, the Graphics Processing Unit (GPU) is used to accelerate the
operation of applying the segmentation volumes to the template. Since the GPU
we used was limited to six clipping planes, we chose six planes that conservatively
enclose the 3D volume (the planes are usually not aligned with the coordinate axes).
Figure 7.3 shows six clipping planes positioned over the atlas to model one fracture
fragment.
7.3. METHOD
142
Figure 7.3: A fractured bone fragment is modelled as a segmentation volume on the
atlas. The pose of the segmentation volume and the shape of the atlas
are to be determined.
Transformation Parameters
There are two types of transforms to be determined during optimization: a global
transform that models the shape of the dynamic template as well as its position in the
OR coordinate space, and a set of local transforms, one per fragment, that model the
poses of individual fragments within the template. The template shape is modelled
as a non-rigid transform with respect to the atlas mean (see Section 7.3.1), and is
represented using atlas coefficients. In our experiments, the first five eigenmodes were
used to generate instances from the atlas. The template position and local transforms
for individual fragments are rigid transforms, each has six coefficients with three being
the rotation angles and three being the translation. In a two-fragment fracture, for
example, there are 5 + 6 + 2 × 6 = 23 parameters to be estimated.
To start the optimization, an initial value of the parameters is required. For the
template shape, the initial parameters were set to all zeros (corresponding to the
mean shape). For the template position and local transforms, the initial parameters
were determined interactively using a graphical user interface.
7.3. METHOD
143
Similarity Measure
For each transformation, we compute a similarity measure as follows. The transformed bounding volume of each fragment is used to clip the deformed atlas. For
each of the X-ray images, each clipped volume is rendered as a DRR with the same
camera parameters as used by the X-ray image, and a combinatorial DRR is composed from the individual fragment DRRs. For each pair of X-ray and combinatorial
DRR, a similarity value is computed. Finally, the sum of individual similarity values
provides an overall similarity measure.
With n fragments and m X-ray images, n × m DRRs and m combinatorial DRRs
are generated for each transformation. Since many transformations are considered in
the optimization process, we use GPU-accelerated 3D clipping and texture-mapping
techniques to quickly generate the DRRs. An nVidia GeForce 8600 GT with 256 MB
video memory was used. To compare an X-ray image with its corresponding DRR,
we have used Normalized Cross Correlation (NCC), Variance-Weighted Correlation
(VWC), Mutual Information (MI), and Gradient Difference (GD).
Optimization
The similarity metrics described above are highly nonlinear. We use the robust Covariance Matrix Adaptation Evolution Strategy (CMA-ES) [42] to search for an optimal solution of the transformation parameters. The algorithm requires no derivatives.
Instead, it determines the direction in which to search by taking samples in the parameter domain according to a normal multi-variate distribution around the current
state in the parameter domain. During the optimization process, the search distribution is adaptively deformed according to the local function landscape. The inputs of
7.4. EXPERIMENTS AND RESULTS
144
the algorithm include an initial guess of the solution and an initial size of the search
distribution.
We employ CMA-ES in a two-stage optimization scheme. In the first stage, the
deformation and pose parameters are optimized alternately: only the pose parameters
are varied for a number of optimization iterations; then, only the atlas deformation
parameters are varied for a number of iterations. This is repeated until convergence. This first stage permits the optimization to quickly bring the fragments into
a reasonable position, permitting the atlas deformation parameters to be more easily optimized. In the second stage of optimization, the result is further refined by
allowing all parameters to vary simultaneously.
7.4
Experiments and Results
For this research, we focus on demonstrating the functioning of our proposed method,
so only preliminary validation with synthetic fracture and synthetic X-ray images are
provided. We used a synthetic fracture to test our method, in which case the “ground
truth” was known. One of the training data sets was cut with a plane, and a set of
30 fragment layouts was generated by randomly perturbing the two bone fragments
on either side of the plane. This simulated a paediatric physeal distal radius fracture.
The CT and the simulated fracture location are shown in Figure 7.4a.
The pose of each separate fragment was randomly rotated by up to 5◦ and randomly translated by up to 3 mm. Four examples are shown in Figure 7.4(b-e). These
displacements appear to be reasonable since they create clinically realistic simulated
X-ray images similar to those seen in an orthopaedic trauma case.
For each fragment layout, two simulated X-ray images were generated in the AP
7.4. EXPERIMENTS AND RESULTS
145
Figure 7.4: A simulated fracture that was used to test our method. (a) One of the
training data sets was cut with a plane close to the distal end of the radius.
(b-e) The two resulting fragments were randomly rotated and translated.
and lateral directions. The two fracture fragments were roughly outlined on each of
the two X-rays. Finally, using the NCC similarity metric, we estimated the atlas
shape and moved the fragments toward their correct positions in the atlas.
The error measure was evaluated for each fragment using the surface points of
the fragment. It was defined as the surface-to-surface distance between the initial or
registered position of the fragment and the “ground truth” position of the fragment,
ideally 0 mm. The initial errors of our 30 simulated fractures were 3.05 ± 0.87 mm
(written as mean±standard deviation) for the large fragment, and 2.99 ± 0.82 mm
for the small fragment. The final errors, after applying our method to determine
the template shape and move each fragment toward its correct position, were 0.94 ±
7.4. EXPERIMENTS AND RESULTS
146
Table 7.1: Preliminary results of atlas-based 2D/3D registration.
Fragment
Head
Body
Initial mTRE (mm)
2.99 ± 0.82
3.05 ± 0.87
Final mTRE (mm)
1.64 ± 1.67
0.94 ± 0.66
Figure 7.5: One experiment case with two X-ray views: before reduction (left, initial
error 7.3 mm) and after reduction (right, final error 3.1 mm) . The result
was considered failure as the final error > 2.0 mm, which was mainly
caused by the small fragment.
0.66 mm for the large fragment, and 1.64 ± 1.07 mm for the small fragment. Table 7.1
summarizes the experimental results, and Figure 7.5 shows the initial and final views
of one experiment case.
In clinical practice, a final error within 2.0 mm is considered successful. As the
method performed much better for the large fragment than for the small fragment,
additional experiments using eight simulated X-ray images were performed to further
reduce the final errors for the small fragment. With eight X-ray images, the final errors
were improved down to 1.10 ± 0.43 mm, which satisfies the clinical requirement.
7.5. DISCUSSION AND SUMMARY
7.5
147
Discussion and Summary
We have described a method to guide the reduction of distal radius fractures. The
use of a deformable atlas potentially provides a more accurate template than the
commonly-used reflected radius of the contralateral arm, and does not require a large
radiation exposure from CT imaging. No accurate 3D segmentations are necessary,
instead, only rough 2D segmentations on X-ray images are performed. The method
should be extendible to other types of fractures containing multiple fragments.
The preliminary results show that the proposed method is able to accurately
find the correct poses of fracture fragments for synthetic 2-fragment fractures with
simulated X-ray images. When large errors occurred for smaller fragments, using
additional X-ray images improved the result.
148
Chapter 8
Conclusion
This chapter summarizes the contributions made by this thesis, and proposes directions for future research.
8.1
Summary of Contributions
The main goal of this thesis is to investigate two major limitations of current 2D3D registration techniques, those being the lack of efficient optimization algorithms
that are also robust against noise and outliers, and the lack of 2D-3D registration
techniques that accurately and efficiently align multiple anatomical models to X-ray
images for use in the cases such as fracture treatment. To address the first problem,
two 2D-3D registration techniques that use recently proposed advanced optimization algorithms are investigated. For the second problem, two 2D-3D registration
techniques that simultaneously register multiple objects are proposed.
8.1.1
Robust and Efficient 2D-3D Registration
Though a variety of similarity metrics have been proposed for 2D-3D registration,
finding an accurate and well-shaped function to model the similarity between X-ray
8.1. SUMMARY OF CONTRIBUTIONS
149
and DRR images is still a challenging task. When a similarity metric is highly nonlinear, the selection of optimization algorithm becomes critical. Chapter 3 described
a 2D-3D registration technique that uses UKF along with GPU-accelerated DRR generation for robust and fast registration. As it requires no derivatives and simulates
the process of Simulated Annealing (that is, random noise is artificially added at each
iteration step), it is simple to use and robust against local minima. The method was
evaluated using three bone phantoms of different shapes and sizes, and was compared
with a conventional method that uses the down-hill simplex algorithm. Preliminary
experimental results confirmed that UKF is superior to simplex in dealing with illposed similarity metrics. With similar registration accuracies and computation costs,
the UKF-based method achieved 1.3 to 2 times wider capture ranges than the traditional method, which is a significant improvement as it will greatly increase the
success rate of registration and simplify the initialization process.
In the UKF-based 2D-3D registration technique, the noise in transformation parameters is used to drive the optimization process. Thus, the prior knowledge about
such noise is an important factor that affects the registration. This is both good
and bad. For the good, when such knowledge is available, the registration can be
completed quickly and robustly, and the final covariance matrix of the transformation parameters can be used to analytically estimate the final registration error. For
the bad, accurate knowledge about such noise is sometimes difficult to obtain, because there are many sources of errors such as imaging, calibration, similarity metric
definition, and so on.
As a complement to the UKF-based method, Chapter 4 described a 2D-3D registration technique that uses CMA-ES as the optimization algorithm. Similar to the
8.1. SUMMARY OF CONTRIBUTIONS
150
UKF-based method, the new method requires no derivative calculations, and learning the proceeding directions from a minimal set of sample points in the parameters
domain. However, the method is easier to use as it has only a single user parameter. The method was evaluated with the same set of testing data as in the UKFbased method, and new experiments were performed for the UKF-based method with
more accurate knowledge about the ratio between the process noise and the measurement noise. The experimental results showed that the two methods achieved similar
capture ranges. However, the CMA-ES-based method marginally outperformed the
UKF-based method in terms of accuracy and computation cost. The results were also
compared with those of the simplex-based method, and both former methods showed
significant improvements in capture range.
8.1.2
Multiple-object 2D-3D Registration
One main use of multiple-object 2D-3D registration in computer-assisted fracture
treatment is to identify the relative poses of the OR fracture fragments in the coordinate space of the preoperative plan. Chapter 5 described such a technique that uses
multiple techniques to simultaneously align all fragments to a set of X-ray images. To
achieve better robustness against various noise and occlusions among fragments, edge
structures in the X-ray images were enhanced before a MI-based similarity metric was
applied. To obtain a fast global alignment among all fragments, a global-local alternating optimization scheme that is based on CMA-ES was adopted, and the GPU was
used to accelerate DRR generation. Both synthetic fractures and fracture phantoms
were used to test the proposed technique. The experimental results showed that, for
fractures in small bones such as distal radius, the proposed method could achieve a
8.1. SUMMARY OF CONTRIBUTIONS
151
capture range up to 10 mm for optimal treatment setups, and a capture range up to
5 mm for lifelike treatment setups.
The multiple-object 2D-3D registration technique presented in Chapter 5 requires
a preoperative treatment plan to be used as the registration reference. One automatic
planning method is to use an intensity atlas of the bone being treated as a dynamic
template to guide the planning process. To enable this operation, Chapter 6 described
a new method that constructs intensity anatomical atlases from 3D images. A Bspline FFD lattice was used to model both geometry and intensity information of
atlas instances, and the GPU was used to accelerate instance generation. A CT atlas
of distal radii was constructed to test the method. The results showed that B-spline
FFD was a compact, accurate and efficient representation to model CT intensities and,
compared with traditional methods that use plain CPU, the use of GPU significantly
improved the speed of instance generation.
By incorporating the atlas generation method described in Chapter 6, a new atlasbased multiple-object 2D-3D registration technique was developed and presented in
Chapter 7. The planning was performed implicitly and automatically by using an
intensity atlas of the bone being treated and integrating the planning process into
the registration process. The registration estimates not only the poses of individual
fracture fragments, but also the final shape of the fractured bone. Fracture fragments
were modelled as coarsely bounded volumes on top of the bone atlas, and the volumes
were constructed by roughly segmenting the fragments on X-ray images and backprojecting the segmented 2D shapes into the 3D space. To improve the computation
speed, the GPU was used to accelerate the processes of fragment modelling and
DRR generation. One major benefit of this new technique is that it removed the
8.2. FUTURE WORK
152
separate step of preoperative planning, and only simple user interactions was involved.
Preliminary results with a synthetic fracture showed that the proposed method can
accurately identify the poses of individual fragments.
8.2
Future Work
For clinical use a 2D-3D registration technique needs to satisfy requirements such as
ease of use, high accuracy, fast computation speed, and robustness to the existence
of multiple bones or external surgical tools in the fluoroscopic view. Without a
doubt, the research works conducted in this thesis are still pilot studies and further
improvements can be made.
First, in the UKF-based 2D-3D registration technique, the knowledge about the
noise in transformation parameters is a critical factor that affects the behaviour and
performance of the registration. In this thesis, this parameter was determined empirically through trial-and-error. As the error in transformation parameters has an
intrinsic connection with the errors introduced during image acquisition and similarity metric formation, there should be a systematic way to project the original errors
into the parameter domain. Once accurate knowledge about the noise is obtained,
it not only significantly improves the performance and robustness of the registration,
but also provides an analytic approach to estimate the final registration error from
the final covariance matrix of the transformation parameters [74].
Second, in some proposed techniques, user parameters were used to control the registration behaviours: for example, the edge-enhancement parameters in the multipleobject 2D-3D registration technique, the starting size of the covariance matrix in
all techniques that use CMA-ES, and so on. Those parameters are case dependent
8.2. FUTURE WORK
153
and were determined empirically in this thesis. Further in-depth studies on those
parameters can be an important step to improve the usability or performance of the
techniques.
Third, the atlas-based multiple-object 2D-3D registration technique described in
Chapter 7 demonstrated a new idea to use 2D-3D registration for fracture treatment.
The method is still in a preliminary stage and many improvements can be made. For
example, the property of mutual exclusions between fragments was not taken into
account when modelling the fragments, and the fragments were only modelled using
very coarse bounding volumes. Those limitations were made due to the expensive
computation cost of the technique. In recent years, the computation power of both
CPU and GPU has been greatly increased, which provides potential to further improve
the technique.
Finally, although GPU-accelerated computations were extensively used in this
thesis, the technology used for DRR generation is becoming outdated as the more
powerful next generation GPUs are developed. In fact, new GPGPU computation
technologies not only can be used to improve the performance and quality of DRR
generation, but also can be used to accelerate other computations such as image
pre-processing, similarity calculation, and optimization. Fast computation has been
an important goal when 2D-3D registration techniques were developed in this thesis;
however, there was still a huge gap between the current computation performance
and interactive clinical usage. Using the new GPGPU technologies to improve the
proposed techniques would be a prospective direction for future research.
BIBLIOGRAPHY
154
Bibliography
[1] Computer assisted surgery: Precision technology for improved patient care.
Technical report, Advanced Medical Technology Association, 2005.
[2] Nvidia CUDA compute unified device architecture - programming guide, June
2008.
[3] The matrix and quaternions FAQ.
http://www.j3d.org/matrix_faq/
matrfaq_latest.html, July 2011.
[4] S. K. Balci, P. Golland, and W. M. Wells. Non-rigid groupwise registration
using B-spline deformation model. In Proceedings of the 10th International
Conference on Medical Image Computing and Computer Assisted Intervention
(MICCAI), volume 10, pages 105–121, 2007.
[5] R. Bansal, L. H. Staib, Z. Chen, A. Rangarajan, J. Knisely, R. Nath, and
J. S. Duncan. A novel approach for the registration of 2D portal and 3D CT
images for treatment setup verification in radiotherapy. In Proceedings of the 1st
International Conference on Medical Image Computing and Computer Assisted
Intervention (MICCAI), volume 1496, pages 1075–1086, 1998.
BIBLIOGRAPHY
155
[6] A. Benassarou, E. Bittar, N. W. John, and L. Lucas. MC slicing for volume
rendering applications. In International Conference on Computational Science
(2), pages 314–321, 2005.
[7] M. Berks, S. Caulkin, R. Rahim, C. Boggis, and S. Astley. Statistical appearance models of mammographic masses. In Proceedings of the 9th International
Workshop on Digital Mammography, pages 401–408, 2008.
[8] P. J. Besl and N. D. McKay. A method for registration of 3-D shapes. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 14:239–256, 1992.
[9] C. Bethune and J. Stewart. Adaptive slice geometry for hardware-assisted
volume rendering. Journal of Graphics Tools, 10(1):55–70, 2005.
[10] K. K. Bhatia, J. Hajnal, A. Hammers, and D. Rueckert. Similarity metrics
for groupwise non-rigid registration. In Proceedings of the 10th International
Conference on Medical Image Computing and Computer Assisted Intervention
(MICCAI), volume 10, pages 544–552, 2007.
[11] R. Bilic and V. Zdravkovic. Planning corrective osteotomy of the distal end of
the radius. Unfallchirurg, 91:571–574, 1988.
[12] W. Birkfellner, W. Burgstaller, J. Wirth, B. Baumann, A. L. Jacob, K. Bieri,
S. Traud, M. Strub, P. Regazzoni, and P. Messmer. LORENZ: a system for
planning long-bone fracture reduction. In Robert L. Galloway and Jr., editors,
Proceedings of SPIE Medical Imaging, volume 5029, pages 500–503, 2003.
[13] W. Birkfellner, R. Seemann, M. Figl, J. Hummel, C. Ede, P. Homolka, X. H.
Yang, P. Niederer, and H. Bergmann. Wobbled splatting - a fast perspective
BIBLIOGRAPHY
156
volume rendering method for simulation of X-ray images from ct. Physics in
Medicine and Biology, 50(9):N73, 2005.
[14] W. Birkfellner, M. Stock, M. Figl, C. Gendrin, J. Hummel, S. Dong, J. Kettenbach, D. Georg, and H. Bergmann. Stochastic rank correlation: A robust merit
function for 2D-3D registration of image data obtained at different energies.
Medical Physics, 36(8):3420–3428, 2009.
[15] C. Brechbühler, G. Gerig, and O. Kübler. Parametrization of closed surfaces for
3-D shape description. Computer Vision and Image Understanding, 61(2):154–
170, 1995.
[16] A. J. Bronstein, T. E. Trumble, and A. F. Tencer. The effects of distal radius
fracture malalignment on forearm rotation: a cadaveric study. The Journal of
Hand Surgery, 22A(2):258–262, March 1997.
[17] L. G. Brown. A survey of image registration techniques. ACM Computing
Surveys, 24:325–376, 1992.
[18] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek. The trimmed iterative
closest point algorithm. In International Conference on Pattern Recognition,
pages 545–548, 2002.
[19] T. F. Cootes, G. J. Edwards, and C. J. Taylor. Active appearance models. In
5th European Conference on Computer Vision, volume 2, pages 484–498, 1998.
[20] T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham. Active shape models
- their training and application. Computer Vision and Image Understanding,
61(1):38–59, 1995.
BIBLIOGRAPHY
157
[21] H. Croitoru, R. E. Ellis, R. Prihar, C. F. Small, and D. R. Pichora. Fixationbased surgery: A new technique for distal radius osteotomy. Computer Aided
Surgery, 6:160–169, 2001.
[22] R. Dalvi, R. Abugharbieh, M. Pickering, J. Scarvell, and P. Smith. Registration
of 2D to 3D joint images using phase-based mutual information. In Proceedings
of SPIE Medical Imaging, volume 6512, page 651209, 2007.
[23] R. H. Davies, C. J. Twining, T. F. Cootes, J. C. Waterton, and C. J. Taylor.
A minimum description length approach to statistical shape modeling. IEEE
Transactions on Medical Imaging, 21(5):525–537, 2002.
[24] A. DiGioia. Computer and robotic assisted hip and knee surgery. Oxford University Press, New York, 2004.
[25] J. Feldmar, N. Ayache, and F. Betting.
3D-2D projective registration of
free-form curves and surfaces. Computer Vision and Image Understanding,
65(3):403–424, 1997.
[26] G.
Fichtinger.
Surgical
navigation,
registration,
and
tracking.
http://cisstweb.cs.jhu.edu/people/gabor/Cs-600.145/Lectures/
RegTrack.pdf, 2006.
[27] J. M. Fitzpatrick, J. B. West, and C. R. Maurer. Predicting error in rigid-body
point-based registration. IEEE Transactions on Medical Imaging, 17(5):694–
702, October 1998.
[28] M. Fleute, S. Lavallée, and L. Desbat. Integrated approach for matching statistical shape models with intra-operative 2D and 3D data. In Proceedings of
BIBLIOGRAPHY
158
the 5th International Conference on Medical Image Computing and Computer
Assisted Intervention (MICCAI), volume 2489, pages 364–372, 2002.
[29] M. Fleute, S. Lavallee, and R. Julliard. Incorporating a statistically based shape
model into a system for computer-assisted anterior cruciate ligament surgery.
Medical Image Analysis, 3(3):209–222, 1999.
[30] R. L. Galloway. The process and development of image-guided procedures.
Annual Review of Biomedical Engineering, 3:83–108, 2001.
[31] P. Gamage, S. Q. Xie, P. Delmas, and P. Xu. Pose estimation of femur fracture
segments for image guided orthopedic surgery. In IEEE International Conference on Image and Vision Computing, pages 288–292, 2005.
[32] P. Giblin and B. B. Kimia. A formal classification of 3D medial axis points
and their local geometry. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 26(2):238–251, 2004.
[33] K. G. Gilhuijs, P. J. van de Ven, and M. van Herk. Automatic three-dimensional
inspection of patient setup in radiation therapy using portal images, simulator images, and computed tomography data. Medical Physics, 23(3):389–399,
March 1996.
[34] R. Gocke, J. Weese, and H. Schumann. Fast volume rendering methods for
voxel-based 2D/3D registration - a comparative study. In Workshop on Biomedical Image Registration, Bled, Slovenia, August 1999.
BIBLIOGRAPHY
159
[35] M. Goitein, M. Abrams, D. Rowell, H. Pollari, and J. Wiles. Multidimensional treatment planning (II): Beam eye-view, back projection, and projection through CT sections. International Journal of Radiation Oncology Biology
Physics, 9:789–797, 1983.
[36] R. H. Gong and P. Abolmaesumi.
2D/3D registration with the CMA-ES
method. In Proceedings of SPIE Medical Imaging, volume 6918, pages 69181M1–
69181M9, 2008.
[37] R. H. Gong, P. Abolmaesumi, and J. Stewart. A robust technique for 2D-3D
registration. In IEEE International Conference on Engineering in Medicine and
Biology (EMBC), volume 1, pages 1433–1436, 2006.
[38] R. H. Gong, J. Stewart, and P. Abolmaesumi. A new method for CT to fluoroscope registration based on unscented Kalman filter. In Proceedings of the 9th
International Conference on Medical Image Computing and Computer Assisted
Intervention (MICCAI), volume 9, pages 891–898, 2006.
[39] R. H. Gong, J. Stewart, and P. Abolmaesumi. Reduction of multi-fragment
fractures of distal radius using an atlas-based 2D/3D registration technique. In
Proceedings of SPIE Medical Imaging, volume 7261, pages 371–379, 2009.
[40] R. H. Gong, J. Stewart, and P. Abolmaesumi. A new representation of intensity atlas for GPU-accelerated instance generation. In IEEE International
Conference on Engineering in Medicine and Biology (EMBC), pages 4399–4402,
2010.
BIBLIOGRAPHY
160
[41] R. H. Gong, J. Stewart, and P. Abolmaesumi. Multiple-object 2D-3D registration for non-invasive pose identification of fracture fragments. IEEE Transaction
on Biomedical Engineering (TBME), 58(6):1592–1601, June 2011.
[42] N. Hansen. The CMA evolution strategy: A comparing review. In Towards
a new evolutionary computation. Advances on estimation of distribution algorithms, pages 75–102. Springer, 2006.
[43] D. L. G. Hill, P. G. Batchelor, M. Holden, and D. J. Hawkes. Medical image
registration. Physics in Medicine and Biology, 46(3):R1, 2001.
[44] J. Huang, R. Crawfis, and D. Stredney. Edge preservation in volume rendering using splatting. In Proceedings of the 1998 IEEE Symposium on Volume
Visualization (VVS), pages 63–69, 1998.
[45] L. Ibanez, W. Schroeder, L. Ng, and J. Cates. The ITK Software Guide. Kitware, Inc. ISBN 1-930934-15-7, http://www.itk.org/ItkSoftwareGuide.pdf, second edition, 2005.
[46] A. Jain, R. Kon, Y. Zhou, and G. Fichtinger. C-arm calibration–is it really
necessary? In Proceedings of the 8th International Conference on Medical Image
Computing and Computer Assisted Intervention (MICCAI), volume 8, pages
639–646, 2005.
[47] L. Joskowicz, C. Milgrom, A. Simkin, L. Tockus, and Z. Yaniv. Fracas: a system
for computer-aided image-guided long bone fracture surgery. Computer Aided
Surgery, 3(6):271–288, 1998.
BIBLIOGRAPHY
161
[48] B. G. Kashef and A. A. Sawchuk. A survey of new techniques for image registration and mapping. In Proceedings of SPIE Medical Imaging, volume 432,
pages 222–239, 1983.
[49] D. Kerl, B. Likar, and F. Pemus. Evaluation of similarity measures for 3D/2D
image registration. In Proceedings of SPIE Medical Imaging, volume 6144, pages
61442F–61442F–11, 2006.
[50] E. Kerrien, M. O. Berger, E. Maurincomme, L. Launay, R. Vaillant, and L. Picard. Fully automatic 3D/2D subtracted angiography registration. In Proceedings of the 2nd International Conference on Medical Image Computing and
Computer Assisted Intervention (MICCAI), volume 1679, pages 664–671, 1999.
[51] A. Khoury, J. H. Siewerdsen, C. M. Whyne, M. J. Daly, H. J. Kreder, D. J.
Moseley, and D. A. Jaffray. Intraoperative cone-beam CT for image-guided
tibial plateau fracture reduction. Computer Aided Surgery, 12(4):195–207, 2007.
[52] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated
annealing. Science, 220(4598):671–680, 1983.
[53] D. Knaan and L. Joskowicz. Effective intensity-based 2D/3D rigid registration
between fluoroscopic X-ray and CT. In Proceedings of the 6th International
Conference on Medical Image Computing and Computer Assisted Intervention
(MICCAI), volume 2878, pages 351–358, 2003.
[54] J. J. Kuffner. Effective sampling and distance metrics for 3D rigid body path
planning. In IEEE International Conference on Robotics and Automation, pages
3993–3998, 2004.
BIBLIOGRAPHY
162
[55] P. Lacroute and M. Levoy. Fast volume rendering using a shear-warp factorization of the viewing transformation. In Proceedings of the 21st Annual
Conference on Computer Graphics and Interactive Techniques, pages 451–458,
1994.
[56] D. LaRose, J. Bayouth, and T. Kanade. Transgraph: interactive intensity-based
2D/3D registration of X-ray and CT data. In Proceedings of SPIE Medical
Imaging, volume 3979, pages 385–396, 2000.
[57] D. A. LaRose. Iterative X-Ray/CT Registration Using Accelerated Volume Rendering. PhD thesis, Camegie Mellon University, 2001.
[58] S. Lavallee, R. Szeliski, and L. Brunie. Matching 3-D smooth surfaces with
their 2-D projections using 3-D distance maps. In Geometric Reasoning for
Perception and Action, volume 708, pages 217–238. 1993.
[59] S. Y. Lee, G. Wolberg, and S. Y. Shin. Scattered data interpolation with multilevel B-splines. IEEE Transactions on Visualization and Computer Graphics,
3:228–244, 1997.
[60] T. Leloup, W. E. Kazzi, O. Debeir, F. Schuind, and N. Warzee. Automatic
fluoroscopic image calibration for traumatology intervention guidance. In International Conference on Computer as a Tool, volume 1, pages 374–377, 2005.
[61] H. Lester. A survey of hierarchical non-linear medical image registration. Pattern Recognition, 32(1):129–149, January 1999.
[62] P. P. Li, S. Whitman, R. Mendoza, and J. Tsiao. ParVox - a parallel splatting
volume rendering system for distributed visualization. In Proceedings of the
BIBLIOGRAPHY
163
IEEE Symposium on Parallel Rendering (PRS), pages 7–ff, Los Alamitos, CA,
USA, 1997. IEEE Computer Society.
[63] H. Livyatan, Z. Yaniv, and L. Joskowicz. Robust automatic C-arm calibration for fluoroscopy-based navigation: A practical approach. In Proceedings of
the 5th International Conference on Medical Image Computing and Computer
Assisted Intervention (MICCAI), volume 2489, pages 60–68, 2002.
[64] H. Livyatan, Z. Yaniv, and L. Joskowicz. Gradient-based 2D-3D rigid registration of fluoroscopic X-ray to CT. IEEE Transactions on Medical Imaging,
22(11):1395–1406, November 2003.
[65] W. E. Lorensen and H. E. Cline. Marching Cubes: A high resolution 3D surface
construction algorithm. Computer Graphics, 21(4):163–169.
[66] P. Lorenzen, M. Prastawa, B. Davis, G. Gerig, E. Bullitt, and S. Joshi. Multimodal image set registration and atlas formation. Medical Image Analysis,
10(3):440–451, June 2006.
[67] B. Ma, J. Stewart, D. Pichora, R. Ellis, and P. Abolmaesumi. 2D/3D registration of multiple bones. In IEEE International Conference on Engineering in
Medicine and Biology (EMBC), pages 860–863, 2007.
[68] F. Maes, D. Vandermeulen, and P. Suetens. Comparative evaluation of multiresolution optimization strategies for multimodality image registration by maximization of mutual information. Medical Image Analysis, 3(4):373–386, December 1999.
BIBLIOGRAPHY
164
[69] J. B. Maintz and M. A. Viergever. A survey of medical image registration.
Medical Image Analysis, 2(1):1–36, March 1998.
[70] P. Markelj, D. Tomaevi, B. Likar, and F. Pernu. A review of 3D/2D registration
methods for image-guided interventions. Medical Image Analysis, April 2010.
[71] S. Marsland, C. Twining, and C. Taylor. Groupwise non-rigid registration using
polyharmonic clamped-plate splines. In Proceedings of the 6th International
Conference on Medical Image Computing and Computer Assisted Intervention
(MICCAI), volume 2879, pages 771–779. 2003.
[72] D. Mattes, D. R. Haynor, H. Vesselle, T. K. Lewellyn, and W. Eubank. Nonrigid
multimodality image registration. In Milan Sonka and Kenneth M. Hanson,
editors, Proceedings of SPIE Medical Imaging, volume 4322, pages 1609–1620,
2001.
[73] N. Milickovic, D. Baltas, S. Giannouli, M. Lahanas, and N. Zamboglou. CT
imaging based digitally reconstructed radiographs and their application in
brachytherapy. Physics in Medicine and Biology, 45(10):2787, 2000.
[74] M. H. Moghari and P. Abolmaesumi. A high-order solution for the distribution
of target registration error in rigid-body point-based registration. In Proceedings
of the 9th International Conference on Medical Image Computing and Computer
Assisted Intervention (MICCAI), volume 4191, pages 603–611, 2006.
[75] E. D. Momi, K. Eckman, B. Jaramaz, and A. DiGioia. Improved 2D/3D registration robustness using local spatial information. In Proceedings of SPIE
Medical Imaging, volume 6144, pages 977–984, 2006.
BIBLIOGRAPHY
165
[76] R. Munbodh, D. A. Jaffray, D. J. Moseley, Z. Chen, J. P. S. Knisely, P. Cathier,
and J. S. Duncan. Automated 2D-3D registration of a radiograph and a cone
beam CT using line-segment enhancement. Medical Physics, 33(5):1398–411,
2006.
[77] Y. Nakajima, T. Tashiro, T. Okada, Y. Sato, N. Sugano, M. Saito, K. Yonenobu,
H. Yoshikawa, T. Ochi, and S. Tamura. Computer-assisted fracture reduction
of proximal femur using preoperative CT data and intraoperative fluoroscopic
images. In International Congress Series - Computer Assisted Radiology and
Surgery, volume 1268, pages 620–625, 2004.
[78] Y. Nakajima, T. Tashiro, N. Sugano, K. Yonenobu, T. Koyama, Y. Maeda,
Y. Tamura, M. Saito, S. Tamura, M. Mitsuishi, N. Suigta, I. Sakuma, T. Ochi,
and Y. Matsumoto. Fluoroscopic bone fragment tracking for surgical navigation in femur fracture reduction by incorporating optical tracking of hip
joint rotation center. IEEE Transactions on Biomedical Engineering (TBME),
54(9):4173–4178, 2007.
[79] L. Nolte. Computer assisted orthopedic surgery : (CAOS). Hogrefe & Huber,
Seattle, 1999.
[80] T. Okada, Y. Iwasaki, T. Koyama, N. Sugano, Y. W. Chen, K. Yonenobu,
and Y. Sato. Computer-assisted preoperative planning for reduction of proximal femoral fracture using 3D-CT data. IEEE Transactions on Biomedical
Engineering (TBME), 56(3):749–759, 2009.
[81] J. Orchard and R. Mann. Registering a multisensor ensemble of images. IEEE
Transactions on Image Processing, 19:1236–1247, May 2010.
BIBLIOGRAPHY
166
[82] G. P. Penney, J. Weese, J. A. Little, P. Desmedt, D. L. Hill, and D. J. Hawkes. A
comparison of similarity measures for use in 2-D-3-D medical image registration.
IEEE Transactions on Medical Imaging, 17(4):586–595, August 1998.
[83] T. M. Peters. Image-guidance for surgical procedures. Physics in Medicine and
Biology, 51(14):R505–R540, 2006.
[84] M. R. Pickering, A. A. Muhit, J. M. Scarvell, and P. N. Smith. A new multimodal similarity measure for fast gradient-based 2D-3D image registration.
In IEEE International Conference on Engineering in Medicine and Biology
(EMBC), pages 5821–5824, 2009.
[85] J. P. Pluim, J. B. Maintz, and M. A. Viergever. Image registration by maximization of combined mutual information and gradient information. IEEE
Transactions on Medical Imaging, 19(8):809–814, August 2000.
[86] J. P. W. Pluim, J. B. A. Maintz, and M. A. Viergever. Mutual-informationbased registration of medical images: a survey. IEEE Transactions on Medical
Imaging, 22(8):986–1004, August 2003.
[87] J. P. W. Pluim, J. B. A. Maintz, and M. A. Viergever. f-information measures in medical image registration. IEEE Transactions on Medical Imaging,
23(12):1508–1516, December 2004.
[88] T. D. Potma. Explorations of the motion and geometry of the human knee.
Master’s thesis, Queen’s University, Kingston, Canada, 2007.
[89] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical
Recipes in C++. Cambridge University Press, 2002.
BIBLIOGRAPHY
167
[90] M. Prümmer, J. Hornegger, M. Pfister, and A. Dörfler. Multi-modal 2D-3D
non-rigid registration. In Proceedings of SPIE Medical Imaging, volume 6144,
pages 297–308, 2006.
[91] D. Rueckert, M. J. Clarkson, D. L. G. Hill, and D. J. Hawkes. Non-rigid
registration using higher-order mutual information. In Proceedings of SPIE
Medical Imaging, volume 3979, pages 438–447, 2000.
[92] D. Rueckert, A. F. Frangi, and J. A. Schnabel. Automatic construction of 3D
statistical deformation models using non-rigid registration. In Proceedings of
the 4th International Conference on Medical Image Computing and Computer
Assisted Intervention (MICCAI), volume 22, pages 77–84, 2001.
[93] D. B. Russakoff, T. Rohlfing, and C. R. Maurer. Fast intensity-based 2D-3D
image registration of clinical data using light fields. In Proceedings of 9th IEEE
International Conference on Computer Vision (ICCV), volume 1, pages 416–
422, 2003.
[94] D. B. Russakoff, T. Rohlfing, D. Rueckert, R. Shahidi, D. Kim, and C. R.
Maurer, Jr. Fast calculation of digitally reconstructed radiographs using light
fields. In Proceedings of SPIE Medical Imaging, volume 5032, pages 684–695,
2003.
[95] O. Sadowsky, G. Chintalapani, and R. H. Taylor. Deformable 2D-3D registration of the pelvis with a limited field of view, using shape statistics. In
Proceedings of the 10th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 10, pages 519–26,
2007.
BIBLIOGRAPHY
168
[96] N. Schubert and I. Scholl. Comparing GPU-based multi-volume ray casting
techniques. Computer Science - Research and Development, 26:39–50, Febuary
2011.
[97] A. J. Seibert and J. M. Boone. X-Ray imaging physics for nuclear medicine
technologists. Part 2: X-Ray interactions and image formation. Journal of
Nuclear Medicine Technology, 33(1):3–18, 2005.
[98] S. Stegmaier, M. Strengert, T. Klein, and T. Ertl. A simple and flexible volume rendering framework for graphics-hardware-based raycasting. International
Workshop on Volume Graphics, pages 187–241, 2005.
[99] J. E. Stone, D. Gohara, and G. C. Shi. OpenCL: A parallel programming
standard for heterogeneous computing systems. Computing in Science and Engineering, 12:66–73, 2010.
[100] C. Studholme. Simultaneous population based image alignment for template
free spatial normalisation of brain anatomy. In Biomedical Image Registration,
volume 2717, pages 81–90, 2003.
[101] T. S. Tang and R. E. Ellis. 2D/3D deformable registration using a hybrid atlas.
In Proceedings of the 8th International Conference on Medical Image Computing
and Computer Assisted Intervention (MICCAI), volume 8, pages 223–230, 2005.
[102] T. S. Y. Tang. Calibration and point-based registration of fluoroscopic images.
Master’s thesis, Queen’s University, Kingston, Canada, 1999.
[103] T. S. Y. Tang, R. E. Ellis, and G. Fichtinger. Fiducial registration from a single
X-Ray image: A new technique for fluoroscopic guidance and radiotherapy. In
BIBLIOGRAPHY
169
Proceedings of the 3rd International Conference on Medical Image Computing
and Computer Assisted Intervention (MICCAI), volume 1935, pages 502–511,
2000.
[104] P. M. Tate, V. Lachine, L. Q. Fu, H. Croitoru, and M. Sati. Performance
and robustness of automatic fluoroscopic image calibration in a new computer
assisted surgery system. In Proceedings of the 4th International Conference
on Medical Image Computing and Computer Assisted Intervention (MICCAI),
volume 2208, pages 1130–1136, 2001.
[105] R. H. Taylor and D. Stoianovici. Medical robotics in computer-integrated
surgery. IEEE Transactions on Robotics and Automation, 19(5):765–781, 2003.
[106] P. Toft. The Radon Transform - Theory and Implementation. PhD thesis,
Technical University of Denmark, Lyngby, Denmark, 1996.
[107] D. Tomazevic, B. Likar, and F. Pernus. 3-D/2-D registration by integrating
2-D information in 3-D. IEEE Transactions on Medical Imaging, 25(1):17–27,
January 2006.
[108] A. Tristan and J. I. Arribas. A fast B-spline pseudo-inversion algorithm for
consistent image registration. In Computer Analysis of Images and Patterns,
pages 768–775, 2007.
[109] C. J. Twining, T. Cootes, S. Marsland, V. Petrovic, R. Schestowitz, and C. J.
Taylor. A unified information-theoretic approach to groupwise non-rigid registration and model building. Information Processing in Medical Imaging, 19:1–
14, 2005.
BIBLIOGRAPHY
170
[110] E. B. van de Kraats, G. P. Penney, D. Tomazevic, T. van Walsum, and W. J.
Niessen. Standardized evaluation methodology for 2-D/3-D registration. IEEE
Transactions on Medical Imaging, 24(9):1177–1189, 2005.
[111] E. A. Wan and R. V. D. Merwe. The unscented Kalman filter for nonlinear
estimation. In IEEE Adaptive Systems for Signal Processing, Communications,
and Control Symposium (AS-SPCC), pages 153–158, 2000.
[112] F. Wang, T. Davis, and B. Vemuri. Real-time DRR generation using cylindrical
harmonics. In Proceedings of the 5th International Conference on Medical Image
Computing and Computer Assisted Intervention (MICCAI), volume 2489, pages
671–678, 2002.
[113] W. C. Wang and E. H. Wu. Adaptable splatting for irregular volume rendering.
Computer Graphics Forum, 18(4):213–222, 1999.
[114] W. Wein. Intensity based rigid 2D-3D registration algorithms for radiation
therapy. Master’s thesis, Technische Universitat Munchen, Munchen, Germany,
2003.
[115] M. A. Westenberg and J. B. T. M. Roerdink. X-ray volume rendering by hierarchical wavelet splatting. In Proceedings of the 15th International Conference
on Pattern Recognition, volume 3, pages 159–162, 2000.
[116] L. Westover. Interactive volume rendering. In Proceedings of the 1989 IEEE
Symposium on Volume Visualization (VVS), pages 9–16, 1989.
BIBLIOGRAPHY
171
[117] R. Westphal, T. Gsling, M. Oszwald, J. Bredow, D. Klepzig, S. Winkelbach,
T. Hufner, C. Krettek, and F. Wahl. Robot assisted fracture reduction. Experimental Robotics, 3(6):153–163, 2008.
[118] Wikipedia.
Axis-angle representation.
http://en.wikipedia.org/wiki/
Axis-angle_representation, July 2011.
[119] Wikipedia. Gimbal lock. http://en.wikipedia.org/wiki/Gimbal_lock, July
2011.
[120] Z. Yaniv and K. Cleary.
Image-guided procedures: A review.
http://
isiswiki.georgetown.edu/zivy/writtenMaterial/CAIMR-TR-2006-3.pdf,
November 2006.
[121] Z. Yaniv, L. Joskowicz, A. Simkin, M. A. Garza-Jinich, and C. Milgrom. Fluroscopic image processing for Computer-Aided Orthopaedic Surgery. In Proceedings of the 1st International Conference on Medical Image Computing and
Computer Assisted Intervention (MICCAI), volume 1496, pages 325–334, 1998.
[122] J. H. Yao. A Statistical Bone Density Atlas And Deformable Medical Image
Registration. PhD thesis, Johns Hopkins University, Baltimore, USA, 2001.
[123] J. H. Yao and R. H. Taylor. Tetrahedral mesh modeling of density data for
anatomical atlases and intensity-based registration. In Proceedings of the 3rd
International Conference on Medical Image Computing and Computer Assisted
Intervention (MICCAI), pages 531–540, 2000.
[124] P. A. Yushkevich, J. Piven, C. H. Heather, G. S. Rachel, S. Ho, J. C. Gee, and
G. Gerig. User-guided 3D active contour segmentation of anatomical structures:
BIBLIOGRAPHY
172
Significantly improved efficiency and reliability. Neuroimage, 31(3):1116–1128,
2006.
[125] H. X. Zhao and A. J. Reader. Fast projection algorithm for voxel arrays with
object dependent boundaries. In IEEE International Symposium on Nuclear
Science, volume 3, pages 1490–1494, 2002.
[126] G. Y. Zheng, M. A. G. Ballester, M. Styner, and L. P. Nolte. Reconstruction
of patient-specific 3D bone surface from 2D calibrated fluoroscopic images and
point distribution model. In Proceedings of the 9th International Conference
on Medical Image Computing and Computer Assisted Intervention (MICCAI),
volume 9, pages 25–32, 2006.
[127] D. Zikic, B. Glocker, O. Kutter, M. Groher, N. Komodakis, A. Khamene,
N. Paragios, and N. Navab. Markov random field optimization for intensitybased 2D-3D registration. In Proceedings of SPIE Medical Imaging, volume
7623, pages 762334–762334–8, 2010.
[128] B. Zitova. Image registration methods: a survey. Image and Vision Computing,
21(11):977–1000, October 2003.
[129] L. Zöllei, E. Grimson, A. Norbash, and W. Wells-III. 2D-3D rigid registration of
X-ray fluoroscopy and CT images using mutual information and sparsely sampled histogram estimators. In Proceedings of the 2001 IEEE Computer Society
Conference on Computer Vision and Pattern Recognition (CVPR), volume 2,
pages II–696–II–703, 2001.
BIBLIOGRAPHY
173
[130] L. Zöllei, E. Learned-Miller, E. Grimson, and W. Wells. Efficient population
registration of 3D data. Proceedings of the International Conference on Computer Vision (ICCV), 3765:291–301, 2005.
[131] M. Zwicker, H. Pfister, J. V. Baar, and M. H. Gross. EWA splatting. IEEE
Transactions on Visualization and Computer Graphics, 8:223–238, 2002.