Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2D-3D Registration Methods for Computer-Assisted Orthopaedic Surgery by Ren Hui Gong A thesis submitted to the School of Computing in conformity with the requirements for the degree of Doctor of Philosophy Queen’s University Kingston, Ontario, Canada September 2011 c Ren Hui Gong, 2011 Copyright Abstract 2D-3D registration is one of the underpinning technologies that enables image-guided intervention in computer-assisted orthopaedic surgery (CAOS). Preoperative 3D images and surgical plans need to be mapped to the patient in the operating room before they can be used to augment the surgical intervention, and this task is generally fulfilled by using 2D-3D registration which spatially aligns a preoperative 3D image to a set of intraoperative fluoroscopic images. The key problem in 2D-3D registration is to define an accurate similarity metric between the 2D and 3D data, and choose an appropriate optimization algorithm. Various similarity metrics and optimization algorithms have been proposed for 2D3D registration; however, current techniques have several critical limitations. First, a good initial guess - usually within a few millimetres from the true solution - is required, and such capture range is often not wide enough for clinical use. Second, for currently used optimization algorithms, it is difficult to achieve a good balance between the computation efficiency and registration accuracy. Third, most current techniques register a 3D image of a single bone to a set of fluoroscopic images, but in many CAOS procedures, such as a multi-fragment fracture treatment, multiple bone pieces are involved. In this thesis, research has been conducted to investigate the above problems: 1) i two new registration techniques are proposed that use recently developed optimization techniques, i.e. Unscented Kalman Filter (UKF) and Covariance Matrix Adaptation Evolution Strategy (CMA-ES), to improve the capture range for the 2D-3D registration problem; 2) a multiple-object 2D-3D registration technique is proposed that simultaneously aligns multiple 3D images of fracture fragments to a set of fluoroscopic images of fracture ensemble; 3) a new method is developed for fast and efficient construction of anatomical atlases; and 4) a new atlas-based multiple-object 2D-3D registration technique is proposed to aid fracture reduction in the absence of preoperative 3D images. Experimental results showed that: 1) by using the new optimization algorithms, the robustness against noise and outliers was improved, and the registrations could be performed more efficiently; 2) the simultaneous registration of multiple bone fragments could achieve a clinically acceptable global alignment among all objects with reasonable computation cost; and 3) the new atlas construction method could construct and update intensity atlases accurately and efficiently; and 4) the use of atlas in multiple-object 2D-3D registration is feasible. ii Acknowledgements First of all, I would like to express my sincere appreciation and thanks to my supervisors, Purang Abolmaesumi and James Stewart. Without their supervision, support and patience, this thesis would not have been possible. In addition, I would like to thank Paul St. John, Burton Ma, Manuela Kunz and David Pichora for their great assistances in collecting or providing medical data, which is a key prerequisite of my research. Many thanks to Burton Ma for developing and providing the fracture treatment planning tool in support of my research. Thanks also to Manuela Kunz for her knowledge and assistance throughout my research work. I am thankful to Mehdi H. Moghari, my former colleague, for his assistance in understanding the Kalman Filter and for useful discussions on other mathematical problems. iii Statement of Originality I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked, is explicitly acknowledged in the thesis. iv Contents Abstract i Acknowledgements iii Statement of Originality iv List of Tables ix List of Figures xi Glossary Chapter 1: Introduction 1.1 Motivation . . . . . . 1.2 Objectives . . . . . . 1.3 Contributions . . . . 1.4 Thesis Organization . xiii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 2: Background 2.1 Overview of CAOS . . . . . . . . . . . . . . . 2.1.1 Components . . . . . . . . . . . . . . . 2.1.2 Procedures . . . . . . . . . . . . . . . . 2.1.3 2D-3D Registration in CAOS . . . . . 2.2 Overview of 2D-3D Registration . . . . . . . . 2.2.1 Registration in General . . . . . . . . . 2.2.2 2D-3D Registration . . . . . . . . . . . 2.3 2D and 3D Data in CAOS . . . . . . . . . . . 2.3.1 X-ray fluoroscopy . . . . . . . . . . . . 2.3.2 Computed Tomography (CT) . . . . . 2.3.3 Anatomical Atlases . . . . . . . . . . . 2.4 Transformations in 2D-3D Registration . . . . 2.4.1 Transformations within the Fixed Data v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 3 3 5 . . . . . . . . . . . . . 6 6 7 8 10 11 11 14 16 16 20 21 22 23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 26 29 30 32 34 36 37 38 41 43 44 45 46 48 Chapter 3: 2D-3D Registration with Unscented Kalman Filter 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Transform and its Initial Value . . . . . . . . . . . . . . . 3.3.2 Hardware-based Volume Rendering Engine . . . . . . . . . 3.3.3 Similarity Measure . . . . . . . . . . . . . . . . . . . . . . 3.3.4 Unscented Kalman Filter . . . . . . . . . . . . . . . . . . . 3.4 Experiment, Results, and Discussion . . . . . . . . . . . . . . . . 3.4.1 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 50 51 52 53 54 55 56 58 58 59 60 62 62 . . . . . . . . 64 64 65 68 68 70 73 75 77 2.5 2.6 2.7 2.4.2 Transformation of the Moving Data . . . . . . 2.4.3 Parametrization of the Output Transformation DRR Generation . . . . . . . . . . . . . . . . . . . . 2.5.1 Ray-casting . . . . . . . . . . . . . . . . . . . 2.5.2 GPU-based Texture Mapping . . . . . . . . . 2.5.3 Other Techniques . . . . . . . . . . . . . . . . 2.5.4 DRR Generation for Multiple Objects . . . . 2D-3D Similarity Metrics . . . . . . . . . . . . . . . . 2.6.1 Correlation-based Metrics . . . . . . . . . . . 2.6.2 Information-theory Metrics . . . . . . . . . . . 2.6.3 Metrics using Spatial Information . . . . . . . Optimization Algorithms . . . . . . . . . . . . . . . . 2.7.1 Techniques Not using Derivatives . . . . . . . 2.7.2 Techniques using Derivatives . . . . . . . . . . 2.7.3 Robust and Efficient Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 4: 2D-3D Registration with the CMA-ES Method 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Algorithm Overview . . . . . . . . . . . . . . . . . . . 4.3.2 Optimization with CMA-ES . . . . . . . . . . . . . . . 4.4 Experiments, Results and Discussion . . . . . . . . . . . . . . 4.4.1 Registration of CT to Simulated Fluoroscopy . . . . . . 4.4.2 Registration of CT to Real Fluoroscopy . . . . . . . . . vi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 4.4.3 The Impact of Initial Search Size . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 5: Multiple-Object 2D-3D Registration 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Preoperative Treatment Plan . . . . . . . . . . . . 5.3.2 Tracked Intraoperative Fluoroscopic Images . . . . 5.3.3 Transforms . . . . . . . . . . . . . . . . . . . . . . 5.3.4 Similarity Metric . . . . . . . . . . . . . . . . . . . 5.3.5 DRR Computation . . . . . . . . . . . . . . . . . . 5.3.6 Optimization Scheme . . . . . . . . . . . . . . . . . 5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Error Measurement . . . . . . . . . . . . . . . . . . 5.4.2 Experiments with Synthetic Fractures . . . . . . . . 5.4.3 Experiments with Fracture Phantoms . . . . . . . . 5.4.4 Validation Against Outliers in Fluoroscopic Images 5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 6: Modelling of 3D Intensity Atlas with B-Spline FFD 6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Atlas Representations . . . . . . . . . . . . . . . . . . . . 6.3.2 Atlas Construction . . . . . . . . . . . . . . . . . . . . . . 6.3.3 Instance Generation . . . . . . . . . . . . . . . . . . . . . 6.4 Experiments, Results and Discussion . . . . . . . . . . . . . . . . 6.4.1 Accuracy of Intensity Approximation . . . . . . . . . . . . 6.4.2 Performance of Instance Generation . . . . . . . . . . . . . 6.4.3 Performance of Atlas Construction . . . . . . . . . . . . . 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 7: Atlas-based Multiple-object 2D-3D Registration 7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 The Atlas of Distal Radius . . . . . . . . . . . . . . . . 7.3.2 Multiple-Fragment Deformable 2D-3D Registration . . 7.4 Experiments and Results . . . . . . . . . . . . . . . . . . . . . vii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 80 . . . . . . . . . . . . . . . . 81 81 82 85 85 87 87 88 91 91 93 95 96 101 108 111 116 . . . . . . . . . . . 118 118 119 121 121 123 125 127 127 129 130 131 . . . . . . 132 132 133 135 135 140 144 7.5 Discussion and Summary . . . . . . . . . . . . . . . . . . . . . . . . . 147 Chapter 8: Conclusion 8.1 Summary of Contributions . . . . . . . . . . . . 8.1.1 Robust and Efficient 2D-3D Registration 8.1.2 Multiple-object 2D-3D Registration . . . 8.2 Future Work . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 148 148 150 152 154 viii List of Tables 3.1 Testing data specifications (UKF) . . . . . . . . . . . . . . . . . . . . 58 3.2 Comparison of capture range (UKF) . . . . . . . . . . . . . . . . . . 60 3.3 Comparison of accuracy (UKF) . . . . . . . . . . . . . . . . . . . . . 61 3.4 Comparison of performance (UKF) . . . . . . . . . . . . . . . . . . . 61 4.1 Testing data specifications (CMA-ES) . . . . . . . . . . . . . . . . . . 74 4.2 Generation of testing cases . . . . . . . . . . . . . . . . . . . . . . . . 76 4.3 Results with simulated X-rays: registration error . . . . . . . . . . . . 76 4.4 Results with simulated X-rays: capture range, accuracy and computation time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.5 Results with real X-rays: registration error . . . . . . . . . . . . . . . 78 4.6 Results with real X-rays: capture range, accuracy and computation time 78 4.7 Impact of initial search size: capture range, accuracy and computation time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.1 Error statistics - two-fragment synthetic fracture . . . . . . . . . . . . 100 5.2 Error statistics - three-fragment synthetic fracture . . . . . . . . . . . 101 5.3 Error statistics - two-fragment fracture phantom . . . . . . . . . . . . 106 5.4 Error statistics - three-fragment fracture phantom . . . . . . . . . . . 108 5.5 Results for outlier studies . . . . . . . . . . . . . . . . . . . . . . . . 110 ix 6.1 Accuracies of intensity approximation with B-spline FFD . . . . . . . 128 6.2 Performance of GPU-accelerated instance generation . . . . . . . . . 130 7.1 Preliminary results of atlas-based 2D/3D registration . . . . . . . . . 146 x List of Figures 2.1 A typical CAOS setup . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Overview of general registration . . . . . . . . . . . . . . . . . . . . . 12 2.3 Overview of DRR-based 2D-3D registration . . . . . . . . . . . . . . 14 2.4 An example of C-arm calibration drum . . . . . . . . . . . . . . . . . 18 2.5 An X-ray example before and after calibration . . . . . . . . . . . . . 19 2.6 Transformations in 2D-3D registration . . . . . . . . . . . . . . . . . 23 3.1 Overview of UKF-based 2D-3D registration . . . . . . . . . . . . . . . 54 3.2 Overview of the UKF algorithm . . . . . . . . . . . . . . . . . . . . . 57 3.3 Testing data (UKF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4 Experimental results (UKF) . . . . . . . . . . . . . . . . . . . . . . . 61 3.5 Visual check of registration results . . . . . . . . . . . . . . . . . . . . 62 4.1 Overview of CMA-ES-based 2D-3D registration . . . . . . . . . . . . 69 4.2 Overview of the CMA-ES algorithm . . . . . . . . . . . . . . . . . . . 71 4.3 Impact of initial search size: registration error . . . . . . . . . . . . . 79 5.1 Overview of multiple-object 2D-3D registration . . . . . . . . . . . . 86 5.2 The testing cycle for each single experiment . . . . . . . . . . . . . . 94 5.3 Simulated two-fragment wrist fracture . . . . . . . . . . . . . . . . . 97 xi 5.4 Simulated three-fragment wrist fracture . . . . . . . . . . . . . . . . . 97 5.5 Formation of a simulated X-ray . . . . . . . . . . . . . . . . . . . . . 98 5.6 Experimental results - two-fragment synthetic fracture . . . . . . . . 99 5.7 Experimental results - three-fragment synthetic fracture . . . . . . . . 100 5.8 Simulated X-rays and the corresponding DRRs . . . . . . . . . . . . . 102 5.9 Two-fragment fracture phantom . . . . . . . . . . . . . . . . . . . . . 103 5.10 Three-fragment fracture phantom . . . . . . . . . . . . . . . . . . . . 103 5.11 X-ray images used for phantom studies . . . . . . . . . . . . . . . . . 105 5.12 Experimental results - two-fragment fracture phantom . . . . . . . . . 107 5.13 Experimental results - three-fragment fracture phantom . . . . . . . . 107 5.14 Visual check of registration results - two-fragment fracture phantom . 109 5.15 Visual check of registration results - three-fragment fracture phantom 110 5.16 Visual check of registration results - outlier studies . . . . . . . . . . 111 6.1 Training examples for CT atlas of distal radius . . . . . . . . . . . . . 127 6.2 Approximated CT data with different B-spline spacings . . . . . . . . 128 7.1 Training examples for building a CT atlas of distal radius . . . . . . . 136 7.2 Overview of atlas-based multiple-object 2D-3D registration . . . . . . 141 7.3 Modelling of a fracture fragment . . . . . . . . . . . . . . . . . . . . . 142 7.4 Generation of synthetic fracture . . . . . . . . . . . . . . . . . . . . . 145 7.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 xii Glossary 2 2D 2-dimensional. 3 3D 3-dimensional. A AABB Axis-aligned Bounding Box, a bounding box that is aligned with the coordinate axes, p. 33. ABS Acrylonitrile Butadiene Styrene, p. 104. ACL Anterior Cruciate Ligament, p. 6. AP Anterior-posterior, a camera orientation that looks straight down the patient’s chest, p. 84. ASGTM Adaptive Slice Geometry Texture Mapping, an improved texture mapping method for volume rendering, p. 54. xiii C CAOS Computer Assisted Orthopedic Surgery, p. 1. CART Computer Assisted Radiotherapy, p. 51. CAS Computer Assisted Surgery, p. 1. CF Coordinate Frame, p. 7. CMA-ES Covariance Matrix Adaptation Evolution Strategy, an algorithm for estimating the parameters of nonlinear systems, p. 3. Co-registered 2D images A set of 2D images within a same coordinate frame, p. 14. CPU Central Processing Unit, p. 31. CT Computed Tomography, a 3D image modality that is dependent on X-ray imaging, p. 10. CUDA Compute Unified Device Architecture, an architecture developed by NVIDIA for performing parallel computations on GPUs, p. 32. CVR Coefficients-to-voxels Ratio, p. 127. D DirectX A component of Microsoft Windows that handles multimedia-related tasks such as video, audio and input/output, p. 31. DOF Degree of Freedom, p. 27. DRB Dynamic Reference Base, p. 8. xiv DRO Distal Radius Osteotomy, p. 6. DRR Digitally Reconstructed Radiograph, a projection image obtained by simulating the X-ray imaging process on a CT image, p. 15. F FFD Free-form Deformation, p. 118. Fixed data The data set in a registration that is used as the reference and is kept fixed during the registration, p. 12. G GC Gradient Correlation, a similarity metric between two images that employs the correlations between the gradient images, p. 43. GD Gradient Difference, a similarity metric between two images that employs the differences between the gradient images, p. 44. GPGPU General-Purpose computation on Graphics Processing Units, p. 31. GPU Graphics Processing Unit, p. 31. Group-wise registration A registration that simultaneously aligns multiple images, p. 13. GRV Gaussian Random Variable, p. 56. GUI Graphical User Interface, p. 123. xv H HRA Hip Resurfacing Arthroplasty, p. 6. I IIR Infinite Impulse Response, p. 89. ips Instances per second, the average speed to generate instances from an atlas, p. 130. M MBA Multi-level B-spline Approximation, p. 120. MI Mutual Information, a similarity metric between two images that uses the joint entropy, p. 41. Moving data The data set in a registration whose pose is to be determined by the registration, p. 12. MRI Magnetic Resonance Imaging, a 3D image modality that is dependent on nuclear magnetic resonance, p. 10. mTRE mean Target Registration Error, p. 54. N NCC Normalized Correlation Coefficients, a similarity metric between two images that evaluates the cross-correlation of the two images, p. 38. NMI Normalized Mutual Information, a variant of MI, p. 41. xvi O OBB Oriented Bounding Box, a bounding box that is aligned with the principal axes of an anatomy. OIVR Order Independent Volume Rendering, p. 37. OpenCL Open Computing Language, a framework managed by Khronos Group for programming on heterogeneous devices such as CPUs, GPUs and other processors, p. 32. OpenGL Open Graphics Library, a standard programming interface managed by Khronos Group for writing programs to produce 2D and 3D graphics, p. 55. OR Operating Room, p. 7. P Pair-wise registration A registration that aligns two images, p. 11. PCA Principal Component Analysis, p. 25. PI Pattern Intensity, a similarity metric between two images that counts the number of a special intensity pattern, p. 43. R ROI Region of Interest. xvii S SLNC Sum of Local Normalized Correlation, a variant of NCC that evaluates and combines local cross-correlations, p. 39. SRC Stochastic Rank Correlation, a similarity metric between two images that calculates the cross-correlation between the intensity ranks of the two images, p. 40. SVD Singular Value Decomposition, p. 54. T THA Total Hip Arthroplasty, p. 6. TKA Total Knee Arthroplasty, p. 6. TRE Target Registration Error. U UKF Unscented Kalman Filter, an algorithm for estimating the parameters of nonlinear systems, p. 3. UT Unscented Transform, p. 56. V Volume rendering Generation of a 2D image from a 3D image. VWC Variance-Weighted Correlation, a variant of SLNC that performs a special weighting when combining the local cross-correlations, p. 39. xviii 1 Chapter 1 Introduction Registration of medical images enables automatic augmentation and/or fusion of multiple images of an object in order to get new insights about the trauma or anatomical structures under the skin. It is one of the underpinning technologies for computerassisted surgery (CAS). One important type of registration is 2D-3D registration to spatially align 3D CT images to 2D X-ray fluoroscopy images, because CT and fluoroscopy are the most available imaging modalities that produce high-quality images for bony structures. 2D-3D registration is especially important in computer-assisted orthopaedic surgery (CAOS), which is a major subspecialty of CAS that deals with human bones such as spine, hip, knee, and wrist. In this thesis, new 2D-3D registration techniques for CAOS are investigated, particularly for multi-fragment fracture fixation applications. 1.1 Motivation Despite the importance of 2D-3D registration in CAOS, the use of 2D-3D registration in clinics has been limited due to several challenges: First, most registration techniques solve the problems in an iterative fashion that 1.1. MOTIVATION 2 requires a sufficiently good initial alignment between the two images being registered. Large errors in initial alignment would significantly increase the failure rate of the registration; in many cases, such as complex fracture cases, finding an initially close alignment is a difficult task. Second, during image acquisition and the process of registration, various system noise and artefacts can be introduced. For example, imaging sensors can produce system noises due to their physical limitations; image acquisition from inappropriate orientations can introduce outliers that appear in one image but are missed in another image; and some registration techniques require post-processing of the images which may introduce undesired artefacts. When the accumulated noise and imaging artefacts become dominant, the registration faces problems such as high failure rate, inaccurate final results, sensitivity to initial alignment, and slow convergence. Third, 2D-3D registration in CAOS is often used with a preoperative treatment plan to guide the intraoperative intervention. The problem is to identify the intraoperative positions of all involved bones in the coordinate space of the preoperative treatment plan. Due to challenges mentioned above, most reported 2D-3D registration techniques handle only a single bony structure during each registration in order to achieve a good compromise among performance, robustness and accuracy. When using those techniques in treating cases that involve multiple bone fragments, more complex procedures are needed, and a good global alignment for all involved bone fragments may be difficult to achieve. 1.2. OBJECTIVES 1.2 3 Objectives The overall objective of this thesis is to develop and evaluate new techniques of 2D3D registration in order to tackle the challenges presented in the previous section. Specifically, the following objectives have been defined: • To seek new optimization algorithms for 2D-3D registration that are not only robust against various types of noise but are also efficient to run. • To develop 2D-3D registration techniques that can simultaneously handle multiple bony structures and have reasonable computation time. 1.3 Contributions As the principal author, the main contributions of this thesis consist of the following published works: 1. A 2D-3D registration method that utilises the Unscented Kalman Filter (UKF) was proposed and evaluated. The method showed better robustness against noise with wide capture range while not compromising the computation speed. This work was published in the proceedings of MICCAI 2006, with J. Stewart and P. Abolmaesumi as co-authors [38]. 2. A 2D-3D registration method that takes advantage of the robust Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm was proposed and evaluated. Similar to the UKF-based method, this method demonstrated great robustness against noise, wide capture range and good computation performance. In addition, this method had better usability because fewer user parameters were involved. Preliminary and extended results of this work were 1.3. CONTRIBUTIONS 4 published in the proceedings of IEEE EMBC 2006 and SPIE 2008, both with J. Stewart and P. Abolmaesumi as co-authors [37, 36]. 3. A multiple-object 2D-3D registration method was proposed and evaluated. This method is especially useful in CAOS for identifying the relative positions of the intraoperative bone fragments with respect to preoperative treatment plans. Such capability can lead to novel computer-assisted image-guided multifragment fracture fixation techniques, and methods for post-operatively assessing the treatment errors. This work was published in the journal of IEEE TBME 2011, with J. Stewart and P. Abolmaesumi as co-authors [41]. 4. The proposed multiple-object 2D-3D registration method depends on two independent procedures developing the treatment plan and performing the registration. Using a statistical shape model of the bone being treated as a reference can significantly simplify the entire process. As the performance of constructing and using such a statistical shape is very important, a new method was proposed to model, construct and use 3D intensity atlases. This work was published in the proceedings of IEEE EMBC 2010, with J. Stewart and P. Abolmaesumi as co-authors [40]. 5. A new multiple-object 2D-3D registration method that incorporates the statistical atlas of the bone was proposed and evaluated. This method integrates automatic planning with the registration; thus aims to improve the clinical usability. This work was published in the proceedings of SPIE 2009, with P. Abolmaesumi as the co-author [39]. 1.4. THESIS ORGANIZATION 1.4 5 Thesis Organization The rest of this thesis is organized into seven chapters: Chapter 2 gives an overview of CAOS and provides a review of current 2D-3D registration techniques. Key problems in 2D-3D registration that motivated this thesis are also discussed. Chapters 3 and 4 present two new 2D-3D registration techniques that employ the UKF and CMA-ES optimization algorithms to improve the registration performance. Chapter 5 describes the first multiple-object 2D-3D registration method for pose identification in CAOS. Chapter 6 presents the new method for modelling, constructing and using 3D intensity-based anatomical atlases. Chapter 7 presents the multiple-object 2D-3D registration method that incorporates the atlas of the bone for improved user experience. Finally, Chapter 8 gives some concluding remarks and points out the potential directions of future research. 6 Chapter 2 Background This chapter first provides an overview of CAOS and highlights how 2D-3D registration is situated in the big picture of CAOS. Then, a brief introduction to 2D-3D registration is given and the key problems are described in more detail. 2.1 Overview of CAOS Over the past two decades, the traditional surgery has been rapidly moving to computer-assisted surgery (CAS) because CAS enables the development of more accurate minimally-invasive procedures [30, 83, 105]. One important subspecialty of CAS is computer-assisted orthopaedic surgery (CAOS) that focuses on the integration of CAS technologies into the field of orthopaedic surgery. Applications of CAOS have been found in many orthopaedic therapies such as spinal surgery, total hip arthroplasty (THA), hip resurfacing arthroplasty (HRA), total knee arthroplasty (TKA), anterior cruciate ligament (ACL) reconstruction, distal radius osteotomy (DRO), and so on. A large number of articles in the forms of clinical trials and retrospective reviews have been published, and the use of CAOS has demonstrated the benefits such 2.1. OVERVIEW OF CAOS 7 Figure 2.1: A typical CAOS navigation system [26]. as revolutionized operating room (OR) configurations and surgical procedures, enhanced preoperative planning, improved intraoperative effectiveness and efficiency, increased speed of postoperative recovery, and improved clinical outcomes [1, 24, 79]. 2.1.1 Components Fig. 2.1 illustrates a typical CAOS setup in the OR. As shown in the picture, a CAOS system has three major constituents: • A camera or 3D localizer that is used to determine the reference coordinate frame (CF) of the OR, and to integrate all trackable OR devices; • Various trackable devices that are monitored by the camera to provide the positions or poses of the patient or other OR objects. Example devices include 2.1. OVERVIEW OF CAOS 8 pointers, dynamic reference bases (DRB), surgical tools, intraoperative imaging devices, and so on; • A computer workstation that stores preoperative surgical plans and connects all OR devices to provides image-based navigation. 2.1.2 Procedures Using CAOS technologies for medical treatment generally consists of three phases or procedures: Preoperative planning. Before a surgery is conducted, a digital “patient” is created on the computer workstation and is subsequently used as the base for making surgical plans. The digital “patient” consists of various information about the patient and trauma such as medical images, 3D models of the involved anatomical structures, geometry information of implants, and functional data that are relevant. Multiple imaging modalities may be used concurrently for better visualization of different anatomic structures. All information is mapped into a common coordinate frame before surgical plans can be made. The planning process is patient and procedure specific, and has variable complexity. A planning can be as simple as specifying a target, or as complex as determining the final shape of a bone fracture and specifying the movement trajectories of various fracture fragments. Intraoperative intervention. Once the patient is in OR, the digital “patient” on the computer is mapped to the physical patient in OR, which means to align the coordinate frame of the digital “patient” to the OR coordinate frame that is determined by the camera. This task is performed either by aligning a set of anatomical or artificial landmarks that is visible both preoperatively and intraoperatively, or by capturing a 2.1. OVERVIEW OF CAOS 9 few intraoperative images and aligning them with the preoperative images. During the intervention, this task may also be performed periodically to update the surgical plan to reflect critical anatomical deformations or patient pose changes. Once the two coordinate frames are aligned, the navigation system provides visual assistance to the surgical team by tracking the surgical tools and visualizing the spatial relationships of the anatomical structures. The digital “patient” and the associated surgical plans are used as references to provide visual or quantitative feedback on the treatment accuracy. Postoperative assessment. After the surgery, it is necessary to quantitatively or visually evaluate the outcome of intervention with respect to the preoperative surgical plan. This task is often achieved using non-invasive methods because the evaluation may be repeated periodically over the healing time. In the most commonly used method, postoperative images are captured and compared with the preoperative images and the surgical plan. Each of the procedures described above employs various technologies. Collectively, several key technologies can be identified [83, 120]: • Medical imaging and image processing that provide appropriate 2D or 3D images to aid the therapy; • Segmentation and anatomical modelling that extract and construct anatomical models from medical images to aid planning, visualization, navigation, and so on; • Surgical planning that performs tasks such as trauma visualization, determination of surgical paths, simulation of surgical procedures, and so on, before an intervention is conducted; 2.1. OVERVIEW OF CAOS 10 • Data registration that integrates different images or anatomical models into a common context such that more comprehensive understanding about the trauma can be obtained; • Visualization that renders 3D images and anatomical models to highlight the structures of interest; • Tracking that reports the positions or poses of various OR objects such as surgical tools, intraoperative imaging devices, DRBs, and so on; • Human-computer interaction that accepts inputs from the surgical team and provides visual or quantitative feedback. Though all the technologies pointed above are important for CAOS, this thesis focuses on the registration topic; specifically, the 2D-3D registration. 2.1.3 2D-3D Registration in CAOS One popular type of registration is 2D-3D registration that aligns high-quality preoperative 3D data sets such as CT, MRI and 3D ultrasound to portable intraoperative 2D image modalities such as X-ray fluoroscopy and ultrasound. In CAOS, registration of CT to X-ray is especially important because these two modalities are not only most suitable for imaging bony structures, but also widely available in hospitals. 2D-3D registration is a fundamental task in many CAOS procedures. In preoperative planning, it can be used to provide 3D visualizations of the trauma by registering bone models to fluoroscopic images. During intraoperative surgical intervention, it can be used to map the preoperative plan on the computer to the physical patient in the OR by registering the digital “patient” to a set of intraoperative fluoroscopic 2.2. OVERVIEW OF 2D-3D REGISTRATION 11 images. This task is the most important use of 2D-3D registration in CAOS, and it is the key prerequisite that enables image-based navigation and surgical guidance. In postoperative assessment, 2D-3D registration can be used to measure or evaluate the treatment errors by registering the preoperative plan to a set of postoperative fluoroscopic images. 2.2 Overview of 2D-3D Registration 2D-3D registration is a special case of the general image registration problem. Therefore, a short introduction to the image registration is given first, followed by special focus on the 2D-3D registration problem. 2.2.1 Registration in General The registration task integrates multiple data of an object into a common context in order to provide useful information to the physicians. This task aligns two or more independent data sets in a single reference coordinate frame and establishes spatial correspondences among the points in different data sets. The data to be aligned can be raw medical images or anatomical models that are derived from the images. One fundamental type of registration is called Pair-wise registration, which is a process of geometrically aligning two correlated data sets. One of the two data sets is served as the reference data, and the other data set is transformed into the reference data’s coordinate frame such that the two data sets are aligned. Fig. 2.2 illustrates the key components as well as the general work-flow of a pair-wise registration. The key components of a pair-wise registration are two data objects and three processes: 2.2. OVERVIEW OF 2D-3D REGISTRATION 12 Figure 2.2: Key components and general work-flow of a pair-wise registration. • A Fixed data set that is served as the reference data; • A Moving data set that is transformed by the registration to match the reference data; • A transformation process, parameterized as a vector of scalar parameters, that positions the moving data in the coordinate frame of the reference data; • A similarity calculation process that evaluates a user-defined cost function, named the similarity metric, to quantitatively report the accuracy of alignment under a set of transformation parameters; and • An optimization process that uses a user-selected algorithm, called the optimizer, to compute values of the transformation parameters which produce an optimal alignment (i.e. maximally accurate) between the two data sets. In practice, the processes of transformation and similarity calculation are usually combined into a single process for better computational efficiency. The objective of registration is to determine the transformation parameters. The problem is often 2.2. OVERVIEW OF 2D-3D REGISTRATION 13 solved iteratively: starting from an initial guess of the transformation parameters, the moving data is first transformed and the similarity metric is calculated, then the transformation parameters are refined by the optimizer such that the similarity metric is improved. The process is repeated, and the final result of the transformation parameters is reported when the similarity metric satisfies some predefined criteria. Group-wise registration is another type of registration that is widely used in medical image analysis and CAS. It simultaneously aligns more than two data in order to obtain an optimal global alignment among all involved data. Group-wise registration can be treated as a special case of the pair-wise registration, where all data are collectively treated as the moving data, and the fixed data is unknown in advance but is dynamically determined throughout the registration process. Group-wise registration is usually implemented using special algorithms that iteratively optimize the registration parameters and the fixed data, but it can also be implemented by using a series of pair-wise registrations. Registration has been used in a wide range of applications. It is one of the underpinning technologies in computer-assisted surgery that enables non-invasive or minimally invasive medical procedures. Registration techniques can be classified according to different criteria, with the commonly used ones being the modalities of individual data (single-modality versus multiple-modality), the number of data to be aligned (pair-wise versus group-wise), the nature of the transformation that positions the data in the reference coordinate frame (rigid, affine or deformable), the dimensionalities of individual data (2D-2D, 3D-3D or 2D-3D), and the nature of the optimization algorithm (analytic versus iterative). Comprehensive surveys on pair-wise registration can be found in the literature [17, 43, 48, 61, 69, 128], and several methods on 2.2. OVERVIEW OF 2D-3D REGISTRATION 14 Figure 2.3: Key components and work-flow of DRR-based 2D-3D registration. group-wise registration have also been reported [4, 10, 66, 71, 81, 100, 109, 130]. 2.2.2 2D-3D Registration 2D-3D registration is a special type of pair-wise registration. It is a common practice to register the 3D data to the 2D data. That is, the 2D and 3D data are used as the fixed and moving data, respectively. Traditionally, the 3D moving data is a single 3D image; however, in certain applications such as CAOS for fracture treatment, there is an increasing need for 3D data that consists of multiple 3D data members such as 3D images of multiple fracture fragments or multiple anatomical models. When multiple 3D data members are involved, the registration is called multiple-object 2D3D registration. As a 3D data contains significantly more information than a 2D data, the 2D fixed data usually consists of a set of Co-registered 2D images rather than a single 2D image. The term “co-registered” means that the 2D images are in the same coordinate 2.2. OVERVIEW OF 2D-3D REGISTRATION 15 frame, taken from the same subject but from different viewing directions. 2D data and 3D data cannot be directly compared, so special processing is needed when defining the similarity metric. Two solutions have been used to deal with this issue: either constructing an intermediate 3D data from the 2D data [90, 107, 126], or generating an intermediate 2D data from the 3D data [9, 13, 34, 50, 55, 56, 112]. In the first solution, an intermediate 3D data is computed from the 2D fixed data, and the similarity metric is defined between the 3D moving data and the intermediate 3D data. This solution requires some prior knowledge to reconstruct the intermediate 3D data from the 2D fixed data because the 2D fixed data itself may not contain enough information to perform an accurate reconstruction. As the similarity calculation is performed in 3D, the computation cost is also increased. Because of those limitations, a small number of methods have used this solution. A more popular solution is to compute an intermediate 2D data from the 3D moving data and define the similarity metric between two sets of 2D data. The intermediate 2D data can be obtained by simulating the X-ray imaging process on the 3D moving data with the same imaging parameters that produced the 2D fixed data. For each fluoroscopic image in the 2D fixed data, a corresponding 2D projection image, known as digitally reconstructed radiograph (DRR), is synthesized. As this solution tightly depends on intermediate DRRs and the similarity calculation is performed on raw intensities or features derived from the intensities (such as the image gradient), the 2D-3D registration based on this scheme is also called DRR-based or intensity-based 2D-3D registration, or gradient-based 2D-3D registration if the image gradient is primarily used. Fig. 2.3 shows an updated diagram for DRR-based 2D-3D registration. 2.3. 2D AND 3D DATA IN CAOS 16 In the subsequent sections, each of the components in 2D-3D registration are briefly described. As comprehensive reviews on 2D-3D registration can be found in several publications [70, 114, 129], the goal in this chapter is not to give a detailed review, but to highlight the technologies and problems that are relevant to this thesis. In each of the following chapters, further references that are relevant to the particular chapter are provided. 2.3 2D and 3D Data in CAOS As mentioned in Section 2.1.3, the most important and commonly used imaging modalities in CAOS are 2D fluoroscopy and 3D CT. They are summarized below to show that the data can be an important source of errors in 2D-3D registration. 2.3.1 X-ray fluoroscopy X-ray imaging has been used for medical applications for over a century. X-rays are generated from radiation of high energy electromagnetic waves. When the radiation travels through the human body, the rays interact with the tissue layers and are gradually attenuated. The types of interaction include photoelectric absorption, Rayleigh scattering and Compton scattering, and the combined interaction is measured using the term attenuation coefficient. Different tissues have different attenuation coefficients. After penetrating the human body, the attenuated rays hit an X-ray sensitive detector and are converted into a visible projection image. Traditionally, the projection image is recorded onto a film which is not a suitable medium for computer-assisted applications. Nowadays, electronic media are used and the image is instantly displayed on a monitor. This type of X-ray devices is commonly known 2.3. 2D AND 3D DATA IN CAOS 17 as fluoroscopy, or C-arm because the detector and radiation source are mounted on the two ends of a C-shaped frame. Let I0 be the initial energy of an X-ray x at the source point, I1 be the attenuated energy at the detector plane, A and B be the incident and exit points of the ray through the tissue. Then, the following relationship between the two energies exists I1 = I0 e− RB A µ(s)ds , (2.1) where s is a point on the ray x, µ(.) is the tissue absorption coefficient at the given point, and ds is an infinitesimal length along the ray. A fluoroscopy device measures the photon fluence (determined by I1 ) reached at the detector plane, and the quantized pixel value is determined by the physical characteristics of the imager. Eq. (2.1) shows the X-ray imaging process for a monochromatic X-ray beam. In practice, the X-ray beams used in fluoroscopy devices are polychromatic and have a moderately broad energy spectrum. When such a beam travels through a tissue, X-rays with different energies are attenuated differently: some are more easily attenuated (called soft X-rays), and some are more penetrating (called hard X-rays). The net effect is that the total amount of attenuation is determined by the tissue thickness and the beam spectrum, and the thicker the tissue is the more penetrating the beam will be. This phenomenon is called beam hardening, and can cause artefacts in recorded images [97]. Fluoroscopic images are characterized by a small field of view and generally exhibit geometric and intensity distortions. The distortions vary for different imaging orientations, because C-arm devices are heavy objects and rotating the C-arm will cause deformations in the shape of the C-arm frame. Newer C-arm devices can produce 2.3. 2D AND 3D DATA IN CAOS 18 Figure 2.4: An example of calibration drum for correcting C-arm distortions [60]. images with less distortion, but they are expensive and not widely available in the OR yet. For traditional C-arm devices, it has been reported that a different distortion correction for every imaging orientation is necessary [46]. Two approaches have been proposed for correcting C-arm distortions: offline calibration [121] and online calibration [60, 63, 102, 104]. The offline approach computes the calibration parameters for a fixed set of C-arm orientations. It produces cleaner images but has a main drawback that the available imaging orientations are limited. The more popular approach is the online calibration, which computes the calibration parameters for every captured image. This is usually realized by mounting a two-layer calibration drum on top of the C-arm intensifier. Each of the layers contains a number of radio-opaque markers with different diameters (usually two) and known geometrical configurations. When an image is captured, part of the markers on each layer are detected from the image, and subsequently used to correct the geometry distortions as well as to compute the X-ray source location. To detect the markers from the image, the prior knowledge of the marker configurations is utilized. Fig. 2.4 shows a C-arm calibration drum. 2.3. 2D AND 3D DATA IN CAOS 19 Figure 2.5: An example of X-ray fluoroscopy before (left) and after (right) calibration. While the use of a calibration drum provides a good trade-off between the calibration accuracy and the accessibility of the device, it also introduces side-effects. First, the markers on both layers are shown in the captured images, and should be removed and then interpolated before further processing. This process will introduce additional noise into the images. Fig. 2.5 shows an example of an X-ray image before and after removing the calibration markers, where the introduced noise is small but still visible. Second, the calibration requires enough markers to be detected, which may not be possible for some imaging orientations due to occlusions between the markers and the imaged tissues. Furthermore, the drum is not only used for calibration, but is also used to report the imaging orientations, which means that the drum must be visible to the camera in order to produce valid images. However, this is often a problem in the crowded OR environment. So the use of calibration drum also limits the available imaging orientations. For experiments in this thesis, a commercial product was used to calibrate the acquired X-ray images, and the calibration results were dependent on various factors such as the source-to-object distance, the imaging parameters, the type of bones, and so on. 2.3. 2D AND 3D DATA IN CAOS 2.3.2 20 Computed Tomography (CT) CT is a 3D X-ray modality that is reconstructed from 2D X-ray projections. It generates cross-sectional projection images of the human body, and then a 3D tomography data is computed from the projection images using the Radon Transform [106]. The projection images are acquired by rapidly rotating the X-ray tube around the patient, and the transmitted radiation is measured using an array of X-ray detectors that are mounted on the device gantry. There are two basic types of CT machines: single-slice CT and spiral CT. Traditionally, the X-ray source rotates through 360◦ within the gantry and the patient table is moved through the X-ray beam in discrete steps. At each table position, one image slice is acquired. In modern CT scanners, the X-ray source generates a fan beam of X-rays, and multiple detectors are used to record the image data simultaneously. Compared with the first generation CT scanners that use a single X-ray beam, this design greatly improves the image acquisition speed. In spiral CT scanners, the X-ray source continuously rotates within the gantry while the patient table is moved through the X-ray beam at a constant speed. So the radiation passing through the patient takes on a spiral or helix form. As a continuous volume is acquired in a one-go, compared with the conventional single-slice scanners, the image acquisition speed is significantly improved. To further improve the scanning efficiency, multislice spiral CT scanners are developed. In such devices, an array of detectors in z direction are used and multiple spiral slices are simultaneously acquired. As anatomical regions of interest can be imaged within a single breath hold, the possible artefacts due to patient movement can be reduced. CT imaging is considered to be geometrically accurate, so user calibrations are 2.3. 2D AND 3D DATA IN CAOS 21 usually not necessary. However, it can exhibit intensity artefacts when metallic objects are present in the field of view. These artefacts are the result of reconstruction using corrupted projection data, which is caused by the X-rays being greatly attenuated by the metal. There are a number of approaches to metal artefact reduction, including use of higher energy X-ray beams, and interpolating the missing projection data. CT imaging can also present artefacts that are caused by beam hardening. However, for most recent CT scanners, such artefacts will be corrected internally when CT images are acquired. CT is mainly a preoperative modality, but intraoperative CT is also available. Similar to X-ray fluoroscopy, the main drawback of CT imaging is the ionizing radiation. 2.3.3 Anatomical Atlases An anatomical atlas [20], or statistical shape model, is a special type of data set that is generalized from a set of images or anatomical models. It captures, in a compact form, the mean and variability of an anatomy within a population of subjects or across multiple studies of a same subject over time. It not only represents the subjects that are used to construct the atlas, but also can predict the shapes of unknown new subjects. In registration applications, this property of atlas is very useful because an unknown atlas instance can be used in place of a missed data set and the concrete shape of the instance can be dynamically determined during the registration. An atlas of a particular anatomy is constructed from a set of subjects of the anatomy called training examples. The training examples are geometry or intensity models of the anatomy, so the quality of the anatomical modelling is a key factor 2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION 22 that affects the quality of the constructed atlas. When geometry models are used, the model representations are more compact, thus the segmentation error introduced during model construction is a main quality factor for the constructed atlas. When intensity models are used, accurate and efficient representations are needed to model both geometry and intensity, thus the selection of an appropriate model representation is a key factor. Another important factor that affects the quality of an atlas is the set of training examples that are used to construct the atlas. They need to be diverse enough to cover all possible shapes, and should include minimal artefacts in shape or intensities. 2.4 Transformations in 2D-3D Registration 2D-3D registration in CAOS involves a number of different coordinate frames. Fig. 2.6 illustrates the involved coordinate frames and their relationships for a typical multipleobject 2D-3D registration. The primary coordinate frames include patient, camera, C-arm intensifier, fluoroscopic image, and 3D data. If the 3D data is a collection of 3D objects, then there will be additional coordinate frames for the member objects. The available coordinate frames can be split into two groups: those that are associated with the fixed data, and those that are associated with the moving data. The goal of registration is to establish a link between the two groups of coordinate frames, or specifically, to find a spatial transformation that places the moving data into the coordinate frame of the fixed data. 2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION 23 Figure 2.6: Transformations in a multiple-object 2D-3D registration. 2.4.1 Transformations within the Fixed Data In general, the patient coordinate frame is used as the reference coordinate frame of the fixed data, and it is defined by a DRB that is mounted on the patient. For a captured fluoroscopic image i, its pose in the reference coordinate frame can be written as a concatenation of three transformations intensif ier camera −1 camera )i , (Tfpatient luoro )i = (Tpatient ) (Tintensif ier )i (Tf luoro (2.2) where i = 1...N and N is the number of fluoroscopic images. In the above equation, camera camera Tpatient and (Tintensif ier )i are reported by the camera, and respectively represent the poses of the patient and the C-arm intensifier (or calibration drum) in the coordinate ier frame of the camera. (Tfintensif )i is the conversion from the X-ray image coordinate luoro 2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION 24 frame (usually defined with respect to the X-ray source) to the intensifier coordinate frame (usually defined with respect to the center of the detector plane). This transformation consists of two parts: a constant part that is known when the two involved coordinate frames are determined, and a variable part that is reported by the calibration procedure to compensate the X-ray source deviation for the current imaging orientation. The transformations within the fixed data are shown in blue dotted lines in Fig. 2.6. 2.4.2 Transformation of the Moving Data Depending on the type of the moving data, the transformation that is computed by registration has different forms. A single 3D object as the moving data When the moving data is a single 3D image, the transformation is a single rigid transformation which brings the 3D image into the patient coordinate frame: patient T (.; θ) = Tmdata , (2.3) where θ is a vector of scalars which represents a user-selected parametrization of the rigid transformation. If the 3D image is not available for some reason, it is a common practice to use an anatomical atlas as the replacement. In such cases, an additional transformation needs to be determined, and the general transformation is composed as: patient instance T (.; θ, θatlas ) = Tmdata Tatlas , (2.4) 2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION 25 instance where Tatlas models the process of producing instances from the mean shape of the atlas, which is a deformable transformation represented using parameters θatlas , derived from a Principal Component Analysis (PCA) of the anatomical shape and/or intensity variations. Multiple 3D objects as the moving data For moving data that contains multiple 3D images, the general transformation consists of a global transformation and a set of local transformations: patient Tglobal (., θg ) = Tmdata ; mdata Tlocal (., {θk }) = {(TCT )k }, k = 1...M. (2.5) (2.6) patient Tmdata is a rigid transformation which brings the moving data as an entirety into mdata the patient coordinate frame, {(TCT )k } is a set of rigid transformations that po- sition individual member objects within the moving data, and M is the number of member objects in the moving data. When implementing the registration algorithm, the global transformation can be merged into individual local transformations to reduce the number of parameters to be estimated. However, having a separate global transformation can improve the performance and robustness of the registration. Fig. 2.6 illustrates such a case, and the transformations to be computed are shown in red dotted lines. When an atlas is used instead of a set of 3D images, each member object is now a dynamic instance, and the corresponding local transformation needs to be appended to include the process of instance generation from the atlas. The updated general 2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION 26 transformation can be written as: patient Tglobal (., θg ) = Tmdata ; mdata instance Tlocal (., {θk }, θatlas ) = {(Tinstance )k (Tatlas )k }, k = 1...M. (2.7) (2.8) instance For a given member object k, the deformable PCA transformation (Tatlas )k promdata )k positions duces an instance from the atlas, and the rigid transformation (Tinstance the instance within the moving data. 2.4.3 Parametrization of the Output Transformation The general transformation of the moving data is the main output of the registration. It needs to be parametrized before the optimization algorithm can search for a solution of it. Depending on the optimization algorithm being used, the parametrization method can be a key factor that affects the performance and robustness of the optimization process. In general, a good parametrization has a small number of parameters, less ambiguity (e.g., parameters are orthogonal or independent each other), similar dynamic ranges for all parameters, uniform behaviours across all regions in the parameters domain, and so on. As described in the previous section, the general transformation to be computed in this thesis may involve two types of transformations: 3D rigid transformations of rigid bone fragments, and 3D deformable transformations of a statistical atlas derived from PCA-based parametrization of the bone shapes in a population. While deformable transformations have a unique way of PCA-based parametrization, the parametrization of rigid transformations has various forms. A rigid transformation with parameters θ can be decomposed into two consecutive sub-transformations: a 2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION 27 translation with parameters θt , and a rotation with parameters θr . The parametrization of translation is simply a three-component vector that describes the offsets with respect to the three coordinate axes, that is, θt = (tx , ty , tz ). For the rotation subtransformation, a number of parametrizations exist, with the most general one being the 3 × 3 rotation matrix. This section discusses several parametrizations of the rotation that are commonly used in 2D-3D registration. Euler Angles Euler angles [57, 54] represent the rotation using a three-component vector θr = (rx , ry , rz ), where rx , ry and rz are rotation angles of three sequential rotations around each of the coordinate axes. The relationship between this representation and the rotation matrix can be seen by writing each of the individual rotations as a matrix, and then composing those matrices. The individual matrices can be composed in different orders; however, their impact on optimization is not significant. A rotation represented using Euler angles has minimal number of parameters, so it is an efficient representation for optimization. Another advantage of this parametrization is that, in most medical applications, the values of the rotation and translation parameters have similar ranges if the angles are in degrees and the translations are in millimetres. This is a preferred property for many optimization algorithms such as Gradient Descent and Downhill-Simplex. The drawback of using Euler angles is that the angles are coupled to each other. That is, for a given rotation, there are multiple sets of Euler angle representations that can produce the same rotation. This ambiguity can cause a problem known as “gimbal lock” [119], which is a loss of one rotational degree-of-freedom (DOF) when 2.4. TRANSFORMATIONS IN 2D-3D REGISTRATION 28 certain parameter values of the representation are encountered. Unit Quaternion and Angle-Axis Unit quaternion [54] was formulated to overcome the “gimbal lock” problem in the Euler angle representation. In this parametrization, the rotation is represented using a unit quaternion θr = (X, Y, Z, W ). A unit quaternion is a four-element vector having unit magnitude. Three of them, that is (X, Y, Z), determine the rotation axis and the other, that is W , determines the rotation angle about the axis. The use of unit quaternions can avoid the “gimbal lock” problem because each rotation can be represented using a unambiguous quadruple whose parameters are independent each other. Unit quaternions also have other nice properties such as easy composition and differentiation, capable of performing smooth interpolation between any two rotations, and so on. As unit quaternion has four parameters of two types (i.e. axis and angle), it is slightly more expensive to be optimized than the Euler angles and, in order to achieve a good optimization performance, it is preferred to use dedicated optimization algorithms that can employ the special properties of the unit quaternion. Another representation that closely relates to the unit quaternion is Angle-Axis [118]. The two representations are essentially the same, but Angle-Axis explicitly represents the rotation angle using degrees or radians. This representation shares the same advantages and disadvantages of the unit quaternion. 2.5. DRR GENERATION 29 Versor Versor [45] is derived from unit quaternion, and can be written as θr = (vx , vy , vz ). It encodes the rotation angle into the rotation axis by scaling the rotation axis with a factor which is the cosine of the half of the rotation angle. The advantage is that the number of parameters to be optimized is reduced by one, and general optimization algorithms work well with this representation because all parameters are in the same range. During optimization, changing one parameter will simultaneously change the rotation axis and the rotation angle, which may not be a desired behaviour for some applications. Spherical Spherical representation [3] describes a rotation using three angles θr = (α, β, γ). It can be seen as a variant of versor or unit quaternion, where α and β are used to represent the rotation axis in the spherical coordinate system, and γ represents the rotation angle about the rotation axis. This representation removes the coupling between the axis and angle in versors while maintaining three parameters. Similar to the Euler angles, the use of angles can benefit the optimization algorithms because the rotation and translation parameters can be scaled to have the same range. One drawback of this representation is that, because spherical coordinate system is used, the optimization does not behave uniformly across all areas in the parameters domain. 2.5 DRR Generation DRR generation is the operation of simulating the X-ray imaging process on 3D images to produce a 2D X-ray view of the 3D data. There are two key requirements 2.5. DRR GENERATION 30 about DRR generation. Firstly, it must be fast enough as many algorithms heavily depend on dynamically generated DRRs. Secondly, the generated DRRs must closely resemble the real X-ray images so that DRR generation is not a major source of errors. For many years, generation of realistic DRRs has been the performance bottleneck for a large number of 2D-3D registration methods. Recent technology advancement in graphics processing units (GPUs) has greatly improved the speed of DRR generation. However, the accuracy of the generated DRRs still needs to be improved. In many publications, DRR generation is also known as volume rendering. It should be noted that volume rendering has a broader meaning and embraces a variety of rendering techniques including and other than X-ray imaging simulation. This section provides an overview of the commonly used DRR generation techniques. 2.5.1 Ray-casting As CT and fluoroscopic images are generated with different X-ray energy spectra which are usually unknown to the user, it is difficult to exactly simulate the X-ray imaging process (Eq. 2.1) on a CT image. Instead, approximated methods are used. One commonly used method is ray-casting (sometimes known as ray-tracing, though the latter is more general and more complex) which is defined as follows Ix = n X i=0 Ci αi ! i−1 Y (1 − αj ) , (2.9) j=0 where x is the X-ray passing through the tissue, n is the number of voxels on the ray, Ci is the CT value in Hounsfield unit at the i-th voxel along the ray, and αk is the opacity at the corresponding voxel. The opacities for individual voxels are used to simulate the tissue absorption coefficients from the CT numbers, and their values 2.5. DRR GENERATION 31 are assigned by using a transfer function. The selection of an appropriate transfer function is very important for generating realistic DRRs, and a few functions have been suggested [73]. The ray-casting approach can produce highly realistic DRRs. However, direct implementation of Eq. (2.9) on the CPU is a time-consuming task due to expensive operations such as interpolation and iteration. With recent quad-core CPUs, it still takes more than one second to compute a DRR with moderate resolution (such as 256 × 256) from a CT image with usual resolution (such as 512 × 512 × 200), which is a speed that is not appropriate for interactive use. To improve the performance of ray-casting, several methods have been proposed. One method accelerates the ray-tracing process by applying several techniques [125]: replacing most floating-point computations with integer operations; removing noninteresting voxels; and early ray termination. The method can improve the computation speed up to a few times with negligible compromise in DRR quality, but the magnitude of improvement largely depends on the image contents and it is still not significant for registration applications. Another method simulates the ray-casting process by using graphics hardware [98]. It takes advantage of the GPU parallel rendering mechanism, and ray-casting is implemented using GPU shader programs such as DirectX Pixel Shader 3.0 and NVIDIA Fragment Program 2. With recent consumer-grade GPUs, the speed improvement with respect to the original CPUbased method can go up to 100-200 times while maintaining good DRR guality. However, using the shader programs for such a task is becoming outdated when the more powerful GPGPU (General-Purpose computation on Graphics Processing Units) 2.5. DRR GENERATION 32 computation techniques, such as CUDA [2] and OpenCL [99], were developed a couple of years ago. In the latest improvement [96], ray-casting is implemented within GPU using the CUDA technology. Compared with the old GPU-based ray-casting method, the new method marginally improved the DRR generation speed and the DRR quality, but it significantly simplified the implementation and brought great potentials for further improvements with future GPUs. 2.5.2 GPU-based Texture Mapping Texture mapping is a technique widely used in computer graphics. It maps a bitmap image, called a texture, to a polygon, which is usually an expensive operation that involves interpolation. Most recent GPUs provide hardware support for texture mapping as well as the alpha blending operations that combines two bitmaps. These two features can be used together to accelerate the DRR generation process. The process generally involves three steps: l) slice the bounding box of the 3D image into polygons along a particular direction; 2) map each polygon with the corresponding 2D texture taken from the 3D image; and 3) blend all textured polygons into a final image. The first step is often done by CPU and the remaining steps are done by GPU. Texture mapping can be viewed as an iterative implementation of the ray-casting process (Eq. 2.9). It starts from the furthest voxel (with respect to the source) on the ray, and recursively accumulates the attenuation coefficients (simulated from the CT numbers and opacities) of the voxels towards the source Ix(n) = 0, Ix(i) = Ci αi + (1 − αi )Ix(i+1) , i = n − 1, ..., 0, (2.10) 2.5. DRR GENERATION 33 where Ci and αi are the CT value and opacity of a voxel i on the ray x, and the opacities are specified by the user by using a transfer function. Early graphics hardware supports only 2D textures. In this case, the 3D image is sliced along each of the three main axes, and each slice is stored as a 2D texture. When a DRR is requested, the image axis that is closest to the viewing direction is selected, and texture mapping and blending are performed along that direction. This technique is called object-aligned texture mapping and has a major drawback: if the angle between the slicing and viewing directions is too large, artefacts will appear. Recent GPUs support 3D texture mapping, in which case the entire 3D image is stored in the texture memory, and texture mapping and blending is performed along the viewing direction. Compared with the 2D texture mapping technique, the difference is that the 2D slices are now constructed on-the-fly out of the 3D image, and therefore can be oriented perpendicular to the viewing direction, resulting in an image with fewer artefacts. This technique is called view-aligned texture mapping and is currently the most popular technique for DRR generation. The performance of 3D texture mapping does not depend on the contents of the 3D image, which is a good property if constant computation time is important for the context applications. However, for 3D images that contain a lot of empty voxels, it is a kind of waste of the computation power. An improved method, called Adaptive Slice Geometry for Hardware Assisted Volume Rendering, has been proposed to solve this problem [9]. The method removes all empty voxels and computes Axis-aligned Bounding Boxes (AABBs) for structures of interest in a pre-processing step. During volume rendering, texture mapping is only applied for polygons that are obtained by slicing the AABBs. The improvement in computation speed is obvious as most 3D 2.5. DRR GENERATION 34 images contain certain amount of non-relevant structures. The slicing operation in texture mapping is usually done within the CPU using some computational geometry algorithms. A simple method is to use a sweeping plane to cut the bounding box of the 3D image along the viewing direction, and then compute the intersections as well as the polygons for texture-mapping [9]. To improve the slicing performance, a new method was proposed and the Marching Cubes [65] algorithm along with a special look-up table was used to aid the computation of the intersection polygons [6]. 2.5.3 Other Techniques Aside from the ray-tracing and texture-mapping techniques described above, several other DRR generation techniques are also available for registration applications: Shear-warp. This algorithm [55] transforms the 3D image to an intermediate coordinate system called the “sheared object space”. In this space the projective viewing rays are transformed into axis-aligned parallel rays for easy walk-through along the rays. The algorithm involves shearing, scaling and resampling the volume slices. The slices are composed together in front-to-back order resulting in an intermediate 2D image. The final step is to “warp” this image in order to transform it back to the original image space. The rendering is very efficient, as the voxels in the intermediate slices correspond with the scan-lines of the final image, and can be composed immediately. On the other hand this algorithm produces artefacts under certain circumstances, and therefore may be problematic if used for accurate registration. Splatting. This method [116] is similar to the ray-casting method but using a different projection scheme. Each voxel of interest is projected onto the 2D viewing 2.5. DRR GENERATION 35 plane, and a Gaussian splat is used to approximate the projection result. Then, all resulting splats are composed together in the order of back-to-front to produce the final image. To improve the computation efficiency, only voxels that effectively contribute to the final image are used during the calculation. This method has better computation performance than the ray-casting method. However, artefacts such as aliasing can exist. Several variants [131, 13, 44, 62, 113, 115] have been proposed to further improve the speed and accuracy of the original splatting method. Transgraph. Transgraph, or light-field, is a pre-computed data structure that stores the pixel intensities of a large number of DRR rays in an efficient way [56, 93]. Each ray is represented using two points in the 3D space, and the associated DRR pixel intensity is computed using the ray-casting function (Eq. 2.9) or any other DRR formulation functions. This pre-computation is performed for many different viewing directions around a user-selected reference direction. Rendering of a DRR is then reduced to, for each ray in the output DRR, retrieve the closest rays from the transgraph and compute an interpolated pixel intensity from the retrieved rays. The more pre-computed DRR rays, the longer computation will take (to several hours or even days) and the more accurate result it can achieve. Another limitation of this method is that, when the content of the 3D data is changed, the transgraph has to be recomputed. Cylindrical harmonics. This technique [112] transforms the 3D image into a cylindrical harmonics representation, which consists of a series of orthonormal cylindrical harmonics basis functions and their corresponding coefficients. Then a reference projection orientation is selected, and each of the harmonics is projected along the selected orientation. This will produce a set of 2D projections (called harmonic DRRs), 2.5. DRR GENERATION 36 and their super-positions is the DRR of the 3D image in its reference orientation. When a DRR from an arbitrary direction is requested, the harmonic DRRs are exponentially weighted by the orientation of the requested DRR (represented as the relative angle with respect to the reference orientation), and super-positioned to produce the output DRR. This method can produce DRRs with good quality (the actual quality is dependent on the number of harmonics being used as well as the 3D image content), and the main advantage is that, once the harmonic DRRs are generated from a chosen reference direction, new DRRs can be quickly obtained by simply super-positioning the harmonic DRRs. Another advantage is that, the number of harmonics to be used can be truncated such that DRRs can be composed even faster with certain trade-off in image quality. 2.5.4 DRR Generation for Multiple Objects In all above-described methods, only a single 3D image is involved during DRR generation. This should work fine for most medical applications. However, some applications involve multiple moving objects (as is the case for bone fragments in fracture treatment), where rendering of multiple 3D images or anatomical models can be a key requirement. DRR generation for multiple objects can be done in one of two approaches. The first approach is to modify the existing methods developed for a single 3D image to simultaneously handle multiple objects during DRR generation. As different DRR methods have different complexities, the efforts required for the modifications also vary. The advantage is that, with this approach, DRRs of multiple objects can be 2.6. 2D-3D SIMILARITY METRICS 37 produced without trade-offs in image quality. The second approach is to use an existing single object rendering method to produce object DRRs for individual objects, and then combine the individual object DRRs. This approach is simple to implement, and can take advantage of all existing DRR methods. However, simply combining the individual object DRRs may introduce artefacts to the final DRR, because occlusions among individual objects can happen. In most existing DRR methods, the rendering is performed along the viewing direction in either back-to-front or front-toback order, and the accumulation of attenuation coefficients for different objects is not a separable operation. The simplest way to combine the individual object DRRs is to compute the mean of them. It may blur out some useful structures, but it is the function that will minimize the overall artefacts for all possible DRR directions. Some single object DRR implementations use order independent volume rendering (OIVR), in which case the choice of the combination function is not important. 2.6 2D-3D Similarity Metrics A similarity metric is computed from all the data that are involved in registration, and is a function of the transformation parameters. It can be single-valued or multiplevalued, and can be intensity-based or feature-based. Feature-based similarity metrics are more efficient to compute; however, accurate feature extraction is necessary and the errors in feature extraction propagate to the similarity measure. Intensity-based similarity metrics are more accurate in general because raw image intensities provide more information than extracted features; however, the computation speed is usually a bottleneck in registration applications that use intensity-based similarity metrics. 2.6. 2D-3D SIMILARITY METRICS 38 In 2D-3D registration, the most commonly used similarity metrics are intensitybased, and they are computed from X-ray images and DRRs. This section gives an overview of such metrics. 2.6.1 Correlation-based Metrics Correlation is a good metric for intra-modality registrations because it is invariant to linear intensity differences between two images. That is, the metric value remains unchanged even if the pixel intensities in one or both of the images are multiplied by a positive constant, or are increased or decreased by a constant. Normalized Correlation Coefficient. The simplest form of correlation-based metrics is the Normalized Correlation Coefficient (NCC) [114]. The NCC of two images is computed by first normalizing each image to have zero mean and unit variance, then multiplying each pixel in one image by the corresponding pixel in the other image, and summing the products. Let A and B be the two images, and Ω be the domain or region-of-interest within which the similarity will be calculated, then NCC is defined as X p∈Ω X 1 X A(p) B(p) |Ω| p∈Ω p∈Ω !2 v !2 , u X X uX 1 A(p) t B(p)2 − B(p) |Ω| p∈Ω p∈Ω p∈Ω A(p)B(p) − N CC(A, B, Ω) = v u uX 1 t A(p)2 − |Ω| p∈Ω (2.11) Sum of Local Normalized Correlation. One problem associated with NCC is that it is not resistant to non-uniform intensity distortions that appear in different areas of the intensifiers. To address this problem, a revised version of NCC, called Sum of 2.6. 2D-3D SIMILARITY METRICS 39 Local Normalized Correlation (SLNC), is proposed [57]. The new metric computes a local NCC in a small neighbourhood for each pair of pixels in the two images, and then reports the mean of the local NCC values as the measure. The SLNC for a pair of images A and B with region-of-interest Ω is defined as SLN C(A, B, Ω) = 1 X N CC (A, B, R(p)) , |Ω| p∈Ω (2.12) where R(p) is the small neighbourhood around a pixel p, and N CC is defined in Eq. (2.11). One good by-product of this revision is that the metric calculation can now be optimized for multi-threaded or parallel execution, because the calculation of a local NCC only involves a small number of neighbouring pixels. Variance-Weighted Correlation. With SLNC, individual local NCC values are equally weighted, which is not a reasonable behaviour in some situations. For example, if neighbourhood R(p) contains pure background intensities and neighbourhood R(q) contains boundaries of anatomic structures, the two regions will be considered equally important when computing the SLNC metric. However, R(q) contains more useful information than R(p) for registration and should be assigned more weight. To address such concerns, a revision of SLNC, named Variance-Weighted Correlation (VWC), was proposed [56]. In VWC, local NCC values are weighted by the variances of the corresponding regions when composing the final similarity value. One of the two images is chosen as the control image, and is used to compute the weights. This modification effectively concentrates attention in those regions of the control image where the signal strengths are high. When DRR is used as the control image, VWC is especially useful for excluding foreign objects such as DRBs and surgical tools that appear only in X-ray images, since the DRR (as the control image) does not contain 2.6. 2D-3D SIMILARITY METRICS 40 the DRBs and tools. Mathematically, VWC is defined as X V W C(A, B, Ω) = C (I, R(p)) N CC (A, B, R(p)) p∈Ω X , (2.13) I(p) , (2.14) C (I, R(p)) p∈Ω 2 C(I, R(p)) = 1 |R(p)| X I(p)2 − p∈R(p) 1 |R(p)| X p∈R(p) where I is the selected control image (that is, either A or B), and C(I, R(p)) is the variance of the neighbourhood region of the point p in the control image. Stochastic Rank Correlation. All above described metrics are designed for intramodality registration problems, where intensities of the corresponding pixels from two images have nearly a linear relationship. To use the correlation technique in other registration problems, such as intra-modality registration and DRR-based 2D-3D registration with low quality DRRs, Stochastic Rank Correlation (SRC) is proposed [14]. This metric calculates the correlation on the intensity ranks of two images instead of raw intensities. For each of the two images, the pixels are sorted according to the intensity values, then the rank of an intensity value is computed as the mean index of all pixels with the same intensity value. Let ρA (p) and ρB (p) be the intensity ranks of the same pixel p in two corresponding images A and B, then the SRC metric is defined as SRC(A, B, Ω) = 1 − 6 2 p∈Ω (ρA (p) − ρB (p)) , |Ω|(|Ω|2 − 1) P (2.15) where Ω is a mask that indicates the pixels of interest. The mask is randomly generated by uniformly sampling the fixed image domain, and its use is mainly for improving the computation performance. As the similarity measure is computed from 2.6. 2D-3D SIMILARITY METRICS 41 intensity ranks instead of original intensities, this metric is known to be robust against intensity non-linearity between the two images, and robust against outliers that only appear in one image. 2.6.2 Information-theory Metrics A group of similarity metrics are based on information theory [87]. These metrics evaluate the amount of information that is contained in the joint intensity distribution of two images. Let X and Y be the variables representing intensities in two correlated images, then the joint intensity distribution of the two images is defined as the probability of X and Y being corresponded at the same pixel location. It will contain the most information when the two images are aligned, and the least information when they are completely independent. Entropy-based metrics. One commonly used function to measure the amount of information within a message is Shannon entropy. The similarity metric that computes the Shannon entropy for the joint distribution of two images is called joint entropy [86]. The computation of joint entropy is straight-forward; however, if the two images have no overlap at all, the joint entropy will still report a high response, which is not a desired behaviour. This problem can be solved by using Mutual Information (MI) or Normalized Mutual Information (NMI) [86], which combines the joint entropy with the entropies of individual images. Let p(a) be the histogram of an image A, and p(a, b) be the joint intensity distribution of two images A and B, then the entropy of 2.6. 2D-3D SIMILARITY METRICS 42 a single image, H(A), and the joint entropy of two images, H(A, B), are defined as H(A) = − X p(a) log p(a), (2.16) p(a, b) log p(a, b), (2.17) a∈A H(A, B) = − X a∈A,b∈B and the metrics MI and NMI are defined as: M I(A, B) = H(A) + H(B) − H(A, B), (2.18) H(A) + H(B) . H(A, B) (2.19) N M I(A, B) = The difference between MI and NMI is that MI computes the difference between the two types of entropies, and NMI computes the ratio. It has been reported [86] that NMI is more stable than MI when the overlapping area between the two images varies. Combining entropy with spatial information. Entropy-based metrics consider only the statistical properties of the joint histogram and ignore the information of local neighbourhoods. To incorporate useful spatial information such as edges, gradients, and so on, a number of extensions of the entropy-based metrics have been suggested. In [91], MI calculations were performed over blocks of pixels in the images. In [75], PCA was performed in order to incorporate local spatial information into MI. In [85, 49], MI was combined with gradient information to obtain new metrics such as Asymmetric Gradient-based Mutual Information and Symmetric Gradient-based Mutual Information. f-function metrics. Several other metrics are based on joint distribution but 2.6. 2D-3D SIMILARITY METRICS 43 do not use entropy [87]. Such metrics include V -information, Iα -information, Mα information, Xα -information and Rα -information. In those metrics, different parametercontrolled functions are used to compute the information contained within the joint distribution image, and each metric is aimed to solve a special type of problems. 2.6.3 Metrics using Spatial Information This type of metrics takes into account some kind of neighbourhood information at every pixel location [114]. This can be done by adding all pixel differences within a certain radius or by calculating gradient images for further examination. Pattern Intensity (PI). This metric computes the difference between two images, and counts the amount of a special pattern contained in the difference image. When computing the difference, one of the image is dynamically scaled (that is, each registration step has a different scaling factor) such that the difference image has the least contrast. The pattern is defined over a small neighbourhood of radius r for every pixel in the difference image, and its shape is controlled by a constant σ. The metric is defined as follows: P I(A, B) = σ2 1 X X , |Ω| p∈Ω σ 2 + (D(p) − D(q))2 (2.20) D = A − sB, (2.21) q∈R(p) where D is the difference image, s is a dynamically computed scaling factor, and R(p) is a small neighbourhood of point p. Gradient Correlation (GC). This metric calculates the horizontal and vertical gradient images for each of the two images. Then, normalized correlation is calculated 2.7. OPTIMIZATION ALGORITHMS 44 for each pair of the horizontal and vertical gradient images. The final value of the metric is computed as the average of the two calculated correlation values. Let GA,x and GA,y be the two gradient images of the image A, and GB,x and GB,y be the two gradient images of the image B, then the final value of the metric is computed as: 1 GC(A, B) = (N CC(GA,x , GB,x ) + N CC(GA,y , GB,y )), 2 (2.22) where N CC is defined as Eq. (2.11). Gradient Difference (GD). Similar to GC, this metric also depends on the horizontal and vertical gradient images of the two images. However, instead of computing a NCC value for each pair of gradient images, a difference image is computed for each pair of the gradient images, and the same pattern as of PI is applied. Let Dx and Dy be the two difference images between the horizontal and vertical gradient images, the metric is defined as follows: GD(A, B) = X p∈Ω X σy2 σx2 + , σx2 + Dx2 p∈Ω σy2 + Dy2 (2.23) Dx = GA,x − sGB,x , (2.24) Dy = GA,y − sGB,y , (2.25) where σx2 and σy2 are constants that control the pattern shapes in horizontal and vertical directions, and s is a dynamically computed scaling factor. 2.7 Optimization Algorithms Optimization is the process that searches for a value of the transformation parameters such that the similarity metric reaches a pre-defined target. The target can be a 2.7. OPTIMIZATION ALGORITHMS 45 minimal, maximal, or constant value. Depending on how the similarity metric is defined, the problem may have a closed-form solution or may have to be solved in an iterative fashion. The closed-form solution is only available for few problems, such as point-based registration with known point correspondences [8, 18]. The vast majority of problems are solved iteratively by starting from an initial guess of the solution and proceeding towards the optimal solution. There are two common ways to drive the optimization process in iterative optimization techniques. When derivative information is not available or is unreliable, the proceeding direction at each iteration step is learnt by evaluating the metric function at sampled points in the parameters domain. On the other hand, the proceeding directions can be obtained more efficiently if derivative information is available. This section gives an overview of the popular optimization techniques [89] that have been used for 2D-3D registration. 2.7.1 Techniques Not using Derivatives Hill-Climbing. This is the simplest optimization algorithm that uses no derivatives. In each iteration loop, the parameter value in each dimension is altered by a specific step size in both directions and new values of the similarity metric at this position are calculated. After having evaluated all 2N neighbours (with N being the number of parameters), the one that improves the metric value most is chosen and set as the base position for the next iteration. If none of the neighbours achieves a better value than the current, either a downscaling of the step size is executed, or the algorithm terminates, assuming to have found an optimal position. Downhill-Simplex. Hill-Climbing requires a large number of function evaluations, thus is not an efficient process. The Simplex algorithm [89] improves Hill-Climbing 2.7. OPTIMIZATION ALGORITHMS 46 by reducing the number of function evaluations. A simplex is the simplest geometric shape consisting of N +1 corners in N-dimensional space. A starting simplex is defined at the initial position; next, the metric is evaluated at all corners; then, depending on the results of metric evaluations, the shape of the simplex is changed. This algorithm is mostly known for its simple and elegant implementation; however, the improvement in computation cost is still not significant. 2.7.2 Techniques using Derivatives Gauss-Newton. When derivatives are available, Newton’s method can be used to find a solution efficiently. At each iteration step, the proceeding direction is computed from the first and second derivatives of the metric. Let µ be the parameters to be optimized, the Newton update can be defined as µk+1 = µk − (∇2 f (µk ))−1 ∇f (µk ), (2.26) where ∇f (.) and ∇2 f (.) are the first and second derivatives, respectively. The Newton method is known for its efficiency. However, it is not guaranteed to convergence to an optimum. When the second derivatives are not available or difficult to compute, the Gauss-Newton algorithm can be used, where the second derivatives are approximated with the Jacobians of the metric function. Gradient-Descent. This algorithm is similar to Gauss-Newton, but uses the gradient instead of Jacobians. It approaches a local minimum of the metric function by taking steps proportional to the negative of the gradient at the current position. The 2.7. OPTIMIZATION ALGORITHMS 47 Gradient-Descent method can be described as follows µk+1 = µk − Γ∇f (µk ), (2.27) where ∇f (.) is the gradient, and Γ is a diagonal scaling matrix that determines the step size at each iteration. This optimization method can guarantee to converge to a local optimum. However, if the scaling matrix is not appropriately updated during the optimization, the convergence speed can be quite slow. To overcome this problem, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm was developed. At each step, the step size is caluculated by using an efficient scaling matrix and performs a more sophisticated search. Levenberg-Marquardt. This algorithm automatically shifts between the GaussNewton and Gradient-Descent during execution. It is more robust than the GaussNewton, which means that it can find a solution even if the initial guess of the solution is far from the final solution. On the other hand, for well-behaved metric functions and reasonable starting positions, the Levenberg-Marquardt algorithm tends to be a bit slower than the Gauss-Newton method. The Levenberg-Marquardt method can be described as follows µk+1 = µk − (H(µk ))−1 ∇f (µk ), (2.28) H(µ) = ∇2 f (µ)(1 + λ∆), (2.29) where ∇f (.) and ∇2 f (.) are the first and second derivatives, respectively, λ ∈ [0, +∞) is a user parameter that controls the compromise between the Newton method (λ = 0) and the gradient method (λ → +∞), ∆ is a matrix of Kronecker symbols, and H(.) 2.7. OPTIMIZATION ALGORITHMS 48 represents a modified Hessian matrix. 2.7.3 Robust and Efficient Optimization Optimization can be a very time-consuming process, and may frequently trap to local minima if the similarity metric is not well defined. A couple of techniques have been used together with the usual optimization algorithms to address such problems. Multi-resolution Strategy. This technique [68] starts with a fast but coarse estimation of the solution, and gradually refines the solution with more precise but slower estimations. Often a pyramid of sampled images at different resolutions is created from the full-resolution images. The first optimization is carried out with the coarsest images at the pyramid top, terminating early when respective stopping criteria are met. Then, the optimization is restarted from the resulting position using the more precise images down the pyramid hierarchy, this time with smaller tolerance values for termination. The optimization is repeated until the finest resolution images at the pyramid bottom are used. When moving from one level to the next, more information is incorporated into the metric, and the metric becomes more accurate and more specific. A key advantage of this optimization scheme is that the metric shape at low resolutions is smoother and may contain fewer local optima. Therefore, the use of a multi-resolution strategy not only can speed up the registration, but also is a key means to overcome the problems of local optima. Simulated Annealing. This is another commonly used technique for avoiding local optima [52]. The idea is that, at each iteration step, the current solution is replaced by a random “nearby” solution, chosen with a probability that depends on the distance between the current and target metric values, and on a decreasing global parameter. 2.7. OPTIMIZATION ALGORITHMS 49 When the current solution is close to a local minima, the random perturbation on the current solution will increase the chance to get out of the local minima in the next iteration step. 50 Chapter 3 2D-3D Registration with Unscented Kalman Filter 3.1 Overview In this chapter, a robust 2D-3D registration method with a wide capture range is presented.1 The method registers preoperatively collected 3D Computed Tomography (CT) data sets of a single bone fragment to its intra-operative fluoroscopic images. The registration technique relies on hardware rendering of CT data on consumergrade graphics cards to generate digitally reconstructed radiographs (DRRs) in real time. We also employ Unscented Kalman Filter to solve for the non-linear dynamics governing this 2D-3D registration problem. The method is validated on phantom models of three different anatomies, namely scaphoid, pelvis and femur. We show that, under the same testing conditions, our proposed technique outperforms the conventional simplex-based method in capture range and robustness while providing comparable accuracy and computation time. 1 This work has been published in Proceedings of MICCAI: R. H. Gong, J. Stewart, and P. Abolmaesumi, “A new method for CT to fluoroscope registration based on unscented Kalman filter”, Medical Image Computing and Computer-Assisted Intervention (MICCAI), 9(1):891-898, 2006. 3.2. INTRODUCTION 3.2 51 Introduction Registration of CT to fluoroscopic images is a fundamental task in Computer-Assisted Orthopaedic Surgery (CAOS) and Radiotherapy (CART). In the case of CAOS, registration of pre-operative CT to a set of intraoperative fluoroscopic images can be used to create a precise link between the virtual patient (i.e. the pre-operative CT) displayed on a screen and the physical patient in the operating room (OR) so that the CT image can be used to guide the intervention. In the case of CART, registration of CT to portal images allows precise configuration of treatment beams so that the radiation is focused on tumors/lesions; thus the damage to healthy tissues remains minimal. The CT-to-fluoroscopy registration problem can be briefly described as finding a geometric transform that positions the CT in the patient’s coordinate space so that a user-defined similarity measure between the CT and a set of fluoroscopic images is optimal. Usually the CT is captured preoperatively and used for surgical/treatment planning, and the fluoroscopic images are captured intra-operatively and used to update the surgical/treatment plan dynamically. A clinically usable CAOS or CART system requires accurate, fast and robust registration between the two data sets. One can formulate CT-to-fluoroscopy registration as a 2D-3D registration problem. A number of methods have been proposed to address this problem in the literature. More detailed reviews on this topic can be found in [5, 25, 28, 33, 53, 56, 64, 82, 107]. A commonly adopted approach is to generate intermediate simulated 2D fluoroscopic images, called digitally reconstructed radiographs (DRRs), from the 3D CT and compare the simulated fluoroscopic images with the real ones. Registration is 3.3. METHOD 52 achieved when the simulated fluoroscopic images closely resemble the real ones. Multiple fluoroscopic images from different viewing angles are often used in the process to compensate for the loss of depth information in the acquired 2D fluoroscopic images. The majority of the previous work has focused on how to generate DRRs quickly and realistically [9, 13, 55, 56, 94, 98] or on how to define/select a similarity measure between DRRs and fluoroscopic images. Those methods often relied on either the simplex or, if calculation of derivatives is possible, the gradient-descent optimization method to search for an optimal registration. In this paper, due to the non-linear nature of the CT-to-fluoroscopy registration problem, we propose to use the Unscented Kalman Filter (UKF) as the optimization method. The proposed registration method requires no calculation of derivatives, deals with multiple observations simultaneously, estimates the variance along with the state, and uses an improved hardware-based technique for fast DRR generation. We believe that these features could potentially benefit the CT-to-fluoroscopy and other 2D-3D registration problems. To validate our approach, we extensively test our method on various phantom data sets and compare it with a conventional simplex-based approach. The remaining of this chapter is organized as follows: Section 3.3 gives the details of our technique; Section 3.4 describes the testing scenarios and presents the experimental results; and Section 3.5 provides a summary. 3.3 Method Our method consists of four major components: a transform that positions the CT in the patient’s coordinate space; a hardware-based volume rendering engine that generates DRRs at interactive rates; a similarity measure that compares the DRRs 3.3. METHOD 53 with the corresponding fluoroscopic images; and the UKF that recursively optimizes the transform parameters. Figure 3.1 shows how these components interact with each other to search for an optimal registration solution. Once the algorithm is initialized, it runs iteratively until some pre-defined stopping criteria are met. One iteration of the algorithm works as follows: 1. Apply the current transform to CT; 2. The transformed CT is fed to the volume rendering engine along with the fluoroscopic images’ imaging parameters, including the C-arm focal position and the C-arm orientation; 3. For each fluoroscopic images, a corresponding DRR is generated by the graphics hardware; 4. A set of similarity measures are computed for all (DRR, fluoroscopy) pairs; 5. The computed similarity measures are fed to the UKF along with the current transform parameters as well as their variances; 6. The UKF updates the transform parameters and the variances. The above process repeats until a set of optimal similarity measures is achieved or the updates to the parameters or variances are sufficiently small. In the following sections, we discuss each of the components in detail. 3.3.1 Transform and its Initial Value The transform used to position the CT in OR is a simple rigid transform with six parameters: three Euler angles for rotation and three scalars for translation. Most 3.3. METHOD 54 Figure 3.1: The UKF-based approach for CT-to-fluoroscopy registration. 2D-3D registration methods require an initial transform that is close to the unknown real solution to start the registration. Our method is no exception. We find an initial guess of the transform parameters by manually selecting a few landmarks from both CT and fluoroscopic images, e.g., three to four visible points on the bone surface, and solving an absolute orientation problem using the singular value decomposition (SVD) based technique [103]. If the landmarks are selected carefully, the computed initial parameters can yield an initial mean Target Registration Error (mTRE) [110] within 3 cm, which is often sufficient to start our registration method. 3.3.2 Hardware-based Volume Rendering Engine We use an improved hardware-based technique, i.e. the Adaptive Slice Geometry Texture Mapping (ASGTM) algorithm [9], for fast DRR generation. The algorithm 3.3. METHOD 55 improves the common view-aligned 3D texture-mapping based method by adaptively slicing the volume based on image content. First, the CT is partitioned into a set of axis-aligned bounding boxes (AABBs) based on a user-defined transfer function that removes the non-interesting voxels and highlights the anatomical structures of interest. Then, the AABBs are sliced along the viewing direction, i.e. the focal axis of the C-arm, in order from back to front. Finally, the powerful OpenGL features of consumer-grade graphics cards, including 3D texture-mapping, multi-texturing, and fragment program are employed to render and blend the slices into a final DRR. We tested our implementation on an ATI Radeon X800 card with 256 MB video memory by rendering CT volumes of size 512 × 512 × 256 into DRRs of size 473 × 473, and have achieved the speed of 20-50 frames per second, depending on the image content in the CT data. 3.3.3 Similarity Measure A variety of similarity measures [56, 82, 129] have been proposed for comparing a DRR with a fluoroscopic images. A few examples are normalized correlation, variance weighted correlation, mutual information, pattern intensity, gradient difference, and gradient correlation. Different measures have very different behaviours in the parameter space, depending on the type of transform used, the initial conditions, and the contents of original image data. We do not bias towards any particular similarity measure. Any or a combination of them can be used with our method in a plug and play fashion. 3.3. METHOD 3.3.4 56 Unscented Kalman Filter UKF [111] is a sequential least squares optimization technique employed for solving non-linear systems. It estimates both the state and its covariance matrix, and no calculations of Jacobian or Hessian are required. Instead, UKF assumes that the unknown state is a Gaussian-distributed random variable (GRV) and uses a minimal set of carefully chosen sample points along with the corresponding observations to learn about the behaviors of a true non-linear system. The sample points, which are generated using the Unscented Transform (UT) [111], completely capture the true mean and variance of the GRV and, when propagating through the non-linear system, capture the posterior mean and variance accurately up to at least the second order Taylor series approximation. Figure 3.2 illustrates the workflow of UKF, which contains the following three stages: 1. Calculate sigma points from the current state and variance using UT; 2. Propagate the sigma points through the known dynamic state and observation models; 3. Compute the gain and update the state as well as its covariance matrix using the computed gain and the known observations. The general equations and details about the UKF can be found in [111]. In our proposed method for CT-to-fluoroscopy registration, the state and observation models 3.3. METHOD 57 Figure 3.2: The UKF algorithm. have the following forms: x = [θx , θy , θz , tx , ty , tz ]T , (3.1) xi = xi−1 + N (0, σx2 ), (3.2) yi = SM (xi ) + N (0, σy2 ), (3.3) where x is the transform parameters to be estimated, SM is the non-linear similarity measure between the DRRs and the corresponding fluoroscopic images as a function of transform parameters, σx2 and σy2 are the variances of the process and measurement noises intrinsic to the system. As multiple fluoroscopic images are used in our method, y is a multiple dimensional measurement vector. 3.4. EXPERIMENT, RESULTS, AND DISCUSSION 58 Table 3.1: Data specifications. Scaphoid CT Pelvis CT Femur CT Fluoroscopy/DRR Size (pixels) 2562 × 64 2563 2562 × 128 2562 Resolution (mm3 ) 0.3752 × 0.525 1.1762 × 0.766 0.6252 × 1.445 0.8362 Since the registration goal is to achieve an optimal similarity value for all < DRR, f luoroscopy > pairs, the known observations are constant in our method, which is the optimal value of the selected similarity measure. For example, in the case of normalized correlation, the value is 1.0. 3.4 3.4.1 Experiment, Results, and Discussion Data Sets We evaluate our method using three different phantoms: a small scaphoid bone, a large pelvis, and a long and thin femur, all with embedded fiducial markers for gold-standard validations. For the scaphoid and pelvis phantoms, the CT data were registered to synthetic fluoroscopic images, i.e. DRRs. The DRRs were generated from three orthogonal views with the CT positioned in the origin of the reference space. For the femur phantom, the CT was registered to three real fluoroscopic images. Table 3.1 lists the specifications of all testing images, and Figure 3.3 shows the CT and the corresponding DRRs/fluoroscopic images for each phantom. 3.4. EXPERIMENT, RESULTS, AND DISCUSSION 59 Figure 3.3: The CT volumes and the corresponding DRRs/fluoroscopic images. Left: scaphoid; middle: pelvis; right: femur. 3.4.2 Experiments A conventional simplex-based method was implemented along with our UKF-based approach for the purpose of comparison. For each phantom data set, 100 experiments were performed for both methods with the initial transforms generated by adding random rotations (±12◦ ) and translations (±12 mm) to the gold-standard. As the simplex-based method has a much smaller capture range than the UKF-based method, 100 additional experiments were performed for the simplex-based method with perturbations of the initial position in the range of ±5◦ and ±5 mm. For the CT-to-DRRs registrations, the gold-standards were known. For the CT-to-fluoroscopy registration, the gold-standard was computed by a fiducial registration [103] using the embedded markers. Normalized correlation was used as the similarity measure in all experiments, and the same set of process and measurement noise assumptions were made for all UKF experiments. 3.4. EXPERIMENT, RESULTS, AND DISCUSSION 60 Table 3.2: Comparison of capture range (unit: mm). Method UKF-based Simplex-based 3.4.3 Scaphoid 12.0 5.8 Pelvis 20.0 9.6 Femur 11.0 8.0 Results We recorded the initial misalignment error (here after called initial mTRE) and final mTREs for all experiments. Each mTRE was computed as the average difference between the positions of some CT points mapped by the evaluated transform and the gold-standard. We used 20 randomly selected points from the region bounding the bones for the mTRE calculation. Figure 3.4 shows the registration results, and Figure 3.5 shows the image differences between the DRRs and the corresponding fluoroscopic images before and after registration for one experiment of the femur data with our proposed method. We compare the capture range, accuracy and computation time of the two methods. Capture range is defined as the distance from the gold-standard in which at least 95% of registrations are successful. It is measured as an initial mTRE and reflects the robustness of an algorithm. A registration is said successful if the final mTRE is within 2 mm for the scaphoid and pelvis data and 3 mm for the femur data. Accuracy was evaluated using the statistics of the successful registrations, which include the mean and standard deviation of the final mTREs. Since DRR generation is the dominant operation in both methods, the computation time was measured as the average number of DRR generations required to reach successful registrations. Tables 3.2 - 3.4 show the results of capture range, accuracy and computation time for each phantom data and each method. 3.4. EXPERIMENT, RESULTS, AND DISCUSSION 61 Figure 3.4: Registration results: initial mTREs vs final mTREs. All units are in mm. Left column: UKF-based; right column: Simplex-based. First row: scaphoid; second row: pelvis; third row: femur. Table 3.3: Comparison of accuracy (unit: mm). Method UKF-based Simplex-based Scaphoid 0.80 ± 0.58 0.32 ± 0.21 Pelvis 0.11 ± 0.10 0.25 ± 0.20 Femur 2.42 ± 0.57 2.53 ± 0.56 Table 3.4: Comparison of required number of DRRs. Method UKF-based Simplex-based Scaphoid 760 620 Pelvis 650 630 Femur 700 580 3.5. SUMMARY 62 Figure 3.5: The image differences between the DRRs and the corresponding fluoroscopic images for one experiment of the femur data with the UKF-based method. Top: before registration; Bottom: after registration. 3.4.4 Discussion From Figure 3.4 and Table 3.2, it is obvious that the UKF-based approach consistently has a much larger capture range than that of the simplex-based method. Tables 3.3 and 3.4 also indicate that the two methods have similar performance in accuracy and computation cost, though the simplex-based method is slightly faster. 3.5 Summary We presented a new method for registering 3D CT data sets to 2D fluoroscopic images that uses UKF for robust optimization and a hardware-based adaptive geometric slicing technique for fast DRR generation. The experimental results showed that 3.5. SUMMARY 63 our method outperforms the conventional simplex-based method by having a larger capture range while providing comparable accuracy and computation time. Future work will include the extension of the proposed method to register multi-fragment bone fractures simultaneously to a set of intra-operative fluoroscopic images, which will be used in computer-assisted trauma surgery. 64 Chapter 4 2D-3D Registration with the CMA-ES Method 4.1 Overview In this chapter, we propose a new method for 2D-3D registration and report its experimental results.12 The method employs the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm to search for an optimal transformation that aligns the 2D and 3D data. The similarity calculation is based on Digitally Reconstructed Radiographs (DRRs), which are dynamically generated from the 3D data using a hardware-accelerated technique - Adaptive Slice Geometry Texture Mapping (ASGTM). Three bone phantoms of different sizes and shapes were used to test our method: a long femur, a large pelvis, and a small scaphoid. Experiments were performed to register CT to fluoroscopy and DRRs of these phantoms using the proposed method and two other methods, i.e. our previously proposed Unscented Kalman Filter (UKF) based method (from Chapter 3) and a commonly used simplex-based 1 Preliminary results of this work has been published in Proceedings of EMBC: R. H. Gong, P. Abolmaesumi, and J. Stewart, “A robust technique for 2D-3D registration”, Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 1:1433-1436, 2006; 2 Final results of this work has been published in Proceedings of SPIE: R. H. Gong and P. Abolmaesumi, “2D-3D registration with the CMA-ES method”, SPIE Medical Imaging, pages 69181M169181M9, Feb. 2008. 4.2. INTRODUCTION 65 method. The experimental results showed that: 1) with slightly more computation overhead, the proposed method was significantly more robust to local minima than the simplex-based method; 2) while as robust as the UKF-based method in terms of capture range, the new method was not sensitive to the initial values of its exposed control parameters, and does not need the knowledge about the system noise within the similarity metric; 3) the proposed method was fast and consistently achieved the best accuracies in all compared methods. 4.2 Introduction 2D-3D registration is a fundamental task in computer assisted surgery (CAS). In such surgeries, in order to use pre-operative CT to guide the surgical procedure during the intervention, the CT must first be mapped to the physical patient in the operating room, and this can be done through registering the 3D CT to a set of intra-operative 2D fluoroscopic images. Another important application is in computer assisted radiotherapy, where registration of CT to a few portal images is used to focus the harmful treatment beams on the lesion area thus minimizing the damage to the surrounding healthy tissues. The goal of 2D-3D registration is to find a spatial transformation that transforms one data set (usually a 3D data set) from its local coordinate space to an other coordinate space (usually that of a 2D data set consisting of a set of 2D images) coordinate space so that the two data sets are aligned in terms of some similarity metric. A 2D-3D registration method generally involves determining three components: a transformation that spatially correlates the two data sets, a similarity metric that evaluates how well the two data sets are aligned under a particular transformation, 4.2. INTRODUCTION 66 and an optimization technique that iteratively searches for an optimal solution of the transformation. A variety of methods have been proposed for 2D-3D registration [37, 58, 82]. Most of the methods have focused on defining an accurate and efficient similarity metric, and have relied on simple search algorithms such as simplex and gradientdescent, to find the final solution. Early methods [58] have used geometry features (e.g., edges and surfaces) to define the similarity metric for obtaining acceptable computation speed. The main drawback of those methods is the need for an accurate feature extraction, where the errors in segmentation propagate through the registration process. Due to the fast increase in computation power in recent years, most current methods [37, 82] compute the similarity directly from image intensities to achieve better robustness and accuracy. This group of methods dynamically generates the simulated 2D data, called Digitally Reconstructed Radiographs (DRRs), from the 3D data and computes the similarity from a set of 2D image pairs. A variety of functions, including normalized correlation coefficients (NCC), variance-weighted correlation (VWC), gradient correlation (GC), gradient difference (GD), pattern intensity (PI) and mutual information (MI), have been used to define the similarity between a 2D image and its corresponding DRR. To achieve interactive computation performance, hardware-accelerated techniques [9] are usually employed to speed-up the DRR-generation process. Finally, some recent work reconstructs 3D data from the 2D data to take advantage of the variety of existing 3D registration techniques. However, one limitation of this type of method is that it needs a large number of 2D images or the statistical information about the studied object for accurate 3D reconstruction. 4.2. INTRODUCTION 67 While simple optimization techniques are known for their ease of use, they are sensitive to local minima. When used in 2D-3D registration, they work well only if a good initial guess of the solution can be found. The main reason is that, due to different dimensionalities and modalities involved in this registration problem, the 2D-3D similarity metrics are usually highly nonlinear and have a rugged search landscape. In real applications, finding such an initial alignment usually involves using a user interface, which is a time-consuming task, or using known geometry objects in the field. To develop a more robust approach, in our previous work an Unscented Kalman Filter (UKF) based method was proposed and the UKF was used as the optimization strategy [37, 38]. The method demonstrated significant improvement in capture range compared to a commonly used simplex-based method. It also provided a possibility to estimate the registration errors using a closed-form solution [74] after the registration was finalized. However, the method is only suitable for the situations that the knowledge about the system noises can be easily obtained and the similarity metric has a known target value. In this work, we propose a fast and more general method that uses the Covariance Matrix Adaption Evolution Strategy (CMA-ES) technique [42] as the optimization strategy to achieve high robustness and better usability. In Section 4.3, we provide the details of the algorithm. In Section 4.4, we validate the proposed method, and compare it with two prior work: our previous UKF-based method and a simplex-based method. Finally, a summary will be provided in Section 4.5. 4.3. METHOD 4.3 68 Method 4.3.1 Algorithm Overview Without losing generality, in the subsequent sections we assume orthopaedic surgery as the common application of CT to fluoroscopy registration in CAS. In this case, the 3D data is the pre-operative CT, quantized in the coordinate space of the CT machine, and the 2D data is a series of intra-operative fluoroscopic images, captured from different orientations in the fluoroscopy coordinate space. Fig. 4.1 shows the overall method as well as the interactions between its components. The inputs are the two data sets being registered and an initial guess for the registration transformation. Then the optimizer, CMA-ES, iteratively refines the transformation according to the similarity between the 2D data and the dynamically generated DRRs. We briefly describe the transformation and similarity metric in the paragraphs that follow. The details about searching for an optimal transformation with CMA-ES will be given in Section 4.3.2. The transformation takes the 3D data from its local coordinate space to the 2D data’s coordinate space. Depending on the application, the transformation can be of any type including rigid, similarity, affine, non-rigid, or a combination. In the context of CT to fluoroscopy registration, this is usually a 3D rigid transformation consisting of rotational and translational components. The translation has a fixed form with three parameters, while the rotation can have various representations with different number of parameters, such as Euler angles and versor with three parameters, unit quaternion and angle-axis with four parameters, and so on. Our method does not tend to a particular representation. For compactness and intuitiveness, the form of Euler angles was selected in this work. 4.3. METHOD 69 Figure 4.1: CMA-ES based 2D-3D registration method. We have adopted the intensity-based approach in the proposed method to take advantage of its robustness and high accuracy. For each fluoroscopic image, one DRR is generated from the CT using the current transformation and the fluoroscopic image settings, then a similarity is computed between each fluoroscopic image and DRR pair. The final similarity between the 2D and 3D data is formulated as a linear combination of the similarities between each pair. All current similarity metrics (NCC, VWC, GC, GD, MI, PI) can be used with our method, and the selection usually depends on the image quality or content. Because DRR generation is the dominant operation during the registration process, a hardware-accelerated technique, named Adaptive Slice Geometry Texture Mapping (ASGTM) [9], is used to accelerate the task. ASGTM is an improvement of the commonly used 3D texture mapping technique. It excludes 4.3. METHOD 70 the non-interesting voxels of the 3D data from rendering by generating and using a set of Axis-aligned Bounding Boxes (AABBs) in a preprocessing step to further accelerate the DRR generation process. 4.3.2 Optimization with CMA-ES The main contribution of this work is to use the CMA-ES optimization technique in 2D-3D registration for improved robustness and better usability. CMA-ES [42] is a sampling-based search algorithm known for robust and efficient operation in a rugged search landscape. The method requires no calculation of derivatives; instead the learning is done through taking random samples around the current solution according to a multivariate normal distribution. In each iteration of the optimization process, the solution is refined by sampling, selection and recombination, and the search distribution is adaptively deformed according to both new information from the selected samples and the information from previous steps. Fig. 4.2 shows the key steps of the CMA-ES algorithm. In Fig. 4.2, the initial guess of the solution and search distribution are provided by the user, which are the initial position of the 3D data and its uncertainty. The search distribution is represented using a covariance matrix C, which determines the distribution shape, and a scalar s, which determines the distribution size (a scaling factor that is applied to the variances of the distribution). Initially, only s is specified and the search distribution has a spherical shape with an isotropic standard deviation in all directions. Each iteration consists of three key steps. First, a population of parameter samples are drawn according to the current search distribution. The sampling size, λ, is determined by the dimension of the solution parameters n, and 4.3. METHOD 71 Figure 4.2: The CMA-ES algorithm. a common choice is λ = 4 + b3 ln nc. Next, the samples are evaluated and sorted according to the metric values, and the first µ samples are selected and recombined. The value of µ is user selectable with the default being µ = bλ/2c. The recombination refines the solution, that is, the mean of the search distribution, by computing the weighted mean of the the selected samples with the coefficients {wi }, i = 1, ..., µ, within which better-valued samples are assigned more weights. Finally, the covariance matrix of the search distribution is updated. This is the core part of the CMAES algorithm, and the update is based on three sources: the search distribution of 4.3. METHOD 72 the previous iteration, the accumulated evolution path of the solution from the first iteration to the current iteration, and the distribution of the selected samples at the current iteration. Each source is assigned a weight and the assignment is controlled by the parameters cc and cσ , which are computed from the recombination weights {wi }. Except for the initial guess, all control parameters can be automatically determined and have been appropriately suggested [42]. The stopping criteria are user-defined. The commonly used ones are the maximum number of iterations, the tolerance of function update (with respect to similarity value), and the tolerance of parameter update (with respect to the solution parameters). In summary, our CMA-ES based 2D-3D registration method works as follows: Inputs: 2D data, 3D data, initial transformation parameters T0 , and initial search distribution size σ0 ; Output: final transformation parameters T . 1. Initialize the search distribution N (T, σ 2 I) (where I is the identity matrix) with T = T0 and σ = σ0 , and the evolution path p to be null; Compute the population size λ, the selection size µ, the recombination weights {wi }, i = 1, ..., µ, and the parameters cc and cσ . 2. Until stopping criteria are met, do the following: (a) Draw a population of samples {Ti }, i = 1, ..., λ, according to the distribution N ; (b) For each sample Ti , transform the 3D data, generate DRRs, and compute the similarity measure; (c) Select µ best samples according to similarity values; 4.4. EXPERIMENTS, RESULTS AND DISCUSSION 73 (d) Update T by recombining the selected samples with the weights {wi }; (e) Update N by linearly combining the following three components with the parameters cc and cσ : the previous N , the covariance of the µ selected samples, and the covariance of p; (f) Update p (see [42] for more details). 4.4 Experiments, Results and Discussion We used three bone phantoms of different sizes and shapes, including a long femur, a large pelvis, and a small scaphoid, to evaluate the proposed method. Four pairs of 2D and 3D data were acquired or synthesized from the phantoms. The 3D data were CTs, captured using a GE LightSpeed Plus machine. The 2D data were simulated fluoroscopic images and real fluoroscopic data. The simulated fluoroscopic images were generated from CTs along coordinate axes using the ASGTM technique. The real fluoroscopic images were acquired using an OEC-9800 fluoroscopy device. Table 4.1 lists the specifications of the data used in this study. Three types of experiments were conducted. First, registration of CT to simulated fluoroscopic images was performed for each phantom using the proposed method and two other methods: our previous UKF-based method and a commonly used simplexbased method. Next, registration of CT to real fluoroscopic images was performed for the pelvis phantom. Finally, additional experiments for studying the impact of the initial search distribution of the CMS-ES algorithm were conducted. All experiments were done on a Dell OptiPlex GX270 computer equipped with 2 GB RAM and an ATI Radeon X800 (256 MB video RAM) graphics card. 4.4. EXPERIMENTS, RESULTS AND DISCUSSION Table 4.1: Data specifications. 74 4.4. EXPERIMENTS, RESULTS AND DISCUSSION 75 Mean Target Registration Error (mTRE), capture range, accuracy and computation time were used to evaluate each method. mTRE was used to measure the initial and final misalignments, and was calculated using the segmented surface points of the corresponding bone in CT. Capture range measures the robustness of a method under a collection of experiments. It was chosen as the range of initial mTRE that 95% of registrations would success. A registration was defined successful if the final mTRE ≤2 mm for experiments using simulated fluoroscopic images, and ≤4 mm for real fluoroscopic images. The accuracy was measured using the mean and standard deviation of the final mTREs of the successful registrations. The computation time was measured as the mean and standard deviation of the time required achieving successful registration. 4.4.1 Registration of CT to Simulated Fluoroscopy For each phantom, three simulated fluoroscopic images were generated from CT. The CT data was placed at the origin, and the simulated fluoroscopic images were generated along coordinate axes with focal length of 920 mm and origin being at the half focal length. Obviously, the gold standards were identity transformations with all parameters being zeros. One hundred experiments with random initial CT positions were conducted for each phantom and each method. The initial CT positions were obtained by applying small perturbations to the gold standard. Table 4.2 lists the magnitude of perturbations for each phantom. NC was used as the similarity metric. Table 4.3 shows the initial and final mTREs. Table 4.4 shows the capture ranges, accuracies, and computation time. From Tables 4.3 and 4.4, we have the following observations: 1) the proposed 4.4. EXPERIMENTS, RESULTS AND DISCUSSION 76 Table 4.2: Perturbations used to generate random initial CT positions for CT to simulated fluoroscopy registrations. The perturbations were made around the six components (3 rotations, 3 translations) of the gold standard. Femur Pelvis Scaphoid Rotational Components (◦ ) ±20 ±50 ±30 Translational Components (mm) ±20 ±20 ±10 Table 4.3: Experimental results of CT to simulated fluoroscopy registrations: initial mTREs vs final mTREs (unit: mm). 4.4. EXPERIMENTS, RESULTS AND DISCUSSION 77 Table 4.4: Experimental results of CT to simulated fluoroscopy registrations: capture range (unit: mm), accuracy (unit: mm), and computation time (unit: s). Phantom Capture Range Scaphoid Accuracy Computation Time Capture Range Pelvis Accuracy Computation Time Capture Range Femur Accuracy Computation Time CMA-based > 40 0.26 ± 0.48 147 ± 53 80 0.07 ± 0.07 94 ± 26 > 100 0.42 ± 0.63 99 ± 27 UKF-based > 40 0.87 ± 0.54 187 ± 16 72 0.40 ± 0.10 154 ± 38 > 100 1.87 ± 0.63 124 ± 14 Simplex-based 9 1.24 ± 0.99 95 ± 27 50 0.30 ± 0.47 86 ± 22 10 1.31 ± 0.82 65 ± 14 method achieved similar to or better capture ranges than the UKF-based method for all testing phantoms, and both methods were significantly more robust than the simplex-based method in terms of capture range; 2) the CMA-ES based method consistently achieved the best accuracy; 3) the CMA-ES based method took slightly longer time than the simplex-based method to converge, but on average the difference was within one minute for a single registration. 4.4.2 Registration of CT to Real Fluoroscopy To examine the method’s performance in a simulated surgical environment, in this test the real fluoroscopic images of the pelvis phantom were used. Three fluoroscopic images were acquired from arbitrary viewing directions that were apart about 45 degrees each other, and the pose information was reported by the tracking camera. Four embedded fiducials, visible in CT and tracked during fluoroscopy acquisition, were used to obtain the gold standard. Similar to the CT to simulated fluoroscopy registration experiments, 100 experiments with random initial CT positions around 4.4. EXPERIMENTS, RESULTS AND DISCUSSION 78 Table 4.5: Experimental results of CT to real fluoroscopy registrations for pelvis: initial mTREs vs final mTREs (unit: mm). Table 4.6: Experimental results of CT to real fluoroscopy registrations for pelvis: capture range (unit: mm), accuracy (unit: mm), and computation time (unit: s). Capture Range Accuracy Computation Time CMA-based 22 3.19 ± 0.44 114 ± 40 UKF-based 10 2.56 ± 0.74 156 ± 40 Simplex-based 9 3.24 ± 0.65 90 ± 8 the gold standard were performed for each of the three methods. The perturbations were ±15◦ for rotational components and ±20 mm for translational components. VWC was used as the similarity metric. The results are shown in Tables 4.5 and 4.6. Obviously, the CMA-ES based method achieved the largest capture range. However, the UKF-based method achieved the best accuracy but did not show much improvement in capture range. This can be explained by one important property of the UKF-based method: it requires a good understanding about the error sources in the system to work robustly and efficiently. In CT to fluoroscopy registration, a variety of sources (CT acquisition, fluoroscopy acquisition, DRR generation, outliers in fluoroscopic images, and so on) would cause the generated DRRs to not exactly match the corresponding fluoroscopic images. Finding the statistics about the combined errors is usually not an easy task. In this study, they were determined by trial 4.4. EXPERIMENTS, RESULTS AND DISCUSSION 79 Figure 4.3: Experimental results for different initial distributions of the CMA-ES based method: initial mTREs vs final mTREs (unit: mm). The pelvis phantom was used as the testing data. and error, and may not have been optimally chosen. In our previous work [37], it was demonstrated that the UKF-based method was able to achieve significant improvements in capture range and computation time if such knowledge can be accurately obtained. 4.4.3 The Impact of Initial Search Size While most control parameters of the CMA-ES algorithm have been suggested before [42], the user has to provide an initial search distribution in the form of σ. The parameter indicates the uncertainty about the user-supplied initial transformation, and the value of 1.0 has been recommended. Here, we analyze the sensitivity of the algorithm to this parameter. The CT and simulated fluoroscopy of the pelvis phantom and three different values of σ, i.e. 0.5, 1.0, and 2.0, were used in this testing. Similar to the previous cases, 100 experiments were performed for each value of σ. The results are shown in Figure 4.3 and Table 4.7. These experiments demonstrate that the recommended value of 1.0 gives the best balance between capture range, accuracy and computation time; however, the differences caused by the parameter were not significant. This observation was anticipated 4.5. SUMMARY 80 Table 4.7: Experimental results for different initial distributions of the CMA-ES based method: capture range (unit: mm), accuracy (unit: mm) and computation time (unit: s). The pelvis phantom was used as the testing data. Capture Range Accuracy Computation Time σ = 0.5 75 0.12 ± 0.27 119 ± 33 σ = 1.0 80 0.07 ± 0.07 94 ± 26 σ = 2.0 78 0.10 ± 0.28 115 ± 27 because the CMA-ES algorithm is able to adaptively change its search distribution according to the local search landscape. 4.5 Summary In this chapter, we presented a new 2D-3D registration method that takes advantage of the CMA-ES searching algorithm to achieve improved robustness, fast computation speed and better usability. From the experimental results, we have the following conclusions: 1. The proposed method is able to achieve highly accurate results and is significantly more robust with respect to local minima than the simplex-based method. It is a fast method as most registrations can be finished in 1-2 minutes; 2. The UKF-based method is able to achieve the same capture range as the CMAES based method if a good understanding about the system errors can be obtained. However, the CMA-ES based method is a more general solution because most of its control parameters can be automatically determined and it is not sensitive to the only exposed parameter. 81 Chapter 5 Multiple-Object 2D-3D Registration 5.1 Overview This chapter presents a multiple-object 2D-3D registration technique for non-invasively identifying the poses of fracture fragments in the space of a preoperative treatment plan.1 The treatment plan is generated from tessellation of computed tomography images of fracture fragments. The registration technique recursively updates the treatment plan and matches its digitally reconstructed radiographs (DRRs) to a small number of intraoperative fluoroscopic images. The proposed approach combines an image similarity metric that integrates edge information with mutual information, and a global-local optimization scheme, to deal with challenges associated with the registration of multiple small fragments and limited imaging orientations in the operating room. The method is easy to use as minimum user interaction is required. Experiments on simulated fractures and two distal radius fracture phantoms demonstrate clinically acceptable target registration errors with capture range as large as 1 This work has been published in Journal of TBME: R. H. Gong, J. Stewart, and P. Abolmaesumi, “Multiple-Object 2D-3D Registration for Non-invasive Pose Identification of Fracture Fragments”, IEEE Transaction on Biomedical Engineering, volume 99, Jan. 2011. 5.2. INTRODUCTION 82 10 mm. 5.2 Introduction The emergence of computer-assisted surgery (CAS) enables the use of a preoperative treatment plan to guide surgical operations. It is nowadays the preferred choice of many surgeons because of demonstrated advantages such as low incidence of surgeryinduced infection, short healing time, and high union rates [117]. A fundamental task in such procedures is to accurately and responsively establish a spatial correspondence between the preoperative treatment plan and the patient in Operating Room (OR). In the context of fracture treatment, the task becomes to identify the poses of the fracture fragments in the space of the treatment plan in order to obtain knowledge about spatial deviations between the actual and planned positions of the fragments. Conventionally, optical tracking is used to identify the position of each fracture. This pose identification technology is accurate, fast and reliable. However, it requires line-of-sight between the camera and a number of reference bodies that are mounted on the fracture fragments. Such an approach has several drawbacks: First, the size and weight of the reference bodies may limit their application in some types of fractures that contain multiple, small fragments; second, the line-of-sight constraint limits the surgeon’s flexibility in the OR; third, mounting of reference bodies on bones is invasive, which may lead to longer recovery time. 2D-3D registration is an alternative pose identification technique that overcomes the limitations of optical tracking. To identify the poses, a treatment plan generated from preoperative 3D data, such as a Computed Tomography (CT) of the trauma region, is mapped to intraoperative fluoroscopic images through an image registration 5.2. INTRODUCTION 83 process. This process is less responsive and less accurate than optical tracking; however, it provides a non-invasive, suboptimal solution for the cases that optical tracking is impossible or costly. In the case of orthopaedic surgery, the registration process is usually performed by maximizing a similarity metric between simulated fluoroscopic images generated from the 3D CT data, called digitally reconstructed radiographs (DRRs), and the actual fluoroscopic images. A number of DRR-based 2D-3D registration techniques have been suggested (see e.g., [53, 64, 82, 84, 93, 127]). However, none of these techniques has been reported for the pose identification problem in fracture treatment, primarily due to the following challenges: • Involvement of multiple moving objects. Most current 2D-3D techniques handle only one large or long bone such as pelvis or femur. In fracture treatment, multiple and maybe small bones are involved, which not only increases the computation complexity, but also increases the likelihood of occlusion on fluoroscopic images. • Limitations from the constrained OR environment. The imaging orientations are very limited due to the collision of the fluoroscopy imaging device with the OR table. So, the best fluoroscopic imaging views that are optimum for registration of multiple fracture fragments may not be available. Previously, the multiple-object 2D-3D registration problem has been explored in a few studies. In [22], phase-based mutual information was used as the similarity metric for registering femur and tibia to fluoroscopic images of knee joint. Each bone was registered to the joint images separately as a conventional 2D-3D registration, and the phase information was used to reduce the impact of the outliers, i.e. the other 5.2. INTRODUCTION 84 bone shown in the joint images. In addition, a very good initialization (±3◦ and ±3 pixels) was required. In [67], multiple bones were registered to two fluoroscopic images (one AP view and one lateral view). Correlation of edge information was used to deal with overlaps between bones, and an optimization algorithm that takes advantage of the orthogonal fluoroscopic images was used. Rough user segmentations from fluoroscopic images were needed, and only preliminary results with synthetic fluoroscopic images were reported. In our previous work [39], multiple fracture fragments of unknown shapes were registered to a set of fluoroscopic images. The registration determined not only the poses of the fragments, but also their true shapes by simultaneously deforming a bone atlas through automatic planning (i.e. planning and registration were combined into a single procedure). Mutual information was used as the cost function, and a small amount of user interactions were necessary to remove the impact of outliers on fluoroscopic images. However, validation has only been performed with a simple synthetic fracture. In addition to intensity-based registration techniques above, feature-based registration has been proposed [31]. However, this approach depends on accurate segmentation of the fracture within the image, and is specifically designed for tubular-shape bone structures such as the femur. Hybrid pose identification methods have also been proposed where a combination of optical tracking and 2D-3D registration is used [78]. In this chapter, we describe a new multiple-object 2D-3D registration technique, aiming at solving the pose identification problem in fracture treatment. A similarity metric that integrates edge information with mutual information is used in order to obtain a more accurate and smoother cost function in a very noisy image environment. 5.3. METHODS 85 A key to the success of the approach is the use of a global-local alternating optimization scheme that is based on the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm [42], which can handle rugged objective functions. Treatment planning is done in a separate step in order to achieve reliable results. We evaluate the proposed technique with synthetic fractures as well as actual fracture phantoms. The rest of this chapter is organized as follows: Section 5.3 presents the details of our method. The experimental results are reported in Section 5.4, and discussed in Section 5.5. Finally, Section 5.6 provides a summary. 5.3 Methods The main components of our multiple-object 2D-3D registration algorithm are illustrated in Fig. 5.1. The three inputs are: the preoperative treatment plan as the moving data (Section 5.3.1); a set of intraoperative fluoroscopic images as the fixed data (Section 5.3.2); and an initial transform on the treatment plan that roughly aligns the treatment plan with the fluoroscopic images (Section 5.3.3). To find the poses of the intraoperative fragments in the space of the treatment plan, the transform is recursively refined until the generated DRRs of the transformed treatment plan (Section II-E) match the corresponding fluoroscopic images in terms of the similarity metric we have defined (Section 5.3.4). The process is steered by a global-local alternating optimization scheme (Section 5.3.6). 5.3.1 Preoperative Treatment Plan The preoperative treatment plan has as its goal the surgeon’s ideal shape of the bone after the treatment. The plan is made by manipulating and aligning computer models 5.3. METHODS 86 Figure 5.1: The main components and flowchart of our multiple-object 2D-3D registration algorithm. The poses of the intraoperative fragments in the space of treatment plan are computed by registering the treatment plan to a set of intraoperatve fluoroscopic images. of individual fracture fragments that are segmented from a diagnostic CT. We chose to use intensity models to represent the fragments. We used a semiautomatic active contour-based technique [124] to segment the fragments, then manually corrected the boundaries where large segmentation errors occurred. The goal of planning is to find a set of rigid-body transforms, one per fragment, that transform the fragment models from their local coordinate frames in the diagnostic CT to the coordinate frame of the treatment plan such that an ideal bone shape is obtained. We denote the transforms as {plan Tmodel(i) }, i = 1, ..., N , where N is the number of fragments. A number of planning methods can be used to obtain such transforms. In the simplest method, the fragment models are interactively manipulated on a computer screen, and the final shape of the bone is determined according to user’s 5.3. METHODS 87 expertise [11, 12, 35, 47]. For automatic planning, a template bone model is used as a reference, and the fragments are concurrently registered to the template using 3D registration techniques. The commonly used template is the reflected contra-lateral bone [77, 80], or a statistical shape model of the bone [20, 23, 39, 92] as the planning reference. Given the focus of our work, the accuracy of surgical planning and the approach taken to generate the plan are irrelevant to the accuracy of the proposed registration method. However, we use the transforms obtained during planning as the ground truth for validation of 2D-3D registration. 5.3.2 Tracked Intraoperative Fluoroscopic Images In this study, fluoroscopic images are captured using a GE OEC 9800 C-arm that is commonly available in ORs. We assume that the pose of the C-arm is tracked, hence, the relative positions of the fluoroscopic images are known. We also assume that the images are calibrated and distortion free, and that the imaging parameters such as the distance of the X-ray source to the OR table are known. These assumptions are similar to the ones made in prior work and can be satisfied by tracking the position of the C-arm relative to the OR table using optical, or magnetic, tracking techniques, or by using specialized fiducials in the image. 5.3.3 Transforms The registration process estimates two types of transforms: one global transform, OR Tplan , that is applied on the entire treatment plan, and a set of local transforms, {plan Tmodel0 (i) }, i = 1, ..., N , that are applied to the individual fragment models. Note that the models have been placed at their planned positions after planning, and we 5.3. METHODS 88 use model0 to distinguish them from their original positions in the diagnostic CT. After registration, the pose of a fragment i in the space of the OR is formed as Ti = OR Tplan plan Tmodel0 (i) , i = 1, ..., N. (5.1) The global transform maps the treatment plan to the patient in the OR, and the local transforms reposition the models within the treatment plan so that their final positions match the intraoperative positions of the corresponding fragments. For a given fragment, if the registration is accurate, the local transform represents the spatial deviation between its true pose in the OR and the planned pose in the OR according to the treatment plan, which is the information that surgeons are most interested to know. Each of the global and local transforms is represented using a rigid transform with six parameters pi = (θx , θy , θz , tx , ty , tz )i , i = 0, 1, ..., N (i = 0 for global transform), where the first three parameters determine the rotation in Euler angles2 . Here, rotation is with respect to the geometry center of the treatment plan (for i = 0) or the corresponding fragment (for i > 0). In the case of N fragments, there are (1 + N ) × 6 parameters to be determined. 5.3.4 Similarity Metric A similarity metric is defined over the fluoroscopic images and the corresponding DRRs of the transformed treatment plan. Our design goal is to take advantage of the wide capture range of mutual information (MI) while making use of the edge information for better robustness against noise and outliers. An approach similar to 2 We found Euler angles to work well (taking into account the singularities) for our optimization scheme. Other representations such as the quaternions may also be used. 5.3. METHODS 89 Munbodh et al. [76] is adopted, where images (both fluoroscopic images and DRRs) are first processed with an edge-enhancement technique, and then a similarity measure is computed from the modified images. We use the gradient magnitude to derive the edge information, and the infinite impulse response (IIR) filter is used for fast computation. The edge thickness is controlled by the standard deviation, σ, of a smoothing Gaussian kernel. For a pixel location (x, y), the edge-likelihood factor is computed as w(x, y) = max[0, e G(x,y)−A B 1 − e B ], (5.2) where G(x, y) is the gradient magnitude, A is a constant that affects the contrast of the resulting image, and B ∈ [0, 1) is a threshold that reduces the shadow generated by the soft tissues surrounding the trauma region. The computed factor is a positive integer: if it is smaller than one, the pixel will be suppressed; if it is greater than one, the pixel will be enhanced. The values of σ, A and B are determined through empirical testing, and we have found that the values 2, 15 and 0.2, respectively, are suitable for most of our experiments. Once the edge likelihood factor is computed, the original pixel is modified by weighting the pixel with the factor I 0 (x, y) = I(x, y)w(x, y). (5.3) After performing the edge-enhancement process, Mattes Mutual Information (MMI) [72] is computed for every pair of fluoroscopic image and DRR. MMI is an implementation of mutual information that computes the measure from a small set of pixel 5.3. METHODS 90 samples and estimates the image histograms with the Parzin Windowing technique. We have slightly modified the original MMI algorithm by adding pixel samples from the edges detected during edge-enhancement. The gradient magnitude of the fluoroscopic image is thresholded to keep only structures of strong gradient magnitudes, using the threshold value t = Gmax − α(Gmax − Gmean ), (5.4) where Gmax and Gmean are the maximum and mean values of the gradient magnitude, and α determines how much edge information to keep. We have chosen a fixed value of 0.3 for α and found that it was appropriate in most cases. When the images contain extreme fine or coarse details, it will be helpful to tune this parameter so that suitable amount of edge information is used. In total, 5% of uniformly sampled pixels plus the edge samples generated by Eq. (5.4) are used to compute the MMI for each pair of images, and the overall metric is formed as M 1 X M M Ij (p; f luoroj , DRRj ), E(p; f luoros, plan) = M j=1 p = {p0 , p1 , ..., pN }, (5.5) where M is the number of fluoroscopic images used for registration. Note that, in the above function we didn’t apply any constraint to the transformation parameters such as collision avoidance, so the fragments can overlap each other during the registration, which not only increases the failure rate of the registration, but also causes slower convergence. This decision was made mainly because currently it lacks an efficient 5.3. METHODS 91 algorithm for collision detection among multiple 3D objects of irregular shapes. 5.3.5 DRR Computation As a large number of DRRs are required for registration, a 3D texture-mapping technique that employs modern Graphics Processing Units (GPUs) is used to accelerate the production of DRRs [9]. To generate a DRR of an X-ray view, the transformed treatment plan is sliced along the viewing direction with all fragment models being sectioned simultaneously, and then the slices are blended together in the order of back to front. In order to highlight particular structures of interest (the bone in our case) and to produce DRRs that better simulate real fluoroscopic images, a transfer function is used during slicing and blending. All those operations (i.e. slicing and blending) are performed within the GPU hardware. 5.3.6 Optimization Scheme The goal of optimization is to find a set of transform parameters that minimize the similarity metric defined in Eq. (5.5). As the metric function is highly non-linear, and also due to the involvement of multiple fragments, we use a coarse-to-fine twolevel optimization scheme. At each level, we perform two types of optimizations in an alternating and iterative fashion: 1. Global optimization - the parameters of the global transform are estimated for a certain number of iterations, so that all fragment models are transformed as an entirety. 2. Local optimization - the parameters of the local transforms are estimated in two steps. First, the parameters for individual fragments are sequentially estimated 5.3. METHODS 92 for a certain number of iterations, in the order from the largest fragment to the smallest one, where the size of a fragment is computed as the number of non-zero voxels in the fragment model. Second, all parameters are estimated in parallel for a certain number of iterations. The number of iterations for the global, sequential and parallel optimizations are user-defined, and we have used 20, 20 and 10, respectively, for all of our experiments. All optimizations are performed using the CMA-ES algorithm [42], which is known for robust estimation of nonlinear functions that have a rugged search landscape. The algorithm requires no derivative calculation; instead, the proceeding directions are learned by sampling the parameter space according to a probability search distribution and selecting the samples that best predict the convergence direction. As estimation advances, the parameters and the search distribution are progressively updated according to the newly added samples. Compared with traditional optimization algorithms such as simplex and gradient-descent, although each iteration of the CMA-ES algorithm is computationally more expensive, the total number of iterations required for convergence is much smaller because more informative samples are considered in each iteration and the history information is also carried on. In our previous work on single-object 2D-3D registration (Chapter 4) [36], the CMA-ES algorithm has demonstrated a large capture range of about 20 mm with convergence within a couple of minutes. To improve the capture range as well as the performance, the optimizations are performed for two resolution levels. In the first level, fluoroscopic images are downsampled to have a resolution of 128 × 128 pixels with isotropic spacing, and fragment models are down-sampled to have a resolution of 128×128×64 voxels with anisotropic 5.4. RESULTS 93 spacing. In the second level, the resolution along each coordinate axis is doubled for both fluoroscopic images and fragment models. Our algorithm requires a rough initial alignment between the treatment plan and the fluoroscopic images. Initialization is performed interactively from a graphical user interface (GUI), which provides good initial guesses for both global and local parameters with only a few user interactions. Alternatively, the inverse of the transformations obtained during planning can be used as a good initial guess of the local parameters, and the traditional optical tracking or landmark-based registration techniques can be used to supply an initial guess of the global parameters. 5.4 Results We evaluate our method with three types of experiments. First, we use two synthetic fractures to test the method’s behavior under ideal conditions. Second, we use two fracture phantoms, simulating real patient cases, to test the method’s behavior in a clinical environment. Lastly, a patient fracture with simulated fluoroscopic images are used to study the behavior of our method in the presence of outliers on fluoroscopic images. For each fracture case in each type of experiment, a variety of treatment plans is used to evaluate the method. The treatment plans are randomly generated on computer rather than having been obtained through an actual planning procedure, which greatly simplifies our experimental process, and allows extensive testing of our method. As we mentioned before, the focus of the current work is on image registration and hence, the quality of generated plans do not affect the conclusions drawn in this study. 5.4. RESULTS 94 Figure 5.2: The process of planning, registration (including fluoroscopic imaging and fragments pose identification), and error calculation for every single experiment. Fig. 5.2 shows the complete testing cycle for every single experiment: 1. Planning - A treatment plan is generated from the fragment models of the fracture. 2. Registration - This step consists of 2a) fluoroscopic imaging of the fracture, and 2b) pose identification of the fragments with our proposed method. 3. Error evaluation - The identified fragment poses are compared with their true poses. 5.4. RESULTS 5.4.1 95 Error Measurement Fig. 5.2 also shows the associated transforms in each testing stage as well as the error calculation for a fragment i. The planning yields the transform plan 5.3.1), the intraoperative fluoroscopic imaging yields the transform Tmodel(i) (Section OR Tmodel , which takes all fragment models as one fracture into OR, and the pose identification obtains the transform Ti (Section 5.3.2). The composition of the transforms from planning and intraoperative imaging is the gold-standard of our registration −1 TiGS = OR Tmodel plan Tmodel(i) , i = 1, ..., N. (5.6) We use the mean Target Registration Error (mTRE) [27] to report the pose identification error ei . It is determined by the gold-standard transform and the transform obtained during registration, and is evaluated over all surface points of the fragment ei = mT REi = 1 X Ti x − TiGS x , i = 1, ..., N, |Ω| x∈Ω (5.7) where Ω is the surface point set. The above error represents the mean surface-tosurface distance between corresponding points of the true fragment in OR and the estimated fragment after registration. A pose identification for a fragment is deemed successful if the final mTRE is below 3 mm. In order to study the performance of the method under different distances of planning, mTREs are correlated to the model-to-plan distances, and success rates are analyzed for different ranges of such distances. The model-to-plan distance of a fragment di is the mean displacement that is generated by the planning. It is calculated from the transform obtained during planning and all surface points of the 5.4. RESULTS 96 fragment di = 5.4.2 1 X plan −1 Tmodel(i) x − x , i = 1, ..., N. |Ω| x∈Ω (5.8) Experiments with Synthetic Fractures We first perform experiments with synthetic fractures and synthetic fluoroscopic images. The goal of this type of experiment is to investigate our method’s behavior under optimal conditions, where system noise introduced during the planning and pose identification stages is minimum, and best imaging views with minimum fragment occlusions are available. The right wrist CT of a human cadaver is used to generate the synthetic fracture cases. The CT has a resolution of 512 × 512 × 72 voxels and a spacing of 0.174 × 0.174 × 1 mm3 (lower resolution CT could also be used as long as fragment models can be accurately extracted). The radius is first segmented from the CT, then is cut with a plane to generate two fracture cases: a two-fragment fracture case (Fig. 5.3) and a three-fragment fracture case (Fig. 5.4). This process also produces a total of five fragment models, each is resampled and cropped to have a resolution of 256 × 256 × 128 voxels and a spacing of 0.2 × 0.2 × 0.6 mm3 . The two-fragment fracture case simulates a diaphyseal segmental fracture that consists of two major bone fragments: one with irregular shape, and one with tubular shape. The threefragment fracture case simulates a trauma that includes an additional peri-articular oblique fracture. These simulated fractures create fragments of different shapes and sizes, and are common types of fractures in clinics. Fluoroscopic images are simulated from the fragment models with a virtual GE 5.4. RESULTS 97 Figure 5.3: Simulated wrist fracture that contains a two-fragment diaphyseal segmental fracture. Figure 5.4: Simulated wrist fracture that consists of two fracture surfaces and three fracture fragments: one diaphyseal segmental fracture surface and one peri-articular oblique fracture surface. OEC 9800 C-arm (Section 5.3.1). The virtual device has a fixed source-to-detector distance of 920 mm, and a detector size of 213.7 × 213.7 mm2 . Fluoroscopic images are simulated with a resolution of 256 × 256 pixels and a spacing of 0.83 × 0.83 mm2 , with the fragment models being placed at a location with a source-to-object distance of 400 mm. To obtain a simulated fluoroscopic image Fj∗ from a direction j, a DRR, DRRj , is first generated from the fragment models along the direction, then it is composed with a true fluoroscopic image Fempty that is captured from an empty field, and finally a certain amount of random noise is added to the final image. The composition at a pixel location (x, y) is formulated as 5.4. RESULTS 98 Figure 5.5: Formation of a simulated fluoroscopic image. Left: DRR of a fracture CT. Middle: empty true fluoroscopic image. Right: the simulated fluoroscopic image. Fj∗ (x, y) = DRRj (x, y)Fempty (x, y)/D + N (µ, σ 2 ), j = 1, ..., M (5.9) where D is a constant used to control the contrast of the resulting image, N (µ, σ 2 ) controls the amount of noise added, and M is the number of fluoroscopic views. When generating the DRRs from fragment models, we also incorporated the physical phenomenon - heel effect, which further introduced intensity distortions in the resulting X-ray images. Fig. 5.5 shows the formation of one simulated fluoroscopic image. For each of the synthetic fracture cases, the poses of all fragment models are independently and randomly perturbed to obtain a set of 100 virtual treatment plans. The perturbation ranges are ±15◦ (maximal value) for the rotation parameters (with rotation centers at the geometry centers of the models), and ±10 mm (maximal value) for the translation parameters. Two simulated fluoroscopic images are produced: one from the x-direction (AP view), one from the y-direction (lateral view), and composition parameters of D = 256, µ = 0 and σ = 5 are used. The coordinate axes of both fracture cases have been shown in Fig. 5.3 and Fig. 5.4. As fluoroscopic 5.4. RESULTS 99 images are directly simulated from the fragment models, the transform for fluoroscopic imaging OR Tmodel (Fig. 5.2) is the identity. So, calculation of the gold standard (Eq. 5.6) is simplified. For each fracture case and each generated treatment plan, the perturbed poses are recovered with our method. Registrations with mTRE > 3 mm are considered unsuccessful, and are excluded from the calculation of error statistics. The results are summarized in Fig. 5.6 and Table 5.1 for the two-fragment fracture case, and in Fig. 5.7 and Table 5.2 for the three-fragment fracture case. For a visual check of the final result, Fig. 5.8 shows the intensity differences between the simulated fluoroscopic images and the corresponding DRRs before and after registration for one typical experiment. Figure 5.6: Pose identification errors versus distances of planning for the twofragment synthetic fracture experiments. 5.4. RESULTS 100 Fragment A Fragment B θx 0.24±0.44 0.37±0.46 Errors θy 0.52±0.61 1.03±0.96 in θz 2.06±1.26 0.64±0.68 Rigid-Body tx 0.78±0.85 1.83±1.71 Parameters ty 0.33±0.61 0.66±0.83 tz 0.27±0.54 0.05±0.06 mTRE (mm) 0.43±0.49 0.24±0.18 Successful di ≤ 10 mm 91 100 Registrations di ≤ 15 mm 90 98 (%) di ≤ 20 mm 84 85 Table 5.1: Error statistics for the two-fragment synthetic fracture experiments. Figure 5.7: Pose identification errors versus distances of planning for the threefragment synthetic fracture experiments. 5.4. RESULTS 101 Fragment A Fragment B Fragment C θx 1.03±1.15 0.51±0.77 0.89±0.69 Errors θy 0.58±0.46 1.03±0.81 1.42±0.98 in θz 2.08±1.27 0.60±0.77 1.40±0.90 Rigid-Body tx 0.84±0.68 1.79±1.44 2.76±1.95 Parameters ty 1.49±1.69 0.91±1.37 1.79±1.40 tz 0.38±1.37 0.08±0.08 0.13±0.11 mTRE (mm) 0.62±1.33 0.21±0.16 0.31±0.12 Successful di ≤ 10 mm 68 100 93 Registrations di ≤ 15 mm 57 87 65 (%) di ≤ 20 mm 50 80 61 Table 5.2: Error statistics for the three-fragment synthetic fracture experiments. 5.4.3 Experiments with Fracture Phantoms Two fracture phantoms are made to study our method’s performance in a lifelike clinical environment, where various system noises exist, and orientations of fluoroscopic imaging are more restricted. Two wrist fracture cases are used as templates to create the fracture phantoms. The first case is a peri-articular distal radius fracture which has one fracture surface and two involved fragments (Fig. 5.9). The second case is a more complicated comminuted distal radius fracture which has two irregular fracture surfaces and three involved fragments (Fig. 5.10). In this fracture case, two involved fragments have 5.4. RESULTS 102 AP view Lateral view Figure 5.8: Difference images between the simulated fluoroscopic images and the corresponding DRRs for one typical case of the three-fragment synthetic fracture experiments. The background of the simulated fluoroscopic images is removed in order to visualize the details. First row: before registration. Second row: after registration. The model-to-plan distances for fragments A, B and C are 10.7 mm, 7.8 mm and 13.0 mm, respectively. The corresponding pose identification errors are 0.4 mm, 0.1 mm and 0.3 mm, respectively. Registration took about four minutes on an Intel Core 2 PC with a GeForce 8600 GPU. 5.4. RESULTS 103 Figure 5.9: One phantom replicates a peri-articular segmental fracture which has one fracture surface with two involved fragments. Figure 5.10: The second phantom replicates a comminuted fracture case which has two irregular fracture surfaces with three involved fragments. small sizes, and one is embedded inside a large bone. This is a rather difficult problem for registration, because it is impossible to obtain fluoroscopic images without occlusions, and also the images of the small fragments do not provide rich information for registration. The construction of a fracture phantom starts from the diagnostic CT of the fracture case. 5.4. RESULTS 104 First, the fragments are segmented from the CT, and the corresponding mesh models are created. Next, the mesh models are printed out with Acrylonitrile Butadiene Styrene (ABS) plastic using a rapid prototyping 3D printer. Then, the surfaces of the printed bones are coated with a barium sulfate solution. A small amount of lacquer is added into the solution to prevent the barium sulfate from dissolving in hot water. The use of this solution is to simulate bone’s density under X-ray imaging as plastic materials are hardly seen in CT and fluoroscopic images. Next, the models are positioned inside a plastic container, tissue mimicing material is poured in and the container is cooled down. The use of the tissue mimicking material is to fix the models’ positions as well as to simulate the surrounding soft tissues. The tissue mimicking material is made by mixing 4% of Agar with 96% of boiled water, which becomes hardened quickly when it is cooled down. During the hardening process, the models are manipulated to create the desired fracture layout. Finally, five labeled CT fiduals are attached on three faces of the phantom container for computing the gold standard during the experiments. Note that the constructed fracture phantom replicates the shapes and sizes of the fragments in the fracture case, but with a customized spatial relationship among the fragments. For each created fracture phantom, a CT scan is acquired, then fragments are segmented, down-sampled and cropped. Created fragment models have a resolution of 256×256×128 voxels and a spacing of 0.35×0.35×0.7 mm3 for both phantoms. 50 random treatment plans are generated from the fragment models, with a perturbation range of ±15◦ for rotation and ±10 mm for translation for the two-fragment fracture 5.4. RESULTS 105 Figure 5.11: Four fluoroscopic images are used for each phantom case. First row: the two-fragment fracture case. Second row: the three-fragment fracture case. case, and ±10◦ / ± 5 mm for the three-fragment fracture case. Four fluoroscopic images are acquired for each phantom with the phantom being placed roughly in the iso-centre of the C-arm. The imaging orientations are about 45◦ apart each other, starting from the x-direction (AP view) and rotating counterclockwise around the z-axis (Fig. 5.9 and Fig. 5.10). So, the third imaging direction is close to the lateral view. Due to the constraints within OR, these views are the most available orientations. Other orientations may also be available, but either they do not add much information for registration, or the patient’s pose has to be changed. Images are captured with a resolution of 473 × 473 pixels and a spacing of 0.45 × 0.45 mm2 , and down-sampled to have a resolution of 256 × 256 pixels and a spacing of 0.83 × 0.83 mm2 . As the C-arm rotates around the phantoms, the focal length varies between 916 mm and 923 mm due to the changes in the position of the C-arm. The fluoroscopic images used for this study are shown in Fig. 5.11. 5.4. RESULTS θx θy θz tx ty tz mTRE (mm) Successful di ≤ 10 mm Registrations di ≤ 15 mm (%) di ≤ 25 mm Errors in Rigid-Body Parameters 106 Fragment A 1.79±1.21 1.23±1.16 1.72±1.14 0.80±0.49 1.05±0.72 0.45±0.34 1.53±0.64 98 91 87 Fragment B 1.29±0.86 1.82±1.78 1.53±1.18 0.99±0.54 2.44±0.73 0.71±0.52 2.79±0.60 92 86 77 Table 5.3: Error statistics for the two-fragment fracture phantom experiments. To compute the gold-standard for each phantom case, the transform that positions the phantom in the OR for fluoroscopic imaging (i.e. OR Tmodel in Fig. 5.2) must be known. This is obtained by using the CT markers that are attached on the phantom container surfaces. During fluoroscopic acquisition, the markers’ positions in the OR coordinate frame are also recorded using optical tracking. After the phantom CT is acquired, the positions of the same set of markers in the phantom coordinate frame are obtained by segmenting the markers from the phantom CT. As the correspondences of the two sets of points are known, the transform that correlates the two coordinate frames can be easily computed by aligning the two point sets. Registrations are performed using the default edge-enhancement and optimization parameters. Results with mTRE > 3 mm are considered unsuccessful, and are excluded from the calculation of error statistics. Fig. 5.12 and Table 5.3 show the experimental results for the two-fragment fracture phantom, and Fig. 5.13 and Table 5.4 show the results for the three-fragment fracture phantom. Figures 5.15 and 5.14 show the examples of the registration results for the two phantoms. 5.4. RESULTS 107 Figure 5.12: Pose identification errors versus distances of planning for the twofragment fracture phantom experiments. 6% of total points are cropped in the fragment A chart because mTRE are out of the display range (i.e. > 25 mm). Figure 5.13: Pose identification errors versus distances of planning for the threefragment fracture phantom experiments. 2% of total points are cropped in the fragment C chart because mTRE are out of the display range (i.e. > 15 mm). 5.4. RESULTS θx θy θz tx ty tz mTRE (mm) Successful di ≤ 5 mm Registrations di ≤ 10 mm (%) di ≤ 15 mm Errors in Rigid-Body Parameters 108 Fragment A 1.61±1.67 1.18±1.05 2.68±1.82 0.66±0.35 0.76±0.50 1.13±0.52 1.74±0.52 100 93 87 Fragment B 2.00±2.29 2.10±1.81 1.58±1.19 1.44±0.99 0.96±1.52 0.90±0.68 1.98±1.80 98 94 86 Fragment C 1.99±1.34 1.33±1.81 2.59±2.18 0.87±0.59 0.96±0.50 0.85±0.47 1.71±0.41 100 97 93 Table 5.4: Error statistics for the three-fragment fracture phantom experiments. 5.4.4 Validation Against Outliers in Fluoroscopic Images To study the behaviors of our method in the presence of outliers in fluoroscopic images, we generated simulated fluoroscopic images using the two-fragment patient CT which includes surrounding bones (Fig. 5.9). First, both fracture fragments were rotated and shifted manually to create a larger amount of displacement. Then, using the method described before (Section 5.4.2), a set of simulated fluoroscopic images were created from four directions that are perpendicular to the z-axis, starting from -x and rotating towards +y, and roughly 75◦ apart (Fig. 5.16 first row). Those directions were selected because they are usually available in the OR and provide rich information about the fracture. Two types of studies were performed. First, the original fragment positions in the patient CT were used as the planned positions, in which case the manually created displacements (i.e. model-to-plan distances or initial errors) were the registration gold standard. Second, similar to previous experiments, a total of 100 random treatment plans were generated around the manually created fracture configuration (±15◦ for 5.4. RESULTS 109 Figure 5.14: Result of registration for the two fragment phantom. Top row shows the fluoro images used. Middle row shows the treatment plan. Bottom row shows the result of the registration of the plan to the fluoro images. Note that to demonstrate the robustness of the proposed registration technique, we have chosen a treatment plan that differs significantly from the pose of the fractures in the fluoro images. the rotation parameters and ±10 mm for the translation parameters), in which case the perturbed displacements were the registration gold standards. Fig. 5.16 shows the simulated fluoroscopic images used in both studies as well as the treatment plan for the first study, and Table 5.5 summarizes the experimental results for both studies. 5.4. RESULTS 110 Figure 5.15: Result of registration for the three fragment phantom. Top row shows the fluoro images used. Middle row shows the treatment plan. Bottom row shows the result of the registration of the plan to the fluoro images. Note the adjustment of the location of the small fracture fragment in the bottom row. Study I Study II Displacement (mm) mTRE (mm) Displacement (mm) mTRE (mm) Success Rate (%) Fragment A 12.34 0.74 10.57±6.24 0.83±0.64 86 Fragment B 8.22 0.45 15.95±7.18 0.56±0.49 91 Table 5.5: Summary of experimental results for studies with outliers presented in fluoroscopic images (initial and final errors in Study II are denoted as mean ± standard deviation). 5.5. DISCUSSION 111 Figure 5.16: First row: four simulated fluoroscopic images were generated from the two-fragment patient CT which includes surrounding bones as outliers (fragment positions were manually displaced). Second row: the treatment plan shown in the corresponding imaging views (the surrounding bones were not present in DRRs during registration, they are included for illustration). 5.5 Discussion The effectiveness of our method has been strongly supported by our experimental results. For the synthetic fracture cases, an average pose identification error of 0.4 ± 0.5 mm (i.e. the average of mTRE for all synthetic fracture experiments and all fragments) is achieved (see Tables 5.1 and 5.2) with only two simulated fluoroscopic images from the best imaging directions. For the phantom fracture cases, an average pose identification error of 2.0 ± 0.8 mm (i.e. the average of mTRE for all phantom fracture experiments and all fragments) is achieved (Tables 5.3 and 5.4) with only four fluoroscopic images from restricted imaging directions. Having consulted with orthopedic physicians in Kingston General Hospital, these accuracies are sufficient 5.5. DISCUSSION 112 and meet most clinical needs. As for capture range, for a planning distance of below 10 mm, almost all registrations are successful for most fragments in the synthetic fracture experiments (Tables 5.1 and 5.2). Two exceptions are the fragment A in both fracture cases, where larger failure rate are observed. But if we had relax the registration success criterion to 3 mm, all registrations for the fragment A in both fracture cases would have been successful. For the phantom fracture cases, all experiments are successful for a planning distance of below 5 mm (Tables 5.3 and 5.4). Given the difficulty of multiple-object 2D3D registration in a noisy environment, we consider those acceptable capture ranges. Furthermore, the planning distance of 5 mm can normally be achieved by orthopaedic surgeons within our institution using manual alignment of bone fragments. In Tables 5.1 and 5.2, the fragment A in both fracture cases has higher failure rates, and a slightly larger mTRE distribution. A closer look at the errors in rigidbody parameters reveals that the main error sources originate from rotation about the z-axis. This is because the fragment A in both cases has a cylinderical shape, and the AP and lateral imaging views provide relatively little information for correcting this kind of rotational error. In Table 5.2, the failure rates for the two small fragments (i.e. B and C) are larger than that of the large fragment. The results show that the shape and size of the fracture do affect the performance of our method. However, the impact can be reduced by using fluoroscopic images from better imaging directions. Comparing all three fragments in Table 5.2, we see that the fragment C has similar mTRE distribution (1.71 ± 0.41 mm) as that of A (1.74 ± 0.52 mm), but the fragment B has a larger error distribution (1.98±1.80 mm) than that of C though its size is even slightly larger. This observation can be explained by the imaging views (Fig. 5.11 5.5. DISCUSSION 113 second row) used for registration: the irregular shadows of the fragment C are more informative for registration, so a good pose identification accuracy could be achieved but with a larger failure rate; while the fragment B has a lot of occlusions with the large bone in the fluoroscopic images, so a larger error distribution is reasonable due to the uncertainties caused by occlusions. Our algorithm is designed for handling these kinds of issues, so the overall pose identification performance for the fragment B is still acceptable. Given the 0.4±0.5 mm error in synthetic fracture experiments and the 2.0±0.8 mm error in phantom fracture experiments, we believe that the major source of error of our algorithm comes from a few factors that are related to the image similarity metric. In the synthetic fracture experiments, the best imaging directions are used, and the discrepancies between the fluoroscopic images and DRRs are minimum as the fluoroscopic images are simulated from DRRs. So the registration errors should come from sources such as empty fluoroscope background, artificially introduced noise, nonlinear function transformations including image enhancement and mutual information, and premature termination of optimization, which collectively produced an error of 0.4 ± 0.5 mm. In our phantom fracture studies, the acquired fluoroscopic images are post-processed in order to determine the imaging parameters as well as to correct the geometry distortions. During registration, though the quality of the produced DRRs can be controlled by using a transfer function or using a more accurate DRR generation method, the discrepancies between the fluoroscopic images and DDRs will still be large. Also, in phantom studies, the involved fragments have different shapes and sizes, and the available imaging views are constrained. It should be noted that, in our phantom studies, the phantoms are built without 5.5. DISCUSSION 114 including the surrounding bones such as the ulna and carpal bones. Since in distal radius fracture treatment, fluoroscopic images are usually acquired from AP and lateral directions, the surrounding bones can be maximally cropped from the images before the registration is initiated. Therefore, excluding the surrounding bones should not have critical impact on our experimental results. In cases that a lot of surrounding bones do appear on fluoroscopic images, the reported final errors (Table 5.5 Study II) are slightly worse than those from a similar synthetic case but without outliers (Table 5.1 row “mTRE”). However, the impact of the outliers is not significant, and the overall final errors are still within 1 mm. We also observe that the results are even better than those from the two-fragment phantom case (Table 5.3 row “mTRE”). This is mainly because simulated fluoroscopic images of good quality have been used, but our studies still show that, if fluoroscopic images of good quality and from appropriate imaging views can be acquired, the impact of outliers will not be significant. Our method has a number of parameters to control the behavior of the edge enhancement filter and optimization, and they are currently determined through empirical testing. We have found that the registration results are somewhat sensitive to some of the parameters such as the σ, A and B in the edge-enhancement filter. Comparing the failure rates in Table 5.2 and Table 5.4, we see that the experiments for phantom fractures achieve better capture ranges than the synthetic fracture experiments. The same observation can also be seen for the fragment B in Tables 5.1 and 5.3. Although more fluoroscopic images are used in the phantom fracture experiments, the main reason for this observation is that the user parameters are tuned for the phantom experiments. The values of those parameters are mainly determined 5.5. DISCUSSION 115 by the structure details in the fluoroscopic images, and the type of device used for fluoroscopic imaging. Therefore, for each type of fracture and each type of imaging device, the parameters need only to be tuned once. Our method employs several techniques in order to achieve a good balance between performance and robustness. For example, GPU-based DRR generation and IIR gradient are employed for speed consideration, while Mattes mutual information, CMA-ES, and multiple-resolution and global-local alternating optimization scheme are adopted for consideration of both performance and robustness. Each registration in our experiments takes about 3 to 9 minutes to complete, and the experiments are performed on a personal computer with the following configuration: Windows XP, Intel Core 2 2.4 GHz CPU, 3 GB RAM, nVidia GeForce 8600 GT graphics card and 256 MB VRAM. As the computation power of CPU and GPU is rapidly increasing, we are optimistic that it won’t be long before we can finish one registration within one minute. In clinical environment, there are three potential ways to use the transformations reported by our method. Firstly, our method can be used like traditional CAS systems to guide the intraoperative reduction procedure with a preoperative treatment plan, where the obtained transformations are used to compute the treatment errors with respect to a carefully made plan and provide both visual and quantitative feedbacks to the surgeon. The advantage is that our method is non-invasive, and the drawback is that the computation efficiency as well as robustness need to be further improved. The second usage is for postoperative evaluation of treatment errors, which is similar to the first usage but has fewer requirements about the computation speed, and our method has immediate suitability. Lastly, the most ambitious and prospective usage 5.6. SUMMARY 116 is to use the obtained transformations to directly control a treatment robot in order to achieve automatic fracture reduction, and there is still a long way to go to achieve this goal. 5.6 Summary We present a new multiple-object 2D-3D registration method to identify the poses of the fracture fragments in the space of treatment plan with only 2-4 fluoroscopic images. Two key techniques are used to deal with occlusions, outliers, noise and small fragment sizes. First, a similarity metric that integrates edge information with mutual information is used to obtain an accurate and smooth cost function. Second, a globallocal alternating optimization scheme that employs CMA-ES is used to achieve a good balance between the capture range and convergence speed. A mean pose identification error of 0.4 mm with a capture range up to 10 mm is achieved for synthetic fracture studies that simulate optimal treatment setups, and a mean pose identification error of 2.0 mm with a capture range up to 5 mm is achieved for phantom fracture studies that simulate lifelike treatment setups. Using 2D-3D registration in place of optical tracking provides minimally invasive treatment for complex bone fractures with reduced amount of irradiation. The method can be used as the first trial before more invasive and costly procedures are attempted, or it can be used to evaluate post-treatment errors against the treatment plan. In future work, we will seek more systematic ways to determine the user parameters that control the edge-enhancement filter and the optimization behavior. We will also further optimize the software by using the latest GPU parallel computation 5.6. SUMMARY technologies to improve our method’s performance. 117 118 Chapter 6 Modelling of 3D Intensity Atlas with B-Spline FFD 6.1 Overview Fast instance generation is a key requirement in atlas-based registration and other problems that need a large number of atlas instances.1 This chapter describes a new method to represent and construct intensity atlases. Both geometry and intensity information are represented using B-spline free-form deformation (FFD) lattices; intensities are approximated using the multi-level B-spline approximation algorithm during model creation, and the parallel computation capability of modern graphics processing units is used to accelerate the process of instance generation. Experiments with distal radius CTs show that, with a coefficients-to-voxels ratio2 of 0.16, intensities can be approximated up to an average accuracy of 2 ± 17 grey-levels (mean ± stdev, out of 3072 total grey-levels), and instances of resolution 256 × 256 × 200 can 1 This work has been published in Proceedings of EMBC: R. H. Gong, J. Stewart, and P. Abolmaesumi, “A new representation of intensity atlas for GPU-accelerated instance generation”, Annual International Conference of Engineering in Medicine and Biology Society, pages 4399-4402, 2010. 2 Defined as the ratio between the number of B-spline intensity deformation parameters and the number of original voxels, see Section 6.4.1 for more details. 6.2. INTRODUCTION 119 be produced in a rate of 25 instances per second with a GeForce GTX 285 video card, which is about a 500 times performance improvement over the traditional method that uses plain CPU. 6.2 Introduction Anatomical atlases (or statistical shape models) are widely used in Computer Assisted Surgery (CAS) for registration, segmentation, planning, interpretation, and so on. An atlas captures the statistics, including mean and variations, of a set of instances of the same anatomy. Captured information can be geometry only (geometry atlas hereafter) or both geometry and intensity (intensity atlas hereafter). While geometry atlases are more efficient for use due to their compact sizes, intensity atlases offer greater reliability and accuracy because of the additional intensity information they provide. One key problem associated with an atlas is to find an appropriate representation of the atlas such that correspondences across a set of training shapes can be easily established, and new instances can be efficiently generated. Geometry atlases have been well studied during the past two decades. Reported representations or underpinning techniques include point distribution models [20], minimum description length [23], medial axes [32], spherical harmonics [15], tetrahedral mesh [123], connected in-spheres [101], and deformation fields [92]. For most of those techniques, the process of atlas construction and the quality of the resulting model depend on the geometry of the shape being studied. One exception is the technique that uses deformation fields, which is the result of B-spline deformable registration. However, that technique uses multiple component volumetric data of full-resolution as the model representation, which not only takes large storage space, 6.2. INTRODUCTION 120 but also slows down the process of instance generation. This limitation can become the performance bottleneck when the application context requires a large number of atlas instances, such as with atlas-based 3D registration. Adding intensity information into atlases further complicates the problem. Thus, intensities are usually sampled or mathematically approximated in order to reduce the size of the models. The Active Appearance Models suggested by Cootes et al. [19] captures texture variations within a sampled “shape-free” patch (i.e. the region of interest). Berks et al. [7] use a thin-plate spline to approximate intensities in their Mammographic Appearance Models. For a 3D intensity atlas, Yao et al. [122] choose to use Bernstein polynomials to approximate intensity distributions within each tetrahedron. In all these methods, while intensity information is captured with satisfactory accuracies, efficiently producing instances from the models remains a problem. In recent years, the processing power of consumer-grade graphics processing units (GPUs) is improving rapidly, and GPUs are increasingly used for general-purpose computations. In this chapter, we describe a simple representation for intensity atlases that takes advantage of this trend. Both geometry and intensity information are represented using B-spline deformation lattices: intensities are approximated with the Multi-level B-spline Approximation (MBA) algorithm [59] during atlas creation; and the parallel computation capability of modern GPUs is used to accelerate the process of instance generation. We evaluate our method with a set of distal radius CT data, and report the accuracies of intensity approximation as well as the performance of instance generation. Our primary contribution is the use of a B-spline lattice for representing the 6.3. METHOD 121 intensity information of atlas models, which is an accurate and compact representation that is good for GPU parallel processing. Our experiments showed that, comparing with the traditional CPU-based method, the use of B-splines and GPU can achieve about a 500 times improvement in performance. 6.3 Method 6.3.1 Atlas Representations Every atlas has two representations: an internal representation that describes the mean and variations of the atlas, and an external representation, known as shape parameters, for user applications to request instances. The former has different formats for different atlases, and determines the space/time costs needed to produce each instance. On the other hand, the latter has a compact format, and its dimension has great impact on the performance and behaviors of the user applications. The internal representation of any instance in our atlas can be generally described using a function S = f (V ; C), with C = {S, M } and V = {Φg , Φi }. C and V represent the constant and variable aspects of the model, respectively: • C consists of a volume that describes the mean geometry and intensities (S), and a binary mask that segments the anatomy of interest (M ). • V describes the geometry and intensity variations between this instance and the mean volume, which consists of two uncorrelated (in terms of the positions of the control points) B-spline control lattices, one for geometry deviation (Φg ) and one for intensity deviation (Φi ). The two lattices cover the same region of interest, but with different resolutions and spacings. The selection of resolutions depends 6.3. METHOD 122 on the accuracies of the variations to be captured, but the resolutions are always coarser than the original volume grid in order to achieve data reduction. To obtain an instance from the model, the two control lattices are sequentially applied to the mean volume, with intensity first, followed by geometry. The goal with the external representation is to use the fewest parameters to represent the most variation within a set of training examples. This is done by performing Principle Component Analysis (PCA) on the variable part of the internal representation. For each training example, Sk , k = 1...N , we form a column vector vk = (φxg(k) , φyg(k) , φzg(k) , φi(k) )T , where the first three are the X, Y and Z components of the geometry B-spline lattice Φg , each linearized as a row vector. Similarly, the last component is the linearized row vector of the intensity B-spline lattice Φi . Performing PCA on v with all training examples, we obtain following generalized linear model v = Pb (6.1) where the columns of P describe a set of orthogonal modes of shape variation corresponding to the given set of training examples, and b is the projection of a known v in the new coordinate space, which is the external representation of the instance that user applications see. The modes are sorted in descending order, according to their variance. The dimension of b is less than or equal to N , and the values of b are bounded with the convex hull that contains the training examples. All values of b within the convex hull collectively represent the valid shapes that the given set of training examples can predict. 6.3. METHOD 6.3.2 123 Atlas Construction Given a set of volumetric training examples, the goal of atlas construction is to mutually align all training examples, and transform each into internal and external representations. The following steps give an overview of the process: 1. Segmentation and initial alignment. • Segment the anatomy from all training examples, and select an arbitrary one to define the atlas Coordinate Frame (CF). The principal axes of the selected shape are used to define the atlas CF: shortest 7→ X, medium 7→ Y, longest 7→ Z (Fig. 6.1). • For each other training example, the principal axes of the shape are automatically aligned with the atlas CF. As multiple mappings between the two CFs exist, some training examples will be misaligned. However, the misalignments are only 180◦ rotation about the X-axis and/or 90◦ rotation about the Z-axis. They are corrected interactively in a graphical user interface (GUI). 2. Group-wise registration to align all training examples. The results include a common mean volume with the corresponding anatomy mask and, for each training example, a sequence of transformations that transform the training example into the mean, which include: • A rigid transform. It is discarded after registration as it is not an intrinsic property of any shape. • A free-form geometry deformation in the form of a B-spline lattice. 6.3. METHOD 124 • An anisotropic scaling. It is merged into the above transform after registration as our atlas does not explicitly capture scaling. • An intensity deformation in the form of B-spline lattice. The lattice is constructed using the MBA algorithm (described later). The method by Balci et al. [4] is ideal for this task. However, it is sensitive to a number of user settings. So, we adopted a sub-optimal solution by appointing (rather than dynamically computing) one training example as the mean, and performing piece-wise registrations to align all training examples. The negative side of this choice is that the appointed mean may be far away from the true mean, which leads to larger total variations within the constructed atlas, such that more modes of variation are needed to capture all possible shapes. However, the choice does not affect the accuracy in reproducing the training examples from the resulted shape parameters. Moreover, the negative impact can be minimized by visualizing all training examples on the computer screen and choosing the one that is average sized and is not obviously abnormal. 3. Once all training examples are aligned, PCA is performed on the geometry and intensity B-spline lattices as described in Section 6.3.1. One main contribution of this work is to represent and approximate intensity values with a B-spline lattice. This decision is motivated by two facts: first, intensity is used as supplementary information to the geometry, so certain approximation errors are allowed; second, we approximate the intensity differences between the training examples and the mean, which are usually smooth. So, the use of a B-spline lattice is able to achieve good data reduction while maintaining satisfactory accuracy. For 6.3. METHOD 125 a given training example, the intensities are approximated using the MBA algorithm [59] as follows: 1. Initialize D to be the intensity difference between the training example and the mean, choose an initial size for the output lattice Φ, compute Φ from D using the B-spline Approximation (BA) algorithm, and set Ψ = Φ; 2. Approximate D with Ψ to obtain D0 , and compute the residual R = D − D0 ; 3. If R is satisfactory, finish approximation and report Φ as the final lattice, otherwise do following: • Double the resolution of Ψ in each dimension, and compute its new value from R using the BA algorithm; • Refine Φ to have the same resolution as Ψ, and merge Ψ into Φ (see [59] for the details); • Set D = R; • Repeat the steps 2-3. 6.3.3 Instance Generation Given a vector of shape parameters, b, a corresponding instance can be generated from the atlas. First, b is back-projected into a vector v with Eq. (6.1), which is in turn re-formated to obtain a geometry B-spline lattice and an intensity B-spline lattice. Then, the two lattices are sequentially applied to the mean volume. Applying the B-spline lattices to the mean volume is the most time-consuming task during instance generation, especially when the resolution of the instance is large. 6.3. METHOD 126 One good property about B-spline lattices is that they are regular data structures, which is good for parallel processing. We use CUDA-capable GPUs from nVidia (Santa Clara, CA) [2] to aid instance generation. Such GPUs contain a number of processor cores, each of which can execute multiple threads in parallel. To configure a GPU for instance generation, the GPU is programmed with a kernel [2] that maps each voxel of the resulting volume to one GPU thread, each of which performs geometry and intensity transformations independently and simultaneously. As the total number of concurrent threads is limited, the resulting volume is generated slice by slice; within each slice all voxels are computed in parallel. The mean volume, the anatomy mask, and the coefficients of the B-spline lattices are stored on the GPU as 3D textures for fast computations. When an instance is requested, the coefficients of the B-spline lattices are sent from the host memory to the GPU, then the kernel is executed, and finally the result is transfered back to the host memory. While all threads are executed in parallel, the computation each thread does is relatively simple: it transforms the thread’s coordinates with the geometry Bspline lattice, fetches the corresponding texel from the mean volume texture, and performs intensity transformation with the intensity B-spline lattice if the texel is within the anatomy mask. It should be noted that, in actual implementation, we perform PCA on the inverse of the geometry and intensity transformations, so the above computation is performed straightforward at each output voxel location. 6.4. EXPERIMENTS, RESULTS AND DISCUSSION 127 Figure 6.1: Training examples for building an atlas of distal radius (after group-wise registration). Also shown is the reference coordinate frame of the atlas. 6.4 Experiments, Results and Discussion We built an atlas of distal radii from nine wrist CTs using the method described above. All volumes are re-sampled and cropped to have a resolution of 256 × 256 × 200 voxels and a spacing of 0.45 × 0.45 × 1.18 mm3 . Fig. 6.1 shows the training set as well as the reference coordinate frame used for atlas construction. We report on the accuracy of intensity approximation with B-spline lattice, and the performance gain when a GPU is used for instance generation. 6.4.1 Accuracy of Intensity Approximation Two training examples are chosen to test the accuracy of reproduced intensity. Bspline lattices of three different sizes are used to study the relationship between the approximation accuracy and the data reduction rate, where data reduction rates are evaluated using coefficients-to-voxels ratio (CVR), and are computed as the ratio between the total number of coefficients in the B-spline lattice and the total number of voxels in the training example. Table 6.1 shows the results, and Fig. 6.2 gives a visual check of the approximation results. When approximating a data set with a B-spline lattice, the spacing of the lattice 6.4. EXPERIMENTS, RESULTS AND DISCUSSION 128 Table 6.1: Accuracies of the intensity approximation with different final B-spline lattice sizes (mean ± stdev; unit: Hounsfield units; intensity range: [-1024, 2047]; accuracies were computed over all voxels within the bones). Final Lattice Size Coefficients/Voxels Ratio Training Example 1 Training Example 2 35 × 35 × 35 0.003 5 ± 34 5 ± 28 67 × 67 × 67 0.02 4 ± 28 4 ± 19 131 × 131 × 131 0.16 2 ± 17 2 ± 11 Figure 6.2: Approximated volumes (one axial slice is shown) for different final Bspline lattice sizes. The top image shows the volume to be approximated. 6.4. EXPERIMENTS, RESULTS AND DISCUSSION 129 determines the approximation errors [59]. When the spacing is larger than the minimum distance between any two voxels, approximation will happen and approximation errors will occur; otherwise, interpolation will happen and no approximation errors will be present. This property makes the B-spline an ideal choice for sparse data approximation. However, if the data values are smooth, a dense data set still can be approximated with a large lattice spacing while maintaining small approximation errors. We can observe this phenomenon from Table 6.1, where small approximation errors were achieved even when the coefficients-to-voxels ratios were much smaller than 1. This observation can be explained by the fact that the intensity difference between any two images of the same anatomy and modality will be smooth if they are deformed to be accurately aligned. 6.4.2 Performance of Instance Generation As the size of the output volume and the data transfer between CPU and GPU are important factors that affect the performance of atlas-based applications, we performed three types of testing with different instance resolutions: • CPU only - instances are generated and processed in CPU (the traditional method); • CPU+GPU - GPU generates instances and CPU does processing (there is data transfer between CPU and GPU); and • GPU only - GPU performs both instance generation and processing (there is negligible data transfer between CPU and GPU). 6.4. EXPERIMENTS, RESULTS AND DISCUSSION 130 Table 6.2: Performance of instance generation under different usage scenarios. Instance Resolution CPU only (ips) CPU+GPU (ips) GPU only (ips) 128×128×100 256×256×200 512×512×400 0.4 6.82 112.71 0.05 5.37 25.58 0.006 2.27 4.19 In this study we were only concerned about the speed of instance generation, so no actual “processing” was applied to the generated instances during the testing. The performance is evaluated as number of instances per second (ips) within the following environment: Intel i7-920, 6 GB RAM, GeForce GTX 285 with 1 GB VRAM, Windows 7 64-bit, CUDA SDK 2.3, and Visual C++ 2008. 25 random instances were generated for each “CPU only” experiment, and 500 random instances were generated for each of the other experiments. Table 6.2 shows the averaged results. From Table 6.2 we can see large performance improvements when the GPU was used for instance generation. Also observed is that the data transfer between CPU and GPU is an important performance factor, especially when frequent and large data transfers exist. This suggests that there is still room for performance improvement if the subsequent tasks are also performed in the GPU. For example, in atlas-based registrations, additional performance may be achieved if similarity computation is also done in the GPU. 6.4.3 Performance of Atlas Construction As described in Section 6.3.2, our atlas method depends on pair-wise registration to incrementally incorporate the training examples. While the quality of the constructed atlas may not be as good as the method that uses the group-wise registration, the performance of atlas construction can be an advantage because it is linear to the 6.5. SUMMARY 131 number of training examples; moreover, new training examples can be added with constant time because it is not necessary to repeat the registrations for previous training examples. In our experiment with 9 training examples of moderate CT resolution, the construction took around 3 hours. 6.5 Summary In this chapter, we described a new representation for intensity atlases. Both geometry and intensity information are represented using B-spline lattices, intensities are approximated using the MBA algorithm, and CUDA-capable GPUs are used for fast instance generation. Testing with human wrist CTs showed that the B-spline lattice is an appropriate and compact representation for carrying intensity information in intensity atlases. The use of GPU parallel computation also demonstrated significant speed improvements against the traditional CPU-based method. 132 Chapter 7 Atlas-based Multiple-object 2D-3D Registration 7.1 Overview In this chapter, we describe a method to guide the surgical fixation of distal radius fractures.1 The method registers the fracture fragments to a volumetric intensitybased statistical anatomical atlas of distal radius, reconstructed from human cadavers and patient data, using a few intra-operative X-ray fluoroscopic images of the fracture. No pre-operative Computed Tomography (CT) images are required, hence radiation exposure to patients is substantially reduced. Intra-operatively, each bone fragment is roughly segmented from the X-ray images by a surgeon, and a corresponding segmentation volume is created from the back-projections of the 2D segmentations. An optimization procedure positions each segmentation volume at the appropriate pose on the atlas, while simultaneously deforming the atlas such that the overlap of the 2D projection of the atlas with individual fragments in the segmented regions is maximized. Our simulation results shows that this method can accurately identify the 1 This work has been published in Proceedings of SPIE: R. H. Gong and P. Abolmaesumi, “Reduction of multi-fragment fractures of distal radius using an atlas-based 2D-3D registration technique”, SPIE Medical Imaging, pages 726137-1 - 726137-9, 2009. 7.2. INTRODUCTION 133 pose of large fragments using only two X-ray views, but for small fragments, more than two X-rays may be needed. The method does not assume any prior knowledge about the shape of the bone and the number of fragments, thus it is also potentially suitable for the fixation of other types of multi-fragment fractures. 7.2 Introduction Distal radius fractures account for about 15% of all fractures in the emergency room [16, 21]. Treatment is usually performed using minimal invasive techniques that are based on imaging and tracking technologies. Compared with other types of fractures that have been extensively studied in the literature (for example, the ones in hip and knee), distal radius fractures involve multiple and small fracture fragments. Thus the treatment is more challenging. Traditionally, fracture reduction is controlled by intraoperative fluoroscopy [11] or intraoperative CT [51]. A serious drawback of solely using intraoperative imaging is the need for a large number of X-ray images, which exposes the surgical team and patient to excessive irradiation. In recent years, preoperative 3D data, including bone fragment models [21] and treatment templates [77], are incorporated and used as the main modalities that guide the surgical process. The use of intraoperative imaging is minimized as this imaging is used only when establishing or updating the spatial correspondence between the preoperative 3D data and the patient. Fragment models are usually created from a diagnostic CT of the fracture, and a CT of the reflected contra-lateral bone of the fractured bone is commonly used as the treatment template [77]. This new treatment technique not only reduces the amount of radiation, but also provides intuitive 3D 7.2. INTRODUCTION 134 views of the fracture region without intraoperatively acquiring or reconstructing an instant 3D image. However, it assumes mirror symmetry between the corresponding bones from both sides of human body, while in reality those bones usually differ in size and shape. As a result, using the contra-lateral bone to guide the treatment may lead to significant misalignments. We propose a new method for treating distal radius fractures which uses a more case-specific template to guide the treatment process. An atlas (statistical shape) of healthy distal radii is used as a deformable treatment template, and a multiplefragment deformable 2D-3D registration algorithm that depends on a small number of intraoperative X-ray images is used to compute the relative poses between the template and the individual fracture fragments in operation room (OR). As the shape of the template is simultaneously estimated during registration to match the fracture being treated, the template is more accurate. In addition, the use of the atlas eliminates preoperative CT imaging and 3D segmentation, which significantly simplifies the treatment process. A couple of atlas-based or deformable 2D-3D registration techniques have been proposed in the literature. Sadowsky et al. [95] proposed a method that uses statistical shapes to replace CT in the circumstances that the field of view of X-ray images are limited. Tang et al. [101] proposed a method that uses a hybrid atlas and a few X-ray images for 3D surface reconstruction. Those methods deal with only a single bone piece and are thus not suitable for fracture treatment. To the best of our knowledge, our technique is the first that treats multi-fragment fractures using an atlas-based 2D-3D registration technique. We present the proposed method in Section 7.3, and report the preliminary results 7.3. METHOD 135 with a synthetic fracture in Section 7.4. Finally, Section 7.5 concludes the chapter. 7.3 Method Our method is based on an atlas of distal radii, which is used as a dynamic template to guide the treatment, and a multiple-fragment deformable 2D-3D registration algorithm, which is used to compute the relative poses between the template and the individual fracture fragments in OR. 7.3.1 The Atlas of Distal Radius An atlas captures the statistical information, including mean and variations, of a group of objects. It not only represents the objects used to construct the atlas, but also produces “interpolated” new objects that do not exist during atlas construction. An atlas can capture either the statistical information of geometrical silhouette (hereafter geometry atlas) [20, 23, 29, 88, 92], or the statistical information of both geometrical silhouette and internal density values (hereafter intensity atlas) [122]. While geometry atlases are known for good efficiency in the applications of modelfitting, intensity atlases are more robust to segmentation errors introduced during atlas creation and model-fitting. Our atlas is an intensity atlas that represents a family of CT volumes containing the distal radii (about 1/4 of the full radius) of both right and left arms. Though Yao [122] has developed an efficient method for building intensity atlases using tetrahedral mesh (for representing geometry) and Bernstein polynomials (for representing density), we used a simpler while more general approach that is based on B-spline 7.3. METHOD 136 Figure 7.1: A few examples of the training data used to build the atlas of distal radius. All data have been reoriented such that the principal axes are aligned with the coordinate axes. Reflection has also been performed for the data from the left-arm. (Data courtesy of Kingston General Hospital, Ontario, Canada) deformable transformation. It extends Rueckert’s method [92] by capturing additional information including the scale of the bone and CT Hounsfield values within the bone. Our method does not tightly depend on the content of the training data; thus can be used for building the atlas for objects of any shape. A total of 16 training data (CT), six from Distal Radius Osteotomy (DRO) surgeries and 10 from cadavers, were used to build the atlas. The DRO data contained only the two ends of the radius (intended to reduce the irradiation to the patients), while the cadaver data contained the full radius. Nine data sets were from the right arm, and seven from the left arm. A few examples of the training data are shown in Figure 7.1. 7.3. METHOD 137 To build the atlas, all training data were first normalized into a common coordinate frame. Then, Principal Component Analysis (PCA) was performed and the atlas was created. Once the atlas is constructed, new instances can be generated from atlas coefficients. Normalization of Training Data This process transforms all training data into a common coordinate frame, i.e. the atlas coordinate frame, such that all data are aligned. This is a group registration problem, and we use the term “normalization” to distinguish it from other types of registration problems. After normalization, each data set is decomposed into a common mean shape, and a transformation that determines the variation of the data away from the mean shape. We model the transformation as a concatenation of five sub-transforms that sequentially transform the data into the mean shape: rigid transform one → anisotropic scaling → rigid transform two → B-spline deformation → intensity transform. We use the following method to compute the mean and the individual transformations: 1. Compute the first rigid sub-transforms by initially aligning all training data: (a) Segment the radius from each training data set; (b) Reorient each segmented radius such that its principal axes are aligned with the coordinate axes (longest 7→ Z, medium 7→ Y, shortest 7→ X), and its geometry center is at the origin; (c) Due to the special shape of the radius, most training data will be uniformly oriented after the previous step, a few may be mis-oriented by 90◦ or 180◦ , those are identified and corrected in a graphical user interface (GUI); 7.3. METHOD 138 (d) Reflect the radius in Y-direction if it is from the left arm. 2. Compute the mean size of the radius using the Axis-aligned Bounding Boxes (AABBs) of all training data and, for each training data, compute the scaling factors and scale/crop the radius. 3. Perform pair-wise non-rigid registrations to accurately align all training data. This will compute the common mean shape and, for each training data set, the second rigid sub-transform and the B-spline deformation. For more accurate alignments, group-wise registration algorithm [4] can be used instead. 4. For each training data set and each voxel, compute the intensity difference with respect to the mean shape. To reduce the memory usage, polynomials (e.g., power polynomials or Bernstein polynomials) can also be used to approximate the intensity values [122] with greatly reduced number of coefficients. Atlas Construction As described in the previous section, after normalization, the variation of each training data with respect to the mean shape is represented using a sequence of five subtransforms. We build the atlas to capture the statistical information of the last four sub-transforms, because the first one is not an intrinsic part of the radius and will be determined for specific user applications. For each training data set, the inverses of the four sub-transforms were first computed, and then parameterized and concatenated to form one column of a matrix X. Since the inverse of a B-spline deformable transform is not analytically available, we approximated the inverse by registering the mean shape back to the training data. A fast but less accurate pseudo-inversion algorithm 7.3. METHOD 139 could also used [108]. Finally, we perform Principal Component Analysis (PCA) for X and project each training data (i.e. some column xi ) into the Eigen space: ai = diag(σ1 ...σN )([v1 ...vN ]T xi ) (7.1) where N is the number of training data sets (16 in our case), (v1 ...vN ) are the 2 eigenvectors computed from XXT , and (σ12 ...σN ) are the corresponding variances along each eigenvector. ai is the N -dimensional projected point, which is also called atlas coefficients. The convex hull of all projected points contains all valid shapes the atlas could generate, based on the set of given training data. Instance Generation To generate an instance from the atlas, a set of atlas coefficients is provided, and the inverse process of Equation (7.1) is performed to compute a sequence of subtransforms. Then, sequentially and in reverse order, each of the sub-transforms is applied to the mean shape. If a left-arm radius is requested, the generated volume is Y-reflected. When supplying the atlas coefficients, it is important to constrain the point to within the convex hull shaped by the training data; otherwise, the generated shape could be unrealistic. When generating an instance from a set of atlas coefficients, only the first few significant eigenmode can be used. Depending on the application requirement, using an appropriate number of eigenmode could achieve a good trade-off between the quality of the generated instances and the computation cost for solving the user problem. In our distal radius atlas, the first five eigenmode accounted for about 2 70% of total variations in the training data (computed as Σ5i=1 σi2 /ΣN i=1 σi ), which is 7.3. METHOD 140 acceptable for testing our fracture reduction method. 7.3.2 Multiple-Fragment Deformable 2D-3D Registration The inputs of our method are a set of co-registered intraoperative X-ray images and a dynamic treatment template, i.e. an instance of the distal radius atlas with changing shape. The goal is to: 1) determine the real shape of the template based on the X-ray images of the fracture, and 2) find the relative poses between the determined template and the individual fracture fragments in the OR. An overview of the method is illustrated in Figure 7.2. The fixed data is the set of intraoperative X-ray images. The moving data is the set of fracture fragments in the OR to be reassembled, each modeled as a segmentation volume on the dynamic template (Figure 7.3). To search for a solution, first, the shape of the template is deformed and the segmentation volumes of the individual fragments are moved; then, simulated X-ray images of the fragments, also called Digitally Reconstructed Radiographs (DRRs), are generated and combined; next, similarity values between the combined DRRs and X-ray images are computed; finally, based on the similarity values under the current transformation, an optimizer is used to update the shape of the template and the poses of the fragments. We describe each of the involving components in the following sections. The Fragment Model Each bone fragment is modelled as a segmentation volume on the dynamic template. The segmentation volume (i.e. region of interest for one fragment) is constructed interactively from the X-ray images: the bone fragment is roughly segmented by hand 7.3. METHOD 141 Figure 7.2: Overview of the multiple-fragment deformable 2D-3D registration algorithm. from the X-ray images, then the 3D back-projections of these 2D segmentations are intersected to produce a bounding volume around the fragment. The 2D segmentations are convex polygons of (typically) four or five edges. Each polygon defines a cone in 3D with its apex at the known focal point of the X-ray. For each fragment, the cones from different X-ray images are intersected to form a convex 3D volume. During optimization, the Graphics Processing Unit (GPU) is used to accelerate the operation of applying the segmentation volumes to the template. Since the GPU we used was limited to six clipping planes, we chose six planes that conservatively enclose the 3D volume (the planes are usually not aligned with the coordinate axes). Figure 7.3 shows six clipping planes positioned over the atlas to model one fracture fragment. 7.3. METHOD 142 Figure 7.3: A fractured bone fragment is modelled as a segmentation volume on the atlas. The pose of the segmentation volume and the shape of the atlas are to be determined. Transformation Parameters There are two types of transforms to be determined during optimization: a global transform that models the shape of the dynamic template as well as its position in the OR coordinate space, and a set of local transforms, one per fragment, that model the poses of individual fragments within the template. The template shape is modelled as a non-rigid transform with respect to the atlas mean (see Section 7.3.1), and is represented using atlas coefficients. In our experiments, the first five eigenmodes were used to generate instances from the atlas. The template position and local transforms for individual fragments are rigid transforms, each has six coefficients with three being the rotation angles and three being the translation. In a two-fragment fracture, for example, there are 5 + 6 + 2 × 6 = 23 parameters to be estimated. To start the optimization, an initial value of the parameters is required. For the template shape, the initial parameters were set to all zeros (corresponding to the mean shape). For the template position and local transforms, the initial parameters were determined interactively using a graphical user interface. 7.3. METHOD 143 Similarity Measure For each transformation, we compute a similarity measure as follows. The transformed bounding volume of each fragment is used to clip the deformed atlas. For each of the X-ray images, each clipped volume is rendered as a DRR with the same camera parameters as used by the X-ray image, and a combinatorial DRR is composed from the individual fragment DRRs. For each pair of X-ray and combinatorial DRR, a similarity value is computed. Finally, the sum of individual similarity values provides an overall similarity measure. With n fragments and m X-ray images, n × m DRRs and m combinatorial DRRs are generated for each transformation. Since many transformations are considered in the optimization process, we use GPU-accelerated 3D clipping and texture-mapping techniques to quickly generate the DRRs. An nVidia GeForce 8600 GT with 256 MB video memory was used. To compare an X-ray image with its corresponding DRR, we have used Normalized Cross Correlation (NCC), Variance-Weighted Correlation (VWC), Mutual Information (MI), and Gradient Difference (GD). Optimization The similarity metrics described above are highly nonlinear. We use the robust Covariance Matrix Adaptation Evolution Strategy (CMA-ES) [42] to search for an optimal solution of the transformation parameters. The algorithm requires no derivatives. Instead, it determines the direction in which to search by taking samples in the parameter domain according to a normal multi-variate distribution around the current state in the parameter domain. During the optimization process, the search distribution is adaptively deformed according to the local function landscape. The inputs of 7.4. EXPERIMENTS AND RESULTS 144 the algorithm include an initial guess of the solution and an initial size of the search distribution. We employ CMA-ES in a two-stage optimization scheme. In the first stage, the deformation and pose parameters are optimized alternately: only the pose parameters are varied for a number of optimization iterations; then, only the atlas deformation parameters are varied for a number of iterations. This is repeated until convergence. This first stage permits the optimization to quickly bring the fragments into a reasonable position, permitting the atlas deformation parameters to be more easily optimized. In the second stage of optimization, the result is further refined by allowing all parameters to vary simultaneously. 7.4 Experiments and Results For this research, we focus on demonstrating the functioning of our proposed method, so only preliminary validation with synthetic fracture and synthetic X-ray images are provided. We used a synthetic fracture to test our method, in which case the “ground truth” was known. One of the training data sets was cut with a plane, and a set of 30 fragment layouts was generated by randomly perturbing the two bone fragments on either side of the plane. This simulated a paediatric physeal distal radius fracture. The CT and the simulated fracture location are shown in Figure 7.4a. The pose of each separate fragment was randomly rotated by up to 5◦ and randomly translated by up to 3 mm. Four examples are shown in Figure 7.4(b-e). These displacements appear to be reasonable since they create clinically realistic simulated X-ray images similar to those seen in an orthopaedic trauma case. For each fragment layout, two simulated X-ray images were generated in the AP 7.4. EXPERIMENTS AND RESULTS 145 Figure 7.4: A simulated fracture that was used to test our method. (a) One of the training data sets was cut with a plane close to the distal end of the radius. (b-e) The two resulting fragments were randomly rotated and translated. and lateral directions. The two fracture fragments were roughly outlined on each of the two X-rays. Finally, using the NCC similarity metric, we estimated the atlas shape and moved the fragments toward their correct positions in the atlas. The error measure was evaluated for each fragment using the surface points of the fragment. It was defined as the surface-to-surface distance between the initial or registered position of the fragment and the “ground truth” position of the fragment, ideally 0 mm. The initial errors of our 30 simulated fractures were 3.05 ± 0.87 mm (written as mean±standard deviation) for the large fragment, and 2.99 ± 0.82 mm for the small fragment. The final errors, after applying our method to determine the template shape and move each fragment toward its correct position, were 0.94 ± 7.4. EXPERIMENTS AND RESULTS 146 Table 7.1: Preliminary results of atlas-based 2D/3D registration. Fragment Head Body Initial mTRE (mm) 2.99 ± 0.82 3.05 ± 0.87 Final mTRE (mm) 1.64 ± 1.67 0.94 ± 0.66 Figure 7.5: One experiment case with two X-ray views: before reduction (left, initial error 7.3 mm) and after reduction (right, final error 3.1 mm) . The result was considered failure as the final error > 2.0 mm, which was mainly caused by the small fragment. 0.66 mm for the large fragment, and 1.64 ± 1.07 mm for the small fragment. Table 7.1 summarizes the experimental results, and Figure 7.5 shows the initial and final views of one experiment case. In clinical practice, a final error within 2.0 mm is considered successful. As the method performed much better for the large fragment than for the small fragment, additional experiments using eight simulated X-ray images were performed to further reduce the final errors for the small fragment. With eight X-ray images, the final errors were improved down to 1.10 ± 0.43 mm, which satisfies the clinical requirement. 7.5. DISCUSSION AND SUMMARY 7.5 147 Discussion and Summary We have described a method to guide the reduction of distal radius fractures. The use of a deformable atlas potentially provides a more accurate template than the commonly-used reflected radius of the contralateral arm, and does not require a large radiation exposure from CT imaging. No accurate 3D segmentations are necessary, instead, only rough 2D segmentations on X-ray images are performed. The method should be extendible to other types of fractures containing multiple fragments. The preliminary results show that the proposed method is able to accurately find the correct poses of fracture fragments for synthetic 2-fragment fractures with simulated X-ray images. When large errors occurred for smaller fragments, using additional X-ray images improved the result. 148 Chapter 8 Conclusion This chapter summarizes the contributions made by this thesis, and proposes directions for future research. 8.1 Summary of Contributions The main goal of this thesis is to investigate two major limitations of current 2D3D registration techniques, those being the lack of efficient optimization algorithms that are also robust against noise and outliers, and the lack of 2D-3D registration techniques that accurately and efficiently align multiple anatomical models to X-ray images for use in the cases such as fracture treatment. To address the first problem, two 2D-3D registration techniques that use recently proposed advanced optimization algorithms are investigated. For the second problem, two 2D-3D registration techniques that simultaneously register multiple objects are proposed. 8.1.1 Robust and Efficient 2D-3D Registration Though a variety of similarity metrics have been proposed for 2D-3D registration, finding an accurate and well-shaped function to model the similarity between X-ray 8.1. SUMMARY OF CONTRIBUTIONS 149 and DRR images is still a challenging task. When a similarity metric is highly nonlinear, the selection of optimization algorithm becomes critical. Chapter 3 described a 2D-3D registration technique that uses UKF along with GPU-accelerated DRR generation for robust and fast registration. As it requires no derivatives and simulates the process of Simulated Annealing (that is, random noise is artificially added at each iteration step), it is simple to use and robust against local minima. The method was evaluated using three bone phantoms of different shapes and sizes, and was compared with a conventional method that uses the down-hill simplex algorithm. Preliminary experimental results confirmed that UKF is superior to simplex in dealing with illposed similarity metrics. With similar registration accuracies and computation costs, the UKF-based method achieved 1.3 to 2 times wider capture ranges than the traditional method, which is a significant improvement as it will greatly increase the success rate of registration and simplify the initialization process. In the UKF-based 2D-3D registration technique, the noise in transformation parameters is used to drive the optimization process. Thus, the prior knowledge about such noise is an important factor that affects the registration. This is both good and bad. For the good, when such knowledge is available, the registration can be completed quickly and robustly, and the final covariance matrix of the transformation parameters can be used to analytically estimate the final registration error. For the bad, accurate knowledge about such noise is sometimes difficult to obtain, because there are many sources of errors such as imaging, calibration, similarity metric definition, and so on. As a complement to the UKF-based method, Chapter 4 described a 2D-3D registration technique that uses CMA-ES as the optimization algorithm. Similar to the 8.1. SUMMARY OF CONTRIBUTIONS 150 UKF-based method, the new method requires no derivative calculations, and learning the proceeding directions from a minimal set of sample points in the parameters domain. However, the method is easier to use as it has only a single user parameter. The method was evaluated with the same set of testing data as in the UKFbased method, and new experiments were performed for the UKF-based method with more accurate knowledge about the ratio between the process noise and the measurement noise. The experimental results showed that the two methods achieved similar capture ranges. However, the CMA-ES-based method marginally outperformed the UKF-based method in terms of accuracy and computation cost. The results were also compared with those of the simplex-based method, and both former methods showed significant improvements in capture range. 8.1.2 Multiple-object 2D-3D Registration One main use of multiple-object 2D-3D registration in computer-assisted fracture treatment is to identify the relative poses of the OR fracture fragments in the coordinate space of the preoperative plan. Chapter 5 described such a technique that uses multiple techniques to simultaneously align all fragments to a set of X-ray images. To achieve better robustness against various noise and occlusions among fragments, edge structures in the X-ray images were enhanced before a MI-based similarity metric was applied. To obtain a fast global alignment among all fragments, a global-local alternating optimization scheme that is based on CMA-ES was adopted, and the GPU was used to accelerate DRR generation. Both synthetic fractures and fracture phantoms were used to test the proposed technique. The experimental results showed that, for fractures in small bones such as distal radius, the proposed method could achieve a 8.1. SUMMARY OF CONTRIBUTIONS 151 capture range up to 10 mm for optimal treatment setups, and a capture range up to 5 mm for lifelike treatment setups. The multiple-object 2D-3D registration technique presented in Chapter 5 requires a preoperative treatment plan to be used as the registration reference. One automatic planning method is to use an intensity atlas of the bone being treated as a dynamic template to guide the planning process. To enable this operation, Chapter 6 described a new method that constructs intensity anatomical atlases from 3D images. A Bspline FFD lattice was used to model both geometry and intensity information of atlas instances, and the GPU was used to accelerate instance generation. A CT atlas of distal radii was constructed to test the method. The results showed that B-spline FFD was a compact, accurate and efficient representation to model CT intensities and, compared with traditional methods that use plain CPU, the use of GPU significantly improved the speed of instance generation. By incorporating the atlas generation method described in Chapter 6, a new atlasbased multiple-object 2D-3D registration technique was developed and presented in Chapter 7. The planning was performed implicitly and automatically by using an intensity atlas of the bone being treated and integrating the planning process into the registration process. The registration estimates not only the poses of individual fracture fragments, but also the final shape of the fractured bone. Fracture fragments were modelled as coarsely bounded volumes on top of the bone atlas, and the volumes were constructed by roughly segmenting the fragments on X-ray images and backprojecting the segmented 2D shapes into the 3D space. To improve the computation speed, the GPU was used to accelerate the processes of fragment modelling and DRR generation. One major benefit of this new technique is that it removed the 8.2. FUTURE WORK 152 separate step of preoperative planning, and only simple user interactions was involved. Preliminary results with a synthetic fracture showed that the proposed method can accurately identify the poses of individual fragments. 8.2 Future Work For clinical use a 2D-3D registration technique needs to satisfy requirements such as ease of use, high accuracy, fast computation speed, and robustness to the existence of multiple bones or external surgical tools in the fluoroscopic view. Without a doubt, the research works conducted in this thesis are still pilot studies and further improvements can be made. First, in the UKF-based 2D-3D registration technique, the knowledge about the noise in transformation parameters is a critical factor that affects the behaviour and performance of the registration. In this thesis, this parameter was determined empirically through trial-and-error. As the error in transformation parameters has an intrinsic connection with the errors introduced during image acquisition and similarity metric formation, there should be a systematic way to project the original errors into the parameter domain. Once accurate knowledge about the noise is obtained, it not only significantly improves the performance and robustness of the registration, but also provides an analytic approach to estimate the final registration error from the final covariance matrix of the transformation parameters [74]. Second, in some proposed techniques, user parameters were used to control the registration behaviours: for example, the edge-enhancement parameters in the multipleobject 2D-3D registration technique, the starting size of the covariance matrix in all techniques that use CMA-ES, and so on. Those parameters are case dependent 8.2. FUTURE WORK 153 and were determined empirically in this thesis. Further in-depth studies on those parameters can be an important step to improve the usability or performance of the techniques. Third, the atlas-based multiple-object 2D-3D registration technique described in Chapter 7 demonstrated a new idea to use 2D-3D registration for fracture treatment. The method is still in a preliminary stage and many improvements can be made. For example, the property of mutual exclusions between fragments was not taken into account when modelling the fragments, and the fragments were only modelled using very coarse bounding volumes. Those limitations were made due to the expensive computation cost of the technique. In recent years, the computation power of both CPU and GPU has been greatly increased, which provides potential to further improve the technique. Finally, although GPU-accelerated computations were extensively used in this thesis, the technology used for DRR generation is becoming outdated as the more powerful next generation GPUs are developed. In fact, new GPGPU computation technologies not only can be used to improve the performance and quality of DRR generation, but also can be used to accelerate other computations such as image pre-processing, similarity calculation, and optimization. Fast computation has been an important goal when 2D-3D registration techniques were developed in this thesis; however, there was still a huge gap between the current computation performance and interactive clinical usage. Using the new GPGPU technologies to improve the proposed techniques would be a prospective direction for future research. BIBLIOGRAPHY 154 Bibliography [1] Computer assisted surgery: Precision technology for improved patient care. Technical report, Advanced Medical Technology Association, 2005. [2] Nvidia CUDA compute unified device architecture - programming guide, June 2008. [3] The matrix and quaternions FAQ. http://www.j3d.org/matrix_faq/ matrfaq_latest.html, July 2011. [4] S. K. Balci, P. Golland, and W. M. Wells. Non-rigid groupwise registration using B-spline deformation model. In Proceedings of the 10th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 10, pages 105–121, 2007. [5] R. Bansal, L. H. Staib, Z. Chen, A. Rangarajan, J. Knisely, R. Nath, and J. S. Duncan. A novel approach for the registration of 2D portal and 3D CT images for treatment setup verification in radiotherapy. In Proceedings of the 1st International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 1496, pages 1075–1086, 1998. BIBLIOGRAPHY 155 [6] A. Benassarou, E. Bittar, N. W. John, and L. Lucas. MC slicing for volume rendering applications. In International Conference on Computational Science (2), pages 314–321, 2005. [7] M. Berks, S. Caulkin, R. Rahim, C. Boggis, and S. Astley. Statistical appearance models of mammographic masses. In Proceedings of the 9th International Workshop on Digital Mammography, pages 401–408, 2008. [8] P. J. Besl and N. D. McKay. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14:239–256, 1992. [9] C. Bethune and J. Stewart. Adaptive slice geometry for hardware-assisted volume rendering. Journal of Graphics Tools, 10(1):55–70, 2005. [10] K. K. Bhatia, J. Hajnal, A. Hammers, and D. Rueckert. Similarity metrics for groupwise non-rigid registration. In Proceedings of the 10th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 10, pages 544–552, 2007. [11] R. Bilic and V. Zdravkovic. Planning corrective osteotomy of the distal end of the radius. Unfallchirurg, 91:571–574, 1988. [12] W. Birkfellner, W. Burgstaller, J. Wirth, B. Baumann, A. L. Jacob, K. Bieri, S. Traud, M. Strub, P. Regazzoni, and P. Messmer. LORENZ: a system for planning long-bone fracture reduction. In Robert L. Galloway and Jr., editors, Proceedings of SPIE Medical Imaging, volume 5029, pages 500–503, 2003. [13] W. Birkfellner, R. Seemann, M. Figl, J. Hummel, C. Ede, P. Homolka, X. H. Yang, P. Niederer, and H. Bergmann. Wobbled splatting - a fast perspective BIBLIOGRAPHY 156 volume rendering method for simulation of X-ray images from ct. Physics in Medicine and Biology, 50(9):N73, 2005. [14] W. Birkfellner, M. Stock, M. Figl, C. Gendrin, J. Hummel, S. Dong, J. Kettenbach, D. Georg, and H. Bergmann. Stochastic rank correlation: A robust merit function for 2D-3D registration of image data obtained at different energies. Medical Physics, 36(8):3420–3428, 2009. [15] C. Brechbühler, G. Gerig, and O. Kübler. Parametrization of closed surfaces for 3-D shape description. Computer Vision and Image Understanding, 61(2):154– 170, 1995. [16] A. J. Bronstein, T. E. Trumble, and A. F. Tencer. The effects of distal radius fracture malalignment on forearm rotation: a cadaveric study. The Journal of Hand Surgery, 22A(2):258–262, March 1997. [17] L. G. Brown. A survey of image registration techniques. ACM Computing Surveys, 24:325–376, 1992. [18] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek. The trimmed iterative closest point algorithm. In International Conference on Pattern Recognition, pages 545–548, 2002. [19] T. F. Cootes, G. J. Edwards, and C. J. Taylor. Active appearance models. In 5th European Conference on Computer Vision, volume 2, pages 484–498, 1998. [20] T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham. Active shape models - their training and application. Computer Vision and Image Understanding, 61(1):38–59, 1995. BIBLIOGRAPHY 157 [21] H. Croitoru, R. E. Ellis, R. Prihar, C. F. Small, and D. R. Pichora. Fixationbased surgery: A new technique for distal radius osteotomy. Computer Aided Surgery, 6:160–169, 2001. [22] R. Dalvi, R. Abugharbieh, M. Pickering, J. Scarvell, and P. Smith. Registration of 2D to 3D joint images using phase-based mutual information. In Proceedings of SPIE Medical Imaging, volume 6512, page 651209, 2007. [23] R. H. Davies, C. J. Twining, T. F. Cootes, J. C. Waterton, and C. J. Taylor. A minimum description length approach to statistical shape modeling. IEEE Transactions on Medical Imaging, 21(5):525–537, 2002. [24] A. DiGioia. Computer and robotic assisted hip and knee surgery. Oxford University Press, New York, 2004. [25] J. Feldmar, N. Ayache, and F. Betting. 3D-2D projective registration of free-form curves and surfaces. Computer Vision and Image Understanding, 65(3):403–424, 1997. [26] G. Fichtinger. Surgical navigation, registration, and tracking. http://cisstweb.cs.jhu.edu/people/gabor/Cs-600.145/Lectures/ RegTrack.pdf, 2006. [27] J. M. Fitzpatrick, J. B. West, and C. R. Maurer. Predicting error in rigid-body point-based registration. IEEE Transactions on Medical Imaging, 17(5):694– 702, October 1998. [28] M. Fleute, S. Lavallée, and L. Desbat. Integrated approach for matching statistical shape models with intra-operative 2D and 3D data. In Proceedings of BIBLIOGRAPHY 158 the 5th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 2489, pages 364–372, 2002. [29] M. Fleute, S. Lavallee, and R. Julliard. Incorporating a statistically based shape model into a system for computer-assisted anterior cruciate ligament surgery. Medical Image Analysis, 3(3):209–222, 1999. [30] R. L. Galloway. The process and development of image-guided procedures. Annual Review of Biomedical Engineering, 3:83–108, 2001. [31] P. Gamage, S. Q. Xie, P. Delmas, and P. Xu. Pose estimation of femur fracture segments for image guided orthopedic surgery. In IEEE International Conference on Image and Vision Computing, pages 288–292, 2005. [32] P. Giblin and B. B. Kimia. A formal classification of 3D medial axis points and their local geometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2):238–251, 2004. [33] K. G. Gilhuijs, P. J. van de Ven, and M. van Herk. Automatic three-dimensional inspection of patient setup in radiation therapy using portal images, simulator images, and computed tomography data. Medical Physics, 23(3):389–399, March 1996. [34] R. Gocke, J. Weese, and H. Schumann. Fast volume rendering methods for voxel-based 2D/3D registration - a comparative study. In Workshop on Biomedical Image Registration, Bled, Slovenia, August 1999. BIBLIOGRAPHY 159 [35] M. Goitein, M. Abrams, D. Rowell, H. Pollari, and J. Wiles. Multidimensional treatment planning (II): Beam eye-view, back projection, and projection through CT sections. International Journal of Radiation Oncology Biology Physics, 9:789–797, 1983. [36] R. H. Gong and P. Abolmaesumi. 2D/3D registration with the CMA-ES method. In Proceedings of SPIE Medical Imaging, volume 6918, pages 69181M1– 69181M9, 2008. [37] R. H. Gong, P. Abolmaesumi, and J. Stewart. A robust technique for 2D-3D registration. In IEEE International Conference on Engineering in Medicine and Biology (EMBC), volume 1, pages 1433–1436, 2006. [38] R. H. Gong, J. Stewart, and P. Abolmaesumi. A new method for CT to fluoroscope registration based on unscented Kalman filter. In Proceedings of the 9th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 9, pages 891–898, 2006. [39] R. H. Gong, J. Stewart, and P. Abolmaesumi. Reduction of multi-fragment fractures of distal radius using an atlas-based 2D/3D registration technique. In Proceedings of SPIE Medical Imaging, volume 7261, pages 371–379, 2009. [40] R. H. Gong, J. Stewart, and P. Abolmaesumi. A new representation of intensity atlas for GPU-accelerated instance generation. In IEEE International Conference on Engineering in Medicine and Biology (EMBC), pages 4399–4402, 2010. BIBLIOGRAPHY 160 [41] R. H. Gong, J. Stewart, and P. Abolmaesumi. Multiple-object 2D-3D registration for non-invasive pose identification of fracture fragments. IEEE Transaction on Biomedical Engineering (TBME), 58(6):1592–1601, June 2011. [42] N. Hansen. The CMA evolution strategy: A comparing review. In Towards a new evolutionary computation. Advances on estimation of distribution algorithms, pages 75–102. Springer, 2006. [43] D. L. G. Hill, P. G. Batchelor, M. Holden, and D. J. Hawkes. Medical image registration. Physics in Medicine and Biology, 46(3):R1, 2001. [44] J. Huang, R. Crawfis, and D. Stredney. Edge preservation in volume rendering using splatting. In Proceedings of the 1998 IEEE Symposium on Volume Visualization (VVS), pages 63–69, 1998. [45] L. Ibanez, W. Schroeder, L. Ng, and J. Cates. The ITK Software Guide. Kitware, Inc. ISBN 1-930934-15-7, http://www.itk.org/ItkSoftwareGuide.pdf, second edition, 2005. [46] A. Jain, R. Kon, Y. Zhou, and G. Fichtinger. C-arm calibration–is it really necessary? In Proceedings of the 8th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 8, pages 639–646, 2005. [47] L. Joskowicz, C. Milgrom, A. Simkin, L. Tockus, and Z. Yaniv. Fracas: a system for computer-aided image-guided long bone fracture surgery. Computer Aided Surgery, 3(6):271–288, 1998. BIBLIOGRAPHY 161 [48] B. G. Kashef and A. A. Sawchuk. A survey of new techniques for image registration and mapping. In Proceedings of SPIE Medical Imaging, volume 432, pages 222–239, 1983. [49] D. Kerl, B. Likar, and F. Pemus. Evaluation of similarity measures for 3D/2D image registration. In Proceedings of SPIE Medical Imaging, volume 6144, pages 61442F–61442F–11, 2006. [50] E. Kerrien, M. O. Berger, E. Maurincomme, L. Launay, R. Vaillant, and L. Picard. Fully automatic 3D/2D subtracted angiography registration. In Proceedings of the 2nd International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 1679, pages 664–671, 1999. [51] A. Khoury, J. H. Siewerdsen, C. M. Whyne, M. J. Daly, H. J. Kreder, D. J. Moseley, and D. A. Jaffray. Intraoperative cone-beam CT for image-guided tibial plateau fracture reduction. Computer Aided Surgery, 12(4):195–207, 2007. [52] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671–680, 1983. [53] D. Knaan and L. Joskowicz. Effective intensity-based 2D/3D rigid registration between fluoroscopic X-ray and CT. In Proceedings of the 6th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 2878, pages 351–358, 2003. [54] J. J. Kuffner. Effective sampling and distance metrics for 3D rigid body path planning. In IEEE International Conference on Robotics and Automation, pages 3993–3998, 2004. BIBLIOGRAPHY 162 [55] P. Lacroute and M. Levoy. Fast volume rendering using a shear-warp factorization of the viewing transformation. In Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques, pages 451–458, 1994. [56] D. LaRose, J. Bayouth, and T. Kanade. Transgraph: interactive intensity-based 2D/3D registration of X-ray and CT data. In Proceedings of SPIE Medical Imaging, volume 3979, pages 385–396, 2000. [57] D. A. LaRose. Iterative X-Ray/CT Registration Using Accelerated Volume Rendering. PhD thesis, Camegie Mellon University, 2001. [58] S. Lavallee, R. Szeliski, and L. Brunie. Matching 3-D smooth surfaces with their 2-D projections using 3-D distance maps. In Geometric Reasoning for Perception and Action, volume 708, pages 217–238. 1993. [59] S. Y. Lee, G. Wolberg, and S. Y. Shin. Scattered data interpolation with multilevel B-splines. IEEE Transactions on Visualization and Computer Graphics, 3:228–244, 1997. [60] T. Leloup, W. E. Kazzi, O. Debeir, F. Schuind, and N. Warzee. Automatic fluoroscopic image calibration for traumatology intervention guidance. In International Conference on Computer as a Tool, volume 1, pages 374–377, 2005. [61] H. Lester. A survey of hierarchical non-linear medical image registration. Pattern Recognition, 32(1):129–149, January 1999. [62] P. P. Li, S. Whitman, R. Mendoza, and J. Tsiao. ParVox - a parallel splatting volume rendering system for distributed visualization. In Proceedings of the BIBLIOGRAPHY 163 IEEE Symposium on Parallel Rendering (PRS), pages 7–ff, Los Alamitos, CA, USA, 1997. IEEE Computer Society. [63] H. Livyatan, Z. Yaniv, and L. Joskowicz. Robust automatic C-arm calibration for fluoroscopy-based navigation: A practical approach. In Proceedings of the 5th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 2489, pages 60–68, 2002. [64] H. Livyatan, Z. Yaniv, and L. Joskowicz. Gradient-based 2D-3D rigid registration of fluoroscopic X-ray to CT. IEEE Transactions on Medical Imaging, 22(11):1395–1406, November 2003. [65] W. E. Lorensen and H. E. Cline. Marching Cubes: A high resolution 3D surface construction algorithm. Computer Graphics, 21(4):163–169. [66] P. Lorenzen, M. Prastawa, B. Davis, G. Gerig, E. Bullitt, and S. Joshi. Multimodal image set registration and atlas formation. Medical Image Analysis, 10(3):440–451, June 2006. [67] B. Ma, J. Stewart, D. Pichora, R. Ellis, and P. Abolmaesumi. 2D/3D registration of multiple bones. In IEEE International Conference on Engineering in Medicine and Biology (EMBC), pages 860–863, 2007. [68] F. Maes, D. Vandermeulen, and P. Suetens. Comparative evaluation of multiresolution optimization strategies for multimodality image registration by maximization of mutual information. Medical Image Analysis, 3(4):373–386, December 1999. BIBLIOGRAPHY 164 [69] J. B. Maintz and M. A. Viergever. A survey of medical image registration. Medical Image Analysis, 2(1):1–36, March 1998. [70] P. Markelj, D. Tomaevi, B. Likar, and F. Pernu. A review of 3D/2D registration methods for image-guided interventions. Medical Image Analysis, April 2010. [71] S. Marsland, C. Twining, and C. Taylor. Groupwise non-rigid registration using polyharmonic clamped-plate splines. In Proceedings of the 6th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 2879, pages 771–779. 2003. [72] D. Mattes, D. R. Haynor, H. Vesselle, T. K. Lewellyn, and W. Eubank. Nonrigid multimodality image registration. In Milan Sonka and Kenneth M. Hanson, editors, Proceedings of SPIE Medical Imaging, volume 4322, pages 1609–1620, 2001. [73] N. Milickovic, D. Baltas, S. Giannouli, M. Lahanas, and N. Zamboglou. CT imaging based digitally reconstructed radiographs and their application in brachytherapy. Physics in Medicine and Biology, 45(10):2787, 2000. [74] M. H. Moghari and P. Abolmaesumi. A high-order solution for the distribution of target registration error in rigid-body point-based registration. In Proceedings of the 9th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 4191, pages 603–611, 2006. [75] E. D. Momi, K. Eckman, B. Jaramaz, and A. DiGioia. Improved 2D/3D registration robustness using local spatial information. In Proceedings of SPIE Medical Imaging, volume 6144, pages 977–984, 2006. BIBLIOGRAPHY 165 [76] R. Munbodh, D. A. Jaffray, D. J. Moseley, Z. Chen, J. P. S. Knisely, P. Cathier, and J. S. Duncan. Automated 2D-3D registration of a radiograph and a cone beam CT using line-segment enhancement. Medical Physics, 33(5):1398–411, 2006. [77] Y. Nakajima, T. Tashiro, T. Okada, Y. Sato, N. Sugano, M. Saito, K. Yonenobu, H. Yoshikawa, T. Ochi, and S. Tamura. Computer-assisted fracture reduction of proximal femur using preoperative CT data and intraoperative fluoroscopic images. In International Congress Series - Computer Assisted Radiology and Surgery, volume 1268, pages 620–625, 2004. [78] Y. Nakajima, T. Tashiro, N. Sugano, K. Yonenobu, T. Koyama, Y. Maeda, Y. Tamura, M. Saito, S. Tamura, M. Mitsuishi, N. Suigta, I. Sakuma, T. Ochi, and Y. Matsumoto. Fluoroscopic bone fragment tracking for surgical navigation in femur fracture reduction by incorporating optical tracking of hip joint rotation center. IEEE Transactions on Biomedical Engineering (TBME), 54(9):4173–4178, 2007. [79] L. Nolte. Computer assisted orthopedic surgery : (CAOS). Hogrefe & Huber, Seattle, 1999. [80] T. Okada, Y. Iwasaki, T. Koyama, N. Sugano, Y. W. Chen, K. Yonenobu, and Y. Sato. Computer-assisted preoperative planning for reduction of proximal femoral fracture using 3D-CT data. IEEE Transactions on Biomedical Engineering (TBME), 56(3):749–759, 2009. [81] J. Orchard and R. Mann. Registering a multisensor ensemble of images. IEEE Transactions on Image Processing, 19:1236–1247, May 2010. BIBLIOGRAPHY 166 [82] G. P. Penney, J. Weese, J. A. Little, P. Desmedt, D. L. Hill, and D. J. Hawkes. A comparison of similarity measures for use in 2-D-3-D medical image registration. IEEE Transactions on Medical Imaging, 17(4):586–595, August 1998. [83] T. M. Peters. Image-guidance for surgical procedures. Physics in Medicine and Biology, 51(14):R505–R540, 2006. [84] M. R. Pickering, A. A. Muhit, J. M. Scarvell, and P. N. Smith. A new multimodal similarity measure for fast gradient-based 2D-3D image registration. In IEEE International Conference on Engineering in Medicine and Biology (EMBC), pages 5821–5824, 2009. [85] J. P. Pluim, J. B. Maintz, and M. A. Viergever. Image registration by maximization of combined mutual information and gradient information. IEEE Transactions on Medical Imaging, 19(8):809–814, August 2000. [86] J. P. W. Pluim, J. B. A. Maintz, and M. A. Viergever. Mutual-informationbased registration of medical images: a survey. IEEE Transactions on Medical Imaging, 22(8):986–1004, August 2003. [87] J. P. W. Pluim, J. B. A. Maintz, and M. A. Viergever. f-information measures in medical image registration. IEEE Transactions on Medical Imaging, 23(12):1508–1516, December 2004. [88] T. D. Potma. Explorations of the motion and geometry of the human knee. Master’s thesis, Queen’s University, Kingston, Canada, 2007. [89] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in C++. Cambridge University Press, 2002. BIBLIOGRAPHY 167 [90] M. Prümmer, J. Hornegger, M. Pfister, and A. Dörfler. Multi-modal 2D-3D non-rigid registration. In Proceedings of SPIE Medical Imaging, volume 6144, pages 297–308, 2006. [91] D. Rueckert, M. J. Clarkson, D. L. G. Hill, and D. J. Hawkes. Non-rigid registration using higher-order mutual information. In Proceedings of SPIE Medical Imaging, volume 3979, pages 438–447, 2000. [92] D. Rueckert, A. F. Frangi, and J. A. Schnabel. Automatic construction of 3D statistical deformation models using non-rigid registration. In Proceedings of the 4th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 22, pages 77–84, 2001. [93] D. B. Russakoff, T. Rohlfing, and C. R. Maurer. Fast intensity-based 2D-3D image registration of clinical data using light fields. In Proceedings of 9th IEEE International Conference on Computer Vision (ICCV), volume 1, pages 416– 422, 2003. [94] D. B. Russakoff, T. Rohlfing, D. Rueckert, R. Shahidi, D. Kim, and C. R. Maurer, Jr. Fast calculation of digitally reconstructed radiographs using light fields. In Proceedings of SPIE Medical Imaging, volume 5032, pages 684–695, 2003. [95] O. Sadowsky, G. Chintalapani, and R. H. Taylor. Deformable 2D-3D registration of the pelvis with a limited field of view, using shape statistics. In Proceedings of the 10th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 10, pages 519–26, 2007. BIBLIOGRAPHY 168 [96] N. Schubert and I. Scholl. Comparing GPU-based multi-volume ray casting techniques. Computer Science - Research and Development, 26:39–50, Febuary 2011. [97] A. J. Seibert and J. M. Boone. X-Ray imaging physics for nuclear medicine technologists. Part 2: X-Ray interactions and image formation. Journal of Nuclear Medicine Technology, 33(1):3–18, 2005. [98] S. Stegmaier, M. Strengert, T. Klein, and T. Ertl. A simple and flexible volume rendering framework for graphics-hardware-based raycasting. International Workshop on Volume Graphics, pages 187–241, 2005. [99] J. E. Stone, D. Gohara, and G. C. Shi. OpenCL: A parallel programming standard for heterogeneous computing systems. Computing in Science and Engineering, 12:66–73, 2010. [100] C. Studholme. Simultaneous population based image alignment for template free spatial normalisation of brain anatomy. In Biomedical Image Registration, volume 2717, pages 81–90, 2003. [101] T. S. Tang and R. E. Ellis. 2D/3D deformable registration using a hybrid atlas. In Proceedings of the 8th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 8, pages 223–230, 2005. [102] T. S. Y. Tang. Calibration and point-based registration of fluoroscopic images. Master’s thesis, Queen’s University, Kingston, Canada, 1999. [103] T. S. Y. Tang, R. E. Ellis, and G. Fichtinger. Fiducial registration from a single X-Ray image: A new technique for fluoroscopic guidance and radiotherapy. In BIBLIOGRAPHY 169 Proceedings of the 3rd International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 1935, pages 502–511, 2000. [104] P. M. Tate, V. Lachine, L. Q. Fu, H. Croitoru, and M. Sati. Performance and robustness of automatic fluoroscopic image calibration in a new computer assisted surgery system. In Proceedings of the 4th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 2208, pages 1130–1136, 2001. [105] R. H. Taylor and D. Stoianovici. Medical robotics in computer-integrated surgery. IEEE Transactions on Robotics and Automation, 19(5):765–781, 2003. [106] P. Toft. The Radon Transform - Theory and Implementation. PhD thesis, Technical University of Denmark, Lyngby, Denmark, 1996. [107] D. Tomazevic, B. Likar, and F. Pernus. 3-D/2-D registration by integrating 2-D information in 3-D. IEEE Transactions on Medical Imaging, 25(1):17–27, January 2006. [108] A. Tristan and J. I. Arribas. A fast B-spline pseudo-inversion algorithm for consistent image registration. In Computer Analysis of Images and Patterns, pages 768–775, 2007. [109] C. J. Twining, T. Cootes, S. Marsland, V. Petrovic, R. Schestowitz, and C. J. Taylor. A unified information-theoretic approach to groupwise non-rigid registration and model building. Information Processing in Medical Imaging, 19:1– 14, 2005. BIBLIOGRAPHY 170 [110] E. B. van de Kraats, G. P. Penney, D. Tomazevic, T. van Walsum, and W. J. Niessen. Standardized evaluation methodology for 2-D/3-D registration. IEEE Transactions on Medical Imaging, 24(9):1177–1189, 2005. [111] E. A. Wan and R. V. D. Merwe. The unscented Kalman filter for nonlinear estimation. In IEEE Adaptive Systems for Signal Processing, Communications, and Control Symposium (AS-SPCC), pages 153–158, 2000. [112] F. Wang, T. Davis, and B. Vemuri. Real-time DRR generation using cylindrical harmonics. In Proceedings of the 5th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 2489, pages 671–678, 2002. [113] W. C. Wang and E. H. Wu. Adaptable splatting for irregular volume rendering. Computer Graphics Forum, 18(4):213–222, 1999. [114] W. Wein. Intensity based rigid 2D-3D registration algorithms for radiation therapy. Master’s thesis, Technische Universitat Munchen, Munchen, Germany, 2003. [115] M. A. Westenberg and J. B. T. M. Roerdink. X-ray volume rendering by hierarchical wavelet splatting. In Proceedings of the 15th International Conference on Pattern Recognition, volume 3, pages 159–162, 2000. [116] L. Westover. Interactive volume rendering. In Proceedings of the 1989 IEEE Symposium on Volume Visualization (VVS), pages 9–16, 1989. BIBLIOGRAPHY 171 [117] R. Westphal, T. Gsling, M. Oszwald, J. Bredow, D. Klepzig, S. Winkelbach, T. Hufner, C. Krettek, and F. Wahl. Robot assisted fracture reduction. Experimental Robotics, 3(6):153–163, 2008. [118] Wikipedia. Axis-angle representation. http://en.wikipedia.org/wiki/ Axis-angle_representation, July 2011. [119] Wikipedia. Gimbal lock. http://en.wikipedia.org/wiki/Gimbal_lock, July 2011. [120] Z. Yaniv and K. Cleary. Image-guided procedures: A review. http:// isiswiki.georgetown.edu/zivy/writtenMaterial/CAIMR-TR-2006-3.pdf, November 2006. [121] Z. Yaniv, L. Joskowicz, A. Simkin, M. A. Garza-Jinich, and C. Milgrom. Fluroscopic image processing for Computer-Aided Orthopaedic Surgery. In Proceedings of the 1st International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 1496, pages 325–334, 1998. [122] J. H. Yao. A Statistical Bone Density Atlas And Deformable Medical Image Registration. PhD thesis, Johns Hopkins University, Baltimore, USA, 2001. [123] J. H. Yao and R. H. Taylor. Tetrahedral mesh modeling of density data for anatomical atlases and intensity-based registration. In Proceedings of the 3rd International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), pages 531–540, 2000. [124] P. A. Yushkevich, J. Piven, C. H. Heather, G. S. Rachel, S. Ho, J. C. Gee, and G. Gerig. User-guided 3D active contour segmentation of anatomical structures: BIBLIOGRAPHY 172 Significantly improved efficiency and reliability. Neuroimage, 31(3):1116–1128, 2006. [125] H. X. Zhao and A. J. Reader. Fast projection algorithm for voxel arrays with object dependent boundaries. In IEEE International Symposium on Nuclear Science, volume 3, pages 1490–1494, 2002. [126] G. Y. Zheng, M. A. G. Ballester, M. Styner, and L. P. Nolte. Reconstruction of patient-specific 3D bone surface from 2D calibrated fluoroscopic images and point distribution model. In Proceedings of the 9th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 9, pages 25–32, 2006. [127] D. Zikic, B. Glocker, O. Kutter, M. Groher, N. Komodakis, A. Khamene, N. Paragios, and N. Navab. Markov random field optimization for intensitybased 2D-3D registration. In Proceedings of SPIE Medical Imaging, volume 7623, pages 762334–762334–8, 2010. [128] B. Zitova. Image registration methods: a survey. Image and Vision Computing, 21(11):977–1000, October 2003. [129] L. Zöllei, E. Grimson, A. Norbash, and W. Wells-III. 2D-3D rigid registration of X-ray fluoroscopy and CT images using mutual information and sparsely sampled histogram estimators. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages II–696–II–703, 2001. BIBLIOGRAPHY 173 [130] L. Zöllei, E. Learned-Miller, E. Grimson, and W. Wells. Efficient population registration of 3D data. Proceedings of the International Conference on Computer Vision (ICCV), 3765:291–301, 2005. [131] M. Zwicker, H. Pfister, J. V. Baar, and M. H. Gross. EWA splatting. IEEE Transactions on Visualization and Computer Graphics, 8:223–238, 2002.