Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LONG-TERM SECONDARY ALVEOLAR BONE GRAFT EVALUATION IN COMPLETE CLEFTS USING A NEW RADIOGRAPHIC SCALE AND DETERMINING OPTIMAL GRAFT ASSESSMENT TIMING Julianne K. Ruppel, D.D.S. An Abstract Presented to the Graduate Faculty of Saint Louis University in Partial Fulfillment of the Requirements for the Degree of Master of Science in Dentistry (Research) 2012 ABSTRACT Purpose: This study evaluates the effect of length of follow-up on alveolar cleft bone graft outcomes of two cleft lip and palate treatment centers. The Americleft SWAG scale for assessing graft outcomes was also evaluated for reliability and validity. Methods: 164 occlusal radiographs representing short- (T1) and long-term (T2) followups from 82 consecutively grafted patients (43 from Center 1, 39 from Center 2) were rated using the SWAG scale from 0 (failed graft) to 6 (ideal). Mean grafting age was 9y10m (9y7m Center 1, 10y1m Center 2). Average T1 was 11y1m in the mixed dentition, and 1y3m post-graft (10y10m Center 1, 11y3m Center 2). T2 was 17y7m in permanent dentition, and 7y9m post-graft (20y2m Center 1, 14y6m Center 2). Six trained/calibrated raters scored each radiograph twice. Rating for each graft at T1 and T2 was the average of 12 ratings. Reliability was calculated at T1 and T2 using weighted Kappa. Paired t-tests (p<.05) were used to test mean T1 and T2 differences for each Center. Correlation tested the relationship between T1 and T2 ratings. Linear regression was used to determine possible factors that might contribute to graft rating changes. Results: Paired t-test failed to find a statistical difference between T1 and T2 scores for either Center. There was a significant correlation between ratings at T1 and T2 (r=0.68). Twenty-seven patients’ ratings became better or worse by more than one point. Linear regression identified several treatment cofactors of interest. There was a greater chance of bone graft score improvement with completion of canine eruption and canine substitution for missing lateral incisors. Mean inter- and intra-rater Kappa measurements 1 were good (inter-rater: overall=0.705), (intra-rater: overall=0.788). Mean Center 1 scores were significantly better than Center 2 at both T1 (5.21 vs. 3.19) and T2 (5.17 vs. 3.43). Conclusions: Short-term follow-up ratings of graft outcomes of groups of patients from different centers identified significant differences between centers that did not change over time with similar differences identified at short-term and long-term. The rating system was reliable in the mixed and permanent dentitions. Outcome comparisons might optimally be made effectively as early as one year post-grafting. 2 LONG-TERM SECONDARY ALVEOLAR BONE GRAFT EVALUATION IN COMPLETE CLEFTS USING A NEW RADIOGRAPHIC SCALE AND DETERMINING OPTIMAL GRAFT ASSESSMENT TIMING Julianne K. Ruppel, D.D.S. A Thesis Presented to the Graduate Faculty of Saint Louis University in Partial Fulfillment of the Requirements for the Degree of Master of Science in Dentistry (Research) 2012 © Copyright by Julianne K. Ruppel ALL RIGHTS RESERVED 2012 i COMMITTEE IN CHARGE OF CANDIDACY: Associate Clinical Professor Donald Oliver, Chairperson and Advisor Professor Eustáquio Araújo, Adjunct Professor Ross E. Long, Jr., Clinical Professor Gus Sotiropoulos ii DEDICATION This thesis is dedicated to my wonderful fiancé Michael Durkin. He has made me both laugh and contemplate, and always wonder at my great luck. Thank you for the encouragement, patience, and love. I would also like to thank my parents James and Barbara Ruppel for their unceasing love, support, and enthusiasm. I am so grateful to have had your guidance throughout my life. You have made all of this possible. iii ACKNOWLEDGEMENTS The author would like to thank Dr. Ross E. Long for his tireless guidance. His generous contribution of his time and energy in the production of this thesis is much appreciated. The cleft patients and families of Lancaster, Pennsylvania are very lucky to have the Lancaster Cleft Palate Clinic to help them through the many phases of cleft lip and palate treatment. The author thanks Dr. Donald Oliver for his guidance, advice, and alacrity in the development of this thesis, and for the thoughtful clinical instruction throughout residency. The author thanks Dr. Eustáquio Araújo for his encouragement and participation in this thesis committee, and for the tireless efforts to educate the University’s orthodontic residents. The author thanks Dr. Gus Sotiropolous for participation in this thesis committee and especially for treating the challenging occlusions of the University’s cleft palate patients. iv TABLE OF CONTENTS List of Tables ..................................................................................................................... vi List of Figures ................................................................................................................... vii CHAPTER 1: INTRODUCTION .......................................................................................1 CHAPTER 2: REVIEW OF THE LITERATURE Cleft Palate in Humans ............................................................................................3 Definitions and Etiology ..............................................................................3 Incidence ......................................................................................................6 Dental and Alveolar Effects .........................................................................6 Cleft Palate Treatment Protocol ...............................................................................7 Overview of Treatment Phases ....................................................................7 Timing of Secondary Bone Grafting..........................................................11 Surgical Technique for Bone Graft Construction ......................................13 Changes in Grafted Bone Over Time.........................................................14 Assessment of Alveolar Bone Graft Outcomes .....................................................18 Use of Radiographs ....................................................................................19 Use of Cone Beam Computerized Tomography ........................................20 Current Popular Assessment Methods .......................................................21 Summary and Statement of Thesis ........................................................................25 References ..............................................................................................................27 CHAPTER 3: JOURNAL ARTICLE Abstract ..................................................................................................................31 Introduction ............................................................................................................32 Materials and Methods ...........................................................................................32 Sample........................................................................................................34 Ratings…………………………. ..............................................................37 Statistics .....................................................................................................38 Results ....................................................................................................................39 Discussion ..............................................................................................................45 Scale Reliability and Validity ....................................................................45 Changes in Grafted Bone Over Time.........................................................48 Impact of Variables on Graft Score Change ..............................................50 Conclusions ............................................................................................................55 Literature Cited ......................................................................................................56 Vita Auctoris ......................................................................................................................58 v List of Tables Table 1: Sample Demographics ................................................................................35 Table 2: Americleft SWAG Scale Scores .................................................................37 Table 3: Interpretation of Kappa Statistics ...............................................................39 Table 4: Overall Kappa scores for SWAG scale ......................................................40 Table 5: Distribution of Scores at Centers 1 and 2 at T1 and T2..............................42 Table 6: Change in score from T1-T2 multivariate analysis ....................................44 Table 7: Comparison of Kappa Scores for SWAG, Bergland, Kindelan, and Chelsea Scales............................................................................................ 47 vi List of Figures Figure 1: Example of SWAG Scale Method With Scoring ......................................37 Figure 2: Correlation of mean T1 and T2 scores ......................................................41 Figure 3: Comparison of outcome scores for Center 1 and Center 2........................41 Figure 4: Distribution of scores for Center 1 and Center 2 at T1 and T2 .................42 Figure 5: Linear Plot of Multivariate Analysis Results ............................................44 Figure 6: Example a score decrease of >2 ................................................................47 Figure 7: Example a score increase of >2 .................................................................47 Figure 8: Actual Change In Score Grouped By Initial T1 Score ..............................52 Figure 9: Difference Between T1 and T2 Ratings Versus Mean of Both Ratings ...54 vii CHAPTER 1: INTRODUCTION Secondary alveolar bone grafting of the cleft alveolar ridge in the mixed dentition is a well-established treatment for patients with cleft lip and palate (CLP). The graft surgery has many reported benefits including periodontal support for the cleft-adjacent teeth (Boyne and Sands, 1972; Turvey et al., 1986; Tan et al., 1996), establishment of an osseous matrix for the eruption of permanent teeth (Boyne and Sands, 1972; Bergland et al., 1986; Long et al., 1995), closure of oronasal fistulae (Bergland et al., 1986; Turvey et al., 1986), and stabilization of the maxillary segments in cases of bilateral CLP (Turvey et al., 1986). Although most cleft lip and palate (CLP) treatment centers use similar secondary alveolar grafting surgical procedures, a method similar to that described by Boyne and Sands (1986), clinical graft success continues to vary among centers (Long et al., 2011). As long as alveolar grafting continues in some cases to result in less than ideal bony fillin, clinicians will find it necessary to assess graft prognosis and decide if a re-graft surgery is necessary. Assessment of a bone graft is typically done using one of the following popular scales: The Bergland Scale (Bergland et al., 1986), the Kindelan Scale (Kindelan et al.,1997), or the Chelsea Scale (Witherow et al., 2002). These existing scales for bone graft assessment have several negative aspects including a lack of information on the location of grafted bone within a cleft site (Bergland et al., 1986; Kindelan, 1997), requirement for the canine to be fully erupted before assessment (Bergland et al., 1986), overly complicated methods for assessment (Witherow et al., 2002; Long et al., 1995), and relatively poor inter-observer agreement (Boley et al., 2010; Nightingale et al., 2003). 1 Currently it is unknown whether there is an optimal time to assess a bone graft so that the assessment reflects the long-term condition of the graft. Grafted bone continues to change several years after placement (Tan et al., 1996; Honma et al., 1999; Feichtinger et al., 2007), and it would be helpful if the clinician knew at what point, post-surgery, an assessment score will remain stable. A score assigned in the mixed dentition could be substantially different from a score assigned after orthodontic treatment is completed, or perhaps the score would remain relatively constant. The benefit of such a study and scale is that the earlier the opportunity to assess the final quality of a bone graft, the sooner a decision could be made regarding future treatment needs, such as the management of the lateral incisor space and the possible use of a regrafting procedure. If reliable and valid, the scale could be used for inter-center graft outcomes studies, allowing cleft palate treatment centers to compare results and modify treatment protocols if necessary in order to achieve the best possible outcomes. 2 CHAPTER 2: REVIEW OF THE LITERATURE Cleft Palate in Humans Definitions and Etiology The embryogenesis and etiology of cleft lip and palate has been described in many texts and publications (Thornton et al., 1996; Mitchell, 2009). Cleft lip and palate is a congenital defect that results from the failure of the median nasal process to fuse with the maxillary processes and the failure of the bilateral palatal shelves of the maxillary process to fuse to one another. The cleft can occur separately in the lip, the palate, or a combination of the two. A cleft of the lip and primary palate occurs in the fifth to eighth week of embryonic development. During this time in normal fetal development the median nasal process and bilateral maxillary processes fuse to unite the upper lip. Merging of the two medial nasal processes to one another results in formation of a single median nasal process, also known as the intermaxillary segment, which gives rise to the primary palate, the anterior alveolus with the upper incisors, the philtrum of the upper lip, the columella, and the tip of the nose. A cleft of the secondary palate occurs in utero between the ninth through twelfth weeks of life when the palatal shelves fail to migrate superiorly towards one another and fuse. Since fusion of the palatal shelves occurs roughly 2 weeks after the fusion of the primary palate, a disruption in lip and primary palate fusion can also affect the secondary palate. Accordingly, approximately sixty percent of individuals with cleft lip also have a cleft palate (Mitchell, 2009). Fusion of the lip can be disrupted by a lack of mesenchymal tissue proliferation in the maxillary process. This mesenchymal tissue is derived from neural crest cell tissue, and its absence or reduction could inhibit maxillary fusion with the median nasal process. In contrast, the 3 secondary palate is formed solely by midline fusion of the palatal shelves and is not associated with a lack of cell proliferation (Thornton et al., 1996). Oral clefts are generally considered to be multifactorial disorders, though they can be genetic in origin or associated with a teratogen or medical condition. It is thought that a threshold model controls expression of multifactorial disorders, in which an increase in genetic predisposition and environmental factors causing disease increase the likelihood to exceed the threshold and express the disorder (Thornton et al., 1996). Clefting of the palate can be a consequence of several factors that lead to impeded fusion (Mitchell, 2009). Failure of the palatal shelves to elevate has been attributed to a number of gene mutations, most notably mutations in MSX1 and DLX5, both of which are encoded for homeobox transcription factors. Impaired FGF signaling has also been found to contribute to cleft lip and palate. Palatal clefting can also be due to an absence or deficiency in the hyoglossus muscle caused by the HOX gene mutation effect. This mutation effect can prevent the embryonic tongue from flattening and thus impedes palatal shelf elevation. Currently no specific tests are available to assess genetic susceptibility to orofacial clefts (Mitchell et al., 2009). A positive family history of CL/P is considered one of the strongest risk factors for its occurrence. The risk of occurrence and severity of the cleft both increase if an immediate family member is affected. The risk of nonsyndromic CL/P development is 3% in first-degree relatives of an affected individual. The risk is also increased for second- and third-degree relatives of the individual (Mitchell, 2009). However, despite the increased familiar occurrence, approximately 75% of CL/P and 80% of isolated CP 4 instances have been reported as both nonfamilial and nonsyndromic (Thornton et al., 1996). Many environmental factors have been found to increase the risk of cleft development. Teratogens are infrequent causes but nonetheless have been implicated. The use of isotretinoin, an acne treatment, and the diet drug amphetamine have been shown to cause the condition as well as benzodiazepines and steroids, though the latter two are considered weak teratogens (Thorton et al., 1996; Mitchell, 2009). Additionally, smoking during pregnancy, second-hand smoke, obesity, diabetes, and Vitamin A deficiency have also been found to increase the risk (Boley et al., 2010). There are other drugs that produce orofacial clefts but are classified as part of a syndrome based on the presentation of the developmental condition. These include the anticonvulsant drugs phenytoin, hydantoin, and trimethadione, and fetal alcohol syndrome in instances that the mother consumed alcohol during pregnancy (Thornton et al., 1996; Mitchell, 2009). With respect to syndromes, the literature agrees that orofacial clefts associated with a syndrome are of a separate etiology than the isolated orofacial cleft of multifactorial origin (Vanderas, 1987). In contrast to the isolated cleft, oral clefts associated with a syndrome have an etiology that can be easily determined by understanding the involved syndrome’s transmission whether it is due to single gene abnormalities such as Van der Woude’s syndrome, microdeletions such as 22Q deletion syndrome, or chromosomal abnormalities such as Down syndrome (Cohen, 2002). There is debate concerning the number of clefts that occur as part of a syndrome, but it has been reported as high as 30% of CLP and 50% of CP (Mitchell, 2009). 5 Incidence Cleft lip and cleft palate are the most common congenital malformations of the head and neck with the incidence being approximately 1 in 940 infants born with the anomaly in the United States each year (CDC, 2011). Incidence is highest among Native Americans, followed in decreasing order by Asian, Caucasians, and African Americans. Incidence also differs between sexes with males more often affected by cleft lip and palate than females, while females are more often affected by a cleft palate only (Vanderas, 1987; Thornton et al., 1996). Dental and Alveolar Effects Dental and alveolar anomalies can occur in widely varying arrays in cleft palate patients and are much more common in occurrence when compared to the general population. It has been reported that dental abnormalities occur in approximately 54% of cleft patients versus 15% of non-cleft patients (Jordan et al., 1966). A number of authors have described abnormalities in dental development in the cleft patients. These abnormalities invariably relate to the disruption of the embryonic formation of the dental lamina and are the result of the failure of the merging and fusion of the median nasal, lateral nasal, and maxillary processes of the embryo (Ross and Johnston, 1972). In addition to the discontinuity and absence of bone in the maxillary alveolar ridge at the cleft site, the dental abnormalities resulting can include congenitally missing teeth, supernumerary teeth, malformed teeth, and ectopic teeth (Long et al., 2000). It is reported that 20% of cleft patients are missing teeth in the primary dentition which 6 increases to 40% of patients missing teeth in the permanent dentition (Ross and Johnston, 1972). The maxillary lateral incisor is the most commonly affected permanent tooth due to its position near the cleft site. In the permanent dentition the lateral incisor is missing in approximately 30% to 50% of cleft patients and 10%-20% in the primary dentition (Jordan et al., 1966; Ross and Johnston, 1972). When present the lateral incisor is often malformed, and can occur on either side of the cleft (Ross and Johnston, 1972). Missing teeth result in the atrophy of the associated alveolar ridge and underdevelopment of the premaxilla. In patients with clefts, abnormal muscle attachments are combined with the absence of teeth, resulting in negatively affected bone growth (Boley et al., 2010). Cleft Palate Treatment Protocol Overview of Treatment Phases Cleft lip and palate (CLP) management is a process that can begin before the birth of the affected child and continues into adulthood. Treatment is complex and best managed by an interdisciplinary team of specialists that together are able to address the comprehensive needs of the patient (ACPA, 2009). A cleft team typically consists of a geneticist, a plastic surgeon, an oral maxillofacial surgeon, an otolaryngologist, an orthodontist, a pediatric dentist, a speech therapist, an audiologist, a psychologist, a nutritionist, and often a social worker to aid in the rendering of services. During the pregnancy the parents of the child may begin genetic counseling in which they are screened by a geneticist for craniofacial syndromes and advised on the probability that 7 future children may have these conditions. Following birth, primary surgical repair of the lip takes place in the first three months of life and can occur as soon as is deemed safe for the infant (Posnick and Ruiz, 2002). The goal of the lip repair is to restore functionality and improve esthetics. In some cases pre-surgical orthopedics may be beneficial to align the segments of the cleft maxillary alveolus before lip surgery can occur, particularly in patients with a bilateral cleft. Soft tissue cleft palate closure is usually done as a second procedure at approximately 12 months of life. Closure of the palate at this time is done in order to restore functionality and allow for development of normal speech. Reconstruction of the muscles of the soft palate is done as part of the palate repair in order to restore this function (Posnick and Ruiz, 2002). Restoration of the cleft alveolar ridge is done by alveolar bone grafting which commonly uses autogenous bone to fill the remaining alveolar defect. The procedure is classified as either an early secondary bone graft if done between the ages of 2-5 years, intermediate secondary if done between the ages of 6-15 years in the late mixed dentition, or late secondary (also called tertiary) if done from adolescence onward (Mercado and Vig, 2009). The secondary alveolar bone grafting procedure is an essential part of the cleft palate treatment protocol with numerous benefits when completed at an appropriate age and with a proper technique. A successful graft supplies bone for erupting teeth and periodontal support for teeth adjacent to the cleft. It also gives more support and elevation of the alar base on the affected side, thereby improving nasal symmetry. Alveolar grafting also stabilizes the separated maxillary segments and provides proper alveolar contour and prevents maxillary arch collapse. The graft also connects the 8 disconnected segments to the mobile premaxilla in cases of bilateral cleft (Long et al., 1995; Mercado and Vig, 2009). Transverse palatal expansion with an orthodontic appliance is often deemed necessary before secondary bone grafting due to the absence of palatal bone and the constrictive effect of palatal scar tissue following surgery (Lidral and Vig, 2002). The goal of expansion is to align more favorably the posterior segment with the remaining alveolus in order to improve the surgeon’s access to the cleft. Before the alveolus is grafted, the greatest amount of separation of the segments during transverse expansion will occur at the cleft site as it is the area of least resistance. The separation is done with an orthodontic palatal appliance that provides low continuous force thereby allowing the existing scar tissue to gradually stretch (Mercado and Vig, 2009). Some clinicians argue, however, that presurgical expansion is often undesirable because the widened cleft makes surgical success less predictable. Postgraft expansion is advocated instead with the additional suggestion that the distraction forces of expansion will stress the graft and thus stimulate graft maturation (Turvey et al., 1984). In an attempt to resolve this issue, in a study of 56 clefts that underwent presurgical expansion, there were significant but low correlations between cleft widths and outcome of bone grafting for cleft widths ranging from 1.0 mm to 11.2 mm (Long et al., 1995). The low correlation, though statistically significant, suggests there is limited clinical significance to the concern that an expanded cleft will be more difficult to graft successfully, and that if necessary, clinicians should not hesitate to initiate gentle presurgical palatal expansion for improved surgical access. A study by Aurouze and colleagues (2000) also supports this conclusion. In their study of 31 cleft sites, there was 9 no statistically significant correlation between pre-surgical cleft size and graft success as evaluated on pre-operative and 6-month post-operative digitized radiographs. Limited orthodontic alignment before grafting is also often necessary if the maxillary incisors are severely tipped or rotated. Alignment to move these teeth away from the cleft can improve surgical access to the site (Mercado and Vig, 2009). Following alveolar grafting in the mixed or permanent dentition the patient will typically receive comprehensive orthodontic treatment for a number of years. The goals of orthodontic treatment after grafting are to optimize the oral esthetics and function as much as possible (Mercado and Vig, 2009). As previously mentioned if the lateral incisor is present, it frequently is malformed, hypoplastic, or shows poor root formation. In these cases a decision needs to be made as to the value of maintaining the incisor for possible future dental restoration. If the maxillary lateral incisor is missing, this can include space closure with canine substitution of missing lateral incisors, or space maintenance for future dental osseointegrated implants or other prosthetic replacement. It may also include orthognathic surgery to correct severe skeletal anteroposterior discrepancies. The decision whether to close or maintain the missing lateral incisor space largely depends on the patient’s age at grafting. Gap closure of the lateral incisor space results in less graft resorption than space maintenance, but gap closure is most often done in younger patients with unerupted canines (Dempf et al., 2002; Schltze-Mosgau et al., 2003). Sufficient alveolar bone at the area of the cleft is necessary for implant placement, and in cases for implant prosthesis it is suggested that the graft be placed at a later age so that an implant can be placed a few months after grafting surgery (Mercado and Vig, 2009). Loss of grafted bone over time often makes a follow-up graft necessary 10 if the implant is to be placed more than a few months after the grafting surgery (Mercado and Vig, 2009). In a retrospective study by Kearns et al. (1997), nine of fourteen grafted patients that were planned to receive dental implants required a re-grafting procedure due to inadequate bone volume. The patients requiring a re-graft had an average of 26.4 months wait between secondary grafting and planned implant placement versus an average 15.75 month interval for patients with adequate bone. Timing of Secondary Bone Grafting The most common current strategy for secondary alveolar bone grafting surgery is based on the procedure as it was first described by Boyne and Sands in 1972 and advocates use of autogenous bone to fill the alveolar defect (Boyne and Sands, 1972). Currently intermediate secondary bone grafting between 8-12 years of age is the procedure of choice for most cleft centers. There is general agreement that grafting at this time tends to improve alveolar and facial contour, and better facilitates permanent tooth eruption when compared to early secondary and tertiary grafting. This opinion has come about as the result of several classical research studies and continues to be supported by current data. Bergland and colleagues in a study of 378 consecutive patients that had undergone secondary alveolar bone grafting noted that the most optimal graft outcomes were observed in those patients that had the procedure done before eruption of the permanent maxillary canine. (Bergland et al., 1986). This observation has been reported frequently in the literature (Turvey et al., 1984; Enemark et al., 1985; Mercado and Vig, 11 2009) but has also been found to be insignificant in others (Long et al., 1996; SchultzeMosgau et al, 2002). In a landmark international multicenter outcome study called Eurocleft, outcomes of prior surgical treatment on maxillomandibular relationships, soft and hard tissue morphology, and nasolabial esthetics were compared in 169 subjects with complete unilateral cleft lip and palate (UCLP). Concerning bone grafting, this initial Eurocleft study concluded that patients who had undergone early primary alveolar bone grafting had less optimal outcomes in the mixed dentition when compared to patients treated with intermediate secondary grafting (Shaw et al., 1992). A long-term follow-up Eurocleft study of 127 consecutively treated UCLP patients from the original Eurocleft study also found poorer results in sagittal interarch relationships in the cleft palate center that used primary bone grafting, with almost fifty percent of the center’s patients requiring orthognathic surgery by age 17 (Molsted et al., 2005; Shaw et al., 2005). This also agrees with the conclusions of the Americleft study, a similar multicenter outcome comparison, which found that of 169 patients among five centers, the patients from a center which used primary bone grafting had the poorest interarch relationship (Long et al., 2011; Hathaway et al., 2011). Late secondary (also known as tertiary) bone grafting takes place in the permanent dentition and is done in situations in which the patient for various reasons could not be treated earlier in the mixed dentition. A bone graft placed at this time does not benefit from the bony remodeling that occurs during tooth eruption, and although usually adequate, the crestal bone height is typically lower post-graft than in patients that received secondary grafts in the mixed dentition (Dempf et al., 2002). Dempf and 12 colleagues (2002) retrospectively compared graft outcomes using occlusal radiographs in 60 patients treated with secondary bone grafting and 25 patients treated with tertiary bone grafting. Following the grafting procedure, the secondary group was treated with orthodontic space closure of missing lateral incisor space, while 14 patients of tertiary group received dental implants and the remaining nine patients of the tertiary group were treated with a fixed prosthesis. In a follow-up exam at least three years post-graft, it was found that 85% of secondary grafts had a crestal bone height 50-100% the height of adjacent alveolar bone, and that 68% of tertiary bone grafts had this same outcome. The study concluded that the functional stress of orthodontics and/or tooth eruption that takes place in grafted bone prevents progressive bone graft resorption. Furthermore, the tertiary bone grafts that later received dental implants had better crestal bone height results than those restored with a fixed prosthesis. Surgical Technique for Bone Graft Construction Boyne and Sands introduced the basic technique for contemporary alveolar bone grafting in 1972. They advocated intermediate secondary bone grafting based on their perception that anterior and transverse maxillary growth were 80% completed by eight years of age, thereby avoiding the adverse effect of scarring on midfacial growth that was reported with early grafting (Turvey et al., 2009). Many contemporary surgeons now time the grafting procedure so that the permanent lateral incisor if present and/or permanent canine will erupt into the grafted bone shortly thereafter. This is accomplished by placing the graft when the unerupted canine root is one-half to two- 13 thirds developed (Mercado and Vig, 2009), although Waite and Waite (1996) have suggested one-third root development is appropriate. As described by Turvey et al. (2009) the procedure begins with mucoperiosteal flaps on the palate and vestibular surfaces of the maxillary segments while avoiding the gingiva of the teeth adjacent to the cleft. Reflection of the vestibular flaps exposes the crestal alveolar bone, while the palatal mucosa and tissue lining the cleft is reflected to expose the bony margins of the cleft alveolus. A nasal floor is reconstructed for the defect by superiorly elevating the tissues lining the cleft to the nasal cavity floor and suturing in place. When construction of the watertight nasal floor is completed the area is ready to receive grafted bone. At this point the lateral surfaces of the cleft are exposed, as well as the anterior and posterior bony margins. Autogenous cancellous bone particles are densely packed into the cleft and extended over the cleft margins in anticipation of some resorption. The iliac crest has remained a popular choice in children for harvesting cancellous particulate bone, although there is some debate on this topic. In adults the cranium is often used instead because of a very low level of morbidity associated with healing versus the iliac crest. In children however, the iliac crest provides a large amount of accessible autogenous cancellous bone that is usually unavailable in the cranium (Boyne and Sands, 1972; Turvey et al., 1984; Waite and Waite, 1996). Changes in Grafted Bone Over Time Once the graft has been placed in the cleft some amount of resorption is expected, though the amount and timeline of resorption that occurs varies widely with reports. A 14 2007 article by Feichtinger et al. studied the volumetric change of twenty-four cases of grafted clefts treated by orthodontic space closure (when possible) over three years. The analysis used computed tomography slices to calculate the volume of the defect preoperatively and the volume of grafted bony-bridge postoperatively, and observed a mean bone loss of 49.5% in the first year, and a loss of 52% at three years. Despite this in three of the twenty-four cleft sites there was an increase in bone volume by 8.0% at three years. The authors hypothesized that this increase could be due to the stimulating effect of erupting teeth as these patients had retained both the permanent lateral incisor and canine. In contrast, the two patients that were habilitated with fixed prostheses instead of orthodontic space closure had graft volume decrease by an average of 95.2% at three years. This study suggests that even when treated with space closure, a fairly large amount of bone loss can be expected within the first year with minimal continued loss in the second and third year postoperatively. Another study by Honma et al. (1999) assessed grafted bone bridge volume of 15 cleft sites using CT slices at 3 months and 1 year post-operatively. Unlike the findings of Feichtinger et al. (2007), 9 of the 15 grafts maintained a volume of bone very close to the preoperative defect volume. The mean preoperative volume was 1.1+0.3 cm³ with a range of 0.6 to 1.8 cm³, while at 3 months the bone bridge volume was a mean of 1.2+0.6 cm³, and at 1 year was 1.1+0.5 cm³. The study also demonstrated the varying degrees of success with grafting, however, because the bone volumes at 1 year ranged from 0.3 cm³ to 2.0 cm³. In two of the fifteen patients, there was a significant volume decrease from 3 months to 1 year, but the authors did not offer any insight as to why this may have occurred, or what type of treatment was done to habilitate the cleft space. 15 A one-year post-operative follow-up of 65 secondary alveolar grafts by Trinidade et al., (2005) also found some resorption within the first year after surgery. Of the 65 grafts, 68%-71% were classified as a Bergland type I, meaning the interdental septum height was normal, and 15% as Bergland type II with the height 75% of normal. The authors stated that it was impossible to rate all of the cases because of ongoing orthodontic treatment and continuing eruption of permanent teeth, inferring that it was expected the bone graft could change during this time. In a five-year prospective study of 100 cases by Tan et al. (1996), long-term changes in grafted bone were assessed radiographically and by periodontal examination of the adjacent teeth. At five years post-graft, the authors found a mean probing depth of 2.28mm for cleft-adjacent teeth, compared to an average 2.14 mm for contralateral teeth not adjacent to the cleft. The grafted bone was assessed on periapical radiographs using the Bergland scale (Bergland et al., 1986) that assigns a score of I-IV based on the level of septal bone height between adjacent erupted cleft teeth. Using this critera, 88.9% of their patients with unilateral clefts and 84.6% of bilateral clefts were scored as type I, meaning the septal bone height was essentially normal at five years post-graft. The authors stated that whenever possible the missing lateral incisor space was closed orthodontically upon eruption of the adjacent teeth. The findings in this study would suggest that even five years after grafting the amount of bone fill-in is adequate for periodontal tooth support if the remaining cleft space is closed (Tan et al., 1996). The study did not compare the five-year evaluation to any other short-term post-graft followup, however, so it is impossible to assess the amount of change in bone bridging over time. 16 More long-term data was gathered by Sindet-Pederson and Enemark (1985) and Enemark et al. (1987) in which 95 UCLP and BCLP patients were followed after secondary grafting in the mixed dentition. Of the 95 patients, 76 were deemed to have no change in alveolar bone height between the short-term and long-term (greater than 4 years post-graft) evaluations. Another 14 patients changed from a normal alveolar bone height to a height that is 75% of normal, and the remaining 5 grafts were less successful. The authors concluded that one can reasonably assume a final treatment result after a fairly short amount of time post-operatively. It is logical to expect some change in the amount of grafted bone over time, although as demonstrated by these studies the literature reports a wide range of change from very stable bone grafts to decreases in bone volume by over fifty percent (Enemark et al., 1987; Tan et al., 1996; Honma et al., 1999; Feichtinger et al., 2007). Whether the bone graft remains stable or not, however, there also appears to be disagreement on how long after graft placement the bone will continue to change. Tan and colleagues assessed grafts at five years postoperatively, but did not assess the grafts at any other timepoints. Perhaps the graft had stabilized at some timepoint before the assessment done at five years, or conversely maybe the graft continued to change afterward. Feichtinger et al. assessed grafts longitudinally at one, two, and three years, but it is reasonable to question whether the graft may continue to change even after three years. Most often the patient has several years of growth and dental eruption remaining after the placement of a secondary bone graft and may be completing orthodontic treatment during much of that time. An examination of changes in bone graft outcomes over a longer period of time 17 would be helpful in determining at what point postoperatively the assessment can be considered a valid reflection of the graft’s outcome. Assessment of Alveolar Bone Graft Outcomes Restoration of the alveolar defect with a secondary bone graft in the patient with a cleft can vary from a highly successful graft with minor resorption to a relatively poor outcome with only a small amount of bone bridging the cleft or a lack of a bridge completely. In these latter cases the graft bone fill-in is less than acceptable. In some instances the clinician may consider recommending a re-graft procedure. However, in order to do so some objective method of assessing the graft outcome is necessary. In 1986 Bergland and colleagues introduced what is now known as the Bergland Scale in an attempt to quantify the amount of grafted bone successfully filling in the defect and providing bony support of the teeth adjacent to the cleft. The Bergland Scale and subsequent scales and rating systems have been developed for the purpose of accurately measuring the success of the bone grafting procedure in order to aid in this clinical decision-making for the cleft palate team (Bergland et al., 1986; Long et al., 1995; Kindelan et al., 1997; Witherow et al., 2002; Hynes and Early, 2003). The number of studies testing the validity and reliability of rating methods has increased since the introduction of the Bergland Scale with the popularization of evidence-based care and advances in medical and dental imaging. Currently there are many proposed methods for assessing alveolar bone graft outcomes, though most use either dental radiographs or cone-beam computerized tomography (CBCT). 18 Use of Radiographs Dental radiographs are taken regularly during the orthodontist’s lengthy involvement in treatment of CLP, especially during preoperative transverse palatal expansion, before and after the grafting procedure, and during the following orthodontic treatment. This abundance of information is useful in obtaining adequate sample sizes for assessing bone graft outcomes. Radiographs allow inspection of the amount and distribution of bone in the grafted sites, the height of the interdental septum formed during tooth eruption, and the response of grafts to orthodontic treatment, all at very little cost to the patient (Trindade et al., 2005). No medical device is perfect, however, and use of dental radiographs in bone graft assessment has some inherent limitations. Limits of the dental radiograph in evaluating bone grafts include image enlargement and distortion, superimposition of adjacent structures, a limited number of identifiable landmarks, and positioning problems (Feichtinger et al., 2007). As a two-dimensional representation of a 3-dimensional structure, there can also be inaccuracies in judging the amount and volume of bone bridging the cleft and in determining the buccal-lingual positioning of the adjacent cleft teeth to the bridge (Lee et al., 1994). A 1996 study by Rosenstein et al. compared the use of dental radiographs and CT scans in bone graft assessment. The investigation used radiographs and CTs taken within six months of each other from fourteen patients with UCLP. Measurements of root coverage were done on 1.5 mm CT slices, while the ratio of root support was estimated from the periapical radiographs by dividing percentage of estimated bone support by the percentage of estimated root length using the method developed by Long et al., (1995). There was no statistically significant difference between the two sets of measurements, 19 and furthermore the differences found were clinically insignificant when using the statistical method of Bland and Altman (1986) to test for agreement between methods. The study provided evidence that the use of dental radiographs for graft assessment for groups of patients is valid and exposes the patients to much less radiation when compared to CTs and CBCT. Although the study supported the continued use of dental radiographs, agreement on the amount of bone support as measured on radiographs was somewhat worse for the grafts that had middle ranges of root support such as between 70% and 90% of root coverage. The authors attributed this difference in part to the variation inherent in taking measurements (Rosenstein et al., 1996). Use of Cone Beam Computed Tomography Recently cone-beam computed tomography has become more popular in bone graft assessment. Compared to CT, CBCT has a lower cost and much lower patient radiation exposure. A 2009 systematic review of CBCT oral and maxillofacial literature found that 7% of articles published in peer-reviewed journals dealt with clinical applications of CBCT for cleft pathology (De Vos et al., 2009). These articles most commonly reported CBCT use for general assessment of the cleft region, but also for assessing nasal and piriform ridge deformities and for evaluating alveolar bone grafts. In comparing the effective radiation dose to the patient between CBCT and conventional panoramic and lateral cephalometric radiographs, the reported CBCT effective dose ranged from 56.2 Sv to 61.1 Sv, roughly 6 times more radiation than panoramic radiographs (Silva et al.,2008). This effective radiation dose for CBCT of the oral region is also much larger than the 9 Sv which is the typical effective dose of a single intraoral 20 dental radiograph with D-speed film (Avendanio et al., 1996). Thus, although CBCT is a valuable tool in evaluating the cleft area in three dimensions, repeated exposure to CBCT for longitudinal evaluation is perhaps of questionable benefit. Current Popular Assessment Methods The outcome of a secondary alveolar bone graft is of great interest to the orthodontist. The status of the graft affects the treatment decisions of the orthodontist, such as how to address a missing lateral incisor. The graft outcome impacts whether the clinician decides to close the missing lateral incisor space orthodontically, restore the space with an osseointegrated dental implant, or perhaps have a fixed partial denture made if the space is not suitable for either of the first two options. Furthermore, measuring bone graft outcomes allows the clinician to reflect on the bone grafting protocol rendered that led to the outcome and to apply statistics that provide information concerning the impact of various treatments. In short, outcome measures provide the data necessary to make evidence-based decisions regarding the treatment protocols used, and how said protocols can be improved. There have been several popular bone graft assessment methods developed that use conventional dental radiographs since the surgical method of Boyne and Sands was first introduced in the literature in 1972. There are two types of scales that can be applied to these assessment methods, categorical and continuous. The majority of scales introduced use categorical variables, including the Bergland Scale (Bergland et al., 1986), Kindelan Scale (Kindelan et al., 1997), and Chelsea Scale (Witherow et al., 2002), while a scale introduced by Long and colleagues uses continuous variables (Long et al., 1995). 21 The Bergland Scale utilizes periapical dental radiographs to assign a score to the graft ranging from I-IV based on the height of the interdental bone septum between the adjacent cleft teeth (usually a central incisor and canine). A score of I indicates that the interdental septum height is essentially normal, while II indicates an interdental height that is ¾ of normal, III is less than ¾, and IV indicates a lack of a continuous bony bridge and is considered a failure (Bergland et al., 1986). The Bergland Scale has remained relatively popular because it is very easy to use, but there are several disadvantages also. The method requires that the permanent canine be completely erupted before applying the scale. In many instances it is useful to the clinician to assess the graft outcome before the eruption of the permanent canine. If the graft needs to be redone, it is favorable to have the procedure done before the eruption of permanent teeth, so that the tooth eruption is able to have a positive effect on the grafted bone. Repeated bone grafting after the eruption of permanent teeth is frequently associated with a higher rate of failure (Witherow et al., 2002). By waiting to score the graft until after tooth eruption, this advantage is lost. Furthermore, by only taking into account the height of the interdental septum, the scale may inaccurately reflect the outcome of situations in which the bone height is close to normal but there is an apical defect. The amount of bone present in the defect is not taken into consideration with this scale. The Kindelan Scale was introduced in the literature in 1997 with the intent to improve some aspects of the Bergland Scale (Kindelan et al., 1997). The Kindelan Scale assigns a score to the graft according to degree of bony fill-in as determined by comparing pre- and postoperative occlusal radiographs. Grade 1 is assigned if >75% of the alveolar cleft area is filled with bone, 2 if 50-75% is filled, 3 if <50%, and a grade 4 if 22 there is no complete bony bridge present. The Kindelan Scale also was developed for use more quickly after surgery, with scores assigned from radiographs that were taken an average of 4 months postoperatively in the article (Kindelan et al., 1997). This allows assessment of the graft before the eruption of the permanent canine unlike the Bergland Scale. Other advantages of the scale include its ease of use and consideration of the overall amount of bone filling the cleft defect. The scale’s main drawback is that it doesn’t describe the location of the possible defects within the cleft site. For example, it may be desirable to have bony tooth support coronally rather than apically, but either of these situations could be scored a 2 or a 3 with the Kindelan Scale. The Chelsea Scale was made to clarify the position of bone when assessing grafts and to allow assessment before eruption of the cleft teeth (Witherow et al., 2002). The scale involves two stages. In the first stage the observer divides the cleft vertically and also divides the roots of the adjacent teeth into fourths. Each fourth of root is then assigned a score: 1 if the bone extends from the root surface to the midline, a 0.5 if there is bone present but it does not extend to the midline, and a 0 if there is no bone present on the root surface. The second stage involves assigning a letter grade to the cleft, from A to F, that reflects the position of the bone. Although this method increases the amount of information gathered from radiographic graft assessment, it is not easily agreed upon between observers. Visually sectioning roots into fourths and dividing the cleft vertically is not a simple task on a small periapical or occlusal radiograph and thus the scale is prone to poor inter-observer agreement (Nightingale et al., 2003). The study by Nightingale et al. (2003) examined the reproducibility of the Bergland, Kindelan, and Chelsea Scales and found that none of the three scales was more 23 reproducible than the other. The authors in the study applied each scale to 59 bone graft sites and then measured the intra-observer agreement with the weighted Kappa statistic, and inter-observer agreement with multiple weighted Kappas as described by Landis and Koch (1977). According to Landis and Koch, the level of agreement is described by the Kappa statistic in accordance with the following values: >0.8-almost perfect, 0.61-0.80: substantial agreement, 0.41-0.60: moderate agreement, <0.40: poor agreement. Nightingale and co-authors found that the intra-observer Kappas were an average of 0.67 for the Bergland Scale, 0.61 for the Chelsea Scale, and 0.70 for the Kindelan Scale. Inter-observer Kappas averaged 0.48 for the Bergland Scale, 0.50 for the Chelsea Scale, and 0.49 for the Kindelan Scale. The results indicate that there is little difference in reliability between the scales, and that although each scale has substantial intra-observer agreement, there is only moderate agreement between observers. A major drawback of all three of these categorical scales is the lack of objective assessment methods of bone graft appearance, and thus the inability to carry out parametric statistical analyses of the results (Long et al., 1995). One scale introduced by Long and colleagues (1995) uses ratios of linear measurements resulting in continuous variables on which parametric statistics can be used. In the Long method, proportional bone support of adjacent cleft teeth is determined by dividing the relative height of bone supporting a tooth by the length of the root, resulting in a ratio. The amount of bony bridge is determined by another ratio that divides the distance of the interseptal height from the CEJ line by the length of adjacent cleft tooth roots. By using ratios, this method eliminates the problems of radiographic image elongation and foreshortening and also improves the statistical information made available by data reported in ratios. Despite 24 this, the method is time-consuming and cumbersome and the typical clinician may be unlikely to use it. A negative aspect of all of the aforementioned scales is that they are each based on cross-sectional data, i.e. a radiograph taken at a single point in time post-surgery. As has been demonstrated by Tan et al. (1996), Honma et al. (1999), and Feichtinger et al. (2007) the grafted bone may change for several years following placement. Therefore, it would be useful if there was a bone graft scale that would enable identification of a postsurgical time at which a score would reflect the long-term outcome of the graft that would no longer be expected to change over time. Currently there is no scale or study which could determine whether a score assigned in the mixed dentition is substantially different from a score assigned after orthodontic treatment is completed, or whether the score would remain relatively constant between mixed and permanent dentitions. The benefit of such a study and scale is that the earlier the opportunity to estimate the final quality of a bone graft, the sooner a decision could be made regarding future treatment needs, such as the management of the lateral incisor space and the possible use of a regrafting procedure. Summary and Statement of Thesis The purpose of this study is twofold. First, it will assess secondary alveolar bone graft success or failure longitudinally using long-term post-surgery radiographs, and if possible to conclude from the data an optimal time post-surgery to assess the graft outcome. This study intends to determine when measurements of bone fill-in accurately reflect the long-term result of the graft. It will also address the effect, if any, of post- 25 surgical growth, tooth eruption, or orthodontic movement of teeth adjacent to or into the grafted bone on ultimate graft success. This is a retrospective study of data gathered on patients with cleft lip and palate that were treated at two cleft palate treatment centers that were included in a previous bone graft study. Secondly, the study will also test the validity and reproducibility of a new scale developed by the Americleft Project, the first North American intercenter comparison of treatment outcomes (Russel et al., 2011). This scale was first introduced at the American Cleft Palate Craniofacial Association 2011 annual meeting. 26 References 1. American Cleft Lip and Palate Association. Parameters for the evaluation and treatment of patients with cleft lip/palate or other craniofacial anomalies. Cleft Palate Craniofac J. 2009. 2. Aurouze C, Moller KT, Revis RR, Rehem K, Rudney J. The presurgical status of the alveolar cleft and success of secondary bone grafting. Cleft Palate Craniofac J. 2000;37:179-184. 3. Avendanio B, Frederiksen NL, Benson BW, Sokolowski TW. Effective dose and risk assessment from detailed narrow beam radiography. Oral Surg Oral Med Pathol Radiol Endod. 1996;82:713-719. 4. Bergland O, Semb G, Abyholm FE. Elimination of the residual alveolar cleft by secondary bone grafting and subsequent orthodontic treatment. Cleft Palate J. 1986;23:175-205. 5. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1:307-310. 6. Boley S, Grossman D, Long Jr. RE. A new method for assessing outcomes of bone grafting in cleft patients and intra-center audit of alveolar bone grafting outcomes from different surgeons. Philadelphia, PA: Albert Einstein Medical Center; 2010. Dissertation. 7. Boyne PJ, Sands NR. Secondary bone grafting of residual alveolar and palatal clefts. Journal of Oral Surgery. 1972;30:87-92. 8. Center for Disease Control and Prevention(US). Birth defects, Data and Statistics. Department of Health and Human Services. 2011. 9. Cohen MM Jr. Syndromes with orofacial clefting. In: Wyszynski DF, eds. Cleft Lip and Palate: From Origin To Treatment. New York: Oxford University Press; 2002:53-65. 10. Dempf R, Teltzrow T, Kramer FJ, Hausamen JE. Alveolar bone grafting in patients with complete clefts:a comparative study between secondary and tertiary bone grafting. Cleft Palate Craniofac J. 2002;39:18-25. 11. Enemark H, Sindent-Pedersen S, Bundgaard M. Long-term Results after Secondary Bone Grafting of Alveolar Clefts. J Oral Maxillocfac Surg. 1987;45:913-919. 12. Feichtinger M, Mossböck R, Kärcher H. Assessment of bone resorption after secondary alveolar bone grafting using three-dimensional computed tomography: a three year study. Cleft Palate Craniofac J. 2007;44:142-148. 27 13. Hathaway R, Daskalogiannakis J. Mercado A, Russell K, Long RE, Cohen M, Semb G, Shaw W. The Americleft study: an inter-center study of treatment outcomes for patients with unilateral cleft lip and palate. Part 2. dental arch relationships. Cleft Palate Craniofac J. 2011;48:244-251. 14. Honma K, Kobayashi T, Nakajima T, Hayasi T. Computed tomographic evaluation of bone formation after secondary bone grafting of alveolar clefts. J Oral Maxillofac Surg. 1999;57:1209-1213. 15. Hynes PJ, Early MJ. Assessment of secondary alveolar bone grafting using a modification of the Bergland grading system. Br J Plast Surg. 2003;56:630-636. 16. Jordan RE, Kraus BS, Neptune CM. Dental abnormalities associated with cleft lip and/or palate. Cleft Palate J. 1966;3:22-55. 17. Kearns MB, Perrot DH, Sharma A, Kaban LB, Vargervik K. Placement of endosseous implants in grafted alveolar clefts. Cleft Palate Craniofac J. 1997;34:520-525. 18. Kindelan JD, Nashed RR, Bromige MR. Radiographic assessment of secondary autogenous alveolar bone grafting in cleft lip and palate patients. Cleft Palate Craniofac J. 1997;34:195-198. 19. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159-174. 20. Lee C, Crepeau RJ, Williams HB, Schwwartz S. Alveolar cleft bone grafts: results and imprecision of the dental Radiograph. Plast Reconstr Surg. 1995;96:1534-1538. 21. Lidral AC, Vig KW. Role of the orthodontist in the management of patients with cleft lip and/or palate. In: Wyszynski DF, eds. Cleft Lip and Palate: From Origin To Treatment. New York: Oxford University Press; 2002:381-396. 22. Long RE Jr, Spangler BE, Yow M. Cleft width and secondary alveolar bone graft success. Cleft Palate Craniofac J. 1995;32:420-427. 23. Long RE Jr, Paterno M, Vinson B. Effect of Cuspid Positioning in the Cleft at the Time of Secondary Alveolar Bone Grafting on Eventual Graft Success. Cleft Palate Craniofac J. 1996;33:226-230. 24. Long RE Jr, Semb G, Shaw W. Orthodontic treatment of the patient with complete clefts of lip, alveolus, and palate: lessons of the past 60 years. Cleft Palate Craniofac J. 2000;37:533. 25. Long RE Jr, Deacon SA. Assessment of orthodontic outcomes in patients with clefts. In: Losee JE, Kirschner RE, eds. Comprehensive Cleft Care. New York: McGraw-Hill Medical; 2009:1061-1077. 28 26. Long RE Jr, Hathaway R, Daskalogiannakis J, Mercado A, Russell K, Cohen M, Semb G, Shaw W. The Americleft study: an inter-center study of treatment outcomes for patients with unilateral cleft lip and palate. Part 1. principles and study design. Cleft Palate Craniofac J. 2011;48:239-243. 27. Mercado AM, Vig KWL. Orthodontic Principles in the Management of Orofacial Clefts. In: Losee JE, Kirscher RE, eds. Comprehensive Cleft Care. New York: McGraw-Hill Medical; 2009: 721-747. 28. Mitchell LE. Epidemiology of Cleft Lip and Palate. In: Losee JE, Kirscher RE, eds. Comprehensive Cleft Care. New York: McGraw-Hill Medical; 2009: 35-42. 29. Mølsted K, Brattström V, Prahl-Andersen B, Shaw WC, Semb G. The Eurocleft study: intercenter study of treatment outcome in patients with complete cleft lip and palate. Part 3: dental arch relationships. Cleft Palate Craniofac J. 2005;42:78-82. 30. Nightingale C, Witherow H, Reid FD, Edler R. Comparative reproducibility of three methods of radiographic assessment of alveolar bone grafting. Eur J Orthod. 2003;25:3541. 31. Posnick JC, Ruiz RL. Staging of cleft lip and palate reconstruction: infancy through adolescence. In: Wyszynski DF, eds. Cleft Lip and Palate:From Origin To Treatment. New York: Oxford University Press; 2002:319-353. 32. Rosenstein SW, Long RE Jr, Dado DV, Vinson B, Alder ME. Comparison of 2-D calculations from periapical and occlusal radiographs versus 3-D calculations from CAT scans in determining bone support for cleft-adjacent teeth following early alveolar bone grafts. Cleft Palate Craniofac J. 1997;34:199-205. 33. Ross RB, Johnston MC. Cleft Lip and Palate. Baltimore: Williams and Wilkins; 1972. 34. Russell K, Long RE Jr., Hathaway R, Daskalogiannakis J, Mercado A, Cohen M, Semb G, Shaw W. The Americleft Study: An Inter-Center Study of Treatment Outcomes for Patients With Unilateral Cleft Lip and Palate: Part 5. General Discussion and Conclusions. Cleft Palate Craniofac J. 2011;48:265-270. 35. Schultze-Mosgau S, Nkenke E, Schlegel AK, Hirschfelder U, Wiltfang J. Analysis of Bone Resorption After Secondary Alveolar Cleft Bone Grafts Before and After Canine Eruption in Connection With Orthodontic Gap Closure or Prosthodontic Treatment. J Oral Maxillofac Surg. 2003;61:1245-1248. 36. Shaw WC, Dahl E, Asher-McDade C, Brattström V, Mars M, McWilliam J, Mølsted K, Plint DA, Prahl-Andersen B, Roberts C. A six-center international study of treatment outcome in patients with clefts of the lip and palate: part 5. General discussion and conclusions. Cleft Palate Craniofac J. 1992;29:413-418. 29 37. Shaw WC, Brattstrom V, Molsted K, Prahl-Andersen B, Roberts CT. The Eurocleft study: intercenter study of the treatment outcome in patients with complete cleft lip and palate. Part 5: discussion and conclusions. Cleft Palate Craniofac J. 2005;42:93-98. 38. Silva MA, Wolf U, Heinicke F, Bumann A, Visser H, Hirsch E. Cone-beam computed tomography for routine orthodontic treatment planning: A radiation dose evaluation. Am J Orthod Dentofacial Orthop. 2008;133:640.e1-640.e5. 39. Sindet-Pedersen S, Enemark H. Comparative study of secondary and late secondary bone-grafting in patients with residual cleft defects. Short term evaluation. Int J Oral Surg. 1985;14:389-398. 40. Tan AES, Brogan WF, McComb HK, Henry PJ. Secondary alveolar bone grafting-fiveyear periodontal and radiographic evaluation in 100 consecutive cases. Cleft Palate Craniofac J. 1996;33:513-518. 41. Thornton JB, Nimer S, Howard PS. The incidence, classification, etiology, and embryology of oral clefts. Seminars in Orthodontics. 1996;2:162-168. 42. Trindade IK, Mazzottini R, Silva Filho OG, Trindade IE, Deboni MC. Long-term radiographic assessment of secondary alveolar bone grafting outcomes in patients with alveolar clefts. Oral Surg Oral Med Oral Pathol Oral Radiol Endod. 2005;100:271-277. 43. Turvey TA, Vig K, Moriarty J, Hoke J. Delayed bone grafting in the cleft maxilla and palate: a retrospective multidisciplinary analysis. Am J Orthod. 1984;86:244-256. 44. Turvey TA, Ruiz RL, Tiwana PS. Bone Graft Construction of the Cleft Maxilla and Palate. In: Losee JE, Kirschner RE, eds. Comprehensive Cleft Care. New York: McGraw-Hill Medical; 2009:837-865. 45. Vanderas, AP. Incidence of cleft lip, cleft palate, and cleft lip and palate among races: a review. Cleft Palate Craniofac J. 1987;24:216-225. 46. Waite PE, Waite DE. Bone grafting for the alveolar defect. Seminars in Orthodontics. 1996;2(3):192-196. 47. Witherow H, Cox S, Jones E, Et al. A new scale to assess radiographic success of secondary alveolar bone grafts. Cleft Palate Craniofacial J. 2002;9:255-260. 30 CHAPTER 3: JOURNAL ARTICLE Abstract Purpose: This study evaluates the effect of length of follow-up on alveolar cleft bone graft outcomes of two cleft lip and palate treatment centers that were part of a previous four center comparison of bone grafting outcomes.. The Americleft SWAG scale for assessing graft outcomes was also evaluated for reliability and validity. Methods: 164 occlusal radiographs representing short (T1) and long (T2) term follow-up from 82 consecutively grafted patients (43 from Center 1, 39 from Center 2) were rated using the SWAG scale from 0 (failed graft) to 6 (ideal). Mean grafting age was 9 years 10 months (9y7m Center 1, 10y1m Center 2). Average T1 was 11y1m in the mixed dentition, and 1y3m post-graft (10y10m Center 1, 11y3m Center 2). T2 was 17y7m in permanent dentition, and 7y9m post-graft (20y2m Center 1, 14y6m Center 2). Six trained/calibrated raters scored each radiograph twice. Rating for each graft at T1 and T2 was the average of 12 ratings. Reliability was calculated at T1 and T2 using weight Kappa. Paired t-tests (p<.05) were used to test mean T1 and T2 differences for each Center. Correlation tested the relationship between T1 and T2 ratings. Linear regression was used to determine possible factors that might contribute to graft rating changes over time. Results: Paired t-test failed to find a statistical difference between T1 and T2 scores for either Center. There was a significant correlation between scores at T1 and T2 (r=0.68). Fourteen patients’ ratings became worse by more than 1 scale point, and 13 patients’ ratings became better by more than 1 point. Linear regression identified several treatment cofactors of interest. There was a greater chance of a bone graft score 31 improving with completion of canine eruption and substitution of canines for missing lateral incisors. Mean inter- and intra-rater Kappa measurements were good (inter-rater: overall=0.705), (intra-rater: overall=0.788). Mean scores for Center 1 were significantly better than Center 2 at both T1 (5.21 vs. 3.19) and T2 (5.17 vs. 3.43). Conclusions: Short-term follow-up ratings of graft outcomes of groups of patients from different centers identified significant differences between the centers that did not change over time with similar differences identified at both short-term and long-term assessments. The rating system was reliable in the mixed and permanent dentitions. Outcome comparisons might optimally be made effectively as early as one year postgrafting. Introduction Secondary alveolar bone grafting in patients with complete clefts of the lip, alveolus and palate (CLP) in the mixed dentition is a well-established treatment. The graft surgery, when done in combination with orthodontic treatment, has many reported benefits including periodontal support for the cleft-adjacent teeth (Boyne and Sands, 1972; Turvey et al., 1984; Tan et al., 1996), establishment of an osseous matrix for the eruption of permanent teeth (Boyne and Sands, 1972; Bergland et al., 1986; Long et al., 1995), closure of oronasal fistulas (Bergland et al., 1986; Turvey et al., 1986), and stabilization of the maxillary segments, especially in cases of bilateral CP (Turvey et al., 1986). Although most CLP treatment centers perform secondary alveolar grafting surgery in a method similar to the description by Boyne and Sands (1986), clinical graft 32 success continues to vary among centers (Long et al., 2011). As long as alveolar grafting continues in some cases to result in less than ideal bony fill-in, clinicians will find it necessary to decide if a re-graft surgery is necessary. Assessment of a bone graft is typically done using one of the popular proposed scales: The Bergland Scale (Bergland et al., 1986), the Kindelan Scale (Kindeland et al., 1997), or the Chelsea Scale (Witherow et al., 2002). The existing scales for bone graft assessment have several shortcomings including a lack of information on the location of grafted bone within a cleft site (Bergland et al., 1986; Kindelan et al., 1997), requiring the canine to be fully erupted before assessment (Bergland et al., 1986), overly complicated methods for assessment (Witherow et al., 2002; Long et al., 1995), and relatively poor inter-observer agreement (Nightingale et al., 2003). Currently it is unknown whether there is an optimal time to assess a bone graft so that the assessment reflects the long-term condition of the graft. Grafted bone continues to change several years after placement (Enemark et al., 1987; Tan et al., 1996; Honma et al., 1999; Feichtinger et al., 2007), and it would be helpful if the clinician knew if and at what point post-surgery an assessment score is stable. Perhaps a scored assigned in the mixed dentition would be substantially different from a score assigned after orthodontic treatment is completed, or perhaps the score would remain relatively constant. The purpose of this study is twofold. First, it will assess secondary alveolar bone graft success or failure longitudinally using long term post-surgery radiographs, and if possible to conclude from the longitudinal data an optimal time post-surgery to assess the graft outcome. This study intends to determine when measurements of bone fill-in accurately reflect the long-term result of the graft. It addresses the effect, if any, of post- 33 surgical growth, tooth eruption, or management of the lateral incisor space on ultimate graft success. This is a retrospective study of data gathered on patients with cleft lip and palate who were treated at two cleft palate treatment centers that were included in a previous bond graft study. Secondly, the study will also test the validity and reproducibility of a new scale introduced at the American Cleft Palate Craniofacial Association 2011 annual meeting. Materials and Methods Sample The sample was gathered from two cleft palate craniofacial treatment centers that were used in a previous study comparing bone grafting outcomes between four treatment centers. The sample consisted of 82 consecutively grafted non-syndromic patients (31 female and 51 male) with unilateral or bilateral clefts. Forty-three subjects were from Center 1 and 39 subjects from Center 2. There were 4 subjects with bilateral clefts, and on these only the side most clearly visible on the radiograph was included in the study. The mean age at the time of ABG (alveolar bone grafting) was 9.85 years (9y7m Center 1, 10y1m Center 2) with an age range of 6.08 – 15.92 years. The mean length of time between surgery and the first post-operative radiograph (T1) was 14.43 months with a range of 3 months– 4.5 years. The mean length of time between surgery and the second post-operative radiograph (T2) was 7.75 years with a range of 1.58 to 13.0 years. At T2 the subjects were 17.64 years of age on average (20y2m Center 1, 14y6m Center 2). The length of time between T1 and T2 radiographs was a mean of 6.54 yr with a range of 6 mo-12 yr (Table 1). 34 Table 1. Sample Demographics N (total) Center 1 Center 2 Overall 43 39 82 Mean Age at ABG (years) 9.58 10.08 9.83 Mean years ABG to T1 1.25 1.17 1.25 Mean years ABG to T2 10.58 4.42 7.75 Mean years T1 to T2 8.95 3.40 6.54 Intraoral periapical and occlusal radiographs were analyzed from the eighty-two cleft sites taken after secondary alveolar bone grafting surgery. All radiographs were taken using conventional film by experienced technicians. T2 films selected were those taken at least 12 months after the T1 radiograph when possible, with the majority taken at the mean of 6.54 years after the T1 radiograph. The T1 and T2 radiographs for each cleft site were assigned a number, scanned to digital form and converted to a .tif file (high quality digital photo) by a research assistant. Where possible, radiographs were digitally enhanced to improve contrast, brightness, and edge sharpness. Each coded radiograph was placed in a power point slide to comprise a slide show given to volunteer examiners. In an attempt to overcome the shortcomings of the previous bone graft rating scales, a new bone graft scale has been developed as part of the Americleft Project (a Task Force of the American Cleft Palate-Craniofacial Association). It is called “A Standardized Way of Assessing Grafts” (SWAG) (Table 2). The scale assigns a score from 0 – 6, with 0 representing a failed graft with a poor re-graft prognosis, and 6 representing a completely filled cleft site with an essentially normal alveolar bone height. The scale divides the cleft visually into thirds: apical, middle, and coronal. First, the clinician looks for the presence of a bony bridge spanning the cleft site. For a bony bridge to be counted, the presence of bone has to be unequivocal, although the entire third does not have to be filled completely with bone. If there is not a bridge present that 35 completely crosses the cleft, the clinician then evaluates the site for root exposure of either of the adjacent cleft teeth. If there is any root exposure of a permanent central incisor, canine, or a usable lateral incisor, the graft is rated a “0”. If there is complete bony root coverage despite the lack of a bony bridge, the graft is rated a “1”. If a bony bridge is present, the clinician then scores that given third of the cleft a “2”, and then assigns a score for the other two sections. The remaining thirds are assigned a “0” if there is no bony bridge present and if there is any permanent tooth root exposure, a “1” if there is not a bony bridge but both adjacent permanent roots are covered with bone or if the permanent tooth roots are separated from the cleft margin by primary or unusable teeth that will be extracted, or a “2” if the given third also has a bony bridge spanning the cleft. For example, if a grafted cleft has had some resorption apically but is otherwise successful, it would be assigned a SWAG score of “5” as shown in Figure 1. The coronal and middle thirds of the cleft are bridged with bone (so both the coronal and middle thirds are assigned scores of “2,”) but the apical third lacks a bony bridge but still has root coverage of the adjacent teeth (so the apical third is scored a “1”). The SWAG scale attempts to provide the clinician with information on both the quantity and location of bone within a cleft site. The scale also assesses the prognosis for a regrafting procedure by including a method to score sites that lack a bone bridge but provide root coverage for permanent teeth, as opposed to those teeth in which unbridged thirds also have exposed root surface, which would likely require removal of exposed teeth for a successful attempt at regrafting. 36 Table 2. Americleft SWAG Scale Scores (Russell et al., 2011) Score 0 1 2 3 4 5 6 Description No bone bridge. Permanent tooth roots or crown exposed in cleft margin. No bone bridge. No permanent tooth roots or crown exposed. Bone bridge present in one of the cleft thirds (avg. 1/3 entire cleft site filled but less than ½); permanent tooth root or crown exposed in both other unbridged thirds Bone bridge present: avg 1/3 cleft site filled but less than ½; permanent tooth root or crown exposure in one of the remaining unbridged thirds; no permanent tooth root or crown exposure in the other unbridged third. Bone bridge present: avg. 1/3 cleft site filled but less than ½; no permanent tooth root or crown exposure in the other unbridged thirds OR Bone bridge present in two of the cleft thirds: avg. 2/3 cleft site filled (more than ½ filled); permanent tooth root of crown exposure in the other unbridged third Bone bridge present in two of the cleft thirds: avg 2/3 cleft site filled (more than ½ filled); no permanent tooth root or crown exposure in the remaining unbridged third Complete bone fill-in: definitely more than 2/3 cleft site filled, including up to and beyond actual or projected root apices. Figure 1. Example of SWAG Scale Method with Scoring Ratings Six examiners, five experienced CLP orthodontists from the Americleft Project team and one orthodontic resident, were given a digital slide tutorial in order to learn the 37 new SWAG scale for rating alveolar bone grafts, and were then calibrated by discussion of the use the scale and with use of a few sample radiographs. Examiners were then asked to use the rating system to rate each digitized radiograph in the slide show, and to mark their ratings on a Microsoft Excel spreadsheet. After the first ratings were completed, the radiographs were rearranged in a random sequence and re-rated at a later date. The random rearrangement was done to minimize the chance of bias due to memory and to allow for determination of intra-examiner reliability. All examiners were also blinded to the center of origin of each radiograph. Each radiograph was rated 12 times, twice by each of six examiners. The mean of the twelve scores was used as the final graft site score for each time point. Inter-rater reliability was determined by comparing the scores given to each of the cleft sites at each examiner’s first rating session. Intra-rater reliability was determined by comparing the first and second session scores for each graft site given by each examiner. Statistics Analysis of intra- and inter-examiner reliability was done using weighted Kappa statistics (Landis and Koch, 1977) as listed in Table 3. Kruskal-Wallis tests were used to analyze the difference in scores between centers at T1 and at T2. A paired t-test was used to test the difference in scores within each center from T1 and T2 and a Correlation statistic was run to compare the agreement between T1 and T2 scores. Finally, a Multivariate Linear Regression was used for multivariate analysis of the impact of related variables on the change between T1 and T2 scores. The variables evaluated were canine eruption status at T2 (1=incompletely erupted versus 2=erupted) 38 and management of the lateral incisor at T2 (1=lateral incisor present and functional; 2=canine substitution for a missing lateral incisor; 3=missing lateral incisor space held for prosthetic replacement). In addition, the relationship between the amount of change in scores over time and the initial score at T1 was included in the regression. For simplification, initial scores were categorized according to their clinical implications. Scores of less than 3 (category 1) would be considered graft failures with poor prognosis for regrafting, with either no bone bridge at all or at most only 1/3 of the cleft bridged but with root of teeth exposed. Scores between 3 and 4 (category 2) would suggest a poor graft but with a bone bridge, more bone coverage over roots in unbridged thirds, and a better prognosis for regrafting. Scores between 4 and 5 (category 3) would suggest a successful but less than perfect graft with less need for augmentation grafting, and scores greater than 5 (category 4) would be considered very successful grafts. For this and all tests, a p-value of <0.05 was selected for indication of statistical significant. Table 3. Interpretation of Kappa Statistics (modified from Landis and Koch, 1977) Value of Kappa <0.20 0.21-0.40 0.41-0.60 0.61-0.80 0.81-0.99 1.00 Strength of Agreement Poor Fair Moderate Good Very Good Perfect Agreement Results The overall mean intra-rater reliability as measured with the Kappa statistic was 0.788 and ranged from 0.755 to 0.837. The T1 mean intra-rater Kappa was 0.790 and the T2 mean intra-rater Kappa was 0.805. The mean overall inter-rater Kappa statistics were 39 0.705 (range 0.681 to 0.735). The T1 mean inter-rater Kappa was 0.713, and the T2 mean Kappa was 0.701 (Table 4). Referring to Table 3, all of these Kappa averages fall in the “Good” to “Very Good” range. Table 4. Overall Kappa Scores for SWAG scale. Mean Inter-rater=0.705. Mean Intra-rater=0.788 Rater 1 2 3 4 5 6 AVG 1 0.809 0.79 0.753 0.674 0.657 0.731 0.721 2 0.79 0.795 0.706 0.72 0.71 0.749 0.735 3 0.753 0.706 0.837 0.658 0.711 0.688 0.703 4 0.674 0.72 0.658 0.755 0.704 0.695 0.690 5 0.657 0.71 0.711 0.704 0.729 0.622 0.681 6 0.731 0.749 0.688 0.695 0.622 0.804 0.697 Overall, the average T1 graft score was 4.322 with a standard deviation of 1.370. The average score given to the T2 grafts was 4.347 with a standard deviation of 1.269. The average score change from T1 to T2 was 0.024 + 1.053. Fourteen graft ratings became worse by 1 category or greater, and 13 ratings became better by 1 category or greater. Five grafts changed <1 category, randomly varying better or worse. A paired ttest failed to find a statistical difference between the T1 and T2 scores (p=0.835). The correlation from T1 to T2 was 0.68 (p<.001), which is defined as a strong and highly significant positive correlation (Figure 2). 40 Scatterplot of Mean T1 Rating vs Mean T2 Rating 6 Mean T1 Rating 5 4 3 2 1 0 0 1 2 3 Mean T2 Rating 4 5 6 Figure 2. Correlation of mean T1 and T2 Scores. r=0.68 (p<.001) The mean scores for Center 1 were significantly higher than for Center 2 at both T1 (5.21 vs. 3.19) and T2 (5.17 vs. 3.43). A Kruskal-Wallis test found similarly highly significant differences between the two centers at both T1 and T2 which can be seen in Figure 3. Figure 3. Comparison of outcome scores for Center 1 and Center 2 41 Having found from the Kruskal-Wallis test that by using the mean score for each patient’s 12 ratings, similar significant differences could be demonstrated between centers at both T1 and T2, a graph of the actual distribution of the categorical scores for each center was constructed to see if the distribution of occurrence of each score, from 1 to 6, found in each center, was also similar in the short-term (T1) and long-term (T2) ratings. Table 5 and Figure 4 clearly show the similar distribution score for each center at both time periods. Table 5. Distribution of Scores for Centers 1 and 2 at T1 and T2. The scale ranges 6 (ideal) to 0 (complete failure). Note similar distribution from T1 to T2 within each center Score 6 5 4 3 2 1 0 Center 1 (T1) 55.04% 21.32% 14.73% 8.14% 0.58% 0.00% 0.19% Center 1 (T2) 57.56% 8.14% 30.43% 2.52% 1.36% 0.00% 0.00% Center 2 (T1) 7.05% 17.09% 23.29% 27.14% 14.32% 2.99% 8.12% Center 2 (T2) 7.48% 13.03% 29.49% 28.85% 13.46% 2.35% 5.34% Figure 4. Distribution of scores for Center 1 and Center 2 at T1 and T2 42 The results of the multivariate analysis by multiple linear regressions are listed in Table 6 and shown in Figure 5. When holding all other variables constant, for those grafts with a T1 score of <3 the change in score from T1 to T2 was a mean 2.2 points greater than those with an initial score of ≥5 (p<0.001). Grafts with a T1 score between 3 and 4 had a mean 1.29 point higher change than those with an initial score of ≥5 (p<0.001). Grafts with a T1 score between 4 and 5 had a mean 0.90 point higher change in score than those with an initial score of ≥5 (p=0.005). Center 2 also had a mean 0.75 point lower change in score than Center 1 (p=0.02). When holding all other variables constant, compared to the use of a present and functional permanent lateral incisor in the graft site (Incisor Management 1), missing lateral incisor spaces that were managed by substitution of the permanent canine (Incisor Management Group 2) had a mean 0.38 higher, but not statistically significant change in score from T1 to T2 (p=0.105). Change in score from T1 to T2 when a bridge was planned to replace a missing lateral incisor was lower (-0.30) but not statistically significant when compared to a present lateral incisor (Table 6). Grafts that had completed canine eruption at T2 had an average 0.48 increase in score change compared to grafts with incomplete canine eruption at T2, but this was also not statistically significant. 43 Table 6. Change in score from T1-T2 multivariate analysis Variable Constant Initial Score <3 Initial Score 3 to <4 Initial Score 4 to <5 Initial Score ≥5 Center 1 Center 2 Incisor Mgmt 1 (present lateral incisor) Incisor Mgmt 2 (canine substitution) Incisor Mgmt 3 (space held) Canine Eruption Status 1 (incomplete) Canine Eruption Status 2 (complete) Coefficient -0.99 2.20 1.29 0.90 Referent Referent -0.75 Referent 0.38 -0.30 Referent 0.48 Figure 5. Linear Plot of Multivariate Analysis Results. Constant =-0.99 44 p-value <0.001 <0.001 0.005 0.02 0.105 0.345 0.167 Discussion Scale Reliability and Validity This study used the SWAG scale, a new method of assessing alveolar cleft graft outcomes. The purpose of developing a new assessment method was to increase the amount of useful data obtained with increased reliability over previous methods, and to test the scale for possible use in inter-center outcome comparisons and at varied points in time and dental eruption following the graft placement. The scale’s reliability was measured using weighted Kappa statistics and the mean intra- and inter-rater reliability were 0.788 and 0.705, respectively, both within the “good” category (Landis and Koch, 1977)(Table 2). The SWAG scale Kappa scores for both inter- and intra-rater reliability were also higher than those reported for other graft assessment scales such as the Bergland Scale, Chelsea Scale, and Kindelan Scale (Nightingale et al., 2003; Boley et al., 2010) (Table 7). The new SWAG scale provides additional information regarding the location and amount of bone bridging across the cleft when compared to the scale with the next highest Kappa statistics, the Bergland scale. The Bergland scale assigns a score based on the height of interdental bone, and is therefore easy to use. It does, however, provide misleading information in situations where there is normal bone height but an apical bone defect (Bergland et al., 1986). Furthermore, the SWAG scale can be used before eruption of the permanent canine, whereas the Bergland scale stipulates that canine eruption is complete before assessment of interdental septum height. The SWAG scale also received higher Kappa statistic scores than both the Kindelan and Chelsea scales. The Kindelan and Chelsea scales provide more detailed information on the condition of the graft but with less inter-rater reliability and so have 45 limited use for assessing inter-center outcomes. The Kindelan scale assigns a score based on the estimated percentage of bony fill-in, and does not take into account those situations in which there is no bony bridge but the adjacent cleft teeth have adequate bony root coverage. This information is useful in determining when a group of grafted clefts with poor outcomes still have a favorable prognosis for a regrafting surgery. The Chelsea scale does provide this information, but involves two steps and thus is timeconsuming and typically has the lowest reliability of the four scales (Table 7). Table 7. Comparison of Kappa Scores for SWAG, Bergland, Kindelan, and Chelsea Scales SWAG Intra-rater Kappa score Inter-rater Kappa score 0.788 0.705 (Nightingale et al., 2003) Bergland Kindelan 0.67 0.70 (Boley et al., 2010) 0.711 0.627 (Nightingale et al., 2003) 0.48 0.49 (Boley et al., 2003) 0.671 0.671 Chelsea 0.61 0.50 A major factor that may play a role in lowering levels of reliability is the variability in the quality and type of radiographs themselves (occlusal versus periapical). Additionally, the amount of bone density and extent of periodontal attachment can impact the overall success of the graft and are impossible to assess on radiographs (Rosenstein et al., 1997). There were several outliers in this study’s results that demonstrate the variability in using conventional radiographs for graft assessment, shown in Figure 6 and Figure 7. The graft in Figure 6a. depicts an eruption sac around the permanent canine crown but with apparent complete bone fill-in extending to the alveolar crest. In most situations one might assume that the area of the eruption sac will become healthy bone after the canine erupts and thus score the graft as highly successful. However, Figure 6b. depicts a much less successful graft outcome at T2. The dramatic change from T1 to T2 46 is likely due to many factors such as the rater’s inability to assess depth of bone on a 2D radiograph, or the effect of space opening versus closure on the graft. Figure 7 shows a graft that substantially increased in score from T1 to T2, again likely due to the limitation of assessing a 3D volume of bone at T1 from a 2D radiograph and the confusion resulting from interpreting the difference between radiolucency from a cuspid eruption sac, which might be expected to look better with completion of canine eruption, and failure of bone bridging which would not be expected to improve. Figure 6. Example of a score decrease of >2. T1 (a.) was rated a mean 5.9. T2 (b.) was rated a mean 3.5. Figure 7. Example of a score increase of >2. T1 (a.) was rated a mean 3.6 while T2 (b.) was rated a mean 5.7. 47 The amount of variation possible when scoring individual grafts indicates that the SWAG scale is more appropriate for assessing groups of patients rather than individuals. Although the results found no overall change when rating groups of patients from T1 to T2, the variation among individual patients’ ratings between T1 and T2 may not be acceptable for clinical treatment planning purposes. Kruskal-Wallis tests indicated that the SWAG scale could accurately score grafts from various follow-up periods and from different centers as significantly different. The scale is able to differentiate graft outcomes at both ages examined, even when comparing the two center’s T1 and T2 radiographs (Figure 3). The method is able to detect differences between separate samples with different lengths of follow-up, making it well suited for use in inter-center studies of graft outcomes. The SWAG scale provides information on the location and amount of bony bridging with increased reliability over other methods as measured by weighted Kappa scores (Table 7). The method provides a total score for the graft outcome, and also an individual score for each vertical third of the cleft: coronal, middle, and apical. In addition to assessing individual graft outcomes for patients, the scale scores can be used to generate overall data for cleft treatment centers in order to compare treatment methods and subsequent outcomes. Changes In Grafted Bone Over Time The lack of statistically significant change between the short-term follow-up radiographs taken at T1 and the long-term T2 radiographs indicate that grafts can be evaluated for long-term success or failure as early as 1 year post-graft (the mean T1 post- 48 surgery length was 1y3m). This was also true when each center was examined separately, with the distribution of scores at T1 and at T2 within each center nearly the same (Table 5). The mean time lapse between T1 and T2 was 6.5 years, during which time a typical cleft patient’s remaining unerupted teeth will emerge and orthodontic work will be completed. Both of these instances are known to create bone (Posnick and Ruiz, 2002), and can thus influence the long-term success or failure of the graft. The results indicate that in spite of various post-surgical events that can change the bony architecture, the outcomes of a center’s bone grafting protocol will change very little over time when looking at large samples of patients. In short, a center with an average of mediocre graft outcomes will continue to have average outcomes that stay mediocre over time, and a center with highly successful grafts shortly after surgery will continue to show the same outcome long-term. The study by Feichtinger and colleagues (2007) found similar results with a mean bone loss of 49.5% in the first year, and a loss of 52% at three years. Within the two years between timepoints the bone volume changed only by 3.5%, an amount that is not clinically significant. This also agrees with the studies by Sindet-Pederson and Enemark (1985) and Enemark et al. (1987) in which 80% of the patients followed were deemed to have no change in alveolar bone height between the short-term and long-term (greater than 4 years post-graft) evaluations. In the present study 55 of 82 patients (67%) had changes in their graft outcomes of 1 category or less between short- and long-term follow-ups. This study’s results support the idea that after a short-term post-surgical follow-up of approximately one year, clinicians can presume there will be minimal change to a center’s graft outcomes over the long-term, and that conclusions made from a group of 49 short-term follow-up grafts would be identical to those made from the same group in a long-term follow-up. Although there is occasional lack of agreement on individual cases, the randomness of the variation balances any individual outliers. This is useful in that centers can compare graft outcomes with only a relatively short period of time needed to gather an appropriately sized sample, and can therefore more quickly learn from the comparisons if corrections need to be made to a center’s protocol. Impact of Variables on Graft Score Change There are many factors that can possibly affect the success of alveolar bone grafting. In addition to the center of origin of the samples, the other variables investigated in this study were (1) the initial score category at T1; (2) the management of the present or missing permanent lateral incisor at T2; and (3) the eruption status of the permanent canine at T2 (Table 6). The multivariate analysis using linear regressions revealed some statistically significant differences in changes in score as well as some interesting trends for the covariates identified in this study which might be considered as “predictors” of change on an individual basis. As was obvious from the significant differences evident in the comparison of the outcomes between centers, there was also a statistically significant likelihood of a 0.75 lower change in score from T1 to T2 for patients coming from Center 2 as compared to Center 1. This disparity, or “center effect,” demonstrates the importance of collaborative efforts to compare the treatment outcomes of different centers. Both centers included in this study used similar pre- and post-surgical treatment protocols, underlining the effect of other variables on outcomes. These may include supplementary infant orthopedics, timing and techniques of surgical 50 procedures, the extent of the original deformity, and importantly, the skill of the surgeon (Russell et al., 2011). Also of interest was the comparison of the changes in score that would be predicted depending on which score category a patient started in at T1. Using the highest starting score category as the “gold standard”, with scores between 5 and 6 indicative of perfect or near-perfect graft outcomes against which others were compared, the changes in scores predicted for each of the categories of lower starting scores were all significantly higher than those predicted for the grafts with the best starting scores. On the one hand, this would not be unexpected since there would be little or no room for a near perfect graft at T1 to get even better and much more room for a poorer graft at T1 to show greater change in score. While this appears promising in suggesting that improvement in scores can occur with poor grafts to start, when considering the actual amount of predicted change (the cofactor minus the constant) the predicted improvement in the poorest of grafts to start would only be predicted to add little more than 1.2 points for initial scores less than 3 (still a poor graft at T2) and a negligible change in score for those with initial ratings between 3 and 5 (Figure 8). 51 Figure 8. Actual Change In Score Grouped By Initial T1 Score. 1: <3; 2: 3 to <4; 3: 4 to <5; 4: ≥5. When examining the predictors of incisor management and canine eruption status, while no statically significant relationships were found, there were some interesting trends. The management of a missing lateral incisor space appeared to have a more positive effect on long-term graft success when canine substitution is used with a predicted higher change in score as compared to use of a present lateral incisor in the cleft site (+.38). Contrariwise, when a missing lateral incisor was managed with plans for a replacement tooth there was a predicted lower change in score than that found when a present lateral incisor was used (-.30). However, it should be emphasized that in this sample, replacement lateral incisors were done with fixed bridgework rather than a dental implant, which in fact may have a different impact on graft appearance at T2. This finding agrees with that reported by Dempf et al. (2002) and Schultze-Mosgau et al. (2003). Dempf et al. suggested that stress introduced to the bone during orthodontic space closure prevents resorption, and that those patients treated with space maintenance 52 for a bridge or future implant had higher rates of resorption. Finally, grafts in which the canine was completely erupted at T2 had a slightly higher increase in change when compared to incomplete canine eruption, but again this was not statistically significant. In spite of the statistically significant predictive quality of several variables in an attempt to be able to estimate changes in bone graft ratings over time in individual patients, the clinical significance of the multivariate analysis approach for clinical treatment planning purposes is less impressive. In attempting to place into perspective the validity of making decisions about an individual patient’s long term bone graft outcome based on short term radiographic assessment, the agreement method of Bland and Altman (1986) in Figure 9 depicts the difference between the T1 and T2 ratings versus the mean of the two ratings for each of the patients in the study. Setting the limits of agreement at two standard deviations, Figure 9 shows that the vast majority of the patients’ ratings fall within an acceptable range of agreement, equally distributed above and below the mean and within the two standard deviations of the mean difference. This of course is the explanation for the ratings at short- and long-term assessments as being not significantly different for the group as a whole. However, for this rating method when done at a short-term follow-up (about 1 year post-op) to be useful in making treatment decisions concerning an individual patient’s graft success, an error rate of up to ±2 points for approximately 7% (6 of 82) of the patients and an error rate of up to ±1 point for 33% (27 or 82) of the patients, would have to be acceptable. Therefore although clinical significance is unlikely as shown in Figure 9, as noted previously there are many possible confounders affecting alveolar bone grafting success, and further studies are needed to investigate among other things, the possible effects of 53 canine eruption status at surgery, orthodontic tooth movement around the grafted cleft site both before and after bone grafting, and the effects of either pre- or post-grafting expansion and the method of expansion, etc. Figure 9. Difference Between T1 and T2 Ratings Versus Mean of the Ratings (Bland and Altman, 1986) From the results of this investigation one could conclude that the long-term outcome of an alveolar graft may be assessed as soon as one-year after surgery, although variations, especially management of a missing lateral incisor space, can also have some effect on individual outcomes. It is suggested that canine substitution of the missing lateral incisor may result in superior bone bridging in the long-term and that, when possible, orthodontists should consider this as a viable treatment alternative. Future investigations of this nature are needed to compare the expected outcomes of a canine substitution method to the use of a dental implant in the graft site, which is an option in situations in which a lateral incisor needs to be replaced. 54 Conclusions 1. Outcome comparisons might optimally be made effectively as early as one year after grafting. 2. Alveolar bone grafts do not significantly change post-operatively between short-term (mean 1.25 years) and long-term (mean 6.54 years) follow-up times. 3. Short- and long-term graft outcomes are highly correlated (r=0.68). 4. Grafts with less successful short-term ratings may improve over time. Grafts with successful short-term ratings do not change significantly over time. 5. Space closure with canine substitution of missing lateral incisors may positively affect the long-term graft outcome but was not statistically significant. 6. The Americleft Project’s SWAG scale appears to be a valid and reliable method of assessing groups of alveolar bone graft outcomes and can be used for inter-center outcome comparison studies. 55 Literature Cited 1. Bergland O, Semb G, Abyholm FE. Elimination of the residual alveolar cleft by secondary bone grafting and subsequent orthodontic treatment. Cleft Palate J. 1986;23:175-205. 2. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1:307-310. 3. Boley S, Grossman D, Long Jr. RE. A new method for assessing outcomes of bone grafting in cleft patients and intra-center audit of alveolar bone grafting outcomes from different surgeons. Philadelphia, PA: Albert Einstein Medical Center; 2010. Dissertation. 4. Boyne PJ, Sands NR. Secondary bone grafting of residual alveolar and palatal clefts. Journal of Oral Surgery. 1972;30:87-92. 5. Dempf R, Teltzrow T, Kramer FJ, Hausamen JE. Alveolar bone grafting in patients with complete clefts:a comparative study between secondary and tertiary bone grafting. Cleft Palate Craniofac J. 2002;39:18-25. 6. Enemark H, Sindent-Pedersen S, Bundgaard M. Long-term Results after Secondary Bone Grafting of Alveolar Clefts. J Oral Maxillocfac Surg. 1987;45:913-919. 7. Feichtinger M, Mossböck R, Kärcher H. Assessment of bone resorption after secondary alveolar bone grafting using three-dimensional computed tomography: a three year study. Cleft Palate Craniofac J. 2007;44:142-148. 8. Honma K, Kobayashi T, Nakajima T, Hayasi T. Computed tomographic evaluation of bone formation after secondary bone grafting of alveolar clefts. J Oral Maxillofac Surg. 1999;57:1209-1213. 9. Kindelan JD, Nashed RR, Bromige MR. Radiographic assessment of secondary autogenous alveolar bone grafting in cleft lip and palate patients. Cleft Palate Craniofac J. 1997;34:195-198. 10. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159-174. 11. Long RE Jr, Spangler BE, Yow M. Cleft width and secondary alveolar bone graft success. Cleft Palate Craniofac J. 1995;32:420-427. 12. Long RE Jr, Hathaway R, Daskalogiannakis J, Mercado A, Russell K, Cohen M, Semb G, Shaw W. The Americleft study: an inter-center study of treatment outcomes for patients with unilateral cleft lip and palate. Part 1. principles and study design. Cleft Palate Craniofac J. 2011;48:239-243. 56 13. Nightingale C, Witherow H, Reid FD, Edler R. Comparative reproducibility of three methods of radiographic assessment of alveolar bone grafting. Eur J Orthod. 2003;25:3541. 14. Posnick JC, Ruiz RL. Staging of cleft lip and palate reconstruction: infancy through adolescence. In: Wyszynski DF, eds. Cleft Lip and Palate:From Origin To Treatment. New York: Oxford University Press; 2002:319-353. 15. Rosenstein SW, Long RE Jr, Dado DV, Vinson B, Alder ME. Comparison of 2-D calculations from periapical and occlusal radiographs versus 3-D calculations from CAT scans in determining bone support for cleft-adjacent teeth following early alveolar bone grafts. Cleft Palate Craniofac J. 1997;34:199-205. 16. Russell K, Long RE Jr., Hathaway R, Daskalogiannakis J, Mercado A, Cohen M, Semb G, Shaw W. The Americleft Study: An Inter-Center Study of Treatment Outcomes for Patients With Unilateral Cleft Lip and Palate: Part 5. General Discussion and Conclusions. Cleft Palate Craniofac J. 2011;48:265-270. 17. Schultze-Mosgau S, Nkenke E, Schlegel AK, Hirschfelder U, Wiltfang J. Analysis of Bone Resorption After Secondary Alveolar Cleft Bone Grafts Before and After Canine Eruption in Connection With Orthodontic Gap Closure or Prosthodontic Treatment. J Oral Maxillofac Surg. 2003;61:1245-1248. 18. Sindet-Pedersen S, Enemark H. Comparative study of secondary and late secondary bone-grafting in patients with residual cleft defects. Short term evaluation. Int J Oral Surg. 1985;14:389-398. 19. Tan AES, Brogan WF, McComb HK, Henry PJ. Secondary alveolar bone grafting-fiveyear periodontal and radiographic evaluation in 100 consecutive cases. Cleft Palate Craniofac J. 1996;33:513-518. 20. Turvey TA, Vig K, Moriarty J, Hoke J. Delayed bone grafting in the cleft maxilla and palate: a retrospective multidisciplinary analysis. Am J Orthod. 1984;86:244-256. 21. Witherow H, Cox S, Jones E, Et al. A new scale to assess radiographic success of secondary alveolar bone grafts. Cleft Palate Craniofacial J. 2002;9:255-260. 57 Vita Auctoris Julianne Kathryn Ruppel was born on January 5th, 1982 and grew up with her parents James and Barbara and brothers David and John in Galesburg, Illinois. In 2004 she received her Bachelor of Science with a major in Biology from Illinois Wesleyan University in Bloomington, Illinois. Julianne then entered dental school at the University of Illinois at Chicago, College of Dentistry and received her Doctor of Dental Surgery in 2008. She completed a hospital-based General Practice Residency at Advocate Illinois Masonic Medical Center in Chicago in 2009. She received her Master of Science in Dentistry Research degree on January 7, 2012 from Saint Louis University after completing a residency in orthodontics and dentofacial orthopedics. Julianne is planning on moving to North Carolina in the summer of 2012 with her fiancé Michael Durkin M.D., where he will be begin a specialty fellowship in Infectious Diseases at the Duke University Medical Center. Julianne will be associating in a private orthodontic practice in the Raleigh-Durham area in North Carolina, with plans to buy her own practice in the future. 58