Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
EDUCATIONAL ADVANCE Learning Curves in Emergency Ultrasound Education David J. Blehar, MD, Bruce Barton, PhD, and Romolo J. Gaspari, MD, PhD Abstract Objectives: Proficiency in the use of bedside ultrasound (US) has become standard in emergency medicine residency training. While milestones have been established for this training, supporting data for minimum standard experience are lacking. The objective of this study was to characterize US learning curves to identify performance plateaus for both image acquisition and interpretation, as well as compare performance characteristics of learners to those of expert sonographers. Methods: A retrospective review of an US database was conducted at a single academic institution. Each examination was scored for agreement between the learner and expert reviewer interpretation and given a score for image quality. A locally weighted scatterplot smoothing method was used to generate a model of predicted performance for each individual examination type. Performance characteristics for expert sonographers at the site were also tracked and used in addition to performance plateaus as benchmarks for learning curve analysis. Results: There were 52,408 US examinations performed between May 2007 and January 2013 and included for analysis. Performance plateaus occurred at different points for different US protocols, from 18 examinations for soft tissue image quality to 90 examinations for right upper quadrant image interpretation. For the majority of examination types, a range of 50 to 75 examinations resulted in both excellent interpretation (sensitivity > 84% and specificity > 90%) and good image quality (90% the image quality benchmark of expert sonographers). Conclusions: Educational performance benchmarks occur at variable points for image interpretation and image quality for different examination types. These data should be considered when developing training standards for US education as well as experience requirements for US credentialing. ACADEMIC EMERGENCY MEDICINE 2015;22:574–582 © 2015 by the Society for Academic Emergency Medicine T he use of clinician-performed ultrasound (US) examination has increased dramatically over the past several decades. Initially adopted by relatively few physicians, it has become part of the standard practice of emergency medicine (EM) in academic and community settings alike and is now considered a requisite skill for graduating EM residents. The American College of Emergency Physicians (ACEP) instated US training guidelines based on expert consensus in 2001 and again in 2008.1 In 2008 the Council of Emergency Medicine Residency Directors (CORD) introduced minimum training guidelines for clinician-performed US. In 2012 the Accreditation Council for General Medical Education (ACGME) designated US as one of the milestones competencies for graduating EM residents.2 The initial ACEP guidelines focused on the number of US examinations that were required to be performed by a physician prior to being considered competent. In the most recent ACEP guidelines, it is recommended that 25 to 50 examinations be performed for each of the core applications. While the ACEP guidelines note that other metrics may be used to determine competency, the absolute number of US examinations performed remains the most common (and easiest to obtain) metric. Likewise, while the ACGME milestone competencies focus on a variety of metrics, they do include a cutoff of 150 total examinations as minimum experience to complete residency training. Choosing a specific number as a benchmark for competency remains a theme for national guidelines in EM and other specialties. From the Department of Emergency Medicine (DJB, RJG) and the Department of Quantitative Health Sciences (BB), University of Massachusetts Medical School, Worcester, MA. Received July 29, 2014; revision received November 7, 2014; accepted November 10, 2014. Presented at the Society for Academic Emergency Medicine, Dallas, TX, May 2014. The authors have no relevant financial information or potential conflicts to disclose. Supervising Editor: John H. Burton, MD. Address for correspondence and reprints: David J. Blehar, MD; e-mail: [email protected]. A related article appears on page 597. 574 574 ISSN 1069-6563 574 PII ISSN 1069-6563583 © 2015 by the Society for Academic Emergency Medicine doi: 10.1111/acem.12653 ACADEMIC EMERGENCY MEDICINE • May 2015, Vol. 22, No. 5 • www.aemj.org The recommended 25 to 50 examination cutoff was chosen based on expert consensus, as there has been very little published literature regarding the learning curves for US. Previous studies on learning curves for clinician-performed US have focused primarily on interpretation metrics without consideration for image acquisition skill.3 A consensus conference on how to evaluate for US competency identified a number of elements, including image optimization and image interpretation, but little data exist on how these metric change as individuals gain experience.4 The objective of this study was to characterize learning curves for novice physician sonographers to identify experience levels where educational performance plateaus occur. We additionally sought to identify experience levels requisite for training sonographers to approach the performance standards of expert physician sonographers for image acquisition and interpretation. This study uses a large educational database from a single site to calculate learning curves for each of the core EM US applications. We compare learner metrics to expert metrics to introduce a discussion on learning curves and competency for each individual US application. METHODS Study Design This study was a retrospective review of an educational database from a single EM residency training program over a 5-year period, from May 2007 to January 2013. The study was reviewed by the local institutional review board and was determined to be exempt from informed consent requirements. Study Setting and Population Ultrasound examinations in the database are a combination of those performed in the course of clinical care as well as those performed by learning sonographers solely for educational purposes across four separate emergency departments (EDs) staffed by a single academic EM physician group. The four sites ranged in character from a small community ED with annual volume of 20,000 patients, to a large urban Level I trauma center with annual volume over 100,000 patients. The US machines used during the study period varied by location and over time, with acquisition of newer equipment as older machines were removed from service. Equipment used included a range of Sonosite machines (Titan and MicroMaxx, Sonosite, Bothel, WA) and Zonare machines (Zone Ultra, ZS3, Zonare Medical Systems, Inc., Mountain View, CA). Still and video images were captured in DICOM video file format stored to DVD storage or USB drive (Zonare) or recorded as continuous video clips to DVD recorder (Sonosite). Changes in equipment did not have a significant effect on image quality or accuracy metrics. There was no significant difference in image quality (p = 0.62, Student’s t-test) between the first and last year of our study data. Similarly, sensitivity (0.86% vs. 0.88%) and specificity (0.96% and 0.97%) did not differ between the first and last year in the study. Learning sonographers in the database included not only training residents, but also attending physician 575 staff without prior US training or experience. All sonographers, regardless of whether they were residents or faculty, underwent similar education. Prior to inclusion in the database, they underwent a 1-day educational session focusing on basic principles of the core US applications. Residents participated in 1 dedicated month in each of their 3 years of training that included a focus on US, during which time they attended weekly educational sessions. These periodic educational sessions focused on core US applications including limited US of the aorta, chest wall, endovaginal uterus, focused assessment with sonography in trauma (FAST), right upper quadrant (RUQ), lower-extremity duplex, renal, soft tissue, and limited cardiac echo. Each educational session consisted of a 1-hour lecture, 60 to 90 minutes of hands-on education, and 60 to 90 minutes of image review. Faculty learners attended identical educational sessions over a similar 3-year time period. All learners, including both residents and faculty, attended between eight and 12 of these complete sessions. Additional hands-on sessions were available for learners. An additional US elective was available to third-year residents as either a 2- or a 4-week rotation. All learners were able to perform and record US during clinical shifts and received automated timely feedback on their images using the system described in the next section. Study Protocol Digital video of every US examination performed in the ED was recorded at the time of performance and uploaded into an electronic database for review. All pertinent patient information, the indication for imaging, and the initial image interpretation of the diagnostic study by the sonographer performing the examination were recorded. Examinations performed solely for the purposes of the educational experience of the learning sonographer were logged as such by the sonographer and designated as educational US in the database. All other examination indications were designated as clinical US. All images were reviewed in a standardized fashion by one of five physicians expert in bedside US. The US experience of the expert reviewers ranged from 5 years and over 4,000 US examinations to 15 years and over 20,000 US performed and/or reviewed. Image reviews were performed unblinded to the initial interpretation, the identity of the patient, or the performing sonographer. Required images and image elements differed for the different US protocols performed in this study. These requirements did not change regardless of the experience level or type of learner (resident or faculty). Some protocols required a specific number of images, such as four images for the FAST exam and two images for soft tissue examination, while other protocols could require a variable number of images. For example, chest wall US required at least one image of each hemithorax, but multiple views of each hemithorax could be submitted if needed. Lower-extremity duplex could include between six and 12 (or more) paired images of compressed and uncompressed veins in the lower leg. The image review focused on image interpretation, image acquisition skill, and resultant image quality. Image interpretation consisted of a predefined 576 Blehar et al. • LEARNING CURVES IN EMERGENCY US EDUCATION structured interpretation focusing on a primary finding specific to each individual US application. Interpretation was dichotomous, either positive or negative for the primary finding for that application. A listing of the core US applications and the primary findings are included in Table 1. Each examination was scored for agreement between the initial interpretation (learner) and final review (expert). During the initial study period (first 3 years of the database), image review was performed twice weekly by one of three expert physician reviewers. During the final 2 years of the study period, image review was performed on a daily basis Monday through Friday by one of five reviewers. This image review performed by the expert reviewer also focused on metrics related to image acquisition. These metrics included image quality ratings for the images as a whole on a predetermined ordinal scale from 1 to 8, with 8 representing perfect image quality and 1 representing poor image quality. Any score of 4 or less was predefined to indicate that the images were of sufficiently poor quality to adversely affect the interpretation. All data were recorded in the electronic database and sent back to the initial sonographer via e-mail for feedback at the time of the review. Data were downloaded from the electronic database for all users in the system. US examinations performed by any individual with experience that preceded inclusion in the electronic tracking system were excluded from data analysis. In addition, all US of noncore applications were excluded from analysis (e.g., musculoskeletal, US-guided nerve blocks, ocular US). For analysis of interpretative skill, those US that did not include an interpretation by the initial sonographer were excluded. Table 1 Definition of Ultrasound Terms and Their Findings Term Finding Aorta Chest wall Abdominal aortic aneurysm Pericardial effusion Pneumothorax Endovaginal uterine FAST Intrauterine pregnancy Free fluid Lowerextremity duplex Renal Deep vein thrombosis Right upper quadrant Gallstones Soft tissue Abscess Cardiac Hydronephrosis Short Definition of Positive Finding Diameter of aorta measuring ≥3 cm. Anechoic fluid visualized in pericardial space. Lack of visualized sliding of interface between chest wall and lung tissue. Visualized fetus and or yolk sac in endometrial cavity. Anechoic fluid visualized in peritoneal or pericardial space. Inability to compress deep veins of lower extremity. Visualized branched anechoic center to kidney. Mobile hyperechoic structures in lumen of gallbladder. Disruption of superficial soft tissue with focal collection of anechoic fluid. FAST = focused assessment with sonography in trauma. Image quality metrics were converted to an ordinal scale representing a dichotomous quality metric of poor quality limiting interpretation (score of 4 or below) or good quality (score of 5 or greater). Data Analysis The data were not normally distributed (KolmogorovSmirnov goodness-of-fit test for uniform distributions) and are presented as medians with interquartile ranges (IQR). Learner interpretation was analyzed using sensitivity and specificity analysis with the findings from expert review as the criterion standard. All examinations were included for analysis, regardless of image quality. Learner image quality was assessed by comparing their image quality to the image quality of the expert reviewers’ independent imaging performed during the study. To describe expert reviewer benchmarks related to image quality, the US images obtained and interpreted by the expert reviewers were analyzed separately. Each image was interpreted and reviewed in a fashion identical to those included in the study database. Data on the average image quality for each US application were calculated. Internal validity of the rating scale was analyzed by comparing the agreement of the five expert reviewers. All reviewers independently and blindly reviewed and rated 100 randomly selected US images from the database for image quality. Agreement between reviewers was analyzed using a kappa analysis on those image interpretations. A locally weighted scatterplot smoothing method (loess) was used to generate a model of predicted performance (+95% CI) for each individual examination type. Performance curves were analyzed to determine plateau points where experiential benefit diminished. In addition, performance curves were analyzed with a reference to the performance of the expert reviewers. SAS (version 9.3) software was used for data analyses. Not all patients will have perfect imaging, so data for image quality was normalized to the image quality for images obtained by the experts. In the displayed data, 100% references the average image quality of the experts for that US protocol and 0 represents the lowest possible score for the scale. While a plateau point within a curve can be estimated visually, for the purposes of this study the plateau was mathematically defined as the point where there was a change in slope for the interval immediately preceding a specific point of US experience (x axis) compared to the slope immediately following that point. Slopes were calculated as the change in percent experience (y axis) over the number of US performed (x axis). For the purpose of this study, only slope changes greater than 25% were considered eligible as a plateau point. For curves with multiple changes in slope, the point of the greatest slope change was used as the plateau point for this study. RESULTS A total of 191 EPs performed a total of 89,052 US examinations. After excluding experienced physicians (those ACADEMIC EMERGENCY MEDICINE • May 2015, Vol. 22, No. 5 • www.aemj.org who began US imaging prior to starting this study) and noncore applications, 101 EPs and 52,408 examinations were included in the data set for analysis. US examinations in the database included those performed for educational purposes (33%), as well as clinical purposes (66%). There were nine image applications included in the final data set, representing the core US applications as defined by ACEP. The number of US examinations in each application ranged from 12,963 (FAST) to 1,253 (endovaginal uterus). Overall, 12.5% of imaging was positive for pathology, but the percentage of positive findings varied by US application, with a range of 4.5% (chest wall) to 56% (soft tissue; see Table 2). The median image rating (rated from 1 to 8) for images obtained by expert sonographers provided a benchmark for the image ratings of learners. The median image quality score for each of the US applications for those images obtained by expert physicians was as follows: aorta, 7 (IQR = 5 to 8); cardiac, 6 (IQR = 5 to 8); chest wall, 8 (IQR = 7 to 8); endovaginal uterus, 8 (IQR = 7 to 8); FAST, 7 (IQR 6 to 8); lower-extremity duplex, 7 (IQR = 6 to 8); renal, 7 (IQR = 6 to 8); RUQ, 7 (IQR = 6 to 8); and soft tissue, 8 (IQR = 7 to 8). The overall image quality and agreement with final interpretation varied by US application for US acquired by learners. Agreement between reviewers related to rating the images on the image rating scale was good (j = 0.81). The median image rating for all US in the database was 6 on the 8-point ordinal scale. The US applications with the worst image quality were the cardiac and aorta examinations, with median scores of 6 (IQRs 5 to 8), while the chest wall and renal examinations had the best mean image rating, with medians of 7 (IQR = 7 to 8) for chest wall and 7 (IQR = 6 to 8) for renal. The overall agreement between the initial sonographer interpretation and the expert review was 95.9%, but there was variability by US application. The learning curves for image interpretation differed by US application with regard to initial agreement level, slope of learning curve, and overall shape of the learning curve. Most of the learning curves demonstrate a slow steady improvement in agreement as experience increases until a point of plateau, where further experience is associated with little or no improvement. All 577 image interpretation learning curves are displayed in Figure 1. Interpretation performance plateaus differed for the different imaging protocols (Table 3). Some US protocols had plateau points that occurred at relatively lower experience levels. Interpretation plateaus for soft tissue (for abscess) and cardiac (for pericardial effusion) occurred at 27 and 30 US, respectively. Other examinations such as FAST, chest wall, and aorta required more experience (57, 60, and 66 US, respectively). Renal (78 US) and RUQ (90 US) required the most experience to reach interpretation plateaus. Plateau points for endovaginal uterus and lower-extremity duplex were not definable. Another way to analyze the changes in interpretation performance for learners over time is to compare sensitivity and specificity of learners as they accumulate experience. For most protocols, the sensitivity and specificity improved over time (Figures 2 and 3). Soft tissue and endovaginal uterus were the easiest to learn to interpret, with sensitivities in the mid-90s. The FAST exam was the hardest protocol to learn to interpret, as it demonstrated the lowest sensitivity, with a peak around 80%. All protocols had excellent specificity, with soft tissue and renal the only protocols below a specificity of 96%. Performing 50 US is a common goal for many learners based on current guidelines, and this would produce a sensitivity and specificity greater than 84 and 90%, respectively, for all US protocols with the exception of the FAST exam, where performing 50 US produces sensitivity of 80% and specificity of 96%. The learning curves for image acquisition also differed by application (see Figure 4). Performance plateaus were identified for five of nine examination types and occurred earliest for soft tissue (18), and latest for aorta (84; see Table 3). The four learning curves that did not display plateau points did so because their curves demonstrated no change in slope for three, and there was insufficient data for one. Image quality was easiest to obtain for chest wall US, where performing three US resulted in 95% of the image quality obtained by experts. Image quality was hardest to learn for cardiac, endovaginal uterus, and lower-extremity duplex, where learners never surpassed 90% of the image quality of Table 2 Characteristics of Ultrasounds Included in the Database Examination Type Aorta Cardiac Chest wall Endovaginal uterine FAST Lower-extremity duplex Renal Right upper quadrant Soft tissue Total Number of US Examinations Number of Sonographers 6,183 5,689 6,713 1,253 12,963 2,871 6,173 8,118 2,445 52,408 99 100 97 89 99 98 99 100 97 101 FAST = focused assessment with sonography for trauma. Positive Findings, n (%) 354 505 304 308 1,290 174 802 1,432 1,377 6,546 (5.3) (8.9) (4.5) (24.6) (10.0) (6.1) (13.0) (17.6) (56.3) (12.5) Image Rating, Mean (95% CI) 5.8 5.4 7.0 5.8 6.2 5.8 6.3 6.0 6.8 6.1 (5.75–5.85) (5.35–5.45) (6.97–7.03) (5.7–5.9) (6.17–6.23) (5.73–5.87) (6.26–6.34) (5.96–6.04) (6.74–6.86) (6.09–6.11) Interpretation Agreement (%) 98.7 95.6 98.6 90.7 95.7 96.9 93.2 95.5 92.2 95.9 578 (A) Blehar et al. • LEARNING CURVES IN EMERGENCY US EDUCATION (B) (C) (D) (E) (F) (G) (H) (I) Figure 1. Image interpretation learning curves. Predicted percentage agreement with expert review is plotted as a function of examination experience with surrounding 95% CI. the experts. Both FAST and RUQ demonstrated long learning curves, where it took 183 and 96 US, respectively, to reach 90% of the image quality of an expert. As mentioned previously, performing 50 US is a common goal for many learners based on current guidelines, and this would produce different skill levels for the different US protocols. Performing 50 US resulted in an image quality relative to expert as follows: aorta (73%), cardiac (68%), chest wall (93%), endovaginal uterus (>76%), FAST (84%), lower-extremity duplex (51%), renal (87%), RUQ (79%), and soft tissue (94%). DISCUSSION It is logical to expect that individuals learning US will demonstrate an increase in skill level over time. The learning curves for interpretation in this article demonstrate a gradual improvement over time, but the degree of improvement was relatively shallow. This may be related to the fact that the agreement was relatively high for most US applications, even for the first few US in the learning experience. For some of the examination types, this initial high performance level may in part explain the absence of a discernable performance plateau (e.g., FAST exam image quality). A few of the learning curves (FAST, cardiac) demonstrate a small decrease over the later stages of experience. It is unclear what is responsible for this decrease in interpretation performance. One possibility is that more experienced learners interpreted trace fluid as negative, mistaking the clinical significance with the actual finding. Another possibility is that patient selection for ACADEMIC EMERGENCY MEDICINE • May 2015, Vol. 22, No. 5 • www.aemj.org Table 3 Plateau Points for Interpretation and Image Acquisition Metrics Examination Aorta Cardiac Chest wall Endovaginal uterine FAST Lower-extremity duplex Renal Right upper quadrant Soft tissue Interpretation 66 30 60 None 57 None 78 90 27 Image Acquisition 84 27 39 None None None 75 None 18 Data represent the number of ultrasounds performed prior to reaching a plateau in the learning curve of the indicated metric. Plateau is defined as a decrease in slope of learning curve >25%. The curve for endovaginal uterus did not have enough data points to calculate a plateau point. All other curves without a plateau point did not demonstrate sufficient changes in slope to calculate a plateau point. Figure 2. Sensitivity of ultrasound by examination type compared to interpretation by expert reviewer. Figure 3. Specificity of ultrasound by examination type compared to interpretation by expert reviewer. learning sonographers with more relative experience includes difficult patients who are avoided by more novice sonographers. It is also possible that not all individuals will demonstrate increases in skill over time. A study examining cardiologists found no association between experience and proficiency.5 One final possibility relates to the fact that FAST and cardiac examinations demonstrated proportionally more educational US during the earlier learning phases. These would be 579 more likely to be obtained during nonclinical times when the sonographer has more time to obtain and interpret images, thus artificially enhancing performance. Even including the initial US experience, our overall interpretation agreement rate of 95.9% compares favorably to published rates for other specialties. Discrepancy rates of US interpretation between radiology residents and faculty range from 0.2% to 4.0%.6–9 A renal US study in 2012 found good agreement between two experienced radiologists (j = 0.82), similar to the agreement in our study.10 Studies comparing cardiology fellows using portable US to cardiology faculty found comparable agreement to our study (j = 0.66 to 0.89).11,12 Echocardiography studies involving internists or EPs demonstrate similar agreement (j = 0.79).13,14 A study by novice surgeons learning RUQ US demonstrated less agreement (j = 0.40), significantly lower than the agreement in this article (j = 0.87).15 The most likely explanation for this difference is that their training was significantly less than the training of the learners in this article. Similar to image interpretation, it is logical to expect that imaging technique will improve as experience increases. However, not all of the learning curves for image quality improved over time. Three of the US applications (aorta, lower extremity duplex, and endovaginal uterus) demonstrated decreases in image quality over the initial experience before increasing back to baseline. We speculate that this decrease relates to patient selection, as new learners choose patients who are easier to image (thinner, fasting patients) and more experienced sonographers recorded examinations from more challenging patients based on clinical necessity. It is also possible that the decrease in image quality is independent of the sonographer, and some of the patients at different times had higher percentages of characteristics that decrease US imaging (for example, obesity, recent food intake, increased intestinal gas or pain tolerance, and ability to cooperate with emergent imaging). There are little published data related to image acquisition for US. One study focusing on acquisition of images for the FAST exam found that US technique improved even after 75 US examinations.16 The learning curve in this article demonstrates similar findings, in that the image quality for the FAST exam increased even after 200 US. A study of internists found that even after 35 cardiac US, their image quality was not equal to experienced sonographers.17 Our learning curve demonstrates that learners do not reach the quality equivalent of expert sonographers even after 120 US. A study exploring learning curves for US-guided interventions found that students needed between 37 and 109 US-guided procedures to gain competency.18 Many national groups recommend or require a certain number of US for training, in effect using the number of US as a surrogate for length of experience and by extension competency. In most of these cases, the specific numbers are chosen by expert consensus and are not based on any data. In 2008 ACEP recommended that physician perform 25 to 50 US for each application except US-guided procedures, which required 10 US 580 Blehar et al. • LEARNING CURVES IN EMERGENCY US EDUCATION (A) (B) (C) (D) (E) (F) (G) (H) (I) Figure 4. Image quality learning curve. Image quality is expressed as percentage of examinations at a given experience level that are predicted to be of high quality (rating scale of 5 or greater) with surrounding 95% CI. each.1 The American Registry for Diagnostic Medical Sonography requires that physicians perform 800 US prior to taking a test to certify competency (www.ardms.org). The American College of Cardiology and the American Heart Association published recommendations in 2003 that learners perform 150 cardiac echoes prior to independently performing echocardiography.19 The American College of Radiology requires that physicians document at least 500 US to certify competency (www.acr.org). Translating the performance curves into a metric for competency is more complicated, as the definition of competency with regard to US imaging is unclear. Performance plateaus found in our study offer a guide for US education to understand at what point additional experience offers minimal improvement in image acquisition or interpretation. We have additionally assessed the performance of the expert reviewers as a yardstick to serve as a comparison for the individuals who are learning US. Although the ideal rating for image quality on our scale would be 8.0, there are many contributors to decreased image quality, including patient factors (oral intake, body habitus, painful condition, urgency of imaging) and sonographer factors (skill level, attention to detail), to name a few. Similarly, the ideal agreement would be 100%, but even expert reviewers sometimes disagree with each other. Some statisticians define a kappa of 0.80 or greater as “nearly perfect.”20 Some of the newer guidelines on proficiency in US have moved away from requiring a specific number of examinations. The guidelines for proficiency in critical care US describe educational goals without a reference to number of US performed.21 CORD has instituted “milestones,” where US competency is assessed through ACADEMIC EMERGENCY MEDICINE • May 2015, Vol. 22, No. 5 • www.aemj.org simulation, direct observation, and examinations.2 This trend makes logical sense, as not every individual learns at the same speed, and periodic assessments will allow a greater certainty that an individual is competent. From a group perspective, the data presented in this article provide some understanding of how a given experience level translates into a predicted level of performance. LIMITATIONS This study was conducted at a single center with a common educational process for all learning sonographers supported by rapid interpretative and technical feedback. This potentially limits applicability in other centers with different educational programs. Specific educational styles can affect both individual skill acquisition rates and practice patterns of US utilization. However, it is unclear how unevenly distributed imaging, or periods of inactivity during a period of learning US, affect learning. It is possible that skill decay is seen following periods of inactivity, but our study does not address this issue. In this study our learners sometimes benefit from more experienced individuals (residents and staff) present during image acquisition and interpretation. Our database does not specifically track such interactions, and therefore the effect of this interaction cannot be quantified in the analysis of our data. Although it is possible that interactions with more experienced staff during image acquisition could influence learning curves, this situation exists in many EDs, and it should not limit the generalizability of our findings. The image rating scale used in this study has not been previously validated and is potentially unique to our center. Simplifying the scale to a dichotomous variable of good versus poor image quality not only assists in statistical analysis, but also converts the scale to one we believe is easily applied to other educational programs. Our primary outcome measure relates to performance of expert sonographers at our center. While all five experts in this study have extensive training and experience (as detailed under Methods), there is no standard to define expertise in emergency US, and this creates another limitation to the generalizability of the results of this study to other centers. CONCLUSIONS Educational performance curves vary by ultrasound application, not only for image acquisition skill but also for image interpretation. While not providing an absolute cutoff for requisite examination experience, our results would suggest that for the majority of ultrasound examination types, a minimum of 50 examinations, as suggested by the current American College of Emergency Physicians guidelines, will result in a reasonable performance level compared to expert sonographers. References 1. Emergency ultrasound guidelines. Ann Emerg Med 2009;53:550–70. 581 2. Lewiss RE, Pearl M, Nomura JT, et al. CORD-AEUS: consensus document for the emergency ultrasound milestone project. Acad Emerg Med 2013;20:740–5. 3. Gaspari RJ, Dickman E, Blehar D. Learning curve of bedside ultrasound of the gallbladder. J Emerg Med 2009;37:51–6. 4. Tolsgaard MG, Todsen T, Sorensen JL, et al. International multispecialty consensus on how to evaluate ultrasound competence: a Delphi consensus survey. PloS One 2013;8:e57687. 5. Nair P, Siu SC, Sloggett CE, Biclar L, Sidhu RS, Yu EH. The assessment of technical and interpretative proficiency in echocardiography. J Am Soc Echocardiogr 2006;19:924–31. 6. Ruma J, Klein KA, Chong S, et al. Cross-sectional examination interpretation discrepancies between on-call diagnostic radiology residents and subspecialty faculty radiologists: analysis by imaging modality and subspecialty. J Am Coll Radiol 2011;8:409–14. 7. Ruutiainen AT, Durand DJ, Scanlon MH, Itri JN. Increased error rates in preliminary reports issued by radiology residents working more than 10 consecutive hours overnight. Acad Radiol 2013;20: 305–11. 8. Ruchman RB, Jaeger J, Wiggins EF 3rd, et al. Preliminary radiology resident interpretations versus final attending radiologist interpretations and the impact on patient care in a community hospital. Am J Roentgenol 2007;189:523–6. 9. Ruutiainen AT, Scanlon MH, Itri JN. Identifying benchmarks for discrepancy rates in preliminary interpretations provided by radiology trainees at an academic institution. J Am Coll Radiol 2011;8: 644–8. 10. Rud O, Moersler J, Peter J, et al. Prospective evaluation of interobserver variability of the hydronephrosis index and the renal resistive index as sonographic examination methods for the evaluation of acute hydronephrosis. BJU Int 2012;110: E350–6. 11. Giusca S, Jurcut R, Ticulescu R, et al. Accuracy of handheld echocardiography for bedside diagnostic evaluation in a tertiary cardiology center: comparison with standard echocardiography. Echocardiography 2011;28:136–41. 12. Borges AC, Knebel F, Walde T, Sanad W, Baumann G. Diagnostic accuracy of new handheld echocardiography with Doppler and harmonic imaging properties. J Am Soc Echocardiogr 2004;17:234–8. 13. Bustam A, Noor Azhar M, Singh Veriah R, Arumugam K, Loch A. Performance of emergency physicians in point-of-care echocardiography following limited training. Emerg Med J 2014;31:369–73. 14. Vignon P, Mucke F, Bellec F, et al. Basic critical care echocardiography: validation of a curriculum dedicated to noncardiologist residents. Crit Care Med 2011;39:636–42. 15. Eiberg JP, Grantcharov TP, Eriksen JR, et al. Ultrasound of the acute abdomen performed by surgeons in training. Minerva Chir 2008;63:17–22. 16. Jang T, Kryder G, Sineff S, Naunheim R, Aubin C, Kaji AH. The technical errors of physicians learning 582 to perform focused assessment with sonography in trauma. Acad Emerg Med 2012;19:98–101. 17. Martin LD, Howell EE, Ziegelstein RC, Martire C, Shapiro EP, Hellmann DB. Hospitalist performance of cardiac hand-carried ultrasound after focused training. Am J Med 2007;120:1000–4. 18. de Oliveira Filho GR, Helayel PE, da Conceicao DB, Garzel IS, Pavei P, Ceccon MS. Learning curves and mathematical models for interventional ultrasound basic skills. Anesthes Analg 2008; 106:568–73. 19. Quinones MA, Douglas PS, Foster E, et al. ACC/ AHA clinical competence statement on echocardiography: a report of the American College of Cardiol- Blehar et al. • LEARNING CURVES IN EMERGENCY US EDUCATION ogy/American Heart Association/American College of Physicians-American Society of Internal Medicine Task Force on clinical competence. J Am Soc Echocardiogr 2003;16:379–402. 20. Viera AJ, Garrett JM. Understanding interobserver agreement: the kappa statistic. Fam Med 2005; 37:360–3. 21. Mayo PH, Beaulieu Y, Doelken P, et al. American College of Chest Physicians/La Societe de Reanimation de Langue Francaise statement on competence in critical care ultrasonography. Chest 2009;135: 1050–60.