Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Contents 1 Introduction 1 2 Medical Background 2.1 Anatomy of the Human Heart . . . . . . . . . . . . . . 2.2 Atrial fibrillation . . . . . . . . . . . . . . . . . . . . . 2.2.1 Electrical Activity in NSR and in AF . . . . . 2.2.2 Classification of AF . . . . . . . . . . . . . . . 2.3 ECG Signal . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Formation of the ECG Signal . . . . . . . . . . 2.3.2 ECG in Normal Sinus Rhythm and Fibrillatory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rhythm 3 . 3 . 4 . 5 . 6 . 7 . 8 . 10 3 Methodology 3.1 Database . . . . . . . . . . . . . . . . . . . . . . 3.2 Noise Level Estimation . . . . . . . . . . . . . . . 3.3 The Pan-Tompkins algorithm for QRS detection 3.4 Feature Extraction . . . . . . . . . . . . . . . . . 3.4.1 Feature Selection . . . . . . . . . . . . . . 3.5 Classification . . . . . . . . . . . . . . . . . . . . 3.5.1 Bayes Decision Theory . . . . . . . . . . . 3.5.2 Artificial Neural Network . . . . . . . . . 3.5.3 K Nearest Neighbor Classifier . . . . . . . 3.6 Post-processing and Diagnostic Decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 13 14 15 16 19 19 19 21 23 24 4 QRST Cancellation 4.1 Straight Forward Averaging Algorithm for QRST Cancellation 4.2 Improved QRST Cancellation . . . . . . . . . . . . . . . . . . . 4.2.1 Digital Filters . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 QRS Clustering . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Sub-clustering based analysis on RR-interval . . . . . . 4.2.4 Appropriate Templates and Subtraction . . . . . . . . . 4.2.5 Frequency Spectrum of AF . . . . . . . . . . . . . . . . 4.2.6 Fourier Transform and Power Spectrum . . . . . . . . . . . . . . . . . 27 27 30 31 33 38 39 44 45 5 Results and Discussion 5.1 Results on Belt Database . . . . . . . . 5.2 Results on MIT Database . . . . . . . . 5.3 Discussion . . . . . . . . . . . . . . . . . 5.3.1 Limitation of QRST Cancellation . . . . 49 49 55 58 58 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Contents 5.4 5.5 5.6 5.7 The first and the beat window II . . . . . . . . . . . . . . QRS clustering Methods . . . . . . . . . . . . . . . . . . . Survey of QRST cancellation using appropriate templates Results summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 62 64 66 6 Summary and Perspective 71 A Abbreviations and Acronyms 73 B COOKING BOOK FOR AF DETECTION TOOLBOX 75 C The M-file structure of AF detection using MIT [1] and OWN database 81 References 85 List of Figures 2.1 2.2 2.3 2.4 2.5 The cardiac conduction system . . . . . . . . . . . . . . . . . . . 4 Diagram of electrical activity in NSR and during atrial fibrillation 5 Representative human ECG waveform . . . . . . . . . . . . . . . 7 The generation of the ECG signal in the Einthoven limb leads. . 9 ECG in sinus rhythm and fibrillatory rhythm . . . . . . . . . . . 10 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 A example from the MIT-BIH AF database . . . . . . . . . . . . ECG miniature monitor and traces . . . . . . . . . . . . . . . . . Noise Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . Block diagram of the Pan-Tompkins algorithm for QRS detection FeatureExctraction . . . . . . . . . . . . . . . . . . . . . . . . . . DecisionTree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A neuron with a single scalar input and bias . . . . . . . . . . . . layers of back propagation . . . . . . . . . . . . . . . . . . . . . . An example of 5-Nearest Neighbor classifier . . . . . . . . . . . . Post-processing classifier output using moving non-overlapping window containing 6 beats in this example. . . . . . . . . . . . . 3.11 Example of receiver operating characteristic curve . . . . . . . . 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 Example of AF episode acquired by wearable belt system . . . ECG and their respective remainder ECG are shown for an example of NSR and AF . . . . . . . . . . . . . . . . . . . . . . . ECG Example, uniform template and its respective residual signal after subtraction . . . . . . . . . . . . . . . . . . . . . . . . Flow chart of QT cancellation algorithm. . . . . . . . . . . . . Frequency response of the 50 Hz low-pass filter . . . . . . . . . Frequency response of the 50 Hz Notch filter . . . . . . . . . . Noisy ECG signal and filtered ECG signal . . . . . . . . . . . . Schematic representation of QRS clustering . . . . . . . . . . . Result of QRS clustering for the ECG example in the Fig. 4.3 . The appropriate templates superimposed on heart beats of each cluster or subgroup for ECG example in Fig. 4.3 . . . . . . . . Flow chart of computing appropriate templates . . . . . . . . . An AF example, appropriate templates for QRST Cancellation and its respective residual signal . . . . . . . . . . . . . . . . . Remainder of the AF Example using the forward averaging algorithm and improved QRST Cancellation. . . . . . . . . . . . 13 14 15 15 17 18 22 22 24 25 26 . 27 . 28 . . . . . . . 29 30 32 33 34 36 37 . 38 . 39 . 40 . 41 iv List of Figures 4.14 An NSR example, template for QRST Cancellation and its respective residual signal . . . . . . . . . . . . . . . . . . . . . . . . 42 4.15 Definition of beat window . . . . . . . . . . . . . . . . . . . . . . 43 4.16 Frequency spectra from the residual signals in the AF and nonAF cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16 5.17 QT features averaged over AF and NSR records . . . . . . . . . . Histogram of QTCan14 and QTCan11 . . . . . . . . . . . . . . . ROC curve for combining features of QTCan14 and RR interval Plot features of AF and NSR episode from record 04746 . . . . . AF example with not apparent fibrillatory waves . . . . . . . . . NSR example with chaotic atrial activity. . . . . . . . . . . . . . Noisy ECG signal causes considerable error in QRST cancellation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K-mean clustering partitions the QRS complexes into three clusters Frequency spectrum of the entire P wave . . . . . . . . . . . . . . Decision curves comparison . . . . . . . . . . . . . . . . . . . . . Performance comparision . . . . . . . . . . . . . . . . . . . . . . QDC Density Estimation . . . . . . . . . . . . . . . . . . . . . . Density function . . . . . . . . . . . . . . . . . . . . . . . . . . . Decision curves comparison . . . . . . . . . . . . . . . . . . . . . QDC Density Estimation . . . . . . . . . . . . . . . . . . . . . . Density function . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 52 54 56 59 60 60 61 63 65 66 67 67 68 69 69 69 List of Tables 2.1 Classification of AF 5.1 Features of QRST Cancellation of chronic AF patients for each record . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Features of QRST Cancellation of NSR patients for each record Averaging features for AF and NSR records . . . . . . . . . . . Results of AF detection using QRST Cancellation features on wearable belt system. . . . . . . . . . . . . . . . . . . . . . . . Results of AF detection using features of RR interval and QRST Cancellation on wearable belt system. . . . . . . . . . . . . . . The duration of annotated segment in minutes and the number of heart beats for each MIT AF database . . . . . . . . . . . . AF detection using feature of RR interval, QRST Cancellation and combining the both features on MIT database. . . . . . . . AF detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . AF detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 . . . . . . . . . . . . . . . . . . . . . . . . . 6 . 50 . 50 . 51 . 53 . 54 . 57 . 58 . 66 . 68 1 Introduction Atrial fibrillation (AF) is a common arrhythmia with a prevalence of approximately 0.4-1% in the general population. Prevalence increases with age and it is estimated to be present in 5% of those older than 65, and 10% of those older than 70. It is associated with an increased risk of stroke and mortality, as well as congestive heart failure and cardio-myopathy. AF gives rise to a significant increase in mortality [2]. To help in the fight against AF disease, a system is developed for the continuous monitoring of health status based on non-invasive wearable sensors integrated in the Philips strap. The disease knowledge is enriched over time as the system learns the patient’s behavior, for example, by monitoring and “remembering” the heartbeat during daily activities. Biometric signals, primarily ECG (electrocardiogram), are collected in the hospital or at home and displayed on a central station. ECG signal, which is a graphical representation of the potential differences measured between two points on body surface versus time, is produced by activation front of cardiac depolarization and repolarization. ECG signals are largely employed as a diagnostic tool in clinical practice in order to assess the cardiac status of the patient. As one of the most important pieces of vital information, the ECG signal plays an important role in the continuous patient monitoring for the people, who suffer from chronic cardiovascular diseases. This abnormal excitation propagation of AF patients results in morphology changes in ECG. The ECG of AF patients is characterized by irregular RR intervals caused by chaotic atrial depolarization waves penetrating the AV node in an irregular manner. We can not see any consistent P wave due to chaotic atrial activity, replaced by a fibrillatory wave, caused by random reentry wavelets. These features enable automatic diagnosis of AF. This novel continuous ECG monitoring support automatic analysis of ECG signal on pocket PC, e.g., mobile platform. Algorithms of ECG processing are thereby on demand under this circumstance. This study proposes a solution for automatic detection of AF available for pocket PC. The AF detection uses features extracted from the ECG, which reflect the electrophysiological changes manifest in ECG signal during AF. The most descriptive features were selected, and evaluated using various classifier on MIT-BIT atrial fibrillation database, as well as on a database collected from atrial fibrillation patients using a Philips one lead ECG strap. This thesis contain the following contents: • Medical Background: Anatomy of human heart, AF pathology, formation of ECG signal are introduced to aid in comprehension of electrophysiological changes in the ECG of AF patients. 2 Introduction • Methodology: This chapter describes all the components in the framework of AF detection: database, preprocessing, features extraction, postprocessing and evaluation. • QRST cancellation: The vast majority of the AF detector is based on ventricular irregularity, but the drawback is that rhythms other than AF can also have irregular ventricular responses. Therefore, attempts have been made to detect AF fibrillatory activity using QRST cancellation. • Results and Discussion: This chapter describes the results of AF detection on the two databases, using diverse features and combined features along with various classifiers. This chapter also discusses the drawbacks of the QRST cancellation and the other possible approaches. 2 Medical Background AF is the result of a fractionated atrial electrical activity mainly due to the shortening of atrial refractory period, which allows multiple wavelets pass through the atrial mass. AF can probably cause both molecular modifications of electrophysiological activity, and structural, functional, which contribute to disturbance of initiation and propagation of excitation pattern in atrial tissue. In the case of normal sinus rhythm (NSR), the excitation propagation originates from the sinoatrial node, from the right atrium to the left atrium in a uniform pulse wave, after a 0.1 second delay in the AV node, the excitation along the His bundle musculature to the both ventricles. However, in the case of AF, reentry wavelets occur instead of uniform excitation propagation. The excitation spreads throughout the atrium in a random pattern. These chaotic atrial depolarization causes rapid atrial activity at 300 to 600 bpm. Fortunately, the AV node does not permit all of the excitation to propagate to ventricles. Only 1 or 2 among every 3 atrial signals can pass the AV node due to Wenckebach effect. However the ventricle contracts at a high rate of 110 to 180 bpm, depolarized by a variant cycle length. To aid in comprehension of the methods for AF detection, this chapter introduces the medical background including anatomy of human heart, AF pathology, formation of ECG, as well as the the characters of the ECG signal which distinguish between atrial fibrillation and normal sinus rhythm. 2.1 Anatomy of the Human Heart The heart constitutes together with the blood vessels the cardiovascular system, which has the task of transporting blood through the body. In this system the heart acts as a cyclically working pump and as a blood reservoir. From a macroscopic spatial view the mammalian heart is located inside of the thorax and near to the lungs enclosed in the pericardium. Large blood vessels are connected to the heart. It is subdivided by septa into two functionally and anatomically similar structures: the right and left half, which represents the division of the blood circulation system in two different parts. The right half collects the deoxygenated blood from the body and pumps it to the lungs; the left half receives the oxygenated blood from the lungs to deliver it to the body. A constriction subdivides each half of the organ into two muscular regions enclosing a cavity. The upper region is called the atrium, the lower is the ventricle. The heart therefore consists of four chambers: i.e. left and right atria as well as left and right ventricles. The atria collect the incoming blood, 4 Medical Background Figure 2.1: The cardiac conduction system. The numbers in the brackets are the conduction time of the excitation propagation in seconds [3]. which is transported to the ventricles. From there the blood is moved to supply the body and the heart itself. The atria and ventricles are composed of walls surrounding a cavity, which is normally filled with blood [4]. The excitation propagation in the human heart produces electrocardiography (ECG) and regulates the heart contraction. Figure 2.1 depicts the cardiac conduction system. The sinoatrial (SA) node emits an impulse from the leading pacemaker site. The impulse spreads immediately into the atrial cardiomyocytes and is transmitted through the entire atrial muscle mass to the atrioventricular node (AV node) After a brief delay in AV node, during which the atria can contract and fill the ventricles with blood, the impulse is conducted through the His bundle via Tawara bundles and a subendocardial network (Purkinje’s fibers). Finally both ventricles are activated from endocardium to epicardium. 2.2 Atrial fibrillation Atrial Fibrillation (AF) is a common arrhythmia with a prevalence of approximately 0.4-0.1% in the general population. Prevalence increases with age and is estimated to be present in 5% of those older than age 65, and 10% of those older than 70. AF is associated with an increased risk of stroke and mortality, as well as congestive heart failure and cardiomyopathy [5][6]. In normal individuals, a brief episode of AF may cause palpitations, chest discomfort and light-headedness. Palpitations (a sensation of a rapid and irregular heart beat) are the most frequent symptoms in those patients with paroxysmal AF. When AF is persistent or permanent, patients suffer more of- 2.2 Atrial fibrillation 5 ten non-specific symptoms like poor effort tolerance, breathlessness on exertion, and lack of energy. AF can, by itself, cause severe CHF (congestive heart failure) after several weeks to months. The loss of atrial contraction also leads to the enlargement of atria, thus the stasis of blood in the atria, which promotes clot formation and the occurrence of thromboemboli. AF is the single most important cause of ischaemic stroke in people older than 75 [5]. 2.2.1 Electrical Activity in NSR and in AF Figure 2.2: Diagram of electrical activity in NSR (a) and during atrial fibrillation (b). Representative APs are shown from the SA node, atrial myocytes, AV node and ventricles. The vertical line on each AP recording corresponds to a common time reference. LA, left atrium; LV, left ventricle; RA, right atrium; R V, right ventricle [5]. The heart is a large muscular pump that drives blood around the body (see section 2.1). To achieve this effectively, the heart’s chambers must be precisely controlled electrically. Figure 2.2 a) illustrates the normal regular activity in the physiological heart. The normal heart beat is initiated in the SA node at the normal sinus rhythm, then is conducted regularly throughout the atria, causing them to contract. The contraction of the atria propels blood into the ventricles. After about 0.1 s delay in the AV node, the excitation spreads rapidly through Hisbundle and the connected branches to the ventricles and initiates their contraction. As a consequence, the blood is pumped from the ventricles to all of the organs around the body. 6 Medical Background Figure 2.2 b) reflects the disturbed excitation propagation in the heart with AF. When episodes of AF occur, instead of the regular initiation of the heart beat in the SA node, there is no single place where the heart activates in the atrium. The wavefronts of excitation spread throughout the atrium in a random pattern (re-entry wavelets: multiple wavefronts of depolarisation), finding another small region of tissue to depolarize. The atria are constantly activated in this chaotic pathway, until several of them are captured by the AV node and propagate to the ventricles. Although the AV node filters most of these extra atrial signals, the heart rate still reaches 110 to 180 bpm. There is no effective contraction of the atrial muscle in this situation. 2.2.2 Classification of AF It has been long recognized that an episode of AF may be self-terminating or non-self-terminating. The terms chronic and paroxysmal have been used, but sometimes this definition results in difficulties in effectiveness of treatments and therapeutic strategies. It is important for clinicians to ascertain whether an incident of AF is the very first episode, that is, the initial event; whether it is symptomatic or not; and whether it is self-terminating or not. If the patient has had two or more episodes, AF is said to be recurrent. AF can be classified by (see table 2.1): Terminology Initial event (first detected episode) Paroxysmal Persistent Permanent Clinical features Symptomatic Asympotomatic (first detected) Onset unknown (first detected) Spontaneous termination < 7 days and most often < 48 hours Not self-terminating Not terminated or terminated but relapsed Arrhythmia pattern May not recur Recurrent Recurrent Established Table 2.1: Classification of AF [7]. Paroxysmal AF: Episodes of paroxysmal AF usually self-terminate within 48 hours and, by definition, in fewer than 7 days. The heart changes from SR to AF episodes lasting from seconds to days. The patient may only have 1 episode a year or be in AF most of the time, but the essential feature is that most episodes terminate spontaneously. Persistent AF: When an espisode of AF has lasted longer than 7 days, AF is designated as persistent. Persistent AF may be the first presentation of the arrhythmia or may be preceded by recurrent episodes of paroxysmal AF. When AF is persistent, termination using electrical cardioversion may be required, which is used to restore NSR. Cardioversion delivers the electrical shock instantaneous to the human heart, resulting in a momen- 2.3 ECG Signal 7 tary depolarisation of most cardiac cells simultaneously. It allows the SA node to resume the normal pacemaker activity. Permanent AF: When AF has been present for some time and fails to terminate using cardioversion or is terminated but replaces within 24 hours, it is said to be established or permanent [7]. Figure 2.3: Representative human ECG waveform, adapted from [8] 2.3 ECG Signal Each individual heartbeat is comprised of a number of distinct cardiological stages, which in turn give rise to a set of distinct features in the ECG waveform. These features represent either depolarization (electrical discharging) or repolarization (electrical recharging) of the muscle cells in particular regions of the heart. This activation sequence generates ECG measured by electrode on the patients skin in specific position, e.g. left, right hand and left leg. This electrode locations on the extremities produce three different leads: Einthoven leads I, II and III. The activation front contributing to ECG signal is contained in Fig. 2.4. Figure 2.3 shows a human ECG waveform and the associated features. The standard features of the ECG waveform are the P wave, the QRS complex and the T wave. Additionally a small U wave (following the T wave) is occasionally present. The timing between the onset and offset of particular features of the ECG (referred to as an interval ) is of great importance. The two most important intervals in the ECG waveform are the QT interval and the PR interval. The QT interval is defined as the time from the start of the QRS complex to the end 8 Medical Background of the T wave, i.e. Tof f -Q, and corresponds to the total duration of electrical activity (both depolarization and repolarization) in the ventricles. Similarly, the PR interval is defined as the time from the start of the P wave to the start of the QRS complex, i.e. Q-Pon and corresponds to the time from the onset of atrial depolarization to the onset of ventricular depolarization. In normal sinus rhythm [9][10][11], • P-R interval 120-200 milliseconds (0.12 to 0.20 seconds) • QRS interval under 120 milliseconds (0.12 seconds) • Q-T interval under 380 milliseconds (0.38 seconds) 2.3.1 Formation of the ECG Signal The cardiac cycle begins with the P wave (the start and end points of which are referred to as Pon and Pof f ), which corresponds to the period of atrial depolarization in the heart. After the electric activation of the heart has begun at the sinus node, it spreads along the atrial walls. The resultant vector of the atrial electric activity is illustrated with a thick arrow in Fig. 2.4 (a). The projections of this resultant vector on each of the three Einthoven limb leads is positive. After the depolarization has propagated over the atrial walls, it reaches the AV node. The propagation through the AV junction is very slow and involves negligible amount of tissue; it results in a delay in the progress of activation. (This is a desirable pause which allows completion of ventricular filling.) The P wave is followed by the QRS complex, which is generally the most recognisably feature of an ECG waveform, and corresponds to the period of ventricular depolarization. Once activation has reached the ventricles, propagation proceeds along the Purkinje fibers to the inner walls of the ventricles. The ventricular depolarization starts first from the left side of the interventricular septum, and therefore, the resultant dipole from this septal activation points to the right. The figure 2.4 (a) right shows that this causes a negative signal in leads I and II. In the next phase, depolarization waves occur on both sides of the septum, and the resultant vector points to the apex. After a while the depolarization front has propagated through the wall of the right ventricle. Because the left ventricular wall is thicker, activation of the left ventricular free wall continues even after depolarization of a large part of the right ventricle. Because there are no compensating electric forces on the right, the resultant vector reaches its maximum in this phase (see R peak in Fig. 2.4 (b)), and it points leftward. The depolarization front continues propagation along the left ventricular wall toward the back. Because its surface area now continuously decreases, the magnitude of the resultant vector also decreases until the whole ventricular muscle is depolarized. The last to depolarize are basal regions of both left and right ventricles. Because there is no longer a propagating activation front, there is no signal either in Fig. 2.4 (c). Ventricular repolarization begins from the outer side of the ventricles and the repolarization front ”propagates” inward (see Fig. 2.4 (d)). The inward 2.3 ECG Signal 9 (a) (b) (c) (d) Figure 2.4: The generation of the ECG signal in the Einthoven limb leads. (a) The cardiac cycle begins with the P wave, which corresponds to the period of atrial depolarization in the heart. (b) and (c) The QRS complex is caused by ventricular depolarization. (d) The T wave represents the ventricular repolarization. [8]. 10 Medical Background spread of the repolarization front generates a positive signal, denoted as T wave. Because of the diffuse form of the repolarization, the amplitude of the signal is much smaller than that of the depolarization wave and it lasts longer [8]. 2.3.2 ECG in Normal Sinus Rhythm and Fibrillatory Rhythm Figure 2.5: Top: ECG in sinus rhythm. Buttom: ECG in finrillatory rhythm, adapted from [12][13] Normal heart rhythm is termed as sinus rhythm (SR) or normal sinus rhythm (NSR). The ECG in sinus rhythms (see upper Fig. 2.5) are characterized by a conducted P-wave with a P-R interval between 0.12 and 0.20 seconds. The QRS width should be 0.04 to 0.12 seconds and and a Q-T interval of less the 0.40 seconds. The rate for a normal sinus rhythm is 60 to 100 beats a minute. If the rate is below 60 beats a minute but the rest is the same it is a Sinus Bradycardia. If the rate is between 100 to 150 beats a minute with the same intervals it is a Sinus Tachycardia. The AF on ECG in figure 2.5 below is indicated by the absence of consistent P-waves, due to the chaotic atrial depolarization. Chaotic atrial depolarization waves penetrate the AV node in an irregular manner, resulting in irregular ventricular contractions. The QRS complexes have normal shape, due to normal ventricular conduction. However the RR intervals vary from beat to beat [7]. In AF, the atria is excited rapidly and irregularly at a rate of 400 to 600 bpm. 2.3 ECG Signal 11 Fortunately, the AV node doesn’t permit all of the excitations to propagate to the ventricles. Only 1 or 2 among every 3 atrial signals can pass AV node (Wenckebach effect). However the vertricles contract at a high rate of 110 to 180 bpm in the absence of drug therapy [6]. The ventricular rate during AF (the effective “heart rate”) is thus no longer under physiological control of the SA node, instead is determined by interaction between the atrial rate and the filtering function of the AV node. 3 Methodology The approach for AF detection uses data collected the Philips strap and MITBIH atrial fibrillation (AF) database [14]. The ECG signals were pre-processed by a R peak detection algorithm before the features were extracted. The feature extraction using beat-to-beat features were chosen to reflect the physiological changes that manifest in the ECG signal. Various classifiers with input of features were applied for pattern recognition. Finally, the accuracy of the classification decision were measured by a statistical analysis. 3.1 Database The first evaluation database is the MIT-BIH arrhythmia database. This database was the first generally available set of standard test material for evaluation of arrhythmia detectors, and has been used for that purpose as well as for basic research into cardiac dynamics at about 500 sites worldwide [12]. The original analog recordings were made at Boston’s Beth Israel Hospital (now the Beth Israel Deaconess Medical Center) using ambulatory ECG recorders with a typical recording bandwidth of approximately 0.1 Hz to 40 Hz. The individual recordings are each ten hours in duration, and include two channels of ECG signals each sampled at 250 samples per second with 12-bit resolution over a range of ±10 millivolts. Eighteen long-term ECG signals of channel one were chosen from the MIT-BIH atrial fibrillation database, recordings of human subjects with paroxysmal AF. The remaining five recordings containing fewer AF episodes were not adopted. The Fig. 3.1 shows a example from the MIT-BIT AF database. Another database was collected from chronic AF patients using a one lead ECG strap. This Philips belt has been developed with three integrated dry electrodes. The electrodes based on carbon-loaded rubber was integrated in Figure 3.1: A example from the MIT-BIH Atrial Fibrillation Database Record 04015, Grid intervals: 0.2 seconds (horizontal) and 0.5 mV (vertical) [13]. 14 Methodology (a) (b) Figure 3.2: (a) ECG monitoring strap designed for convenience (b) ECG traces in Sinus rhythm of a subject at rest, measured by the wearable belt on the chest [14]. strap with miniaturized shielded cable. The strap was worn around the chest. The representative P wave, QRS-complex, distinct R peak and T wave can be seen in Fig. 3.2, only slight morphology variation compared to the standard ECG-leads [14]. In total seventeen patients were measured in Aachen clinic, Germany. Among them nine patients suffer from chronic AF, the rest are healthy subjects. 9120 beats from fourteen recordings (three recordings were not used due to its bad quality) in total duration of 130 minutes were used for processing. The data comes from regular but short recordings. The patients are resting during measurement and the noisy data are not taken into account. 3.2 Noise Level Estimation To determine the noisy parts of signal which were to be excluded, the discrete wavelet transform with mother Daubechies 4 wavelet was used. We consider the following model of a discrete noisy signal [15]: y(n) = f (n) + σe(n), n = 1 . . . N (3.1) The vector y represents a noisy signal and f is an unknown, deterministic signal. We suppose that e is Gaussian white noise N (µ, σ) = N (0, 1). Donoho and Johnstone [15] propose to use the ”universal threshold” estimation for estimating the noise σ. √ δ= 2 ln N σ̄ (3.2) where σ̄ is an estimation of the noise variance σ 2 given by σ̄ = median(|C(1, k)|)/0.6745 (3.3) 3.3 The Pan-Tompkins algorithm for QRS detection 15 The first scale |C(1, k)| in the wavelet transform contains high frequencies, usually characteristic of noise. Afterwards, the energy function of the first scale is computed to amplify the noisy parts of the signal and the estimation of ”average” white noise variance is performed by taking the median value of the wavelet coefficients at this finest scale (3.3) - see Fig. 3.3. Figure 3.3: Noise detection. The parts with saturation noise and high frequency noise have been successfully detected. 3.3 The Pan-Tompkins algorithm for QRS detection Pan and Tompkins [16][17] proposed a real-time QRS detection algorithm based on analysis of the slope, amplitude, and width of QRS complexes. The algorithm includes a series of filters and methods that perform low-pass, high pass, derivative, squaring, integration, adaptive thresholds and search procedures. Fig. 3.4 illustrates the steps of the algorithm in schematic form. Figure 3.4: Block diagram of the Pan-Tompkins algorithm for QRS detection [17]. Bandpass-filter The bandpass filter reduced the influence of muscle noise, power-line interference, baseline wander, and T wave interference. The desired pass band to maximize the QRS energy is approximately 5-15 Hz. 16 Methodology Derivative operator The derivative procedure suppresses the low frequency components of P and T waves, and provides a large gain to high components arising from high slopes of the QRS complexes. Squaring The squaring operation makes the result positive and emphasizes large large differences resulting from QRS complexes; the small differences arising from P and T waves are suppressed. The high- frequency components in the signal related to the QRS complex are further enhanced. Integration The output of the derivative based operation will exhibit multiple peaks within the duration of a single QRS complex. The Pan-Tompkins algorithm performs smoothing of the output of the preceding operations through a moving-window integration filter and produces transformed ECG. Adaptive threshold Two set of thresholds are used to detect QRS complexes for the transformed to improve the reliability compared to using onr threshold. The thresholds continuously adapt to the current characteristics of ECG signals since they are based upon the most-recent signal and noise peaks. If a peak exceeds THRESHOLD I1 during the first step of analysis, it is classified as a QRS peak. If the search-back technique (described in the next paragraph) is used, the peak should be above THRESHOLD I2 to be called QRS. For irregular heart rates, the first threshold of each set is reduced by half so as to increase the detection sensitivity and avoid missing beats. To be identified as a QRS complex, a peak must be recognized as such a complex in both the integration and bandpass-filtered waveform. Search-back procedure The Pan-Tompkins algorithm maintains two RRinterval averages: RR AVERAGE1 is the average of the most-recent beats, and RR AVERAGE2 is the average of the most-recent beats having RR intervals within the range specified by RR LOW LIMIT = 92%RR AVERAGE2 RR HIGH LIMIT = 116%RRAVERAGE2 (3.4) Whenever a QRS is not detected for a certain interval specified as RR MISSED LIMIT = 166%RRAVERAGE2, (3.5) the QRS is taken to the peak between the established two thresholds. 3.4 Feature Extraction There are three important feature groups used in detection of atrial fibrillation: features using RR interval information, features using P-wave morphology, and features using QRST cancellation. We consider a combination of features from all these groups. 3.4 Feature Extraction 17 The features were extracted in a sliding window consisting of 30 beats, rather than breaking the heart beats into separately blocks. Each time the window was shifted in one heart beat (1 R-R interval) forward. In this way, an attempt was made to label each beat individually, rather than in a group, e.g, the first ECG block extending from the first to thirtieth heart beats formed the 1st features array, and the second ECG block extending from the second to thirty-first intervals formed the 2nd features array, etc. This technique results in one-to-one correspondence between features and beats in the stream- see Fig. 3.5. Figure 3.5: Features were calculated in moving window containing 30 beats. 1. An attractive approach for extraction of ventricular activity is to model the R-R interval sequence as a three-state Markov process [18]. Each interval is characterized as representative of one of the three states S, R, L by classifying it as short, regular or long. Intervals were called short if they did not exceed 85% of the mean interval, long if they exceeded 115% of the mean, and regular otherwise. The mean interval is determined recursively by the relation for all observed R-R intervals rr(i) which do not exceed 1.5 seconds: rrmean(i) = 0.75 ∗ rrmean(i − 1) + 0.25 ∗ rr(i) (3.6) Assume that R-R interval sequence T = {t1 , t2 , . . . , tn } (ti ∈ {S,R,L}) (3.7) is controlled by a stationary first-order Markov process characterized by the transition probability matrix Pi,j,R = P (ti |tj , R) (R ∈ {AF, other}) (3.8) where AF and other denote AF and other rhythms of the databases respectively. This matrix gives the probability moving from state i to j. 18 Methodology Further features apart from Moody’s matrix were calculated. In the time domain, the following parameters were extracted: standard deviation of the NN interval (SDNN), the standard deviation of the average NN interval (SDANN), the square root of the mean squared differences of successive NN intervals (RMSSD), the number of interval differences of successive NN intervals greater than 50ms (NN50). In the frequency domain, power in very low (VLF [0-0.01Hz]), low (LF [0.01-0.15Hz]), high (HF [0.150.5Hz]) frequency range and ratio LF/HF were estimated. Furthermore, the two following non-linear parameters were computed as well: approximate entropy, a measure of complexity [19], and dentrended fluctuation analysis [20], a measure of long-term correlations. 2. The second feature group is a test for a presence of P wave. In normal sinus rhythm, the P wave can be observed before QRS complex while in case of AF, there is no P wave presented. The P wave detection is done using template matching where correlation coefficient is used as a dissimilarity measure between actual P wave and template. A threshold had to be chosen (0.1) to allow acceptance of very similar beats. In this way, each beat was labelled as beat with P wave present or P wave absent. 3. Finally, the last feature group are frequency and domain properties of ECG remainder obtained after QRST cancellation. The frequency spectra of ventricular and atrial activity overlap. The remainder electrogram is needed to cancel the ventricular component and isolate the atrial activity component of the signal. The remainder was calculated by averaging method [21]. Fiducial points for ventricular complexes were marked using a method based on the algorithm presented by Pan and Tompkins [22]. This involved calculating the first and second derivatives of the electrocardiogram, adding their absolute values together, and marking the maxima as fiducial points. Basically, the average beat was aligned with the fiducial points of all dominant beat windows and subtracted.The other features derived from QRST cancellation will be discussed in chapter 4. Figure 3.6: Example of feature selection using decision tree algorithm. To demonstrate the selection process, the validation set on which the tree was built was much smaller than the validation set for training and testing classifiers. 3.5 Classification 3.4.1 19 Feature Selection In total we obtained 45 features. In order to reduce the dimension of the feature space we applied the decision tree C4.5 algorithm using the WEKA package [23]. We retained the two most significant features by looking at the first levels of the resulting decision tree. One simplified example of the decision tree process is shown in Figure 3.6 where two features of the R-R interval analysis and the P wavelet template matching were selected. 3.5 Classification These features were fed to classifiers to categorize the ECG data into two classes: patients with or without AF. The database was split randomly into a training set (30%) and test set (70%). For each beat, the correct classification into AF / non-AF was known as 1 for AF and 0 for non-AF. The classifier was trained using the ECG signal from the training set, and evaluated on the test set. Different classifiers were tested from a toolbox for pattern recognition (PRTools4) to get highest specificity and sensitivity. In the first case the equal covariances matrices for Bayes classifier were assumed, which results in a linear discriminant function based on Bayes normal densities (LDC). In the second case the covariance matrices are different for each category. The Bayes classifier for normally distributed classes with unequal covariant matrices is termed as quadratic classifier based on Bayes normal densities (QDC). The third classifier is a back propagation (BP) neural network with one hidden layer of 10 neuron units and one output neuron unit (10-ANN). The fourth one is 3-nearest neighbor classifier (3-KNN). These classifiers are described in the subsections below. 3.5.1 Bayes Decision Theory Bayes decision theory is a fundamental statistical approach to the problem of pattern classification. For two category classification (AF and non-AF), we let ω1 and ω2 denote the two states to be classified with priori probability P (ω1) and P (ω2 ), and suppose x is the feature value. The joint probability density of ωj can be written in two ways: P (ωj, x) = P (ωj |x)P (x) = P (x|ωj )P (ωj). Therefore the Bayes formula is given by P (ωj |x) = P (x|ωj )P (ωj ) P (x) (3.9) where in this case of two categories P (x) = 2 X P (x|ωj )P (ωj ) (3.10) j=1 The posteriori probability P (ωj |x) is the probability of the state being ωj given that feature value x is measured. P (x|ωj ) is density function of x given by ωj . P (x) is unimportant as far as making a decision is concerned. It is 20 Methodology basically just a scale factor that states how frequently we will actually measure a pattern with feature value x. It only guarantee us that P (ω1 |x)+P (ω2 |x) = 1. By eliminating this scale factor, we obtain the following completely equivalent decision rule: Decide ω1 if P (x|ω1 )P (ω1) > P (x|ω2 )P (ω2) otherwise decide ω2 (3.11) There are many ways to represent pattern classifiers. One of the most useful is in terms of a set of discriminant functions, gi (x), i = 1, . . . , c. The classifier is said to assign a feature vector x to class ωi if gi (x) > gj (x) for all j 6= i (3.12) For the maximum a posteriori rule (MAP), the associated discriminant functions become g˜i (x) = P (ωi |x) = P (x|ωi )P (ωi) (3.13) Since the logarithm is monotonically increasing, the classification is unchanged if we take natural logs. gi (x) = log P (ωi |x) = log P (x|ωi ) + log P (ωi) (3.14) Let’s assume that the likelihood densities are Gaussian distribution. 1 1 x−µ P (x|ωi ) = √ exp[− 2 σ 2πσ 2 ] (3.15) The normal density is completely specified by two parameters: its mean µ and variance σ 2 . The general multivariate normal density in d dimensions is written as X 1 1 P 1/2 exp − (x − µi )T ( )−1 i (x − µi ) d/2 2 (2π) | i | P (x|ωi ) = (3.16) where x is a d-component column vector, µ is the d-component mean vector, P P is the d-by-d covariance matrix. | | and −1 Eliminating constant term, the MAP discriminant functions become P gi (x) = | X i X 1 |1/2 exp − (x − µi )T ( )−1 i (x − µi ) P (ωi ) 2 (3.17) expressed in logarithm X X 1 1 gi = − (x − µi )T ( )−1 |) + log(P (ωi )) i (x − µi ) − log(| 2 2 i (3.18) This know as Quadratic Discriminant Function. The quadratic term, called as Mahalanobis Distance: 3.5 Classification 21 X kx − yk2(P)−1 = (x − y)T ( i )−1 i (x − y) (3.19) P −1 can be thought ofPas a stretching factor on the space. Note for an identity covariance matrix ( i = 1), the Mahalanobis distance becomes familiar Euclidean distance. When the features are statistically independent and each feature has the same variance σ 2 . In this case gi can be rewritten as (x − µi )T (x − µi ) + log P (ωi ) 2σ 2 1 = − 2 [xxT − 2µTi x + µTi µi ] + log P (ωi ) 2σ gi (x) = − (3.20) However, the quadratic term xT x is the same for all i, making it an ignorable additive constant. Thus, we obtain the equivalent linear discriminant functions gi (x) = wiT x + ωi0 (3.21) where wi = 1 µi σ2 (3.22) and 1 T µ µi + log P (ωi ) (3.23) 2σ 2 i In short, the Bayes classifier for normally distributed classes is quadratic, whereas the Bayes classifier for normally distributed classes with equal covariance matrices is a linear classifier [24][25]. ωi0 = − 3.5.2 Artificial Neural Network Artificial neural networks are computational systems, either hardware or software, which mimic the computational abilities of biological systems by using large numbers of simple, interconnected artificial neurons. Artificial neurons are simple emulation of biological neurons [26]. Fig. 3.7 shows a neuron unit with a single input. The scalar input p is transmitted through a connection that multiplies its strength by the scalar weight w to form the product wp, which is argument of the transfer function f . The bias b is viewed as a threshold. If the wp greater than the threshold, the output is 1, otherwise it is 0. The back propagation (BP) network is the most widely used training algorithm, consisting at least three layers: an input layer, at least one intermediate hidden layer, and an output layer (see Fig. 3.8). With BackProp networks, learning occurs during a training phase in which each input pattern in a training set is applied to the input units and then propagated forward. The pattern of activation arriving at the output layer is then compared with the correct 22 Methodology Figure 3.7: A neuron with a single scalar input and bias [27] (associated) output pattern to calculate an error signal. The error signal for each such target output pattern is then backpropagated from the outputs to the inputs in order to appropriately adjust the weights in each layer of the network. After a BackProp network has learned the correct classification for a set of inputs, it can be tested on a second set of inputs to see how well it classifies untrained patterns [28]. Figure 3.8: Back propagation network consists at least three layers: an input layer, at least one intermediate hidden layer, and an output layer. The simplest implementation of back propagation learning updates the network weights and biases in the direction in which the performance function decreases most rapidly - the negative of the gradient. One iteration of this algorithm can be written [29] xk+1 = xk − αgk (3.24) where xk is a vector of current weights and biases, gk is the current gradient, and α is the learning rate. Learning in a backpropagation network is in two steps. First each pattern is presented to the network and propagated forward to the output. Second, a method called gradient descent is used to minimize the total error on the patterns in the training set. In gradient descent, weights 3.5 Classification 23 are changed in proportion to the negative of an error derivative with respect to each weight [28]: ∆wji = −ε[δE/δwji ] (3.25) where wji is the weight connecting unit i to unit j, and ε is constant. Weights move in the direction of steepest descent on the error surface defined by the total error: E= 1 XX (tpj − opj )2 2 p j (3.26) where opj be the activation of output unit uj in response to patter p and tpj is the target output value for unit uj . The gradient is computed by summing the gradients calculated at each training example, and the weights and biases are only updated after all training examples have been presented. In summary, the BP network learns by example, when it is provided with a learning set that consists of some input examples and the known-correct output for each case. The computational cycle is repeated until the network learns the problem “well enough”, that means overall error value drops below some pre-determined threshold. However, this operation is unpredictable, since the network finds out how to solve the problem by itself. 3.5.3 K Nearest Neighbor Classifier The KNN classifier is a very intuitive method. The KNN requires an integer k, a set of labelled examples and a measure of “closeness” calculated by a distance function. For a given unlabelled example x, the algorithm finds the k “closest” labelled examples in the training data set and assign the new point x to the class that appears most frequently within the k-subset. In other words, a decision is made by examining the labels on the k-nearest neighbors and taking a vote. Expressing in mathematic formula, the discriminant functions are given by [24]: ki (3.27) k ki is the number of example, labelled as class ωi , and enclosed in a spherical volume around unlabelled example x. k is the total number of examples inside the spherical region. Fig. 3.9 illustrates an example of 5-Nearest Neighbor classifier. Advantage: gi (x) = • Analytically tractable, simple implementation • Uses local information, which can yield highly adaptive behavior • Lends itself very easily to parallel implementation Disadvantage • Large storage requirements 24 Methodology Figure 3.9: The test point is labelled by a majority vote of these samples enclosed in a spherical region. In the case k = 5, the test point is labelled as red [30]. • computationally intensive recall • Highly susceptible to the curse of dimensionality 3.6 Post-processing and Diagnostic Decision The results of classifier on testing data are further post-processed as shown in Fig 3.10. The number of AF beats detected in the sliding non-overlapping window of 30 beats was counted. An interval is marked as an AF interval, if a number of AF, exceeding a particular threshold, is detected in the ECG block. This particular threshold, designated as AF number threshold was consequently used as decision variable for receiving operator curve (ROC) analysis. The threshold is designed for each block containing 30 heart beats to smoothing the signal from small flauctuation, rather than evaluating each heart beat. The diagnostic accuracy can be expressed through sensitivity, specificity and predictability in a certain study population. Let the prior probabilities P(A) and P(N) represent the fractions of data with the AF and the fraction of data without AF, respectively. Let T+ represent a positive test result (indication of the presence of AF) and T− a negative result (indication of the absence of AF). The following possibility arises. • A true positive is the situation when the test is positive for a data with AF. The true-positive fraction (TPF) or sensitivity s+ is given as P(T+ |A) or S+ = number of TP decitions TP = number of data with AF TP+FN (3.28) The sensitivity of a test represents its capability to detect the presence of AF. 3.6 Post-processing and Diagnostic Decision 25 Figure 3.10: Post-processing classifier output using moving non-overlapping window containing 6 beats in this example. If the segment contains normal sinus beats more than a threshold is classified as AF absent, otherwise as AF present. • A true negative(TN) represents the case when the test is negative for a data without AF. A true negative fraction (TNF) or specificity S− is given as P(T− |N) or S− = number of TN decitions TN = number of data without AF TN+FP (3.29) The specificity of a test represents its accuracy in identifying the absence of AF. • A false negative(FN) is said to occur when the test is negative for a data with AF; that is, the test has missed the case P(T− |A). • A false positive(FP) is defined as the case where the results of the test is positive when the individual being tested not have AF. The probability of this type of error or false alarm, known as the false-positive fraction (FPF) is P(T+ |N). The efficiency of a test may also be indicated by its predictive values. The positive predictive value (Predictability) PPV of a test, defined as TP (3.30) TP + FP represents the percentage of the cases labelled as positive by the test that are actually positive [31]. P P V = 100 · 26 Methodology Figure 3.11: Example of receiver operating characteristic curve It is desired to have a diagnostic test that is both highly sensitive and highly specific. An ROC (receiver operate curve) is a graph that plots FPF (1specificity), TPF (sensitivity) points obtained for a range of decision threshold or cut points of the decision method. An example of ROC curve is illustrated in Fig. 3.11. In this case, this decision variable is particular threshold of detected AF beats contained in ECG block (AF number threshold), as mentioned above. By tuning the threshold parameter in the window of 30 beats, the optimal trade-off between sensitivity and specificity can be found. The optimal tradeoff between sensitivity and specificity is defined as the point which has minimal distance to the point (0,1) on the ROC curve, see the point connected to (0,1) in red line in the Fig. 3.11. Here we assume that sensitivity and specificity are equal important. For example, this test has a sensitivity of 99.6% and specificity of 98.7 %, as the threshold is set to be 27. By varying the decision threshold, we get different decision fractions, i.e, choose to operate the sensitivity and specificity at any point along the curve. The ROC curve is independent of the prevalence of AF. 4 QRST Cancellation 4.1 Straight Forward Averaging Algorithm for QRST Cancellation AF is indicated by the absence of consistent P waves, due to the chaotic atrial depolarization. The RR intervals vary in time. In AF, the atria are excited rapidly and irregularly at a fibrillatory rate of 300 to 600 bpm caused by reentry wavelets (see section 2.2). Fig. 4.1 depicts ECG of a chronic AF patient measured by Philips strap. To sum up, AF is characterized by 1. ventricular irregular rhythm 2. presence of atrial fibrillatory activity Figure 4.1: Example of AF episode acquired by wearable belt system The most important feature in detecting AF involves ventricular irregularity. A vast majority of the cases of AF do, in fact, have marked ventricular irregularity, but the drawback of this criterion is that rhythms other than AF can also have irregular ventricular responses. So many studies propose to detect the fibrillatory activity in the surface electrogram. The challenge in detection of fibrillatory waves is due to its chaotic nature and small amplitude in comparison to ventricular activity, buried in QRS complexes and T wave in some 28 QRST Cancellation Figure 4.2: High-voltage lead ECG and their respective remainder ECG are shown for an example of (top) NSR and (bottom) AF. The right single heart beat is template for subtraction in each case [32]. cases. Sometimes, the amplitude of the fibirllatory waves is small enough to be invisible to the ECG reader. Janet Slocum [21] presented a method to cancel the ventricular activity from ECG, and used the power spectrum of the atrial fibrillatory wave to detect AF. A mean beat was generated by averaging over all beat windows aligned by the fiducial points, which is defined as R peak location. For all rhythm, the mean beat was aligned with the fiducial points of all the beats windows and subtracted. Figure 4.2 shows a result of QRST cancellation. Observe that the fluctuating waves in place of P waves, sometimes also T waves, have a mean value close to baseline. The average beat subtraction approach uses the fact that AF is uncoupled to ventricular activity and, therefore the average is subtracted to produce a residual signal which contains the fibrillation waveforms, whereas the ECG of NSR has a small remainder after subtraction. The above QRST cancellation relies on the assumption that the average beat can represent each individual beat accurately. However, QRS morphology varies often dynamically, caused by respiration, premature ventricular beats, fusion beats, etc. The signal may be polluted by a wide range of phenomena including myopotential and electromagetic noise and several other acquisition related events. Fig. 4.3(a) depicts an ECG example of 20 seconds, collected from a chronic AF patient using Philips strap (see section 3.1). Fig. 4.3(b) illustrates a template in red dashed line superimposed on the example of ECG in blue solid line. Obviously this template, calculated by averaging all the heart beats does not match the ECG data well. The poor fits are due to dissimilar QRS complexes, i.e, high variation in QRS morphology and a different ST interval. Fig. 4.3(c) is the residual signal after subtraction, ranged from -1635 4.1 Straight Forward Averaging Algorithm for QRST Cancellation Figure 4.3: (a) An ECG example of chronic AF patient with the length of 20 s, measured by wearable ECG-monitoring. (b) The Template averaging all the heart beats and superimposed on the original ECG segment. (c) Remainder using the uniform template 29 30 QRST Cancellation to 710 mV. This cancellation has a poor performance, since the amplitudes of some residual QRS complexes are even greater than the amplitudes of original ones, and residual T wave are clearly present in ECG signals. For this purpose, improvements in the cancellation method are carried out in this study, which are discussed below. 4.2 Improved QRST Cancellation Figure 4.4: Flow chart of QRST cancellation algorithm. Fig. 4.4 represents the main sequence of the improved QRST cancellation algorithm. The fundamental frequency of the residual signal of fibrillatory baseline is well below the 4 – 10 Hz range. However, preliminary study revealed that harmonics arising from the QRST often produce more energy in the relevant frequency spectrum range than much lower amplitude fibrillatory activity [33]. Therefore, accurate assessment of frequency spectrum of atrial component required selective attenuation of the QRST. In general, the QRST complex is attenuated using a template matching and subtraction technique. The improvement was focused on computing appropriate templates. Appropriate templates are specified as more than one templates, created for each morphology. Nevertheless only one template is calculated if the signal is very regular. Linear-phase, low and high pass filtering were used to reduce baseline wander and suppress noise before performing QRST cancellation. A Notch filter eliminated the power-line interference. Clustering and RR-interval based beat classification was performed so that beats with different morphology and cycle length were separated into different classes. Anomalous beats were rejected in further analysis. A beat average was then computed for each of the classes. Finally, the remainder was subjected to Fourier Transform and displayed as a power spectrum. Each step will be discussed in great detail in the following subsections. 4.2 Improved QRST Cancellation 4.2.1 31 Digital Filters ECG are often polluted by many other noise of various origins. Noise includes muscle noise, artifacts due to electrode motion, power-line interference, baseline wander, and T waves with high-frequency characteristics similar to QRS complexes. The linear digital filters reduce the influence of these noise sources, and thereby improve the signal-to-noise ratio. In this study recursive digital filters are used, with only small integer multipliers meaning they are both simple to program and fast in execution [34]. Low-pass filter This class of filters is designed based on the formula: yn = xn − xn−m (4.1) where yn represents the the current (filtered) output sample value from the filter, xn represents the current input sample, and xn−m represents the input sample delivered to the filter m sampling periods previously. This time-domain description of the filter is converted into a transfer function G(z), which is derived as: G(z) = Y (z) = (1 − z −m ) X(Z) (4.2) G(z) is the transfer function. X(z) and Y (z) are z transforms of input x(n) and y(n), respectively. In this case, there are m zeros equally spaced around the unit circle, each of which gives rise to a transmission zero in the corresponding filter frequency response. Cancellation of one of the zeros by coincident gives the low-pass characteristic. The addition of the pole causes G(z) to be modified to G(z) = 1 − z −m Y (z) = −1 1−z X(z) (4.3) So that the time-domain recurrence formula becomes: yn = yn−1 + xn − xn−m (4.4) The filter is recursive, since each output depends upon a previous output as well as inputs. The integer m can be adjusted to give the desired cutoff frequency. Filters of this type have the advantage of having a pure linear phase characteristic. The phase is said to have a linear phase response if its phase response satisfies one of the following relationships [35]: θ(ω) = β − αω (4.5) The cutoff slope and the attenuation my be greatly reduced by using higher order zeros, and cancelling pole, instead of first order. The transfer function becomes: 32 QRST Cancellation G(z) = (1 − z −m )n (1 − z −n )n (4.6) The transfer function of second-order low-pass filter applied in this algorithm is: G(z) = [ 1 − z −2 2 ] 2(1 − z −1 ) (4.7) with the cutoff frequency is 50 Hz at the sampling frequency of 250 Hz. Frequency response of the filter is shown in Fig. 4.5. Figure 4.5: Top: Magnitude response of the 50 Hz low-pass filter, Button: Phase response of 50 Hz low-pass filter High-pass filter This class of filter can be extended to the high-pass filter which is designed to remove the base-line drift in the ECG signal. The design of the high-pass is based on subtracting the output of a first-order low-pass filter from an all-pass filter. The transfer function for such a high-pass filter is G(z) = 1 − 128 + z −64 − z −65 + 1 − z −1 1 128 128 z (4.8) The low cutoff frequency of this filter is 0,5 Hz at the sampling frequency 250 Hz. The components of 0,05 Hz is reduced in amplitude by about 50 dB. Removal of power-line interference A well-known method capable of reducing power-line interference is the use of a notch filter characterize by a unit gain at all frequencies except at notch frequency where gain is zero. The transfer function of a second-order notch filter is given by [10]: 4.2 Improved QRST Cancellation G(z) = 33 Y (z) 1 (1 + a2 ) − 2a1 z −1 + (1 + a2 )z −2 = · X(z) 2 1 − a1 z −1 + a2 z −2 (4.9) The notch frequency ω0 and 3-dB rejection bandwidth related to the filter coefficients a1 and a2 by the following: a1 = 2cos(ω0 ) , 1 + tan( Ω2 ) a2 = 1 − tan( Ω2 ) 1 + tan( Ω2 ) (4.10) In application, if the sampling frequency, sinusoidal frequency (50 Hz in this case) and notch band width are fs , fd and BW Hz, then ω0 = 2π( fs ), fd Ω = 2π( BW ) fs (4.11) The desired rejection bandwidth can be obtained by adjusting BW. The magnitude and phase response of the 50 Hz Notch filter are plotted in Fig. 4.6. Figure 4.6: Top: Magnitude response of the 50 Hz Notch filter, Button: Phase response of 50 Hz Notch filter Figure 4.7 compared the noisy ECG signal and the filtered signal after removal of ECG signal after removal of baseline wandering, low frequency noise and power-line interference using the filters described above. 4.2.2 QRS Clustering In stead of subtraction by an uniform template for the whole block, several templates are designed due to differences in even interpersonal ECG morphology. For this purpose, non-supervised clustering of QRS morphologies is performed on the filtered ECG signal. Clustering creates groups of objects, or clusters, in such a way that the morphologies of objects in the same cluster are very 34 QRST Cancellation (a) (b) Figure 4.7: (a) Noisy ECG segment (b) ECG signal after removal of baseline wandering, low frequency noise and power-line interference. 4.2 Improved QRST Cancellation 35 similar and the morphologies of objects in different clusters are quite distinct. The cluster analysis on a data set is performed as following procedures: 1. At first, the QRS complex is extracted from QRS onset to QRS offset, which is detected by segmentation program. Each QRS complex is aligned with fiducial point and stored in an array in equal length. Short QRS complexes were padded with zero to make them with the required length. 2. A dissimilarity matrix stores distances that are available for all pair of objects. This matrix is represented by an m × m (the number of total heart beats in a segment) table. The subscripts of the distance matrix is consistent with the index of R peak. Equ.(4.12) is the dissimilarity matrix, where each element d(i, j) represents the difference or dissimilarity between the objects i and j. The row and column represents the objects. DM 0 d(1, 2) d(1, 3) . . . d(1, m) 0 d(2, 3) . . . d(2, m) . . .. .. .. = . 0 . . . d(m − 1, m) 0 (4.12) For all points x, y and z, a distance function must have following properties. • Nonnegativity: D(x, y) ≥ 0 • Reflexivity: D(x, y) = 0 if and only if x = y • Symmetry: D(x, y) = D(y, x) • Triangle inequality: D(x, y) + D(y, z) ≥ D(x, z) d(i, j) is very small when objects i and j are very similar to each other, and becomes larger the more they differ. To calculate the dissimilarity between the objects i and j the most popular distance measure is used called Manhattan or city block distance distance between i and j is given by: d(i, j) = b X |xik − xjk | (4.13) k=a where a, b is the first and last index of QRS array, respectively. 3. The distance information generated in the Equ. (4.12) is used to determine proximity of objects to each other. The pair of QRS complexes is denoted as similar when their distance is less than the predefined cut-off, termed as similarity threshold. The similarity threshold values are empirically selected to assure a sufficient number of QRS complexes included in the several initial clusters. Fig. 4.8 illustrates the way the algorithm groups objects into clusters. Let’s treat the QRS as green circles in space 36 QRST Cancellation Figure 4.8: Schematic representation of QRS clustering (see Fig. 4.8 a). The algorithm searches the similar pair of QRS complexes along each row, and clusters them at first level (see the circles enclosed by blue ellipse in Fig. 4.8 b). These newly formed clusters are linked to other objects to create bigger clusters at higher level until there are no overlapping elements in these groups, presented as the contents enclosed by purple ellipse. 4. The remaining objects, which can’t be grouped into any clusters, are labelled as abnormal QRS complexes (see the isolated circles in Fig. 4.8 b). Unfortunately, the number of labelled abnormal QRS complexes depends upon threshold similarity. To assure a strong correlation between the objects of the same cluster, a small initial similarity threshold is selected. In some instances, most QRS complexes could be categorized as abnormal because of the different scale, irregularity of ECG, subtle abnormalities in shapes. Therefore a second patient specific threshold is used. 5. To verify the constructed cluster, a second threshold is set to control the amount of rejected QRS complexes. The similarity threshold (1st threshold) increases and clustering is computed at each iteration, till the final amount of rejected QRS is less than a second threshold, called as abnormal threshold. More circles are gathered together for the enlarged blue ellipse in the Fig. 4.8. The abnormal threshold is specified for the individual data. In progressive iterations, the similarity threshold is increased. In later iteration, If the increment of similarity threshold does not change the clustering, the abnormal threshold is also increased. A greater abnormal threshold is assigned to a longer ECG segment. Each beat in the ECG block was classified as either a dominant or anomalous beat. Heart beats that contained QRS complexes of the most common morphology and amplitude for the rhythm strip were defined as dominant beats, in another word, the heart beats except for outlier beats derived from QRS clustering. Premature ventricular beat, fusion beats or beats that saturated the amplifiers are defined as anomalous beats, in another word, the outlier beats derived from QRS clustering. Fig. 4.9 a) plots the ensemble of 4.2 Improved QRST Cancellation 37 (a) (b) Figure 4.9: (a) The ensemble QRS complexes of the ECG example in the Fig. 4.9. (b)Result of QRS clustering for this ECG example. 38 QRST Cancellation QRS complexes of the ECG example in the Fig. 4.3. Fig. 4.9 (b) depicts the result of QRS clustering for this ECG example, containing 22 heart beats. The algorithm above divides 22 heart beats into two clusters: cluster 1 in magenta color and cluster 2 in green color, made up of 14 and 5 objects, respectively. Three heart beats are labelled as abnormal, plotted in blue color. (a) (b) (c) (d) Figure 4.10: The appropriate templates (red line) superimposed on heart beats (blue line) of each cluster or subgroup for the ECG example in Fig. 4.3. The QRS complexes of this example are grouped into clusters in the Fig. 4.9. (a) and (b) Cluster 1 in Fig. 4.9 containing medium, long and short RR interval is divided into 2 subgroups further for short and medium rhythm. (c) Cluster 2 made up of only medium heart beats were not divided further. (d) The template and residual signal for anomalous heart beats are set to zero. 4.2.3 Sub-clustering based analysis on RR-interval AF is always associated with irregular heart rhythm, meaning the R to R beats can have very different lengths. Here we want to subtract an average full beat template from the ECG signal. In order to accommodate for the different beat lengths, we calculate three different templates of short, medium and long length for every QRS cluster if there are sufficient numbers for the subgroups. Thereby, each RR interval is characterized as one of three state by classifying it as short (S), medium (M) and long (L). Intervals are called short if they do not exceed 4.2 Improved QRST Cancellation 39 85% of the mean interval, long if they exceeded 115% of the mean, and medium otherwise. The running mean interval is given by [18]. rrmean(i) = 0.75 ∗ rrmean(i − 1) + 0.25 ∗ rr(i) (4.14) where i is the current heart beat index. Subsequently, each cluster is analyzed based on RR interval classification. The cluster that contains heart beats of different RR classes is divided into subgroups. For instance, cluster 1 containing 9 medium, 1 long and 4 short RR interval is divided into 2 subgroups for short and medium rhythm (see Fig. 4.10 (a) and (b)). The single long heartbeat is merged with medium subgroup. 4.2.4 Appropriate Templates and Subtraction Figure 4.11: Flow chart of computing appropriate templates for the example in the Fig. ??. Each heart beat is segmented by two kinds of beat windows. The heart beats for each class are averaged to generate appropriate templates. Cluster 2 in Fig. 4.10 (c) includes only medium heart beats, thus one template is calculated for this class. In this way, appropriate templates are computed for similar QRS morphology and RR interval. The template and residual signals for anomalous beats in Fig. 4.10 (d) are set to zero to prevent their large remainders from contributing to evaluation. Fig. 4.10 plots the computed template (red line) superimposed on heart beats (blue line) of each cluster or subgroup for ECG example in Fig. 4.3. Fig. 4.11 illustrates the steps of computing the appropriate templates for this example in schematic form. 40 QRST Cancellation Figure 4.12: (a) An ECG example of chronic AF patient with the length of 20 s, measured by wearable ECG-monitoring. (b) Appropriate Templates (red line) superimposed on the original signal (blue line) (c) Remainder after QRST cancellation using appropriate templates. 4.2 Improved QRST Cancellation Figure 4.13: (a) Remainder of the AF Example using the forward averaging algorithm with a uniform template. (b) Remainder of the AF Example using the improved QRST Cancellation with several appropriate templates. 41 42 QRST Cancellation Figure 4.14: (a) An ECG example of NSR patient with the length of 20 s, measured by wearable ECG-monitoring. (b) The template (red line) superimposed on the original signal (blue line) (c) Remainder after QRST cancellation. 4.2 Improved QRST Cancellation 43 (a) (b) Figure 4.15: (a) the beat window I is defined as interval from Qon (i) to Qon (i+ 1), (b) Beat window II is defined as interval from P R+Qon (i) to P R+Qon (i+1). Two different definitions of beat window are established. Beat window I is specified as interval form current QRS onset to the subsequent QRS onset (from Qon (i) to Qon (i + 1)), see Fig. 4.15 (a). Beat window II is specified as interval from the onset of the current P wave Pon (i) to the onset of the subsequent P wave Pon (i + 1), see Fig. 4.15 (b). The boundaries of the beat window distinguish between the beat window I and II. Afterward, the corresponding templates are subtracted from the original sequence after alignment at the fiducial point of each QRS complex. The template is padded with zero so that its length is equal to the longest heart beat in the same class. Fig. 4.12 (b) depicts the template in red color superimposed on original ECG data in blue color. The three blue heart beats observed in Fig. 4.12(b) are abnormal heart beats rejected in averaging, whose template and remainder are set to zero. Fig. 4.12 (c) illustrates the residual signal, ranged from -279 mV to 349 mV. Fig. 4.13 compared the residual signal using the 44 QRST Cancellation QRST cancellation algorithm described in section 4.1 (see Fig. 4.13 (a)) and improved QRST cancellation algorithm (see Fig. 4.13 (b)). This comparison shows that, the improved algorithm performs considerably better than the forward averaging algorithm in the section StraightForwardAveraging. The QRS remainder reduced significantly due to clustering and appropriate templates. The minor residual signal for T wave was achieved by consideration of variation in heart rate related ST interval. In comparison to the AF example in the previous figures, Fig. 4.14 (a) shows an ECG example of a patient in normal sinus rhythm with the length of 20 s, measured by wearable ECG-monitoring. The averaging template in red color is superimposed on the original signal in blue color in Fig. 4.14 (b). The remainder contains very small pieces of QRS complexes and noise in Fig. 4.14 (c) 4.2.5 Frequency Spectrum of AF The fibrillatory wave is usually the result of multiple simultaneous reentrant activation wavefronts. The average size of reentrant pathways during AF is dependent on atrial wavelength, defined as the product of conduction velocity and refractory period. Long wavelengths are associated with larger and fewer wavefronts, occurring in paroxysmal AF patients, whereas short wavelengths results in a great number of small circuits, occurring in chronic AF patients. The vast majority of ECG recordings analyzed in previous investigations have exhibited single-narrow banded frequency spectra when applying the QRST cancellation [33][36][37][38][39][21]. Spectral decomposition of residual ECG signal revealed that the dominant frequency may result from repetitive activations by a reentrant source such as a single focal source within pulmonary vein. The lower frequency components may represent various degree of spatiotemporal organization of waves propagating from that source to the rest of the atria [40]. Fibrillatory frequency in human AF determined from the surface ECG exhibits a marked individual variability ranging between 4 and 10 Hz or 5 and 10 Hz [33][36][37][38][39][21][41], which is in agreement to the values obtained from intra-atrial recordings in previous investigations [42] and physiological investigation: fibrillatory wave typically have a rate between 300 and 600 bpm [43][21][44][39][6]. Asano et al induced AF with rapid pacing in 30 patients undergoing an electrophysiological study [42]. Those patients where AF terminated spontaneously had an average fibrillatory frequency of 5.7 Hz, significantly lower than the 6.4 Hz recorded in the group of patients where the arrhythmia persisted. Estimating fibrillatory frequency from the surface ECG, Bollmann et al illustrated the fibrillatory frequency increases from paroxysmal (mean 5.1 Hz) to chronic AF (mean 6.9 Hz). Chronic atrial fibrillation had the highest frequency. Low frequency fibrillation is more likely to terminate spontaneously or to respond to antiarrhythmic therapy, while high frequency fibrillation is more often persistent and drug refractory [36]. 4.2 Improved QRST Cancellation 4.2.6 45 Fourier Transform and Power Spectrum The Fourier Transform decomposes a signal into purely harmonic wave, which is given by [45]: Z ∞ X(f ) = x(t)e−j2πf t df (4.15) −∞ This transform breaks down a signal into its frequency-spectrum as a set of sinusoidal components, converting it from the time domain to the frequency domain. In practice the Fourier components of data are obtained by digital computation rather than by analogue processing. It is not possible to apply the Fourier transform in the Equ. 4.15 because it is defined for continuous data. However, a Discrete Fourier Transform (DFT) is available for discrete data. Assume that a waveform has been sampled at regular time intervals T to produce the sample sequence x(nT ) = x(0), x(T ), . . . , x[(N − 1)T ] of N sample values, where n is the sample number from n = 0 to n = N − 1. The DFT of x(nT ) is then defined as the sequence of complex values X(kΩ) = X(0), X(Ω), . . . , X[(N − 1)Ω] in the frequency given by Ω = 2π/N T . The N real data values (in the time domain) transform to N complex DFT values (in the frequency domain) is given by: −1 −jkΩnT N −1 −j2πnk/N X(k) = ΣN = Σn=0 e , n=0 e with k = 0, . . . , N − 1 (4.16) However, a large number of multiplication and additions are required for the calculation of the DFT. For an N-point DFT, there will be N2 and N-12 of them respectively. If N = 1024, approximately one million complex multiplication and one million complex additions are required. Clearly some means of reducing these numbers is desirable. The fast Fourier transform (FFT) is a discrete Fourier transform algorithm which reduces the number of computations needed for N points from 2N 2 to 2N lgN , where lg is the base 2 logarithm using symmetric property. The computational redundancy in DFT is reduced applying the periodic inherence of e−j2π/N [46]. 1024-points Fast Fourier Transform was used to calculate the power spectrum in 0- f2s Hz frequency band, where fs is sampling frequency. The ith element of the power spectrum (PS) is computed from Xi of Fourier Transform X as [45]: P S = (Xj Xj∗ )/1024 (4.17) Fig. 4.16 depicts the frequency analysis of the residual signals from different records of AF patients in the left panel (Fig. 4.16 (a), (c) (e)) and this of NSR patients in the right panel: (4.16 (b), (d), (f)). Among them Fig. 4.16 (e) and 4.16 (f) show the power spectra of the remainder from Fig. 4.12 (c) and Fig. 4.14 (c). These six records are collected by Philips strap from independent subjects (see section 3.1). Observe that in the AF case, the absolute power are concentrated in the frequency band 4 – 10 Hz, either exhibiting a narrow peak or great power component in this frequency band, whereas the residual signal has a lower frequency component in this frequency band. Noise enhances 46 QRST Cancellation (a) AF (b) Non-AF (c) AF (d) Non-AF (e) AF (f) Non-AF Figure 4.16: (a), (c) and (e): Frequency spectra from the residual signals are estimated using FFT in the AF cases. (b), (d) and (f): Frequency spectra from the residual signals in the non-AF cases. the power components over the frequency band 0 – 4 Hz in the Fig. 4.16 (b) and (d). The residual QRS signals in the remainder contribute to high power content in the Fig. 4.16 (f). The PS in the frequency band extending from 4 to 10 Hz and from 5 to 10 Hz are analyzed, since the start frequency of the band is still ambiguous (see section 4.2.6). Two feature groups are extracted from the power spectrum: mean values of dominant heart beats (see subsection 4.2.2) in the ECG block (QTCan1 – QTCan10) and the evaluation of whole ECG block (QTCan11 – QTCan16), including 30 heart beats. PR10 5 is the percentage of the absolute power in the frequency band 5 – 10 Hz to the total power from 0 – 125 Hz P 10Hz =5 (P R510 = Pf125 f =0 P P ) in the ECG remainder, and PR10 4 is related with power ratio of frequency band 4 – 10 Hz (P R410 P10Hz P =4 = Pf125 ). Peak ratio denotes the f =0 P ratio of the numbers of the heart beats exhibiting maximal power in the 4 – 10 Hz frequency band to the amount of the dominant beats in the ECG block. 4.2 Improved QRST Cancellation 47 The list enumerates the features extracted from the PS of remainder. 1. QTCan1: The mean PR10 5 using the the beat window I (see Fig. 4.15 (a)). 2. QTCan2: The mean PR10 4 using the the beat window I (see Fig. 4.15 (b)). 3. QTCan3: Peak Ratio using the beat window I. 4. QTCan4: The mean PR10 5 using the beat window II. 5. QTCan5: The mean PR10 4 using the beat window II. 6. QTCan6: Peak Ratio using the beat window II. 7. QTCan7: The sum of QTCan1 and QTCan3. 8. QTCan8: The sum of QTCan2 and QTCan3. 9. QTCan9: The sum of QTCan4 and QTCan6. 10. QTCan10: The sum of QTCan5 and QTCan6. 11. QTCan11: PR10 5 using the beat window I. 12. QTCan12: PR10 4 using the beat window I. 13. QTCan13: Peak frequency in the PS using the beat window I. 14. QTCan14: PR10 5 using the beat window II. 15. QTCan15: PR10 4 using the beat window II. 16. QTCan16: Peak frequency in the PS using the beat window II. 5 Results and Discussion The most features using the QRST cancellation algorithm noted in chapter 4 were fed into various classifiers, which categorize the ECG data into two classes: AF or non-AF. This chapter will discuss the results using diverse features and several classifiers on both databases: belt data acquired by Philips strap and MIT-BIH AF database. The features are selected by statistical methods and visualized. This chapter covers also the drawbacks of QRST cancellation algorithm. Possible approaches for QRS clustering and QRST cancellation are introduced as well as their performances. 5.1 Results on Belt Database At first the validation of the QRST cancellation algorithm has been done using the database acquired by the Philips strap. A total of 6930 beats from 14 records in duration of 130 minutes were used for processing, including 8 records of chronic AF patients (4650 heart beats) and 6 records from patients in normal sinus rhythm (2280 heart beats). 16 features were extracted for each heart beat based on QRST cancellation algorithm, defined in section 4.2.6. The table 5.1 shows the mean value of the features for each AF record, as well as the number of heart beats, whereas the table 5.2 shows the similar analysis on 6 records in normal sinus rhythm. In general, the energy and spectral peak of remainders are concentrated on frequency band 4 – 10 or 5 – 10 Hz in the case of chronic AF patients, extracted by QT1 – QT6 and QT11 – QT16. However the remainders of NSR have a lower frequency content in this area. These results are in agreement with the statement mentioned in section 4.2.5. QT7 – QT10 combined peak and power percentage features should be greater in the case of AF than NSR. Higher QT3, QT6, QT13 and QT16 indicated that peak frequency between 4 and 10 Hz occurs more frequently in the AF records than the NSR records. We expect the remainder exhibiting a spectral peak in 4-10 frequency band. In fact, this is not always the case. The shift in the peak frequency of AF ECG for some instances is due to noise, insufficient subtraction, etc. The reason will be discussed in the section 5.3.1. Bollmann et al [33] didn’t detect the the peak from 4 to 10 Hz for each data, either. The features QT7 – QT10 displayed a wide range of values over the different patient. The 16 features are averaged over the whole AF and NSR records, shown in table 5.3. These results of AF and NSR are also plotted in Fig. 5.1, expressed in purple and red bar respectively. As expected, all the mean values of the 16 features are much greater in the AF data than in the NSR case data, consistent 50 Results and Discussion Table 5.1: Features of QRST Cancellation for each AF record. HB-Heart beats. QT1 QT2 QT3 QT4 QT5 QT6 QT7 QT8 QT9 QT10 QT11 QT12 QT13 QT14 QT15 QT16 HB AF(1) 55.44 63.90 81.21 55.73 63.87 81.27 136.65 145.11 147.01 145.14 46.92 52.17 6.06 47.83 52.03 6.18 540 AF(2) 45.06 53.48 57.56 43.80 51.92 54.41 102.62 111.04 98.22 106.33 32.34 39.09 4.80 31.76 37.88 4.82 690 AF(3) 39.06 47.58 41.51 39.02 46.44 38.29 80.57 89.09 77.32 84.73 32.36 38.29 4.93 33.34 39.34 4.31 900 AF(4) 35.31 40.05 31.42 37.16 40.24 30.07 60.73 68.02 50.23 61.31 29.28 32.49 3.55 30.58 36.67 4.04 330 AF(5) 47.38 55.19 60.28 46.84 54.46 59.59 107.67 115.67 106.43 114.05 38.97 43.85 5.08 38.91 43.67 5.09 1470 AF(6) 23.12 29.53 23.44 22.20 28.20 23.17 36.56 42.97 35.37 41.37 16.63 22.78 3.29 16.07 21.36 3.02 120 AF(7) 29.24 36.57 22.33 32.39 40.25 28.58 51.57 58.91 60.97 68.83 24.46 32.03 3.09 25.70 33.40 3.10 240 AF(8) 38.35 46.05 44.06 35.60 42.91 37.36 82.41 90.12 72.95 80.27 26.37 32.29 3.60 24.87 30.71 3.20 360 Table 5.2: Features of QRST Cancellation for each NSR record. HB-Heart beats. QT1 QT2 QT3 QT4 QT5 QT6 QT7 QT8 QT9 QT10 QT11 QT12 QT13 QT14 QT15 QT16 HB NSR(1) 36.20 42.05 39.71 16,67 19.16 5.77 75.90 81.76 22.44 24.93 25.36 29.16 7.49 14.84 16.46 10.78 300 NSR(2) 16.21 21.89 2.82 14.39 20.19 1.79 19.03 24.71 16.19 21.99 12.48 17.16 1.87 10.96 15.42 1.94 690 NSR(3) 21.32 30.32 31.26 17.83 24.03 7.29 37.58 43.58 25.12 31.33 19.19 24.13 2.39 13.01 17.75 2.18 480 NSR(4) 44.81 52.38 54.93 44.64 52.93 52.18 99.74 107.31 96.82 105.11 36.10 41.89 3.83 32.21 44.23 3.79 210 NSR(5) 33.89 41.81 17.92 32.38 40.07 15.15 51.81 59.73 47.53 55.23 30.65 38.72 2.82 30.30 38.05 2.69 300 NSR(6) 27.37 34.8 16.14 26.63 33.36 15.09 43.52 50.98 41.73 48.46 20.86 27.40 2.28 20.55 27.06 2.31 300 5.1 Results on Belt Database 51 with the higher spectral component in the relevant region. Table 5.3: Averaging features for AF and NSR records AF NSR AF NSR QT1 42.82 30.08 QT9 91.94 38.88 QT2 50.83 36.78 QT10 99.56 45.00 QT3 51.50 22.90 QT11 34.07 23.79 QT4 42.26 24.42 QT12 39.82 29.41 QT5 49.88 30.54 QT13 4.66 3.43 QT6 49.68 14.45 QT14 34.14 29.19 QT7 94.31 52.98 QT15 39.62 25.4543 QT8 102.32 56.89 QT16 4.92 3.34 Figure 5.1: 16 QT features are averaged over the whole AF and NSR records Apart from mean value of features, histograms are displayed to show the distribution of the features. This histogram counts the number of elements within the range of features and displays each range as a rectangular bin. The height of the bins specifies the number of values that fall within each range. To simplify comparison, the Y axis in Fig. 5.2 is normalized by the amount of total heart beats and represented in percentage. Fig. 5.2 (a) refers to a distinct feature QTCan14, which is centered on the range 30 – 45 in case of AF (upper) and well below 20 in case of NSR (bottom). In comparison, QTCan11 in Fig. 5.2 (b) are overlapping over the range 20 – 40 in the both cases, thus little differentiation appears to be present. The histograms show that QTCan1 and QTCan2, QTCan4 and QTCan5, QTCan7 and QTCan8, QTCan9 and QTCan10, QTCan14 and QTCan15 are highly correlated. 52 Results and Discussion (a) (b) Figure 5.2: Histogram of QTCan14 (a) and QTCan11 (b) comparing the AF records (upper) with NSR records (bottom). QTCan11 and QTCan14 are the power percentages in the frequency band 5 – 10 Hz using the beat window I and using the beat window II, respectively. 5.1 Results on Belt Database 53 Table 5.4: Results of AF detection using QRST Cancellation on wearable belt system. The measures are expressed in %. Legend:Met.-classification method, Se.-Sensitivity, Sp.-Specificity, Pr.-Predictability. Feature QTCan1 QTCan2 QTCan3 QTCan4 QTCan5 QTCan6 QTCan7 QTCan8 QTCan9 QTCan10 QTCan11 QTCan12 QTCan13 QTCan14 QTCan15 QTCan16 Met. QDC 3-KNN QDC 3-KNN 3-KNN QDC 3-KNN 3-KNN 3-KNN 3-KNN 3-KNN 3-KNN QDC KNN 3-KNN QDC Se. 80.77 83.21 80.77 91.56 86.04 84.19 89.61 91.80 93.59 92.45 86.77 88.72 98.07 81.51 85.61 100.00 Sp. 78.19 85.66 81.90 80.71 86.50 86.19 83.43 83.04 83.57 83.57 88.47 86.22 57.74 96.60 91.53 55.71 Pr. 64.29 78.43 78.57 71.43 78.57 78.57 71.43 78.57 78.57 78.57 64.29 71.43 50.00 85.71 78.57 50.00 These features are fed into classifiers trained to recognize the AF data from NSR data using the training set. Different classifiers were applied: normal density based linear classifier (LDC), normal density based quadratic classifier (QDC), back-propagation neural network with one hidden layer of 10 neuron units (10-ANN) and K-nearest neighbor classifier (KNN). Classification decisions on testing set were evaluated by statistical parameters: sensitivity, specificity and predictability. Table 5.4 summarized the best results using the four classifiers noted above, which received one single QTCan feature independently. For input of one feature KNN and QDC have the best performance on this problem. The highest sensitivity, specificity and predictability are achieved using the feature QTCan14: 81.51%, 96.60% and 85.71%. The features using the beat window II (QTCan4 – QTCan6 and QTCan14 – QTCan16) provide higher sensitivity and specificity than the ones using the beat window I (QTCan1 – QTCan3 and QTCan11 – QTCan13). The features QTCan7 – QTCan11 distinguish AF and NSR data more efficiently than the single feature QTCan1, QTCan2, QTCan4 and QTCan5. The power spectra of the whole ECG block of 30 heart beats (QTCan11 – QTCan12 and QTCan14 – QTCan15) yield the better sensitivity and specificity than the mean power spectra (QTCan1 – QTCan2 and QTCan4 – QTCan5). Here we assume that the sensitivity and specificity are equal important, and the optimal cur-off between sensitivity and specificity is determined by ROC curve. The ventricular irregularity plays an important role in differentiating between AF and other irregular rhythm. To test the effect on the combination 54 Results and Discussion Table 5.5: Results of AF detection using features of RR interval and QRST Cancellation on wearable belt system. The measures are expressed in %. Legend:Met.-classification method, Se.-Sensitivity, Sp.-Specificity, Pr.Predictability. Features RR interval RR interval, QTCan1 RR interval, QTCan2 RR interval, QTCan3 RR interval, QTCan4 RR interval, QTCan5 RR interval, QTCan6 RR interval, QTCan7 RR interval, QTCan8 RR interval, QTCan9 RR interval, QTCan10 RR interval, QTCan11 RR interval, QTCan12 RR interval, QTCan13 RR interval, QTCan14 RR interval, QTCan15 RR interval, QTCan16 Met. QDC QDC LDC QDC 10-ANN QDC KNN QDC QDC QDC QDC QDC QDC QDC QDC QDC QDC Se. 95.24 96.43 97.32 98.21 99.11 96.43 99.35 97.32 96.43 95.54 97.32 98.17 98.56 98.27 99.46 98.56 98.42 Sp. 94.64 96.43 95.71 96.43 96.43 97.02 94.64 95.83 96.43 97.62 96.00 98.40 98.21 98.69 100.00 98.81 98.10 Pr. 92.61 92.86 88.71 92.86 92.86 92.86 92.86 92.86 92.86 92.86 85.71 92.68 92.68 85.71 100.00 92.86 85.71 Figure 5.3: Receiver operating characteristic curve for combining features of QTCan14 and RR interval. 5.2 Results on MIT Database 55 of features, RR interval (see section 3.4) was input the classifiers in additional to the features of QRST cancellation. The results of AF detection using the combined features are described in the table 5.5. As expected these combining features improve the sensitivity, specificity and predictability of AF detection significantly in comparison to the single feature. The best result using features of only RR interval is sensitivity: 97.90%, specificity: 93.33% and predictability: 85.71%. The performance of KNN with multi-input features is not as superior as it was on the single input feature. QDC classifier has a better performance, which yielded sensitivity: 99.46%, sensitivity: 100% and predictability 100% using combination of the RR interval and QTCan14. Figure 5.3 depicts the ROC curve for this instance, illustrating the change in sensitivity and false positivity (expressed in percentage in the figure) determined by decision variable, presented as diagonal cross. False positivity is defined as one minus specificity. The decision threshold is here the number of detected AF beats (NAF = 2 . . . 29) in sliding window of 30 beats. If the threshold exceeds NAF , the segment is defined as AF. The cut-point 28, connected with (0,1) in red line, is optimal trade-off between sensitivity and specificity. The fact that only chronic AF and NSR data was evaluated without paroxysmal AF, has a big influence on the result of 100%. 5.2 Results on MIT Database A comparative study using MIT Atrial Fibrillation database was performed. This database includes twenty-three long-term ECG recording of paroxysmal AF patients. Among them, eighteen recordings were analyzed, which contain AF as well as NSR episodes. The other five recordings (ID: 05091, 07162, 07859, 08405 and 08455) were excluded in the further analysis due to too short or no segment of AF episodes. Table 5.6 enumerates the duration of annotated segment in minutes as well as the number of heart beats for each record. In total eighteen records in duration of 7937 minutes were selected to validate the algorithms, including 246600 heart beats of AF episodes and 435600 heart beats of NSR episodes. The results on belt data have already given a sense, how well the features of QRST cancellation differentiate between AF and NSR. The features extracted from MIT data were analyzed using the same method: visualization, calculation of the mean value and distribution histogram. These features have the similar performance with those extracted from the belt data. Generally the most energy derived from remainder of AF data is contained within 4 – 10 Hz, whereas the remainder of NSR has a lower frequency content in this area. Therefore the features from AF data are greater than NSR data, except QTCan13 and QTCan16. More distinct features were obtained using the beat window II (QTCan4 – QTCan6, QTCan9 – QTCan10 and QTCan14 – QTCan16) than using the beat window I (QTCan1 – QTCan3, QTCan7 – QTCan8 and QTCan11 – QTCan13). The reason will be explain in the section 5.4. QTCan1 and QTCan2, QTCan4 and QTCan5, QTCan7 and QTCan8, QTCan9 and QTCan10, QTCan11 and QTCan12, QTCan14 and QTCan15 are strongly correlated. For example, Fig. 56 Results and Discussion (a) (b) (c) (d) (e) (f) Figure 5.4: Plot features (a) QTCan4 (b) QTCan6 (c) QTCan10 (d) QTCan15 (e) QTCan2 and (f) QTCan11 of AF (red solid line) and NSR (blue dashed line) episode from record 04746. 5.2 Results on MIT Database 57 Table 5.6: The duration of annotated segment in minutes and the number of heart beats for each MIT AF database. ID-patient ID, Pos. HB-the number of heart beats in AF episodes, Neg. HB-the number of heart beats in NSR episodes, HB-the number of total heart beats in each record. ID 04015 04043 04048 04126 04746 04908 04936 05121 05261 06426 06453 06995 07879 07910 08215 08219 08378 08343 Total AF(min) 3.85 130.04 5.73 22.59 206.46 37.62 439.55 385.11 7.91 384.10 4.51 93.86 370 33.51 95.32 131.76 24.32 16.39 2394.45 NSR(min) 93.41 466.75 506.39 531.24 173.90 535.14 114.52 141.12 485.77 27.80 492.66 159.02 118.98 137.05 118.3 273.29 371.03 150.06 5542.20 AF+NSR 97.26 596.79 512.12 553.83 380.36 572.76 554.07 526.23 493.68 411.90 497.17 252.88 488.98 170.56 213.62 405.05 395.35 166.45 7936.65 Pos. HB 600 16170 960 3150 30900 5910 40350 33240 1110 35940 510 8730 40080 2490 7170 15060 1980 2190 246600 Neg. HB 7680 45420 34170 36750 10170 54330 6960 10800 36210 2340 31140 27570 9210 29970 10170 26820 30270 10350 435600 HB 8280 61590 35130 39900 41070 60240 47310 44040 37320 38280 31650 36300 49290 32460 17340 41880 32250 12540 682200 (5.4) plots the features QTCan4, QTCan6, QTCan10, QTCan15, QTCan2 and QTCan11 of AF (red solid line) and NSR (blue dashed line) episode from record 04746. Obviously QTCan2 and QTCan11 can not distinguish AF from non-AF well. Effective features were selected by statistical methods manually: visualization, mean values and histogram. The performance of combining features was also analyzed by varying the number of features input to the classifier. Distinct features of RR interval and QRST cancellation were given to LDC, QDC and 10-ANN classifiers, instead of all the features due to computation constrains. 3-KNN was not applied due to its expensive computation for large amount of data. The table 5.7 summarized the best outcoming results of these three classifiers using the feature of RR interval, QRST Cancellation and combination of the both features. These results revealed that significant ventricular irregularity is crucial criterion to distinguish AF from NSR data, giving the best result: sensitivity 89.85%, 89.14 % and 76.50%, in comparison to the other one single feature. The performance of QTCan features on MIT data is not as good as this on belt data. MIT data is much longer and noisier than belt data, containing diverse ECG morphology, thus the analysis of MIT data is much more complicated and difficult. Therefore the sensitivity, specificity and 58 Results and Discussion Table 5.7: AF detection using feature of RR interval, QRST Cancellation and combining the both features on MIT database. The measures are expressed in %. Legend:Met-classification method, Se.-Sensitivity, Sp.-Specificity, Pr.Predictability. Features RR interval QTCan4 QTCan5 QTCan6 QTCan9 QTCan10 QTCan14 QTCan15 RR interval, QTCan4 RR interval, QTCan5 RR interval, QTCan6 RR interval, QTCan9 RR interval, QTCan10 RR interval, QTCan14 RR interval, QTCan15 Met. QDC 10-ANN 10-ANN QDC 10-ANN LDC LDC LDC QDC QDC ANN 10-ANN 10-ANN 10-ANN 10-ANN Se. 89.85 76.83 77.03 79.11 81.06 78.39 72.01 73.72 93.07 93.83 91.11 91.74 91.80 91.33 91.14 Sp. 89.14 77.45 78.65 77.25 75.29 79.50 60.32 75.31 90.38 90.12 91.96 91.53 91.62 90.04 92.01 Pr. 76.50 60.89 61.23 56.73 56.91 60.26 51.74 61.51 78.90 79.15 79.17 79.56 80.00 78.30 82.08 predictability obtained using MIT data is lower than these using the belt data, although the same QRST cancellation algorithm was utilized. Combining the QTCan features with the RR interval features enhance the sensitivity, specificity and predictability of AF detection. A similar sensitivity, specificity and predictability is achieved using QTCan4, QTCan5, QTCan9, QTCan10, QTCan14, QTCan15. The combination of the features RR interval and QTCan15 have a slightly superior performance, sensitivity: 91.14%, specificity: 92.01% and predictability: 82.08%. 5.3 5.3.1 Discussion Limitation of QRST Cancellation It is well known that AF is associated with chaotic and rapid atrial activity. The estimation of atrial activity in the surface ECG based on spectral analysis is attempted by separating ventricular activity. Subtraction of the averaged QRST complex from each individual beat is the standard method for producing a signal which mainly contains atrial activity – the residual ECG signal. It is highly desired, the dominant energy lies in frequency band extending from 4 to 10 Hz, since this frequency band is identified as the atrial frequency in fibrillatory rhythm. Unfortunately, it is not always this case in reality. The reasons of shifted peak frequency were investigated in order to provide a precise detection of fibrillatory waves in the future. 5.3 Discussion 59 1. Invisible Fibrillatory waves: Fibrillatory waves are often of very low amplitude or not apparent in the ECG data. For example, few fibrillatory waves are presented in the ECG of Fig. 5.5. But the heart beat variability of this episode is clearly higher than the NSR data, thus it is labelled as AF. Figure 5.5: The fibrillatory waves in this AF example (Record 04015, MIT AF database) are not apparent. 2. Annotation: Sometimes chaotic atrial activity is observed in the ECG data labelled as NSR (see Fig. 5.6). This chaotic waves results in a peak or high spectral energy exhibiting in the frequency band 410 Hz. But the ventricular activity is regular relatively, thus it is labelled as NSR episode. 3. Noise: Noise in ECG signal causes considerable errors, e.g., false detection of QRS on- and offset, missing R peak, great distances calculated for each pair of QRS complexes, which may be denoted as abnormal heart beat and false QRS clustering depending upon R peak detection, QRS on- and offset. More energy in the relevant frequency spectrum range is produced by this noisy than much lower amplitude fibrillatory activity. Filters can not handle this problem in a satisfactory way. This noise is inside the physiological bandwidth (0.5 – 50 Hz), thus can not be eliminated by the low pass filter with a cutoff frequency of 50 Hz. Fig. 5.7 shows an example of a noisy signal, containing noise below 50 Hz after filtering, and an accurate delineation of QRS complexes is impossible. 4. Insufficient subtraction This algorithm is based upon a combined QRS and RR interval template. However a variable P wave, QT interval and T wave may require separate templates, which are not considered in this study. For some data, P wave, QT interval and 60 Results and Discussion Figure 5.6: NSR example (the 6th NSR record of belt database) with chaotic atrial activity. Figure 5.7: ECG signal from record 06426 (MIT database) is polluted by noise and causes considerable error in QRST cancellation. T wave change dynamically and frequently, which have impact on the precision of the technique. The uniform templates of P wave, QT interval and T wave cannot match the signal well and result in inefficient subtraction of P wave and T wave. The pieces of QRS, T wave and P wave in the remainder increase the power spectrum in the related frequency band. 5.4 The first and the beat window II 61 (a) (b) Figure 5.8: (a) One ECG example from NSR(1) of belt data (see table 5.2) in blue solid line superimposed onto the template using beat window I in red dashed line, remainder and its power spectrum. (b) This ECG example superimposed onto the template using the beat window II, remainder and its power spectrum. 62 5.4 Results and Discussion The first and the beat window II The results on the belt database and MIT database revealed that the features extracted by template using the beat window II distinguish AF and NSR data better than using the beat window I. Beat window I is defined as interval form current QRS onset to the subsequent QRS onset (from Qon (i) to Qon (i + 1)), see Fig. 4.15 (a). In the second case, the heart beat is counted from Pon (i) to Pon (i + 1) see Fig. 4.15 (a). The beat window I placed the P wave in the end of beat window, whereas the second one placed the P wave in the beginning. The location of P wave changes over a large range as a result of QT dispersion. Consequently the subtraction using the beat window I cannot remove the P wave efficiently, because the P waves in different locations are averaged. The P wave has a high frequency component in the bandwidth of 4-10 Hz (specified later in Fig. 5.10), which affects the specificity of QRST cancellation algorithm. Nevertheless, the PR interval is relative constant, thus the template using the beat window II represents the P wave in most heart beats accurately. Fig. 5.8 (a) shows one ECG example from the first NSR record of belt data (see NSR(1) in the table 5.2) in blue solid line superimposed onto the template using the beat window I in red dashed line, the remainder, as well as the its power spectrum. Fig. 5.8 (b) shows the results using the beat window II. P wave present in the remainder1 resulted in a peak of 5.8 Hz exhibiting in the power spectrum, thus false detection of AF. In comparison, there are no P waves observed in the remainder2. Therefore the peak location in Fig. 5.8 (b) is outside the relevant frequency band. 5.5 QRS clustering Methods Numerous clustering algorithms are available in the literature. Different QRS clustering methods were studied. Choosing the correct number of clusters is a challenge. Too few clusters result in QRS complexes with very different morphologies being clustered together, and thus inaccurate templates. Too many clusters result in too few QRS complexes being used to train each cluster and results in over-fitting. One possible approach of unsupervised QRS clustering is to form a plurality of different templates by placing the first QRS complexes within a first cluster and comparing the second QRS complexes to the beat window I derived by averaging the QRS in the first cluster. If, for instance, the second QRS complex conforms to the beat window I within a certain degree set by a threshold, the second QRS complex is placed in the first cluster, otherwise forms a second cluster. In each iteration the templates are updated when the new beats are allocated to it. Each subsequent QRS complexes is compared to the existing templates and is either grouped into the one that provides the closest distance or forms a entirely a new cluster. The clusters contain one or two heart beats are labelled as anomalous heart beats and excluded in the further analysis. If too many heart beats fall into abnormal groups, the steps for establishment of cluster structure are repeated with a increased threshold. This algorithm 5.5 QRS clustering Methods 63 yielded similar results to the algorithm described in section 4.2.2 on ECG block in short length. But this time-consuming algorithm is not suitable for large database. Hence this algorithm is not incorporated in AF detection framework. (a) (b) (c) (d) Figure 5.9: (a) Ensemble QRS complexes in the example of Fig. 4.9. (b) – (c) Three clusters created by K-mean clustering. The number in title of each figure is the cluster size. A conventional clustering algorithm: K-mean clustering was tested, which is one of the simplest unsupervised learning algorithm. K-means treats each observation in the data having a location in space. It finds a partition in which objects within each clusters are as close to each other as possible, and as far from objects as possible. Each cluster in the partition is defined by its centroid. The centroid for each cluster is the point, to which the sum of distances from all objects in the cluster is minimized. K-means uses an iterative algorithm that minimizes the sum of distances from each object to its cluster centroid, over all clusters. This algorithm moves objects between cluster until the sum cannot be decreased further. The algorithm is composed of the following steps [47]: 1. Place K points into the space represented by the objects that are being clustered. These point represents initial group centroids. 2. Assign each object to the group that has the closest centroid. 64 Results and Discussion 3. When all objects have been assigned, recalculate the position of K centroids. 4. Repeat steps 2 and 3 until the centroids no longer move, that means the sum of distances from each object to its cluster centroid are minimized. Fig. 5.9 depicts partitioning the QRS complexes in the example (see Fig. 4.9) into three clusters, created by K-mean clustering. Note that the anomalous heart beats are not separated and fall in incompact cluster 2. Furthermore, the objects in cluster 1 are not distinct from these in cluster 2. The last problem is particularly troublesome, since we do not know how many clusters the heart beats in ECG block should be partitioned in. Therefore, K-mean clustering is not incorporated in AF detection framework, either. 5.6 Survey of QRST cancellation using appropriate templates In the study [33] it was proposed to obtain the fibrillatory frequency using QRST cancellation algorithm with appropriate templates. Appropriate templates are adopted by comparing each QRST interval with the mean duration for all the heart beats. Each QRST interval is then determined to occur from its own onset to the onset of subsequent QRST interval. The heart beat is considered to be ectopic if the duration of QRST exceeds the product of 1.2 times mean, otherwise to be normal heart beats. Based on the comparison, the QRST complexes are either added and averaged in with a conducted a template or an ectopic template. Based on the result of this step, the signal processor selects the appropriate template and subtracts the normal and ectopic templates from the corresponding heart beats respectively. This technique does not take the variation in QRS morphology into account, which occurs very frequently. More than 10% of ECG recordings were not analyzable using this technique. Hnatkova stated an approach to compute the templates of QRS and T wave and subtraction of QRS and T wave separately, based on self-similarities of the ECG signals in [48]. Self-similarities between corresponding QRS and corresponding T wave were accessed using correlation coefficients. It was performed in two steps; the first phase evaluated the template, and the second phase estimated the similarity using correlation coefficients. The first QRS sample was identified with the predefined window, 200 ms window around the QRS fiducial point. The cross-correlation coefficient was computed within a segment of the ECG that followed/preceded the fiducial point by the same distance. The point and the value of correlation coefficient was saved for each QRS sample. The pair of ECG samples giving the maximum correlation, which exceeded the predefined cut-off was then selected as basic template. If no basic template at the upper required correlation coefficient the next ECG sample was tested. The second step of the algorithm used the basic template which was correlated with the rest of ECG signal. The above described method was again utilized. The remaining un-matched ECG were repeatedly tested with the required crosscorrelation decreased in steps of 0.01 until the lower limit was reached. The 5.6 Survey of QRST cancellation using appropriate templates 65 actual content of each QRS sample was again used to update the template. In such a way, there was assigned a template for most of ECG samples. The same methods was utilized to compute the template for T wave. The generation of T wave template used a 450 ms window, which started 100 ms after the QRS fiducial point. The study population consisted of 23 patients, but only in 20 was QRS and T wave subtraction feasible. According to the experience, the recognition of T wave requires clean signal. In the case of atrial fibrillation, a less clean signal is present due to continuous atrial fibrillatory wave in the signal. For instance, it’s difficult to delineate T wave in Fig. ??, where the fibrillatory waves are superimposed into T waves. An variable QT interval requires separate template. This algorithm generated only one most common template, which is not suitable under some circumstances, e.g., the AF example in Fig. 4.13. Figure 5.10: Frequency spectrum of the entire P wave presented in 10 – Hz frequency bands in patients and controls. PAF: paxoxysmal atrial fibrillation. adapted from [49] In the beginning of the final thesis, only the templates of QRS complexes and T wave were subtracted from ECG signal instead of the mean heart beat. As a result, some remainder of the NSR data exhibited a high spectral energy in 4 – 10 Hz frequency band, which did not allow the differentiation of AF from NSR. Fig. 5.10 depicts the high frequency contents from 4 to 10 Hz in the power spectrum of entire P wave in the case of PAF and NSR enclosed in yellow area. Fig. 5.8 (a) shows the frequency spectrum of a remainder of NSR data presented with P waves. 66 Results and Discussion 5.7 Results summary Feature extraction was performed both automatically using a decision tree structure and manually by looking at different scattered plots and statistical parameters such as the correlation matrix. Only two features as an input for classifier were selected. We found that automatic analysis delivered one feature from the R-R interval analysis (fifth element of Moody matrix RR5 [18]) and one feature from the group of P template matching (number of found P waves in the window of 30 beats long-PtemMatch). In manual selection we concentrated on the selection of parameters only from the first R-R group in order to obtain less computational expensive system (first and sixth element of Moody matrix [18] - RR1 and RR6 ). The results are summarized in Table 5.8. Table 5.8: AF detection using wearable belt system incorporated to intelligent textile. The measures are expressed in %. Legend:Met-classification method, V.Er.-validation error, Se.-Sensitivity, Sp.-Specificity, Pr.-Predictability. Feature RR1,PtemMat,QTCan RR5,PtemMat RR1,RR6 Met. LDC QDC 3-KNN 10-ANN LDC QDC 3-KNN 10-ANN LDC QDC 3-KNN 10-ANN V.Er. 2.65 2.67 0.71 0.69 4.50 3.68 3.43 2.24 4.63 3.65 7.52 2.96 Se. 99.32 99.46 99.35 99.11 98.13 98.27 98.17 99.85 97.90 95.24 92.25 95.10 Sp. 95.71 100 94.64 96.43 96.79 98.10 95.12 96.31 93.33 94.64 91.79 94.64 Pr. 88.71 100 92.86 92.86 78.57 85.71 85.71 85.71 85.71 92.86 85.71 92.86 Figure 5.11: Decision curves comparison demonstrated on R-R interval parameters. In terms of the error on the validation data (beat-to-beat based detection), QDC yields the best results in both cases. After post-processing of classifier 5.7 Results summary 67 output - see Fig. ??, the results are summarized using sensitivity, specificity and predictability measures. The decision surface curves for the R-R features are depicted in Fig. 5.11. Figure 5.12: Performance comparision using ROC approach. The decision threshold is here the number of detected AF beats (from NAF = 2 . . . 29) in sliding window of 30 beats. If the threshold exceeds NAF than the segment is labelled as AF. OT O is the optimal trade-off between Sensitivity and Specificity.Legend: ’dash-3KNN’, ’solid-LDC’, ’dot’-QDC, ’dashdot’-10ANN. Since the classifier results are further post-processed in a sliding window, a design parameter determining the final segmentation classification can be used to obtain receiver-operating curve. This threshold determines if the segment is AF or not-see Fig. 3.10. By tuning the threshold parameter in the fix window of 30 beats, the optimal trade-off between sensitivity and specificity can be found-see Fig. 5.12. The optimal trade-off is defined as the minimum distance between the point (0,1) and the points on the ROC curve. It can be seen that best results in terms of sensitivity are achieved using with combination of three features RR5, PtemMat, QTCan - PR ratio in 5-10Hz frequency band with combination of ANN classifier. Finally, in the last two Fig. 5.13-5.14, the density estimation of underlying data is shown. The two densities are slightly overlapping resulting in final error 3.6% on validation data set. Figure 5.13: Bayes density estimation of underlying data. The covariances matrices are not equal which result is quadratic discrimination function. 68 Results and Discussion Figure 5.14: Density estimation in three dimensional space. In case of MIT-Database, the results are depicted in Table 5.9. The numbers are not so high like in case of Belt data due to bigger presence of noise and artifacts in Holter recordings that were used for MIT AF Database creation. Table 5.9: AF detection using MIT AF Database. The measures are expressed in %. Legend:Met-classification method, V.Er.-validation error, Se.-Sensitivity, Sp.-Specificity, Pr.-Predictability. Feature RR5,PtemMat,QTCan RR5,PtemMat Met. LDC QDC 3-KNN 10-ANN LDC QDC 3-KNN 10-ANN V.Er. 4.32 3.61 NA 2.12 4.50 3.68 NA 2.24 Se. 90.00 93.83 NA 91.45 89.23 91.15 NA 91.65 Sp. 84.68 90.12 NA 92.01 86.12 87.96 NA 87.47 The similar comparison proceeding was carried out like in case of Belt Database. In Fig. 5.15 the four classifiers are compared. The best separation of both classes is achieved by QDC classifier again. To visualize better its statistical characteristic, the density function in two dimension in form of contours and its three dimensional counterparts are shown in Fig 5.16, Fig. 5.17 respectively. 5.7 Results summary Figure 5.15: Decision curves comparison demonstrated on R-R interval and P wave template matching parameters. Figure 5.16: Bayes density estimation of underlying data. The covariances matrices are not equal which result is quadratic discrimination function, MIT AF Database. Figure 5.17: Density estimation in three dimensional space, MIT AF Database. 69 6 Summary and Perspective AF is common arrhythmia, arising from the disturbance of initiation and propagation of excitation pattern in atrial tissue. In the ECG AF is indicated by irregular ventricular rhythm, absence of consistent P wave and presence of random fibrillatory waves. An automatic method for the detection of atrial fibrillation has been developed, differentiating the ECG signals with AF and with NSR. The QRST cancellation algorithm was evaluated for two databases: the belt databases, collected from 8 chronic AF patients and 6 healthy subjects at rest using a chest belt with dry electrodes yielding 1 lead ECG; and 23 long-term ECG of paroxysmal AF patients including AF episodes and NSR episodes from MIT-BIH atrial fibrillation database. Based upon straight forward averaging algorithm, the QRST cancellation algorithm was improved to generate appropriate template. This algorithm combined QRS clustering and RR interval classification, so that the template presented the individual heart beat more accurately. This algorithm is comprised of four major steps. • First, the ECG signals were preprocessed, and only the region between 0.5 – 50 Hz was kept for analysis to avoid contributions from noise of high frequency noise, baseline wonder and power-line interference. Fiducial points were detected using our existing processing system. • In a second step all beats were divided into classes of heartbeats with similar QRS morphologies and ventricular cycle length using unsupervised approach. Abnormal heart beats were rejected in the further analysis. For each class of normal heartbeat, an averaged heartbeat was calculated as appropriate template. Hereby two definitions of heart beats window were tested. • The third step was the subtraction of the corresponding averaged heart beats from the original sequence after alignment with each QRS complex in order to remove the regular ventricular signal parts. • Finally, the residual signals were subjected to Fast Fourier Transformation to analyze the power spectrum. Regarding the survey on frequency spectrum of AF, the features which reflects the electrophysiological changes manifest in the power spectrum of AF remainder, were extracted in a sliding window consisting of 30 beats. The results revealed that the energy and spectral peak of remainders are concentrated on frequency band 4 – 10 or 5 – 10 Hz in the case of chronic AF patients, whereas the remainder of NSR data has a lower frequency component. 72 Summary and Perspective The features of power percentage and peak location were handled into various classifiers for pattern recognition which classify the ECG into two classes: AF present and AF absent. The outcomes of classifiers are post-processed to give a optimal cutoff between sensitivity and specificity. Finally, the accuracy of classification decision were measured by statistical parameters. The best results of sensitivity: 99.46%, specificity: 100.00% and predictability 100.00% was achieved on belt data. The best result on MIT data is sensitivity: 91.14%, specificity: 92.01% and predictability: 82.08%. In this study the QRST algorithm which accounts for rapid variations in QRS morphology and RR interval, were developed and validated on two databases. The results show that this method performs considerably better than does straightforward averaging method. Application of QRST cancellation features allowed improved differentiation of AF from other non-AF irregular rhythm than using the RR interval series alone. The method for automatic AF detection using simple features along with LDC, QDC, KNN and ANN classifiers has been presented. The system works with high sensitivity, specificity and predictability. There is some room to improve the QRST cancellation algorithm. Supervised clustering methods can be tested, e.g, artificial neural network. The variation in P wave, T waves and QT interval should be taken into consideration in the further work. The remainders of P wave and T wave contribute to high spectral power in the relevant frequency band, which affect the sensitivity and specificity of the algorithm. For classifiers a hierarchical structure could be established, in which the features are combined in a weighted fashion. RR interval which plays an important role in a differentiating between AF and other irregular rhythm, should be considered at first level. This approach would lead to a system of AF detection with personalized feature extraction and personalized classifiers for individual recording in the future. A Abbreviations and Acronyms AF AP APD APG AV AVR bpm CHF CS CT ERP JSR NSR MDP SR WM = = = = = = = = = = = = = = = = atrial fibrillation action potential action potential duration right appendage atrioventricular atrioventricular ring beats per minute congestive heart failure coronary sinus crista terminalis effective refractory period junctional sarcoplasmic reticulum network sarcoplasmic reticulum maximum diastolic potential sinus rhythm working myocardium B COOKING BOOK FOR AF DETECTION TOOLBOX I. Configuration 1. Open Config AFDetection.m and AFConfig.m and edit the paths where your data and your source code is saved. 2. Call Config AFDetection.m to install the toolbox II. Segmentation (R,P waves peaks detection) Both database 1. Open SegmentationSignal.m a) Choose database,e.g. Database=’MIT’; b) Uncomment the section for MIT or Own database c) The files will be save to D:\AtrialFibrillationData\ , e.g. Segmentationnor_a_050725_1.mat III. Noise Level Detection In case of OWN database, an algorithm for noise section determination was developed. Just start DetectNoise.m. IV. Feature Extraction MIT Data - Small Training & Test set = validation set 1. P wave template selection a) The nice P waves was selected for each signal in SelectPWaveforMITDat.m b) First, the appearance of P waves for each signal is defined using variables NPwave_Segment_lengthTEMP=2; StartNormalTemp=60+Sampling_Rate*60*1; EndNormalTemp=StartNormalTemp+Sampling_Rate *60*NPwave_Segment_lengthTEMP; c) than the longer segment is defined for evaluation of the selection using ROC analysis NPwave_Segment_length=20; StartNormal=EndNormalTemp; EndNormal=StartNormal+Sampling_Rate*60*NPwave_Segment_length; d) E.g., the following file will be generated: PwaveTemMat04015_1.mat 76 COOKING BOOK FOR AF DETECTION TOOLBOX 2. Open AFProcessingBatch.m 3. Choose channel Channel=1; and functionality FUNCTION_BATCH=’ExtractFeaturesSignal’; 4. In ExtractFeaturesSignal.m are for each signal defined the representative segment of NSR (normal sinus rhythm) and AF data. The following file will be for example produced FeatExtract04936_1.mat. 5. If you want to change training set than you must redefine the segments in ExtractFeaturesSignal.m, e.g. case ’04015’; %Very few data of AF - no possibility of merging training and validating set MERGE=0; %Train-Normal-BAD example:StartNormalTrain=30 NPwave_Segment_lengthNormal=3; NPwave_Segment_lengthNormal actually means Segment_lengthNormal, and has nothing to do with the normal P-wave. (MH) StartNormalTrain=166857+Sampling_Rate*60*21; %AnnotECG.Beg_Seg_N_All(4) EndNormalTrain=StartNormalTrain+Sampling_Rate*60*,... NPwave_Segment_lengthNormal; %Train-AF NPwave_Segment_lengthAF=2; StartAFTrain=133348; %AnnotECG.StartAF(3); EndAFTrain=StartAFTrain+Sampling_Rate*60*NPwave_Segment_lengthAF; %Validate-Normal NPwave_Segment_lengthNormal=3; StartNormalTest=166857+Sampling_Rate*60*25; %AnnotECG.Beg_Seg_N_All(4) EndNormalTest=StartNormalTest+Sampling_Rate*60*,... NPwave_Segment_lengthNormal; %Validate-AF NPwave_Segment_lengthAF=2; StartAFTest=133348; %AnnotECG.StartAF(3); EndAFTest=StartAFTest+Sampling_Rate*60*NPwave_Segment_lengthAF; 6. The Validation set must be created by merging the already extracted feature files like FeatExtract04936_1.mat. This is done by calling ComposeTrainingSet.m. You can specify which record should contributed to validation set by filling variable DataID, e.g., DataID={’04015’;’04746’;’04908’;’04936’;’06995’;’07879’;’08219’;’08378’}; The following file will be generated : FeaturesTrainingValidationAll1.mat MIT Data - BIG Testing set 1. Open AFProcessingBatch.m 2. Choose channel Channel=1; and functionality FUNCTION_BATCH=’ExtractFeaturesWhole’; 77 3. The features will be selected for the whole signal. Note, that this is quite consuming task, normally it takes like one day. OWN Data - The whole data set 1. P wave template selection a) No patients suffering from paroxysmal AF are presented. Therefore the P wave template selection was selected by trial & error for all the NSR records using SelectPwaveOwn.m b) The following file will be generated PwaveTemMatOwn.mat 2. Open AFProcessingBatchOwn.m 3. Create Annotation using functionality FUNCTION_BATCH=’CreateAnnotationOwn’; 4. Choose functionality FUNCTION_BATCH=’ExtractFeaturesSignal’; 5. In ExtractFeaturesSignalOwn.m are for each signal defined the representative segment of NSR and AF data. The following file will be for example produced FeatExtractOwnFullnor_a_050725resampleTrainNormal.mat 6. If you want to change training set than you must redefine the segments in ExtractFeaturesSignalOwn.m, e.g. case ’ned_r_070440resample’; WhichLabel=’Normal’; WhichSet=’Train’; Start=84397; End=179931; %6.4m 7. The OWN database is quite small, that means that training set and testing set is equal to sets of extracted features for the whole database. To compute it, open AFProcessingBatchOwn.m 8. Choose functionality FUNCTION_BATCH=’CreateTrainingSet’; 9. The following files will be generated: ValidationSetOwnFullNormal.mat and ValidationSetOwnFullNormal.mat V. Classifier Training For both databases 1. The training is done using PrToolsTrain.m. For training the pattern recognition toolbox is used !!! a) Choose appropriate database, e.g . DatabaseAF=’Own’; b) Choose features to be used, e.g. FS=[1 6]; c) Choose classifiers, e.g. W1 = qdc(C); d) You can also enable scatter plots by SCATTERPLOT=1; e) Or automatic feature selection by FEATURESELECTION=1; VI. Classifier Evaluation It is quite easy because the GUI was developed. Just start SignalEvaluation.m 78 COOKING BOOK FOR AF DETECTION TOOLBOX MH Notes: Most of the training and testing optimisation is carried out on a small subset of the MIT AF database (called ’Signal’. The whole database is called ’Whole’.). This small subset is divided into two parts, called training and test. The pwave template determination is carried out on this small subset of the MIT AF database. 10 p-wave are selected randomly from the training part of this small subset, and tested on a fixed set of p-waves from the test part of this subset. This process is repeated until the template matches the test p-waves well. 1. AFProcessingBatch.m carries out all sorts of batch functions, given by FUNCTION_BATCH. 2. FUNCTION_BATCH=’ExtractFeaturesSignal’ generates the features (using the p-wave templates just generated, among other things) for the ’Signal’ part of the database. 3. ExtractFeaturesSignal.m extracts the features for the hand found start and end points of the NSR (normal sinus rhythm), which are hard coded in the .m file. 4. NPwave_Segment_lengthAF actually means Segment_lengthAF, and has nothing to do with the normal P-wave. 5. The start and end times of the normal and AF segments were found by hand (using ReadOwn.m which uses eegplot.m). StartNormalTrain=166857+Sampling_Rate*60*21; The 166857 is from the annotation file (the starting point of a normal segment). It was found that this start was noisy, and the first 21 minutes were skipped. 6. If have path problems, can call Config AFDetection.m. This updates the Matlab path variable. 7. Various paths are stored in AFConfig.m. This defines various DataPath variables. SelectPWaveOwn.m uses the p-waves from certain patients to generate a p-wave template. It was found that it was better to only use one patient to generate this template rather than using all the patients. For many of the patients, the p-wave boundaries were poorly determined, resulting in a poor template. This could be improved by improving the p-wave detection algorithm (or tuning it). The best p-wave boundary detection occurred with patient ned_r_070440resample, and this was taken to generate the p-wave template. 8. Need to add the annotation directory for our data. 9. To create the annotation, need to change if 0, if 1 at various places in the code. (Line 200 in ExtractFeaturesSignalOwn.m.) Now changed to CreateAnnotationOwn.m. 79 10. a) For our data set, the full feature matrix was created, done using FUNCTION_BATCH=ExtractFeaturesSignal. From this matrix, the training and test set feature matrix was extracted in CreateTrainingSet.m. b) For MIT DB, it is time consuming creating the full feature matrix for the whole data set (can take 1-2 days). For this reason, a validation set was created (a small training and test set) for which the features can be quickly computed. When the features work well on the validation set, the full feature matrix for the whole data set can be computed. In theory, the validation set matrix can be extracted from the whole data set matrix. The validation set feature matrix was created using FUNCTION_BATCH=ExtractFeaturesSignal. The whole feature matrix was created using FUNCTION_BATCH=ExtractFeatureWhole. 11. In PrToolsTrain, A is structure containing all stuff necessary for the pattern recognition toolbox. Can examine A using struct(A). 12. In PrToolsTrain.m, the validation training set is divided into a sub training set (30chosen randomly from the PR Toolbox (using the command [C,D] = gendat(A,0.3);). The sub test set is used to evaluate the classifier trained using the sub training set. This process can be repeated several times, giving different classifiers each time (trained from different sub training sets, with different initial conditions), and evaluated on different sub test sets. This can be repeated until a satisfactory classifier is obtained. 13. TestWholeSignalDiffClassifier.m tests the newly generated classifier on the test part of the validation set. C The M-file structure of AF detection using MIT [1] and OWN database *NUBU - Abbreviation of NotUsedButUseful directory. All files that are not currently used were placed under those directories. • AFProcessingBatch.m - Segmentation, Feature extraction and Testing is done for the whole AF database • AFProcessingBatchOwn.m - Feature extraction and Testing is done for the whole OWN database DatabaseTools • SegmentationSignal.m - It performs signal segmentation. – SegmentationPeace.m - Pan Tompkins algorithm implementation [50], pp.187. . rdsign212.dll - Read data in 212 format, the source code is in rdsign212.c . readheader.m - Read MIT header, (ReadMITData.m - read data in 212 format) • ReadOwn.m - Read OWN Belt data and manually select segments without noise FeatureExtraction • SelectPWaveforMITDat.m It is necessary to select suitable representative P wave. Select suitable P wave. It evaluates the selection using template matching with correlation coefficient and dynamic time warping as measures. This function can be also view as a training procedure for P wave suitable selection and use ONLY the template matching for AF detection. – ReadAnnot.m - Read annotations of AF database (here is also selected the channel!!!. – SelectPwaveTemplate\_Method.m - Performs template selection . ReadSegData.m - Reads AF data into first nth blocks, performs segmentation 82 The M-file structure of AF detection using MIT [1] and OWN database . TemplateComputation.m - Select P wave template . CorrCoefCalc.m\verb - Compute correlation coefficient [50], pp. 95. . Dtw.m - Compute Dynamic Time Warping measure RRIntervalTraining.m computes the Moody matrix [18]. However, during testing we use the matrix provided in Moody article. • SelectPWaveOwn.m- Select COMMON P wave TEMPLATE from Normal sinus ECG signals of our OWN database. • ExtractFeaturesSignal.m It extracts features from signal. We repeat it several times by calling ExtractFeatures.m to get the training and testing set (validation set) for classifier. – ComputeMatrix.m - RR intervals matrix [18]. – entropy.m - Compute Shanon entropy – sdann.m - Compote RR time parameters SDANN and SDNNind (index) – hrtach.m - Constructs evenly sampled rrintervals - interpolate RR interval for further frequency analysis – hrpowsfp.m - Compute FFT Spectrum in different frequency bandth – QTCancellationInt.m - QT calculation and feature calculation [21]. P wave matching . time_normalization.m - normalization of beats to have the same length. . TraceSegmentation.m - the method for normalization is polygonal approximation . CorrCoefCalcTestInt.m - template matching using correlation coefficient or DTW. ClassifierTraining • ComposeTrainingSet.m - It creates training set for classifier. • PrToolsTrain.m - Training and testing using Pattern Recognition toolbox from Delf university. ClassifierTesting • TestWholeSignalDiffClassifier.m Classifier TESTING for MITAF & OWN database – ComputeClassifierROCMITAF.m. ROC analysis over the whole AF and OWN Belt database. Classifiers can be compared by loading ROC curve from ClassName ROC.mat file. – EvaluationGraph.m - Results evaluation 83 . Plotgui.m - Prints out a graph containing the rr intervals and detected AF • PrintClassifierAFDetectionResultsROC.m. Print the results of classifier testing for particular record. • NUBU\NeuralNetworkTrain.m ANN training [51]. It uses Neural Network Matlab toolbox. • NUBU\NeuralNetworkTestValidationData.m Because the testing of one signal takes several hours we firts test the ANN perfomance on smaller test of data. Using this set, the best feature combination and ANN topology is determined and consequently the whole signals are tested using NeuralNetworkBatch.m. – NUBU\NeuralNetworkBatch.m - ANN testing – NUBU\NeuralNetworkTestWholeSignalPCA.m - All features selected by PCA analysis – NUBU\NeuralNetworkTestWholeSignalRRP.m - Only RR matrix and P wave template matching – NUBU\NeuralNetworkTestWholeSignalRR.m- Only RR matrix as feature Personalization - so far only in case of NN Mathworks toolbox The main goal is to develop algorithm adapted to each patient. Most m files are the same for common (training over the whole database) and personalized approach. Some kind of personalization is already P wavelet matching. The main idea is to have for each patient adapted NN. Called by AFProcessingBatch.m • NUBU\NeuralNetworkTrainPersonalize.m - Training • NUBU\NeuralNetworkTestWholeSignalRRPersonalize.m\ - Testing for RR feature • NUBU\NeuralNetworkTestWholeSignalRRPPersonalize.m- Testing for RR+P feature • NUBU\NeuralNetworkTestWholeSignalPCAPersonalize.m - Testing for RR+P +QT feature • \NUBU\PrintMITAFDetectionResultsROCPersonalize.m - Results evaluation using ROC curve File structure of AF detection - AtrialFibrillation directory • AFDetectionDemos - demos for AF/OWN Belt database • ClassifierTesting - Classifier evaluation with help of ROC curve 84 The M-file structure of AF detection using MIT [1] and OWN database • ClassifierTraining - Classifier Training using PRTools toolbox • DatabaseTools - Databases records Reading and Segmentation • FeatureExtraction - RR interval, Pwave template matching, QT cancellation FeatureExtraction\FeatExtUtils - support files for extraction File structure of MIT-AF [1] database detection • D:\AtrialFibrillation - the above mentioned m-files are placed in this directory • D:\AtrialFibrillation\Doc - it includes results excel tables • D:\AtrialFibrillation\help - it includes help for AF Detection Toolbox • AtrialFibrillationData\!!AfExtract\FeaturesTrainingbyHong feature extraction of both channels fir training computed by ExtractFeaturesSignal.m (45 features in window of 30 beats) • AtrialFibrillationData\!!Af\ExtractWholeSignalbyHong - feature extraction for the whole signals computed by AFProcessingBatch.m (FUNCTION_BATCH=’ExtractFeaturesWhole’) • AtrialFibrillationData\!!Af\ValidationSet - Validation set for classifier training created by ComposeTrainingSet.m • AtrialFibrillationData\!!Af\WEKA - Conversion of Validation set for WEKA system • AtrialFibrillationData\!!Af\Results - results of AF detection computed by TestWholeSignalDiffClassifier.m • AtrialFibrillationData\!!Af\Results\ANNxxx - results of AF detection for both channels and personalizes approach computed by NeuralNetworkBatch.m • AtrialFibrillationData\!!Af\PwaveTemMat - P wave Templates for both channels computed by SelectPWaveforMITDat.m • AtrialFibrillationData\!!Af\Segmentation - segmentation for both channels computed by SegmentationSignal.m Bibliography [1] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, “PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals,” Circulation, vol. 101, no. 23, pp. e215–e220, 2000. [2] D. S. R. S. D. Bialy, M. Lehmann and M. Meissner, “Hospitalization for arrhythmias in the united states: importance of atrial fibrillation,” J Am Coll Cardiol, vol. 19 (Suppl. A), no. 41 A, pp. 612–627, 1992. [3] J. Robbins and G. W. Dorn, “Listening for hoof beats in heart beats,” Nature Medicine, vol. 6, pp. 968–970, 2000. [4] F. B. Sachse, “Modeling of the mammalian heart,” 2002. Habilitationsschrift at Institute für Biomedizinische Technik, Universität Karlsruhe (TH). [5] S. Nattel, “New ideas about atrial fibrillation 50 years on,” in Nature, vol. 415, pp. 219–226, 2002. [6] J. Waktare, “Cardiology patient page. atrial fibrillation,” Circulation., vol. 106(1), pp. 14–6, 2002 Jul. [7] A. S. Levy, J. Camm, and S. Saksena, “International consensus on nomenclature and classification of atrial fibrillation: A collaborative of the working group on arrhythmias and the work group of cardiac pacing of the european society of cardiology and the north american society of paing and electrophysiology,” Cardiovasc. Electrophysiol., vol. 14, pp. 443–445, 2003. [8] J. Malmivuo and R.Plonsey, Bioelectromagnesium, ch. Chapter 15: 12-lead ECG system. New York: Oxford University Press, 1995. [9] “Activity b13: EKG demo (ecg sensor).” http://www.clarion.edu/eduhumn/science education/biologylabs/B13%20EKG%20Demo.doc. [10] D. Novak, “Processing of ECG signal using wavelets,” 2000. Masterthesis, Faculty of Electrical Engineering, Department of Cybernetics at Czech Technical University in Prague. [11] “The EKG waveform.” http://sprojects.mmi.mcgill.ca/cardiophysio/EKGPRinterval.htm. 86 Bibliography [12] PhysioBank, “The mit-bih atrial arrhythmia http://www.physionet.org/physiobank/database/mitdb/. database.” [13] PhysioBank, “The mit-bih atrial fibrillation http://www.physionet.org/physiobank/database/afdb/. database.” [14] J. Muhlsteff, O. Such, and R. Schmidt, “Wearable approach for continuous ECG - and activity patient-monitoring,” in 26th Annual International Conference of the IEEE EMBS, vol. 1, pp. 2184–2187, 2004. [15] D. Donoho, “De-noising by soft-thresholding,” IEEE Trans Inform Theory, vol. 41, no. 3, pp. 612–627, 1995. [16] J. Pan and W. Tompkins, “A real-time QRS detection algorithm,” IEEE Transactions on Biomedical Engineering, vol. 32, pp. 230–236, 1985. [17] R. M. Rangayyan, ch. Chapter 4: Event Detection, pp. 466–471. IEEE Press Series on Biomedical Engineering, 2002. [18] G. B. Moody and R. G. Mark, “A new method for detecting atrial fibrillation using R-R intervals,” in Computers in Cardiology, vol. 10, pp. 227–230, 1983. [19] F. Beckers, D. Ramaekers, and A. A.E, “Approximate entropy of heart rate variability: Validation of methods and application in heart failure,” Cardiovascular Engineering: An International Journal, vol. 1, no. 4, pp. 177–182, 2001. [20] C. Peng, S. Havlin, H. Stanley, and G. A.L., “Quantification of scaling exponents and crossover phenomena in nonstationary heartbeat time series,” Chaos, vol. 5, pp. 82–87, 1995. [21] J. Slocum, A. Sahakian, and S. Swiryn, “Diagnosis of atrial fibrillation from surface electrocardiograms based on computer-detected atrial activity,” Journal of Electrocardiology, vol. 25(1), pp. 1–8, January 1992. [22] J. Pan and W. Tompkins, “A real-time qrs detection algorithm,” IEEE Transactions on Biomedical Engineering, vol. 32, pp. 230–236, 1985. [23] I. H. Witten and E. Frank, Data Mining. Morgan Kaufmann, 1999. [24] “Lecture 12: Classification.” http://research.cs.tamu.edu/prism/lectures/iss/iss l12.pdf. [25] R. O. Duda, P. E. Hart, and D. G. Stork, ch. Bayes Decision Rule, pp. 20– 83. New York: A Wiley-Intersicience Publication, 2001. [26] A. Maren and C. Harston, Handbook of Neural Computing Application, ch. Chapter 1: Introduction to Neural Network. 24-28 Oval Road, London: Academic Press, 1990. [27] H. Demuth and M. Beale, “Chapter2: Neuron model and network architectures,” in Documentation of neural network toolbox: For user with MATLAB, vol. 4, pp. 2 1–2 32, 2002. Bibliography 87 [28] D. McAuley, “The backpropagation network: Learning by example,” 1997. http://www2.psy.uq.edu.au/∼brainwav/Manual/BackProp.html. [29] H. Demuth and M. Beale, “Chapter5: Backpropagation,” in Documentation of neural network toolbox: For user with MATLAB, vol. 4, pp. 5 1–5 74, 2002. [30] S. Akosoy, “Non-bayesian classifiers part I: knearest neighbor classifier and distance functions.” http://www.cs.bilkent.edu.tr/∼saksoy/courses/cs551/slides/cs551 nonbayesian1.pdf. [31] R. M. Rangayyan, ch. Chapter 9: Pattern Classification and Diagostic Decision, pp. 466–471. IEEE Press Series on Biomedical Engineering, 2002. [32] S. Shkurovich, A. V. Sahakian, and S. Swiryn, “Detection of atrial activity from high-voltage leads of implantable ventricular defibrillators using a cancellation technique,” IEEE Transaction on Biomedical Engineering, vol. 45(2), pp. 229–234, Feb. 1998. [33] A. Bollmann, N. K. Kanuru, and K. K. McTeague, “Frequency analysis of human atrial fibrillation using the surface electrocardiogram and its response to ibutilide,” The American Journal of Cardiology, vol. 81, pp. 1439– 1445, June 15, 1998. [34] P. A. Lynn, “Online digital filters for biological signals: some fast designs for a small computer,” Med. & Biol. Eng. & Comput., vol. 15, pp. 534–540, 1977. [35] E. C. Ifeachor and B. W. Jervis, Digital Signal Processing A Practical Approach, ch. Chapter 6: Finite impulse (FIR) filter design. Edinburgh Gate, Harlow, England: Addison-Wesley, 1993. [36] A. Bollmann, K. Sonne, and H. Esperer, “Non-invasuve monitoring of spantaneous and antiarrhythmic drug induced changes in fibrillatory frequency in human atrial fibrillation,” Cardiovasc Res., vol. 44, pp. 60–66, 1999. [37] A. Bollmann, K. Sonne, and H. Esperer, “Non-invasive assessment of fibrillatory avtivity in patients with oaroxysmal and persistent atrial fibrillation using the holter ecg,” IEEE Computers in Cardilogy, vol. 26, pp. 695–698, 1999. [38] M. Holm, S. Pehrson, and M. Ingemansson, “Non-invasive assessment of the atrial cycle length during atrial fibrillation in man: introducing, validating and illustrating a new ecg method,” Cardiovasc Res., vol. 38, pp. 69–81, 1998. [39] S. Pehrson, M. Holm, and C. Meurling, “Non-invasive assessment of magnitude and dispersion of atrial cycle length during chronic atrialfibrillation in man,” European Heart Journal, vol. 19, pp. 1836–1844, 1998. [40] J. P, H. M, and S. DC, “A focal source of atrial fibrillation treated by discrete radiofrequency ablation,” Circulation, vol. 95, pp. 527–526, 1997. 88 Bibliography [41] D. Raine, P. Langley, and A. Murray, “Surface atrial frequency analysis in patients with atrial fibrillation,” J Cardiovasc Electrophysiol., vol. 16(8), pp. 838–844, 2004. [42] Y. Asono, J. Saito, and K. Matsumoto, “Onthe mechanism of termination and perpetuation of atrial fibrillation,” Am J Cardiol., vol. 69, pp. 1033– 1038, 1992. [43] “Cleveland clinic heart center.” http://www.clevelandclinic.org/heartcenter/pub/guide/disease/e [44] A. Bollmann, D. Husser, and M. Stridh, “Frequency measures obtained from the surface electrocardiogram in atrial fibrillation research and clinical decision-making,” J Cardiovasc Electrophysiol., vol. 14(10 Suppl), pp. 154– 161, 2003. [45] U.Kincke and H. Jaekel, Signale und Systeme, ch. Chapter 5: Zeitdiskrete Signale. Munchen: Oldenbourg Verlag, 2002. [46] E. C. Ifeachor and B. W. Jervis, Digital Signal Processing A Practical Approach, ch. Chapter 2: Discrete Transform. Edinburgh Gate, Harlow, England: Addison-Wesley, 1993. [47] “A tutorial on clustering algorithm.” http://www.elet.polimi.it/upload/matteucc/Clustering/tuto [48] K. Hnatkova, J. Waktare, and C. Meurling, “A computer package generating non-invasive atrial electrograms: Detection and subtraction of qrs and t wave,” Computers in Cardiology, vol. 25, pp. 533–536, 1998. [49] R. P. Stafford, P. Denbigh, “Frequency analysis of the P wave: comparative techniques,” Pacing Clin Electrophysiol., vol. 18(2), pp. 261–270, Feb. 1995. [50] R. M., Biomedical Signal Analysis. John Wiley & Sons, 2000. [51] S. Artis, G. Moody, and R. Mark, “Detection of atrial fibrillation using artificial neural networks,” in Computers in Cardiology, vol. 14, pp. 173– 176, 1991.