PAF Prediction Challenge (300 records)

This database of two-channel ECG recordings has been created for use in the Computers in Cardiology Challenge 2001, an open competition with the goal of developing automated methods for predicting paroxysmal atrial fibrillation (PAF). See the challenge announcement for information about the competition, and see Predicting Onset of Atrial Fibrillation for a brief overview of the clinical problem, its significance, and suggestions for further reading on the subject.

The database is divided into a learning set (records with names of the form n* and p*) and a test set (records with names of the form t*).

more...

The learning set consists of 50 record sets. Each record set contains two 30-minute records with consecutive record names (e.g., p15 and p16), and two 5-minute ``continuation'' records with names ending in c (e.g., p15c and p16c). All four records in each record set are excerpts of longer continuous ECG recordings of a single subject; the 50 record sets come from 48 different subjects.

The records with names beginning with p come from subjects who have PAF. The second (even-numbered) record in each pair of 30-minute records contains the ECG immediately preceding an episode of PAF, which can be verified by examining the like-numbered continuation record. Thus, for example, record p16 immediately precedes the episode of PAF in record p16c. The first (odd-numbered) record of the set (for example, record p15) contains 30 minutes of the ECG during a period that is distant from any episode of PAF (there is no PAF during the 45-minute period before the beginning or after the end of the 30-minute record). The corresponding 5-minute continuation record (e.g., record p15c) shows that (at least!) the minutes immediately following the ``PAF-distant'' record do not contain PAF. Note: Please be aware that a few of the 30-minute records in this group may contain very short bursts of PAF that escaped notice while the learning set was being compiled.

The records with names beginning with n come from subjects who do not have documented atrial fibrillation, either during the period from which the records were excerpted or at any other time. The subjects include healthy controls, patients referred for long-term ambulatory ECG monitoring, and patients in intensive care units.

The test set is similarly constructed of 50 record sets (from 50 different subjects); unlike the learning set, there are no continuation records. The test set records are named t01, t02, ... t100. As in the learning set, pairs of consecutively numbered records come from the same long-term ECG recording of a single subject. Approximately half of the record sets in the test set come from subjects with PAF; part 1 of the challenge is to identify these record sets, and part 2 is to identify which record in each pair immediately precedes PAF.

For more information:

  1. http://www.physionet.org/physiobank/database/afpdb/
  2. Moody GB, Goldberger AL, McClennen S, Swiryn SP. Predicting the Onset of Paroxysmal Atrial Fibrillation: The Computers in Cardiology Challenge 2001. Computers in Cardiology 28:113-116 (2001).
  3. Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 101(23):e215-e220 [Circulation Electronic Pages; http://circ.ahajournals.org/cgi/content/full/101/23/e215]; 2000 (June 13).

This database has 300 records