Skip to main content

Full text of "Principles and practice of public health surveillance"

See other formats



r Tin 

P82 " ,Ji<li ' S 





AUG 92 

National Technical Information Service 



Principles and Practice 

of Public Health 


Steven M. Teutsch 

R. Elliott Churchill 


BLDG 10 


Public Health Service 

Epidemiology Program Office 
Centers for Disease Control 

August 1992 

Paiuam Health Lftwy 

5600 Fishars 

Lane.P.n".. »*» 

Us* of trade naaaa is for identification only 

and does not constitute endorsement by the 

Public Health Service or the Centers for Disease Control. 



Form Approved 
OMB No. 0704-0188 

Public reporting burden for this collection of information is estimated to average 1 hour oer response, including the time for reviewing instructions, searching existing data sources, 
gathering and maintaining the data needed, and completing and reviewing the collection of information Send comments regarding this burden estimate or any other aspect of this 
collection of information, including suggestions for reducing this burden, to Washington Headouarters Services. Directorate for Information Operations and Reports, 1215 Jefferson 
Dav L ' — ■" *— '. andto the Officeof Managementand Budget. Paperwork Reduction Project (0704-0188), Washington. DC 20503 

PB9 3-10 1129 



August 1992 



Principles and Practice of Surveillance 


Teutsch, Steven M. and CHoif.) 4il-^^o 
Churchill, R. Elliott, Editors 


Epidemiology Program Office 
Centers for Disease Control 
Mail stop C08 
Atlanta, GA 30333 


Epidemiology Program Office 
Centers for Disease Control 
Mailstop C08 
Atlanta, GA 30333 








Oxford University Press has expressed interest in obtaining this material when it 
has been placed in the public domain, with a view to publishing it. 



13. ABSTRACT (Maximum 200 words) 

Public health surveillance is the systematic, ongoing assessment of the health of a 
community including the timely collection, analysis, interpretation, dissemination 
and subsequent use of data. The book presents an organized approach to planning, 
developing, implementing, and evaluating public health surveillance systems. Chapters 
include: planning; data sources; system management and data quality control; analyzing 
surveillance data; special statistical issues; communication; evaluation; ethical 
issues; legal issues; use of computers; state and local issues; and surveillance in 
developing countries. The book is intended to serve as a desk reference for public 
health practitioners and as a text for students in public health. 




Public Health; Public Health Surveillance; Epidemiology; Disease 
Control; epidemics; Communicable Disease; Infectious Disease; 



NSN 7540-01-280-5500 









Standard Form 298 (Rev 2-89) 
Prescribed by AN5I Std Z39-18 

















SYSTEMS . 245 





Contributors to this book: 


Willard Cates, Jr., M.D., M.P.H. 
Director, Division of Training 
Epidemiology Program Office 

R. Elliott Churchill, M.A. 
Technical Publications Writer-Editor 
Office of the Director 
Epidemiology Program Office 

Andrew G . Dean , M.D., M.P.H. 
Chief, System Development and 

Support Branch 
Division of Surveillance and Epidemiology 
Epidemiology Program Office 

Robert F. Fagan, B.S. 

Systems Analyst 

Division of Surveillance and Epidemiology 

Epidemiology Program Office 

Norma P. Gibbs, B.S. 

Chief, Systems Operation and Information 

Division of Surveillance and Epidemiology 
Epidemiology Program Office 

Richard A. Goodman, M.D., M.P.H. 
Assistant Director 
Epidemiology Program Office 

Robert A. Hahn, Ph.D., M.P.H. 

Medical Epidemiologist 

Division of Surveillance and Epidemiology 

Epidemiology Program Office 

Robert J. Howard, B. A. 
Public Affairs Officer 
Office of Public Affairs 
Office of the CDC Director 

Douglas N. Klaucke, M.D., M.P.H. 
Chief, International Branch 
Division of Field Epidemiology 
Epidemiology Program Office 

Carol M. Knowles, B.S. 

Programmer Analyst 

Division of Surveillance and Epidemiology 

Epidemiology Program Office 

Gene W. Matthews, J.D. 
Legal Advisor to CDC 
Office of the CDC Director 

Mac W. Otten, Jr., M.D., M.P.H. 

Medical Epidemiologist 

Division of Immunization 

National Center for Prevention Services 

Barbara J. Panter-Connah 
Epidemiology Program Specialist 
Division of Surveillance and Epidemiology 
Epidemiology Program Office 

Nancy E. Stroup, Ph.D. 


Division of Surveillance and Epidemiology 

Epidemiology Program Office 

Donna F. Stroup, Ph.D. 

Director, Division of Surveillance and 

Epidemiology Program Office 

Steven M. Teutsch, M.D., M.P.H. 
Special Assistant to the Director 
Epidemiology Program Office 

Stephen B. Thacker, M.D., M.Sc. 


Epidemiology Program Office 

Melinda Wharton, M.D., M.Sc. 

Medical Epidemiologist 

Division of Immunization 

National Center for Prevention Services 

G. David Williamson, Ph.D. 

Chief, Statistics and Analytic Methods Branch 
Division of Surveillance and Epidemiology 
Epidemiology Program Office 

Matthew M. Zack, M.D. 

Medical Epidemiologist 

Division of Chronic Disease Control and 

Community Intervention 
National Center for Chronic Disease 

Prevention and Health Promotion 


Patrick L. Remington, M.D., M.P.H. 
Chronic Disease Epidemiologist 
Wisconsin Department of Health and 
Social Services (Madison) 

Richard L. Vogt, M.D. 
State Epidemiologist 
Hawaii Department of Health 


Kevin M. Sullivan, Ph.D., 
Assistant Professor 
Division of Epidemiology 
Emory University (Atlanta; 

M.P.H., M.H.A. 


Since public health surveillance undergirds public health practice, it is unfortunate that no single 
resource has been available to provide a guide to the underlying principles and practice of 
surveillance. In recent years, a small number of courses on surveillance at schools of public 
health have been developed in recognition of the importance of surveillance, but no definitive 
textbook has appeared. Principles and Practice of Public Health Surveillance is intended to serve 
as a desk reference for those actively engaged in public health practice and as a text for students 
of public health. 

The book is organized around the science of surveillance, i.e., the basic approaches to planning, 
organizing, analyzing, interpreting, and communicating surveillance information in the context of 
contemporary society and public health practice. Surveillance provides the information base for 
public health decision making. It must continually respond to the need for new information, such as 
about chronic diseases, occupational and environmental health, injuries, risk factors, and emerging 
health problems. It must also accommodate to changing priorities. Issues, such as long latency, 
migration, low frequencies, and the need for local data, must be addressed. New analytic methods 
and rapidly evolving technologies present new opportunities and create new demands. This book 
addresses many of these issues. Although many examples of surveillance systems are included, this 
is not intended to be a manual for establishing surveillance for any particular condition. We 
believe that this approach will provide the reader with ideas and concepts that can be adapted to 
her or his particular needs. 

This book grew out of a recognition by the Surveillance Coordination Group at the Centers for 
Disease Control of the need to capture the art as well as the science of surveillance. Most of the 
authors are current or former staff in the Epidemiology Program Office at the Centers for Disease 
Control. These friends and colleagues have drawn on their own experience in surveillance in states, 
a diversity of federal programs, and in international health, as well as having provided an 
interweaving of the experience of others. We felt that the risks of being parochial were outweighed 
by the desirability of producing a consistent and systematic coverage of the subject. Although most 

examples are drawn from the United States, they illustrate basic principles and approaches that can 
be applied in a wide variety of settings around the world. 

We would like to acknowledge Douglas Klaucke, who pulled together many of the initial thoughts on 
organizing the book, and Stephen Thacker, the Director of the Epidemiology Program Office (EPO) , and 
Donna Stroup, Director of the Division of Surveillance and Analysis, for their continued support and 
encouragement. We also acknowledge with gratitude the creative guidance and constructive criticism 
provided by EPO's Assistant Director for Science, Edwin Kilbourne. Finally, and most importantly of 
all, we gratefully recognize the expertise, the dedication, and the commitment of all the authors in 
assuring that this book became a reality. 

SMT Atlanta, Georgia 

REC August 1992 

Chapter I 


Stephen B. Thacker 

"If you don't know where you're going, any road will get you there." 

Lewis Carroll 

Public health surveillance is the ongoing systematic collection, analysis, and 
interpretation of outcome-specific data for use in the planning, implementation, and 
evaluation of public health practice (2) . A surveillance system includes the 
functional capacity for data collection and analysis, as well as the timely 
dissemination of these data to persons who can undertake effective prevention and 
control activities. While the core of any surveillance system is the collection, 
analysis, and dissemination of data, the process can be only understood in the context 
of specific health outcomes. 


The idea of observing, recording, and collecting facts, analyzing them and considering 
reasonable courses of action stems from Hippocrates (2) . The first real public health 
action that can be related to surveillance probably occurred during the period of 
Bubonic plague, when public health authorities boarded ships in the port near the 
Republic of Venice to prevent persons ill with plague-like illness from disembarking 
(3) . Before a large-scale organized system of surveillance could be developed, 
however, certain prerequisites needed to be fulfilled. First, there had to be some 
semblance of an organized health-care system in a stable government; in the Western 
world, this was not achieved until the time of the Roman Empire. Second, a 
classification system for disease and illness had to be established and accepted, 


which only began to be functional in the 17th century with the work of Sydenham. 
Finally, adequate measurement methods were not developed until that time. 

Current concepts of public health surveillance evolve from public health activities 
developed to control and prevent disease in the community. In the late Middle Ages, 
governments in Western Europe assumed responsibilities for both health protection and 
health care of the population of their towns and cities (4) . A rudimentary system of 
monitoring illness led to regulations against polluting streets and public water, 
construction for burial and food handling, and the provision of some types of care 
(5). In 1766, Johann Peter Frank advocated a more comprehensive form of public health 
surveillance with the system of police medicine in Germany. It covered school health, 
injury prevention, maternal and child health, and public water and sewage (4). In 
addition, he delineated governmental measures to protect the public's health. 

The roots of analysis of surveillance data can also be traced to the 17th century. In 
the 1680s, von Leibnitz called for the establishment of a health council and the 
application of a numerical analysis in mortality statistics to health planning (2) . 
About the same time in London, John Graunt published a book, Natural and Political 
Observations Made Upon the Bills of Mortality, in which he attempted to define the 
basic laws of natality and mortality. In his work, Graunt developed some fundamental 
principles of public health surveillance, including disease-specific death counts, 
death rates, and the concept of disease patterns. In the next century, Achenwall 
introduced the term 'statistics," and over the next several decades vital statistics 
became more widespread in Europe. Nearly a century later, in 1845, Thurnam published 
the first extensive report of mental health statistics in London. 

Two prominent names in the development of the concepts of public health surveillance 
activities are Lemuel Shattuck and William Farr. Shattuck's 1850 report of the 
Massachusetts Sanitary Commission was a landmark publication that related death, 
infant and maternal mortality, and communicable diseases to living conditions. 
Shattuck recommended a decennial census, standardization of nomenclature of causes of 
disease and death, and a collection of health data by age, gender, occupation, 
socioeconomic level, and locality. He applied these concepts to program activities in 
immunization, school health, smoking, and alcohol abuse, and introduced these concepts 
into the teaching of preventive medicine. 

William Farr (1807-1883) is recognized as one of the founders of modern concepts of 
surveillance (6). As superintendent of the statistical department of the Registrar 
General's office of England and Wales from 1839 to 1879, Farr concentrated his efforts 
on collecting vital statistics, on assembling and evaluating those data, and on 
reporting both to responsible health authorities and to the general public. 

In the United States, public health surveillance has focused historically on 
infectious disease. Basic elements of surveillance were found in Rhode Island in 
1741, when the colony passed an act requiring tavern keepers to report contagious 
disease among their patrons. Two years later, the colony passed a broader law 
requiring the reporting of smallpox, yellow fever, and cholera (7). 

National disease monitoring activities did not begin in the United States until 1850 
when mortality statistics based on death registration and the decennial census were 
first published by the Federal Government for the entire United States (8). 
Systematic reporting of disease in the United States began in 1874 when the 
Massachusetts State Board of Health instituted a voluntary plan for weekly reporting 
by physicians reporting on prevalent diseases, using a standard postcard-reporting 
format (9,10). In 1878, Congress authorized the forerunner of the Public Health 
Service (PHS) to collect morbidity data for use in quarantine measures against such 
pestilential diseases as cholera, smallpox, plague, and yellow fever (11) . 

In Europe, compulsory reporting of infectious diseases began in Italy in 1881 and 
Great Britain in 1890. In 1893, Michigan became the first U.S. jurisdiction to 
require the reporting of specific infectious diseases. Also in 1893, a law was 
enacted to provide for the collection of information each week from state and 
municipal authorities throughout the United States (12). By 1901, all state and 
municipal laws required notification (i.e., reporting) to local authorities of 
selected communicable diseases such as smallpox, tuberculosis, and cholera. In 1914, 
PHS personnel were appointed as collaborating epidemiologists to serve in state health 
departments to telegraph weekly disease reports to the PHS. 

In the United States, it was not until 1925, however, following markedly increased 
reporting associated with the severe poliomyelitis epidemic in 1916 and the influenza 
pandemic in 1918-1919, that all states had begun participating in national morbidity 

reporting (13). A national health survey of U.S. citizens was first conducted in 
1935. After a 1948 PHS study led to the revision of morbidity reporting procedures, 
the National Office of Vital Statistics assumed the responsibility for morbidity 
reporting. In 1949, weekly statistics that had appeared for several years in Public 
Health Reports began being published by the National Office of Vital Statistics. In 
1952, mortality data were added to the publication that was the forerunner of the 
Morbidity and Mortality Meekly Report (MMWR) . As of 1961, the responsibility for this 
publication and its content was transferred to the Communicable Disease Center (now, 
Centers for Disease Control [CDC] ) . 

In the United States, the authority to require notification of cases of disease 
resides in the respective state legislatures. In some states, authority is enumerated 
in statutory provisions; in other states, authority to require reporting has been 
given to state boards of health; still other states require reports both under 
statutes and health department regulations. Variation among states also exists among 
conditions and diseases to be reported, time frames for reporting, agencies to receive 
reports, persons required to report, and conditions under which reports are required 
(14) . 

The Conference (now Council) of State and Territorial Epidemiologists (CSTE) was 
authorized in 1951 by its parent body, the Association of State and Territorial Health 
Officials to determine what diseases should be reported by states to the Public Health 
Service and to develop reporting procedures. CSTE meets annually, and in 
collaboration with CDC, recommends to its constituent members appropriate changes in 
morbidity reporting and surveillance, including what diseases should be reported to 
CDC and published in the MMWR. 


Until 1950, the term "surveillance" was restricted in public health practice to 
monitoring contacts of persons with serious communicable diseases such as smallpox, to 
detect early symptoms so that prompt isolation could be instituted (15) . The critical 
demonstration in the United States of the importance of a broader, population-based 
view of surveillance was made following the Francis Field Trial of poliomyelitis 
vaccine in 1955 (16,17). Within 2 weeks of the announcement of the results of the 


field trial and initiation of a nationwide vaccination program, six cases of paralytic 
poliomyelitis were reported through the notifiable-disease reporting system to state 
and local health departments; this surveillance lead to an epidemiologic 
investigation, which revealed that these children had received vaccine produced by a 
single manufacturer. Intensive surveillance and appropriate epidemiologic 
investigations by federal, state, and local health departments found 141 vaccine- 
associated cases of paralytic disease, 80 of which represented family contacts of 
vaccinees. Daily surveillance reports were distributed by CDC to all persons involved 
in these investigations. This national common-source epidemic was ultimately related 
to a particular brand of vaccine that had been contaminated with live poliovirus. The 
Surgeon General requested that the manufacturer recall all outstanding lots of vaccine 
and directed that a national poliomyelitis program be established at CDC. Had the 
surveillance program not been in existence, many and perhaps all vaccine manufacturers 
would have ceased production. 

In 1963, Langmuir limited use of the term "surveillance" to the collection, analysis, 
and dissemination of data (18) . This construct did not encompass direct 
responsibility for control activities. In 1965, the Director General of the World 
Health Organization (WHO) established the epidemiological surveillance unit in the 
Division of Communicable Diseases of WHO (19) . The Division Director, Karel Raska, 
defined surveillance much more broadly than Langmuir, including "the epidemiological 
study of disease as a dynamic process." In the case of malaria, he saw epidemiologic 
surveillance as encompassing control and prevention activities. Indeed, the WHO 
definition of malaria surveillance included not only case detection, but also 
obtaining blood films, drug treatment, epidemiologic investigation, and follow-up 
(20) . 

In 1968, the 21st World Health Assembly focused on national and global surveillance of 
communicable diseases, applying the term to the diseases themselves rather than to the 
monitoring of individuals with communicable disease (21) . Following an invitation 
from the Director General of WHO and with consultation from Raska, Langmuir developed 
a working paper and in the year prior to the Assembly obtained comments from 
throughout the world on the concepts and practices advocated in the paper. At the 
Assembly, with delegates from over 100 countries, the working paper was endorsed, and 
discussions on the national and global surveillance of communicable disease identified 


three main features of surveillance that Langmuir had described in 1963: a) the 
systematic collection of pertinent data, b) the orderly consolidation and evaluation 
of these data, and c) the prompt dissemination of results to those who need to know-- 
particularly those in position to take action. 

The 1968 World Health Assembly discussions reflected the broadened concepts of 
•epidemiologic surveillance" and addressed the application of the concept to public 
health problems other than communicable disease (20) . In addition, epidemiologic 
surveillance was said to imply "...the responsibility of following up to see that 
effective action has been taken.' 

Since that time, a wide variety of health events, such as childhood lead poisoning, 

leukemia, congenital malformations, abortions, injuries, and behavioral risk factors 

have been placed under surveillance. In 1976, recognition of the breadth of 

surveillance activities throughout the world was made evident by the fact that a 

special issue of the International Journal of Epidemiology was devoted to surveillance 

(22) . 



The primary function of the application of the term "epidemiologic" to surveillance, 
which first appeared in the 1960s associated with the new WHO unit of that name, was 
to distinguish this activity from other forms of surveillance (e.g., military 
intelligence) and to reflect its broader applications. The use of the term 
"epidemiologic," however, engenders both confusion and controversy. In 1971, Langmuir 
noted that some epidemiologists tended to equate surveillance with epidemiology in its 
broadest sense, including epidemiologic investigations and research (25) . He found 
this "both epidemiologically and administratively unwise, " favoring a description of 
surveillance as "epidemiological intelligence." 

What are the boundaries of surveillance practice? Is "epidemiologic" an appropriate 
modifier of surveillance in the context of public health practice? To address these 
questions, we must first examine the structure of public health practice. One can 
divide public health practice into surveillance; epidemiologic, behavioral, and 
laboratory research; service (including program evaluation); and training. 

Surveillance data should be used to identify research and service needs, which, in 
turn, help to define training needs. Unless data are provided to those who set policy 
and implement programs, their use is limited to archives and academic pursuits, and 
the material is therefore appropriately considered to be health information rather 
than surveillance data. However, surveillance does not encompass epidemiologic 
research or service, which are related but independent public health activities that 
may or may not be based on surveillance. Thus, the boundary of surveillance practice 
excludes actual research and implementation of delivery programs. 

Because of this separation, "epidemiologic" cannot accurately be used to modify 
surveillance (1) . ' The term "public health surveillance" describes the scope 
(surveillance) and indicates the context in which it occurs (public health) . It also 
obviates the need to accompany any use of the term "epidemiologic surveillance" with a 
list of all the examples this term does not cover. Surveillance is correctly --and 
necessarily--a component of public health practice, and should continue to be 
recognized as such. 



Public health surveillance data are used to assess public health status, define public 
health priorities, evaluate programs, and conduct research. Surveillance data tell 
the health officer where the problems are, whom they affect, and where programmatic 
and prevention activities should be directed. Such data can also be used to help 
define public health priorities in a quantitative manner and also in evaluations of 
the effectiveness of programmatic activities. Results of analysis of public health 
surveillance data also enable researchers to identify areas of interest for further 
investigation (23) . 

The analysis of surveillance data is, in principle, quite simple. Data are examined 
by measures of time, place, and person. The routine collection of information about 
reported cases of congenital syphilis in the United States, for example, reflects not 
only numbers of cases (Figure 1.1), geographic distribution, and populations affected, 
but also indicates the effects of crack cocaine use and changing sexual practices over 
the past 10 years. The examination of routinely collected data show where rates of 


salmonellosis by county in New Hampshire and in three contiguous states. Mapping 
these data illustrates the pattern of the spread of disease across state boundaries 
(Figure 1.2). The examination of death certificates for data on homicide identifies 
high-risk groups and shows that the problem has reached epidemic proportions among 
young adult men (Figure 1.3). 


The uses of surveillance are shown in Table 1.1. Portrayal of the natural history of 
disease can be illustrated by the surveillance of malaria rates in the United States 
since 1930 (Figure 1.4). In the 1940s, malaria was still an endemic health problem in 
the southeastern United States to the degree that persons with febrile illness were 
often treated for malaria until further tests were available. After the Malaria 
Control in the War Areas Program led to the virtual elimination of endemic malaria 
from the United States, rates of malaria decreased until the early 1950s, when 
military personnel involved in the conflict in Korea returned to the United States 
with malaria. The general downward trend in reported cases of malaria continued into 
the 1960s until, once again, numbers of cases of malaria rose, this time among 
veterans returning from the war in Vietnam. Since that time, we have continued to see 
increases in numbers of reported cases of malaria involving immigrant populations, as 
well as among U.S. citizens traveling abroad. 

Surveillance data can be used also to detect epidemics. For example, during the swine 
influenza immunization program in 1976, a surveillance system was established to 
detect adverse sequelae related to the program (24) . Working with state and local 
health departments, CDC was able to detect an epidemic of Guillain-Barr6 syndrome, 
which rapidly led to the termination of a program in which 40,000,000 U.S. citizens 
had been vaccinated. However, most epidemics are not detected by such analysis of 
routinely collected data but are identified through the astuteness and alertness of 
clinicians and public health officials of the community. From a pragmatic point of 
view, the key point is that when someone does note an unusual occurrence in the health 
picture of a community, the existence of organized surveillance efforts in the health 
department provides the infrastructure for conveying information to facilitate a 
timely and appropriate response. 

The distribution and spread of disease can be documented from surveillance data, as 
seen in the county-specific data on salmonellosis (Figure 1.2). U.S. cancer mortality 
statistics have also been mapped at the county level to identify a variety of 
geographic patterns that suggest hypotheses on etiology and risk (25) . Recognition of 
such clusters can lead to further epidemiologic or laboratory research, sometimes 
using individuals identified in surveillance as subjects in epidemiologic studies. 
The association between the periconceptual use of multivitamins by women and the 
development of neural tube defects by their children was documented using children 
identified in a surveillance system for congenital malformations (26) . 

Surveillance data can also be used to test hypotheses. For example, in 1978 the U.S. 
Public Health Service announced a measles elimination program that included an active 
effort to vaccinate school-age children. Because of this program and the state laws 
that excluded from school students who had not been vaccinated, CDC anticipated a 
change in the age pattern of persons reported to have measles. Before the initiation 
of the program, the highest reported rates of measles were for children 10-14 years of 
age. As predicted, almost immediately after the school exclusion policy was 
implemented, there was not only a general decrease in the number of cases but also a 
shift in peak occurrence from school-age to preschool-age children (Figure 1.5). By 
1979, there were even lower levels of measles incidence and altered age-specific 

Surveillance data can be applied in evaluating control and prevention measures. With 
routinely collected data, one can examine --without special studies--the effect of a 
health policy. For example, the introduction of inactivated poliovirus vaccine in the 
United States in the 1950s was followed by a dramatic decrease in the number of 
reported number of cases of paralytic poliomyelitis, and the subsequent introduction 
in the 1960s of oral poliovirus vaccine was followed by an even greater decline 
(Figure 1.6) . 

Efforts to monitor changes in infectious agents have been facilitated by the use of 
surveillance data. In the late 1970s, antibiotic-resistant gonorrhea was introduced 
into the United States from Asia. Laboratory- and clinical -practice-based 
surveillance for cases of gonorrhea enabled public health officials to monitor the 
rapid diffusion of various strains of this bacterium nationally and facilitated 


prevention activities, including notifying clinicians of proper treatment procedures 
(Figure 1.7). Similarly, the National Nosocomial Infections Surveillance System, a 
voluntary, hospital -based surveillance system of hospital-acquired infections, has 
been used to monitor changes in antibiotic-resistance patterns of infectious agents 
associated with hospitalized patients. 

As noted earlier, the first use of surveillance was to monitor persons with a view of 
imposing quarantine as necessary. Although this use of surveillance is rare in 
modern-day United States, in 1975 — with the introduction of a suspected case of Lassa 
fever — over 500 potential contacts of the patient were monitored daily for 2 weeks to 
assure that secondary spread of this serious infectious agent did not occur (27) . 

Surveillance data can also be used to good effect for detecting changes in health 
practice. The increasing use of various technologies in health care has come to be an 
issue of growing concern over the past decade; surveillance data can provide useful 
information in this area (28) . For example, in the United States since 1965, the rate 
of cesarean delivery has increased from approximately <5% to nearly 25% of all 
deliveries (Figure 1.8). Data such as these are useful both in planning research to 
learn the causes of these changes and in monitoring the impact of such changes in 
practice and procedure on outcomes and costs associated with health care. 

Finally, surveillance data are useful for planning. With knowledge about changes in 
the population structure or in the nature of conditions that might affect a 
population, officials can, with more confidence, plan for optimizing available 
resources. For example, data on refugees entering the United States from Southeast 
Asia in the early 1980s were broadly applicable; they told where people settled, 
described the age and gender structure of the population, and identified health 
problems that might be expected in that population. With this information, health 
officials were able to plan more effectively the appropriate health services and 
preventive activities for this new population. 


As we approach the year 2000, several activities are expected to contribute to the 
evolution of public health surveillance. First, use of the computer- -particularly the 


microcomputer- -has revolutionized the practice of public health surveillance. In the 
United States, the National Electronic Telecommunications System for Surveillance 
(NETSS) links all state health departments by computer for the routine collection, 
analysis, and dissemination of data on notifiable health conditions (29) . Over the 
next several years, the growth will be within states, with state health departments 
being linked to county departments, and possibly even to health-care providers' 
offices for routine surveillance. The Minitel system currently in use in France has 
already demonstrated the essential utility of office-based surveillance of various 
conditions of public health importance (30) . 

The second area of renewed activity associated with surveillance is that of 
epidemiologic and statistical analysis. A by-product of the use of computers is the 
ability to make more effective use of sophisticated tools to detect changes in 
patterns of occurrence of health problems. In the 1980s, applications and methods of 
time series analysis and other techniques have enabled us to provide more meaningful 
interpretation of data collected in surveillance efforts (31) . More sophisticated 
techniques will doubtless continue to be applied in the area of public health as they 
are developed. 

Until recently, surveillance data were traditionally disseminated as written documents 
published periodically by government agencies. While paper reports will continue to 
be produced, and public health officials will continue to refine the use of print 
media, they are also beginning to use electronic media for the dissemination of 
surveillance data. More effective use of the electronic media, and all the other 
tools of communications, should facilitate the use of surveillance data for public 
health practice. At the same time, ready access to detailed information on 
individuals will continue to provide ethical and legal concerns that may constrain 
access to data of potential public health importance. 

The 1990s will see surveillance concepts applied to new areas of public health 
practice such as chronic disease, environmental and occupational health, and injury 
control. The evolution and development of methods for these programmatic areas will 
continue to be a major challenge in public health. 


A more fundamental principle that will underlie the ongoing development of 
surveillance is the increasing ability of people to look at public health surveillance 
as a scientific endeavor {32) . A growing appreciation of the need for rigor in 
surveillance practice will no doubt improve the quality of surveillance programs and 
will therefore facilitate the analysis and use of surveillance data. An important 
result of this more vigorous approach to surveillance practice will be the increased 
frequency and quality of the evaluation of the practice of surveillance {33) . 

Finally, and probably most important, is the observation that surveillance needs to be 
used more consistently and thoughtfully by policymakers. Epidemiologists not only 
need to improve the quality of their analysis, interpretation, and display of data for 
public health use, they also need to listen to persons empowered to set policy in 
order to understand what stimulates the policymakers' interest and action. This 
assessment allows surveillance information to be crafted so that it is presented in 
its most useful form to the appropriate audience and in the necessary time frame. In 
turn, as we maximize the utility of data for decision making and better understand 
what is essential to that process, we will raise the area of public health 
surveillance to a new and higher level of importance. 

The critical challenge in public health surveillance today, however, continues to be 
the assurance of its usefulness. In this effort, we must have rigorous evaluation of 
public health surveillance systems. Even more basic is the need to regard 
surveillance as a scientific endeavor. To do this properly, one must fully understand 
the principles of surveillance and its role in guiding epidemiologic research and 
influencing other aspects of the overall mission of public health. Epidemiologic 
methods based on public health surveillance must be developed; computer technology for 
efficient data collection, analysis, and graphic display must be applied; ethical and 
legal concerns must be addressed effectively; the use of surveillance systems must be 
reassessed on a routine basis; and surveillance principles must be applied to emerging 
areas of public health practice. 


1. Thacker SB, Berkelman RL. Public health surveillance in the United States. 
Epidemiol Rev 1988;10:164-90. 

2. Eylenbosch WJ, Noah ND. Historical aspects. In: Surveillance in Health and 
disease. Oxford, England: Oxford University Press, 1988:3-8. 

3. Moro ML, McCormick A. Surveillance for communicable disease. In: Eylenbosch 
WJ, Noah ND (eds.). Surveillance in health and disease. Oxford, England: 
Oxford University Press, 1988:166-82. 

4. Hartgerink MJ. Health surveillance and planning for health care in the 
Netherlands. Int J Epidemiol 1976;5:87-91. 

5. Anonymous (Editorial). Surveillance. Int J Epidemiol 1976;5:3-6. 

6. Langmuir AD. William Farr: founder of modern concepts of surveillance. Int J 
Epidemiol 1976;5:13-8. 

7. Hinman AR. Surveillance of communicable diseases. Presented at the 100th 
annual meeting of the American Public Health Association, Atlantic City, New 
Jersey, November 15, 1972. 

8. Vital statistics of the United States, 1958. Washington, DC: National Office of 
Vital Statistics, 1959. 

9. Trask JW. Vital statistics: a discussion of what they are and their uses in 
public health administration. Public Health Rep 1915,-Suppl 12. 

10. Bowditch HI, Webster DL, Hoadley JC et al. Letter from the Massachusetts State 
Board of Health to physicians. Public Health Rep 1915,-Suppl 12:31. 


11. Centers for Disease Control. Manual of procedures for national morbidity 
reporting and public health surveillance activities. Atlanta, Georgia: Public 
Health Service, 1985. 

12. Chapin CV. State health organization. JAMA 1916,-66:699-703. 

13. National Office of Vital Statistics. Reported incidence selected notifiable 
disease: United States, each division and state, 1920-50. Vital Statistics 
Special Reports (National Summaries). 1953;37:1180-1. 

14. Chorba TL, Berkelman RL, Safford SK, Gibbs NP, Hull HF. The reportable 
diseases. I. Mandatory reporting of infectious diseases by clinicians. JAMA 

15. Langmuir AD. Evolution of the concept of surveillance in the United States. 
Proc R Soc Med 1971;64:681-9. 

16. Langmuir AD, Nathanson N, Hall WJ . Surveillance of poliomyelitis in the United 
States in 1955. Am J Public Health 1956;46:75-88. 

17. Nathanson N, Langmuir AD. The Cutter incident: poliomyelitis following 
formaldehyde-inactivated poliovirus vaccination in the United States during the 
Spring of 1955. I. Background. Am J Hyg 1963;78:29-81. 

18. Langmuir AD. The surveillance of communicable diseases of national importance. 
N Engl J Med 1963;268:182-92. 

19. Raska K. National and international surveillance of communicable diseases. WHO 
Chron 1966;20:315-21. 

20. Report for drafting committee. Terminology of malaria and of malaria 
eradication. Geneva, Switzerland: World Health Organization, 1963. 


21. National and global surveillance of communicable disease. Report of the 
technical discussions at the Twenty-First World Health Assembly. A21/Technical 
Discussions/5. Geneva, Switzerland: World Health Organization, May 1968. 

22. Int J Epidemiol. 1976;5:3-91. 

23. Thacker SB. Les principes et la practique de la surveillance en sante 1 publique: 
1' utilisation des donnees en sante publique. Sant4 Publique 1992;4:43-9. 

24. Retailliau HF, Curtis AC, Starr G et al . Illness after influenza vaccination 
reported through a nationwide surveillance system, 1976-1977. Am J Epidemiol 

25. Mason TJ, Fraumeni JF, Hoover R, Blot WJ. An atlas of mortality from selected 
diseases. Washington, D.C.: U.S. Department of Health and Human Services. NIH 
Publication No. 81-2397, May 1981. 

26. Mulinare J, Cordero JF, Erickson D, Berry RJ. Periconceptional use of 
multivitamins and the occurrence of neural tube defects. JAMA 1988,-260:3141-5. 

27. Zweighaft RM, Fraser DW, Hattwick MAW et al. Lassa fever: response to an 
imported case. N Engl J Med 1977 ; 297 :803-7 . 

28. Thacker SB, Berkelman RL. Surveillance of medical technologies. J Pub Health 
Pol 1986;7:363-77. 

29. Centers for Disease Control. National electronic communications systems for 
surveillance- -United States, 1990-1991. MMWR 1991;40:502-3. 

30. Valleron AJ, Bouvet E, Garnerin et al . A computer network for the surveillance 
of communicable diseases: the French experiment. Am J Public Health 


31. Stroup DF, Wharton M, Kafadar K, Dean AG. An evaluation of a method for 
detecting aberrations in public health surveillance data. Am J Epidemiol (In 
press) . 

32. Thacker SB, Berkelman RL, Stroup DF. The science of public health surveillance. 
J Public Health Pol 1989;10:187-203. 

33. Centers for Disease Control. Guidelines for evaluating surveillance systems. 
MMWR 1988;37(Suppl No. S-5):l-20. 


Chapter II 

Planning a Surveillance System 

Steven Teutsch 

"Natural laws govern the occurrence of a disease, that these laws can be discovered by 
epidemiologic inquiry and that, when discovered, the causes of epidemics admit to a 
great extent of remedy." 

William Farr 

As described earlier, public health surveillance is the systematic and ongoing 
assessment of the health of a community, including the timely collection, analysis, 
interpretation, dissemination, and subsequent use of data. Surveillance provides 
information for action, information with a purpose. Surveillance systems evolve in 
response to ever-changing needs of society in general and of the public health 
community in particular. In order to understand and meet those needs, an organized 
approach to planning, developing, implementing, and maintaining surveillance systems 
is imperative. In the sections below, approaches to the planning and evaluation 
processes to be presented in more detail elsewhere in this book are discussed. The 
steps in planning a system are shown in Table II. 1. 



Planning a surveillance system begins with a clear understanding of the purpose of 
surveillance, i.e., the answer to the question: "What do you want to know?" In the 
context of public health, surveillance may be established to meet a variety of 
objectives, including assessment of public health status, establishment of public 
health priorities, evaluation of programs, and conduct of research. Surveillance data 
can be used in all of the following ways: 

to estimate the magnitude of a health problem in the 
population at risk 

to understand the natural history of a disease or injury 

to detect outbreaks or epidemics 

to document the distribution and spread of a health event 

to test hypotheses about etiology 

to evaluate control strategies 

to monitor changes in infectious agents 

to monitor isolation activities 

to detect changes in health practice 

to identify research needs and facilitate epidemiologic 
and laboratory research 

to facilitate planning 

Surveillance is inherently outcome oriented and focused on various outcomes associated 
with health-related events or their immediate antecedents. These include the 
frequency of an illness or injury, usually measured in terms of numbers of cases, 
incidence, or prevalence; the severity of the condition, measured as a case-fatality 
ratio, hospitalization rate, mortality rate, or disability; and the impact of the 
condition, measured in terms of cost. Where risk factors or specific procedures are 
incontrovertibly linked to health outcomes, it is often useful to measure the latter 
because health outcomes often more frequent (and hence more precisely ascertainable 
for small populations) and may be more closely linked to public health interventions. 
For example, mammography with suitable follow-up is the major prevention strategy for 
reducing mortality associated with breast cancer. Assessment of the level of 


utilization of mammography by women can be regularly monitored and should be a more 
timely indicator of the impact of public health prevention programs than measurement 
of mortality from breast cancer. Surveillance data should also provide basic 
information on the utilization of mammography services by age and race/ethnicity of 
recipient, allowing better targeting of prevention efforts on the population sectors 
with the lowest utilization. In addition, over-utilization by some parts of the 
population (e.g., women <35 years of age who do not have other risk factors) might 
stimulate efforts to reduce unnecessary procedures. 

High-priority health events should clearly be under surveillance. However, 
determining which should be considered high-priority events can be a daunting task. 
Both quantitative and qualitative approaches can be used in a selection process. Some 
quantitative factors are shown on Table II. 2. In addition, criteria based on a 
consensus process to identify high-priority problems may identify emerging issues or 
problems that might otherwise not be considered. The consensus process leading to the 
Year 2000 Health Promotion and Disease Prevention Objectives in the United States is 
an example of a mechanism for identifying high-priority conditions, types of behavior, 
and interventions that require ongoing monitoring (2) . 

Because public health surveillance in the United States is driven by the public health 
need to be cognizant of diseases and injuries in the community and to respond 
appropriately, surveillance is inherently an applied science. Therefore, as 
surveillance has evolved, it is generally undertaken only when there is reasonable 
expectation that control measures will be taken as appropriate. For many conditions 
the link between surveillance and action is obvious (e.g., meningococcal meningitis 
prophylaxis for contacts of patients diagnosed as having meningitis) . For emerging 
conditions, such as eosinophilia-myalgia syndrome, there is a compelling public health 
need to identify cases (delineate the magnitude of the problem) , identify the mode of 
spread, and take appropriate action. 

Surveillance data are usually augmented by additional studies to determine more 
precisely the causes, natural history, predisposing factors, and modes of transmission 
associated with the health problem. Yet, undertaking surveillance exclusively for 
research purposes is rarely warranted. Research needs are often better served by 
other, more precise (and often more costly) methods of case identification (e.g., 


registries), which facilitate more detailed data collection and tracking of cases. 
For example, registries of type I diabetes may have value for surveillance, but are 
justified primarily because they fill research needs. The ongoing public health 
application of these data is more limited. Scarce public health resources and the 
efforts of health-care providers to report cases need to be focused on problems for 
which the public health importance and the need for public health action can be 
readily recognized. 

A primary role of surveillance is the assessment of the overall health status of a 
community. One approach to this issue is the development and identification of a set 
of indicators that measure major components of health status. Such a set has been 
developed in the United States to be used at a national, state, and local level 
(2) . Another approach is to examine the most frequent, severe, costly, and 
preventable conditions in the community by examining most frequent causes of death, 
hospitalization, injury, disability, infection, work-site-associated illness and 
injury, and major risk factors for all the preceding items. This information can be 
obtained in most communities in terms of age, race/ethnicity, gender, and temporal 
trends. Regular assessments of the information can form the basis for educating the 
community about its major health problems and for identifying specific conditions that 
merit more intensive surveillance and intervention. 

The specific objective and purpose of the surveillance system should be specified and 
general agreement obtained. 


Once the purpose of and need for a surveillance system has been identified, methods 
for obtaining, analyzing, disseminating, and using the information should fee 
determined and implemented (see Chapters V, VI, and VII) . 

Because surveillance systems are ongoing and require the cooperation of many 
individuals, careful consideration must be given to the attributes discussed in 
Chapter VIII in the discussion on evaluation. The system adopted must be feasible and 
acceptable to those who will contribute to its success; it must be sensitive enough to 
provide the information required to do the job at hand, while having a high 


predictive-value positive to minimize the expenditure of resources on following up 
false-positive cases. A surveillance system should be flexible enough to meet the 
continually evolving needs of the community and to accommodate changes in patterns of 
disease and injury. It must provide information that is timely enough to be acted 
upon. All of these considerations must be carefully balanced in order to design a 
system that can successfully meet identified needs without becoming excessively costly 
or burdensome . 

Case Definitions 

Practical epidemiology is heavily dependent on clear case definitions that include 
criteria for person, place, and time and that are potentially categorized by the 
degree of certainty regarding diagnosis as "suspected" or "confirmed" cases (3) . 

While high sensitivity and specificity are both desirable, generally one comes at the 
expense of the other. A balance must be struck between the desire for high 
sensitivity and level of effort required to track down false-positive cases. In 
addition, case definitions evolve over time. During periods of outbreaks, cases 
epidemiologically linked to the outbreak cases may be accepted as cases, whereas in 
non-epidemic periods, serologic or other more specific information may be required. 
Similarly, when active surveillance is used, such as in measles control programs, 
numbers of cases identified tend to rise. 

As our understanding of a disease and its associated laboratory testing improves, 
alterations in case definitions often lead to changes in sensitivity and specificity. 
As new systems complement old ones (e.g., as a morbidity system supplements a 
mortality system for injury surveillance) , the reported frequency and patterns of 
conditions change. These changes must be taken into account in analysis and 
interpretation of secular trends in the frequency of reporting. It is all too easy to 
define cases of various conditions with such different criteria that it is difficult 
to compare the essential descriptors of person, place, or time. For example, in 
surveillance of diabetes, one could determine the frequency of diabetes from surveys 
(self reports of diabetes) , surveys using glucose determination (laboratory- 
confirmed) , or from reviews of ambulatory or hospital records (physician-diagnoses) . 
Each method provides a different perspective on the problem. Self reports are subject 

to vagaries of recall and variation in interpretation (patient may be under treatment, 
may have "a touch of diabetes" or prediabetes, or may have a history of gestational 
diabetes). Glucose determinations allow detection of previously undiagnosed diabetes. 
Medical records identify only patients currently receiving medical care. 

Case definitions should be specified including criteria for person, place, time, 
clinical or laboratory diagnosis, and epidemiologic features. 

Data Collection 

Information on diseases, injuries, and risk factors can be obtained in many ways. 
Each mechanism has characteristics that must be balanced against the purpose of the 
system (see Chapter III) . Timeliness is of the essence for frequently fatal 
conditions such as plague, rabies, or meningococcal meningitis. Notifiable-disease 
systems are most appropriate for such potentially catastrophic conditions with high 
and urgent preventability constraints. Conversely, detailed information on influenza 
strains or Salmonella serotypes must come from laboratory -based systems. Long-term 
mortality patterns are available through vital records systems. 

Often, existing data sets can provide surveillance data. Such sets include vital 
records, administrative systems, and risk-factor or health- interview surveys. Among 
administrative systems, hospital-discharge data, medical-management-information and 
billing systems, police records for violence, and school records for disabilities or 
injuries among children can all provide needed data, in addition, with some 
modification, an existing system might provide needed data more economically or 
efficiently than a newly initiated system. 

Existing registries or surveys may collect information on defined populations. To the 
extent that the condition of interest is uniformly distributed, the population under 
study is reasonably representative, and the information collected is available on a 
timely basis, such systems can be valuable data sources. Although many registries are 
established for research purposes, they often provide valuable data for surveillance 
purposes. In particular, cancer registries have been widely used (4) . 


Sentinel providers can also constitute a network for collecting data on common 
conditions, such as influenza; more specialized providers can provide data on less 
common conditions, e.g., ophthalmologists who provide information on treatment of 
patients for diabetic retinopathy. 


Data-collection instruments should use generally recognized and, where suitable, 
computerized formats for each data element to facilitate analysis and comparison with 
data collected in other systems, e.g., census and other surveillance data. Careful 
consideration should be given to using identifiers. Although additional assurances of 
confidentiality and privacy considerations will be required, the ability to link data 
to other systems, such as through the National Death Index, may enhance the value of 
the system. 

Active and passive systems 

Primary surveillance-data-collection systems have traditionally been classified as 
passive or active. For example, most routine notifiable-disease surveillance relies 
on passive reporting. On the basis of a published list of conditions, health-care 
providers report notifiable diseases on a case-by-case basis to the local health 
department. This passive system has the advantage of being simple and not burdensome 
to the health department, but it is limited by variability and incompleteness in 
reporting. Although the completeness of reporting may be augmented by efforts to 
publicize the importance of reporting and by continued feedback to communications 
media representatives, passive reporting systems may still not be representative and 
they may fail to identify outbreaks. To obviate these problems, more active systems 
are often used for conditions of particular importance. These systems involve regular 
outreach to potential reporters to stimulate the reporting of specific diseases or 
injuries. Active systems can validate the representativeness of passive reports, 
assure more complete reporting of conditions, or be used in conjunction with specific 
epidemiologic investigations. Since resources are often limited, active systems are 
often used for brief periods for discrete purposes such as during the measles 
elimination efforts. 

Limited surveillance systems 


Some surveillance efforts may not require ongoing systems. Surveillance to deal with 
specific problems may be needed to address problems for which all cases must be 
identified in order to assess the level of risk. Such programs can be conducted to 
resolve specific problems and then be terminated (5) . Similarly, for logistic and 
economic reasons, it may not be feasible to mount a surveillance system across large 
geographic areas, and representative populations may need to be selected. Sentinel 
providers can also provide information on common conditions or conditions of 
particular interest to them. 

Field testing 

The careful development and field testing of surveillance systems and procedures is 
important to facilitate the implementation of feasible systems and to avoid making 
changes as systems are implemented on a broad scale. The frustration engendered by a 
new and poorly executed system may undermine efforts to improve or use existing 
systems for the same or other conditions. As new surveillance systems or new 
instruments and procedures are developed, field tests of their feasibility and 
acceptability are appropriate. These field-test projects can demonstrate how readily 
the information can be obtained and can detect difficulties in data-collection 
procedures or in the content of specific questions. Analyses of this test information 
may also identify problems with the information collected. Model surveillance systems 
may facilitate the examination and comparison of a variety of approaches that would 
not be feasible on too large a scale and may identify methods suitable for other 
conditions or other settings. 

The data to be collected by a surveillance system, the data sources and collection 
methods, and the procedures for handling the information should be developed and 

Data Analysis 

A determination of the appropriate analytic approach to data should be an integral 
part of the planning of any surveillance system. The data needed to address the 
salient questions must be assessed to assure that the data source or collection 
process is adequate. Analyses may prove to be as simple as an ongoing review of all 
cases of rare but potentially devastating illnesses, such as plague. For most 


conditions, however, an assessment of the crude number of cases and rates is followed 
by a description of the population in which the condition occurs (person), where the 
condition occurs (place) , and the period over which the condition occurs (time) . 
These basic analyses require decisions as to the kind of information that needs to be 
collected. The level of detail required varies substantially from condition to 
condition. For instance, one may need more detailed information regarding the 
population that is not receiving prenatal care than on the one that is exposed to 
meningococcal disease, because the nature of the intervention for the former is likely 
to be more complex and require an understanding of socioeconomic factors. Similarly, 
how one will collect data on geographic areas may depend on whether the data will be 
examined at the county, state, or census- tract level. 

Most contemporary surveillance systems are maintained electronically. The types of 
analyses to be performed and the size of the data bases should suggest the type of 
hardware and software needed (see Chapter XI) . As personal computers become more 
powerful, the capacity of data-storage devices continues to grow, and data-sharing 
systems such as local- and wide-area networks become more widely available, more 
surveillance systems can be operated on personal computers. Software to meet most 
basic analytic needs for surveillance, including mapping and graphing, is now widely 
available. The analytic approach often suggests a basic set of analyses that are 
performed on a regular basis. These analyses can be designed early in the development 
of the system and incorporated into an automated system, which can then be run by 
support personnel . 

The adequacy of the data system and processing mechanisms should be assured. 

Interpretation and Dissemination 

Data must be analyzed and presented in a compelling manner so that decision makers at 
all levels can readily see and understand the implications of the information. 
Knowledge of the characteristics of the audiences for the information and how they 
might use it may dictate any of a variety of communications systems. Routine, public 
access to the data—consistent with privacy constraints- -should be planned for and 
provided. This access can be facilitated with various electronic media, ranging from 


systems with structured-analysis features suitable for general users to files of raw 
data for persons who can do special or more detailed analyses themselves. 

The primary users of surveillance information, however, are public health 
professionals and health-care providers. Information directed primarily to those 
individuals should include the analyses and interpretation of surveillance results, 
along with recommendations that stem from the surveillance data. Graphs and maps 
should be used liberally to facilitate rapid review and comprehension of the data. 
Communications media represent a valuable secondary audience that can be used to 
amplify the messages from surveillance information. The media play an important role 
in presenting and reinforcing health messages. Innovative methods for presenting 
information capitalizing on current audiovisual technology should be explored (see 
Chapter VII) . 


Planning, like surveillance itself, is an iterative process requiring the regular 
reassessment of objectives and methods (see Chapter VIII) . The fundamental question 
to be answered in evaluation is whether the purposes of the surveillance system have 
been met. Did the system generate needed answers to problems? Was the information 
timely? Was it useful for planners, researchers, health-care providers, and public 
health professionals? How was the information used? Was it indeed worth the effort? 
Would those who participated in the system wish to (be willing to) continue to do it? 
What could be done to enhance the attributes of the system (timeliness, simplicity, 
flexibility, acceptability, sensitivity, predictive-value positive, and 
representativeness) ? 

Answers to these questions will direct subsequent efforts to revise the system. 
Changes might be minor (e.g., the addition of data elements to existing forms), or 
major (e.g., the need to obtain information from entirely different data sources). 
For example, a system to determine utilization of mammography might be based on 
administrative billing systems. Yet, problems with reports of multiple mammography 
examinations for the same individual might require the addition of unique patient 
identifiers or the addition of questions on mammography use from self reports on 
health-interview surveys. If access emerges as a critical factor in mammography 


utilization, then ongoing monitoring of the quantity and location of mammography 
facilities or monitoring for appropriate insurance coverage for mammography might be 

Periodic rigorous evaluation assures that surveillance systems remain vibrant. 
Systems that assess problems whose only interest is historical should be discontinued 
or simplified to reduce the reporting burden. Contemporary systems should take 
advantage of the emergence of new technology for information collection, analysis, and 
dissemination. They should capitalize on new information systems. For example, 
sentinel surveillance systems have become more flexible to allow the inclusion of an 
array of topics. Electronic medical records and standardized clinical data bases all 
provide opportunities to obtain data that have been burdensome or difficult to secure 
(6). These information sources may also provide data in a more timely fashion and 
may allow individuals to be tracked, an option that would be virtually impossible 
without such electronic systems. 


Virtually all surveillance systems involve networks of organizations and individuals. 
Surveillance of notifiable disease relies on health-care providers including 
clinicians, hospitals, and laboratories to report to local health departments, who 
have the initial responsibility for responding to reports and amassing data. In many 
states, epidemiologists in the state health departments are responsible for 
surveillance and control of notifiable diseases in their states. In larger states, 
other organizational units--such as those dealing with sexually transmitted disease, 
immunization, or tuberculosis control--often have primary responsibility for 
surveillance and control of specific diseases or injuries. The state epidemiologist 
is responsible for the ongoing quality control, collection, analysis, interpretation, 
dissemination, and use of notifiable-disease data within that state. Data are 
subsequently forwarded each week to the national level where they are again analyzed, 
interpreted, and disseminated. 

Programs for injuries and chronic and environmental diseases also may have complex 
organizational structures and may involve a wide array of external professional and 
voluntary interest groups whose needs must be addressed. Some basic surveillance 


information can be gleaned from such ongoing information systems as vital records, 
hospitalization programs, and registries. Although some of these conditions are part 
of state notifiable-disease lists, many require surveillance systems to be established 
in unique places (e.g., rehabilitation units and emergency medical services for 
spinal-cord injuries or radiology centers for mammography) . The support and interest 
of these groups of constituents are valuable in establishing the systems; these groups 
can provide key input regarding purposes of systems and users of systems, as well as 
assistance in developing the systems themselves. 

The complex relationships among these organizational units and their constituents 
requires open communication to establish priorities and methods consistent with the 
needs and resources of each group. The conflicting desire for more detailed 
information must be balanced against the associated burden and cost, as well as 
against the utility of collecting extensive amounts of data. For example, electronic 
systems that may facilitate higher quality, more complete, and more timely data also 
involve the commitment of equipment, training, and changes in day-to-day activities 
that may permeate all levels of the system. One must understand the needs of each 
recipient group for the information and assess and assure their commitment to the 
system. It is also critical to be attentive to how components of the system can best 
be integrated into the overall system in terms of day-to-day operations. 

The Council of State and Territorial Epidemiologists, an affiliate of the Association 
of State and Territorial Health Officials, has the authority in the United States to 
recommend which health conditions should be notifiable. After this list has been 
agreed upon, it is then up to each state to determine whether and how the conditions 
should be made reportable. Although most states report all those conditions 
considered to be nationally notifiable, a wide range of conditions are reportable in 
only a few states (3) . States may exercise their authority through regulations, 
boards of health, or legislative procedures. The diversity of these methods is 
described more fully in Chapter XII. Each of these mechanisms entails the involvement 
of groups with an array of medical, administrative, public health, and policy 

The success of surveillance depends heavily on the quality of the information entered 
into the system and on the value of the information to its intended users. A clear 


understanding of how policy makers, voluntary and professional groups, researchers, 
and others might use surveillance data is valuable in garnering the support of these 
audiences for the surveillance system. 


1. Healthy People 2000. National health promotion and disease prevention 
objectives. DHHS Pub. No. (PHS) 91-50212. Washington, D.C.: U.S. Department 
of Health and Human Services, Public Health Service, 1991. 

2. Centers for Disease Control. Consensus set of health status indicators for the 
general assessment of community health status--United States. MMWR 1991;40 :449- 

3. Chorba TL, Berkelman RL, Saffod SK, Gibbs NP, Hull HF. Mandatory reporting of 
infectious diseases by clinicians. JAMA 1989,-262 :3018-26 . 

4. American Cancer Society. Cancer facts and figures- -1991. Atlanta, Georgia: 
American Cancer Society, 1991. 

5. Teutsch SM, Herman WH, Dwyer DM, Lane JM. Mortality among diabetic patients 
using continuous subcutaneous insulin infusion pumps. N Engl J Med 

6. Ellwood PM. Outcomes management. A technology of patient experience. N Engl J 
Med 1988;318:1549-56. 



Notifiable disease and related reporting mechanisms 

Vital statistics 

Sentinel surveillance 


Administrative data-collection systems 


Appendix III 


Chapter III 

Sources of Routinely Collected Data for 


Nancy E. Stroup 
Matthew M. Zack 
Melinda Wharton 

"The real voyage of discovery consists not in seeing new landscapes but in having new 
eyes. ■ 

Marcel Proust 


This chapter reviews sources of routinely collected data that can be used for public 
health surveillance. In many instances, these sources will provide sufficient 
information so that active case- finding for the health event of interest many not be 
necessary. In other instances, analysis of routinely collected data, in conjunction 


with active case- finding, will provide the basis for a comprehensive assessment of the 
public health impact of a particular health event. 

For infectious diseases, surveillance activities have traditionally relied on 
"notifiable disease reporting systems based on legally mandated reporting of cases to 
health officials. Depending on characteristics of the reporting system and of the 
specific health event, these systems can provide timely information that is 
particularly useful for monitoring short-term trends and for detecting outbreaks or 
epidemics of disease. While prevention and control of infectious diseases remains a 
mainstay of public health practice, there is increasing emphasis on monitoring the 
public health impact of non-infectious or chronic diseases and injuries, as well as 
risk factors for these conditions, including behavioral risk factors, demographic 
characteristics, and potential exposure to toxic agents. With the expansion in the 
number and type of health events under surveillance, the use of existing data sources, 
such as vital statistics and more recently hospital discharge data, has expanded; and 
new data sources, such as behavioral risk factor surveys, have been developed. 

This chapter describes characteristics of six types of health information systems in 
which data are collected routinely and are generally available for analysis. The six 
are notifiable disease and related reporting systems, vital statistics, sentinel 
surveillance, registries, health surveys, and administrative data collection systems. 
As more sources of health information become available, effective surveillance for a 
specific health event, whether infectious or non- infectious, will rely on analysis and 
synthesis of information from a variety of sources, each of which has different 
strengths and limitations. In many instances, these sources will provide sufficient 
information so that active case-finding or other surveillance-related activities may 
not be necessary. In other instances, analysis of routinely collected data, in 
conjunction with other activities, will provide the basis for a comprehensive 
assessment of the public health impact of a particular health event. For cervical 
cancer, for instance, surveillance activities could include the following: 
comprehensive assessment of cancer incidence data and cancer mortality data; reports 
of cervical cytology and genital infections by laboratories; reports of pap smear 
histories, smoking patterns, genital infections and safe sex practices from health 
surveys; review of hospital-discharge data to monitor surgical treatment for advanced 
disease; and information from a variety of sources on attitudes, payment strategies, 


and other barriers or inducements that could influence the prevention, early 
detection, and treatment of cervical cancer. The selection and appropriate use of data 
from these sources would depend primarily on the nature and scope of activities to be 
monitored as part of a cervical cancer control program. 

Depending on the health event of interest, special short-term or demonstration 
projects can also provide information that is very useful for surveillance or other 
prevent ion- related activities. This chapter, however, focuses on sources of data in 
which information on a wide range of health events is collected on a routine, ongoing 
basis and is generally available for analysis. 

The examples provided in this chapter are meant to be illustrative rather than 
exhaustive. Many examples are research- rather than surveillance-related, but they do 
highlight potential uses of these data sources for surveillance and related 
activities. The background information provided on the methods used to collect 
different types of data serves, however, as a starting point for a more detailed 
assessment of the strengths and limitations of these data systems for surveillance of 
a particular health event. The sources of data mentioned in this chapter are listed 
separately in Appendix A. 

Information on the availability of routinely collected health and population data are 
available from a variety of sources. Federal agencies that provide data in the United 
States include the following organizations: 

the Centers for Disease Control (CDC) , including the National Center for 

Health Statistics (NCHS); 

the National Institute of Health (NIH) , including the National Cancer 

Institute (NCI), the National Heart, Lung, and Blood Institute (NHLBI) , 

the National Institute on Drug Abuse (NIDA) , 

the National Institute on Alcohol Abuse and Alcoholism (NIAAA) , and the 

National Institute for Mental Health (NIMH) ,- 

the Food and Drug Administration (FDA) ; 

the Agency for Health Care Planning and Research (AHCPR) ; 

the Indian Health Service (IHS) ,- 

the Health Care Financing Administration (HCFA) ; 


• the National Highway Traffic Administration (NHTA) ; 
the Consumer Product Safety Commission (CPSC) ; and 
the Bureau of the Census 

State health departments also routinely collect health information, some of which is 
not available from federal sources; and private organizations (e.g., the Public Health 
Foundation and the National Association of Health Data Organizations) either have 
health information or maintain inventories of information that can be obtained from 
other sources. 

Information is available in other countries from similar national or local agencies 
{1-4) . The United Nations and the World Health Organization (WHO) routinely publish 
population estimates and summary information on mortality and natality in member 
countries (5-6) . Health and demographic information is also available from regional 
offices such as WHO/Europe (7). 



Reporting on notifiable diseases at the national level originated in the United States 
in 1878, when Congress authorized the United States Public Health Service (PHS) to 
collect reports on morbidity from cholera, smallpox, plague, and yellow fever, each of 
which was controlled through quarantine measures (8,9). Although initially focused on 
foreign ports, authority for weekly reporting was expanded in 1893 to include states 
and municipal authorities (9). To increase uniformity, the Surgeon General was 
authorized in 1902 to provide forms for the collection, completion, and publication of 
reports at the national level . Weekly telegraphic reporting was recommended for a few 
diseases in 1903, and by 1928, all states, the District of Columbia, Hawaii, and 
Puerto Rico were participating in national reporting of specified conditions (8). 
Compulsory notification for selected infectious diseases was also instituted in many 
other countries in the late 1800s, including Japan (1880), Scotland (1887), Italy 
(1888), England and Wales (1889), and Northern Ireland (1899) (2,3,10). 

The list of diseases for which notification is recommended has changed over time, and, 
although there is overlap, the lists vary from jurisdiction to jurisdiction. In the 


United States, for instance, 47 infectious diseases were considered notifiable at the 
national level in 1989 and were reported to CDC through the National Notifiable 
Disease Surveillance System (NNDSS) (11) . In at least one state, however, reporting 
was required for over 160 infectious diseases or related conditions, 90 occupational 
diseases, 23 other environmental diseases, 29 congenital or related conditions, and 
six diseases of unknown cause. With the addition of Lyme disease and Hemophilus 
influenza in 1991, 49 infectious diseases are currently notifiable at the national 
level in the United States (12) . In recent years, lists of notifiable diseases in 
other countries included 66 diseases in Italy (19 with rapid reporting procedures) , 32 
in Scotland and in Japan, 29 in England and Wales, and 26 in Northern Ireland 

(2,3,10). Procedures for modifying the list of notifiable diseases also vary from 
country to country. In the United States, reporting for notifiable diseases is 
mandated at the state level and the Council of State and Territorial Epidemiologists 

(CSTE) , a consortium of epidemiologists from all state and territorial health 
departments, recommends a list of conditions to be reported each week to CDC (12) . 
National reporting is required for three quarantinable diseases--plague, cholera, and 
yellow fever. Cases of these three diseases are also reported to the WHO by member 

In the United States, occupational diseases or occupation-related conditions are 
considered notifiable in some states, but at present, occupation-related conditions 
are not reported nationally (13,14). In 1988, at least one occupation-related 
condition was considered reportable in 34 states or other jurisdictions. Lead 
poisoning, pesticide poisoning, and occupation-related lung diseases are among the 
occupation- related conditions that are reportable in many states. 

In recent years, notifiable-disease-reporting mechanisms have been used in some 
localities to collect information on conditions that are not infectious, occupation- 
related, or vaccine-related. In the United States, spinal-cord injuries, elevated 
blood lead levels for children and for occupational ly exposed workers, and Alzheimer's 
disease are among the conditions for which reporting is required in some localities, 
although national reporting is not recommended by CSTE (15-17). 

Reporting in the United States for adverse events following vaccination or in 
association with the administration of drugs differs from other notifiable-disease 


reporting procedures in Chat the former types of events are reported nationally rather 
than to state health departments. Since 1988, all health-care providers and vaccine 
manufacturers have been required to report certain suspected adverse events following 
specific vaccinations (18) . The Vaccine Adverse Event Reporting System (VAERS) in 
which all reports of suspected adverse events following any vaccination are accepted, 
became operational in 1990. 

Adverse drug reactions are reported in the United States to the FDA {19,20). Drug 
manufacturers are required to submit post-approval reports of adverse drug reactions 
as well as reports from ongoing clinical trials and selected reports from foreign 
sources. Reports submitted to manufacturers by providers are sent to the FDA, or 
providers and patients can submit reports directly. Nearly 60,000 reports were 
submitted in 1989. Many other countries have similar adverse-drug-reaction reporting 
systems, and about 23 of these report data to the WHO Collaborating Center for 
International Drug Monitoring (21) . In England, active surveillance for adverse drug 
effects in relation to specific drugs can be monitored through the Prescription Event 
Monitoring System, which is funded through both public and private sources {21,22). 

Data Collection, Transmission, and Dissemination 

Although information on notifiable diseases is collated and published nationally, its 
primary purpose is to direct local prevention and control programs. In the United 
States, information is generally reported by clinicians to local or state health 
departments. State regulations governing notifiable disease reporting are often quite 
specific regarding timeliness of reporting. For conditions in which an immediate 
public health response is needed, notification by telephone is usually mandated, 
either immediately or within 24 hours of a suspected case. Other conditions are 
generally reported on a weekly basis after the diagnosis has been confirmed. 

For conditions that are reported nationally in the United States through the NNDSS, a 
subset of information — including the age, gender, race, and date of occurrence (or 
report) — is sent weekly to CDC by state health departments or other jurisdictions in a 
standard format, either as individual case reports or aggregate reports. Personal 
identifiers are not included in the NNDSS. Since 1990, all reporting states and 
localities have transmitted information electronically to CDC through the National 


Electronic Telecommunications System for Surveillance (NETSS) (23) . National case 
counts for most notifiable diseases are published the week after they are reported to 
CDC in the Morbidity and Mortality Weekly Report (MMWR) . 

Most state health departments also disseminate surveillance data and other public 
health information to health-care providers through weekly or monthly newsletters. For 
some conditions, including measles, hepatitis, syphilis, and acquired immunodeficiency 
syndrome (AIDS) , more detailed information on risk factors and other information 
needed for disease-control programs is also collected by state and local health 
departments and, in some instances, is sent to CDC. Information is also sent to CDC 
through NETSS for conditions such as spinal cord injuries, giardia infection, and Reye 
syndrome, that are not nationally notifiable but for which information is useful at 
the national level. Although their use in the United States is limited primarily to 
influenza surveillance, networks of sentinel health-care providers in many European 
countries report supplemental information on notifiable diseases to local and national 
health officials (see below) . 

Surveillance for zoonotic diseases also involves monitoring animal hosts that either 
transmit the disease directly to humans or are also susceptible to the disease. For 
various types of encephalitis, for instance, detection of elevated virus titers in 
mosquitoes, wild birds, sentinel flocks of chickens, or horses can signal that an 
outbreak of human disease may occur so that mosquito-control activities can be 
initiated {24). Similarly, the potential for human cases of rabies is assessed through 
monitoring wild skunks, raccoons, bats, and other animal vectors (25); the potential 
for human plague is assessed by monitoring rodents in endemic areas (26) ; and Rocky 
Mountain spotted fever and Lyme disease are monitored through testing of ticks 
(27,28) . 

Although most cases of notifiable conditions are reported by clinicians, the role 
laboratories play in reporting notifiable conditions is becoming increasingly 
important. In the United States, many states have developed reporting requirements for 
laboratories and hospitals for conditions that need laboratory confirmation for 
diagnosis (11,29,30) . In New York City, for instance, laboratories are required to 
report elevated blood-lead levels in children, and at least five states rely on 
laboratory reporting to identify workers with elevated levels of lead or other heavy 


metals {15) . Comprehensive, nationwide reporting by laboratories is not yet available 
in the United States, but in England, Wales, and Northern Ireland, nearly all 
microbiology laboratories voluntarily report positive identifications of selected 
conditions to the national Public Health Laboratory Service (PHLS) {10) . 

Strengths and Limitations 

Although many diseases or conditions are considered notifiable, compliance is poor in 
most countries and sanctions are rarely enforced. As Sherman and Langitiuir noted in 
1952, 'Our system of notification of individual case reports is a haphazard complex of 
interdependence, cooperation, and goodwill among physicians, nurses, and county and 
state health officers, school teachers, sanitarians, laboratory technicians, 
secretaries, and clerks. It is a rambling system with variations as numerous as the 
individual diseases for which reports are requested, and as numerous as the interests 
and individual traits of the administrative health officers, epidemiologists, and 
statisticians in [all] the . . . States and the several federal agencies concerned with 
the data" (31). Indeed, it is remarkable- -given the jerry-rigged nature of the system- 
-that the information collected is at all useful. 

Under-reporting is a consistent and well-characterized problem of notifiable-disease- 
reporting systems (see Chapter 12). In the United States, estimates of completeness of 
reporting range from 6% to 90% for many of the common notifiable diseases {32) . 
Reporting is generally more complete for conditions such as plague and rabies that 
cause severe clinical illness with serious consequences. Among the many factors that 
contribute to incomplete reporting of notifiable conditions are lack of medical 
consultation for mild illnesses; concealment by patients or health-care providers of 
conditions that might cause social stigma; lack of awareness of reporting 
requirements; lack of interest by the medical community; incomplete etiologic 
definition of notifiable conditions; inadequate case definitions for surveillance 
purposes; variation in clinical expertise in diagnosing conditions in different areas; 
changes in procedures for verifying reports from providers; variation in the use of 
laboratory confirmation; variation in laboratory procedures; the effectiveness of 
control measures in effect; and priorities of health officials at local, state, and 
national levels {9,30,33) . Similarly, increased concern can result in an increase in 
reported cases. Public health officials may actively solicit information if an 


outbreak is suspected and case reports may increase in response to reports by the 
media . 

The extent of under-reporting can vary by risk group. An evaluation of reporting for 
AIDS in Philadelphia found, for instance, that under-reporting was more prevalent for 
those who were employed in white-collar occupations and who had private health 
insurance (34) . Similarly, a review of hospital -discharge data in South Carolina 
indicated that AIDS diagnoses were less likely to be reported for whites over 40 years 
of age 1.35) . 

Changes in case definitions and the extent to which laboratory confirmation is 
required for reporting can also affect reporting for notifiable conditions. In the 
United States, a 1984 survey of state epidemiologists found substantial variation in 
definitions used for communicable disease surveillance by state health departments. 
Since then, surveillance case definitions have been developed for many communicable 
diseases and occupational conditions, as well as for spinal-cord injuries (14,17). The 
degree to which standardized case definitions for notifiable-disease reporting have 
been adopted varies, but recent experience suggests that there will be more important 
changes in trends as they are more widely used. The 1987 revision of the surveillance 
case definition for AIDS resulted in an increase in the number of reported cases among 
heterosexual drug abusers (36) . Changes in the surveillance case definition for 
congenital syphilis resulted in a 5-fold increase in cases in some reporting areas 
[37,38) . Adoption of a uniform case definition for Lyme disease is probably reflected 
in the decrease in reported cases in the United States in 1990 (39) . 

The extent to which clinical reports are confirmed with laboratory findings can have a 
substantial impact on reporting rates. For instance, malaria was endemic in the 
southeastern United States in the 1930s. Epidemiologic studies in 1947 indicated that 
routine reporting of aggregate case counts based on clinical findings alone was not 
providing an accurate picture of current disease activity. When reporting of 
individual cases with laboratory confirmation was required, it became clear that 
endemic malaria had disappeared between 1935 and 1945, before malaria control programs 
based on drainage and indoor residential spraying of DDT were initiated (40,41). In 
recent years, the role of laboratories has been particularly important for 
surveillance of the numerous subtypes of Salmonella, legionellosis, nosocomial 


infections, and detecting elevated blood-lead levels (15,30,42) . Without laboratory- 
based surveillance, for instance, a large outbreak of drug-resistant Salmonella 
newport that originated from animals fed antimicrobials might not have been detected 
(43) . 

In spite of their limitations, surveillance systems based on reporting of notifiable 
conditions are a mainstay of public health surveillance. Unlike most other sources of 
routinely collected data, information from notifiable-disease systems is available 
quickly and from all jurisdictions. Knowledge of the specific characteristics of 
reporting for a particular condition is helpful in interpreting the findings. While 
long-term trends may be difficult to interpret without supplemental information, 
notifiable-disease systems can often detect outbreaks or other rapid changes in 
disease incidence in a timely manner so that control activities can be initiated. As 
appropriate, initial observations can be evaluated further with additional studies. 
Notifiable-disease systems can also detect changes in patterns of disease by 
demographic characteristics or risk groups. In the United States, for instance, human 
immunodeficiency syndrome (HIV) and AIDS surveillance systems have identified new risk 
groups including intravenous drug abusers and their mates and have highlighted the 
emerging problem of children who are born HIV-infected. Evaluation of surveillance 
information has also lead to changes in disease prevention and control strategies. On 
the basis of reports of measles among elementary school-, high school-, and college- 
age students, recommendations for measles vaccination in the United States were 
recently changed to include a two-dose schedule (44) . Similarly, because strategies 
based on vaccination of high-risk groups have not been as effective as originally 
anticipated, recommendations for hepatitis B vaccination have recently been modified 
(45) . 

In the United States, reports of adverse drug reactions often result in labeling 
changes for new drugs (19). Drug withdrawals are infrequent, although two drugs (an 
antidepressant and a non-steroidal anti- inflammatory agent) have been withdrawn in 
recent years. Vaccine adverse-event-reporting systems are important for detecting 
potential problems following administration of vaccine, such as an increase in 
paralytic poliomyelitis among recently vaccinated children in the 1950s and the 
increase in Guillain-Barre syndrome following vaccination for swine influenza 
(18,46,47) . 


Notifiable-disease-reporting mechanisms have also been important for identifying 
unusual conditions that appear to be increasing and for obtaining a preliminary 
assessment of their public health impact. Among the more recent examples in the United 
States are AIDS, toxic-shock syndrome, legionellosis, Reye syndrome, and eosinophilia- 
myalgia syndrome (EMS). Following the initial report from a state health department, 
nationwide surveillance for EMS using a standard case definition was instituted within 
a few days, and, through additional studies, the putative agent was identified (48) . 

In the future, reporting of notifiable conditions may rely, in part, on computerized 
data bases developed for billing and other purposes. However, the utility of these 
systems is limited at present: first, because International Classification of Disease 
(ICD) codes are often not used to identify infectious agents on billing records and, 
second, because information in these large data bases is not available immediately 
(49) . In the near-term, improvements in notifiable-disease reporting in most areas 
are likely to be related to increased reliance on laboratory-based reporting and on 
the use of sentinel health-care providers or sentinel sites. 

Vital Statistics 

The systematic registration of vital events had its origins in the parish registers of 
15th century Western Europe (1) . One of these registers, the Bills of Mortality--a 
weekly tally begun in 1532 of the number of persons who died in London from plague and 
other causes, was used to study patterns of mortality by John Graunt, one of the first 
to use numerical methods to study disease (50) . 

Parish registers were superseded in the 19th century by civil registers kept for legal 
purposes. Registration of vital events usually remains the responsibility of local 
authorities, but the use of standard procedures for collecting, coding, and reporting 
vital events--f irst used systematically by William Farr in Great Britain the 1830s-- 
allows information from different jurisdictions to be aggregated, summarized, and 
compared. Farr, the first medical statistician in the Office of the General 
Registrar of England and Wales, recognized the importance of determining death rates 
for different segments of the population using information collected systematically at 
the time of birth or death. In the first annual report to the Registrar General in 
1839, Farr discussed the principles that should govern a statistical classification of 


disease and urged the adoption of a uniform system {2,51) . Nomenclature and 
statistical classification systems initially developed by Farr and by Marc d'Espine 
form the basis of the international disease classification system used today. 

Information collected at the time of birth and death is one of the cornerstones of 
surveillance in both developed and developing countries. Today, about 80 countries or 
areas report statistics on vital events to WHO, which are coded and tabulated 
according to the ninth revision of the International Classification of Diseases (ICD- 
9) and represent about 35% of the deaths that occur each year worldwide (ICD-9) (52) . 

Vital statistics are an important source of information for surveillance because they 
are the only health-related data available in many countries in a standard format 
(52) . Also, they are often the only source of health information available for the 
entire population and the only source available for estimating rates for small 
geographic areas. Vital statistics have been used to: 

monitor long-term trends (53-55); 

identify differences in health status within racial or other subgroups of 
the population (55,57); 

assess differences by geographic area (58-62) or occupation (50,63); 
monitor deaths that are generally considered preventable (64-67) ; 
generate hypotheses regarding possible causes or correlates of disease 
(68,69) ; 

conduct health-planning activities (70,72); and 

monitor progress toward achieving improved health of the population 
(7, 72, 73) . 

The usefulness of vital statistics for surveillance of a particular health event 
depends on the characteristics of that health event, as well as on the procedures used 
to collect, code, and summarize relevant information. In general, vital statistics 
will be more useful for conditions that can be ascertained easily at the time of birth 
or death. Likewise, mortality rates derived from death-certificate data will more 
closely approximate true incidence for conditions with a short clinical course that 
are easy to diagnose, are easily identified as initiating a chain of events leading to 
death, and are usually fatal (52,74-75). Although birth and death certificates are 


filed shortly after the event occurs, the process of producing final vital statistics 
at a national level from these data can take several years. Background information on 
the process of producing vital statistics, outlined here for the United States, is 
intended to highlight some of the strengths and limitations of vital statistics for 
public health surveillance. 

Birth and Death Certification 

In the United States, responsibility for the registration of birth, death, and fetal 
death is vested in the individual states and certain independent registration areas 
(e.g., New York City) (77). States are encouraged to adopt standard certificates 
similar to the "model" certificate developed by NCHS in collaboration with other 
groups although some states modify the "model" certificate to comply with state laws 
or regulations or to meet their own information needs (78). Certificates are usually 
filed with a registrar within 24 hours in the jurisdiction in which the event 
occurred. For birth certificates, the physician or attendant certifies the date, 
time, and place of birth and other hospital personnel usually obtain information on 
the remaining items (79). The 1989 model birth certificate includes additional 
information on perinatal risk factors, such as maternal illnesses and complications of 
labor and delivery, that will help to improve surveillance for perinatal events 
(77, 80, 81) . 

For death certificates, the funeral director is usually responsible for including all 
personal information about the decedent and for assuring that medical information is 
provided by the physician who certifies the death [82). Information provided by the 
physician includes the cause of death (immediate, "as a consequence of, " and 
underlying causes), the interval between onset of the condition and death, other 
important medical conditions, the manner of death (e.g., "accident", homicide, or 
suicide) , whether an autopsy was performed, and whether the medical examiner or 
coroner was notified of the death (78). In most cases, information from autopsies and 
reports from medical examiners or coroners are not available at the time the death 
certificate is filed, although the certificate can be amended when this information 
becomes available. Local registrars assure that all vital events that occurred in the 
jurisdiction are registered and that required information is provided on certificates 
before they are sent to the state registrar. Both state and local registrars can ask 


physicians or funeral directors for additional information if the certificate is 
considered incomplete. State registrars are usually responsible for numbering, 
indexing, and binding certificates for permanent safekeeping. Also, state registrars 
usually forward certificates for deaths of non-residents to their states of residence. 

Coding, Classification, and Calculation of Rates 

To calculate national death rates, the numbers of live births is used as denominators 
for infant and maternal mortality rates, and estimates of the population, usually 
derived from the censuses are used as the denominators for other death rates {51,83). 
Conditions are classified and rates are calculated according to the ninth revision of 
the ICD-9 developed through the WHO and in use since 1979. The ICD-9 includes a 
tabular list of categories and conditions with code numbers, definitions of key terms 
(e.g., underlying cause of death, low birth weight), rules for selecting the 
underlying cause of death, and lists of conditions for statistical summaries. 

Age-standardized rates are usually calculated when summary rates are compared in order 
to control for the effects of differences in age structure between compared 
populations (see Chapter V). In the United States, the age distribution of the U.S. 
population in 1940 is usually used as the standard for vital statistics {84,85). 
Other age distributions--such as the world standard population and the European 
standard population--are often used for international comparisons (See Chapter 5) (5). 

In the United States, about half the states submit both medical and demographic data 
from certificates to NCHS in computerized form {84,85). Final national mortality and 
natality data are generally not available from NCHS for at least 20 months after the 
close of the calendar year, although a written report based on a 10% sample of deaths 
is available within a few months. Final data are often available more quickly from 
individual states. Similarly, final mortality and natality data are generally 
available, with indices of quality and completeness, within 2-3 years for countries 
that routinely report data to WHO (5) 

Comparability and Quality Control 

The quality of vital-statistics information depends on various factors, including the 
completeness of registration, the relevance of the categories used for diseases, 


injuries, and other conditions; the accuracy of demographic and medical data provided 
on certificates; and the translation of this information into computerized data 
(including its categorization and coding) . When rates are calculated, estimates are 
also affected by the accuracy of the population estimates or other estimates used for 
denominators. Differences in access to medical care, diagnostic practices, and 
interpretation of coding rules will also affect comparability. 

Registration and medical certification of deaths is virtually complete in most 
developed countries [86) . Population estimates used to calculate rates in developed 
countries are usually derived from censuses conducted at regular intervals (usually 
every 10 years), in which the total population is enumerated (6). Inter-censal 
estimates are derived by adjusting census figures for birth, death, and migration 
patterns in the intervening years. In some countries, population estimates are 
derived from surveys or from continuous population registers. Through the United 
Nations, population estimates, including indices of the quality and completeness of 
these estimates, are available for about 220 countries or areas of the world. 

Population under-counts can have a measurable impact on mortality rates; rates will be 
inflated, for instance, if population estimates used for the denominator are too 
small. In the United States, for instance, the 1980 age-adjusted death rate (1940 age 
standard) from all causes would decrease by 1.1% if the population estimate from the 
1980 census was adjusted for under-counts (85) . Effects are even greater for 
subgroups of the population. For homicides and deaths resulting from legal 
intervention in the United States in 1980, adjustment for census under-count would 
change the ratio of death rates for black to white men ages 35-39 years from 7.3 to 
6.2--a decrease of nearly 18%. 

When cause-specific rates are compared, both the extent to which information on birth 
and death certificates is reported completely and accurately and the precision of 
population estimates will affect the magnitude and the comparability of rates. The 
impact of these factors is likely to be of less importance for aggregated cause-of- 
death categories. Nonetheless, comparisons between different geographic areas or 
different population subgroups should be interpreted cautiously. 


Mortality from "signs, symptoms, and ill-defined conditions" is often used as an 
indicator of the care and consideration given by medical certifiers to completing 
certificates (ICD-9 780-799). In recent years, countries in which 'signs, symptoms, 
and ill-defined conditions" were coded as the underlying cause of death ranged from 
less than 1% for Australia, Czechoslovakia, Finland, Hungary, New Zealand, Sweden, and 
the United Kingdom to 5%-10% for Belgium, France, Greece, Israel, Poland, Portugal, 
and Yugoslavia (86) . In the United States, 1.4% of deaths in 1988 were coded as 
"signs, symptoms, and ill-defined conditions," with a range among the states of 0.4% 
to 4.1% (85) . 

The impact of these factors on international comparisons has been assessed for cancer 
and for respiratory disease (76,76). Within the United States, differences in 
completeness and accuracy of certificates have also been noted within racial and 
ethnic subgroups (87) . 

A variety of approaches will facilitate improvement in the quality of information on 
birth and death certificates. These include providing physicians and funeral 
directors clearer instructions for completing the certificates and more effective 
training regarding the importance of vital statistics and the importance of following 
recommended procedures for completing both the medical and demographic sections of 
certificates (77, 88,89) . State and local registrars can increase the extent to which 
they contact physicians and funeral directors when information provided on 
certificates is not considered complete and can facilitate amendment of certificates 
when additional information is available from autopsies or other sources. 

In spite of limitations, birth and death certificates are an important source of 
information for cost-efficient surveillance of a wide range of health events at local, 
national, and international levels. Although differences in rates may not always 
reflect actual differences in disease and injury burden, routine analysis of 
information obtained at the time of birth and death can highlight areas in which 
further investigation of a health event is warranted. 

Examples of Surveillance Systems Based on Vital Statistics and 
Related Data 

Weekly reports. As part of the national influenza surveillance effort in the United 
States, vital registrars in 121 U.S. cities report to CDC each week the number of 
deaths that have occurred in those jurisdictions (90) . This 121-City Surveillance 
System has been operational since 1952. The total number of deaths and the number 
attributed to pneumonia and influenza by age group are reported, and the total number 
of deaths by age, city, and region are published within a week of receipt in the MMWR. 
About one-third of the deaths that occur in the United States are reported through the 
121-City Surveillance System, and most are reported to CDC within 2-3 weeks of 
occurrence. Mortality rates based on the 121-City system cannot be directly compared 
with rates derived from final mortality data. However, the 121-City system does 
detect short-term increases in deaths from influenza and pneumonia in a timely manner 
as needed for public health intervention. Increases in mortality from other causes- 
including mortality during heat waves and increased deaths from pneumonia and 
influenza among young men (later linked to AIDS)-- have also been detected using the 
121-City system. 

Monthly or quarterly reports. In the United States, final mortality data are 
generally not available for nearly 2 years, although provisional estimates are 
published by NCHS within 3-4 months in the Monthly Vital Statistics Report (MVSR) . The 
Current Mortality Sample, a 10% systematic sample of certificates, is sent to NCHS 
each month by state registrars. On the basis of this sample, provisional estimates of 
total monthly mortality by age, race (white, black, other), gender, state, and region 
are published about 3 months later, and provisional rates from 72 selected causes are 
published the following month. Provisional rates are published by place of occurrence 
while final rates are published by place of residence. For the Mortality Surveillance 
System (MSS), time-series regression models are fitted using monthly data, and charts 
displaying monthly estimates and the fitted model for specific conditions are 
published each month in the MVSR. 

The Current Mortality Sample and the MSS are very useful for monitoring overall trends 
in total mortality and for monitoring trends in relatively common causes of death that 
are increasing or decreasing over time (e.g., heart disease, homicide, lung cancer, 
HIV /AIDS) . Although estimates are adjusted for under-reporting, monthly changes in 
mortality for conditions for which supplemental information is often needed should be 
interpreted with caution. 

Infant mortality and other adverse reproductive outcomes. Linking information 
from death certificates for infants with information on maternal characteristics and 
other information from birth certificates is useful for assessing potentially 
preventable mortality by geographic area and within subgroups of the population. In 
England and Wales, birth and death records for infants were linked for infants born in 
1949-1950 and again for infants who died from April 1954 to March 1965 (2) . All 
births and deaths of infants have been linked routinely in England and Wales since 
1975. In the United States, birth and death certificates have been linked for infants 
born from 1983 to 1986 (PI). Approximately 40,000 infants die each year in the United 
States, and at least 98% of the death certificates for infants have been linked to 
birth certificates in these years. This information is also useful for health 
planning and for targeting services, since U.S. infant mortality rates vary 
considerably by geographic area and within demographic subgroups. 

Information on birth certificates has also been used to identify high-risk mothers who 
need supportive services for infant care. In Michigan, for instance, information on 
birth certificates is transmitted electronically from hospitals to the state health 
department (91) . Key information is then sent to county health departments so that 
public health nurses can be assigned to areas with the greatest need. 

Occupational mortality. William Farr was the first to evaluate systematically the 
associations between occupation and cause of death (50) . The Decennial Supplement on 
Occupational Mortality for England and Wales has been published approximately every 10 
years since 1855 (1,2). Cause-specific rates and ratios by occupation, adjusted for 
social class, are estimated using information derived from death certificates and from 
the decennial census (63) . Although estimates are affected by sources of error in 
both data sets, occupation-specific mortality rates are useful for identifying 
occupations for which more detailed studies may be warranted (92) . 

In the United States, usual occupation (even if retired) and industry are included on 
the standard death certificate (85) . The states are not required to report this 
information to NCHS, but if it is submitted, it has been included since 1985 in the 
computerized final mortality files using the Standard Occupational Classification and 
Standard Industry Classification systems. In 1987, 14 states reported information on 
occupation and industry to NCHS and in 1989, occupation and industry during the last 


year for both mother and father were added to the standard certificate for deaths of 
fetuses (77). Through the National Traumatic Occupational Fatalities (NTOF) 
surveillance system, CDC's National Institute for Occupational Safety and Health 
(NIOSH) obtains additional information for work-related traumatic deaths that is 
included on death certificates but that is not coded and computerized routinely in all 
states (93) . State- and industry-specific rates are derived using estimates of the 
employed population from the Bureau of Labor Statistics. Analyses from the NTOF 
suggest that traumatic occupational fatality rates decreased in the United States 
between 1980 and 1985, although, in some instances, large differences were found in 
fatality rates by gender and by state within the same industry. 

Supplemental information from other sources. Other sources of information may be 
available on the circumstances leading to death. In the United States, medical 
examiners and coroners are responsible for investigating sudden and unexpected deaths - 
- homicides, suicides, deaths from unintentional injuries, and unanticipated deaths 
from natural causes--which account for about 20% of all deaths each year. Reports 
from medical examiners and coroners include detailed information on the circumstances 
surrounding death, results of laboratory analyses for alcohol and drugs, and other 
relevant information. These reports have been used, for instance, to investigate 
deaths associated with horseback riding, drug abuse, hurricanes, earthquakes, and heat 
waves (.94-98) . In 1990, through the Medical Examiner/Coroner Information Sharing 
Program, data from investigations of death were reported to CDC's National Center for 
Environmental Health (NCEH) in a computerized format from nine state and eight county 
medical-examiners' offices (R.G. Parrish, personal communication). 

Additional information on fatalities is often available from other sources. In the 
United States, for instance, the Fatal Accident Reporting System (FARS) from the NHTA 
has been used to investigate the association between use of child restraints and 
motor-vehicle-related crashes (99) and the association between premature mortality and 
alcohol-related traffic crashes (100) . The relationship between homicide and the 
prevalence of hand-gun ownership in the United States and Canada has been investigated 
using data from uniform crime-reporting registries of all homicides and aggravated 
assaults maintained by the Federal Bureau of Investigation in the United States and 
the Centre for Justice Statistics in Canada (102) . Other sources--such as police, 


ambulance, and fire reports--may also include information that is useful for 
surveillance of particular health events. 



The term "sentinel surveillance" encompasses a wide range of activities focused on the 
monitoring of key health indicators in the general population or in special 
populations. Characteristics of these activities vary considerably, but, in general, 
their primary intent is to obtain timely information needed for public health or 
medical action in a relatively inexpensive manner rather than to derive precise 
estimates of prevalence or incidence in the general population. The term "sentinel" 
has been applied to key health events that may serve as an early warning or represent 
the tip of the iceberg; to clinics or other sites where health events are monitored; 
or to networks of health-care providers who agree to report information on one or more 
health events. A sentinel health event, according to Rutstein, is a "preventable 
disease, disability, or untimely death whose occurrence serves as a warning signal 
that the quality of preventative and/or therapeutic medical care may need to be 
improved" (102). Sentinel surveillance, according to Woodhall, represents "an attempt 
to find a system that would provide a measure of disease incidence in a country in the 
absence of good nation-wide institution-based surveillance without having to resort to 
large expensive surveys" (103) . Sentinel surveillance systems are not limited to 
developing countries. In Europe, routine morbidity surveillance is often conducted by 
networks of primary care providers who routinely report information on conditions that 
are relatively common in general practice (104,105). 

Sentinel Health Events 

Sentinel health events are monitored for many different public health programs. In 
the United States, sentinel surveillance for maternal mortality, first used in New 
York City in the 19 30s, was associated with a rapid decline in mortality associated 
with childbirth. For each case, medical panels reviewed pertinent records to identify 
missed opportunities that might have prevented a presumably unnecessary death. 
Similar methods have been used to monitor deaths of infants. In Massachusetts, review 
of records indicated that, in 1967-1968, about one-third of the deaths of infants 


could have been prevented by medical intervention (.102). Monitoring preventable 
conditions can also highlight more general problems. For instance, a review of deaths 
among infants from Rh hemolytic disease, about 90% of which are considered 
preventable, indicated that mothers of many affected infants did not have medical 
insurance coverage (106) . Quality of care has also been evaluated using conditions 
for which death or disability could have been prevented including evaluation of 
hospital-based mortality rates after adjustment for certain patient characteristics 
(107-109) . 

Sentinel surveillance activities have been particularly useful for identifying health 
events that may be related to occupational exposures. Lists of occupation-related 
health events have been developed, some of which (e.g., mesothelioma and angiosarcoma 
of the liver) are specifically tied to environmental or occupation exposure, and some 
of which (e.g., lung cancer and bladder cancer) have other risk factors as well (102). 
Mesothelioma, for instance, is a rare form of cancer specifically associated with 
exposure to asbestos that may identify the "tip of the iceberg" of asbestos-related 
disease in an industry in which workers develop more common conditions, such as lung 
cancer and chronic obstructive pulmonary disease. 

In the United States, NIOSH has developed the Sentinel Event Notification System for 
Occupational Risks (SENSOR) program, which focuses on surveillance of specific 
occupational conditions by networks of sentinel providers (210) . Target conditions 
monitored by at least one of the 10 states initially included in the program include 
silicosis, occupational asthma, pesticide poisoning, lead poisoning, and carpal-tunnel 
syndrome. When cases identified by sentinel providers, (usually physicians who 
practice occupational medicine) are found to be occupation-related, intervention 
activities are undertaken by state health departments in order to prevent additional 
cases. Although primarily used for case identification and follow-up, information 
derived from SENSOR projects may augment other sources of information on trends for 
occupation-related disorders. 

Health indicators that are monitored in many different countries could also be 
considered sentinel health events. Infant-mortality rates, for instance, are used in 
both developing and developed countries as an indicator of the availability and the 
quality of medical care. In Europe and the United States, additional health 


indicators are monitored routinely to assess the general health of the population. In 
Europe, 22 key health indicators have been monitored routinely since 1986 through 
WHO'S Health for All activity in order to compare progress toward reducing preventable 
morbidity and mortality in participating countries (7). In the United States, 
specific goals and objectives for improving the nation's health are monitored using 
key health indicators. Goals and objectives initially developed for 1990 have been 
revised and expanded for the Year 2000 so that progress toward attainment of specific 
objectives can be monitored quantitatively (73). A total of 226 goals and objectives 
for the Year 2000 has been proposed for use in monitoring health status at the 
national level and a subset of 18 indicators has been selected for monitoring by all 
levels of government (112) . Most of these 18 community-health-status indicators are 
based on vital statistics and data from the NNDSS. 

Sentinel Sites 

Sentinel hospitals, clinics, and counties can often provide timely, information on a 
wide range of health conditions that is not available from other sources. Although 
information is generally not available for the entire population, sentinel systems in 
both developing and developed countries can provide sufficient information for making 
public health decisions and for detecting long-term trends. In developing countries, 
the WHO Expanded Project on Immunization uses sentinel hospitals and clinics in 25 
target cities to monitor the impact of vaccination on the incidence of neonatal 
tetanus, poliomyelitis, diphtheria, measles, pertussis, and tuberculosis (203). After 
initial contact with many hospitals and clinics, officials choose sentinel sites that 
serve populations as similar as possible to the general population. In developed 
countries, sentinel providers, hospitals, and clinics are used to monitor conditions 
for which information is not otherwise available. Sentinel primary-care providers 
report information on conditions seen in ambulatory settings, while sentinel sites-- 
such as drug, sexually transmitted disease, and maternal and child health clinics-- 
monitor conditions in subgroups that may be more vulnerable than the general 

Sentinel hospitals, clinics, and counties can also provide public health information 
that is not readily available from other sources. In the United States, for instance, 
viral hepatitis is a notifiable disease, but non-A non-B hepatitis (most of which is 


hepatitis C) is under-reported, and not all of the detailed information on serology, 
demographics, and routes of transmission needed for monitoring is routinely available. 
To obtain such information, patients with hepatitis reported to four county health 
departments are interviewed, are tested serologically at regular intervals after the 
onset of illness, and are followed prospectively to determine whether they have 
acquired hepatitis B or hepatitis C-related chronic liver disease (112,123). Taken 
together, these sentinel counties are intended to be representative of the incidence 
and epidemiologic characteristics of hepatitis B in the United States. Findings from 
these sentinel counties have highlighted the increasing importance of parenteral drug 
use in the transmission of both hepatitis B and C. 

Surveillance from sentinel sites is also used in the United States for surveillance of 
HIV infection (114) . Since the epidemic of HIV comprises multiple sub-epidemics in 
different population groups and different geographic areas, progression of the 
epidemic can be monitored by targeting surveillance efforts directed at groups who are 
at increased risk of HIV infection. The use of standardized survey methods and 
serologic testing procedures facilitates comparison of findings from the different 
groups. Included in the HIV family of surveys are studies of groups that receive care 
through publicly- funded clinics--including those for tuberculosis, drug treatment, 
sexually transmitted disease, family planning, and prenatal care. Other sentinel 
groups in which HIV prevalence is monitored include hospital patients with diagnoses 
that are not likely to be associated with HIV infection, women at the time of 
childbirth, blood donors, military recruits. Job Corps applicants, university 
students, prisoners, migrant farm workers, and homeless persons. Findings from HIV 
sentinel surveillance systems have been used to monitor progression of the epidemic in 
vulnerable populations and to estimate prevalence in the community at large. 

Sentinel Providers 

Networks of sentinel general or family practitioners and other primary care providers 
are active in many European countries and in the United States, Canada, Israel, 
Australia, New Zealand, and other countries (115-117) . Providers in some of these 
networks conduct independent research projects, but many of them--particularly in 
Europe and Australia- -report surveillance data that are used by national health 
agencies. Primary-care practitioners can provide timely information for surveillance 


because they generally provide the first professional judgment for medical problems 
that are seen in early stages. In most networks, primary-care physicians report a 
minimum amount of information, usually at weekly intervals, on a select group of 
health events that are relatively common in general practice. A wide range of health 
events are reported by these networks including the following: infectious diseases 
that are and are not notifiable in that country; conditions such as dementia, gastric 
ulcers, multiple sclerosis, acute pesticide poisoning, and drug abuse; and requests 
for services, such as mammography, cervical smears, and testing for HIV (104) . 
Although most systems are based on reports by primary-care practitioners, the extent 
to which rates can be calculated that reflect morbidity in the general population is 
related in large part to the manner in which medicine is organized and practiced in 
that country. For instance, morbidity reporting by sentinel general practitioners 
would more closely approximate morbidity in the general population in countries with 
universal health-care coverage in which patients are assigned to the same provider or 
group of providers, in which specialists are seen only by referral, and in which 
sentinel providers are selected that serve populations that are demographically 
similar to the general population. None of the existing networks meet all of these 
criteria, and the most enduring networks are usually characterized by highly motivated 
volunteer providers who report information consistently over time, when the 
population from which patients is drawn cannot be characterized, the number of cases 
relative to the total number of patients seen or the number of reporting physicians is 
usually monitored. Regardless of the strengths and limitations of each network, most 
are able to provide preliminary descriptive information in a timely manner for health 
events seen in ambulatory-care settings for which information is not otherwise 
available . 

A recent survey by Eurosentinel, a newly- formed consortium funded by the European 
Economic Community to coordinate activities of sentinel general -practitioner networks, 
found that, as of March 1990, there were at least 39 active networks in Europe (104) . 
Among the more established networks are those in Great Britain, the Netherlands, 
Belgium, and France. Ten of these participated in joint data-collection efforts 
including weekly reporting of mumps, measles, and influenza-like illness, and studies 
of the use of selected laboratory tests in general practice and of requests for HIV- 
testing (105,118) . 

The oldest sentinel -provider network in Europe, the Weekly Returns Service, was 
organized by the Royal College of General Practitioners in Great Britain and has been 
in continuous operation since 1962 (104) . In 1990, 242 volunteer general 
practitioners from 66 practices in Great Britain reported weekly incidence data for 44 
conditions selected collaboratively by participating practitioners, epidemiologists, 
and health-service providers (119) . These sentinel providers report conditions for 
about 1% of the population, and rates per 100,000 population can be calculated using 
information from patient lists. Reported conditions range from those with official 
notification procedures in Great Britain (e.g., measles and whooping cough) to 
conditions (e.g., multiple sclerosis, rheumatoid arthritis, thyrotoxicosis, and 
attempted suicide) for which less information is routinely available from outpatient 
settings (104,119, 120) . Information from the Weekly Returns Service has been 
particularly useful for monitoring trends in influenza and related illnesses in Great 

The Surrey University Morbidity Network, also covering about 1% of the population of 
Great Britain, has been operational since 1974 (104) . In 1990, 42 infectious and non- 
infectious conditions were monitored by 120 practices. One of the purposes of this 
network is to examine seasonal and other environmental influences on morbidity. Data 
have been collected and transmitted electronically since 1985, and participating 
physicians receive reports regularly. 

A network of sentinel general practitioners has reported to the Netherlands Institute 
of Primary Health Care (NIVEL) since 1970 (104, 121, 122) . The primary purpose of this 
network, which covers about 1% of the population, is to gather reliable epidemiologic 
data on health problems, as well as on actions taken by providers to address these 
problems. In 1990, 45 practices involving 63 general practitioners participated in 
the network. Information on 16 topics was reported weekly in 1988-1989, including 
requests for sterilization, referrals for speech therapy and echocardiography, and 
newly diagnosed cases of dementia. Reasonable estimates of morbidity are possible 
because access to medical specialists is available only by referral, a relatively 
well-defined population is served by each practice, and because practitioners, 
although volunteers, are chosen so that the distribution of their patients is as 
representative of the Dutch population as possible (121) . Many descriptive studies 
have been published using information provided by the Dutch network (121-123) . 


The Belgian Sentinel Practice Network has been operated by the National Health 
Department since 1979 (124-126). Each year, about 1,500 general practitioners are 
contacted, about 10% of them usually agree to participate, and a final group is 
selected so that their patients are representative of the age and sex distribution of 
the general population. An estimated 1.3% of the population in Belgium were seen by 
sentinel practitioners (104) . In 1990, measles, acute respiratory infections, new 
cases of cancer, suicide attempts, and requests for HIV tests were reported by the 
network, in addition to five officially notifiable diseases (gonorrhea, infectious 
hepatitis, meningitis, syphilis, and urethritis). Dissemination of the information is 
one of the strengths of the Belgian network. Bimonthly and annual reports are sent to 
participating practitioners, to the Ministry of Public Health, to medical and public 
health schools, to professional organizations, and to the press. 

In France, networks of sentinel primary-care providers transmit and receive 
information on selected conditions using computer terminals and modems available 
nationally at low cost (127) . Interactive electronic systems are used by the national 
French Communicable Diseases Computer Network (FDCN) , as well as by local and regional 
networks in the cities of Toulouse and St. Etienne, and in the regions of Aquitaine, 
France-Sud, and Lyon (104) . The largest network, the FDCN, has been operated by the 
National Health Department and the National Institute of Health since 1984. In 1990, 
about 550 volunteer sentinel general practitioners, about 1% of the number throughout 
France, reported new cases of influenza, viral hepatitis, urethritis measles, and 
mumps each week, none of which were officially notifiable (104, 128) . Since the 
underlying population seen by reporting physicians is not known, trends are usually 
expressed as the average number of cases per reporting physician per week. 
Information is also transmitted directly by national, hospital, and other 
laboratories; and local, regional, and national health agencies are also included in 
the network (127) . Electronic mail and bulletin boards are used to disseminate 
information, and reporting physicians can contact researchers and obtain literature 
searches through the network. 

Tracking the spread of influenza-like illness using the FDCN has been particularly 
effective. Epidemic thresholds can be calculated on the basis of data from previous 
years and the extent of regional spread can be tracked each week (128,129) . Unlike 
mortality-based surveillance systems, the FDCN was able to show that the 1988-1989 


influenza epidemic occurred earlier, was of shorter duration, and affected primarily 
young age groups relative to epidemics in previous years (.130) . In addition to 
routine surveillance activities, the FDCN has been used to conduct surveys on 
physician attitudes regarding vaccination for measles; the use of measles, mumps, and 
rubella trivalent vaccine; HIV testing; and biologic testing for diarrhea (104) . 
Surveys conducted before and after a nationwide AIDS campaign found that the number of 
tests given to women and to heterosexual men increased following the campaign that 
emphasized risks associated with heterosexual activity (131) . Studies of diarrheal 
disease have been conducted by the Aquitaine network (132) . Findings from the 
Aguitaine studies, coupled with findings on measles from the FDCN, highlight that 
localized outbreaks of disease for which public health action is warranted can be 
missed by sentinel networks that typically monitor conditions in about 1% of the 

In the United States, a network of 139 sentinel physicians reports cases of influenza- 
like illness each week to CDC (47,133). Nasopharyngeal specimens are sent by 70 
physicians to a central laboratory, which then reports findings to reporting 
physicians and to CDC. Physicians also report the total number of office visits per 
week so that the percentage of visits by patients with influenza-like illnesses can be 
estimated. In 1991, sentinel physicians from the Middle Atlantic and West South 
Central regions of the United States reported increased visits for influenza-like 
illness by late November, although numbers of such visits had not yet increased in 
other areas of the country. 

Networks of family practitioners and other primary-care providers have been formed in 
the United States and Canada, primarily to conduct collaborative research projects, 
but have the potential to conduct surveillance. The descriptive and analytic studies 
performed by these networks have been very useful for identifying patterns of illness 
in outpatient settings. Unlike most networks in Europe, however, they have generally 
not had formal reporting relationships with state or local health agencies that are 
responsible for timely public health activities. The Ambulatory Sentinel Practice 
Network (ASPN) , formed in 1981, includes 334 volunteer clinicians from 71 practices in 
the United States and Canada most of whom are family practitioners and many of whom 
practice in rural areas (115,134) . Many studies conducted by ASPN--including studies 
of pelvic inflammatory disease, spontaneous abortion, chest pain, carpal tunnel 


syndrome, and HIV prevalence- -have increased knowledge regarding the distribution of 
conditions with public health impact among patients seen in private ambulatory- care 
settings (135-138) . 

The Pediatric Research in Office Settings (PROS) network, formed in 1985 and sponsored 
by the American Academy of Pediatrics, currently includes about 740 practitioners in 
224 practices (139) . The PROS network has completed a study of vision screening of 
young children and a pilot study of febrile illness among infants. Regional primary- 
care networks include the Dartmouth COOP project in northern New Hampshire and 
Vermont, the Upper Peninsula Research Network in Michigan, and the Wisconsin Research 
Network. Studies with public health impact conducted by regional networks include 
studies of cholesterol-, alcohol-, and cancer-screening activities; development of 
methods to identify functional deficits; and development of health-maintenance 
protocols for use in private practice. 

Many of the established networks of primary-care providers participate in 
international collaborative organizations, such as the International Primary Care 
Network (IPCN), the European Electronic Adverse Drug Reaction Network (EEADRN) and 
Eurosentinel (104,140) . A recent IPCN study of 3,360 children from nine countries 
showed that the proportion of children with otitis treated with antibiotics varied 
widely between countries and that antibiotic treatment did not improve the rate of 
recovery (117) . In association with the British pharmaceutical industry, the EEADRN 
monitors adverse drug reactions in the United Kingdom, in Ireland, the Netherlands, 
Belgium, and Switzerland (204). Approximately 2,350 physicians participate in the 
network using hand-held computers to transfer information to the coordinator. 

Establishment of a computerized European sentinel-practice network is a long-term goal 
of the Eurosentinel, although preliminary findings indicate that the existing networks 
are quite heterogeneous. Nonetheless, Eurosentinel can serve as a clearinghouse for a 
wide range of activities that highlight similarities and differences between 
countries — both in patterns of disease and in the practice of medicine and public 
health. Eurosentinel could also serve as a model for a broad-based international 
consortium of sentinel practice networks. 



The use of registries for surveillance and other medical or public health activities 
has increased in recent years, largely because information from other sources, 
including notifiable disease reporting mechanisms and vital statistics, is often not 
adequate for monitoring the public health impact of non-acute diseases (142) . 
Registries differ from other sources of surveillance data in that information from 
multiple sources is linked for each individual over time. Information is collected 
systematically from diverse sources, including hospital-discharge abstracts, treatment 
records, pathology reports, and death certificates. Information from these sources is 
then consolidated for each individual so that each new case is identified and cases 
are not counted more than once. Case series and hospital-based registries in which the 
population at risk is not known can be useful for a variety of activities, including 
descriptive analyses and assessment of treatment effectiveness. However, population- 
based registries from which incidence rates can be calculated are generally more 
useful. Information from registries is used primarily for research purposes, but in 
many instances, registries have been useful for surveillance and related activities. 

The most successful registries are those where purposes are explicit and realistic, 
the data collected are accurate and are limited to essential information, and the 
registry meets needs that cannot be accommodated using simpler, less expensive methods 
(142, 143) . Even when data collection appears to be straightforward, the time and 
resources required to develop a functional registry are often underestimated. Because 
high-quality registries are resource intensive for long periods, they are generally 
not available for all geographic areas or exposed groups. Also, the complexity of the 
data-collection process limits the extent to which data can be made available rapidly. 

Registries have been used to monitor a wide range of health events and have identified 
opportunities for public health prevention and control activities. For instance, 
analysis of data from one of the earliest registries--of blind persons in Great 
Britain-- found that blindness among substantial proportion of the elderly was due to 
treatable cataracts, a finding that had not been previously recognized (142) . Other 
health events that have been monitored using registries include rheumatic fever, 


mental illness, Alzheimer's disease and dementia, renal disease, diabetes, heart 
disease, head and spinal cord injuries, child abuse, early childhood impairments, and 
occupation-related diseases such as berylliosis {16,144-149) . 

Registries are also used to monitor health events in groups with increased exposure to 
hazardous agents, including radiation and hazardous chemicals found in the work place 
and the environment (150-154) . Cancer, however, is by far the most common condition 
for which registry information is used for surveillance. 

Case Series and Hospital-Based Registries 

Case series and hospital-based registries have been useful for surveillance-related 
activities even though population-based rates usually cannot be estimated. Changes in 
the descriptive epidemiology of berylliosis have been monitored using a registry, for 
instance (148,155) . Cases of berylliosis increased sharply in the United States in 
1939 to 1941 following an increase in the use of beryllium in large-scale manufacture 
of fluorescent lamps and in war industries. The number of cases, among both workers 
and those who lived near production facilities, declined rapidly following changes in 
the manufacturing process and adoption of an exposure standard. Case registries have 
also been used to study relatively rare conditions such as mesothelioma among those 
exposed to asbestos and adenocarcinoma of the vagina among women exposed prenatally to 
diethylstilbestrol (156) . 

For most case registries, however, the primary goal is to provide information that can 
be used to improve patient care. Registers of cancer patients are maintained by many 
hospitals, and, more recently, some hospitals have established registries of persons 
who have been treated for traumatic events. In the United States, hospital-based 
cancer registries have been promoted by the American College of Surgeons since 1931 
and have been required as part of their cancer program since 1953 (156). Standardized 
software was made available to hospitals beginning in the 1980s, and development of an 
electronic data-transfer standard allowed information to be transmitted centrally from 
nearly 2,000 hospitals, beginning in 1990 (157). The newly formed National Cancer Data 
Base of the American College of Surgeons includes basic information on about 20% of 
all cases of cancer diagnosed each year in the United States. By highlighting the 
importance of histologic confirmation prior to treatment, hospital-based cancer 


registries have been particularly useful in improving the overall quality of treatment 
for cancer. 

More recently, development of regional and state systems for trauma care have prompted 
the development of hospital-based trauma registries. The first computerized trauma 
registry in the United States was developed in 1969 at Cook County Hospital in Chicago 
and was expanded to a statewide registry in 1971 that included information from 50 
hospitals designated as trauma centers in the state {141,143, 157) . National surveys in 
1987 identified 105 hospitals in 35 states with hospital-based trauma registries and 
10 states with central trauma registries (158) . The registries differed considerably, 
however, in the criteria used for inclusion of cases, the type of data collected, 
coding conventions, and the manner in which data were used. In an effort to make 
information in hospital-based trauma registries more comparable, standardized case 
criteria and a core set of recommended data items, along with supporting computer 
software, were developed by CDC and others in 1988 (.159) . Although data from most 
existing trauma registries are not population-based, they have been usee to support 
primary prevention activities. For instance, findings from the Virginia Statewide 
Trauma Registry and other sources were used to support legislation regulating the use 
of all-terrain vehicles (158) . 

Population-Based Registries 

Population-based registries are particularly useful for surveillance because, using 
incidence rates, the occurrence of a health event can be estimated over time in 
different geographic areas and subgroups of the population. For most registries, the 
population from which cases are identified is the general population of a specified 
area. Most cancer and birth defects registries, for instance, estimate rates for the 
general population. The population from which cases are identified can also arise from 
a group defined by a specific exposure that is thought to increase the risk of 
illness . 

Descriptive analysis of incidence rates based on registry information can be used for 
health planning purposes and can suggest etiologic hypotheses that can be evaluated 
further with additional studies (50, 159-162) . For some conditions, comparisons between 
incidence and mortality rates can be used to estimate the effectiveness of primary 


prevention, early detection, or treatment programs. Findings from studies based on 
registry information can also encourage physicians to abandon less-than-ef fective 
individual therapies, thus improving the standard of medical care. 

Exposure Registries 

Examples of exposure-based registries include the survivors of atomic bombing or 
Hiroshima and Nagasaki during World War II and their offspring and other groups of 
persons exposed to radiation {152,163-167) . Because workers are often exposed to 
higher levels of physical, chemical, and biologic agents for longer periods than is 
the general public, follow up of cohort of workers have been used for many years to 
identify illnesses associated with these agents and to assess how these illnesses can 
be prevented. 

Registries have also been been used to assess the risk of illness for general 
population groups exposed to specific agents. For instance, about 4,600 individuals 
exposed to polybrominated biphenyls through contamination of dairy cattle-food 
supplements in Michigan were followed to assess acute, subacute, and chronic 
conditions that might have been associated with this exposure (168). More recently, 
the United States Congress has mandated that the Agency for Toxic Substances and 
Disease Registry (ATSDR) address potential public health problems associated with 
environmental exposures to hazardous waste sites and chemical spills, partly through 
the creation of registries (ISO) . ATSDR has described the rationale for a national 
exposure registry and methods to be used in its establishment and maintenance. 

Cancer Registries 

Cancer registries are used in many different countries to estimate cancer incidence 
and mortality rates over time. The Connecticut Tumor Registry, the oldest population- 
based cancer registry in the United States, has monitored cancer incidence rates for 
nearly 50 years (156) . Like hospital -based registries, the Connecticut registry was 
developed initially to support the goals of service-oriented hospital-based cancer 
registries throughout the state. Through the Surveillance, Epidemiology, and End 
Results (SEER) program, the NCI has collected information from specific population- 
based cancer registries since 1973. Participant registries were selected to include a 
variety of population groups rather than a representative sample of United States, 


although nation-wide rates can be estimated using SEER data. The four major goals of 
the SEER program are: 

• to estimate cancer-related incidence and mortality in the United States; 

• to identify unusual changes in the incidence of specific types of cancer 
over time in designated areas or demographic subgroups ,- 

• to describe changes in the extent of disease at diagnosis and to estimate 
patient survival; and 

to foster studies of cancer risk factors, screening, and prognostic 
factors to allow intervention. 

The SEER registry is probably the largest population-based registry in the Western 
world (156). Between 1973 and 1988, the program registered about 1.5 million incident 
cases of cancer. At present, about 10% of the United States population lives in one of 
the nine areas that includes a SEER registry, and approximately 120,000 new cases of 
cancer are registered from these areas each year (169). For all types of cancer 
(except certain types of skin cancer) , information on selected patient demographics is 
recorded in addition to information on primary site, morphology, confirmation of 
diagnosis, extent of disease, and first course of treatment. The registries also 
actively follow all living patients to ascertain vital status (except those with in 
situ cervical cancer) . Incidence rates for cancer based on SEER registry information 
are published regularly, and descriptive analyses of cancer incidence rates by age, 
race, gender, and geographic area are routinely performed. Although not part of the 
SEER system, many states--including New York, California, and New Jersey --maintain 
active, high-quality cancer registries that are used for both public health and 
hospital-directed activities. In 1989, there were 42 cancer registries in the United 
States, including 28 state-based registries that cover part or all of a state's 
residents (170) . 

In Europe, the first cancer registry was founded in Denmark in 1942, and there has 
been steady growth in the number of registries and the size of included populations 
since then (171). At present, Denmark, Belgium, England and Wales, and Scotland have 
nationwide registries, and most European countries have registries in certain regions. 
Information from cancer-incidence registries around the world is collected by the 
International Agency for Research on Cancer (IARC), which is part of WHO. As of 1989, 


IARC had identified 238 population-based registries in 53 countries that collected 
information on cancer incidence, and rates were available for selected years from 106 
of these registries (170) . 

Registries provide important information for a wide range of public health activities, 
but their usefulness for identifying new hazards has, in practice, been limited. 
Initial observations by astute clinicians rather than routine analysis of surveillance 
data have led to more extensive studies to investigate associations between 
angiosarcoma and vinyl chloride, mesothelioma and asbestos, and diethylstilbestrol and 
adenocarcinoma of the vagina (171). Cancer registries were essential, however, for 
identifying cases that were evaluated in more extensive epidemiologic investigations. 
Today, cancer incidence rates from population-based registries are used extensively in 
cancer-cluster investigations to assess whether the number of observed cases differs 
substantially from an expected number derived from baseline cancer incidence rates. 
With increased emphasis on screening activities to detect asymptomatic cancer cases at 
an early, more treatable stage and on behavioral-risk-factor control and possibly 
chemo-prevention, the public health importance of high-quality, population-based 
cancer registries should increase. 

Birth-Defects Registries 

Recognition of an epidemic of limb reduction defects among children exposed prenatally 
to thalidomide stimulated interest in developing population-based birth-defects 
registries in many countries. Some birth- defects surveillance systems (e.g., the 
Birth Defects Monitoring Program (BDMP) in the United States) , use available sources 
of information including vital statistics and hospital -discharge data to monitor 
trends in the birth prevalence of various birth defects {172). This type of passive 
monitoring system is discussed further in the section on administrative data in this 

Like most cancer- incidence registries, however, birth defects registries characterized 
by active case finding obtain information on individual cases from multiple sources. 
In the United States, the Metropolitan Atlanta Congenital Defects Program (MACDP) has 
been in operated by CDC's National Center for Environmental Health (NCEH) (172-174) . 
All births are monitored in the five-county metropolitan Atlanta area-- about 35,000 


births per year. Included in the MACDP are all live-born and stillborn infants 
diagnosed as having at least one major birth defect within their first year of life, 
with diagnoses ascertained within their first 5 years of life. Birth-defect rates and 
trends are monitored by quarterly reviews and analysis of data and are published 
regularly by CDC. Numerous investigations have been performed using MACDP data, 
including studies of Vietnam veterans' risk for fathering children with birth defects, 
the risk of bearing children with specific birth defects for women with insulin- 
dependent diabetes, and an apparent protective effect of peri -conceptual vitamin use 
on the risk of neural tube defects (175-177). In addition, the MACDP has served as a 
prototype for other birth-defects registries characterized by active case- finding 
(172) . 

Use of equivalent case definitions, more specific coding schemes, and a uniform set of 
variables has facilitated collaborative efforts between the eight birth-defects 
registries in the United States characterized by active case-finding (172). For 
instance, surveillance for specific birth-defects associated with first trimester 
exposure to isotretinoin relies on collaborative efforts by CDC and state birth- 
defects registries. 

In Europe, population-based birth-defects registries are coordinated through EUROCAT, 
which is funded through the Economic Community (178). In 1983, birth-defects among 
250,000 births were monitored by 17 birth-defects registries in 10 countries. Both 
active and passive birth-defects registries participate in the International 
Clearinghouse for Birth Defects Monitoring Systems (ICBDMS) , founded in 1974 by WHO as 
a means of disseminating birth-defects data from surveillance systems around the 
world. Information is available each year on birth defects among more than 4.5 million 
births in 30 countries. Although methods used by various registries differ 
considerably, the ICBDMS provides a forum for rapid dissemination of information on 
teratogens. Reports from France linking valproic acid, an anti-epileptic drug, with an 
increase in spina bifida were disseminated rapidly though this international network 
(.179,180) . 

More recently, some registries are being developed in some local communities to 
monitor preschool children for whom early intervention programs are needed. These 
programs can identify children with conditions such as fetal alcohol syndrome, 


cerebral palsy, mental retardation, and behavioral or learning disabilities that are 
often detected shortly after birth. These registries will be useful for estimating the 
prevalence of these conditions, as well as for monitoring the effectiveness of 
services provided to children with special needs. 


Health surveys, particularly those that are conducted on a continual or a periodic 
basis, can provide useful information for assessing the prevalence of health 
conditions and potential risk factors and for monitoring changes in prevalence over 
time. More recently, health surveys have also been used to assess knowledge, 
attitudes, and health practices in relation to certain conditions such as HIV/AIDS. A 
survey differs from a registry in that persons surveyed are usually only queried once 
and are not monitored individually after that one contact. Information on respondents 
can be obtained through questionnaires, in-person or telephone interviews, or through 
record reviews. Attempts are made to assure that the survey sample is as 
representative of the source population as possible in order to increase the validity 
and reliability of estimates extrapolated to that population. Surveys are can be 
valuable for public health surveillance if similar information is collected over time 
and if findings are applied to public health activities. 

In the United States, surveys such as NCHS's National Health Interview Survey (NHIS) 
are important sources of information for monitoring nationwide trends in the 
prevalence of target conditions and risk factors for which national health objectives 
for the year 2000 have been established (73,181). Nationwide surveys are costly, 
however, and due to their complex sample designs, specialized statistical techniques 
are often needed for analysis. Since information is usually not available at a local 
level, the usefulness of national surveys for local surveillance activities is 

Health Interview Surveys 

In the United States, the NHIS, conducted annually since 1957, provides information on 
self-reported illnesses, chronic conditions, injuries, impairments, the use of health 
services, and other health-related topics for the civilian, non-institutionalized 


population {182, 183) . Households are identified through a complex sample design 
involving both clustering and stratification. Households selected for interview each 
week are a probability sample from a primary sampling unit such as a county or 
metropolitan area. Respondents are interviewed in their homes with an adult family 
member providing information for other members of the household. Each year, 
information is collected on about 122,000 people from about 48,500 households (2). The 
interviews, which average about 80 minutes, include a core set of health and socio- 
demographic questions are repeated each year and a supplemental section in which 
detailed information is collected on specific health topics. In 1987, for instance, 
supplemental information was collected on risk factors for cancer and nn knowledge and 
attitudes regarding AIDS. NHIS questions will be modified in the future so that 
progress toward meeting the year 2000 health objectives for the nation can be 
monitored closely. 

In England, Scotland, and Wales, the General Household Survey (GHS) in which 
information on housing, employment, education, health, and use of social services is 
obtained using structured personal interviews has been in operation since 1971 (2) . An 
analogous Continuous Household Survey is conducted in Northern Ireland. Electoral 
wards form the primary sampling units, and about 85% of households- -a total of about 
12,000 per year--agree to participate in the GHS. Over time, the health section of the 
survey has included questions on limitations in activities because of acute or chronic 
illnesses, smoking and drinking patterns, and contacts with health-care providers and 
other health-related topics. The ability to compare health-related information with 
extensive socio-demographic information is one of the major strengths of these 
surveys . 

In the United States, CDC's National Center for Chronic Disease Prevention and Health 
Promotion (NCCDPHP) has worked with state health departments since 1981 to conduct 
telephone surveys about adult health behavior and use of prevention services. The 
primary purpose of these surveys is to support state prevention initiatives. 
Questionnaires used by the Behavioral Risk Factor Surveillance System (BRFSS) include 
a core set of questions, and, depending on a state's interest, supplemental questions 
developed by CDC and questions that meet state-specific needs {184) . The 1988 BRFSS 
included questions on height, weight, physical activity, smoking, alcohol use, seat- 
belt use, and use of prevention services, such as cholesterol screening and 


mammography. By 1990, 45 states and the District of Columbia were conducting these 
surveys. Some states have used BRFSS procedures to conduct more detailed studies. In 
Missouri, for instance, cholesterol awareness was compared in urban and rural areas 
was compared, and in California, cigarette smoking was compared among Chinese, 
Vietnamese, and Hispanics in three communities (185,186) . Information from the BRFSS 
is timely and can reflect the particular interests of a state or local community. Use 
of telephones for interviewing is economical, although many persons without telephones 
who are not included in these surveys are generally more likely to be in need of 
public health services than many of the respondents. 

Since 1988, NCCDPHP has developed and implemented a Youth Risk Behavior Survey (YRBS) 
to focus the efforts of local, state, and federal agencies that monitor the behavior 
of young people (187) . In 1990, the national survey used a three-stage sample design 
to obtain a probability sample of 11,631 students in grades 9 through 12 in 50 states, 
the District of Columbia, Puerto Rico, and the Virgin Islands. From the 1990 survey, 
estimates are available for the prevalence of tobacco use, alcohol and drug use, 
exercise, diet, types of behavior that affect the risk of intentional and 
unintentional injuries, and sexual activity {188-194) . The YRBS was designed to 
monitor changes in these types of behaviors biennially so that progress toward meeting 
year 2000 objectives can be monitored. 

Provider-Based Surveys 

In the United States, information on the use of health-care services is not available 
routinely. In order to estimate the use of these services nationally, NCHS has 
developed two complementary surveys, the National Hospital Discharge Survey (NHDS) and 
the National Ambulatory Medical Care Survey (NAMCS) , in which characteristics of 
health encounters are monitored (181, 195,196). Through the NHDS, information has been 
collected since 1965 on discharges from non-federal, short-stay hospitals, including 
characteristics of patients, length of stay, diagnoses, surgical procedures, and 
hospital size and type of ownership. Beginning in 1987, computerized information for 
some discharges was purchased from commercial abstracting services, but, otherwise, 
discharges are sampled randomly from hospitals included in the survey. In 1987, 
information was collected on about 181,000 discharges from about 400 hospitals- -about 
81% of the hospitals that were asked to participate. Although hospital-discharge 


information is available in many states, it is not available nationally, so that state 
estimates are often derived by extrapolation from the NHDS. Data from the NHDS as well 
as other sources have been used, for instance, to assess the public health burden of 
nine major chronic diseases (197) . 

The NAMCS has been conducted annually from 1973 to 1981, in 1985, and annually since 
1989. The target population for the NAMCS is office visits within the continental 
United States to non- federal physicians who are in office-based practice and engaged 
in direct patient care (9 ,181,196) . About 70% of all ambulatory visits occur in 
physicians' offices, and about 70% of selected physicians agreed to participate in the 
survey in 1990. Beginning in 1989, about 2,500 physicians were included in the sample, 
with each physician completing a short form for about 30 office visits. Information on 
visits to hospital out-patient departments and emergency rooms may be added to the 
NAMCS in the future. In addition to information on diagnoses, medications, and reason 
for visit, the 1990 NAMCS included information on diagnostic and screening services; 
counseling for drug, alcohol, and smoking cessation; and other counseling services 
(198). Estimates are published at the national level, and for some events, at the 
regional level. Unlike hospital-discharge data, ambulatory- care data are rarely 
available for routine use at the state or local level in the United States. To obtain 
information that could be used in their programs, however, Wisconsin conducted an 
ambulatory medical care survey in 1986-1987 based on the NAMCS questionnaire and study 
design (199) . Proprietary data bases, such as the National Disease and Therapeutic 
Index (NDTI) provide ongoing data on conditions seen in ambulatory care settings. 
Although used primarily by the pharmaceutical industry, the NDTI has been used monitor 
the public health impact of recommendations to limit the use of aspirin in children 
with fevers (200) . 

Other Surveys 

Other NCHS surveys include the National Survey of Family Growth (NSFG) and the 
National Health and Nutrition Examination Survey (NHANES) also contain information 
that is useful for public health activities. The NSFG has provided national data on 
demographic and social factors associated with childbearing, adoption, and maternal 
and child health based on household interviews of women of childbearing age. The 
survey has been conducted four times--in 1973,1976,1982, and 1988 (201-203). 


The NHANES has provided extensive information on the prevalence of chronic conditions, 
distribution of physiologic and anthropomorphic measures, and nutritional status for 
representative samples of the U.S. population (204,205). The first two NHANES cycles 
were conducted in 1971 through 1974 and 1976 through 1980 and data collection is 
currently under way for the third cycle. A Hispanic Health and Nutrition Examination 
Survey was conducted in 1982 through 1984 in order to compare health and nutritional 
measures among U.S. residents of Mexican, Puerto Rican, and Cuban origin (206) . Also, 
almost 4000 persons ages 55 to 74 years of ages who had been interviewed in NHANES I 
and were living in 1984 were enrolled in the NHANES I Follow-up Study to assess 
whether their characteristics in the 1970s predicted subsequent health outcomes (207) . 
The NHANES studies are rich sources of information that are used primarily for 
epidemiologic and related analyses. They have been used, however, to provide point 
estimates to monitor changes over time in health outcomes, such as changes in blood- 
lead levels (208). In general, sources of information that are available for more of 
the population over longer periods are more useful for routine 
surveillance activities. 


Through the use of standard procedures and classification schemes, vital statistics 
are derived from birth and death certificates, completed primarily for legal reasons. 
Likewise, information on conditions not evident at the time of birth or death can be 
derived from administrative information routinely available on episodes of care 
(including hospitalizations, visits to emergency rooms, and visits to health-care 
providers in the community). In most instances, routinely collected administrative 
data have been computerized for billing purposes, but since diagnoses are often 
included, these data sets can provide useful information for public health 
surveillance. As computerized administrative data become increasingly available, 
their importance for monitoring a wide range of health outcomes is increasing. 

Availability and usefulness of administrative data for surveillance depend on a number 
of factors including: 

• the type of information that is computerized; 


• the extent to which uniform classification schemes are used to categorize 
diagnoses, signs, symptoms, procedures, and reasons for seeking health 

the availability of sufficient computer capacity and user-friendly 
software programs to process large amounts of data; 
the extent to which supplementary information can be obtained; and 
the extent to which information for individuals from different 
administrative sources or time periods can be linked using a unique 
personal identifier; 

Data that include personal identifiers are particularly useful both because statistics 
can be calculated on the basis of persons rather than on episodes of care and because 
additional information can often be obtained through linkage with other data sets. 
Special precautions are needed, however, to protect the confidentiality of individuals 
when personal identifiers are included in computerized administrative data bases. 
Even when personal identifiers are not included, administrative data can be very 
useful, however, for assessing the public health burden of various conditions based on 
the number of health-care visits and their costs. 

Integrated health-information systems based on administrative data are available in a 
few countries, but in most, information may be available only for certain types of 
health care (e.g., hospitalizations) or for certain segments of the population (e.g., 
those who receive care through the public sector) . Although usually incomplete, 
analysis of administrative data has proved useful for public health surveillance and 
program planning. 

Integrated Health Information Systems 

Integrated health- information systems, in which data on individuals are consolidated 
from a variety of sources are available in Sweden, Canada, and for limited groups in 
the United States. In Sweden, for instance, use of a unique personal identifier 
assigned at birth allows the linkage of computerized information on individuals from a 
variety of sources, including birth and death certificates, the cancer registry, 
hospital discharge summaries, and prescription records {209) . In addition to 
etiologic studies, linked Swedish data bases have been used for a variety of 


surveillance-related analyses. Examples include estimating the incidence of acute 
myocardial infarction; comparing methods of ascertaining myocardial infarction using 
community registers, hospital discharge data, and mortality data; and assessing 
temporal trends in the incidence of hip fracture (144,146,210). 

In Canada, the Saskatchewan Health Plan maintains population-based billing information 
including diagnoses from inpatient, outpatient, and prescription records for 
approximately 1 million residents beginning in 1979 (211,212) . This information, 
which has been used in studies of associations between nonsteroidal anti- inflammatory 
drugs and fatal gastrointestinal bleeding and of associations between valproic acid 
use and congenital malformations, could also be used for ongoing surveillance 
activities (213,214) . 

In the United States, integrated health- information systems have been developed for 
some health-maintenance organizations such as the Kaiser Permanente system or for 
geographic areas served by one major health care provider- -such as Rochester, 
Minnesota. Although used frequently for research, the few integrated health- 
information systems in the United States are of limited use for general public health 
surveillance because the populations included in them are relatively small and not 
representative of the U.S. population. These systems are useful, however, for 
providing information on incidence and prevalence for conditions difficult to monitor 
nationally- -such as the trends in incidence for specific types of primary intracranial 
neoplasms (225) and the prevalence of osteoarthritis of the knee with and without 
corroborative radiographic findings (216) . 

Hospital-Discharge Data Systems 

The importance of collecting information on morbidity from hospital records was noted 
by Florence Nightingale among others, although attempts to collect and analyze this 
information systematically were not initiated until the 1940s in Scotland (2,22 7). 
Today, computerized information from hospital discharge summaries-- including 
demographic information and discharge diagnoses-- is routinely collected and 
computerized using standard data-set formats such as the 1981 Recommended Minimum 
Basic Data Set (RMBDS) for the European community and the Uniform Hospital Discharge 


Data Set (UHDDS) or the Medicare Uniform Bill-82 (UB-82) formats in the United States 
(218,219) . Both the UHDDS and the UB-82 formats are currently being revised in 

In Scotland, for example, a standard morbidity record form is completed for each 
admission to a general, psychiatric, or maternity hospital and is sent to a central 
agency for processing and statistical analysis (217) . Initiated in parts of Scotland 
in 1951, the system eventually included the entire country by 1961. Although records 
include a unique personal identifier, they are not linked routinely except in one area 
of the country. With the advent of the National Health System in 1948, a similar 
system based on 10% of hospital admissions was initiated in England and Wales that 
covered all areas by 1958. 

To monitor the quality of care provided in U.S. hospitals, each acute-care hospital is 
required by the Joint Commission on Accreditation of Healthcare Organizations to 
report information on diagnoses, length of stay, and inpatient services. Hospitals 
often contract with private companies to abstract and computerize pertinent data from 
medical records, but in recent years, many hospitals are computerizing this 
information themselves or abstracting it from computerized treatment records. 
Beginning in the early 1980s, individual states began to require submission of 
hospital-discharge data for utilization, financial, and other health-planning studies 
(219) . Thus, hospital discharge summary data are computerized for most discharges 
from acute-care hospitals in the United States, but data are not available nationally 
for all segments of the population from any one source. 

Private -sector systems 

In the private sector, the Commission on Professional and Hospital Activities (CPHA) 
has abstracted information from medical records of U.S. hospitals for over 30 years 
(219,220) . Today, CPHA's Professional Activities Study (PAS) data base includes over 
200 million records with diagnoses coded according to the clinical modification of the 
ICD-9 (ICD-9-CM) in the UHDDS format; 6 million more records are being added each year 
(219,221) . The PAS includes information from clinical rather than billing records, 
since staff from cooperating hospitals review medical charts, prepare case abstracts, 
and send information to CPHA. Hospital-discharge data from CPHA and more recently 
from the McDonnell Douglas Hospital Information System (MDHIS) have been used for the 


surveillance of birth defects and related conditions (222). Today, the Birth Defects 
Monitoring Program (BDMP) , initiated in 1974, includes information from newborn 
discharge summaries for about 1 million newborns per year--about 25% of the births in 
the United States. Prevalence rates are calculated using the number of live births as 
the denominator, and trends in rates for targeted conditions are published routinely 
(223) . Information for the BDMP is abstracted from hospital discharge summaries and 
is not routinely verified. Although personal identifiers are not included in BDMP 
data sets, participating hospitals have agreed to provide hospital records for special 
studies using their own patient numbers to identify records (224,225) . More recently, 
additional information on possible maternal exposures (e.g., infections, use of 
prescription or illicit drugs, or the use of alcohol) linked to birth defects or other 
adverse outcomes noted at birth is available for a subset of infants in the BDMP. 
Probabilistic matching procedures are used to link summary data without personal 
identifiers from newborn and maternal hospital discharge records (222) . Validation 
studies indicate that about 95% of the records linked using the matching algorithm are 
true matches. Linked maternal and infant hospital-discharge records are particularly 
useful for investigating problems associated with maternal exposures. Information on 
birth defects surveillance systems characterized by active case-finding and 
integration of information from multiple sources appears in the registry section of 
this chapter. 

In the United States, use of hospital-discharge data from CPHA, MDHIS, or other 
private-sector sources is more limited for surveillance of conditions other than those 
identified at birth. For the latter, birth-prevalence rates can be calculated using 
the number of live births in that hospital as the population at risk, even if the 
geographic areas to which these rates apply are not known. Calculation of incidence 
or prevalence rates for other conditions is limited by two factors: first, because the 
lack of complete coverage for a geographic area limits the use of census data to 
estimate the population at risk; and, second, because initial hospitalizations for 
conditions cannot usually be distinguished from subsequent hospitalizations. 

In 1988, 29 states maintained hospital-discharge-data systems for acute-care 
hospitals: 17 in the UB-82 format, eight in the UHDDS format, and four in unique data 
formats (219). Although not currently required on the UHDDS or the UB-82, external 
cause-of-injury ("E codes") are required in eight states (226). In most states, 


unique personal identifiers are not computerized, and the extent to which these data 
can be accessed and used for surveillance varies from state to state. When hospital 
discharge information is available, however, estimates of the public health burden of 
inpatient care--based on the number, the duration, and the cost of hospitalizations — 
have been useful for setting priorities for prevention or treatment efforts or for 
targeting interventions to specific subgroups in the community. 

In California, for instance, hospital -discharge data coupled with estimates of the 
proportion of specific diseases attributable to smoking were used to estimate the cost 
of treating smoking-related diseases paid with public funds. To recoup some of these 
costs, California instituted a 25-cent sales tax on tobacco products in 1989 (227). 
State-based hospital discharge data systems have also been used effectively to assess 
the public health impact of injuries in states that require "E codes" (226) . For 
instance, the effect of mandatory seat-belt laws and more stringent drunk-driving laws 
on motor-vehicle-related injuries has been demonstrated using hospital-discharge data 
that includes "E codes'. 

Federal data-collection systems 

In the United States, health care is provided using public funds for about one-quarter 
of the non- institutionalized population--including the elderly (13%), the poor (9%), 
and the military and their dependents (4%) [228). In 1965, two federal health- 
insurance programs- -a hospital insurance plan and a supplementary insurance plan- -were 
established for persons _> age 65. Both of these Medicare health- insurance programs 
are administered by HCFA. All eligible recipients are enrolled in the first plan 
(Part A), which provides coverage for inpatient hospitalizations, stays in skilled 
nursing facilities, and home health services. The second plan (Part B) , for which 
beneficiaries pay a small premium, covers physician services, outpatient hospital 
services, and other medical services. About 96% of the population _> 65 years is 
enrolled in at least the Part A program (229) . Medicare programs were extended in 
1972 to cover persons with end-stage renal disease that required dialysis or 
transplantation and to persons with disabilities <65 years (230) . In Fiscal Year 
1988, Medicare program payments for 31 million beneficiaries _> 65 years and an 
additional 3 million persons with disabilities accounted for about 18% of all personal 
health- care spending in the United States. 


For Part A claims, computerized bills in the UB-82 format are submitted to fiscal 
intermediaries and then are consolidated nationally. Diagnoses included on each bill 
affect payment to hospitals because, since 1983, most short-stay hospitals have been 
paid for each case on the basis of prospectively established rates for some 475 
diagnosis-related groups (DRGs) (228) . To monitor the quality of care provided 
through Medicare programs, HCFA created the Medicare Provider Analysis and Review 
(MEDPAR) file by linking information on individuals such as age, gender, race, and 
residence from the eligibility files; information on diagnoses and treatment from Part 
A and Part B claims files; and information on health-care providers from a facilities 
file. A unique health- insurance number — usually the social security number--is used 
to link information on individuals. HCFA has created a public-use file for Part A 
data from the MEDPAR file and plans to add Part B files, which will includes 
diagnostic data in 1992 {231) . 

Although most studies using MEDPAR files have focused on quality of care and medical 
effectiveness these files have also been used to assess the public health impact of 
various conditions such as end- stage renal disease and hip fracture among the elderly 
(107,230-234) Point prevalence can be estimated because nearly all members of the 
general population _> 65 years are enrolled in Medicare. Incidence can also be 
estimated for some conditions because the first hospitalization can be identified in 
records for an individual linked by using the unique personal identifier. These 
estimated incidence rates would approximate true incidence rates more closely, 
however, for acute events such as hip fracture than for long-standing conditions such 
as Type II diabetes. Since many conditions are commonly among the elderly, rates can 
often be estimated for small geographic areas such as cities or counties (235) . 
Recent studies indicate, for instance, that hip fracture is more common in southern 
states, even though weather conditions are more adverse in the north (236,237) . 

Even more useful public health surveillance information about Medicare recipients 
should be more available in the near future. A National Claims History File is being 
created for elderly Medicare recipients with information from all claims linked for 
individuals (219) . To obtain additional clinical information, medical records for a 
random sample of beneficiaries will be abstracted using standard procedures to create 
a Uniform Clinical Data Set. Self-administered questionnaires will be sent to a 
sample of the elderly at regular intervals to obtain additional information on health 


status prior to entering the Medicare program, on health- related behaviors, and on 
functional status. Information from all these sources will be linked in the Medicare 
Beneficiary Health Status Registry. Information from other sources, such as the SEER 
registry and other cancer-incidence registries will be linked with Medicare files when 
possible (238) . An end-stage renal disease registry has been developed by linking 
health-claims information (239) . As they become available, these enhanced data sets 
should prove useful for monitoring trends, for public health planning, and for 
evaluating the effectiveness of medical and preventive health services such as 
mammography and vaccination. 

Medicaid, HCFA's other major public health-care program, provides health-care funds 
for the poor and medically needy through a federal-state cost-sharing program. 
Medicaid data had been used in for surveillance and program planning at state and 
local levels, particularly in the maternal and child health area. Further information 
on uses of Medicaid claims data for surveillance is provided in the ambulatory care 
and related data section of this chapter. 

Hospital-discharge records from IHS hospitals have been particularly useful for 
developing community-specific injury profiles and targeting local public health 
interventions (226) . "E codes" have been included in discharge summaries from IHS 
hospitals for over 20 years, and regional injury prevention coordinators are notified 
electronically of injury-related hospitalizations. Identification of hazardous areas 
identified through analysis of local data has led to brighter and more effective 
lighting and to installation of pedestrian walkways along hazardous stretches of road. 

Data-Collection Systems in Emergency Rooms and Other Units 

Administrative data from hospital emergency rooms have been used for surveillance of a 
variety of acute health events including non- fatal injuries, illicit drug use, 
poisonings, and adverse reactions to prescription drugs. Unlike inpatient hospital- 
discharge data, however, emergency- room data are not routinely computerized and 
reported from all hospitals in a standard format. Because the type of information 
recorded and the filing systems used to retrieve health information differ, special 
surveillance systems focused on specific outcomes such as injuries or illicit drug use 
have been developed using information obtained from cooperating hospitals. 


Information from these special surveillance systems is usually not linked with other 
data sources. Although the scope of these systems is limited, they have provided 
useful information for the surveillance of acute, non-fatal health events for which 
admission to a hospital is not warranted. 

In England and Wales, information has been provided by the Home Accident Surveillance 
System (HASS) since 1976 [240) . Information is collected by trained clerks from 20 
randomly sampled major emergency departments. Each hospital remains in the system for 
4 years, and five hospitals are replaced each year from the pool of 270 hospitals with 
large emergency departments. A similar system, the European Home and Leisure 
Accident Surveillance System (EHLASS) is being implemented in all European Economic 
Community countries. 

In the United States, information on injuries associated with the use of consumer 
products (other than automobiles) is available through CPSC's National Electronic 
Injury Surveillance System (NEISS) . Since 1972, information on consumer-product- 
related injuries, poisonings, and burns has been abstracted from emergency -room 
records of a representative sample of hospitals (9) . Information is sent 
electronically each day to CPSC, and more in-depth information can be obtained on 
conditions of special interest. Information on occupation- related injuries has been 
collected since 1982, although the number of hospitals included in NEISS was reduced 
from the original 73 to 62 in 1987 (241,242). 

National estimates for a variety of conditions are derived by weighing data from 
reporting hospitals. NEISS has provided estimates of various consumer-product- and 
occupation-related injuries, including estimates of the number of work-related 
injuries in the United States bicycle-related injuries and poisonings among children 
(241-243) . NEISS provides the only national estimates of injuries seen in emergency 
rooms, although the number of hospital emergency rooms on which this information is 
based is relatively small. NEISS data have also been used to assess the public health 
impact of injuries at the local level. From NEISS data from one hospital, a cluster 
of injuries that occurred among young girls and were related to playground merry-go- 
rounds was identified (244) . Pediatric injury surveillance systems using emergency 
room and hospital discharge data have also been established in other areas (245,246) . 


In the United States, NIDA's Drug Abuse Warning Network (DAWN) relies on reports from 
about 700 hospital emergency rooms and 85 medical -examiners' or coroners' offices to 
detect emerging trends in the nature and severity of drug-abuse problems in the United 
States (9,247). Facilities report voluntarily to DAWN beginning in 1972, about 453 
emergency rooms in 21 U.S. cities reported data consistently by 1991 (248). Cocaine- 
related deaths increased rapidly between 1985 and 1988 although recent reports 
suggest that cocaine-related medical emergencies began to decrease in the first half 
of 1989. In the same metropolitan areas, about twice as many deaths were identified 
through DAWN as through the vital statistics system, although time trends were similar 
in both types of data. The DAWN system provides timely information on medical 
emergencies related to drug abuse, although estimates are not population-based and are 
based on voluntary participation from medical facilities. 

In some areas, information may be available from poison-control centers, burn units, 
or trauma registries. In Great Britain, poison-control centers—particularly the 
National Poison Information Service in London--have provided information for a variety 
of studies of trends in abuse of solvents and poisonings of children (249) . In the 
United States, poison-control centers--covering 430 defined geographic areas--reported 
over 121,000 instances of exposure to suspected poisons to FDA (243). Reports, for 
instance, of childhood poisonings to FDA have declined since the introduction of 
child-resistant caps for medication containers, and among children < 5 years of age, 
flavored chewable vitamins are now the most common pharmaceutical product associated 
with poisoning. Information from poison-control centers has also been used to monitor 
acute occupation- related health events such as exposure to agricultural chemicals and 
corrosive chemicals (250) . In some centers, requests for information on treatment for 
suspected poisonings may be collected and computerized in a standard form, although a 
standard format for a minimum data set has not been adopted. Exchange of information 
by national and international organizations--such as the American and the European 
Associations of Poison Control Centers and the World Federation of Poison Control 
Centers—facilitates identification and treatment of persons for acute conditions 
related to exposure to toxic substances (249) . 

Unlike hospital-discharge data, information from emergency rooms, poison-control 
centers, and related facilities is usually not available routinely in a standard 
format. Efforts are under way, however, to create standard minimum data sets and 


reporting formats to aggregate and compare data. With the increase in surgical and 
other procedures performed on an outpatient basis, the importance of collecting core 
information from outpatient settings will increase. 

Ambulatory Care and Related Data 

With the exception of countries such as Sweden and Canada that have integrated health- 
information systems, ambulatory-care data are not generally available from 
administrative sources for all segments of the population. Information on the 
prevalence of signs, symptoms, and conditions not usually requiring hospitalization is 
usually obtained through periodic surveys of the general population or through 
sentinel-surveillance systems characterized by voluntary reporting of specific 
conditions by health- care providers. In the United States, a Uniform Ambulatory Care 
Data Set (UACDS) , first developed in 1974 and revised in 199C, offers the possibility 
for standardization of ambulatory-care data (219) , although it is not widely used at 
present. At present, however, diagnostic information is often not required, and when 
included, it is often difficult to distinguish actual diagnoses from presumptive 
diagnoses that are being "ruled out." Inpatient procedures are usually coded using 
the ICD-9-CM, but a universally accepted classification system is not used in 
outpatient settings. The Current Procedure Terminology, fourth revision (CPT-4) and 
the HCFA Common Procedure Coding System (BPCS) are both used, although CPT-4 codes are 
not equivalent to ICD-9-CM codes used for the same procedures in inpatient settings. 
With rapid changes in medical care, it is difficult to maintain an up-to-date 
procedure-classification system. 

In spite of these limitations, the use of claims and related data from public programs 
for surveillance and program planning is increasing in the United States. While data 
from public programs cover only a segment of the population, they are the segment to 
which public health interventions are most often targeted. Information from the 
Medicaid program, in particular, has been used by state and local health departments. 
About 23 million individuals were enrolled in Medicaid programs in Fiscal Year 1988, 
accounting for about 10% of personal health-care expenditures in the United States 
(123) . The eligible population, however, changes substantially over time. 


Because the states have broad discretion in administering the program under federal 
guidelines, benefits vary from state to state, as do the health-information systems 
used to track health claims. The states report aggregate expenditure and utilization 
data to HCFA, although about half the states voluntarily report patient-level 
information (107 ,228) . Data from five states that report data using uniform 
enrollment, provider, and claims- file formats can be aggregated, but otherwise, 
differences in eligibility, covered services, and file structure make it difficult to 
aggregate data across states. Within states, however, health departments are 
attempting to link public health data from various sources to monitor the 
effectiveness of their programs, particularly in the maternal and child health area 
{203) . Many states now link birth- and death-certificate data for deaths that occur 
within the first year of life. Some states are able to link Medicaid data with vital- 
record data, and a few are also able to add data from various public health programs 
to linked Medicaid/vital-record data sets. 

Public health program data are derived from various sources: maternal- and infant-care 
clinics; vaccination clinics; neonatal screening programs for inborn errors of 
metabolism, maternal drug use, and HIV seroprevalence; lead-screening programs for 
schoolchildren; clinics for children with special needs; families enrolled in the 
Women, Infants, and Children (WIC) nutrition supplement programs; hospital discharge 
data; data from the Pregnancy Risk Assessment Monitoring System (PRAMS) ; school 
vaccination records; and data from Head Start programs (203,252). State and local 
health departments have met with varying levels of success in linking data sets, but 
the most successful have been able to target and evaluate public health interventions 
and to monitor outcomes. In Tennessee, for instance, adverse sequelae following 
vaccination were monitored using linked vaccination-clinic records, Medicaid-claims 
data, and vital records {252) . Also in Tennessee, birth certificate and WIC data were 
linked to assess the extent to which high-risk infants were enrolled in county WIC 
programs (253) . Massachusetts and Colorado are among the states that are redesigning 
data bases for public health programs so that the data can be linked more easily 
{203,251) . 

Some information derived from state and local public health programs is available 
nationally in the United States. CDC's Pediatric Nutrition Surveillance System and 
Pregnancy Nutrition Surveillance System have been operational since 1973 and 1980, 


respectively (203, 251) . In both systems, key indicators of nutrition status are 
monitored continuously in participating states using information derived from 
publicly-funded health, nutrition, and food-assistance programs. Information is 
available from 40 states for the pediatric-nutrition system and from 16 states for the 
pregnancy-nutrition system. These data sets have been used to assess the prevalence 
of malnutrition in children < 2 years; to assess the prevalence of anemia during 
pregnancy among low-income women; and to monitor the decline in the prevalence of 
anemia among low-income children in the United States (254-256) . 

Although few countries have integrated health-information systems at present, they may 
become more common in the future. Although not integrated and not inclusive of most 
of the population, data from the patchwork of administrative systems available at 
present have been used successfully for public health surveillance and program 
planning. In the United States, computerized hospital discharge data are relatively 
standardized, but access is limited in some states. Because data-reporting formats 
are less standardized for outpatient settings, it is difficult to aggregate such data. 
Efforts by state health departments to create integrated data bases for public 
programs will help states to monitor their programs more effectively. Although 
eligibility may vary among states, standardization and reporting of data for at least 
some core variables could enhance information available nationwide on problems of 
public health importance. 


Sources of data available for public health surveillance vary considerably from 
country to country. Developed and many developing countries are able to monitor 
reproductive outcomes and mortality through vital statistics systems and many 
countries have notifiable disease-reporting systems for at least some infectious 
diseases. Otherwise, the extent of information available through administrative data 
systems, surveys, registries, and sentinel surveillance systems varies extensively 
from country to country. Although the quality and the completeness of these data 
sources may be limited, they often provide low-cost information that is useful for 
public health surveillance and related activities. Even if new data-collection 
efforts are needed to address specific problems, routinely collected data can provide 
background information that will be useful for designing these studies. 


The increasing computerization of health information, the availability of powerful but 
relatively inexpensive computers, and the development of user-friendly software should 
facilitate the timely use of information from a wide range of sources. Although 
integrated health-information systems and computerized medical records may be on the 
horizon in some countries, limited information that is available quickly from 
notifiable-disease and sentinel-surveillance systems is often the most useful for 
conditions in which timely public health action is needed. Since no one source of 
data is usually adequate, good public health decision-making invariably requires the 
synthesis of data of varying quality from a wide range of sources as well as critical 
interpretation of findings. 

Appendix III. A. Surveillance or Health Information Systems 
Mentioned in Chapter III 

I. Notifiable diseases and related reporting mechanisms 

NNDSS National Notifiable Diseases Surveillance 

System, United States (CDC and state health 

VAERS Vaccine Adverse Event Reporting System, 
United States (FDA) 

II. Vital Statistics 

121-City Surveillance System, United States 

MSS Mortality Surveillance System, United States 


NTOF National Traumatic Occupational Fatality 

surveillance system, United States 

Medical Examiner/Coroner Information Sharing 
System, United States (NCEH/CDC) 

FARS Fatal Accident Reporting System, United 

States (NHTA) 

III. Sentinel surveillance 

SENSOR Sentinel Event Notification System for 

Occupational Risks, United States (NIOSH/CDC) 

EEARDN European Electronic Adverse Drug Reaction 
Network, Europe 

IV. Registries 

Connecticut Tumor Registry, United States 

SEER Surveillance, Epidemiology, and End Results 

Program, United States (NCI) 



Metropolitan Atlanta Congenital Defects 
Program, United States (NCEH/CDC) 

V . Surveys 









General Household Survey, United Kingdom 

Continuous Household Survey, Ireland 

National Health Interview Survey, United 
States (NCHS/CDC) 

Behavioral Risk Factor Surveillance System, 

United States (NCCDPHP/CDC and state health 


Youth Risk Behavior Surveillance System, 

United States (NCCDPHP/CDC and state health 


National Hospital Discharge Survey, United 
States (NCHS/CDC) 

National Ambulatory Medical Care Survey, 
United States (NCHS/CDC) 

National Disease and Therapeutic Index, 
United States (private sources) 

National Survey of Family Growth, United 
States (NCHS/CDC) 

National Health and Nutrition Survey, United 
States (NCHS/CDC) 

Hispanic Health and Nutrition Survey, United 
States (NCHS/CDC) 

HANES I Followp-up Study, United States 

VI. Administrative data-collection systems 



Professional Activity Studies, United States 

McDonnell Douglas Hospital Information 
System, United States 

Birth Defects Monitoring Program, United 
States (NCEH/CDC) 








Medicare Provider Analysis and Review, United 
States (HCFA) 

Home Accident Surveillance System, United 

European Home and Leisure Accident 
Surveillance System, Europe 

National Electronic Injury Surveillance 
System, United States (CPSC) 

Drug Abuse Warning Network, United States 

Pregnancy Risk Assessment Monitoring System, 
United States (NCCDPHP/CDC and state health 


1. Alderson M. International mortality statistics. New York: Facts on File, 
Inc., 1981. 

2. Ashley JSA, Cole SK, Kilbane MPJ. Health information resources: United Kingdom-- 
health and social factors. In: Holland WW, Detels R, Knox G (eds) . Oxford 
Textbook, of Public Health. 2nd edition. Vol. 2 Methods of Public Health 
Surveillance. Oxford, England: Oxford University Press, 1991:29-54. 

3. Yanagawa H, Nagai M. Health information resources: Japan--health and social 
factors. In: Holland WW, Detels R, Knox G (eds.). Oxford textbook of public 
health. 2nd edition. Vol. 2: methods of public health surveillance. Oxford, 
England: Oxford University Press, 1991:55-66. 

4. National Center for Health Statistics. International Health Data Reference 
Guide, 1991. DHHS Publication no. (PHS) 92-1007. Hyattsville, Md. : National 
Center for Health Statistics, 1992. 

5. World Health Organization. World health statistics annual. Geneva, 
Switzerland: WHO, 1988. 

6. United Nations. 1989 demographic yearbook. 41st issue. New York: Department of 
International and Social Affairs, Statistical Office, 1991. 

7. World Health Organization. Targets for health for all: targets in support of 
the European regional strategy for health for all. Copenhagen, Denmark: WHO 
Regional Office for Europe, 1985. 

8. National Office of Vital Statistics. Reported incidence of selected notifiable 
diseases. United States, each division and state, 1920-50. Vital Statistics 
Special Reports. Washington, D.C.iU.S. Government Printing Office, 1953 ; 37 :180 . 

9. Pearce N. Health information resources: United States—health and social 
factors. In: Holland WW, Detels R, Knox G (eds.). Oxford textbook of public 
health. 2nd edition. Vol. 2: Methods of public health surveillance. Oxford, 
England: Oxford University Press, 1991:11-28. 

10. Moro ML, McCormick A. Surveillance for communicable disease. In: Eylenbosch WJ 
and Noah ND (eds.). Surveillance in health and disease. Oxford, England: Oxford 
University Press, 1988:166-82. 

11. Chorba TL, Berkelman RL, Safford SK, Gibbs N, Hull HF. Mandatory reporting of 
infectious diseases by clinicians. MMWR 1990;39 (RR-9) :1-17 . 

12. Centers for Disease Control. Summary of notifiable diseases, United States, 
1991. MMWR 1991,-40. 

13. Freund E, Seligman PJ, Chorba TL, Safford SK, Drachman JG, Hull HF. Mandatory 
reporting of occupational diseases by clinicians. MMWR 1990;39 (RR-9) : 19-28. 

14. Seligman PJ, Matte TD. Case definitions in public health. Am J Public Health 

15. Centers for Disease Control. Childhood lead poisoning, New York City, 1988. MMWR 
1990;39(SS-4) :l-7. 

16. Green LA, Lutz LJ. Notions about networks; primary care practices in pursuit of 
improved primary care. In: Mayfield J and Grady ML (eds.) . Primary care 
research: agenda for the 90s. Rockville, Md.:Agency for Health Care Policy and 
Research, 1990:13-22. 

17. Centers for Disease Control. Case definitions for public health surveillance. 
MMWR 1990;39(No. RR-13):l-43. 

18. Centers for Disease Control. Vaccine adverse event reporting system- -United 
States. MMWR 1990,-39,-730-3 . 


19. Faich GA. National adverse drug reaction reporting, 1984-1989. Arch Intern Med 

20. Rossi AC, Bosco L, Faich GA, Tanner A, Temple R. The importance of adverse 
reaction reporting by physicians: suprofen and the flank pain syndrome. JAMA 

21. Kimmel K. Surveillance for adverse reactions to drugs. In: Eylenbosch WJ and 
Noah ND (eds.). Surveillance in health and disease. Oxford, England: Oxford 
University Press, 1988: 244-54. 

22. Inman WHW. Hazards of drug therapy. In: Holland WW, Detels R, Knox G (eds.). 
Oxford textbook of public health. 2nd edition. Vol. 2: methods of public health 
surveillance. Oxford, England: Oxford University Press, 1991:481-500. 

23. Centers for Disease Control. National electronic telecommunications system for 
surveillance--United States, 1990-1991. MMWR 1991, -40: 502 . 

24. Tsai TF. Arboviral infections in the United States. Infect Dis Clin North Am 
1991;5(1) :73-102. 

25. Fishbein D. Rabies. Infect Dis Clin North Am 1991,-5 (1) :53-74. 

26. Craven RB, Barnes AM. Plague and tularemia. Infect Dis Clin North Am 
1991;5(1) :165-76. 

27. Buchstein SR, Gardner P. Lyme Disease. Infect Dis Clin North Am 1991; 5 (1) :103- 

28. Weber DJ, Walker DH. Rocky Mountain spotted fever. Infect Dis Clin North Am 
1991;5(1) :19-36. 

29. Sacks JJ. Utilization of case definitions and laboratory reporting in the 
surveillance of notifiable communicable diseases in the United States. Am J 
Public Health 1985;75:1420-2. 

30. Berkleman RL, Buehler JW. Surveillance. In: Holland WW, Detels R, Knox G (eds). 
Oxford Textbook of Public Health. 2nd edition. Vol. 2 Methods of Public Health 
Surveillance. Oxford, England: Oxford University Press, 1991:161-76. 

31. Sherman IL, Langmuir AD. Usefulness of communicable disease reports. Public 
Health Rep 1952;67:1249-57. 

32. Thacker SB, Berkleman RL. Public health surveillance in the United States. 
Epidemiol Rev 1988;10:164-90. 

33. Serfling RE, Sherman IL. Problems in improving reported morbidity data as a 
tool for epidemiological research. CDC bulletin. Atlanta, Ga.:Public Health 
Service, October 1951:24-7. 

34. Fife D, McAnaney, Rahman MA. Changes in AIDS case reporting after hospital site 
visits. Am J Public Health 1991;81:1648-50. 

35. Conway GA, Colley-Niemeyer B, Pursley C et al . Underreporting of AIDS cases in 
South Carolina, 1986 and 1987. JAMA 1989;262:2859-63. 

36. Selik RM, Buehler JW, Karon JM, Chamberland ME, Berkelman RL. Impact of the 
1987 revision of the case definition of acquired immune deficiency in the United 
States. J Acguir Immune Defic Syndr 1990;3:73-82. 

37. Cohen DA, Boyd D, Prabhudas I, Mascola L. The effects of case definition in 
maternal screening and reporting criteria on rates of congenital syphilis. Am J 
Public Health 1990;80:316-7. 

38. Centers for Disease Control. Congenital syphilis--New York City, 1986-1988. 
MMWR 1989,-38:825-9. 

39. Centers for Disease Control. Lyme disease surveillance- -United States, 1989- 
1990. MMWR 1991;40:417-21. 


40. Andrews JM, Quinby GE, Langmuir AD. Malaria eradication in the United States. 
Am J Public Health 1950;40:1405-11. 

41. Langmuir AD. The surveillance of communicable diseases of national importance. 
N Engl J Med 1963;268:182-92. 

42. Noah N. Transmissible agents. In: Holland WW, Detels R, Knox G (eds.). Oxford 
textbook of public health. 2nd edition. Vol. 2: methods of public health 
surveillance. Oxford, England: Oxford University Press 1991:417-35. 

43. Holmberg SD, Osterholm MT, Senger KA, Cohel ML. Drug- resistant Salmonella from 
animals fed antimicrobials. N Engl J Med 1984;311:617-22. 

44. Centers for Disease Control. Measles prevention: recommendations of the 
Immunization Practices Advisory Committee (ACIP) . MMWR 1989; 38 (No. S-9):l-18. 

45. Centers for Disease Control. Hepatitis B virus: a comprehensive strategy for 
eliminating transmission in the United States through universal chilhood 
vaccination; recommendations of the Immunization Practices Advisory Committee 
(ACIP). MMWR 1991;40 No. RR-13. 

46. Nathanson N, Langmuir AD. The Cutter incident: poliomyelitis following 
formaldehyde-inactivated poliovirus vaccination in the United States during the 
spring of 1955. I. Background. American Journal of Hygiene 1963;78:16-28. 

47. Centers for Disease Control. Summary of 1990-1991 influenza season, United 
States. Influenza Branch, Division of Viral and Rickettsial Diseases. Atlanta, 
Ga.: 1991. 

48. Centers for Disease Control. Eosinophilia-myalgia syndrome- -New Mexico. MMWR 

49. Watkins M, Lapham S, Hoy W. Use of a medical center's computerized health care 
data base for notifiable disease surveillance. Am J Public Health 1991;81;637- 

50. Mo.nson RR. Occupational epidemiology. 2nd edition. Boca Raton, Fla.:CRC 
Press, Inc., 1990. 

51. World Health Organization. Manual of international statistical classification 
of diseases, injuries, and causes of death. Ninth Revision. Geneva, 
Switzerland: WHO, 1977. 

52. Ruzicka LT, Lopez AD. The use of cause-of-death statistics for health situation 
assessment: national and international experiences. World Health Stat Q 

53. Brzezinski ZJ. Mortality indicators and health- for-all strategies in the WHO 
European region. World Health Stat Q 1986;39:365-78. 

54. Kleinman JC. The slowdown in the infant mortality decline. Paediatr Perinat 
Epidemiol 1990;4:373-81. 

55. Cooper R, Sempos C, Hsieh SC, Kovar MG. Slowdown in the decline of stroke 
mortality in the United States, 1978-1986. Stroke 1990;21:1274-9. 

56. Schwartz E, Kofie VY, Rivo M, Tuckson R. Black/white comparisons of deaths 
preventable by medical Intervention: United States and the District of Columbia 
1980-1986. Int J Epidemiol 1990;19:591-8. 

57. McCord C, Freeman HP. Excess mortality in Harlem. N Engl J Med 1990;322:173-7. 

58. Kleinman JC, Kiely JL. Postneonatal mortality in the United States: an 
international perspective. Pediatrics 1990;86:1091-97. 

59. Li J-Y. Cancer mapping as an epidemiologic research resource in China. Recent 
Results Cancer Res 1989;114:115-36. 


60. Walter SD, Birnie SE. Mapping mortality and morbidity patterns: an 
international comparison. Int J Epidemiol 1991;20:678-89. 

61. Fingerhut LA, Kleinman JC. International and interstate comparisons of homicide 
among young males. JAMA 1990;263:3292-5. 

62. Powell-Griner E, Woolright A. Trends in infant deaths from congenital 
anomalies; results from England and Wales, Scotland, Sweden, and the United 
States. Int J Epidemiol 1990;19:391-8. 

63. Office of Population Censuses and Surveys. The Registrar General's Decennial 
Supplement 1970-72, England and Wales. Occupational Mortality. Series DS No. 
1. London:Her Majesty's Stationery Office, 1978. 

64. New estimates of maternal mortality. Wkly Epidemiol Rec 1991;66:345-52. 

65. Lew JF, Glass RI, Gangarosa RE, Cohen IP, Bern C, Moe CL. Diarrheal deaths in 
the United States, 1979 through 1987. JAMA 1991;265:3280-4. 

66. Weiss KB, Wagener DK. Asthma surveillance in the United States: a review of 
current trends and knowledge gaps. Chest 1990,-98 :179S-184S. 

67. Sutter RW, Cochi SL, Brink EW, Sirotkin BI. Assessment of vital statistics and 
surveillance data for monitoring tetanus mortality. United States, 1979-1984. 
Am J Epidemiol 1990;131:132-42. 

68. Hardy RJ, Schroder GD, Cooper SP, Buffler PA, Prichard HM, Crane A. 
Surveillance system for assessing health effects from hazardous exposures. Am J 
Epidemiol 1990 :132;S32-S42. 

69. Pickle LW, Mason TJ, Howard N, Hoover R, Fraumeni JF. Atlas of U.S. cancer 
mortality among whites: 1950-1980. DHHS Publ No (NIH) 87-2900. Washington, 
D.C.:U.S. Government printing Office, 1987. 

70. Brownson RC, Smith CA, Jorge NE et al . The role of data-driven planning and 
coalition development in preventing cardiovascular disease. Public Health Rep 

71. Boss LP, Suarez L. Use of data to plan cancer prevention and control programs. 
Public Health Rep 1990;105:354-60. 

72. Chan LS, Portnoy B, Black BL. California's use of health statistics in child 
health planning. Am J Prev Med 1985;1:24-30. 

73. Department of Health and Human Services. Healthy people 2000: national health 
promotion and disease prevention objectives. U.S. Department of Health and 
Human Services. Washington, D.C.:U.S. Government Printing Office, DHHS 
Publication No (PHS) 91-50212, 1991. 

74. Percy C, Staneck E, Gloeckler L. Accuracy of cancer death certificates and its 
effect on cancer mortality statistics. Am J Public Health 1981;71:242-50. 

75. Percy C, Muir C. The international comparability of cancer mortality data: 
results of an international death certificate study. Am J Epidemiol 

76. Kelson MC, Heller RF. The effect of death certification and coding practices on 
observed differences in respiratory disease mortality in 8 EEC countries. Rev 
Epidemiol Sante Publique 1983;31:423-2. 

77. Vital and health statistics: the 1989 revision of the U.S. standard certificates 
and reports. Series 4: documents and committee reports. No. 28. DHHS Publ no 
(PHS) 91-1465. Hyattsville, Md. : National Center for Health Statistics, 1991. 

78. National Center for Health Statistics. Physicians' handbook on medical 
certification of death. Hyattsville, Md. : National Center for Health Statistics, 


79. National Center for Health Statistics. Hospitals' and physicians' handbook on 
birth registration and fetal death reporting. Hyattsville, Md. :National Center 
for Health Statistics, 1987. 

80. Taffel SM, Ventura SJ, Gay GA. Revised U.S. certificate of birth--new 
opportunities for research on birth outcome. Birth 1989;16:188-93. 

81. Freedman MA, Gay GA, Brockert JE, Potrzebowski PW, Rothwell CJ. The 1989 
revision of the U.S. standard certificates of live birth and death and the U.S. 
standard report of fetal death. Am J Public Health 1988;78:168-72. 

82. Funeral director's handbook on death registration and fetal death reporting. 
Hyattsville, Md. : National Center for Health Statistics, 1987. 

83. Kleinman JC, Kiely JL. Infant mortality. Healthy People 2000 statistical notes. 
Hyattsville, Md.:National Center for Health Statistics, 1991:l(no 2). 

84. Vital statistics of the United States, 1988. Vol 1 - Natality. DHHS Publ No 

(PHS) 90-1100. Hyattsville, Md.: National Center for Health Statistics, 1990. 

85. Vital statistics of the United States, 1988. Vol II - Mortality, Part A. DHHS 
Publ No (PHS) 90-1100. Hyattsville, Md. : National Center for Health Statistics, 

86. Lopez AD. Causes of death: an assessment of global patterns of mortality around 
1985. World Health Stat Q 1990;43:91-104. 

87. Becker TM, Wiggins CL, Key CR, Saraet JM. Signs, symptoms and ill-defined 
conditions: a leading cause of death among minorities. Am J Epidemiol 

88. Report of a workshop on improving cause-of-death statistics. In: National 
Committee on Vital and Health Statistics, 1990. DHHS Publication no. (PHS) 91- 
1205. Hyattsville, Md: National Center for Health Statistics, 1991:53-77. 

89. Report of the second workshop on improving cause-of-death statistics. In: 
National Committee on Vital and Health Statistics, 1991. DHHS Publication no. 
(PHS) 92-1205. Hyattsville, Md. : National Center for Health Statistics, 

90. Baron RC, Dicker RC, Bussell KE, Herndon JL. Assessing trends in mortality in 
121 U.S. Cities, 1970-79, from all causes and from pneumonia and influenza. 
Public Health Rep 1988;103:120-8. 

91. Interagency Committee on Infant Mortality. Data and surveillance systems related 
to programs to reduce infant mortality: a directory of federal efforts. 
Atlanta, Ga.: Public Health Service, January 1992. 

92. Enterline PE. Extrapolation from occupational studies: a substitute for 
environmental epidemiology. Environ Health Perspect 1981;42:39-44. 

93. National Institute for Occupational Safety and Health. National traumatic 
occupational fatalities, 1980-1985. Atlanta, Ga.:Public Health Service, March 

94. Centers for Disease Control. Injuries associated with horseback riding. MMWR 

95. Centers for Disease Control. Dilaudid-related deaths — District of Columbia, 
1987. MMWR 1988;37:425-7. 

96. Centers for Disease Control. Medical examiner/ coroner creports of hurricanes 
associated with Hurricane Hugo--Puerto Rico, 1989. MMWR 1989;38:680-2. 

97. Centers for Disease Control. Medical examiner summer mortality surveillance 
system. MMWR 1982;31:336-43. 

98. Centers for Disease Control. Earthquake-associated deaths—California, 1989. 
MMWR 1989;38:767-70. 


99. Centers for Disease Control. Child passenger restraint use and motor-vehicle- 
related fatalities. MMWR 1991;40:600-2. 

100. Centers for Disease Control. Premature mortality due to alcohol-related motor 
vehicle traffic fatalities— United States, 1987. MMWR 1988;37:753-5. 

101. Centerwell BS . Homicide and the prevalence of handguns: Canada and the United 
States, 1976 to 1980. Am J Epidemiol 1991;11:1245-60. 

102. Rutstein DD, Mullan RJ, Frazier JM, Halperin WE, Melius JM, Sesito JP. Sentinel 
health events (occupational); a basis for physician recognition and public 
health surveillance. Am J Public Health 1983;73:1054-62. 

103. Woodall JP. Epidemiological approaches to health planning, management, and 
evaluation. World Health Stat Q 1988;41:2-10. 

104. Van Casteren V. Inventory of sentinel health information systems with GPs in 
European countries. Eurosentinel. Brussels, Belgium: Institute of Hygiene and 
Epidemiology, January 1991. 

105. Van Casteren V, Leurquin P. Eurosentinel: Concerted Action on Sentinel Health 
Information Systems with General Practitioners. Final Report. Brussels. 
Institute of Hygiene and Epidemiology. August 1991. 

106. Chavez GF, Mulinare J, Cordero JF. Leading major congenital malformations among 
minority groups in the United States, 1981-1986. JAMA 261: 1989;205-8 . 

107. Roper WL, Winkenwerder W, Hackbarth GM, Krakauer H. Effectiveness in health 
care: an initiative to evaluate and improve medical practice. N Engl J Med 

108. Hartz AJ, Krakauer H, Kuhn EM et al . Hospital characteristics and mortality 
rates. N Engl J Med 1989;321:1720-5. 

109. Rutstein DD, Berenberg W, Chalmbers TC, Child CG, Fishman AP, Perdin EB. 
Measuring the quality of medical care. A" Engl J Med 1976;294:582-8. 

110. Baker EL. Sentinel event notification system for occupational risks (SENSOR): 
the concept. Am J Public Health 1989 ;79S: 18-20 . 

111. Centers for Disease Control. Consensus set of health status indicators for 
general assessment of community health status—United States. MMWR 1991; 40: 449- 

112. Alter MJ, Hadler SC, Margolis HS et al. The changing epidemiology of hepatitis 
B in the United States: need for alternative vaccinations strategies. JAMA 

113. Alter MJ, Hadler SC, Judson FN, et al . Risk factors for acute non-A non-B 
hepatitis in the United States and association with hepatitis C virus infection. 
JAMA 1990;264:2231-5. 

114. Pappaioanou M, Dondero TJ, Petersen LR, Onorato IM, Sanchez CD, Curran JW. The 
family of HIV seroprevalence surveys: objectives, methods, and use of sentinel 
surveillance for HIV in the United States. Public Health Rep 1990;105:113-9. 

115. Green LA, Wood M, Becker L et al. The Ambulatory Sentinel Practice Network: 
purposes, methods, and policies. J Fam Pract 1984;18:275-80. 

116. Hughes JP, van Belle G, Kukull W, Larson EB, Teri L. On the uses of registries 
for Alzheimer disease. Alzheimer Dis Assoc Disord 1989;3:205-17. 

117. Froom J, Culpepper J, Grob P et al. Diagnosis and antibiotic treatment of acute 
otitis media: report from International Primary Care Network. British Medical 
Journal 1990;300:582-6. 

118. Van Casteren V, Leurguin P, Declercq E et al. Study of the use of some selected 
groups of laboratory tests in general practice. Summary report. Eurosentinel. 
Brussels, Belgium: Institute of Hygiene and Epidemiology. June 1991. 


119. Fleming DM, Crombie DL. Weekly returns service report for 1990. Birmingham, 
England :Birmingham Research Unit of the Royal College of General Practitioners, 
June 1991. 

120. Fleming D, Ayres JG. Diagnosis and patterns of incidence of influenza, 
influenza-like illness and the common cold in general practice. Journal of the 
Royal College of General Practitioners 1988;38:159-62. 

121. Netherlands Institute of Primary Care. Continuous morbidity registration 
sentinel stations in the Netherlands. Wtrecht :NIVEL Netherlands Institute of 
Primary Health Care, September 1991. 

122. Bartelds AIM, Fracheboud J, van der Zee J. The Dutch Sentinel Practice Network: 
relevance for public health policy. Utrecht: Netherlands Institute for Primary 
Care (NIVEL) , 1989. 

123. Sprenger MJW, Mulder PGH, Beyer WEP, Masurel N. Influenza: relation of 
mortality to morbidity parameters—Netherlands, 1970-1989. Int J Epidemiol 

124. Lobet MP, Stroobant A, Mertens R et al. Tool for validation of the network of 
sentinel general practitioners in the Belgian health care system. Int J 
Epidemiol 1987;16:612-8. 

125. Stroobant A, Van Casteren V, Thiers G. Surveillance systems from primary-care 
data: surveillance through a network of sentinel general practitioners. In: 
Eylenbosch WJ, Noah ND (eds.). Surveillance in health and disease. Oxford, 
England: Oxford University Press, 1988:62-74. 

126. Stroobant A Lamotte J, Van Casteren V. Epidemiological surveillance of measles 
through a network of sentinel general practitioners in Belgium. Int J Epidemiol 

127. Valleron A-J, Bouvet E, Gaernerin P et al . A computer network for the 
surveillance of communicable diseases: the French experiment. Am J Public 
Health 1986;76:1289-92. 

128. Costagliola D, Flauhault A, Galinec D, Garnerin P, Menares J, Valleron A-J. A 
routine tool for detection and assessment of epidemics of influenza-like 
syndromes in France. Am J Public Health 1991;81:97-9. 

129. Garnerin P. Regional distribution of influenza-like syndrome during weeks 49-51 
of 1989 (number of cases/physician/week). British Medical Journal 1990 ,-300:701 . 

130. Surveillance of influenza-like diseases through a National Computer Network-- 
France, 1984-1989. MMWR 1989;38:585-7. 

131. Massari V, Brunet JB, Bouvet E, Valleron AJ. Attitudes towards HIV-antibody 
testing among general practitioners and their patients. Eur J Epidemiol 

132. Maurice S, Megraud F, Vivares C et al. Telematics: a new tool for epidemiologic 
surveillance of diarrhoeal diseases in the Aquitaine sentinel network. British 
Journal of Medicine 1990;300:514-670. 

133. Centers for Disease Control. Influenza activity -- United States, 1991-92. MMWR 

134. Ambulatory Sentinel Practice Network. 1991 Convocation. Chicago, September 

135. Freeman WL, Green LA, Becker LA. Pelvic Inflammatory disease in primary care: a 
report from ASPN. Fam Med 1988;20:192-6. 

136. Green LA, Becker LA, Freeman WL, Iverson DC, Reed FM. Spontaneous abortion in 
primary care; a report from ASPM. J Am Board Fam Pract 1988;1:15-23. 

137. Ambulatory Sentinel Practice Network. An exploratory report of chest pain in 
primary care. A report from ASPN. J Am Board Fam Pract 1990,-3:143-50. 


138. Peterson LR, Calonge NB, Chamberland ME, Engel R, Herring NC. Methods of 
surveillance for HIV infection in primary care outpatients in the United States. 
Public Health Rep 1990;105:158-62. 

139. Hickner J. Practice-based primary care research networks. In: Hibbard H, 
Nutting PA, and Grady (eds.). Primary care research: theory and methods. 
Rockville, Md.:Agency for Health Care Policy and Research, 1991:13-22. 

140. Culpepper L, Froom J. The International Primary Care Network: purpose, methods, 
and policies. Fam Med 1988;20:197-201. 

141. Goldberg J, Gelfand HM, Levy PS. Registry evaluation methods: a review and 
case study. Epidemiol Rev 1980;2:210-20. 

142. Weddell JM. Registers and registries: a review. Int J Epidemiol 1973;2:221-8. 

143. Pollack DA, McClain PW. Trauma registries: current status and future prospects. 
JAMA 1989;262:2280-3. 

144. Ahbolm A. Acute myocardial infarction in Stockholm--a medical information 
system as an epidemiologic tool. Int J Epi 1978:271-6. 

145. Whiting L. The central registry for child abuse cases: rethinking basic 
assumptions. Child Welfare 1977;56:761-7. 

146. Hammar N, Nervrand C, Ahlmark G et al . Identification of cases of myocardial 
infarction: hospital discharge data and mortality compared to myocardial 
infarction community registers. Int J Epidemiol 1991:114-20. 

147. Brown LJ, Scott RS. A population-based register-development and applications. 
Community Health Stud 1988;12:437-43. 

148. Eisnebud M, Lisson J. Epidemiological aspects of berry lium- induced nonmalignant 
lung disease: a 30-year update. J Occup Med 1983;25:196-202. 

149. Johnson A, King R. A regional register of early childhood impairments: a 
discussion paper. Community Medicine 1989;11:352-63. 

150. Agency for Toxic Substances and Disease Registry. Policies and procedures for 
establishing a national registry of persons exposed to hazardous substances. 
Atlanta, Ga.: Agency for Toxic Substances and Disease Registry, 1988. 

151. Zack M. The pros and cons of exposure registries. In: Proceedings of the 
National Conference on Hazardous Wastes and Environmental Emergencies. 

152. Goldhaber MK, Tokuhata GK, Digon E et al. The Three Mile Island population 
registry. Public Health Rep 1983;98:603-9. 

153. The EUROCAT Working Group. Preliminary evaluation of the impact of the 
Chernobyl radiological contamination on the frequency of central nervous system 
malformations in 18 regions of Europe. Paediatr Perinat Epidemiol 

154. Anderson RE, Key CR, Yamamoto T, Thorslund T. Aging in Hiroshima and Nagasaki 
atomic bomb survivors. Speculations based upon the age-specific mortality of 
persons with malignant neoplasms. Am J Pathol 1974;75:1-11. 

155. Sprince NL, Kaxemi H. Beryllium disease. In: Rom WN (ed.). Environmental and 
occupational medicine. Boston, Massachusetts: Little, Brown, and Co., 

156. Austin DF. Cancer registries: a tool in epidemiology. In: Lilienfeld AM (ed.). 
Reviews in cancer epidemiology. Vol 2. New York: Elsevier, 1983:119-39. 

157. Menck HR, Garfinkel L, Dodd GD. Preliminary report of the National Cancer Data 
Base. Cancer 1991;41:7-8. 


158. Centers for Disease Control. National survey of trauma registries--United 
States, 1987. MMWR 1989;38:857-9. 

159. Centers for Disease Control. Report from the trauma registry workshop, 
including recommendations for hospital-based trauma registries. J Trauma 

160. Blot WJ, Fraumeni JF Jr, Mason TJ, Hoover RN. Developing clues to environmental 
cancer: a stepwise approach with the use of cancer mortality data. Environ 
Health Perspect 1979;32:53-8. 

161. MacLennan R, Muir C, Steinitz R, Winkler A (eds.). Cancer registration and its 
techniques. IARC scientific publications, no. 21. Lyon, France: International 
Agency for Research on Cancer, 1978. 

162. Clemmesen J. Uses of cancer registration in the study of carcinogenesis. J 
Natl Cane Inst 1981;67:5-13. 

163. Shimizu Y, Schull WJ, Kato H. Cancer risk among atomic bomb survivors. The 
RERF Life Span Study. Radiation Effects Research Foundation. JAMA 

164. Tsyb AF, Dedenkov AN, Ivanov VK, Stepanenko VF, Pozhidaev w. The development 
of an all-union registry of persons exposed to radiation resulting from the 
accident at the Chernobyl atomic power station. Medical Radiology 1989;34:3-6. 

165. Axelson O. Occupational and environmental exposures to radon: cancer risks. 
Annu Rev Public Health 1991;12:235-55. 

166. Rowland RE, Stehney AF, Lucas HF Jr. Dose- response relationships for female 
radium dial workers. Radiat Res 1978;76:368-83. 

167. Archer VE. Lung cancer risks of underground miners: cohort and case-control 
studies. Yale J Biol Med 1988; 61 (3) : 183-93 . 

168. Landrigan PJ, Wilcox KR Jr, Silva J Jr, Humphrey HE, Kauffman C, Keath CW, Jr. 
Cohort study of Michigan residents exposed to polybrominated biphenyls: 
epidemiologic and immunologic findings. Ann N Y Acad Sci 1979;320:284-94. 

169. Ries LAG, Hankey BF, Miller BA, Hartman AM, Edwards BK (eds.). Cancer 
statistics review 1973-88. NIH publication no. (NIH) 91-2789. Bethesda, Md. : 
National Cancer Institute, 1991. 

170. Muir C, Waterhouse J, Mack T, Powell J, Whelan S (eds.). Cancer incidence in 
five continents. Volume V. Lyon, France: International Agency for Research on 
Cancer, 1987. 

171. Parkin D. Surveillance of cancer. In: Eylenbosch WJ, Noah ND (eds.). 
Surveillance in health and disease. New York: Oxford University Press. 1988 pp 

172. Edmonds LD, Layde PM, James LM, Flynt JW, Erickson JD, Oakley GP Jr. Congenital 
malformations surveillance: two American systems. Int J Epidemiol 

173. Lynberg MC, Edmonds LD. Surveillance of birth defects. In: Halperin WE, Baker 
EL, Monson RR (eds.). Public health surveillance. New York:Van Nostrand 
Reinhold, 1992:157-77. 

174. Holtzman NA, Khoury MJ. Monitoring for congenital malformations. Annu Rev 
Public Health 1986;7:237-66. 

175. Erickson JD, Mulinare J, McClain PW et al . Vietnam veterans' risks for 
fathering babies with birth defects. JAMA 1984;252:903-12. 

176. Becerra JE, Khoury MJ, Cordero JF, Erickson JD. Insulin-dependent diabetes 
mellitus in pregnancy and risk for specific birth defects. Pediatr Res 


177. Mulinare J, Cordero JD, Erickson JD, Berry RJ. Periconceptual use of 
multivitamens and the occurrence of neural tube defects. JAMA 1988;260:3141-5. 

178. Weatherall JA, de Wals P, Lechat MF. Evaluation of information systems for the 
surveillance of congenital malformations. Int J Epidemiol 1984;13:193-6. 

179. Lechat MF. EUROCAT report: surveillance of congenital anomalies, years 1980- 
1986. Brussels: Catholic University of Louvain, 1989. 

180. Lammer EJ, Sever LE, Oakley GP, Jr. Teratogen update: valproic acid. 
Teratology 1987 ;35 (3) -.465-73 . 

181. Kovar MG. Data systems of the National Center for Health Statistics. Vital & 
Health Statistics. Hyattsville, Maryland: National Center for Health 
Statistics. DHHS publication no. (PHS) 89-1325, (Vital & Health Statistics; 
series 1; no. 23), 1989. 

182. Massey JT. Overview of the National Health Interview Survey and its sample 
design. Hyattsville, Maryland: National Center for Health Statistics. DHHS 
publication no. (PHS) 89-1384 (Vital & Health Statistics; series 2; no. 110), 

183. Moore TF, Tadros W. The 1985-94 NHIS sample design. Hyattsville, Maryland: 
National Center for Health Statistics. DHHS publication no. (PHS) 89-1384 

(Vital & Health Statistics; series 2,- no. 110), 1989:18-27. 

184. Centers for Disease Control. Behavior risk factor surveillance, 1988. MMWR 
1990;39(Suppl 2) :l-6. 

185. Centers for Disease Control. Increased awareness in urban and rural areas- - 
Missouri, 1988-91. MMWR 1992 :41;323-5. 

186. Centers for Disease Control. Cigarette smoking among Chinese, Vietnamese, and 
Hispanics, 1989-91. MMWR 1992;41:362-7 . 

187. Kolbe LJ. An epidemiological surveillance system to monitor the prevalence of 
youth behaviors that most affect health. Health Education 1990:21 (6) :44-4 . 

188. Centers for Disease Control. Participation of high school students in school 
physical education--United States, 1990. MMWR 1991;40 (35) :607 , 613-5 . 

189. Centers for Disease Control. Tobacco use among high school students--United 
States, 1990. MMWR 1991;40 (36) :617-9 . 

190. Centers for Disease Control. Attempted suicide among high school 
students- -United States, 1990. MMWR 1991 ;40 (37) : 633-5 . 

191. Centers for Disease Control. Weapon -carrying among high school students--United 
States, 1990. MMWR 1991;40 (40) :681-4 . 

192. Centers for Disease Control. Body-weight perception and selected 

weight -management goals and practices of high school students — United States, 
1990. MMWR 1991:40(43) :741, 747-50. 

193. Centers for Disease Control. Sexual behavior among high school students--United 
States, 1990. MMWR 1992;40 (51-52) :885-8. 

194. Centers for Disease Control. Current tobacco, alcohol, marijuana, and cocaine 
use among high school students--United States, 1990. MMWR 1991;40 (38) : 659-63 . 

195. Graves EJ. National Hospital Discharge Survey. Hyattsville, Maryland: 
National Center for Health Statistics. DHHS publication no. (PHS) 89-1760 
(Vital & Health Statistics; series 13, -no. 991, 1989. 

196. Nelson C, McLemore T. The National Ambulatory Medical Care Survey: 1975-81 and 
1985. Hyattsville, Maryland: National Center for Health Statistics. DHHS 
publication no. (PHS) 88-1754 (Vital & Health Statistics; series 13; no. 93), 


197. Hahn RA, Teutsch SM, Rothenberg RB, Marks JS. Excessive deaths from nine 
chronic diseases in the United States, 1986. JAMA 1990;264,-2654-9. 

198. DeLozier JE, Gagnon RO. National Ambulatory Medical Care Survey, 1989 summary. 
Advance data from vital and health statistics of NCHS. Hyattsville, Md. : 
National Center for Health Statistics, 1991;203:1-11. 

199. Schilling S, Wilson D. Wisconsin Amublatory Medical Care Survey, 1986-1987. 
Madison: Wisconsin Department of Health and Social Services, 1987. 

200. Arrowsmith JB, Kennedy DL, Kuritsky JN, Faich GA. National patterns of aspirin 
use and Reye syndrome reporting. United States, 1980 to 1985. Pediatrics 

201. Mathiowetz N, Northrup D, Sperry S, Waksberg J. Linking the National Survey of 
Family Growth with the National Health Interview Survey. Hyattsville, Maryland: 
National Center for Health Statistics. DHHS publication no. (PHS) 87-1377 
(Vital & Health Statistics; series 2; no. 103), 1987. 

202. Dawson DA. Family structure and children's health: United States, 1988. 
Hyattsville, Maryland: National Center for Health Statistics. DHHS publication 
no. (PHS) 91-1506, (Vital & Health Statistics; series 10; no. 178), 1991. 

203. Centers for Disease Control and Health Resources and Services Administration. 
Health Department Profiles. Maternal Infant and Child Health Programs Data 
Analysis and Tracking Approaches Conference. Atlanta, Ga.:Public Health 
Service, January 1992. 

204. McDowell A, Engel A, Massey JT, Maurer K. Plan and operation of the Second 
National Health and Nutrition Examination Survey, 1976-1980. Hyattsville, 
Maryland: National Center for Health Statistics. DHHS publication no. 

(PHS) 81-1317 (Vital & Health Statistics; series 1; no. 15), 1981. 

205. Miller H. Plan and operation of the health and nutrition examination survey: 
United States--1971-1973. Hyattsville, Maryland: National Center for Health 
Statistics. DHEW publication no. (HRA) 76-1310 (Vital & Health Statistics; 
series 1; no. 10a), 1973. 

206. Maurer KR. Plan and operation of the Hispanic health and nutrition examination 
survey 1982-84. Hyattsville, Maryland: National Center for Health Statistics. 
DHHS publication no. (PHS) 85-1321 (Vital & Health Statistics; series 1; 

no. 19), 1985. 

207. Finucane FF, Freid VM, Madans JH et al. Plan and operation of the NHANES I 
epidemiologic followup study, 1986. Hyattsville, Maryland: National Center for 
Health Statistics. DHHS publication no. (PHS) 90-1307 (Vital & Health 
Statistics; series 1; no. 25), 1990. 

208. Annest JL, Pirkle JL, Makuc D, Neese JW, Bayse DD, Kovar MG. Chronological 
trend in blood lead levels between 1976 and 1980. N Engl J Med 
1983;308(23) -.1373-7. 

209. Lunde AS, Lundeborg S, Lettenstrom GS, Thygesen L, Huebner J. The person-number 
systems of Sweden, Norway, Denmark, and Israel. Hyattsville, Md. : National 
Center for Health Statistics. DHHS publication no. (PHS) 80-1358 (Vital and 
Health Statistics; series 2, no. 84), 1980. 

210. Naessen T, Parker R, Persson I, Zack M, Adami H-O. Time trends in incidence 
rates of first hip fracture in the Uppsala health care region, Sweden, 1965- 
1983. Am J Epidemiol 1989;130:289-99. 

211. Strom BL, Carson JL. Use of automated data bases for pharmacoepidemiology 
research. Epidemiol Rev 1990: 12; 87-107 . 

212. West R. Saskatchewan health data bases: a developing resource. Am J Prev Med 
1988:4 Supplement. 


213. Guess HA, West R, Strand LM et al . Fatal upper gastrointestinal hemorrhage or 
perforation among users and nonusers of nonsteroidal anti-inflammatory drugs in 
Saskatchewan, Canada, 1983. J Clin Epidemiol 1988;41:35-45. 

214. West R, Sherman GJ, Downey W. A record linkage study of valproate and 
malformations in Saskatchewan. Can J Public Health 1986;76:226-8. 

215. Kurland LT, Schoengerg BS, Annegers JF et al. The incidence of primary 
intracranial neoplasms in Rochester, Minnesota, 1950-1977. Ann N Y Acad Sci 

216. Wilson MG, Michet CJ, Ilstrup DM, Melton LJ. Idiopathic symptomatic 
osteoarthritis of the hip and knee; a population-based incidence study. Mayo 
Clin Proc 1990;65:1214-21. 

217. Paterson JG. Surveillance systems from hospital data. In: Eylenbosch WJ, Noah 
ND (eds.). Surveillance in health and disease. Oxford, England: Oxford 
University Press, 1988:49-61. 

218. Roger FH. The minimum basic data set for hospital statistics in the EEC. 
Luxembourg : Of f ice for Official Publications of the European Communities, 1981. 

219. Agency for Health Care Policy and Research. Report to Congress: the feasibility 
of linking research-related data bases to federal and non-federal medical 
administrative data bases. AHCPR No. 91-0003. April 1991. 

220. Jick H. The Commission on Professional and Hospital Activities—professional 
activity study- A national resource for the study of rare illnesses. Am J 
Epidemiol 1979;109:625-7. 

221. Public Health Service-Health Care Financing Administration. The international 
classification of diseases clinical modification, 9th revision. DHHS 
Publication no (PHS) 80-1260, Washington, D.C.: U.S. Government Printing Office, 


222. Martin ML, Edmonds LD. Use of birth defects monitoring programs for assessing 
the effects of maternal substance abuse on pregnancy outcomes. In: 
Methodological issues in controlled studies on effects of prenatal exposure to 
drug abuse. National Institute on Drug Abuse Research Monograph Series. 
Washington D.C. :U.S. Government Printing Office, 1991:66-38. 

223. Centers for Disease Control. Temporal trends in the prevalence of congenital 
malformations at birth based on the Birth Defects Monitoring Program, United 
States, 1979-1987. MMWR 1990;39 (SS-4) ; 19-23 . 

224. Stroup NE, Edmonds L, O'Brien TR. Renal agenesis and dysgenesis: are they 
increasing? Teratology 1990 :42 ;383-95. 

225. Chavez GF, Mulinare J, Edmonds LD. Epidemiology of Rh hemolytic disease of the 
newborn in the United States. JAMA 1991;265:3270-4. 

226. Report on the need to collect external cause- of -injury codeds in hospital 
discharge data systems. In: National Committee on Vital and Health Statistics, 
1991. DHHS Publication no. (PHS) 92-1205. Hyattsville, Md. : National Center 
for Health Statistics, 1992. 

227. Bal DG, Kizer KW, Felten PG, Mozar HN, Niemeyer D. Reducing tobacco consumption 
in California: development of a statewide anti-tobacco use campaign. JAMA 

228. Helbing C, Schieber G. Use of Medicare data in international comparisons. 
Health Policy 1990:15;45-66. 

229. Fisher ES, Baron JA, Malenka DJ, Barret J, Bubolz TA. Overcoming potential 
pitfalls in the use of Medicare data for epidemiologic research. Am J Public 
Health 1990;80:1487-90. 


230. Eggers PW, Connerton R, McMullan M. The Medicare experience with end-stage 
renal disease: trends in incidence, prevalence, and survival. Health Care 
Financing Review 1984;5:69-88. 

231. Health Care Financing Administration. Medicare/Medicaid decision support 
systems: Office of Statistics and Data Management and Strategy. Baltimore, 
Md.:Health Care Financing Administration Publication (HCFA) 03-272, 1988. 

232. Chassin MR et al. Variations in the use of medical and surgical services by the 
Medicare population. N Engl J Med 1986;314:285-90. 

233. Centers for Disease Control. End-stage renal disease associated with diabetes — 
United States, 1988. MMWR 1989; 38; 546-8. 

234. Kellie SE, Brody JA. Sex-specific and race-specific hip fracture rates. Am J 
Public Health 1990;80:326-8. 

235. Wennberg KE, Freeman JL, Shelton RM, Bubolz TA. Hospital use and mortality 
among Medicare beneficiaries in Boston and New Haven. N Engl J Med 

236. Jacobson SJ, Goldberg J, Miles TP, Brody JA, Stiers W, Rimm AA. Regional 
variation in the incidence of hip fracture: U.S. white women aged 65 years and 
older. JAMA 1990:264;500-2. 

237. Stroup NE, Freni-Titulaer LWJ, Schwartz JJ. Unexpected geographic variation in 
rates of hospitalization for patients who have fracture of the hip. J Bone 
Joint Surg 1990;72:1294-8. 

238. Whittle J, Steinberg EP, Anderson GF, Herbert MS. Accuracy of Medicare claims 
data for estimation of cancer incidence and resection rates among elderly 
Americans. Med Care 1991;29:1226-36. 

239. National Institute of Diabetes, Digestive, and Kidney Diseases. U.S. Renal Data 
System. USRDS 1991 annual data report. Bethesda, Md. : National Institutes of 
Health, 1991. 

240. France G, Barrow M. Home accident surveillance system. In: Eylenbosch WJ, and 
Noah ND (eds.). Surveillance in health and disease. Oxford, England: Oxford 
University Press, 1988; 202-7. 

241. Centers for Disease Control. Leading work-related injuries --United States. 
MMWR 1984;33:213-5. 

242. Centers for Disease Control. Bicycle-related injuries: data from the National 
Electronic Injury Surveillance System. MMWR 1987;36:269-71. 

243. Centers for Disease Control. Poisoning among children-- United States. MMWR 

244. Hopkins RS . Consumer product-related injuries in Athens, Ohio, 1980-85: 
assessment of emergency room-based surveillance. Am J Prev Med 1989;5:104-12. 

245. Gallagher SS, Finison K, Guyer B, Goodenough S. Incidence of injuries among 
87,000 Massachusetts children and adolescents: results of the 1980-81 statewide 
childhood injury prevention program surveillance system. Am J Public Health 

246. Pollack DA, Holmgreen P, Lui K-J, Kirk ML. Discrepancies in the reported 
frequency of cocaine- related deaths, United States, 1983 through 1988. JAMA 

247. King WD. Pediatric injury surveillance: use of a hospital discharge data base. 
South Med J 1991: 84; 342-8. 

248. Colliver JD, Kopstein AN. Trends in cocaine abuse reflected in emergency room 
episodes reported to DAWN. Public Health Rep 1991;106:59-68. 


249. Volans GN, Wiseman HM. Surveillance of poisoning -- the role of poison control 
centers. In: Eylenbosch WJ and Noah ND (eds.). Surveillance in health and 
disease. Oxford, England: Oxford University Press, 1988:255-72. 

250. Blanc PD, Rempel D, Maizlish N, Hiatt P, Olson KR. Occupational illness: case 
detection by poison control surveillance. Ann Intern Med 1989;111:238-44. 

251. Adams MM, Shulman HB Bruc C, Hogue C, Brogan D, the PRAMS Working Group. The 
pregnancy risk assessment monitoring system: design, questionnaire, data 
collection and response rates. Paediatric and Perinatal Epidemiology 

252. Griffin MR, Ray WA, Livengood JR et al. Risk of sudden infant death syndrome 
after immunization with the dipteria-tetanus-pertussis vaccine. JV Engl J Med 

253. Yip R, Fleshood L, Spillman TC, Binkin NJ, Wong FL, Trowbridge FW. Using linked 
program and birth records to evaluate coverage and targeting in Tennessee's WIC 
program. Public Health Rep 1991;106:176-80. 

254. Centers for Disease Control. Anemia during pregnancy in low-income women. MMWR 


255. Yip R, Binkin NJU, Fleshood L, Trowbridge FL. Declining prevalence of anemia 
among low income children in the United States. JAMS 1987 ;258; 1619-23 . 

256. Gayle HD, Dibley MJ, Marks JS, Trowbridge FL. Malnutrition in the first two 
years of life. Am J Dis Child 1987 :141;531-4 . 


1. U.S. Department of Health and Human Services, Public Health Service. Healthy People 
2000. National Health Promotion and Disease Prevention Objectives. 1991. DHHS Pub. No. 
(PHS) 91-50212. 

2. Centers for Disease Control. Consensus set of health status indicators for the general 
assessment of community health status - United States. MMWR 1991;40:449-451. 

3. Chorba TL, Berkelman RL, Saffod SK, Gibbs NP, Hull HF. Mandatory reporting of 
infectious diseases by clinicians. JAMA 1989;262:3018-3026. 

4. American Cancer Society. Cancer Facts and Figures - 1991. American Cancer Society. 

5. Teutsch SM, Herman WH, Dwyer DM, Lane JM. Mortality among diabetic patients using 
continuous subcutaneous insulin infusion pumps. N Engl J Med 1984;310:361-368. 

6. Ellwood PM. Outcomes management. A technology of patient experience. N Engl J Med 


Chapter IV 

Management of the Surveillance System 
and Quality Control of Data 

Kevin M. Sullivan 

Norma P. Gibbs 
Carol M. Knowles 

"It is possible to fail in many ways... while to succeed is possible only in one way 
(for which reason also one is easy and the other difficult- -to miss the mark easy, to 

hit it difficult) . " 



This chapter provides a description of practical management and quality control of a 
disease-reporting system for notifiable diseases, at the disease- and injury-report- 
gathering stage--as in a city/county health department, state health department, or 
within the federal government. It focuses on disease-reporting systems for notifiable 
diseases. It is important to note that in most health jurisdictions there are laws 
that specify which diseases and injuries are reportable, who is responsible for 
reporting, and what method and timing of reporting are to be used (e.g. , by telephone 
within 24 hours of diagnosis or by mail within 1 week of diagnosis) (1) . Because 
these reporting laws differ by geographic locale and municipal unit, the material in 
this chapter is restricted to a general overview of a disease-surveillance system, 
recognizing that aspects may not be applicable to all areas and that issues specific 
to jurisdictions are not covered completely. The term "state" is used in this 
discussion; although "state" is a geographic designation in the United States, 
analogous geographic units have similar functions in other countries. 

Types of Reports and Surveillance Systems 

There are three categories of notifiable disease reports: a) those in which 
information is collected on each individual with the disease or injury; b) conditions 
for which only the total number of patients seen is reported; and c) conditions for 
which the total number of cases is reported if, and only if, there is judged to be an 
epidemic. Each category generally requires specific forms. Once a report has been 
received, for many conditions a nurse or other disease investigator may request that 
the reporting unit provide information for additional disease/injury-investigation 

A traditional way of classifying a surveillance system is as passive or active (2) . A 
passive surveillance system can be described as one with which the health jurisdiction 
receives disease/ injury reports from physicians or other individuals or institutions 
as mandated by state law. In contrast, an active surveillance system is established 
when the health department regularly contacts reporting sources (e.g., once per week) 
to elicit reports, including negative reports (no cases). An active surveillance 
system is likely to provide more complete reporting but is much more labor intensive 
and is therefore more costly to operate than a passive system. 

In most surveillance systems, any health worker who has knowledge of an individual 
with a reportable condition may be required to report that case to the health 
department. In a sentinel surveillance system, only selected physicians or 
institutions report disease or injury. Proponents of sentinel systems maintain that 
it is preferable to receive disease/injury reports of high quality from a few sources 
than to receive data of unknown quality from (in theory) all potential reporting 
sources in a population. This, of course, presupposes that the reporters in a 
sentinel system will, in fact, provide high-quality information on a reliable basis. 
It should also be noted that sentinel systems are inadequate when every case of a 
particular condition needs to be identified. 

Most states have comprehensive, passive disease surveillance systems. For example, 
"as required by law in all 50 U.S. states," any health worker having knowledge of a 
person with a reportable condition is obligated to report that case to the local/state 
health department (1) . Regular contact initiated by the health department and 

directed to all possible reporting sources is not feasible or required. 

Collection of Data 

Laws for reporting disease and injury at the state and local levels not only specify 
who is responsible for reporting, but to whom the reports are to be directed. In the 
least complicated reporting situation, a physician diagnoses a reportable condition 
and sends the appropriate report form to the local health department, where the data 
on that case are added to the appropriate disease/injury-surveillance system. 
Summaries of reports are reviewed regularly and analyzed by staff at the local health 
department to identify any conditions that are being reported more frequently than 
expected on the basis of past experience. After disease/injury reports have been 
processed at the local level, the information is forwarded to the state health 
department to be consolidated with reports from other local health departments, and 
the composite data are examined for trends. Each state health department then 
voluntarily reports these cases to the Centers for Disease Control (CDC) on a weekly 
basis (3) . 

This reporting scheme can be reasonably effective, but problems can arise. For 
example, how does one notify health-care professionals about the requirements and 
procedures for reporting to the health department? Who is responsible for such 
notification? How are new practitioners in the jurisdiction identified and notified 
of their responsibility to report? who provides quality assurance for the process? 
How? At what frequency? Other issues include reporting of suspected cases while 
laboratory results are pending, the desired routing of reports, the mechanism for 
updating/completing reports as additional information is received, reporting of 
disease/injury among transients (e.g., military personnel or migrant workers), and 
defining appropriate time frames for reporting a case of a specific disease/injury 
(Table IV. 1) . 

There may not be one correct answer to each of the questions formulated in Table IV. 1 
that applies in all situations; the answers are often situation dependent. However, a 
disease- or injury-surveillance system should document how to respond to each of the 
above questions so that disease reporting is performed in a consistent manner for each 

Entry of Data into the Surveillance System 

With the availability of microcomputers, many health departments enter disease/injury 
reports into computerized data bases. It is essential that one person be responsible 
for management of the surveillance data base (i.e., to be designated and to act as the 
data-base manager (DBM) (4) . A primary responsibility of the DBM is maintaining the 
integrity and completeness of the data base. Concerns of the DBM are summarized in 
Table IV. 2. 

Checklist for Data-Base Manager 

With any surveillance system for disease/injury, there is a need to establish 
procedures for maintenance and retention of paper disease- report forms (called "source 
documents"). In general, the individual disease reports are filed by year of report 
(or onset), by disease, and in alphabetical order by the patient's last name. If not 
already specified by disease-reporting laws, retention periods should be designated 
for maintaining these files for reference purposes. Electronic reporting may obviate 
the need for redundant paper records. (See Chapter XI for more information on 
computerized surveillance systems.) 

Documentation and Training 

Documentation is a critical step in the development of a computerized system--but one 
that is often neglected. A users' manual if needed and should provide both general 
and detailed descriptions of the system, including the following topics (4): 

• General description of the entire system 

• Detailed procedures for installing the system 

• Detailed procedures for operating the system 

• Detailed procedures for maintaining the system 

The DBM should maintain contact with the programmer for the system so that 
modifications to record formats and programs can be documented by the manager; the 
programmer should also maintain a file of all such changes. Thorough, clear 
documentation facilitates the addition of new programs and modifications in equipment 
or operations (4) . 


A formal training program should be established for persons involved in the daily 
operation of the surveillance system. These staff members must feel that they can 
participate in shaping the system, and their ideas and comments should be elicited as 
part of the training process (4) . The DBM should schedule a series of training 
classes that include hands-on experience with the data-base software. Written 
operational procedures—including guidelines for interpreting information contained in 
the disease/injury report forms — should be distributed and explained at this time. 
Software tutorial packages and videotapes (interactive or presentational) can also be 
useful tools for training. 

Management of the organization responsible for the surveillance system should also be 
oriented to the system in one or several briefing sessions. 

Analysis and Standard Reports 

An effective surveillance system must be designed to cover all the following areas in 
its reporting process: 

• Determining whether a condition is being reported more frequently than 
expected (see Chapter V) 

• Responding appropriately to reports of individual cases 

• Detecting clusters of cases 

• Notifying public health practitioners of the presence of specific 
conditions in their areas 

• Reinforcing the importance of reporting through facilitating effective 
control /prevent ion activities 

The completeness and timeliness of case reports in the surveillance system should be 
assessed regularly. This assessment should include both the proportion of the reports 
with each variable, such as age of patient or date of onset of the condition, date 


completed, and time between onset of condition and receipt of report. At the local 
health department, this information can be analyzed by reporting source (e.g., 
clinicians or hospital or diagnostic laboratory staff) or, at the state level, by 
health jurisdiction. These analyses should identify groups or institutions in need of 
additional information or training on disease reporting. 

Most surveillance systems for infectious disease rely primarily on receipt of case 
reports from physicians and other health-care providers. To encourage reporting by 
these health professionals, many local health departments and most state health 
departments publish newsletters containing data and other information of interest to 
the contributors to the data base (1) . Such newsletters may include standard tabular 
reports of the occurrence of a reportable condition by week or month, with a year-to- 
date summary. They may also include narrative reports about conditions of interest or 
about other topics relevant to public health. Such feedback is important to 
demonstrate to those involved with the system that the data are being used, as well as 
to accomplish communications goals (see Chapter VII) . 

The information needs of management and operations personnel should be considered as 
programs are developed for standard reports from the data base. Standard reports 
should include information on time, place, and person, and should be produced in a 
form that can be easily interpreted by epidemiologists and management. The purpose of 
each report should dictate the appearance of the output, e.g., a table, map, or graph. 
Most types of reports should be produced on a regular basis and according to a set 
schedule, but others may be created only on an as-needed basis. 

Data Sharing 

In some situations disease and injury reports may be shared by various local or state 
health departments, particularly with conditions that require additional investigation 
or follow-up. For example, when a resident of one county/state is examined and given 
a particular diagnosis at a hospital in a neighboring county/state, health authorities 
need to be able to track the condition back to its source in order to respond 
appropriately . 

Occasionally, disease and injury reports are sent directly to the state health 


department, bypassing the local health department. If that happens, the state needs 
to notify the appropriate local health department so that the reports can be added to 
the disease/injury reporting system at the local level. Additional data that the 
state may collect should also be shared with the local health department. 

The DBM should be aware of other sources of information that may need to be accessed 
and compared with or added to the data collected in his or her own system — e.g., 
laboratory results, epidemiologic information for specific conditions, population 
estimates, and mortality records. Through careful planning and coordination on the 
part of managers of reporting systems, standard coding schemes can be adopted as data 
systems evolve. These actions facilitate the sharing and use of data. 

System Maintenance and Security 

Maintenance of a system should be directed first toward reducing errors introduced 
through flaws in design and through content changes (e.g., changes in the list of 
notifiable conditions) and second toward improving the system's scope and services. 
Related activities can be categorized as routine maintenance, emergency maintenance, 
requests for special reports, and system improvements. Maintenance should not be 
performed on an informal or first-come, first-served basis. An effective maintenance 
program includes the following steps (4): 

• Back up data and system files according to an established schedule, and 
maintain records in a secure environment. 

• Require that requests for emergency maintenance be made in writing and 
entered into a log. 

• Assign priorities to special requests on the basis of urgency of need and 
time and resources required. 

• Institutionalize routine maintenance, such as procedures associated with 
changing to a new reporting year. 

• Document maintenance as it is conducted. 


In order to maintain the integrity of a computer system, only one person should have 
the authority to access the system and assign and change passwords. The DBM should be 
the only staff member with authority to install or modify production software. This 
same rule should apply to access to the physical computer files. Authority to add or 
delete files from subdirectories or environments of computers should be delegated to 
only one individual who is then held accountable for all modifications. A second 
computer should be available for testing changes to the system so that the computer 
used for the surveillance system can be reserved for production only. The second 
computer could also serve as a back-up computer should the primary machine fail. 

The numerous risks to the security of a data base include mechanical failure, human 
carelessness, malicious damage, crime, and invasion of privacy. Therefore, back-up 
copies of the data base should be kept off-site to ensure that the system cannot be 
deliberately or unintentionally destroyed. Updating of the off-site copies should be 
done on a routine basis, and new diskettes should be used to make back-up copies at 
least once each year. 

A monthly, total system back-up is recommended, if a valid copy of the current system 
is available. Data files that are changed during the day should be backed up at the 
end of the day. 

Computer viruses have become a threat to data-base and computer-system security. 
These programs can be highly sophisticated and are capable of attaching themselves to 
software or data being loaded on the computer or data being sent from one computer to 
another. Software is available to scan entire systems or diskettes for virus 
infections; such software should be updated periodically because of the addition of 
new viruses. Data received via telecommunications channels or on diskettes from other 
sources should always be scanned before data files and programs are copied to the 
computer's disk. Software retrieved from electronic bulletin boards should be 
carefully examined before being incorporated into a system. 

In the event of extended mechanical failure, a contingency plan should be in place for 
shifting the base of operations to another computer. 

Surveillance data on disease/injury are generally received by a local health 


department, forwarded through a regional health center, and eventually directed to the 
state health department. The complete reporting form, which includes confidential 
information on patients, is usually shared by local and state health departments for 
purposes of follow-up (if necessary) and for identifying and deleting any redundant 
(duplicate) reports. 

Persons who report disease/injury should be familiar with the types of activities that 
may follow the receipt of a report. For example, for purposes of prevention or 
treatment, all cases of syphilis may be investigated to determine the source of the 
infection and potential spread of the infection to others. Disease- reporting laws 
may specify who has access to the confidential portions of a disease/injury report, 
and it is important to assure that the confidentiality of the report is maintained. 
Failure to keep the reports confidential is likely to lead to an unwillingness to 
report on the part of physicians and other health-care providers. Reports and files 
that do not require personal identifiers should not contain them. In the United 
States, notifiable-disease reports received from states by CDC do not include personal 
identifiers (such as name, address, and telephone number) . 

Modification of Reporting Systems 

The basic steps shown below are intended to ensure that a computer-based surveillance 
system will meet current and future needs. A systems analyst, an epidemiologist, and 
the final users of information from the system should work together to produce a 
system that is user- friendly and functional (5) . 

1. Review current methods of processing disease/injury information. Obtain copies 
of paper forms or computer- screen forms or reports. Determine whether suggested 
report forms or screens are available from state or national agencies. Often, 
ready-to-use surveillance software is available. Use of such systems 
facilitates standardization, quality control, and comparability of data. 

2. Review with management and users any problems with the current method for 
processing data and any desired future enhancements. 

3. Document the current system and proposed future system. Allow concerned parties 

to review and comment on their understanding of objectives for the system. 

4. Limit access to the confidential portion of a disease/injury report as much as 
possible. Store the original report forms containing confidential data in 
locked cabinets or a locked room. Secure electronic data bases by limiting 
access to the computer, and obtain additional security through the required use 
of passwords (pre-approved for access to the protected portion of the data 
base) . 

5. Document developmental specifications to meet the objectives above. In 
addition, document proposed testing schedules and methodology for implementing 
the system when it is completed. 

6. Develop prototypic screens and reports for management and end users to review, 
so that misunderstandings and problems can be identified and resolved during 
development . 

7. Once all parties are in agreement, establish self-contained modules of 
development that can be completed, and proceed to the testing stage while other 
modules are being developed. 

8. Begin development in a test environment separate from any current computer-based 
production system. Document any changes to developmental specifications that 
become necessary during actual development. 

9. Produce processing manuals for users (to include not only the operation of the 
computerized system but also proper handling of paper forms, storage of 
electronic and paper data, and distribution of final reports) . This 
documentation should be as thoroughly tested as the actual computer system. 

10. Establish training sessions or develop tutorial manuals for users. If such 
manuals are to be effective, a development/test system for users must be in 
place during their training stage. 

11. Finalize specification documents to include all current stages of the system, as 


well as all expected future enhancements. This documentation should include a 
schedule and methodology for maintaining and troubleshooting the system. 

12. Establish and document proper back-up and data- recovery techniques. This step 
includes selecting a data-base manager. 


A surveillance system of high quality and integrity can only be developed through 
careful planning, documentation, implementation, training, and long-term support. 
Because of the changing nature of disease/injury reporting (e.g., new conditions being 
added or case definitions being modified) , useful surveillance systems must be 
flexible enough to allow for such changes with a minimal amount of disruption. 

Also important is the coordination of disease and injury- reporting activities among 
local health departments, from local health departments to their appropriate state 
health departments, and among state health departments. The Council of State and 
Territorial Epidemiologists has played an important role in the state-to-state 
coordination of disease and injury reporting, as well as in reporting practices from 
states to CDC. 

While there are many complicated aspects of disease/injury-surveillance systems, it is 
important to remember that the overall purposes of such systems are to provide 
information on preventing disease and injury and to improve the quality of the public 


1. Chorba TL, Berkelman RL, Safford SK, Gibbs NP, Hull HF. Mandatory reporting of 
infectious diseases by clinicians. MMWR 1990; 39 (RR-9) : 1-17 . 

2. Mausner JS, Kramer S. Epidemiology — an introductory text. Philadelphia, Pa.: 
W.B. Saunders Co., 1985. 

3. Wharton M, Chorba TL, Vogt RL, Morse DL, Buehler JW. Case definitions for 
public health surveillance. MMWR 1990; 39 (RR-13) : 1-43 . 

4. Murdick, RG. MIS concepts and design. Englewood Cliffs, N.J: Prentice-Hall, 

5. Klaucke DN, Buehler JW, Thacker SB, Parrish RG, Trowbridge FL, Berkelman RL et 
al . Guidelines for evaluating surveillance systems. MMWR 1988,-37 (S-5) : 1-18. 



Chapter V 

Analyzing and Interpreting Surveillance 


Willard Cates, Jr. 
6. David Williamson 

■Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost 
in information?" 

T.S. Eliot 
['Where is the information we have lost in data?" 




Historically, the core processes of public health surveillance have involved using 
appropriate methods to aggregate the units of data being collected- -namely , analysis- - 
and also creative approaches to assess the emerging data patterns--namely , 
interpretation (1). 

For these reasons, the ability to analyze and interpret surveillance data determines 
the mettle of the epidemiologist. Viewed as basic to observational studies (2), 
surveillance is at the forefront of the spectrum of descriptive epidemiology. 
Surveillance has a myriad of uses (3,4), each of which requires careful analysis and 
interpretation. Whether surveillance is used to detect epidemics, suggest hypotheses, 
characterize trends in disease or injury, evaluate prevention programs, or project 
future public health needs, data from a surveillance system must be analyzed carefully 
and interpreted prudently. In this chapter we address practical and methodologic 
approaches to surveillance analysis; the presentation of surveillance data by time, 
place, and person; the concept of rates and standardization of rates; approaches to 
exploratory data analysis; the use of graphics and maps; and, finally, the systematic 
interpretation of surveillance data. 


Practical Approach 

The fundamental approach to analyzing surveillance data is relatively straightforward. 
Because of their descriptive nature, surveillance data cannot be used for formal 
hypothesis testing (5). Rather, the regular scrutiny of systematically collected 
information allows epidemiologists to describe patterns of disease and injury in human 
populations, organized by a variety of sub-measures. Moreover, the analysis (and 
subsequent interpretation) proceeds from the specific elements of the data themselves. 
Thus, surveillance analysis represents an inductive reasoning process in which the 
assembly of individual units eventually produces a more general picture of health- 
related problems in a population. 

Frequently, the time-consuming problems of collecting, managing, and storing 


surveillance data leave little energy for the analysis itself. Nonetheless, analyzing 
surveillance data must be afforded a high priority by those in charge of surveillance 
systems (3). Approaches to analyzing surveillance data include the following steps: 

1. Know the inherent idiosyncracies of the surveillance data set. It is 
tempting to begin immediately to examine trends over time. However, 
intimate knowledge of the day-to-day strengths and weaknesses of the 
data-collection methods and the reporting process can provide a "real 
world" sense of the trends that emerge. 

2. Proceed from the simplest to the most complex. Examine each condition 
separately, both by numbers and crude trends. How many cases were 
reported each year? How many cases were reported in each age group each 
year? What are the variable-specific rates? Only after looking 
separately at each variable should one examine the relationships among 
these variables. 

3. Realize when inaccuracies in the data preclude more sophisticated 
analyses. Erratically collected or incomplete data cannot be corrected 
by complex analytic techniques. Differential reporting (see 
representativeness Chapter VIII) by different regions or by different 
health facilities render the resulting surveillance data set liable to 
misinterpretation . 

Me thodo logic Considerations 

Analysis of surveillance information depends on the accuracy of that information 
(Chapter VIII). Attempts to analyze data that are haphazardly collected or have 
varying case definitions waste valuable time and resources. The two key concepts 
which determine the accuracy of surveillance data are reliability and validity (5) . 
Reliability refers to whether a particular condition is reported consistently by 
different observers, whereas validity refers to whether the condition as reported 
reflects the true condition as it occurs. Ideally, both reliability and validity can 
be achieved, but in practice, reliability (e.g., reproducibility) is easier than 
validity to assess. In situations involving conditions, such as laboratory testing 


for infectious diseases, when biologic measures complement clinical case definitions, 
the accuracy of the data can be more completely assured. However, in the context of 
more subjective behavioral aspects, such as those associated with lifestyles, accuracy 
is more difficult to confirm. 

The application of standard statistical techniques to the analysis of surveillance 
data is dictated by the limitations of the data themselves and the flexibility of the 
epidemiologist/statistician (5) . In a sense, because the essentials of sampling 
theory have not been satisfied, no statistical testing is possible with the often 
incomplete surveillance data set. However, if the information is viewed as samples 
over time, apparent clusters of health events can be evaluated for their statistical 
■significance." Applying 95% confidence limits or other standard statistical tests to 
these 'samples over time" can allow a determination of whether any differences are 
unlikely to have occurred by chance alone. 

Surveillance analyses are often ecologic, since they describe trends in groups of 
individuals. Thus, the use of surveillance data may be especially prone to the 
problem of the "ecological fallacy" (6,7). In brief, this type of bias may occur when 
health officials interpreting observations about groups (e.g., aggregated surveillance 
data) make causal inferences about individual phenomena (8) . These population-level 
analyses may suffer from two separate problems (7): a) aggregation bias — due to 

loss of information when individuals are grouped and b) specification bias due to 

the definition of the "group" itself (8) . The chances of the ecological fallacy can 
be reduced by analyzing subsets of surveillance data to reveal trends in the 
individual characteristics. However, when describing bodies of surveillance data, 
public health officials usually synthesize the populations trends, thus opening the 
possibility for fallacious interpretation. 

Time, Place, and Person 

Surveillance data allow public health officials to describe health problems in terms 
of the basic epidemiologic parameters of time, place, and person. In addition, 
surveillance data permit comparisons among these different parameters (e.g., what are 
the patterns of disease/injury at one time compared with another, in one place 
compared with another, or among one population compared with another) . Use of 


appropriate census data as denominators allows calculation of rates, which then 
facilitates comparison of the risks of disease or injury in terms of the parameters of 
time, place, and person. Moreover, use of these fundamental variables permits the 
epidemics to be detected, long-term trends to be monitored, seasonal patterns to be 
assessed and future occurrence of disease/injury to be projected, thus possibly 
facilitating a more timely public health response. 


Analysis of surveillance data by time can reveal trends in disease/ injury . For all 
health conditions, a measurable delay occurs between the exposure and the problem. In 
the case of disease, an interval exists between exposure and expression of symptoms, 
as well as an interval between a) onset of symptoms and diagnosis of the problem, and 
b) eventual reporting of the illness to public health authorities so that it can be 
included in the surveillance data set. For an infectious disease, this last interval 
may represent days or weeks, whereas for chronic disease it may be measured in years. 
Thus, choosing the appropriate interval for analysis must involve a consideration of 
the health condition being assessed. 

Analysis of surveillance data by time can be conducted in several different ways to 
detect changes in incidence of disease/ injury . The easiest analysis is usually a 
comparison of the number of case reports received during a particular interval (e.g., 
weeks or months) (see Figure 1.1) . Such data can be organized into a table or graph 
to assess whether an abrupt increase has occurred, whether the trends are stable, or 
whether a gradual rise or fall in the numbers occurs. Another simple method of 
analysis compares the number of cases for a current time period (e.g., a given month) 
with the number reported during the same interval for the past several years . 
Similarly, the cumulative number of cases reported in the period representing the 
year-to-date can be compared with the appropriate cumulative number for previous 

Analyzing long-term (secular) trends is facilitated by graphing surveillance data over 
time. The watershed events that influence secular trends--such as changes in the case 
definition used for surveillance, new diagnostic criteria, changes in reporting 
requirements or practices, publicity about a particular condition, or new intervention 
programs- -can be indicated on the graph. Changes in the surveillance system itself 


also influence long-term trends, particularly when the intensity of active case 
detection increases (e.g. , screening programs in particular communities) . 

Finally, additional epidemiologic measures enhance the analysis of surveillance data 
by time. Using denominators to calculate rates becomes especially important if 
changes occur in the community, such as the immigration of a new population. As the 
size of a population changes over time, so will the expected number of cases of 
diseases and injuries. In addition, analysis by date of onset rather than date of 
report more clearly defines the condition. Because of delays between diagnosis and 
reporting, using date of onset when practical and possible provides a better 
representation of actual disease incidence. The longer the interval between the 
occurrence of symptoms, the seeking of health care, and the reporting of events, the 
greater the need for a surveillance system based on date of onset. 


Analysis by the place where the condition occurred is the next step. (see Figure 
1.2). The location from which the condition was reported (such as a hospital) may not 
be the place where the exposure actually occurred (in the community) . Similarly, for 
medical procedures, the place an operation took place may not be the place of 
residence of the patient. For example, the District of Columbia has the highest rate 
of legal abortions in the United States, but more than 50% of this figure reflects 
women who reside outside the District (9). 

Locating the geographic area with the highest rates can facilitate efforts to identify 
cause (s) and allow appropriate interventions to be applied. John Snow's removing the 
Broad Street pump handle remains the classic example of intervention by location (20) . 
Even in situations in which the numbers of a particular problem are decreasing, focal 
areas with high levels of the condition may remain, and the identification of these 
areas allows prevention resources to be targeted effectively. Finally, the size of 
the unit for geographic analysis is determined by the type of condition involved. For 
some rare conditions, large areas such as states may be appropriate, whereas for 
events that occur at relatively high frequency or for outbreak situations, areas 
defined by postal codes or other geographic boundaries may be the most desirable size 
of the measure. 


The availability of computers, as well as software for spatial mapping, allow more 
sophisticated analysis of surveillance data by place. Public health officials are now 
able to use surveillance data to follow the geographic course of a particular 
condition, thus assisting in their efforts to plan intervention strategies (see 'Maps* 
below) . 


Analyzing surveillance data by the characteristics of persons who have the condition 
provides further specification. The demographic variables most frequently used are 
age, gender, and race/ethnicity. Other variables such as marital status, occupation, 
and levels of income and education may also be helpful, even though most surveillance 
systems do not routinely collect such information. 

Analysis of trends in disease/injury by age depends on the specific health condition 
of interest. For childhood diseases, relatively narrow age categories (e.g., by 
single years) , can identify the age group associated with the peak incidence of a 
particular health condition. Conversely, for conditions that primarily affect older 
populations, broader 10-year age intervals are frequently used. In general, the 
typical age distribution associated with the health condition provides the best guide 
to deciding which age categories to use, with several narrower categories for the ages 
associated with peak incidence and broader categories covering the remainder of the 
age spectrum. 

Surveillance systems have also been used to analyze behavioral characteristics of 
populations. Such systems generally depend on self-reported behavior and may be based 
on repeated surveys of representative groups, trends in markers for specific types of 
behavior (e.g., sales of a particular product), or active surveillance of a particular 
behavioral characteristic or indicator in a defined group (e.g. , testing urine for 
drugs in school or work settings) . 

If possible, the characteristics of persons included in any surveillance system should 
be related to denominators. While assessing the number of cases alone can be 
sufficient, variable-specific rates are more helpful in allowing comparisons of the 
risk involved. Thus, even if the number of cases of a particular condition is higher 
in one part of a population, the rate may be lower if that group represents a large 


proportion of the population. In this way, comparing the rates within surveillance 
data of certain populations is analogous to calculating relative risks within 
observational cohort studies. 

Interactions among Time, Place, and Person 

By proceeding from the simple (e.g., crude rates) to the more complex (e.g., variable- 
specific rates) , meaningful trends may be revealed. This is because interactions 
among the time-place-person parameters of surveillance data can obscure important 
patterns of disease/injury in specific populations. For example, in the United States 
in the 1980s, the overall number of syphilis cases fell during the first two-thirds of 
the decade but rose beginning in 1987 (Figure V.l, Panel A) . When analyzed by gender 
(Figure V.l, Panel B) , the decline in syphilis occurred primarily among men; cases 
among women were low for the first 5 years, increased slightly in 1986, and rose more 
rapidly for the rest of the decade. Finally, when stratified by both gender and race 
(Figure V.l, Panel C) , the decrease in numbers of cases of syphilis was seen only 
among white males--presumably among men who have sex with other men and who had 
changed their sexual practices in response to human immunodeficiency virus (HIV) 
prevention activities (12). Conversely, the increase in syphilis occurred among black 
men and women, with both trends beginning in 1986, and being linked to unsafe sexual 
behavior associated with use of crack cocaine (13) . If more specific analysis by 
person had not occurred, the offsetting trends in the mid-1980s of declines among 
white males might have delayed recognition by public health officials of the syphilis 
epidemic among minorities. 



A rate measures the frequency of an event. It comprises a numerator (i.e., the upper 
portion of a fraction denoting the number of occurrences of an event during a 
specified time) and a denominator (i.e., the lower portion of a fraction denoting the 
size of the population in which the events occur) . A crucial aspect of a rate is the 
specification of the time period under consideration. An optional component is a 
multiplier, a power of 10 that is used to convert awkward fractions to more workable 
numbers (14) . The general form of a rate is shown below: 


rate = number of occurrences of event in specified time X 10°, 
average or mid- interval population 

where the denominator represents the size of the population during the specified 
period in which the events occur and the power of n usually ranges from 2 to 6 (i.e., 
the number at risk varies between 100 and 1,000,000). The selection of n depends on 
the incidence or prevalence of the event. 

Although surveillance often provides numerator data only, the use of raw numbers such 
as cases of a disease or injury has limitations. Raw numbers quantify occurrences of 
an event during a specified time without regard to population size and dynamics, or 
other demographic characteristics such as distribution by race and gender. Rates 
enable one to make more appropriate, informative comparisons of occurrences in a 
population over time, among different sub-populations, or among different populations 
at the same or different times, since the size of the population and the period of 
time specified are accounted for in the calculation of rates. 

A wide variety of "rates" are employed in standard public health practice (Table V.l). 
These measures are calculated in numerous ways and may have different connotations. 
Special distinction should be made among the terms 'rate, - "ratio," and "proportion." 
A ratio is any quotient obtained by dividing one quantity by another. The numerator 
and denominator are generally distinct quantities, neither of which is a subset of the 
other. No restrictions exist on the value or dimension of a ratio. A proportion is a 
special type of ratio for which the numerator is a subset of the denominator 
population, thus requiring the resulting quotient to be dimensionless, positive, and 
less than one, or less than 100 if expressed as a percentage. Although all rates are 
ratios, in epidemiology a rate may be a proportion (e.g., prevalence rate) or may be 
limited in scope by further restrictions such as representing the number of 
occurrences of a health event in a specified time and population per unit time (e.g., 
hazard or incidence rate) . This latter definition is most restrictive and is the 
definition generally used for rates in chemistry and physics. 

Use of Rates in Epidemiology 

Calculation and analysis of rates is critical in epidemiologic investigations, not 
only for formulating and testing hypotheses about cause(s), but also for identifying 


risk factors for disease and injury. Rates also allow valid comparisons within or 
among populations for specific times. To determine rates, one must have reliable 
numerator and denominator data, the latter being generally more difficult to obtain in 
most epidemiologic investigations, particularly if the data to be analyzed (i.e, the 
number of occurrences of an event) have been collected from public health surveillance 

Crude, Specific, and Standardized Rates 

Crude and specific rates 

Rates can be calculated either for the entire population or for certain subpopulations 
within the larger group. Rates describing a complete population are termed "crude." 
The computation of crude rates is performed as the initial step in analysis since they 
are important in obtaining information about and contrasting entire populations. 

Within a population, the rate at which a particular health event occurs may not be 
constant throughout the entire population. To examine the differences, the population 
is partitioned into relevant "specific" subpopulations, and a "specific rate is 
calculated for each subset. For example, if one calculates death rates by age group 
(because death rate is not constant for all age categories) , the resulting rates are 
termed "age-specific death rates." 

Variation of rates among population subgroups results from several factors: natural 
history of the health problem, differential distribution of susceptibility or 
cause (s), or genetic differences among subpopulations. For example, mortality rates 
are higher among men than women and blacks than whites (15) . The distribution of 
subgroups within the population may also be so disparate that a summary rate may not 
convey useful information. Therefore, the magnitude of a crude rate depends on the 
magnitude of the rates of the subpopulations as well as on the demographics of the 
entire population (16). These variations in rates across a population would remain 
unknown if only crude rates were calculated. 

Standardized rates 

When rates are compared across different populations or for the same population over 


time, crude rates are appropriate only if the populations are similar with respect to 
factors that are associated with the health event being investigated (2 7) . Such 
factors could include age, race, gender, socioeconomic status, or risk factors (e.g., 
number of cigarettes smoked) . If the populations are dissimilar, variable-specific 
rates should be computed and compared. Alternatively, the rates can be adjusted for 
the effect of a confounding variable in order to obtain an undistorted view of the 
effect that other variables have on risk. This adjustment of rates when comparing 
populations is called standardization and yields "standardized" or "adjusted" rates. 
The two techniques of standardization are direct and indirect. 

Direct standardization 

A directly standardized rate is obtained for a study population by averaging the 
specific rates for the population, using the distribution of a selected standard 
population as the averaging weights. This adjusted rate represents "what the crude 
rate would have been in the study population if that population had the same 
distribution as the standard population with respect to the variable (s) for which the 
adjustment or standardization was carried out" (14). The rate is termed "directly 
standardized" because specific rates are used directly in the calculation. If data 
for the same standard population are used to calculate directly standardized rates for 
two or more study populations, those standardized rates can be appropriately 
compared. Any difference among the standardized rates cannot be attributed to 
differential population distributions of the standardized variable because the 
calculations have been adjusted for that variable {18) . The following data must be 
available in order to use direct adjustment: 

• Specific rates for the study population and 

• Distribution for the selected standard population across the same strata 
as those used in determining the specific rates. 

Indirect Standardization 

An indirectly standardized rate is calculated for a study population by averaging the 
specific rates for a select standard population, using the distribution of the study 
population as weights. One should use indirect adjustment when any of the specific 
rates in the study population are unavailable or when such small numbers exist in the 
categories of strata that the data are unreliable (i.e., the resulting rates are 


unstable) . This commonly occurs in occupational mortality or in small geographic 
areas. For these reasons, indirect standardization is used more often than direct 
standardization. Indirectly standardized rates for two or more populations of 
interest can be appropriately compared if the same standard population is used in the 
computations. The following data are required to make an indirect adjustment to a 

• Specific rates for the selected standard population, 

• Distribution for the study population across the same strata as those 
used in calculating the specific rates, 

• Crude rate for the study population, and 

• Crude rate for the standard population. 

A special application of the indirect standardized rate, when the health event of 
interest is death, is the standardized mortality ratio (SMR) . It is the number of 
deaths occurring in a study population or subpopulation, expressed as a percentage of 
the number of deaths expected to occur if the given population and the selected 
standard population had the same specific rates (19). Explicitly, the SMR is an 
indirect, age-adjusted ratio calculated as the indirect standardized mortality rate 
for the study population, divided by the crude mortality rate for the standard 
population. Additional information is available on the use of the SMR, as well as on 
computation of variance and confidence intervals for direct and indirect 
measures (18) . 

Choice of Standard Population 

If crude rates are to be adjusted, an appropriate standardized population needs to be 

chosen. In extreme cases, the choice of different standardized populations can lead 

to different results. For example, use of one standardized population may yield an 

adjusted rate higher for population A than for population B, while choice of another 

standard population may yield a higher rate for population B (18) . 

Two factors should be considered when choosing a standard population: 

• Select a population that is representative of the study populations being 
compared and 

• Understand how choice of a standard population affects directly 


standardized rates (e.g., if the age-specific rates for population A are 

greater than for population B at young ages and the opposite is true at 

older ages, a standard population with distribution skewed to younger 

ages will yield a higher directly standardized rate for population A than 
for population B) . 

Generally the choice of standard population makes little difference in comparing 
adjusted rates. Although magnitudes of the adjusted rates depend upon choice of 
standard population, no meaning is attached to those magnitudes; only relative 
differences in the adjusted rates can be assessed. 

Various choices are available for a standard population. Customary selections include 
the combined or pooled population of the overall population to be studied, the 
population of one of the study groups, a large population (such as the 1940 or 1980 
United States population) , or a hypothetical population. Calculating standardized 
rates using different standard populations allows comparisons of different 
distributions (20) . 

To Standardize or Not To Standardize. The decision to standardize is not always 
straightforward. Several factors, most of which are data-driven, must be considered 
in the decision process. Reasons to present standardized rates include the following 

• Standardization adjusts for confounding variables to yield a more 
realistic view of the effect of other variables on risk, 

• A summary measure for a population is easier to compare with similar 
summary measures than are sets of specific rates, 

• A standardized rate has a smaller standard error than any of the specific 
rates (this is important when comparing sub-populations or geographic 
areas) , 

• Specific rates may be imprecise or unstable because of sparse data in the 
strata, and 

• Specific rates may be unavailable for certain groups of interest (e.g., 
small populations or those designated by specific geographic areas) . 


The major disadvantage of standardization is evident when the specific rates vary 
differently across strata, such as when they move in different directions or at 
different magnitudes, in individual age groups. In this case the trend in the 
standardized rate is a weighted average of the trends in the specific rates, where the 
weights depend on the standard population selected. When this occurs, the 
standardized rate tends to mask the differences, and no single summary measure will 
reveal these differences. 

Another unfavorable characteristic of standardized rates is that their magnitude is 
arbitrary and depends entirely on the standard population. Although generally not the 
case, relative rankings of summary measures from different study populations may 
change if a different standard population is selected. 

Regardless of the decision made regarding standardization, it is crucial to evaluate 
the specific rates to characterize accurately and to understand more fully the 
variation among study populations. Standardized rates should never be used as a 
substitute for specific rates, nor should they be the basis of inferences when 
specific rates can be computed. A compromise to the use of a summary measure versus a 
set of specific measures is to use the specific rates but to eliminate or combine 
categories to minimize the number of rates required for comparison. Additional 
discussion is available on advantages and disadvantages of standardization and on 
analyzing crude and specific rates (21). 

Rate standardization: practical example 

To demonstrate how crude, specific, and standardized rates are obtained, we compare 
death rates in two Florida counties. This example shows how standardized rates can be 
misleading if they are not properly scrutinized. 

We will use population and death totals for Pinellas and Dade Counties in Florida for 
1980 (Table V.2). The crude death rate for Pinellas County is about 60% higher than 
that for Dade County. When the age distributions of each county are used, the 
resulting age-specific death rates are generally slightly higher in Dade County (Table 
V.3), even though the crude death rate is substantially higher for Pinellas County. 
This seeming anomaly in the data results from the different age distributions of each 
county. Specifically, the population in Pinellas is older. 


Directly standardizing the Pinellas and Dade County rates to the United States 1980 
population corrects for the differences in population (Table V.4). Once differences 
in age-related distributions in the two counties have been taken into account, the 
adjusted death rate for Pinellas County is lower than that for Dade County (7.7 and 
7.9, respectively). 

The indirect method of adjustment increases the relative difference between death 
rates for the two counties (Table V.5). The adjusting factor is computed as the 1980 
death rate for the total U.S. population divided by the expected death rate. Then, 
adjusted death rate is calculated as the adjusting factor multiplied by the crude 
death rate. In this example, indirect adjustment reinforces and accentuates the 
results of direct adjustment by yielding rates of 7.5 and 7.8 deaths per 1,000 
population for Pinellas and Dade Counties, respectively. 

This example illustrates the importance of being thoroughly familiar with the data. 
Comparison of crude death rates alone can be misleading. However, calculating age- 
specific and adjusted rates permits an accurate understanding of death rates in these 
counties and shows that the high crude rate in Pinellas County reflects its older 
population. The example also illustrates how the magnitude of adjusted rates depends 
on the choice of standard population. 

Analysis of Rates 

When numerator and denominator data are available, analysis of rates should always 
begin with calculation of crude rates and proceed to subsequent computation of 
relevant specific rates. If appropriate, a standard population can be chosen to 
determine standardized rates. Tables and especially maps are important means of 
presenting rates at different times and/or locations. (See "Tables," "Graphs," and 
"Maps" below) . 

Several statistical procedures are available to analyze data. Inference on a single 
proportion is performed using a z test, and assessing the difference between two 
proportions can be accomplished with a z or x 2 test (17).* Use of Poisson parameters 

*Note that Fleiss does not distinguish between rates and proportions or the analysis 
of them. 


is helpful in comparing two rates (22) . A series of %' tests can be used to compare 
proportions from several independent samples (16), and Poisson regression is 
frequently used for comparing several rates (23). Other modeling procedures that can 
be used to analyze rates include smoothing, Box-Jenkins, and Kalman filter approaches, 
all of which are time-series methods discussed in Chapter VI. Space-time cluster 
techniques and small-area estimation methods are also discussed in Chapter VI. 


Exploratory data analysis (EDA) is enumerative, numeric, or graphic detective 

work (24) . It is the application of a set of techniques to a body of data to make the 

data more understandable. EDA is a philosophy that minimizes assumptions, allows the 

data to motivate the analysis, and combines ease of description with quantitative 

knowledge. EDA leads the analyst to uncover characteristics often hidden within the 


Practice of EDA involves four fundamental steps (24-25) : 

1. Using visual displays to convey the structure of the data and analyses, 

2. Transforming the data mathematically to simplify their distribution and 
to clarify their analysis, 

3. Investigating the influence that unusual observations (outliers) have on 
the results of analysis, and 

4. Examining the residuals (the difference between the observed data and a 
fitted model) to provide additional insight into the data. 

EDA is the initial step in any analysis. It allows the investigator to become 
familiar with the data and forms the foundation for further analysis. Although most 
public health surveillance systems are established for specific topics, proper EDA of 
the data can provide insight into demographic, temporal, and spatial patterns 
otherwise overlooked in the collection of numbers. EDA may additionally contribute to 
more timely detection of unusual observations, which may, in turn, facilitate a 
quicker public health response to factors that cause increased morbidity and/or 
mortality . 

Data Displays 

A first step in any analysis of data is a visual examination of the data. A few of 

the techniques that should be used initially are described below for application to a 

single set of numbers, for exploration of relationships between two factors, and for 
comparisons among several populations. 

Dot plots 

A dot plot is a one-dimensional plot (Figure V.2) of the individual values of a set of 
numbers. The x-axis represents one or more categories of a non-continuous variable, 
and the y-axis represents the range of values displayed by the observations. 
Observations with identical values are plotted side by side on the same horizontal 

Stem-and-Leaf Displays 

A stem-and-leaf display is a graphic (Figure V.3) that allows the digits of the 
observation values to sort the numbers into numerical order for display. This is a 
variation of the conventional histogram. The basic principle used in constructing a 
stem-and-leaf display is the splitting of each data value between a suitable pair of 
adjacent digits to form a set of leading digits and a set of trailing digits. The set 
of leading digits forms the stems, and the set of the first trailing digit from the 
data forms the leaves. Remaining trailing digits are ignored for the purpose of the 
graphic. Variations to the stem-and-leaf display are possible {24). 

Many investigators begin an evaluation of data with a histogram (see below) , but the 
stem-and-leaf display has several advantages over the histogram. Because every 
observation is plotted in the stem-and-leaf display, it contains more detail than the 
histogram and allows computation of percentages points. Moreover, transformations can 
be applied directly to stem-and-leaf data. 

Scatter plots 

The scatter plot or scatter diagram is a plot (Figure V.4) that reveals the 
relationship between two variables. Each observation comprises a pair of values, one 
for each variable. The observation is plotted by measuring the value of one variable 
on the horizontal axis and the value of the other on the vertical axis. 

Data summaries 

One can summarize a data set by calculating a few numbers which are relatively easy to 
interpret. For example, measures of central tendency and variability are frequently 
used to describe data. In particular, two types of summary displays have proven 
useful in characterizing data, i.e., the five-number summary and the box plot. 

Five-number summaries 

The five-number summary of a data set is a simple display (Table V.6) involving the 
median, hinge, and extreme values. The median is a measure of the central tendency of 
the data that splits an ordered data set in half. The hinges are a measure of the 
variability of the data and are the values in the middle of each half. Therefore, the 
hinges are the data values that are approximately 1/4 and 3/4 from the beginning of 
the ordered data set. They are determined by formulas [25) and are similar to 
quartiles that are defined so that 1/4 of the observations lie below the lower 
quartile and 1/4 lie above the upper quartile. The extremes also reflect the 
variability of the data and are the smallest and largest values in the data. 

Box plots 

The box plot is a graphic representation (Figure V.5) of the five-number summary with 
the two ends of the box representing the hinges and the line through the box 
representing the median. A line runs from each end of the box (i.e., from each hinge) 
to the corresponding lower and upper extreme values. This plot allows the reader to 
see quickly the median level, the variability, and the symmetry of the data. 
Variations of the box plot, including identification of outlier values, are possible 
(25) . 


Transformation or re-expression of data is a powerful tool that facilitates 
understanding their implications. If numbers are collected in a manner that renders 
them hard to grasp, the data analyst should use a transformation method, while 
preserving as much of the original information as can be used. When used 
appropriately, transformed data can be readily analyzed and interpreted. 

Raw data are transformed for a number of reasons--including the achievement of 


symmetry- -to produce a straight-line relationship, to allow use of an additive model, 
to reduce variability, and to attain normally distributed data. Symmetry is highly 
desirable when analyzing a single data set, since it ensures that a "typical" value 
(such as the mean or median) more nearly summarizes the data. When analyzing pairs of 
data, a straight-line relationship is important because linear associations are 
simple, both in form and in interpretation. One or both variables can be transformed 
to achieve linearity. Additive models have the desirable feature that data in multi- 
way tables can be typically decomposed into additive effects and analyzed accordingly. 
Reduced variability of the data is crucial when comparing several data sets. If the 
data spread varies with the data set, then "typical" values are obtained more 
accurately in the data with smaller spread. Finally, normally distributed data are 
needed so that normal theory statistics can be applied to test hypotheses and draw 

Not all data sets can be transformed. The ratio of the largest to smallest value in 
the original data set is a simple indicator of whether a group of numbers will be 
affected substantially by transforming. If the ratio is near 1, a transformation will 
not severely alter the appearance of the data. Since transformations affect larger 
values and smaller values differently, the further the ratio is from 1, the greater 
the need is for transformation to display and understand the data most simply. 

Transformations are generally accomplished by raising each value of the data set to 
some power p. Different values of p yield different effects on a data set, but those 
effects are ordered if the values of p are ordered. Some transformations are 
especially effective in certain instances (Table V.7) . For example, the square root 
transformation is particularly capable of reducing variability in count data. 
Guidelines are available to assist in selecting appropriate transformations (24,25) . 


Smoothing refers to EDA techniques that summarize consecutive, overlapping segments of 
a series of data to produce a smoother curve. Its goal is to represent patterns in 
the data more clearly without becoming encumbered with any detailed peaks and valleys. 
Variations in the data set caused by irregular components are smoothed so that the 
overall trend can be determined more readily. Thus, smoothing allows investigators to 

search for patterns in data that may otherwise be masked. 

Smoothing is used on data series to explore the relationship between two variables. 
The values along the x-axis should be equally spaced. The y values are called a time 
series if they are collected over successive time intervals, although these values 
need not be defined by time (e.g., in a data sequence of birth rates by mother's age) . 
As long as the x-axis defines an order and the order is not too irregular, the y 
sequence can be called a time series, and smoothing techniques can be applied. In 
time-series analysis, models are frequently developed on the smoothed data because 
these data are generally easier to model . 

Numerous smoothing approaches exist, each having its own assets and liabilities. The 
simplest example of smoothing is a moving average of three intervals in which 
observation y^ in the data sequence is replaced with the mean of y^j, y i# and y itl . 
Discussions of smoothing functions, including suggestions on how to overcome the 
problem of obtaining end points for the smoothed series, appear elsewhere (25-26) . 



Visual tools play a critical role in public health surveillance. Data graphics 
visually display measured quantities using points, lines, a coordinate system, 
numbers, symbols, words, shading, and color (27). Graphics allow researchers to mesh 
presentation and analysis. Data graphics are essential to organizing, summarizing, 
and displaying information clearly and effectively. The design and quality of such 
graphics largely determine how effectively scientists can present their information. 

Many visual tools are available to assist in analysis and presentation of results. 
The data to be presented and the purpose for the presentation are the key factors in 
deciding which visual tools should be used (Table V.8) . Further discussion and 
guidance in producing effective, high-quality data graphics are available from several 
sources (27-32) . 



A table arranges data in rows and columns and is used to demonstrate data patterns and 
relationships among variables and to serve as a source of information for other types 
of data graphics (28) . Table entries can be counts, means, rates, or other analytic 
measures . 

A table should be simple; two or three small tables are simpler to understand than one 
large one. A table should be self-explanatory so that if taken out of context readers 
can still understand the data. The guidelines below should be used to increase 
effectiveness of a table and ensure that it is self-explanatory (29) . 

Describe what, when, and where in a clear, concise table title. 

Label each row and column clearly and concisely. 

Provide units of measure for the data. 

Provide row and column totals . 

Define abbreviations and symbols. 

Note data exclusions. 

If the data are not original, reference the source. 

One -variable tables 

One of the most basic tables is a frequency distribution by category for a single 
variable. For example, the first column of the table contains the categories of the 
factor of interest, and the second column lists the number of persons or events that 
appear in each category and gives the total count . Often a third column contains 
percentages of total events in each category (Table V.9). 

Multi-variable tables 

Most phenomena monitored by public health surveillance systems are complex and require 
analysis of the interrelationships of several factors. When data are available on 
more than one variable, multi-variable cross-classified tables can elucidate 
associations. These tables are also called contingency tables when all the primary 
table entries (e.g., frequencies, persons, or events) are classified by each of the 
variables in the table (Table V.10). 


The most frequently used type of table in epidemiologic analysis is the two-by-two 
contingency table, which is appropriate when two variables, each having two 
categories, are studied. This special case is particularly suited for analyzing case- 
control and cohort studies for which the categories of the variables are case and 
control (or ill and well) and exposed and unexposed. 


A graph is a visual display of quantitative information involving a system of 
coordinates. Two-dimensional graphs are generally depicted along an x-axis 
(horizontal orientation) and y-axis (vertical orientation) coordinate system. Graphs 
are primary analytic tools used to assist the reader to visualize patterns, trends, 
aberrations, similarities, and differences in data. 

Simplicity is key to designing graphs. Simple, uncluttered graphs are more likely 
than complicated presentations to convey information effectively. Several specific 
principles should be observed when constructing graphs (29) . 

• Ensure that a graph is self-explanatory by clear, concise labeling of 

title, source, axes, scales, and legends, 

Clearly differentiate variables by legends or keys. 

Minimize the number of coordinate lines. 

Portray frequency on the vertical scale, starting at zero, and the method 

of classification on the horizontal scale. 

Assure that scales for each axis are appropriate for the data. 

Clearly indicate scale division, any scale breaks, and units of measure. 

Define abbreviations and symbols. 

Note data exclusions. 

If the data are not original, reference the source. 

Several commonly used graphs are described below. The scatter plot, an extremely 
helpful graph for detecting the relationship between two variables, has already been 
described (see "Data Displays"). 

Arithmetic-scale line graphs 


An arithmetic-scale line graph is one in which equal distances along the x and/or y 
axes represent equal quantities along that axis. This type of graph is typically used 
to demonstrate an overall trend over time rather than focusing on particular 
observation values. It is most helpful for examining long series of data or for 
comparing several data sets (see Figure 1.1). 

The scale of the x-axis is usually presented in the same increments as the data are 
collected (e.g., weekly or monthly). Several factors should be considered when 
selecting a scale for the y-axis {28) . 

• Choose a length for the y-axis that is suitably proportional to that of 
the x-axis. (A common recommendation is a 5:3 x: y-axis ratio.) 

• Identify the maximum y-axis value and round the value up slightly. 

• Select an interval size that provides enough detail for the purpose of 
the graph. 

Scale breaks can be used for either or both axes if the range of the data is 
excessive. However, care should be taken to avoid misrepresentation and 
misinterpretation of the data when scale breaks are used. 

Semi- logarithmic -scale line graphs 

A semi- logarithmic-scale line or semi-log graph is characterized by one axis being 
measured on an arithmetic scale (usually the x-axis) and the other being measured on a 
logarithmic scale. A logarithm is the exponent expressing the power to which a base 
number is raised (e.g., log 100 = log 10 2 = 2 for base 10). The axis portraying the 
logarithmic scale on semi-log graph paper is divided into several cycles, with each 
cycle representing an order of magnitude and values 10 times greater than the 
preceding cycle (e.g., a 3-cycle semi-log graph could represent 1 to 10 in the first 
cycle, 10 to 100 in the second cycle, and 100 to 1,000 in the third cycle). 

A semi- logarithmic-scale line graph is particularly valuable when examining the race 
of change in surveillance data, because a straight line represents a constant rate of 
change. For absolute changes, an arithmetic-scale line graph would be more 
appropriate. The semi-log scale is also useful when large differences in magnitude or 
outliers occur because this type of graph allows the plotting of wide ranges of values 


(see Figure 1.6). With semi-log graphs, the slope of the line indicates the rate of 
increase or decrease; thus a horizontal line indicates no change in rate. Also, 
parallel lines for two conditions demonstrate identical rates of change [29) . 


A histogram is a graph in which a frequency distribution is represented by adjoining 
vertical bars. The area represented by each bar is proportional to the frequency for 
that interval (i.e., the height multiplied by the width of each bar yields the number 
of events for that interval) . Thus, scale breaks should never be used in histograms 
because they misrepresent the data. 

Histograms can be constructed with equal- and unequal-class intervals. Equal-class 
intervals occur when the height of each bar is proportional to the frequency of the 
events in that interval. We do not recommend using histograms with unequal class 
intervals because they are difficult to construct and interpret correctly. 

The epidemic curve is a special type of a histogram in which time is the variable 
plotted on the x-axis. The epidemic curve represents the occurrence of cases of a 
health problem by date of onset during an epidemic, (e.g., an outbreak of paralytic 
poliomyelitis in Oman [see Figure V.6]). Usually the class intervals on the x-axis 
should be less than one- fourth of the incubation period of the disease, and the 
intervals should begin before the first reported case during the epidemic in order to 
portray any identified background cases of the condition being graphed. 

Cumulative frequency and survival curves 

A cumulative frequency curve is used for both continuous and categorical data. It 
plots the cumulative frequency on the y-axis and the value of the variable on the x- 
axis. Cumulative frequencies can be expressed either as the number of cases or as a 
percentage of total cases. For categorical data, the cumulative frequency is plotted 
at the right-most end of each class interval (rather than at the mid point) to depict 
more realistically the number or percentage of cases above and below the x-axis value 
(Figure V.7) . When percentages are graphed, the cumulative frequency curve allows 
easy identification of medians, quartiles, and other percentiles of interest. 

A survival curve (Figure V.8) is useful in a follow-up study for graphing the 


percentage of subjects remaining until an event occurs in the study. The x-axis 
represents time, and the y-axis is percentage surviving. A difference in orientation 
exists between cumulative frequency and survival curves (Figures V.7, V.8). 

Frequency polygons 

A frequency polygon is constructed from a histogram by connecting the midpoints of the 
class intervals with a straight line. A frequency polygon is useful for comparing 
frequency distributions from different data sets (Figure V.9). Detailed instructions 
for constructing frequency polygons are presented elsewhere (28,29) . 


Charts are useful graphics for illustrating statistical information. Many types of 
charts can be used [28-30) . They are most suited and helpful for comparing magnitudes 
of events in categories of a variable. In the paragraphs below, we describe several 
of the most frequently used types of charts. 

Bar charts 

Bar charts are one of the simplest and most effective ways to present comparative 
data. A bar chart uses bars of the same width to represent different categories of a 
factor. Comparison of the categories is based on linear values since the length of a 
bar is proportional to the frequency of the event in that category. Therefore, scale 
breaks could cause the data to be misinterpreted and should not be used in bar charts. 
Bars from different categories are separated by spaces (unlike the bars in a 
histogram). Although most bars are vertical, they may be depicted horizontally. They 
are usually arranged in ascending or descending length, or in some other systematic 
order . 

Several variations of the bar chart are commonly used. The grouped or multiple-unit 
bar chart compares units within categories (Figure V.10). Generally the number of 
units within a category is limited to three for effective presentation and 
understanding . 

A stacked bar chart is also used to compare different groups within each category of a 
variable. However, it differs from the grouped bar chart in that the different groups 


are differentiated not with separate bars, but with different segments within a single 
bar for each category. The distinct segments are illustrated by different types of 
shading, hatching, or coloring, which are defined in a legend (Figure V.ll). 

The deviation bar chart illustrates differences in either direction from a baseline. 
This type of chart is especially useful for demonstrating positive-negative and 
profit-loss data or comparisons of data at different times (Figure V.12). The 
incorporation of a confidence interval-like portion in the bars provides additional 
useful information. 

Pie charts 

A pie chart represents the different percentages of categories of a variable by 
proportionally sized pieces of pie (Figure V.13) . The pieces are usually denoted with 
different colors or shading, and the percentages are written inside or outside the 
pieces to allow the reader to make accurate comparisons. 


Maps are the graphic representation of data using location and geographic coordinates 
(33) . A map generally provides a clear, quick method for grasping data and is 
particularly effective for readers who are familiar with the physical area being 
portrayed. A few popular types of maps that depict incidence or distribution of 
health conditions are described below. 

Spot maps 

A spot map is produced by placing a dot or other symbol on the map where the health 
condition occurred or exists (Figure V.14). Different symbols can be used for 
multiple events at a single location. Although a spot map is beneficial for 
displaying geographic distribution of an event, it does not provide a measure of risk 
since population size is not taken into account. 

Chloropleth maps 

A chloropleth map is a frequently used statistical map involving different types of 
shading, hatching, or coloring to portray range-graded values (Figure V.15). It is 
also called a shaded or area map. Chloropleth maps are useful for depicting rates of 

a health condition in specific areas. 

Care must be taken in interpreting chloropleth maps because each area is shaded 
uniformly regardless of any demographic differences within an area. For example, most 
of a county may be relatively sparsely populated by low-income persons, where as a 
small portion of that county may be densely inhabited by persons with higher incomes; 
and the rate at which a particular health condition occurs may falsely appear to be 
evenly distributed by location and by socioeconomic status throughout the county. 
Chloropleth maps can also give the false impression of abrupt change in number or rate 
of a condition across area boundaries when, in fact, a gradual change may have 
occurred from one area to the next. 

Density-equalizing maps 

A density-equalizing or rubber map (Figure V.16) transforms actual geographic 
coordinates to produce an artificial figure in which area or population density is 
equal throughout the map (34) . Density-equalizing maps correct for the confounding 
effect of population density and thus are particularly useful in analyzing geographic 
clusters of public health events. 

Several algorithms exist to transform coordinates of maps. Any transformation routine 
should define a continuous transformation over the map domain, solve for the unique 
solution that minimizes map distortion, accept optional constraints, and avoid 
overlapping of transformed areas (35) . 


The real art of conducting surveillance lies in interpreting what the data say. Data 
need to be interpreted in the context of our understanding of the etiology, 
epidemiology, and natural history of the disease or injury. The interpretation should 
focus on aspects which might lead to improved control of the condition. By proceeding 
from the simple to the complex, investigators can use surveillance as a basis for 
taking appropriate public health action. Epidemics can be recognized, preventive 
strategies applied, and the effect of such actions can be assessed. The key to 
interpretation lies in knowing the limitations of the data and being meticulous in 
describing them. One axiom to be kept in mind always is that, because of the 

descriptive nature of surveillance data, correlation does not eijual causation. 

Limitations in Data 

No surveillance system is perfect; however, most can be useful. Several problems 
inherent in data obtained through surveillance must be recognized if the data are to 
be interpreted correctly. 

Uncle rrepor t ing 

Because most surveillance systems are based on conditions reported by health-care 
providers, underreporting is inevitable. Depending on the condition, 5%-80% of cases 
that actually occur will be reported [36-39) . However, the need for completeness of 
reporting—particularly for common health problems--may be exaggerated. Disease 
trends by time, place, and person can frequently be detected even with incomplete 
data. So long as the underreporting is relatively consistent, incomplete data can 
still be applied to derive useful inferences. For problems that occur infrequently, 
the need for completeness becomes more important. 

Unrepresentativeness of reported cases 

Health conditions are not reported randomly. For example, illnesses dealt with in a 
public health facility are reported disproportionately more frequently than those 
diagnosed by private practitioners. A health problem that leads to hospitalization is 
more likely to be reported than problems dealt with on an outpatient basis. Thus, 
reporting biases can distort interpretation. When it is possible, adjusting for 
skewed reporting will allow investigators to obtain a more accurate picture of the 
occurrence of a health problem. Collecting data from multiple sources may help 
provide ways to improve the representativeness of the information. 

Inconsistent case definitions 

Different practitioners frequently use different case definitions for health problems. 
The more complex the diagnostic syndrome, the greater the difficulty in reaching 
consensus on a case definition. Moreover, with newly emerging problems, as 
understanding of their natural history progresses, we frequently adjust the case 
definition to allow greater accuracy of diagnosis. Persons who interpret surveillance 
data must be aware of any changes in case definitions and must adjust their 

interpretations accordingly. 

Approach to Interpretation 

Creative interpretation of surveillance data requires more common sense than 
sophisticated reasoning. The data can speak for themselves. Brainstorm and test, if 
possible, all potential explanations for an observed pattern. Has the nature of 
reporting changed? Have providers or new geographic areas entered the surveillance 
system? Has the case definition changed? Has a new intervention, such as screening 
or therapy, been introduced? 

Consistency among different surveillance systems is probably the most crucial factor 
affecting interpretation. If different surveillance data sets from different 
locations show similar trends, the likelihood that the effect is real increases. 
Examine trends in different age groups. Finally, choose the surveillance system you 
think represents the highest quality local information. If the trends of the health 
problem are evident there, you can be more confident about your interpretations. 

To facilitate interpretation of surveillance data, formats can be designed to 
determine whether the number of reported cases of a health problem for a specified 
reporting period differs from that of a previous period. An example of such a "user- 
friendly" format has been published in CDC's Morbidity and Mortality Weekly Report 
(MMWR) since 1990 {40,41) . Known simply as "Figure 1," the graph uses horizontal bars 
to indicate the ratio of the current level of disease to the previous 5-year average 
(Figure V.12). Striping in the bars shows whether the number of reported cases during 
the most recent 4-week interval are higher or lower than the expected based on the 
mean and two standard deviations of the 4-week totals. A change in the occurrence of 
disease identified by this approach indicates the need for more detailed examination 
of the data--and may indicate an epidemic. Other diverse statistical techniques can 
be used to detect aberrations in surveillance data (42; see Chapter VI). 


Identifying Epidemics 

An important use of surveillance data is in determining whether increases in numbers 


of cases of a health condition at the local or national level represent outbreak 
(i.e., epidemic) situations that require immediate investigation and intervention. 
Thus, a surveillance system can function as an early warning signal for public health 
officials. For example, increases in numbers of cases of hepatitis B among military 
recruits provided the stimulus to intervene with drug-prevention programs (43). CDC's 
Birth Defects Monitoring System identified increases in renal agenesis (44) during the 
1970s and 1980s, which prompted an investigation. Monitoring of regional trends in 
rubella and congenital rubella identified outbreaks among the Amish in 1989-1990 (45) . 
A national registry of anti-abortion-associated violence clearly documented an 
"epidemic" of attacks in the mid-1980s, which decreased after vigorous prosecution was 
initiated by the Federal Bureau of Investigation (46) . 

The utility of surveillance data in detecting epidemics is highest in situations in 
which cases of the health condition occur over a wide geographic area or gradually 
over time. In such situations, the time-place-person links among cases probably would 
not be recognized by individual practitioners (3) . Typical examples occur with 
infectious diseases, when laboratory monitoring of unusual serotypes or antibiotic- 
resistance patterns identify outbreaks of specific microorganisms that might otherwise 
have gone unnoticed. Nationwide epidemics of Salmonella newport (47) , S. enteritidis 
(48), and Shigella sonnei (49) have been detected through surveillance. 

Identifying New Syndromes 

The most dramatic use of surveillance data occurs when a "new" syndrome emerges from 
an ongoing monitoring system. Legionnaire's disease was detected and subsequently 
characterized as the result of an outbreak of non-influenza pneumonia within a 
specific place and population (50) . Acquired immunodeficiency syndrome (AIDS) was 
recognized both because of rapid increases in requests for CDC's pentamidine supply 
and because it occurred in a special time (early 1981), place (California, New York), 
and person (men having sex with men) setting (51) . Finally, the national scope of the 
epidemic of eosinophilia myalgia syndrome (EMS) was noticed because its unique 
features were like those of toxic oil syndrome (52) . 

Monitoring Trends 

Even if specific outbreaks or new syndromes cannot be identified by tracking 


surveillance data, the baseline level of the health condition being monitored reflects 
any variation in its occurrence over time. This purpose is especially relevant to 
assessing events associated with reproductive health (e.g., ectopic pregnancy or 
neonatal mortality), chronic disease, or infections with a long latency. The 
progressive decline — until recently — of tuberculosis in the 20th century and the 
constant increase in numbers of cases of AIDS throughout the 1980s reflect this 
monitoring function (53,54) . 

Evaluating Public Policy 

Surveillance data can assess the health impact--pro or con--of specific interventions 
or of public policy. The rapid fall in numbers of cases of poliomyelitis and measles 
after national vaccination campaigns were instituted is a classic example of the 
usefulness of surveillance data {55,56). Creative interpretation of surveillance data 
has also been applied to non-infectious-conditions; the impact, in such situations, is 
somewhat more difficult to assess. For example, in Washington, D.C., the adoption of 
a gun-licensing law coincided with an abrupt decline in firearm-related homicides and 
suicides (57) . No similar reductions occurred in the number of homicides or suicides 
committed by other means, nor did states adjacent to the District experience any 
reductions in their rates of firearm-related homicides or suicides. Also, 
surveillance of legal abortions and of deaths associated with illegal abortion has 
helped trace the public health impact of this controversial health problem (8 ,58, 59) . 
After legal abortion became widely available, deaths from illegal abortion decreased 
markedly; however, restriction of federal funds for abortion had a negligible effect 
on health parameters (60) . 

Though it is tempting to use trends in disease and injury to monitor the impact of 
community interventions, such evaluation becomes increasingly suspect when several 
factors contribute to the occurrence of disease or health condition being monitored. 
In addition, if only a portion of the population accepts an intervention, analysis and 
interpretation of surveillance data are made even more difficult. Frequently, 
surveillance of process measures or other health problems can act as proxies for the 
intended outcome. Moreover, finding comparability in data from several populations 
that have attempted similar public health programs strengthens evidence that the 
interpretation is correct. For example, to evaluate the effectiveness of allowing 


people to exchange used hypodermic needles for new ones as a means of preventing AIDS, 
epidemiologists could simultaneously examine trends in numbers of needles distributed, 
surveys of needle use, and incidence of higher- prevalence infections such as hepatitis 

Projecting Future Needs 

Mathematical models based on surveillance data can be used to project future trends. 
This tool helps health officials determine the eventual need for preventive and 
curative services. Recently such modelling assisted in estimating the impact of AIDS 
on the United States health-care system in the 1990s (61) . Hot only did such 
projections address the demand for AZT by HIV-infected persons with low CD-4 
lymphocyte counts, but also the requirements for hospital care for persons with life- 
threatening superinfections later in the course of HIV-related disease. In addition, 
models based on surveillance data can predict the decline of morbidity and/or 
mortality when there are changes in risk factors among the population at risk. 
Examples of this application include projecting the decline in cardiovascular disease 
on the basis of decreased smoking of cigarettes (62), the decline in cirrhosis-related 
mortality in the presence of lower levels of alcohol use (63), and decreased rates of 
mortality from cervical cancer associated with an increase in the prevalence of 
hysterectomy (64). 


1. Thacker SB, Berkelman RL. Public health surveillance in the United States. 
Epidemiol Rev 1988;10:164-90. 

2. Doll R. Surveillance and monitoring. Int J Epidemiol 1974;3:305-13. 

3. Berkelman RL, Buehler JW. Surveillance. In: Holland WW, Detels R, Knox G, 
eds. Oxford textbook of public health, second edition. Vol 2: Methods of 
public health. Oxford: Oxford University Press, 1991:161-76. 

4. Hinman AR. Analysis, interpretation, use and dissemination of surveillance 
information. PAHO Bull 1977;11:338-43. 

5. Thacker SB, Berkelman RL, Stroup DF. The science of public health surveillance. 
J Pub Health Pol 1989;10:187-203. 

6. Morgenstern H. Uses of ecologic analysis in epidemiologic research. 
Am J Public Health 1982;72:1336-44. 

7. Piantadosi S, Byar DP, Green SB. The ecological fallacy. Am J Epidemiol 

8. Robinson WS. Ecological correlations and the behavior of individuals. 
Am Sociol Rev 1950;15:351-7. 

9. Koonin LM, Kochanek KD, Smith JC, Ramick M. Abortion surveillance, United 
States, 1988. In: CDC surveillance summaries, July 1991. MMWP. 1991;40(No. SS- 
2) : 15-42. 

10. Snow J. Snow of cholera. New York: Hafner Press, 1965. 

11. Firebaugh G. A rule for inferring individual relationships from aggregate data. 
Am Sociol Rev 1978;43:557-72. 


12. Rolfs RT, Nakashima AK. Epidemiology of primary and secondary syphilis in the 
United States, 1981-1989. JAMA 1990; 254 : 1432-7 . 

13. Marx R, Aral SO, Rolfs RT, Sterk CE, Kahn JG. Crack, sex, and STD. Sex Transm 
Dis 1991;18:92-101. 

14. Last JM, ed. A dictionary of epidemiology. 2nd ed. New York: Oxford University 
Press, 1988:141. 

15. Health United States 1990. DHHS publication no. (PHS) 91-1232. Hyattsville, 
Maryland: Centers for Disease Control, 1991. 

16. Ahlbom A, Norell S. Introduction to modern epidemiology. Chestnut Hill, 
Massachusetts: Epidemiology Resources Inc, 1984:97. 

17. Fleiss JL. Statistical methods for rates and proportions. 2nd ed. New York: 
John Wiley & Sons, Inc, 1981:321. 

18. Kahn HA, Sempos CT. Statistical methods in epidemiology. In: MacMahon B, ed. 
Monographs in epidemiology and biostatistics. Vol 12. New York: Oxford 
University Press, 1989:292. 

19. Lilienfeld AM, Lilienfeld DE. Foundations of epidemiology. 2nd ed. New York: 
Oxford University Press, 1980:375. 

20. Peavy JV. Adjusted rates. DHHS publication No. (PHS) 00-1833. Atlanta: 
Centers for Disease Control, 1988. 

21. Mausner JS, Bahn AK. Epidemiology: an introductory text. Philadelphia: 
WB Saunders, 1974:377. 

22. Haight F. Handbook of the Poisson distribution. New York: John Wiley & Sons, 
Inc, 1967. 

23. Kleinbauiti DG, Kupper LL, Muller KE. Applied regression analysis and other 


multivariable methods. 2nd ed. Boston: PWS-Kent Publishing Co, 1988:718. 

24. Tukey JW. Exploratory data analysis. Reading, Massachusetts: Addison-Wesley 
Publishing Company, 1977:688. 

25. Velleman PF, Hoaglin DC. Applications, basics, and computing of exploratory 
data analysis. Boston: Duxbury Press, 1981:354. 

26. McNeil DR. Interactive data analysis. New York: John Wiley & Sons, Inc, 

27. Tufte ER. The visual display of quantitative information. Cheshire, 
Connecticut: Graphics Press, 1987:197. 

28. Principles of epidemiology. 2nd ed (field test version 11/91). Atlanta: Centers 
for Disease Control, 1991. 

29. Peavy JV, Dyal WW, Eddins DL. Descriptive statistics: tables, graphs, & 
charts. DHHS publication no. (PHS) 00-1834. Atlanta: Centers for Disease 
Control, 1986. 

30. Schmid CF . Statistical graphics design principles and practices. New York: 
John Wiley & Sons, Inc, 1983:212. 

31. Chambers JM, Cleveland WS, Kleiner B, Tukey PA. Graphical methods for data 
analysis. Boston: Duxbury Press, 1983:395. 

32. Tufte ER. Envisioning information. Cheshire, Connecticut: Graphics Press, 

33. Haggett P, Cliff AD, Frey A. Locational analysis in human geography. 2nd ed. 
Bristol: JW Arrowsmith Ltd, 1977:605. 

34. Gillihan AF. Population maps. Am J Public Health 1927;17:316-9. 


35. Merrill DW, Selvin S, Mohr MS. Analyzing geographic clustered response. In: 
American Statistical Association 1991 Proceedings of the Section on Statistics 
and the Environment. Alexandria, Virginia: American Statistical Association 
(in press) . 

36. Eylenbosch WJ, Noah ND. Surveillance in health and disease. Oxford: Oxford 
University Press, 1988:15-23,32-5. 

37. Vogt RL, LaRue D, Klaucke DN, Jillson DA. Comparison of active and passive 
surveillance systems of primary care providers for hepatitis, measles, rubella 
and salmonellosis in Vermont. An J Public Health 1983;73:795-7. 

38. Levy BS, Mature J, Washburn JW. Intensive hepatitis surveillance in Minnesota: 
methods and results. Am J Epidemiol 1977;105:127-34. 

39. Marier R. The reporting of communicable diseases. Am J Epidemiol 1977,-105:587- 

40. Centers for Disease Control. Proposed changes in format for presentation of 
notifiable disease report data. MMWR 1989,-38:805-9. 

41. Centers for Disease Control. Changes in format for presentation of notifiable 
disease report data. MMWR 1990,-39:234-5. 

42. Stroup DF, Williamson GD, Herndon JL, Karon JM. Detection of aberrations in the 
occurrence of notifiable diseases surveillance data. Stat Med 1989;8:323-9. 

43. Cowan DN, Prier RE. Changes in hepatitis morbidity in the United States Army, 
Europe. Milit Med 1984;149:260-5. 

44. Edmonds LD, James LM. Temporal trends in the prevalence of congenital 
malformations at birth based on the Birth Defects Monitoring Program, United 
States, 1979-1987. In: CDC surveillance summaries, December 1990. MMWR 
1990;39(No. SS-4):19-23. 


45. Centers for Disease Control. Outbreak of rubella among the Amish- -United 
States, 1991. MMWR 1991;40:264. 

46. Grimes DA, Forrest JD, Kirkman AL, Radford B. An epidemic of anti-abortion 
violence in the United States. Am J Obscet Gynecol 1991;165:1263-8. 

47. Holmberg SD, Osterholm MT, Senger KA, Cohen ML. Drug-resistant Salmonella from 
animals fed antimicrobials. N Engl J Med 1984;311:617-22. 

48. St Louis ME, Morse DL, Potter ME, et al . The emergence of grade A eggs as a 
major source of Salmonella enteritidis infections: new implications for the 
control of salmonellosis. JAMA 1988;259:2103-7. 

49. Centers for Disease Control. Nationwide dissemination of multiply resistant 
Shigella sonnei following a common-source outbreak. MMWR 1987;36:633-4. 

50. Fraser DW, Tsai TR, Orenstein w, et al. Legionnaires' disease: description of 
an epidemic of pneumonia. N Engl J Med 1977,-297:1189-97. 

51. Centers for Disease Control. Pneumocystic pneumonia--Los Angeles. MMWR 

52. Swygert LA, Maes EF, Sewell LE, Miller L, Falk H, Kilbourne EM. Eosinophilia- 
myalgia syndrome: results of national surveillance. JAMA 1990;264:1698-703. 

53. Reider HL, Cauthen GM, Kelly GD, et al . Tuberculosis in the United States. 
JAMA 1989;262:385-90. 

54. Centers for Disease Control. Update: acquired immunodeficiency syndrome — 
United States, 1981-1990. MMWR 1991;40:358-69. 

55. Centers for Disease Control. Measles prevention: recommendations of the 
Immunization Practices Advisory Committee (ACIP) . MMWR 1989; 38 (no. S-9):l-18. 

56. Centers for Disease Control. Progress toward eradicating poliomyelitis from the 


1551 H 

Americas. MMWR 1989;38:532-5. 

57. Loftin C, McDowall D, Wiersema B, Cottey TJ. Effects of restrictive licensing 
of handguns on homicide and suicide in the District of Columbia. N Engl J Med 

58. Cates W Jr, Rochat RW, Grimes DA, Tyler CW Jr. Legalized abortion: effect on 
national trends of maternal and abortion-related mortality (1940-1976) . 

Am J Obstet Gynecol 197 8;132:211-4. 

59. Cates W Jr. Legal abortion: the public health record. Science 1982,-215:1586- 

60. Cates W Jr. The Hyde amendment in action: how did the restriction of federal 
funds for abortion affect low-income women. JAMA 1981;246:1109-12. 

61. Centers for Disease Control. HIV prevalence estimates and AIDS case projections 
for the United States: report based upon a workshop. MMMR 1990;39:(No. RR-16) . 

62. Kullback S, Cornfield J. An information theoretic contingency table analysis of 
the Dorn study of smoking and mortality. Comput Biomed Res 1976;9:409-37. 

63. Skog 0. The risk function for liver cirrhosis from lifetime alcohol 
consumption. J SCud Alcohol 1984;45:199-208. 

64. Centers for Disease Control. Hysterectomy prevalence and death rates for 
cervical cancer--United States, 1965-1988. MMWR 1992;41:17-20. 


Chapter VI 

Special Analytic Issues 

Donna F. St roup 

•There is only one good, that is knowledge. There is only one evil, that is 
ignorance . ■ 



Data obtained in a public health surveillance system have several characteristics that 
affect analyses. Most fundamentally, data from most surveillance systems are not 
generated from a designed study or randomized trial. Although this departure has been 


addressed in the context of epidemiologic studies and field investigations (1) , the 
effect in the surveillance setting has specific consequences. 

First, for a surveillance system, data are reported regularly, and may be updated 
after the initial report. Since the lag time between first report and subsequent 
updating may vary by health event or reporting location, methods developed for early 
detection of aberrations in the data should be applied as soon as provisional data are 
available. If the analyses are implemented as part of a routine surveillance program, 
results can be monitored as data are updated. 

Second, surveillance data are generated by a spatial as well as a temporal process. 
For example, at a given point in time, cases of a disease for a given area may not 
appear excessive; however, when compared with other times or other areas at a given 
time, an excess may become apparent (2) . 

Third, when only aggregated data are available (e.g., from regions, counties, or 
states) , the distribution of cases in the underlying population cannot be assessed 
directly. This problem is compounded because the areas of aggregation are usually 
arbitrarily defined and case definitions are not consistent within areas. As a 
result, statistical inferences concerning the properties of individuals are confounded 
by the properties of the aggregated system. 

Finally, the surveillance process is generally a multivariate one (3) . Multiple 
health events under surveillance may be related for a given point in time for the same 
area, or the relationship may be delayed in time for the same or nearby areas if 
diagnosis is uncertain or confirmation is delayed. The multivariate nature of this 
process should be used to improve the ability of any method to detect aberrations from 
a baseline. 


One foundation of the science of epidemiology is the study of the departure of the 
observed patterns of the occurrence of disease from the expected pattern of occurrence 
(4) . Variations in the usual incidence of health events in different geographic areas 
or different time periods may provide important clues to specific risk factors or even 


to the etiology of the problem. The expected numbers of reported health events are 
generated by a process involving human behavior and transmission of disease, and 
patterns of occurrence within human populations may lead to hypotheses about the 
determinants of the health problem (5) . 

The public health community continues to struggle with nomenclature for such 
variations. The term "cluster" can be defined as "a set of events occurring unusually 
close together to each other in time or space, in both time and space, or within the 
limits of demographic characteristics (e.g. persons in the same occupation)." 
•Cluster" is usually used to describe uncommon events (e.g., leukemia, suicide) and 
tends to evoke emotional response from members of the public or from the media. 

A related term is "epidemic" , historically used to describe aggregation of infectious 
diseases: "an outbreak of a disease spreading rapidly from person to person" {6). 
More recently, the concept has broadened to the following: "the occurrence in a 
community or region of cases of an illness, specific health-related behavior, or other 
health-related events clearly in excess of normal expectancy .... The number of 
cases indicating the presence of an epidemic will vary according to agent, size and 
type of population exposed, previous experience or lack of exposure to the disease and 
time and place of occurrence; thus, epidemicity is relative to the usual frequency of 
the disease in the same area, among the specified population, at the same season of 
the year" (7). it is prudent to be conscious of the fact that the term "epidemic" 
evokes responses beyond these definitions. In late 1988, the British Public Health 
Laboratory Service used "epidemic" to describe an increase in reported numbers of 
cases of Salmonella enteritidis associated with contaminated chicken and eggs. The 
country's Chief Medical Officer, Sir Donald Acheson, advised caution " using the 
word epidemic when addressing the public because of its connotations with terrifying 
diseases such as cholera and smallpox" (8) . The term "outbreak" has less evocative 
connotations. With all such definitions, a critical concept is the comparison of an 
observed number with what is usual or normal. The distinction made here is that 
•aberration" will be used to denote changes in the occurrence of health events that 
are statistically significant when compared with usual or normal history. The 
definition of an epidemic may require the existence of an aberration; e.g., the 
Centers for Disease Control (CDC) declares that an epidemic of a specific strain of 
influenza is occurring only if the number of reported deaths exceeds a 95% confidence 

limit in the forecast for two or more consecutive periods. In general, application of 
the term "epidemic* may require epidemiologic conditions beyond the statistical ones, 
e.g., laboratory isolates or resistance to vaccine. 

In this chapter, 'aberration' is used to describe statistical departures from a usual 
distribution. It is important to understand that such departures do not necessarily 
signal the "onset of an epidemic" or the "presence of a cluster." Conversely, one can 
have an epidemic even in the absence of a statistical increase, such as when infant 
mortality is "low" but still higher than expected. The methods developed here are 
intended for routine use by the public health analyst, in conjunction with 
epidemiologic investigation and close communication with the source of the 
surveillance reports. 


Since the definition of surveillance implies ongoing data collection, perhaps the most 
fundamental question suggested by the analysis of a surveillance system is the 
following: When does the value of reported events signal a change in the process from 
past patterns? Although fundamental, the analysis required to address this question 
suggests additional questions. How are "past patterns" defined? If an outbreak 
occurred in the past, should this affect the definition of a change? Other than the 
disease or injury process itself, what other factors could cause a change? 

In the paragraph below, we use the terms "baseline" to denote historical data and 
■current report" to denote the recent data on which the assessment is based. 

Graph of Current and Past Experience 

State health departments report the numbers of cases of about 50 notifiable diseases 
each week to CDC's National Notifiable Diseases Surveillance System (NNDSS) . The list 
of health events is determined collaboratively by the Council of State and Territorial 
Epidemiologists and CDC {9,10). Each week provisional reports are published in the 
Morbidity and Mortality Meekly Report (MMWR) and are made available to 
epidemiologists, clinicians, and other public health professionals in a timely manner. 
Although the tables of the MMWR continue to provide important information, the volume 
of data and the need for ease of interpretation encouraged the development of a 

graphic display to highlight unusually high or low numbers of reported cases. 

A new analytic and graphical method was adopted for this system to achieve the 
following objectives: a) to portray in a single comprehensible figure the weekly 
reports of data for approximately 20 diseases and to compare those data with past 
results b) to highlight for further analysis the results most likely to reflect either 
long-term trends or epidemics. These objectives were formulated to reflect most 
recent behavior in as short a time period as possible for weekly publication, but a 
long enough period to assure stable results. To facilitate comprehension, the same 
method is used for all diseases portrayed. 

The analytic method currently used for constructing Figure I in the MMWR (see 
Figure VI. 12), called the °CDC MMWR Current/Past Experience Graph (CPEG),° compares 
the number of reported cases in the current 4-week period for a given health event 
with historical data on the same condition from the preceding 5 years (11,12). 
Numbers of cases in the current month are listed to facilitate interpretation of 
instability caused by small numbers. 

The choice of 4 weeks as the "current period" was based on evidence that weekly 
fluctuation in data from disease reports usually reflects irregular reporting 
practices rather than actual incidence of disease. The use of 5 years of history 
achieves the objective of using the same model for all conditions portrayed, since 
some health events were made notifiable only recently (e.g., acquired immunodeficiency 
syndrome (AIDS) and legionellosis) . 

Also, modelling of reported influenza incidence has shown that more accurate forecasts 
are based on more recent data (13) . To increase the historical sample size and to 
account for any seasonal effect, the baseline is taken to be the average of the 
reported number of cases for the preceding 4-week period, the corresponding 4-week 
period, and the following 4-week period, for the previous 5 years. This yields 15 
correlated observations, referred to as the historical observations, or "baseline" 
(Figure VI. 1) . 

The deviation from unity of the ratio of the current 4-week total to the historical 
average is indicative of a departure from past patterns. We plot this ratio on a 


logarithmic scale so that an n-fold increase projects to the right the same distance 
as an n-fold decrease projects to the left, and no change from past patterns (1:1) 
produces a bar of zero length (14) . To distinguish the conditions that may require 
further investigation, the hatching on the bars begins at a point based on the mean 
and standard deviation of the historical observations.* 

An evaluation of this method shows that it has good statistical robustness to patterns 
in the data and high sensitivity and predictive value positive for epidemiologically 
confirmed outbreaks {15). An outbreak of rubella detected by this method proved to be 
of substantial public health importance (16). Recent increases beyond historical 
limits in reporting of aseptic meningitis reflected increased disease activity 
primarily in the northeastern United States (17). 


The method used by CDC to estimate excess mortality associated with influenza was 
developed from a 1932 study that defined the expected number of weekly deaths from 
pneumonia and influenza, or from all causes, as the median number of deaths for a 
given week during non-epidemic years (18) . "Excess deaths, " then, was defined as the 
difference between the observed and the conditional expected numbers, a one-period- 
ahead forecast. Later, a regression model was fitted to weekly pneumonia and 
influenza data from U.S. cities to calculate an expected number of deaths (19). In 
1979, CDC proposed a new method to estimate expected deaths using a body of methods 
called time-series (20). More recently, a method forecasting separate expected 
numbers by age group has been investigated (13) . 

The methodology of time series is appropriate for data available sequentially over 
time. A time-series model generally comprises components estimating the effect of 
secular trend, cycles, or year-to-year seasonal patterns. The process of model 
fitting consists of identification, estimation, and diagnostic validation. One then 
evaluates competing models on the basis of the fit of the models to the observed data 
and of the accuracy of the forecasts. 

♦Historical limits of the ratio of current reports to the historical mean are calculated 
as 1 plus or minus 2 times the standard deviation divided by the mean, where the mean and 
the standard deviation are calculated from the 15 historical 4-week periods. 


Most common methods of time-series analysis, such as the Auto Regressive Integrated 
Moving Average (ARIMA) models (21), are appropriate for relatively long series of data 
that exhibit certain regular properties over the entire series. Differencing, or 
forming a new series by subtracting adjacent observations, is generally used to create 
a series with a stationary mean, that is without trend. An additional property, 
stationarity of the variance, is generally required, so that the process does not 
become more or less variable over time. An autoregressive model includes terms that 
model the data at one point in time as a function of previous data. A moving-average 
term creates a series from averages of adjacent observations and is used to model 
cycles in the data. 

The advantage of time-series models for surveillance over other modeling methods, such 
as regression, is that the estimation process accounts for period- to-period 
correlations and seasonality, as well as long-term secular trends. A more detailed 
description of the concepts used in time series has been described (21) . 

Scan Statistic 

Consider this surveillance question: Is the number of cases reported for a certain 
time period excessive? While ARIMA time-series methods provide one approach to the 
answer, often the mechanics of this analysis are complex. The scan statistic (22) 
offers a relatively simple alternative in this situation. The scan statistic is the 
maximum number of reported cases (i.e., events) in an interval of predetermined length 
over the time frame of interest. It is used to test the null hypothesis of uniformity 
of reporting against an alternative of temporal clustering. Consider the following 
setting. Surveillance data are reported over a time period T, containing k intervals 
of equal length: 

n x n 2 ... n* 

j ! ! I | ! ! I L 

tj. t, t fc 

Where t i( i= 1, 2, ..., k are of equal length t 
and T = tj + t 2 + ... + t k . 

The total number of events reported in the entire time period is called N and is the 


sum of the numbers of events in each of the intervals n 2 + n 2 + ... + n t . Let n = max 
{n^} , i= 1,2, ..., k, or the largest report in any of the intervals. Then compute L = 
T/t, or the number of intervals in the entire time period. 

The statistical question addressed by the scan statistic is: What is the probability 
that the maximum number of cases in any interval of length t is equal to or exceeds n? 

For example if the frequency of trisomies among karyotyped spontaneous abortions for a 
defined geographic area by calendar month of last menstrual period in 1992 are as 







of cases 

























What is the probability of 10 or more trisomies in December given there were a total 
of 40 in 1992? Using the notation defined above, N = 40, T = 12; L = 12/1 =12; n = 
10; and t= 1. Then from tabulated values {23) the probability of 10 or more trisomies 
in December, given 40 for the year, is 0.083. 





L= 8 

n p 

14 0.002 

13 0.040 

14 0.012 

15 0.003 

14 0.042 

L= 12 

n p 

11 0.007 

10 0.083 

11 0.024 

12 0.006 

11 0.064 


n p 

10 0.007 

9 0.082 

10 0.021 

11 0.005 

10 0.053 

If the results of the scan statistic are to be useful, the lengths of the entire time 
frame and the scanning interval must be determined a priori . The lack of extensive 
tabulated values and the computer- intensive calculations for large sample sizes limit 
the usefulness of the method. Approximations to the exact distribution are described 
elsewhere {23-25) . 


Given cases of a health event reported from a defined geographic area over a defined 
time period, can we say that the cases occur unusually close together in both space 
and time? That is, do they form a spatial-temporal cluster? Traditional approaches 
to the analysis of health-event aggregation in geographic areas have been based on 
randomization arguments {26-27). A representative discussion follows. 

One proposed method divides the study area into subareas (e.g., counties or census 
tracts) and the study time period into intervals of constant length (e.g., month or 
year) (28) . The cases of the health event for each time-space "cell" are then 
calculated. The maximum count within any time interval is summed across all subareas 
to obtain a test statistic. This method assumes equal population density across all 
area cells and has limitations {29) . 

In Knox's method, all possible pairs of cases are examined, and each pair is 

classified according to whether the case-patients in the pair lived "close" together 

and had onset of the health problem (or report) "close" in time, resulting in the 2- 

by-2 table: 

Reports close in time? 

Yes No 
Reports close Yes a b 

in space? No c d 

Under the hypothesis of no clustering, the expected number may be calculated in the 
usual way, with an adjustment in the significance test, since the statistic is based 
on pairs of cases (30). A brief example follows. 

Consider cases of a disease with the following spatial and temporal relationships: 

Close in space? 

Yes No All 

Yes 1 5 

Close in time? No 2 3 

All 6 22 28 

The test statistic to be computed is X = number of pairs close in space and time, 1 in 
this example. We use row and column marginal totals to compute an expected value for 


this cell: (6x5) / 28 = 1.07. Now use the Poisson distribution to compute the 
probability of seeing one (or more) cases close in space and time, given that we 
expect 1.07; this value is at least 0.63. Therefore, we conclude that these data 
provide no evidence for space/time clustering. 

A criticism of Knox's method is that the choice of the critical time and space 
distances is arbitrary. This problem was addressed for the question of spatial 
clustering (31) , and the method does not require spatial boundaries or assessment of 
the entire population base. An alternative approach is demonstrated by Williams (32) , 
with a sensitivity analysis of the time and space critical values. 

A second criticism of Knox's method is that it makes no allowance for edge effects 
which arise either from natural geographic boundaries (e.g., coastlines) or because 
there are unrecorded cases outside the designated study region. A new method (33) 
addresses this, by altering the interpretation of expected pairs of close cases and 
replacing the simple count of close pairs by a weighted sum. Recently, this new 
method has been applied to test the hypothesis that many non-outbreak, cases of 
Legionnaires' disease in Scotland and not sporadic and to attempt to pinpoint cases 
clustering in space and time (34) . 

It is important to emphasize that because of the diverse and complicated nature of 
clusters, there is no single test to assess them. The statistical sources suggested 
here are intended only to augment other epidemiologic methods in a systematic, 
integrated approach (35) , coupled with flexibility in methods of analysis and 
interpretation of significance levels. 


Statistical methods are the basis of many aspects of evaluating a public health 
surveillance system (36) . For example, the question of completeness of a surveillance 
system is fundamental to the system's usefulness. One approach to the assessment of 
completeness involves a capture-mark-recapture technique, developed for the 
enumeration of wildlife populations (37) and used by the U. S. Census Bureau (38) . 
The method requires two parallel surveillance systems, or a surveillance system and a 
survey, measuring the incidence of a single health event, and provides an estimate of 


true total number of cases of that health event and the completeness of coverage of 
the two systems. 

The Chandra Sekar-Deming (CSD) and Lincoln-Peterson Capture-Recapture (LPCR) Methods 
suggest the following structure for the analysis. Suppose two surveillance systems 
for the same health event report R and S totals respectively for some time period. In 
addition, suppose it is possible to match the cases so that we know which C of the 
cases are reported to both surveillance systems. This structure suggests the 
following 2-by-2 table: 

Surveillance system 1 

system 2 

Cases Cases not 
reported reported 


Cases reported C 
Cases not reported N t 
All cases R 

The CSD and LPCR methods estimate N, the total number of cases from the combined 
information, and provide a confidence interval for that estimate. Using the notation 
suggested in the table above, 

N = [ (R+l) (S+l) / (C+l) ] - 1 

Var(N) = (R + l) (S+l) N, N 2 / [ (C+l) 2 (C+2) ] 

95% CI (N) = N + 1.96 Vvar (N) . 

Thus the completeness of each surveillance system can be calculated as follows: 

Completeness of #1 = R / N 
Completeness of #2 = S / N. 

Consider the following example. There exist two independent surveillance systems for 
hepatitis A for a location with stable population. Suppose that the events identified 
in either of the two systems are true events, that the matching procedure identifies 
all true matches, and only true matches are identified. 

Surveillance system 1 

system 2 


Cases not 



Cases reported 790 60 850 

Cases not reported 50 X 

All cases 840 N 

The estimated number of cases missed by both systems is 

X = (50 • 60) / 790 = 3.8 -> 4. 
So, the estimated number of cases in the population under surveillance is: 

N = 790 + 50 + 60 + 4 = 904. 

The formulas above yield a 95% confidence interval for N of 904_+4. The completeness 
of surveillance system #1 is 840/904 or 0.93, and that of surveillance system #2 is 

The usefulness of results from this capture-recapture calculation is based on four 
assumptions : 

• Surveillance is done for a closed population. 

• The matching procedure successfully identifies all true matches and, conversely, 
only true matches are identified. 

• All events identified in either of the two systems are true events. 
The two systems are independent. 

Clearly, these are seldom if ever satisfied for public health surveillance 
systems; however, this should not preclude the method as an investigative tool. 
For example, at the national level, the lack of personal identifiers precludes 
exact matching of cases between surveillance systems. However, other information 
(age, gender, county, date of onset) may allow probability matching or estimates 
of the overlap. Application of the LPCR method with more stringent or relaxed 
matching criteria will yield bounds on the completeness of coverage still useful 
for surveillance evaluation. For example, if we relax the matching criteria in 
the table above so that 820 cases are reported to both systems, analogous 
calculations show that the completeness of system #1 is 0.96, and that of system 
#2 is 0.98. 


No single method can be used to detect all epidemics or all types of aberrations. 
Several questions provide a framework for choosing an analytic method. 

What is the purpose of the surveillance system? The data used for the CPEG 
analyses are reported weekly by state health departments. Although each state 
analyzes its own data, patterns may be apparent from the aggregated national picture 
that may facilitate prevention and intervention efforts. Additionally, the data are 
maintained historically for the archival purposes of measuring trends and assessing 
the effects of interventions. 

What is the purpose of the analytic method? Since a single method cannot be 
expected to distinguish between a change in historical trend and a one-time outbreak 
with unsustained increases, the analyst must identify the purpose of the analysis 
before choosing an analytic method. If the nature of the data is determined and the 
questions are well-defined, the results of the analytic method can be used to augment 
other sources of information. 

The purpose of CPEG is to facilitate the routine analysis of surveillance data and to 
supplement other sources of information. The method is not useful for conditions with 
long-term historical trends. When the data have complex patterns, it may be helpful 
to remove (simplify) some of this pattern by modeling. The classical methods of time- 
series analysis are appropriate for this situation, but these may not be accessible to 
the practicing public health official. 

Which conditions should be monitored? Routine analysis should be reserved and 
adapted for conditions for which there are public health interventions. The CPEG 
methodology is most appropriate for conditions with historical trends that do not 
exhibit frequent changes in trend or level and that occur often enough so that a 
single case or two does not constitute a significant flag. If the raw data are not 
already analyzed for trend and period effects, and the variance of the numerator 
(present cases) cannot be assumed to have the same variance as the observations in the 
denominator (historical data) , and if the series exhibits considerable correlation for 
first-order (adjacent) observations and beyond, the CPEG method may be less powerful. 


For rare conditions, the instability caused by small numbers of reported cases may 
make the results unsuitable for repeated use. 

What is the (person, place, or time) unit of analysis? We chose national data 
for presentation of CPEG. The objective was to use as short and recent a time period 
as possible for weekly publication, thus making the results useful for timely 
intervention. However, variability in weekly reports reflecting factors other than 
the disease process--e.g. , delayed reports due to outbreaks — made the results 
unstable. We then chose a 4 -week window. 

Because of the interest in analytic techniques for the analysis of aberrations in 
surveillance data at the state level, six state health departments evaluated the 
usefulness of the "CPEG" (39) . During the 4-month period of study, a total of 210 
episodes were observed, of which 27 episodes were flagged as exceeding historical 
limits; one state had no episodes of unusual reporting. Overall, 14 episodes (52%) 
represented epidemiologically confirmed outbreaks. Many were small, and none were 
detected when aggregated with other state data for the national analyses. Each 
disease exceeded historical limits at least twice during the study period, and for all 
but meningococcal disease, at least one incident represented an outbreak. Although 
the numbers are clearly small, the proportion of episodes that represented outbreaks 
varies. This is expected for conditions with different epidemiology. 

The five outbreaks that the health department knew about but that were not detected by 
the CPEG method highlight some of its limitations. In three outbreaks, cases were not 
reported nationally as current reports; thus, they were not included with the data 
used for the calculation. The other two outbreaks were not detected because of 
concurrent increases in the corresponding baseline. 

What provision is there for updating or correcting the data using later 
reports? In the NNDSS, cases are reported as early as possible and then later 
confirmed or modified. The methodology of CPEG is applied to the provisional 
(earliest reported) data. In our study of six states, two of the five outbreaks that 
were not detected reflected late reports not included in the current reporting period. 


How is the baseline determined? The choice of 5 years as a baseline period was 
based on a consideration of appropriate sample size balanced by a desire to use the 
same method for all conditions. Although a longer baseline might be used for some 
conditions with a long reporting history, epidemics or changes in trend in the 
baseline will increase the variance of the baseline and thus offset any benefit of 
additional data. An additional source of variation may be increases in reporting due 
to intensive investigation. In these cases, the analyst may choose to omit or adjust 
the increased baseline data. 

How are outbreaks in the baseline handled? CPEG as presented here does not adjust 
for epidemics in the baseline. The result of this is a progressive decline in 
sensitivity--when an outbreak moves in and then out of the baseline window. To 
address this point, one could use a median of the baseline reports (rather than a 
mean) . Unfortunately, this replacement invalidates the technique used to compute the 
point for signalling aberrations, and the alternative methods for calculating this are 
not as accessible to the practicing epidemiologist as the CPEG methodology. 

What are the sensitivity and, predictive value positive of the method? 
Applying CPEG by states detected 14 of 19 (74%) of outbreaks and 14 of 27 (52%) of the 
episodes exceeded historical levels were actually outbreaks by sensitivity (74%) and 
predictive value positive (52%) of CPEG in states is therefore quite high. Partly 
because of the use of provisional data, we use the mean of the historical baseline in 
the calculation. We investigated the predictive value positive of the CPEG from six 
state health departments by asking each department to follow up on aberrations 
detected by this system. In addition, we asked that outbreaks that came to their 
attention through other sources but had not been identified by CPEG be noted. 

What are the mechanics of operation? For any analytic method to be useful, it 
must be easily implemented in the routine work of the practicing epidemiologist. In 
evaluating the states use of CPEG at the national level, an epidemiologist routinely 
evaluated each aberration, analyzed state distributions, and conveyed results to each 
CDC program responsible for the control of the condition. Additional information was 
provided by epidemiologists in state health departments. Investigation was based on 
this evidence in addition to that obtained through other analysis. Eventually, state 

health departments will have the software to generate CPEG locally. 

Emergent methods provide opportunities for the future of surveillance analysis. Many 
methods of pattern recognition are based on Bayesian concepts, in which a different 
approach is taken to the process that generates the data--in this context, reports of 
a health event. 

Classical statistical theory regards the data as arising from a process with unknown 
but constant parameters. The objective of classical methods, then, is to use the 
observed data to estimate or make inferences about the unknown values. Bayesian 
methods regard the parameters as having prior distributions, independent of the data, 
and the data are used to update or refine our idea of this distribution. "The gain in 
introducing the prior [distribution] is partly that it provides a way of injecting 
additional information into the analysis and partly that there is a gain in logical 
clarity" (40) . 

In the application to data generated over time and space as public health surveillance 
reports, the Bayesian approach recognizes the value of information beyond the mere 
data history (e.g., a change in the definition of a reportable case of AIDS (41). In 
such circumstances, no statistical model can be expected to predict such occurrences 
using historical data only. "There is a tendency to overfit [sic] a particular past 
realization at the expense of the unrealized future" (42) . It is necessary to have a 
system in which people can convey their information to the method and have the method 
convey this uncertainty in a way that is useful for intervention and control. 

One important application of Bayesian methodology is to increase the stability of 
observed rates of health events on the basis of data for small populations. For 
example, county-level mapping may provide the resolution necessary to identify regions 
with potentially elevated risk, but the high variability of observed rates in counties 
with small populations may mask any underlying patterns. A two-stage empirical Bayes 
procedure (43) addresses this problem by augmenting information for one county with 
that of all other counties. Devine (44) applied this method to mapping of injury- 
related mortality rates for the United States from 1979 through 1987. This work 
represents an important step towards producing meaningful maps for small areas. 
However, sensitivity to model assumptions and consideration of spatial dependence 

remain areas for investigation. 



1. Goodman RA, Buehler JW, Koplan JP. The epidemiologic field investigation: 
science and judgment in public health practice. Am J Epidemiol 1990;132:9-16. 

2. Openshaw S, Taylor PJ. The modifiable areal unit problem. In: Wrigley N and 
Bennett RJ, (ed.). Quantitative geography: a British view. London: Routledge 
and Kegan, Paul 1981. 

3. Thacker SB, Berkelman RL, Stroup DF. The science of public health surveillance. 
J Publ Hlth Pol 1989;10:187-203. 

4. Lilienfeld AE, Lilienfeld DE. Foundations of epidemiology. 2nd edition. 
Oxford, England: Oxford University Press, 1980. 

5. Macmahon B, Pugh TF. Epidemiology: principles and methods. Boston, Ma.: Little 
Brown and Co. , 1970. 

6. Baker AD, Margerison FM. New medical dictionary. London, England: Northcliff, 

7. Last JM. a dictionary of epidemiology, 2nd edition. Oxford, England: Oxford 
University Press, 1988. 

8. London Times, January 11, 1989. 

9. Thacker SB. The surveillance of infectious diseases. JAMA 1983;249:1181-5. 

10. Centers for Disease Control, summary of notifiable diseases United States 1990. 
MMWR 1990,-39: (53) . 

11. Stroup DF, Williamson GD, Herndon JL, Karon JM. Detection of aberrations in the 
occurrence of notifiable diseases surveillance data. Stat Wed 1989;8:323-32. 

12. Centers for Disease Control. Proposed changes in format for presentation of 
notifiable disease report data. MMWR 1989;38 (47) :805-9 . 

13. Stroup DF, Thacker SB, Herndon JL. Application of multiple time series analysis 
to the estimation of pneumonia and influenza mortality, by age, 1962-1983. Stat 
Med 1989;7:1045-59. 

14. Morgenstern H, Greenland S. Graphing ratio measures of effect. J Clin 
Epidemiol 1990;43:539-42. 

15. Stroup DF, Wharton M, Kafadar K, Dean AG. An evaluation of a method for 
detecting aberrations in public health surveillance data. In press: Stat Med. 

16. Centers for Disease Control. Increase in rubella and congenital rubella 
syndrome- -United States, 1988-1990. MMWR 1991;40:93-9. 

17. Centers for Disease Control. Aseptic meningitis--New York State and United 
States, weeks 1-36, 1991. MMWR 1991;40 (45) :773-5 . 

18. Collins SD. Excess mortality from causes other than influenza and pneumonia 
during influenza epidemics. Publ Health Rep 1932;47:2159-80. 

19. Serf ling RE. Methods for current statistical analysis of excess pneumonia- 
influenza deaths. Public Health Rep 1963;78:494-505. 

20. Choi K, Thacker SB. An evaluation of influenza mortality surveillance, 1962- 
1979. I. Time series forecasts of expected pneumonia and influenza deaths. 
Amer J Epidemiol 192; 113 : 215-26 . 

21. Box GEP, Jenkins G. Time series analysis: forecasting control. San Francisco, 
Ca. : Holden-Day, 1976. 

22. Wallenstein S. A test for detection of clustering over time. Am J epidemiol 

175// 7* 

23. Naus JI . Approximations for distributions of scan statistics. J Amer Stat Assn 

24. Wallenstein S, Neff N. An approximation for the distribution of the scan 
statistic. Stat Med 1987;6:197-207. 

25. Glaz J. Approximations and bounds for the distribution of the scan statistic. 
J Amer Stat Assn 1989;84:560-6. 

26. Mantel N. The detection of disease clustering and a generalized regression 
approach. Cancer Res 1967;27:209-20. 

27. Aldrich TE, Wilson CC, Warner SS, Easterly CE. Studying case clusters: a primer 
for disease surveillance. Am J Epidemiol 1989;120:223-30. 

28. Ederer F, Myers MH, Mantel N. A statistical problem in space and time: do 
leukaemia cases come in clusters? Biometrics 1964;20:626-39. 

29. Knox EG. The detection of space-time interaction. Appl Statist 1964;13:25-9. 

30. David FN, Barton DE. Two space time interaction tests for epidemicity. Brit J 
Prev Soc Med 1966;20:44-8. 

31. Cuzick J, Edwards R. Spatial clustering for in for inhomogeneous populations. 
J R Statist Soc 1990;652:73-104. 

32. Williams EH, Smith PG, Day NE, et al. Space-time clustering of Burkitt's 
lymphoma in the west Nile district of Uganda: 1961-1975. Brit J cancer 

33. Diggle PJ, Chetwynd AG, Haggkvist R. Second order analysis of space-time 
clustering. Lancaster, Pa.: Lancaster University, 1991. (Department of 
Mathematics technical report) . 

34. Bhopal RS, Diggle PJ, Rowlingson B. Pinpointing clusters of apparently sporadic 
cases of legionnaire's disease. BMJ 1992;304:1022-7. 

35. Centers for Disease Control. Guidelines for investigating clusters of health 
events. MMWR 1990; 39 (No. RR-11). 

36. Centers for Disease Control. Guidelines for the evaluation of surveillance 
systems. MMWR 1988; 37 (S-5) . 

37. Eberhardt LL. Appraising variability in wildlife populations. J Wildlife 
Management 1978;42:207-38. 

38. Wolter KM. Accounting for America's uncounted and miscounted. Science 

39. Wharton M, Price W, Hoesly F et al. Evaluation of a method for outbreak 
detection in six states. Am J Prev Med (in press). 

40. Cox DR, Hinkley DV. Theoretical Statistics. London, England: Chapman and Hall, 

41. Selik RM, Buehler JW, Karon JM et al. Impact of the 1987 revision of the case 
definition of acquired immune deficiency syndrome in the United States. J 
Acquired Immune Deficiency Syndromes 1990;3:73-82. 

42. Harrison PJ, Stevens CF. Bayesian forecasting (with discussion). J Royal Stat 
Soc 1976;38:205-47B. 

43. Tsutakawa RK. Mixed model for analyzing geographic variability in mortality 
rates. J Am Stat Assoc 1988;83:37-42. 

44. Devine OJ. A modified empirical bayes approach for stabilizing mortality rates 
in areas with small populations. Proceedings of the National Meeting of the 
American Statistical Association, Atlanta, Ga., August, 1991. 


Chapter VII 


Richard A. Goodman 

Patrick L. Remington 

Robert J. Howard 

"All I know is just what I read in the papers. 

Will Rogers 


Standard definitions for public health surveillance specify the requirement for the 
timely dissemination of findings to those who have contributed and others who need to 
know {1-3). In the United States, surveillance findings have been disseminated 
through the Morbidity and Mortality Weekly Report (MMWR) series of publications, 
public health bulletins in states, and special reports in peer-reviewed journals. 
However, even though new technologies and epidemiologic methodologies have 
dramatically improved the collection and analysis of surveillance data, public health 
programs have lagged in developing effective approaches to the dissemination of 
surveillance f indings--and to the ultimate successful communication of those findings. 

As recently as the 1970s, public health surveillance in the United States focused 
almost exclusively on the detection and monitoring of cases of specific communicable 
diseases, and surveillance data were disseminated primarily in a basic tabular format. 
However, surveillance efforts have expanded rapidly and now include chronic diseases, 
injuries, occupationally acquired conditions, and other problems. In addition, 
surveillance encompasses problems as diverse as personal behavior (e.g., cigarette 


smoking and seat-belt use); environmental insults (e.g., hazardous materials 
incidents); and preventive practices (e.g., Pap smears and mammographic screening). 

Because of the fundamental changes in public health programs and priorities, programs 
at all levels require innovative approaches to convey surveillance findings to new and 
more diverse constituencies. This chapter provides a practical framework for 
optimizing dissemination and communication of information developed through public 
health surveillance efforts. 


Surveillance has been characterized as a process that provides "information for 
action." This concept is inherently consistent with one definition that described 
communications as "...a process, which is a series of actions or operations, always in 
motion, directed toward a particular goal" (5). On the basis of this definition, 
then, public health programs must ensure more than the mere transmission or 
dissemination of surveillance results to others; rather, surveillance data should be 
presented in a manner that facilitates their consequent use for public health actions. 
One fundamental concept is that the terms "dissemination" and "communication" cannot 
be used interchangeably. Dissemination is a one-way process through which information 
is conveyed from one point to another. In comparison, communications is a loop- 
involving at least a sender and a recipient and is a collaborative process. The 
communicator's job is completed when the targeted recipient of the information 
acknowledges receipt and comprehension of that information. 

A basic framework for disseminating the results of public health surveillance with the 
intent of communicating can be adapted from fundamental models for communications. 
One such model—which emphasizes the effect of communications-includes the sender, the 
message, the receiver, the channel, and the impact (3). The sender is the person 
responsible for surveillance of each health condition being monitored. For 
applications in public health practice, this model can be modified (See Table VII. 1). 

Each of these steps is discussed in greater detail in the paragraphs below. They 


should all be read with the understanding that one should never disseminate more 
information than s/he can evaluate and revise, as needed, during the communications 

Establish Message 

The primary message or communications objective for the findings of any public health 
surveillance effort should reflect the basic purposes of the surveillance system. In 
this textbook, the purposes of surveillance systems have been described (Chapters I 
and II) . For each of these categories, the findings and interpretation of 
surveillance data may necessitate a different type of public health response. In 
addition to disseminating data to those who may have contributed, the communications 
objectives should also dictate the delivery of the information to the relevant target 
groups and the stimulation of appropriate public health action, as illustrated below. 

To detect and control outbreaks 

When the purpose of a surveillance system is to detect outbreaks or other occurrences 
of disease in excess of predicted levels, the primary communications objective should 
be to inform two groups: a) the population at risk of exposure or disease, and b) 
persons and organizations responsible for immediate control measures and other 
interventions. For example, when surveillance efforts detect influenza activity in a 
specific locality, public health agencies can promptly disseminate this information to 
health-care providers who may, in turn, intensify efforts to vaccinate or provide 
amantadine chemoprophylaxis to persons at high risk of complications from influenza. 
The release and timing of such messages should be carefully considered and coordinated 
with appropriate agencies. 

In the context of this example, the impact of releasing a message recommending the use 
of amantadine or influenza vaccine may be enhanced if the release has been coordinated 
with public health units, local pharmaceutical suppliers, and medical organizations. 

To determine etiology and natural history of disease 

Public health surveillance for newly recognized or detected problems may be initiated 
to assist in determining the epidemiology, etiology, and natural history of such 
conditions. In such circumstances, the communications objective may simply be to 

provide information which is sufficient to initiate surveillance. 

For example, when eosinophilia-myalgia syndrome (EMS) was recognized in the United 
States in October 1989, a case definition was developed and disseminated to the public 
health community to enable the immediate implementation of national surveillance for 
EMS (4) . Surveillance efforts were critical in characterizing the epidemiology and 
natural history of EMS, as well as in assisting in the development of hypotheses 
regarding its cause. 

Evaluate control measures 

For many public health conditions, surveillance is the principal means for assessing 
the impact of control measures. Epidemiologic trends and patterns that are based on 
surveillance findings must be conveyed to persons involved in control efforts in order 
to refine control activities and guide the allocation of resources in support of those 

Following a period of relative quiescence, as of the mid-1980s the incidence of 
measles in the United States surged. When surveillance indicated that vaccination 
coverage had declined substantially in some groups (e.g., children residing in inner- 
city locations) , key findings were conveyed to and used by public health programs and 
primary care providers in targeting measles vaccination efforts. 

To detect changes in disease agents 

In addition to monitoring trends in the occurrence of public health problems, 
surveillance systems may be fundamental to the process of detecting changes in disease 
agents and the impact of these changes on public health. For example, in the late 
1980s in the United States, surveillance documented an increase in the incidence of 
tuberculosis- -an increase substantially in excess of predicted levels. In addition to 
this overall trend, transmission of multi-drug-resistant tuberculosis (MDR-TB) was 
detected in health-care and prison settings (5) . The public health implications of 
these findings are similar to the basic considerations outlined above for detecting 
and controlling outbreaks: specifically, there is need for timely and effective 
notification of populations at risk and of organizations responsible for 
control/prevention measures. Therefore, in the case of MDR-TB, the communications 
objectives would include immediate notification of the public health community about 


the problem with the intent of facilitating implementation of proper diagnostic, 
therapeutic, and preventive measures. 

To detect changes in health practices 

Some surveillance systems monitor changes in health practices and behaviors in the 
population rather than changes in patterns of disease (6) . This "life-style" 
information is particularly important for problems such as chronic disease, for which 
trends in risk behavior often precede changes in health outcome by years or even 
decades. The communications objective in this context is often to increase awareness 
regarding the role of behavior in causing disease or injury. In addition, this 
information may be used to identify high risk groups in the population. 

For example, surveillance data regarding trends in cigarette smoking indicate that 
smoking rates have not declined among persons with lower educational attainment. 
Accordingly, surveillance data which characterize risk factors (such as smoking), 
outcomes, health services, and other related factors may guide public health programs 
and decision makers in the implementation of targeted communitywide or statewide 
intervention strategies (7). 

Facilitate planning of health policies 

For some conditions, the most appropriate control measure is promulgation of a public 
health policy. In this context, surveillance information about the public health 
impact of different conditions and problems must be effectively communicated to 
legislators and public health policy makers. 

For example, in California, surveillance information about smoking-attributable 
mortality, morbidity, and economic costs helped in enacting Proposition 99. This 
legislation provided for a 25-cent increase in the state cigarette tax which, in turn, 
funded statewide initiatives to prevent and control the use of tobacco. Subsequently, 
surveillance data regarding trends in the prevalence of smoking and the impact of this 
initiative assisted in ensuring the application of state funds to control tobacco use. 
Similarly, data for the United States have confirmed that increases in cigarette taxes 
have helped in reducing cigarette smoking (8). 

Define the Audience 

Identification of target groups is an essential part of the process of developing 
strategies for communicating surveillance results. Typically, public health 
surveillance information and reports have been disseminated in a standard format with 
only limited consideration of the target audiences and, more importantly, the 
techniques to communicate effectively to these groups. In general, key target groups 
may include public health practitioners, health care providers, professional and 
voluntary organizations, policy makers (e.g., from the executive and legislative 
branches of government), the press, or the public. 

In some instances, surveillance information should be disseminated widely, in which 
case communication strategies should be tailored to subgroups of greater interest. 
For example, information regarding trends in injecting drug use (IDU) -related risks 
for HIV is often communicated to the general public through the newspapers; however, 
this strategy may be suboptimal for reaching the groups at highest risk, who use 
alternative media such as radio and television (9) . 

Select the Channel 

Specification of the messages and audiences for surveillance results enable selection 
of the most suitable channels of communication for this information. Traditionally, 
surveillance information has been disseminated through published surveillance reports. 
However, in addition to conventional means for communicating with traditional 
audiences, the advent of new methods and technologies have made possible improved 
communications with both old and new audiences. This spectrum of communications 
options includes professional and trade publications, electronic channels, broadcast 
media, print media, and public forums: 

• Publications: government public health bulletins and surveillance reports, 
peer-reviewed public health and biomedical journals, newsletters. 

• Electronic: telecommunications systems (e.g., National Electronic 
Telecommunications Surveillance System [see Chapter IV] , Public Health 
Net), fax and batch fax, audioconferences, videoconferences. 


• Media: news releases, news conferences, fact sheets, video releases. 

• Public forums: briefings, hearings and testimony, conferences and other 
planned meetings. 

Market the Information 

Once the message has been defined and the target audience and channel selected, it is 
critical to assure that the information is communicated and marketed- -not merely 
disseminated- -to those who need to know. In the decade of the 1990s, enormous 
quantities of information concerning public health are communicated through 
professional channels, as well as the print and electronic media. Because of the 
volume of essential information, as well as time constraints, surveillance information 
must be carefully tailored for presentation to each targeted audience, including 
public health and health care professionals, policy makers, and the public. 

To ensure that surveillance information is readily communicated to target audiences, 
public health agencies should use those techniques that are most effective for 
marketing information. First, as a general principal, graphic formats and other 
visual displays are likely to be more effective in conveying information than 
conventional tabular presentations. Such formats include maps, bar graphs, 
histograms, diagrams, or other ways of visually depicting data which may not be 
readily comprehended through tabular presentation. For example, in December 1989, the 
Centers for Disease Control introduced a graphic format for displaying national 
notifiable disease surveillance data in the Morbidity and Mortality Weekly Report 
(10). This bar graph (Figure V.12), which replaced a standard table, was designed 
both to facilitate interpretation of routine notifiable disease data and to enable 
timely public health responses to changes in disease patterns. 

Second, the principal components of the message can be focused by selecting the most 
important point, then stating that point as a simple declarative sentence. This 
message, termed the "single over- riding communication objective (SOCO) ■ , should 
consider three questions: 

• What is new? 

• Who is affected? 

• What works best? 

For example, chronic disease surveillance information data indicate that compared with 
younger women, older women are less likely to have received a Pap test in the past, 
are more likely to have cervical cancer diagnosed at a late stage, and have higher 
mortality rates due to cervical cancer. Traditionally, this information might be 
disseminated to health care and public health providers through vital statistics 
reports and other published accounts about cervical cancer. However, if these 
findings are to be used as a basis for action, they first must be synthesized, then 
effectively communicated. Thus, in addition to presenting these findings in detailed 
reports, they also may be expressed through a single message, the SOCO: "Older women 
need to get regular Pap tests." 

Third, techniques must be used which present (or "package") the surveillance 
information in a manner which captures an audience's interest and focuses attention on 
a specific issue. Examples of these techniques are the use of introductory terms such 
as: "A new study . . ."; "Recent findings . . . * ,- and "Information recently released . 
..." These terms are likely to appeal more to a target audience than a presentation 
which begins with a conventional preface, such as "Based on recent surveillance 
findings, . . . ." 

Fourth, the method and forum of release of surveillance information may be critical-- 
particularly when a timely release is required, or when the target audiences include 
the media, the public, or policy makers. Under such circumstances, news conferences 
or other news releases may be considered, and should be held when they are likely to 
be attended. Foremost, the presenter should involve reporters in the public health 
surveillance process by "walking them through it", and should recognize opportunities 
to articulate the SOCO on camera or in print. Important adjuncts for presenting the 
information include readily available handouts and effective, but simple, visuals. 

Evaluate the Effect 

Because public health surveillance is, by definition, oriented toward action, 
evaluation efforts should address two considerations: first, whether surveillance 
information has been communicated to those who need to know; and second, whether the 


information has had a beneficial effect upon the public health problem/ condition of 

Assessment of whether surveillance information has been communicated to those who 
need to know may be accomplished through a process evaluation, such as by monitoring 
the distribution of the information or a user survey. In particular, the 
effectiveness of communication through newspapers can be evaluated by using clipping 
services which determine the number of published reports, the geographic distribution 
of the reports, and the proportion of the total audience to which the reports have 
been circulated. In addition, process evaluation efforts should include a review of 
the content of articles to assess both the accuracy and appropriateness of the 
communicated message. 

The second consideration—the impact of the communications effort on the public health 
problem — requires an evaluation of outcomes (e.g., knowledge or practices) within 
specific target audiences. 

Under ideal circumstances, this type of evaluation requires surveys of the target 
audiences both before and after the surveillance information has been communicated to 
detect changes in levels of outcomes. The potential for such evaluation is 
constrained, however, by technical and methodologic challenges, as well as substantial 
resource requirements. 


Effective communication of public health surveillance results represents the critical 
link in the translation of science information section. Recognition of the key 
components in this process- -including the medium, the message, the audience, the 
response, and the evaluation of the process--is the first step in completing the 
communications loop. 


1. Langmuir AD. The surveillance of communi cable diseases of national importance. 
N Engl J Med 1963;288:182-92. 

2. Thacker SB, Berkelman RL. Public health surveillance in the United States. 
Epidemiologic Rev 1988;10:164-90. 

3. Hiebert RE, Ungurait DE, Bohn TW. the process of communication. In: Mass 
media: An introduction to modern communication III. Longman Inc., New York, 
1982, pp 15-29. 

4. Centers for Disease Control. Eosinophilia-myalgia syndrome--New Mexico. MMWR 

5. Centers for Disease Control. Nosocomial transmission of multidrug-resistant 
tuberculosis among HIV-infected persons—Florida and New York, 1988-1991. MMWR 

6. Remington PL, Smith MY, Williamson DF, Anda RF, Gentry EM, Hogelin GC. Design, 
characteristics, and usefulness of state-based risk factor surveillance 1981- 
1986. Public Health Rep 1988 July-August; 103 (4) : 366-75. 

7. Boss LP, Suarez L. Uses of data to plan cancer prevention and control programs. 
Public Health Rep 1990;105:354-60. 

8. Peterson DE, Zeger SL, Remington PL, Anderson HA. The effect of state cigarette 
tax increases on cigarette sales, 1955 to 1988. Am J Public Health 1992;82:94- 

9. Centers for Disease Control. HIV-prevention messages for injecting drug users: 
sources of information and use of mass media- -Baltimore, 1989. MMWR 

187 //Sfc' 

10. Centers for Disease Control. Proposed changes in format for presentation of 
notifiable disease report data. MMWR 1989;38:805-9. 


Chapter VIII 

Evaluating Public Health Surveillance 

Douglas N. Klaucke 

"The best way to escape from a problem is to solve it." 

Brandon Francis 


The overall purpose of evaluating public health surveillance is to promote the most 
effective use of health resources. The highest-priority public health events should 
be under surveillance, and surveillance systems should meet their objectives as 
efficiently as possible. Meeting each of these objectives involves evaluating 
surveillance from two different perspectives; in turn, each perspective has a slightly 
different emphasis in the application of the elements of surveillance evaluation. 



The first level of evaluation answers the question, "Should this health event be under 
surveillance?" This question should be answered from a perspective external to the 
surveillance system itself. It is the first question that should be asked when 
deciding whether to start a new system or before conducting a detailed evaluation of 
an existing one. This "external" evaluation is primarily an assessment of the public 
health importance of a health event and how its importance compares with that of other 
health events. Once a health event is identified as being of high priority, it is 
important to consider both the feasibility and cost of conducting surveillance for 
that event. If this first-level evaluation leads to a decision to discontinue a 
surveillance system, a detailed evaluation of that system is superfluous. 

The second level evaluates an operating surveillance system for a high-priority health 
event to increase the system's utility and efficiency. This type of evaluation may 
also compare two or more systems involving the same health event . This type of 
evaluation will determine whether the system is meeting its objectives, serving a 
useful public health function, and operating as efficiently as possible. It should 
include at least the following steps: 

• An explicit statement of the purposes and objectives of the system 

• A description of its operation 

• Documentation of how the surveillance system has been useful 

• An assessment of the different quantitative and qualitative attributes, 

• Estimates of the cost of the system. 

The goal is to maximize the system's usefulness and to achieve the simplest, least 
expensive system that meets its objectives. 


Although all systems should be assessed for their purpose and usefulness, specific 
attributes described below that are critical to one system may be less important to 
another. Efforts to improve certain attributes--such as the ability of a system to 
detect a health event --may detract from other attributes--such as simplicity or 
timeliness. Thus, the success of an individual surveillance system depends on the 
proper balance of characteristics, and the strength of an evaluation depends on the 


ability of the evaluator to assess these characteristics with respect to the system's 
objectives. Any approach to evaluation must therefore be flexible. 

Determining the most efficient approach to surveillance for a given health event is an 
art. There is room for creativity and opportunity to combine scientific rigor with 
practical realities. The methods discussed in this chapter should be used as a guide 
to the types of questions that need to be answered about the system. Each evaluation 
should be individually tailored. Few evaluations address fully all of the methods 
outlined in this chapter, and many profitably focus on only one or two major 
attributes, such as sensitivity and timeliness (1-3) . Some of these elements may also 
be useful for evaluating other health-information systems or evaluating the value of 
secondary data sources for surveillance. 

Each of the listed aspects of a surveillance evaluation will be discussed in the 
sections that follow: public health importance, objectives and usefulness, operation 
of the system and qualitative attributes (simplicity, flexibility, and acceptability), 
quantitative attributes (sensitivity, predictive value positive, representativeness, 
and timeliness), and cost. This chapter continues the process through which methods 
for evaluating public health surveillance systems evolve (4,5). 


The public health importance of a health event and the need for surveillance of that 
health event can be described in a variety of ways. Health events that affect many 
people or require large expenditures of resources are clearly important in a public 
health context. However, health events that affect relatively few persons may also be 
important, especially if the events cluster in time and place--e.g., a limited 
outbreak of a severe disease. At other times, public concerns may focus attention on 
a particular health event, creating or heightening the sense of importance associated 
with it. Health problems that are now rare because of successful control measures may 
be perceived as 'unimportant, ■ but their level of importance should be assessed on the 
basis of their potential to reemerge. Finally, the public health importance of a 
health event is influenced by its preventability and the ability of public health 
action to influence it. 


Some measures of the importance of a health event, and, therefore, the surveillance 
system that monitors it, include the following: 

• Magnitude of the problem: Total number of cases, incidence, and 

• Severity: Mortality rate and case- fatality ratio. 

• Morbidity: physician visits, hospital days. 

• Premature mortality: Years of potential life lost (YPLL) . 

• Economic cost: Costs of medical care, lost productivity. 

• Preventability : Prevented fraction. 

Measures of importance used should take into account the effect of existing control 
measures. For example, the number of cases of vaccine-preventable illness has 
declined following the implementation of school immunization laws, and the public 
health importance of diseases in this category is underestimated by case counts 
alone. In such instances, it may be possible to estimate the number of cases that 
would be expected in the absence of control programs (6) . 

Preventability can be defined at several levels--from preventing the occurrence of 
disease (primary prevention), through early detection and treatment, (secondary 
prevention) , to minimizing the effects of the health problem among those already ill 
(tertiary prevention) . From the perspective of surveillance, preventability reflects 
the potential for effective public health interventions at any of these levels. 

The need for surveillance may also be affected by factors other than those mentioned 
above. Political and public pressure may affect whether surveillance is undertaken — 
or, at the other extreme, forbidden- -for a specific health event. Regulations, laws, 
and public health programs may be implemented on the basis of considerations other 
than those listed above. However, it is still important to make the scientific 
criteria as clear and explicit at possible. 

Even when using quantitative measures, judgment is necessary to decide which criteria 
are most relevant for each condition. It is important to make these judgments as 
explicit--and as early--as possible. 


Attempts have been made to quantify the public health importance of health 
conditions. Dean described such an approach that involved using a score that 
accommodated for age-specific mortality and morbidity rates and health-care costs (7). 
The Canadian Laboratory Centre for Disease Control has used explicit criteria in 
setting national surveillance priorities for communicable diseases. Their criteria 
include the parameters listed above, plus several others such as interest on the part 
of the World Health Organization, or the Department of Agriculture (Canada) , potential 
for outbreaks, public perception of risk, and necessity for immediate public health 
response. Their ratings for 60 communicable diseases can be useful in setting 
priorities for initiating a surveillance system (8) . 


The most important steps in evaluating a surveillance system are a) describing the 
health event (s) under surveillance, b) stating explicitly the objectives of the 
system, and c) describing how the system has actually been used to help prevent and/or 
control disease or injury. These three steps alone often sufficiently indicate how 
the system can be improved. 

Case definition (s) should be specified, which include symptoms, signs, laboratory 
results, and epidemiologic information; a scale of severity; and the different levels 
of confidence in the diagnosis for each case, such as "suspected," "probable," and 
■confirmed. " Case definitions for nationally notifiable diseases have been published 
for Canada and the United States (9,10). Table VIII. 1 outlines a case definition 
developed by the Centers for Disease Control (CDC) and the U.S. Council of State and 
Territorial Epidemiologists. 

The possible objectives of surveillance systems and the uses of surveillance 
information are very similar and have been reviewed in Chapter I. 

A surveillance system might also meet a statutory requirement based on political 
necessity or public pressure or might identify cases for additional studies. There 
may also be objectives, such as meeting the reporting requirements of the World Health 
Organization, that might not be of immediate or direct benefit to the agency operating 
the surveillance system. 


The usefulness of a system should be described specifically, including the actions 
that have been taken as a result of the data and analysis from the surveillance 
system, and who used the data to make decisions and take actions. Other anticipated 
uses of the data should be noted and their feasibility determined. 

A surveillance system should contribute to the control and prevention of adverse 
health events. This process may include an improved understanding of the public 
health consequences of the events. A surveillance system can also be useful if it 
determines that an adverse health event previously thought to have public health 
importance actually does not. 

An assessment of the usefulness of a surveillance system begins with a review of the 
objectives of the system and should consider the dependence of policy decisions and 
control measures on the surveillance system. Depending on the objectives of a 
particular surveillance system, the system may be considered useful if it 
satisfactorily addresses one or more of the following questions. Does the system, 

• detect trends signaling changes in the occurrence of the health problem in 

• detect epidemics? 

• provide estimates of the magnitude of morbidity and mortality related to 
the health problem being monitored? 

• stimulate epidemiologic research likely to lead to control or prevention? 

• identify risk factors involved in the occurrence of the health problem? 

• permit assessment of the effects of control measures? 

• lead to improved clinical practice by the health-care providers who are 
the constituents of the surveillance system? 

Usefulness may be affected by all the attributes of surveillance described below. 
Increased sensitivity may afford a greater opportunity for identifying epidemics and 
understanding the natural course of an adverse health event in a community. More 
rapid reporting allows more timely control and prevention activities. Increased 
specificity enables public health officials to focus on productive activities. A 


representative surveillance system will characterize more accurately the epidemiologic 
features of a health event in the population. 


To evaluate a surveillance system, one must know how it operates (see Chapter IV) . 
The system description should include the following: 

The people and organizations involved, 

The flow of information (up and down) , 

Mechanisms of information transfer, 

Frequency of reporting and feedback, and 

Quality control. 

The evaluation should address the following questions. What is the population being 
monitored? Who is responsible for reporting a case (and to which public health 
agency)? What information is collected on each case, and who is responsible for 
collecting it? If there are multiple administrative levels represented in the system, 
how are the data transferred from one level to another? How is information stored? 
Who analyzes the data? How are they analyzed, and how often? Are there preliminary 
and final tabulations, analyses, and reports? How often are reports disseminated? To 
whom? By what mechanisms /media are the reports distributed? Are there any 
■automatic" responses to case reports, (e.g., follow-up of individual cases of rabies, 
botulism, or poliomyelitis)? 

A diagram is often useful to summarize the relationship between the various components 
of a system (Figure VIII. 1). 


Each surveillance system has characteristics or attributes that contribute directly to 
its ability to meet its specific objectives. The combination of these attributes 
determines the strengths and weaknesses of the system. The attributes must be 
balanced against each other, (e.g., high sensitivity may only be possible with a 
complex reporting system from a wide array of providers) . 


Simplicity and Flexibility 

In describing a surveillance system, three desirable qualitative attributes should be 
addressed: simplicity, flexibility, and acceptability. 

Simplicity of a surveillance system refers both to its structure and to its ease of 
operation. Surveillance systems should be as simple as possible, while still meeting 
their objectives. It may be useful to think of the simplicity of a surveillance 
system from two perspectives: the design of the system and the size of the system. 
The following measures might be considered in evaluating the simplicity of a system: 

Amount and type of information necessary to establish a diagnosis. 

Number and type of reporting sources, 

Method(s) of transmitting case information/data, 

Staff training requirements. 

Type and extent of data analysis, 

Amount of computerization, 

Methods of distributing reports, and 

Amount of time spent operating the system. 

The cost estimates for a system are also an indirect indicator of simplicity. Simple 
systems usually cost less that complex ones. Another consideration is the ability of 
the system to adapt to changing needs such as the addition of new conditions or data- 
collection elements. This characteristic is termed "flexibility." 


Acceptability reflects the willingness of individuals and organizations to participate 
in the surveillance system. This attribute refers to the acceptability of the system 
to health department staff and at least equally importantly to persons outside the 
sponsoring agency, (e.g., doctors or laboratory staff) who are asked to report cases 
of certain kinds of health problems. To assess acceptability, one must consider the 
points of interaction between the system and its participants, including subjects 
(persons identified as having cases) and reporters. Indicators of acceptability 


include the following: a) subject or agency participation rates; b) interview 

completion rates and question refusal rates, if the system involves case interviews; 

c) completeness of report forms; d) physician, laboratory, or hospital/facility 
reporting rates; and e) timeliness of reporting. 


The four quantitative attributes of a surveillance system include sensitivity, 
predictive value positive, representativeness, and timeliness. These are often 
difficult to measure precisely, but even indirect estimates can be useful in helping 
to improve the efficiency of a system and in comparing it with other systems. 


The sensitivity of a surveillance system can be considered on two levels. First, the 
completeness of case report ing- - i .e. , the proportion of cases of a disease or health 
condition that are detected by the surveillance system (Table VIII. 2) — can be 
evaluated. Second, the system can be evaluated for its ability to detect epidemics 
(11). (see Chapters V & VI) . 

The sensitivity of a surveillance system is affected by the likelihood that 

• persons with certain health conditions seek medical care; 

• the condition is correctly diagnosed which reflects the skill of care 
providers and the accuracy of diagnostic tests; and 

• the case is reported to the system, once it has been diagnosed. 

These factors also apply to surveillance systems that do not fit the traditional 
disease/care-provider model. For example, the sensitivity of a telephone-based 
surveillance system of morbidity or risk factors would be affected by 

• the number of people who have telephones, who are at home when the 
surveyor calls, and who agree to participate; 

• the ability of persons to understand and correctly answer the questions; 

• the willingness of respondents to report their status. 


The extent to which these questions are explored depends on the system and on the 
resources available for the evaluation. The measurement of sensitivity in a 
surveillance system requires the validation of information collected through the 
system, so as to distinguish accurate from inaccurate case reports, and the collection 
of information external to the system, so as to determine the frequency of the 
condition in a community, (i.e. a "gold standard.") (22). From a practical 
standpoint, the primary emphasis in assessing sensitivity — assuming that most reported 
cases are correctly classif ied--is estimating what proportion of the total number of 
cases in the community are being detected by the system. If this proportion is 
estimated using methods that compare two or more surveillance systems, none of which 
is a "gold standard," then this proportion should be called an estimate of 
"completeness of coverage" rather than of sensitivity. (See also Chapter VI on 
capture recapture) . 

A surveillance system that does not have high sensitivity can still be useful in 
monitoring trends, as long as the sensitivity and predictive value positive remain 
reasonably constant. Questions concerning sensitivity in surveillance systems most 
commonly arise when changes in patterns of occurrence of the health problem are 
noted. Changes in sensitivity can be precipitated by heightened awareness of a health 
problem, introduction of new diagnostic tests, or changes in the method of conducting 
surveillance.***** A search for such surveillance "artifacts" is often an initial 
step in investigating an outbreak. 

Several evaluations have looked at the sensitivity or completeness of coverage of 
surveillance systems {13-15) . 

Predictive value positive 

Predictive value positive (PVP) is defined as the proportion of persons identified as 
case-patients who actually have the condition being monitored {11). In Table VIII. 2 
above this is represented by A/ (A+B) . 

In assessing PVP, primary emphasis is placed on the confirmation of cases reported 
through the surveillance system. Its effect on the use of public health resources can 
be considered on two levels. At the level of an individual case, PVP affects the 
amount of resources required for investigation of cases. For example, where every 


reported case of hepatitis A is promptly investigated by a public health nurse, and 
family members at risk are referred for a prophylactic immune globulin injection each 
reported case generates a requirement for follow-up. A surveillance system with low 
PVP and therefore frequent "false-positive" case reports would lead to resources being 
wasted on cases that do not, in fact, exist. 

The other level is that of detection of epidemics. A high rate of erroneous case 
reports over the short term might trigger an inappropriate outbreak investigation, and 
conversely, a constant high level of "false-positive" reports might mask a true 
outbreak. In assessing this attribute, we want to know what proportion of epidemics 
identified by the surveillance system are "true epidemics." 

Calculating the PVP requires confirmation of all cases. Interventions initiated on 
the basis of information obtained from the surveillance system should be documented 
and kept on file. Personnel activity reports, travel records, and telephone logbooks 
may all be useful in estimating the impact of the PVP on the detection of epidemics. 

A low PVP means that a) non-cases are being investigated, and b) there may be mistaken 
reports of epidemics. "False-positive" reports to surveillance systems lead to 
unnecessary interventions, and falsely detected "epidemics" lead to costly 
investigations. A surveillance system with high PVP will lead to fewer "less 
unnecessary and inappropriate expenditure of resources (16) . 

The PVP for a health event may be enhanced by clear and specific case definitions. 
Good communication between the persons who report cases and staff operating the 
surveillance system can also improve PVP. The sensitivity and specificity of the case 
definition, as well as the prevalence of the condition in the population contribute to 
the PVP; (Table VIII. 2) the PVP increases with increasing specificity and prevalence. 

Sensitivity and predictive value positive are inversely related. The balance between 
assuring that all (or almost all) cases are identified (high sensitivity) and few 
false positives are identified (high PVP) must be based on the level of importance 
accorded to identifying all cases (e.g., for rabies or meningococcal meningitis) and 
the ability to use an indicator of the disease in the community (e.g., use of 
Salmonella laboratory isolates) . 


A truly representative surveillance system accurately describes the occurrence of a 
health event over time and its distribution in the population by place and person. 

Representativeness is assessed by comparing the characteristics of reported events 
with those of all such events that occurred. Although this information is not 
generally available in specific detail, some judgment of the representativeness of 
surveillance data is possible, on the basis of knowledge of the following factors: 

• characteristics of the population--e.g. , age, socioeconomic status, and 
geographic location (17); 

• natural history of the condition--e.g. , latency period, fatal outcome; 

• prevailing medical practices--e.g. , sites performing diagnostic tests, and 
physician-referral patterns (18,19); 

• multiple sources of data--e.g., mortality rates for comparison with data 
on incidence, laboratory reports for comparison with physician reports. 

Representativeness can also be examined through special studies of a representative 
sample of the population {16) . 

The points at which bias can enter a surveillance system and decrease 
representativeness are illustrated in Figure VIII. 2. 

Case ascertainment bias (Representativeness) 

This might also be called "sampling bias" and is the differential identification 
and/or reporting of cases from different populations or over time. 

In order to generalize findings from surveillance data to the population at large, the 
data from a surveillance system should reflect the population characteristics that are 
important to the goals and objectives of that system. These characteristics generally 
relate to time, place, and person. An important result of evaluating the 
representativeness of a surveillance system is the identification of subgroups in the 
population that may be systematically excluded from the reporting system. This will 


enable appropriate modification of data-collection practices and more accurate 
projections of incidence of the health event in the target population. 

Changes in reporting practices over time can introduce bias into the system and make 
it difficult to follow long-term trends or establish baseline rates to be used for the 
recognition of outbreaks. For example, switching from a passive to an active system 
or changing reporting sources may change the sensitivity of the system. Publicity can 
also increase rates of reporting in passive systems (20) . While more complete 
reporting is desirable in principle, it is difficult to predict how a change in 
reporting practices or in publicity associated with the reportable condition will 
change the proportion of cases reported. 

Differences in reporting practices by geographic location can bias the 
representativeness of the system. For example, the National Notifiable Diseases 
Surveillance System (NNDSS) aggregates data collected independently by the 50 states, 
Washington, D.C. and several territories. For some infectious diseases, some states 
collect data only from laboratories, whereas other states also accept cases reported 
by health practitioners (21) . Also, despite efforts to achieve consistency, case 
definitions are not standardized across state and territorial boundaries (10) . 

Differential reporting rates of cases may occur in association with different 
characteristics of the person, so that cases among certain subpopulations may be less 
likely to be reported than those among other groups. For example, an evaluation of 
reporting on viral hepatitis in a county in Washington State suggested that cases of 
hepatitis B were underreported among homosexual men and that cases of hepatitis nonA- 
nonB were underreported among persons exposed to blood transfusions. The importance 
of these risk factors as contributors to the occurrence of these diseases was 
apparently underestimated, as indicated by the selective underreporting of certain 
hepatitis cases (22) . 

Bias in descriptive information about a reported case 

Given that a case of a reportable health condition has been identified and reported, 
there may be errors in the collection and recording of descriptive information about 
the case, or 'information bias." 


Most surveillance systems collect more than simple case counts. Information commonly 
collected includes the demographic characteristics of affected persons, details about 
the health event, and the presence or absence of defined potential risk factors. The 
quality, usefulness, and representativeness of this information depends on its 
completeness and validity. 

Quality of data is influenced by the clarity of the information forms, the training 
and supervision of persons who complete surveillance forms, and the care exercised in 
management of data. A review of these facets of a surveillance system provides an 
indirect measure of quality of data. An examination of the percentage of "unknown" or 
"blank" responses to items on surveillance forms or questionnaires is 
straightforward. Assessing the validity of responses requires special studies, such 
as chart reviews or re- interviews of respondents. 

Errors and bias can make their way into a surveillance system at any stage in the 
reporting and assessment process. Because surveillance data are used to identify 
high-risk groups, to target interventions, and to evaluate interventions, it is 
important to be aware of the strengths and limitations of the information in the 

So far, the discussion of attributes has been aimed at the information collected for 
cases, but many surveillance systems also involve calculating morbidity and mortality 
rates. The denominators for these rate calculations are often obtained from a 
separate data system maintained by another agency, such as the Bureau of the Census or 
the National Center for Health Statistics of CDC. Although these data are regularly 
evaluated, thought should be given to the comparability of categories (e.g., race, 
age, or residence) used in the numerator and denominator of rate calculations. 

Several studies have looked at quality-assurance problems associated with surveillance 
data. A sample of National Electronic Injury Surveillance System (NEISS) records were 
compared with emergency-room records to assess the quality of data recorded in the 
surveillance system {23). A study of quality of national malaria surveillance reports 
was carried out in the United Kingdom (24). The quality of Behavioral Risk Factor 
Surveillance System (BRFSS) data, which are obtained through monthly telephone 
surveys, for behavioral risks associated with cardiovascular problems has been 


examined in California {25) . And CDC examined the completeness of race-ethnicity 
reporting in the NNDSS {26) . 


Timeliness reflects the delay between any two (or more) steps in a surveillance 
system. The timeliness of the system can best be assessed by the ability of the 
system to take appropriate action based on the urgency of the problem and the nature 
of the public health response. Four points of time in the surveillance process are 
most often considered when measuring timeliness: a) time of onset of disease or 
occurrence of an injury, b) time of diagnosis, c) time the report of case received by 
public health agency responsible for control activities, and d) time of implementation 
of control activities. Usually one of the first two points of time (a or b) is used 
as the starting point, and each of the other two points (c, d) is used as an end 
point . 

Timeliness is usually measured in days or weeks, but in hospital settings it might be 
measured in hours; for diseases that do not necessitate an immediate response, it 
might be measured in months or even years . 

Evaluations of the timeliness with which shigellosis is reported in two different 
surveillance systems in the United States found median delays of 11 and 12.5 days from 
time of onset of illness to receipt of report by the public health agency responsible 
for control measures. This delay did not allow public health officials to intervene 
in a timely manner to prevent the occurrence of secondary or tertiary cases. However, 
such a time frame might still allow for effective intervention in settings, such as 
day-care facilities, in which outbreaks may persist for weeks or months {27) . Another 
study of timeliness in the reporting of salmonellosis, shigellosis, hepatitis A, and 
bacterial meningitis looked at the reporting delay between date of onset and date of 
report to the CDC (3) . Median reporting delays ranged from 20 days for bacterial 
meningitis to 33 days for hepatitis A. Wide variations in reporting delays were found 
between states as well. A study in Australia showed that reports of infectious 
diseases from laboratories were received by the Medical Officer of Health in a 
substantially shorter time than those received from medical practitioners (13) . 


In contrast, if there is a long latency between exposure and appearance of disease, 
the rapid identification of cases of illness may not be as important as the rapid 
availability of data to interrupt and prevent exposures that lead to disease. 

The need for a rapid reporting to a surveillance system depends on the nature of the 
public health problem under surveillance and the objectives of the system. Recently, 
computer technology has been integrated into surveillance systems and may promote 
timeliness of reporting {28,29) . 


The final descriptive element is an estimation of the resources used to operate the 
system. The estimates generally are limited to direct costs and include the costs of 
personnel and resources required for collecting, processing, and analyzing 
surveillance data, as well as for the dissemination of information resulting from the 

Personnel costs may be determined from an estimate of the time it takes to operate the 
system for different personnel. While this can be expressed as person-time expended 
per year of operation, it is preferable to convert the estimate to dollar costs by 
multiplying the person-time by appropriate salary and benefit figures. 

Other costs may include those associated with travel, training, supplies, equipment, 
and services such as mail, telephone, rent, and computer time. 

The resources required at all relevant levels of the public health system-- from the 
local health-care provider to municipal, county, state, and federal health agencies- 
should be included. 

The approach to resources described here includes only those personnel and material 
resources required for the direct operation of surveillance. A more comprehensive 
evaluation of costs should examine consequential or indirect costs, such as follow-up 
laboratory testing or treatment, case investigations or outbreak control resulting 
from surveillance, costs of secondary data sources (e.g., vital statistics or survey 
data) , and costs averted (benefits) by surveillance. 

Costs are judged relative to benefits, but few evaluations of surveillance systems 
have included a formal cost-benefit analysis, and such analyses are beyond the scope 
of this chapter. Estimating benefits, such as savings resulting from morbidity 
prevented through surveillance, may be possible in some instances, although this 
approach does not take into account the less tangible benefits that may result from 
surveillance systems. More realistically and in most instances, costs should be 
judged with respect to the objectives and usefulness of a surveillance system. 

Alternative data collections may be compared based on their costs and number of cases 
identified (See also Chapter XII) . For example, in Vermont, two methods of collecting 
surveillance data were compared. The 'passive" system was already in place and 
comprised unsolicited reports of notifiable diseases to the district offices or the 
state health department. The "active" system was implemented to involve in a 
probability sample of physicians' practices. Each week a health department employee 
called these practices to solicit reports of selected notifiable diseases. In 
comparing the two systems, an attempt was made to estimate associated costs. The 
resources estimates directly applied to the surveillance systems are shown in Table 
VIII. 3. The active system identified on additional 23 cases at an average cost of 
$861 per case. 


On the basis of the evaluation, an assessment of how well the surveillance system is 
meeting its current objectives should be made (Table VIII. 4). Modifications to the 
system to enhance its usefulness and improve its attributes should be considered. A 
regular review of each surveillance system should assure that systems remain 
responsive to contemporary public health needs. 


1. Harness JR, Gildon BA, Archer PW, Istre GR. Is passive surveillance always 
insensitive? An evaluation of shigellosis surveillance in Oklahoma. Am J 
epidemiol 1988;128:878-81. 

2. Modesitt SK, Hulman S, Fleming D. Evaluation of active versus passive AIDS 
surveillance in Oregon. Am J Public Health 1990;80:463-4. 

3. Birkhead G, Chorba TL, Root S, Klaucke DN, Gibbs NJ. Timeliness of national 
reporting of communicable diseases: the experience of the National Electronic 
Telecommunications System for Surveillance. Am J Public Health 1991;81:1313-5. 

4. Klaucke DN, Buehler JW, Thacker SB, et aJ . Guidelines for evaluating 
surveillance systems. MMWR 1988,-37 (SS-5) :1-18. 

5. Thacker SB, Parrish RG, Trowbridge FL. A method for evaluating systems of 
epidemiological surveillance. World Health Statistics Quarterly 1988;41:11-18. 

6. Hinman AR, Koplan JP. Pertussis and pertussis vaccine: reanalysis of benefits, 
risks, and costs. JAMA 1984;251:3109-13. 

7. Dean AG, West DJ, Weir wm. Measuring loss of life, health, and income due to 
disease and injury. Public Health Rep 1982;97:38-47. 

8. Laboratory Centre for Disease Control. Establishing goals, techniques and 
priorities for national communicable disease surveillance. Canada Diseases 
Weekly Report 1991;17:79-84. 

9 . Laboratory Centre for Disease Control . Canadian communicable disease 
surveillance system. Disease-specific case definitions and surveillance 
methods. Canada Diseases Weekly Report 1991; 17 (Suppl 3):l-35. 

10. Wharton M, Chorba TL, Vogt RL, et al. Case definitions for public health 
surveillance. MMWR 1990.-39 (RR-13) :l-43 . 


11. Weinstein MC, Fineberg HV. Clinical decision analysis. Philadelphia: W.B. 
Saunders Co., 1980. 

12. Chandra Sekar C, Demine WE. On a method of estimating birth and death rates and 
the extent of registration. J Am Stat Assoc 1949;44:101-15. 

13. Murphy DJ, Seltzer BL, Yesalis CE. Comparison of two methodologies to measure 
agricultural occupational fatalities. Am J Public Health 1990;80:198-200. 

14. Rosenman KD, Trimbath L, Stanbury M. Surveillance of occupational lung disease: 
comparison of hospital discharge data to physician reporting. Am J Public 
Health 1990;80:1257-8. 

15. Rushworth RL, Bell SM, Rubin GL, et al. Improving surveillance of infectious 
diseases in New South Wales. Med J Australia 1991; 154 (12) :828-l. 

16. Barker WH, Feldt KS, Feibel J, et al. Assessment of hospital admission 
surveillance of stroke in a metropolitan community. Am J Chron Dis 1984; 37 :609- 

17. Kimball AM, Thacker SB, Levy ME. Shigella surveillance in a large metropolitan 
area: assessment of a passive reporting system. Am J Public Health 1980;70:164- 


18. Vogt RL, Larue D, Klaucke DN, Jillson DA. Comparison of active and passive 
surveillance systems of primary care providers for hepatitis, measles, rubella 
and salmonellosis in Vermont. Am J Public Health 1983;73:795-7. 

19. Thacker SB, Redmond S, Rothenberg R, et al. A controlled trial of disease 
surveillance strategies. Am J Prev Med 1986;2:345-50. 

20. Davis JP, Vergeront JM. The effect of publicity on the reporting of toxic-shock 
syndrome in Wisconsin. J Infect Dis 1982;145:449-57. 

21. Sacks JJ. Utilization of case definitions and laboratory reporting in the 


JjZ6 ? 

surveillance of notifiable communicable diseases in the United States. Am J 
Public Health 1985;75:1420-2. 

22. Alter MJ, Mares A, Hadler SC, et al . The effect of under reporting on the 
apparent incidence and epidemiology of acute viral hepatitis. Am J Epidemiol 

23. Hopkins Rs. Consumer product-related injuries in Athens, Ohio, 1980-85: 
assessment of emergency room-based surveillance. Am J Prev Med 1989; 5 (2) :104- 

24. Phillips-Howard PA, Mitchell J, Bradley DJ. Validation of malaria surveillance 
case reports: implications for studies of malaria risk. Journal of Epidemiology 
and Community Health-London 1990;44 (2) :155-61 . 

25. Jackson C, Jatulis DE, Fortmann SP. The behavioral risk factor survey and the 
Stanford Five-city project survey: a comparison of cardiovascular risk behavior 
estimates. Am J Public Health 1991;82:412-6. 

26. Buehler JW, Stroup DF, Klaucke DN, Berkelman RL. The reporting of race and 
ethnicity in the national notifiable disease surveillance system. Pub Health 
Rep 1989;104:457-65. 

27. Rosenberg ML. Shigella surveillance in the United States, 1975. J Infect Dis 

28. Marks JS, Hogelin GC, Gentry EM, et aJ . The behavioral risk factor surveys: I. 
State-specific prevalence estimates of behavioral risk factors. Am J Prev Med 

29. Graitcer FL, Burton AH. The epidemiologic surveillance project: a computer- 
based system for disease surveillance. Am J Prev Med 1987;3:123-7. 


Chapter IX 

Ethical Issues 

Robert A. Hahn 

"Epidemiologists [and surveillance investigators] should be cognizant that many competing 
values may have moral weight equal to or greater than the freedom of scientific 
inquiry. . . .there are many clearly appropriate social restraints on epidemiologic research 
[and surveillance]." 



Webster defines ethics as "the discipline dealing with what is good and bad or right and 
wrong or with moral duty and obligation." A professional code of ethics provides a guide 
to right and wrong behavior. An ethical code is not a description of what practitioners 
(and others) actually do, but rather a prescription for what they should do. Ethical 
obligations derive principally from moral values--such as the "Golden Rule, " presumably 
shared by the broader society--rather than from scientific principles, such as "formulate 
a hypothesis and a method before collecting data. " However, ethical decisions require 
an understanding of the objectives, current issues, and methods of the scientific 
disciplines to which they refer. 


Over the past several decades, much ethical discussion in health--i .e. , "bioethics"--has 
focused on clinical medicine and medical research, and thus on physicians and their 
patients and on researchers and research subjects. Because public health is concerned 
with the public, specific principles of bioethics may not apply directly to public health, 
although underlying moral values may be shared. Ethical principles associated with 
surveillance are perhaps closer to those of the social sciences than to those of clinical 

medicine or medical research (1). 

Indeed, public health ethics may conflict with the ethics of clinical medicine insofar 
as clinical ethics- -represented by such issues as patient confidentiality—compromise 
public health (e.g., when the patient's condition threatens the health of others) ; or when 
the demands of public health compromise the rights of individuals (e.g., in quarantine); 
or when mass vaccination is required for public health despite the personal objections 
of individual patients (2) . The practice of public health generally assumes that 
individual rights may be ethically superseded in the pursuit of public well-being and a 
greater public good (2) . Epidemiologists and ethicists have recently collaborated in the 
formulation of ethical principles for epidemiology (3) . 

Although characteristics may distinguish surveillance-related ethical issues from ethical 
issues in other areas of epidemiology and public health, many of the ethical issues 
confronting public health surveillance are similar to those of epidemiology. 
Consequently, much of the discussion in this chapter draws heavily on experience in 
epidemiologic research, where these issues have been more fully discussed. Public health 
surveillance may affect the public in several ways. Surveillance is the principal means 
by which the health status of the population is assessed; it can be used to identify 
problems, indicate solutions, plan interventions, and monitor change. As such, public 
health surveillance commonly requires widespread and repeated contact with the public it 
serves regarding basic and often personal matters of health and exposures to risk factors. 
In addition, surveillance systems may be linked with other systems, requiring compatible 
identifiers of individual records; and systems may be shared among researchers or public 
health officials, thus increasing chances of public disclosure. Many facets of 
surveillance may infringe on individual privacy and therefore may increase the risk of 
breaches of confidentiality. 

Several theories have been proposed to account for the basic principles underlying sound 
ethical decisions. Such theories are relevant in public health decisions about resource 
allocation, intervention, surveillance, and other issues, but are only briefly mentioned 

Some ethicists dispute the possibility of formulating general ethical principles, because 
they believe that correct ethics are specific to each situation (i.e., "situation ethics") 


(4). In contrast, most ethicists assume that ethical principles apply to different 
situations; these ethicists commonly adopt one of two positions about the nature of 
ethical rules. Utilitarians believe that ethical actions are those that most effectively 
distribute valued goods within the population; this position is sometimes equated with 
the epithet, 'the end justifies the means.* In contrast, deontologists believe that 
certain principles, such as honesty, are fundamental, and that ends, such as the 
distribution of goods in a population, do not justify the violation of fundamental 
principles. Public health intervention programs commonly combine utilitarian and 
deontological approaches. They attempt to maximize the distribution of health benefits, 
while maintaining a satisfactory level of morality in the means of distribution. 


Ethicists have formulated several basic moral principles that they believe underlie 
clinical medicine and research (5) . Some of these basic principles apply to public health 

Respect for autonomy asserts that "autonomous actions and choices should not be 
constrained by others" (5) . Basic to the notion of autonomy is self-determination 
and voluntary action. 

Beneficence is the principle that one should act to enhance the welfare of others. 
Although non-maleficence, or avoiding acts that might harm others, is sometimes 
viewed as a principle separate from beneficence, it may also be regarded as the 
first tenet of beneficence. That is, in order to benefit others, one must at least 
avoid doing them harm. 

Paternalism is the active pursuit of another person's well-being (as perceived by 
the pursuer) , independent of--and sometimes contrary to-- that person's express 
wishes. Paternalism may be regarded as a form of beneficence. While paternalism 
is generally thought of as protection of a person against harm to himself /herself , 
the notion may be broadened to include threatened harm to others. Paternalism 
commonly conflicts with respect for autonomy and, perhaps for this reason, is not 
a popular concept in the United States. It becomes useful when a person's capacity 
for autonomy is compromised (as may occur in sickness) or when personal autonomy 
may seriously compromise the well-being of others. 

Justice is the principle promoting the equitable distribution of burdens and 
benefits in society. Unfortunately, there is no agreed-upon definition of equity; 

the range includes an equal share for each person, each according to need, each 
according to effort, each according to societal contribution, or each according to 
presumed merit (5) . 

Other ethical principles are regarded by some ethicists as independent and by others as 

derivative from more basic principles (5) : 

Veracity is the duty of full disclosure of relevant information. Veracity is often 
considered a duty of clinicians or researchers but may also be a duty of patients 
or subjects. 

Privacy is the duty to respect a person's right "...of determining, ordinarily, to 
what extent his thoughts, sentiments, and emotions shall be communicated to others" 
(6) . Privacy includes protection from unwanted intrusions, and from the divulgence 
of personal information to others. The right to privacy may derive from respect 
for autonomy. 

Confidentiality is the duty not to disclose information about individuals without 
their consent. Confidentiality may be seen as a principle following privacy. 
Fidelity, commonly applied to the relationship between physician and patient, is 
the duty to keep promises and maintain contracts. 


While conflicts among ethical principles are common--e.g. , paternalism versus respect for 
autonomy- -there is no simple prescription for resolving such conflicts. Utilitarians 
might choose one alternative and deontologists, another. Attempts to prescribe principles 
of conflict resolution emphasize that decisions should be accompanied by justification 
of the choice ( 7) . 

In contrast to medical institutions, institutions of public health and epidemiology do 
not license practitioners and do not maintain official sanctions against violations of 
professional ethical standards (even insofar as such standards exist and are codified) . 
Public health practitioners are not sued for malpractice. Informal sanctions (e.g., the 
avoidance of unscrupulous colleagues or loss of one's job) occur, but have not been 
systematically described. Some epidemiologists have recently proposed an ethical duty 
to monitor and address the unethical practices of their colleagues (7). In contrast to 
the absence of collegial sanctions in public health, some aspects of epidemiology and 


surveillance are governed by law (e.g., violations of confidentiality by surveillance 
personnel) (8) . 

Varying degrees of contact are involved in different forms of surveillance. Environmental 
surveillance (e.g., of environmental lead or rates of Lyme disease infection of ticks), 
may involve contact with animals or the physical environment rather than with humans; 
surveillance using hospital records or death certificates involves indirect human contact; 
surveillance by household interviews and/or physical examinations requires face-to-face 
and/or physical contact . Ethical principles may vary from situation to situation and are 
likely to be more stringent as more human contact is involved. 

This chapter focuses on surveillance involving face-to-face human contact. Also 
considered are surveys such as the Health Interview Survey, the National Health and 
Nutrition Examination Survey, and the Vital Statistics System of the Centers for Disease 
Control's National Center for Health Statistics. These surveys or statistical systems 
may not meet the stringent objectives of public health surveillance, but because they 
entail the collecting personal information on individuals and are widely used for 
surveillance they provide examples surrounding data collection. The U.S. Census is also 
considered, because census information plays an essential role in providing denominators 
for surveillance data. 

The collection of public health information may involve the participation of many 
individuals and institutions. Potential participants include not only the investigator 
and subjects of surveillance but persons in the immediate social environment of study 
subjects, the investigator's colleagues, the broader public health community, clinicians, 
and society at large. Explicit and implicit relations among these parties delineate their 
ethical obligations to one another (Table IX. 1) . Ethical issues are reviewed below by 
focusing on several of these relationships. 


Surveillance practitioners and society at large. The practice of public health may 


be regarded as one means by which a society addresses issues of well-being in the 
population. Public health practitioners retain an essential connection with society at 
large; ultimately, they are supported by and act at the behest of their public 
constituency- The assumption is that, as they pursue and achieve public interests, they 
should be supported by society in their work. 

As agents of public welfare, public health practitioners have several ethical 
responsibilities as outlined below: 

Choice of surveillance topics. In pursuit of beneficence, as well as in upholding 
public fidelity, practitioners should conduct surveillance on priority issues with 
potential public health benefit (7). "As a parallel in a research study, it would be 
unethical to ask anyone to participate that has little likelihood of producing meaningful 
results or furthering scientific knowledge for the good of society" (5). Insofar as 
surveillance findings are basic indicators of health inequities and trends, (e.g., in risk 
or exposure, health-care access, morbidity, or mortality) , the pursuit of justice is also 
a primary moral rationale for surveillance. 

Judgments of priority and potential benefit should be based on explicit criteria, such 
as the criteria for the strength of scientific evidence used by the Preventive Services 
Task Force (10) . Perhaps paradoxically, surveillance results themselves facilitate the 
determination of priority issues, (e.g., the magnitude and location of health problems 
in the population) . 

Avoidance of conflicts of interest . As with other epidemiologic activities, 
surveillance may be prone to conflict of interest. "Virtually all epidemiologic research 
is sponsored, and few if any research sponsors, public or private, are disinterested in 
the outcome of their epidemiologic research" (12). In their commitment to public well- 
being, practitioners of surveillance must assure that data are conducted to answer 
scientific or public health questions effectively, rather than to serve the interests of 
financial and institutional sponsors or to "prove" personal preconceptions. For example, 
practitioners must assure that populations surveyed and questions asked are appropriate 
to assess the issues considered and not to find "results" desired by a sponsor. 
Epidemiologists have presented guidelines for avoiding conflicts of interest (22); the 
guidelines apply to surveillance activities as well. 


• The investigator's independence from the sponsor must be 

maintained in the design, conduct, and reporting of 
epidemiologic (and surveillance) results. Written agreement 
between investigator and sponsor may increase the likelihood 
of independence. 

• Investigations should not be conducted in secrecy, and results 
should be published in a timely fashion. 

• Decisions on release and publication of results should not be 
influenced by the interests of sponsors. 

• All sponsorship should be acknowledged. 

• Decisions regarding the dissemination and publication of results 
should be made by the investigator rather than the sponsor. 

Bond (23) has suggested that certain private industries may have an ethical obligation 
to monitor the effects of their activities for instance the exposures and health of these 
employees. Rothman (21) has argued that it is unethical to judge the results of 
investigations simply on the basis of sponsorship, e.g., private industry. Rather, 
investigations should be judged by the quality of the work involved. 

Methodologic and analytic scrutiny. The principle of beneficence requires that one 
choose the best feasible method of investigation and that one appropriately analyze 
results — thus requiring knowledge of scientific methods (7). 

Interpretation and recommendation. The principle of beneficence also requires (as 
does the concept of surveillance itself) that surveillance data be interpreted and used 
to assess and address public health problems. 

Report of findings. Finally, the principle of beneficence requires that surveillance 
results be reported understandably, sensitively, and responsibly, in a timely fashion, 
with scientific objectivity and caution, appropriate confidence, and appropriate doubt. 
"Epidemiologists should carefully avoid being placed in a situation in which their results 
might be suppressed or inappropriately edited by either internal or external influences" 
(7). Some (14) have argued that epidemiologists should be advocates for the positions 


firmly supported by their data. Others (25) have asserted that epidemiologists are 
legitimate expert witnesses. Practitioners of surveillance must also be free of internal 
or external constraints and must be able to present the results of their work objectively. 



Surveillance subjects do not usually benefit directly from surveillance, though some 
benefit to them may accrue as a side-effect (e.g., when surveillance subjects are given 
physical examinations or when a discovery made by surveillance serves a health need of 
a surveillance subject) . When an adverse health condition is determined in the course 
of surveillance, it is the responsibility of the investigator to provide the surveillance 
subject with timely information about the discovered condition; if the condition is 
complex or sensitive, such information may be best conveyed by the subject's physician, 
trained counselors, or local public health officials (9) . 

Non-Malef icence 

A more common ethical issue in surveillance is non-maleficence. Surveillance subjects must 
not be harmed in the course of the surveillance program. When invasive procedures are 
deemed necessary to the surveillance system- -including psychologically as well as 
physically invasive procedures- -care must be taken that subjects do not suffer undue 
reactions (9) . 

Epidemiologists have recognized a need to be culturally sensitive to the populations they 
are studying. Cultural sensitivity may be a component of beneficence, non-maleficence, 
and autonomy, and may also enhance the effectiveness of the investigation. Cultural 
sensitivity is important not only during the course of surveillance but also in the 
appropriate reporting of results. 

Non-maleficence may also require that survey participants be compensated for their 
participation. Compensation should at least cover the costs of participation--e.g. , 
transportation, lost work time, and child care. While altruism and the personal 
contribution to potential public health benefits may motivate some prospective 
participants in a data collection system, additional compensation may increase the 

participation of others--a pragmatic rather than an ethical justification for payment. 

Protection of Privacy 

Non-maleficence may also underlie respect for privacy. Protection of privacy requires 
not only restraint in intrusion and in the disturbance of persons in their private lives 
but assurance that once information (or a specimen) has been collected, it will not be 
distributed to others in a form that identifies the surveillance subject (see Chapter 
X) (16) . 

Beauchamp et al. propose three situations in which the invasion of privacy by 
epidemiologists (and surveillance investigators) is justified (7): 

• The invasion of privacy is a necessary aspect of the 

• There is no reason to suspect that subjects of the 
investigation will be placed at substantial risk (e.g., of 
being fired or divorced) . 

• The research must have potential social benefit. 

In Public Law 93-579 (17), the Congress states the following: 

■ (2) the privacy of an individual is directly affected by 

the collection, maintenance, use, and dissemination of 

personal information by Federal agencies;... 

(4) the right to privacy is a personal and fundamental right 

protected by the Constitution of the United States; and 

(5) in order to protect the privacy of individuals identified 

in information systems maintained by Federal agencies, it 

is necessary and proper for the Congress to regulate the 

collection, maintenance, use, and dissemination of information 

by such agencies." 

In the United States, public health surveillance activities conducted under the auspices 
of the Executive Branch (thus including the Department of Health and Human Services and 
the Bureau of the Census) are regulated by the Public Health Service Act and by the 
Privacy Act of 1974 (17). Both acts regulate contractors of federal agencies as well as 


the agencies themselves. Regulations apply to " establishments °--i .e. , institutions--as 
well as to individuals surveyed. They address "systems of records" "... from which 
information is retrieved by the name of the individual or by some identifying number, 
symbol or other identifying particular assigned to the individual" (27). Thus, records 
without identifiers are exempt from these regulations. 

While the Privacy Act focuses on the disclosure and dissemination of information already 
collected, the act also restricts surveillance information that may be collected by 
stipulating that records may contain only "such information about an individual as is 
relevant and necessary to accomplish a purpose of the agency...." This enforces the 
ethical obligation to conduct surveillance on issues with potential public health benefit. 
In addition, the Privacy Act prohibits use of surveillance (or other information) "for 
any purpose other than the purpose for which it was supplied unless such establishment 
or person has consented. . .to its use for such other purpose . . . .'{18). 

The Privacy Act gives individuals the right to obtain their own records, to correct errors 
in the record, and to receive an accounting of how the record has been disseminated. 
Exemptions to individual access include the use of records maintained for statistical 
purposes only (rather than for administrative use) . Census information, for example, is 
exempt. Exemptions must meet specific criteria and must be published in the Federal 

The Privacy Act requires that federal agencies train and regulate personnel with access 
to record systems and that agencies maintain physical means of protecting records from 
unwarranted access. Agencies are also required to describe their record systems and to 
report procedures used to comply with requirements in the Federal Register. Criminal 
penalties and fines may be imposed on persons who violate the stipulations of the act. 

Informed Consent 

The Privacy Act regulates not only the collection and maintenance of record systems, but 
the informed consent procedures by which they are collected and matters of confidentiality 
involved in the dissemination of records that have been collected. Informed consent is 
a requirement based on respect for autonomy. Informed consent must be attained primarily 
in the context of surveys and studies. Administrative, medical-care, and legally mandated 


information-collection systems should also consider obtaining informed consent. The 
Privacy Act requires that potential participants in record systems be a) informed of the 
authority under which the data are collected, b) explained the purposes of the 
information, c) explained routine uses of the information, and d) described the 
consequences of not participating. Informed consent is required for "establishments" 
(through their representatives) as well as for individuals. 

Epidemiologists and philosophers have proposed several elements to be included in 
comprehensive informed consent: 

• Reasonable disclosure of the goals and uses of the study 
(or surveillance activity) . 

• Evidence of comprehension on the part of prospective 
participants. The response of potential respondents to 
surveys following appropriate information is sometimes 
regarded as evidence of consent, despite the lack of 
evidence of respondent comprehension (19) . 

• Voluntariness on the part of prospective participants. 
"All forms of duress or undue influence are to be 
scrupulously avoided" ( 7) . 

• Competence on the part of prospective participants. 

• Consent of prospective participants. 

Possible harm of the surveillance--e.g. , from some physical test--should also be explained 
to prospective participants. To guarantee autonomy, comprehensive informed consent should 
also be receptive to informed dissent and non-participation or to withdrawal at any point 
in the research or surveillance activity. 

Feinlieb (5) argues that, "the first responsibility of the epidemiologist to the subject 
is to be clear about the objectives of the study." He also allows that, when the goals 
of epidemiologic investigations (or surveillance) are complex or when full disclosure 
might bias responses, comprehensive disclosure may not be required, so long as the 
respondent is "...not deliberately misled into participating in a study that the 
investigator knows is against the respondent's interests" (9). This paternalistic 
principle may compromise the participant's autonomy. 

Disclosure, Dissemination, and Confidentiality 

The Privacy Act forbids the disclosure of information in which individual identity is 
ascertainable, unless the subject has agreed to disclosure. This principle thus protects 
the confidentiality of individuals and affects the dissemination of surveillance findings 
(see Chapter X) . 

Records protected by the Privacy Act are exempt from Freedom of Information Act (FOIA) 
requests. FOIA specifically exempts "personal and medical files and similar files the 
disclosure of which would constitute a clearly unwarranted invasion of personal privacy" 
and matters "specifically exempted from disclosure by statute" (19). Federal surveillance 
data are also commonly exempt from subpoena and may be explicitly exempted by 
authorization of the Secretary of Health and Human Services (18) . Census data, too, are 
exempt from FOIA access . 

There are several dimensions of disclosure (19) : 

***** • Exact disclosure , which indicates a precise (numerical) 
value of some characteristic, (e.g., precise income or age, 
associated with an individual) , versus approximate disclosure , 
which indicates a range of values associated with an individual. 

• Probability-based disclosure indicates the likelihood (<100%) 
that some characteristic is associated with an individual, 
while certainty disclosure indicates (with 100% likelihood) that 
the characteristic is associated with the individual . 

• Internal disclosure associates an individual with a characteristic 
on the basis of evidence found within one particular study or 
survey, while external disclosure associates individuals and 
characteristics by linking studies or surveys. 

Since the absolute protection of disclosure might make the use of surveillance information 
impossible and would severely hamper programs of disease control and prevention, non- 
disclosure requirements have been interpreted as protecting individuals from harm while 
allowing appropriate use of surveillance information. For example, publication of 
analyses or tables with small numbers of conditions such as fetal or infant deaths or 
deaths from rabies in a county — allowing the identification of individuals--is said to 


be reasonable because these exceptions "...have been accepted traditionally and because 
they rarely, if ever, reveal any information about individuals that is not known socially" 
(20) . Also exempt is publication of small numbers if the identifying characteristics are 
judged not to be "sensitive." 

Two kinds of breaches of confidentiality should be differentiated. In the first, 
information collected in confidence by a clinician or public health practitioner should 
be divulged if the information substantially threatens the welfare of another person 
(21,22). Divulging information need not reveal the identity of the first individual, but 
such revelation may be unavoidable. This is a common occurrence associated with "contact 
tracing" for sexually transmitted diseases. The public health responsibilities of 
clinicians and public health practitioners may override duties of confidentiality to 
individual patients and surveillance subjects, even though their actions abrogate privacy, 
autonomy, and even beneficence. In the second kind of breach of confidentiality, 
revelation of information and the identity of an individual serves no public health 
purpose and is therefore unethical. 

Several techniques may mitigate the likelihood of disclosure and may legitimate the 
publication of otherwise protected data: a) small samples (e.g., <10% of the data) hamper 
efforts to identify which individual in the population a sampled individual represents, 
b) the deliberate creation of errors or imputations of missing data allows that any given 
datum may be an error or an imputation rather than a true observation, c) incompleteness 
of reporting allows that an individual may not have been included in the survey, and d) 
lack of sensitivity of the information in question (because of prior publication or 
historical time frame), so that publication reveals no harmful information. 

In the United States, individual states use surveillance information for their own 
disease-control programs. As major surveillance agencies, the states have been critically 
concerned with issues of confidentiality (23) . While all states have provisions for 
complying with freedom of information requests and maintaining confidentiality of 
information, they vary in specific regulations and their enforcement. Twenty-five states 
have general confidentiality requirements with little specific definition; seven states 
require written consent for release of information; five states exclude surveillance 
information from subpoena; and 10 states have penalties for unlawful disclosure of 
information on some or all reported infectious diseases (23) . The states are concerned 


with the protection of the confidentiality of data released for federal surveillance 
systems and, in collaboration with CDC, have established confidentiality guidelines (23) . 

Several procedures are commonly used to protect the confidentiality of records in 
surveillance investigation settings, disseminated data sets, and published tabulations 
and analyses: 

a. Names or other personal identifiers are necessary in public health 
surveillance for two principal, related purposes: to follow up individuals for the 
determination of subsequent health events and to link data systems for additional 
information on individuals. Surveillance functions which require neither follow-up nor 
linkage may avoid problems of confidentiality by not using names or other identifiers. 
It should be noted, however, that the absence of identifiers, as in "blinded" studies, 
may preclude informing surveillance subjects of adverse surveillance findings. 

b. When names or other identifiers are justified, problems of disclosure may 
be minimized with use of protected or "scrambled" identifiers, which make association 
between records and individuals difficult. The use of identifiers in record systems and 
separate files relating identifiers and individuals maintained in separate, secure areas 
is a common means of minimizing disclosure. 

c. Identifying information can be destroyed once it has served its designated 
follow-up or linkage function. 

d. Avoiding the collection of data that will not be used and that might serve 
to identify individuals . 

e. Precise data--e.g., dates of birth or death or income in exact dollar 
amounts, residence by block or street or address--are rarely essential; data-range 
specifications are most often adequate for surveillance purposes. Since precise data 
facilitate identification of individuals, the use of data ranges is preferable if 
surveillance goals can be achieved with such information. 

f . In some surveillance investigations, linkage with other surveillance sources 
is necessary to determine additional information. In this case, the Privacy Act requires 
that federal agencies and personnel involved be trained in and comply with common 
regulations of privacy and confidentiality. 

g. Suppression of analyses or tables with cells with small numbers in 
publications (19) : 

h. i) no table should include a row or column in which all 
cases are found in one cell. 


ii) the marginal total of any row or column should not be 

fewer than three, 

iii) no estimate should be based on fewer than three cases, 

iv) no estimates should be published if one case contributes 

more than 60% to that estimate, 

v & vi) no characteristics of individuals should be 

identifiable by calculation from other tabulated data in 

the same or other data sets. Solutions to the problem of 

small numbers may be the aggregation of rows or columns or 

the suppression of data in cells and marginal totals. 


In the ethics of public health surveillance, the principle of veracity is usually 
considered in the disclosure by investigators of the goals and uses of surveillance 
information. However, veracity may also be an ethical duty of surveillance subjects (to 
the investigator as well as to society) once they participate. Deception by subjects may 
contribute to erroneous results and public health harm. 

Investigators and Persons in Subjects' Social Environments 

During the course of surveillance, it may be discovered that some condition of the 
surveillance subject (e.g., an infectious disease or violent intentions) might severely 
affect or might have affected the well-being of other persons in the subject's social 
environment. In this case, it may be the ethical duty of investigators to inform 
appropriate authorities (e.g., public health officials or law enforcement agents) of these 
circumstances (9). Paternalistic social beneficence might justify the breach of 

Surveillance and the Public Health Community 

Public health surveillance practitioners have the duty of having their work reviewed by 
colleagues for ethical as well as scientific integrity; they also have the responsibility 
of reviewing the work, of others. The review process requires the sharing of methods and 
findings. Ethical--as well as scientific — critiques must be balanced. "Epidemiologists 
and many research scientists often search in detective-like fashion for flaws in the 
studies of those they review, even though the studies may contain substantial merit" (7) . 


While some agencies have policies to protect researchers' primary use and control of the 
data they collect (24) , others have favored broader access (25) . Ethical principles 
justifying broad access are detailed below. 

• Enhancing the quality of science by allowing reanalysis 

and confirmatory studies--thus potentially contributing 
to public welfare 

• Expanding knowledge by facilitating additional analyses-- 
thus also potentially contributing to public welfare 

• Reducing the burden of surveillance on subjects 

• Reducing the burden of surveillance on practitioners 

Epidemiologists and ethicists have also argued that practitioners have the obligation to 
promote ethical behavior in the public health community and to confront ethically 
unacceptable behavior of colleagues ( 7) . 


Physicians, laboratorians, and other health-care practitioners play a critical role in 
reporting infectious diseases to local and state health departments. Reporting traumatic 
events (e.g., gunshot wounds and child abuse) is also required in some states {26). 
Fulfilling these duties may prevent further infection or trauma. While reporting selected 
diseases and injuries is mandatory for physicians and others in all states, completeness 
of reporting is said to range from 6% to 90% for many notifiable diseases (27) ; reporting 
laws are seldom enforced. 

Investigators and Clinicians 

Investigators have a duty to report findings to clinicians. Findings may concern the 
welfare of a clinician's patients who have been surveillance subjects. Findings from 
surveillance investigations may also have implications for patients in general or patients 
with certain conditions. 

The scale and significance of public health surveillance demand scrupulous and ongoing 
attention to ethics as well as to science (Table IX. 2) . Ethics should not be regarded 
as an afterthought, or worse, an obstacle, to professional practice, but as an element 


vital to its foundation and goals. 


1. Cassel J. Ethical principles for conducting fieldwork. Am Anthropologist 

2. Lappe M. Ethics and public health. Maxcy-Rosenau Public Health and Preventive 
Medicine 12th ed. Norwalk, Connecticut: Appleton-Century-Crofts, 1986:1867-77. 

3. Soskolne CL. Ethical decision-making in epidemiology: The case study approach. 
J Clin Epidemiol 1991;44 (Suppl 1) :125S-30S. 

4. Fletcher J. Morals and medicine. Boston: Beacon Press, 1960. 

5. Beauchamp TL, Childress JF. Principles of Biomedical Ethics. 3rd ed. New York: 
Oxford University Press, 1989. 

6. Warren SD, Brandeis LD. The right to privacy. Harvard Law Review 1890;4:193-220. 

7. Beauchamp TL, Cook RR, Fayerweather WE, Raabe GK, Thar WE, Cowles SR, Spivey GH. 
Ethical guidelines for epidemiologists. J Clin Epidemiol 1991; 44 (Suppl 1):151S- 

8. Lako CJ. Privacy protection and population-based health research. Soc Sci Med 

9. Feinleib M. The epidemiologists responsibilities to study participants. J Clin 
Epidemiol 1991;44 (Suppl 1):73S-9S. 

10. U.S. Preventive Services Task Force. Guide to clinical preventive services: An 
assessment of the effectiveness of 169 interventions. Baltimore: Williams & 
Wilkins, 1989. 

11. Rothman KJ . The ethics of research sponsorship. J Clin Epidemiol 1991; 44 (Suppl 
1) :25S-8S. 


12. Stolley PD. Ethical issues involving conflicts of interest for epidemiologic 
investigators. A report of the committee on ethical guidelines of the society for 
epidemiologic research. J Clin Epidemiol 1991: 44 (Suppl 1). 

13. Bond GG. Ethical issues relating to the conduct and interpretation of 
epidemiologic research in private industry. J Clin Epidemiol 1991, -44 (Suppl 1) :29S- 

14. Last JM. Obligations and responsibilities of epidemiologists to research subjects. 
J Clin Epidemiol 1991, -44 (Suppl 1):95S-101S. 

15. Cole P. The epidemiologist as an expert witness. J Clin Epidemiol 1991, -44 (Suppl 
1) :35S-9S. 

16. Greenawalt K. Privacy, in Reich, WT., ed.. Encyclopedia of bioethics. New York: 
Free Press, pp. 1356-63. 

17. The Privacy Act, Washington, D.C.: U.S. Government Printing Office, 1974. 

18. Public Health Service Act, Washington, D.C.: U.S. Government Printing Office, 1944, 
as amended. 

19. Centers for Disease Control (CDC) . Staff manual on confidentiality. 1984. 

20. National Center for Health Statistics. NCHS staff manual on confidentiality. 

21. Vernon TM. Confidential reporting by physicians. Am J Public Hlth 1991;81:931-2. 

22. Teutsch S, Berkelman RL, Toomey KE, Vogt RL. Reporting for disease control 
activities. Am J Public Hlth 1991,-81. 

23. Vogt RL. Confidentiality: Perspectives from state epidemiologist, in Challenge 
for Public Health Statistics in the 1990's (sic). Proceedings of the 1989 Public 
Health Conference on Records and Statistics, National Center for Health Statistics, 


July 17-19, Washington, D.C. 

24. Sharing Research Data. Washington, D.C: National Academy Press, 1985. 

25. Hogue CJR. Ethical issues in sharing epidemiologic data. J Clin Epidemiol 
1991;44(Suppl 1) :103S-7S. 

26. Smith GR. Health care information confidentiality and privacy: A review and 
analysis of state and federal law. Emory University School of Law, unpublished ms, 

27. Thacker SB, Berkelman RL. Public health surveillance in the United States. 
Epidemiol Rev 1988;10:164-90. 



Public Health Surveillance and the Law 

Gene W. Matthews 
R. Elliott Churchill 

"The people's good is the highest law." 

Marcus Tullius Cicero 


Public health surveillance and the law are joined by so many interconnecting links that 
virtually every aspect of a surveillance program is associated with one or more legal 
issues. In the United States, and throughout the world, many surveillance efforts 
have been effected through mandates enforced by statutes or regulations. By the same 
token, reports derived from the interpretation and application of data from 
surveillance programs have been used to drive legislation relating to public health. 

Public health surveillance involves the collection, analysis, interpretation, and 
dissemination of data. It may be useful to have a working definition of the law to 
meld with this description of surveillance. In essence, as Wing observes, the law 


is "the sum or set or conglomerate of all of the laws in all of the jurisdictions: 
the constitutions, the statutes and the regulations that interpret them, the 
traditional principles known as common law, and the judicial opinions that apply and 
interpret all these legal rules and principles* (1). However, that is by no means 
all. The law is also the legal profession, and, in order to understand the law, we 
must try to understand the lawyers--how they think, how they speak, and what roles 
they play in the legal process. In addition, from a very practical point of view, 
the law is also the legal process—legislatures and their politics, as well as the 
time, efforts, and costs associated with changes in legislation. Finally, the law 
is what it is interpreted to be. This takes us back to the lawyers, as well as to 
the judges in the legal system. 

We cannot avoid what Wing describes as 'the traditional barrier" between the legal 
profession and the rest of the world. He continues with the observation that 'the 
legal profession has for centuries done many things to surround the practice of law 
with a quasi -mystical aura. Much as the medical profession would have us believe 
that there is something almost sacred about medical judgment and that only a 
physician can understand it, lawyers have perpetuated the only partially justified 
myth that there is something called legal judgment that only someone with the proper 
mix of formal education, practical experience, and appropriate vocabulary can make' 
(1) . 

'The basic function of the law is to establish legal rights, and the basic purpose 
of the legal system is to define and enforce those rights .... Legal rights" are 
the "relationships that establish privileges and responsibilities among those 
governed by the legal system" (1) . This concept of "legal rights" does not purport 
to cover freedoms or interests given unconditional, global protection, but rather it 
covers the protection of carefully specified interests against the effects of other 
carefully specified interests. Finally, some rights are protected, not by statute 
or regulation, but by an understanding and application of the prevailing ethics in 
an area. In general, ethics are regulated through whatever sanctions are imposed 
against censured behavior by peers or colleagues (see Chapter IX) . 


This orientation is pivotal in our discussion of legal issues associated with 
surveillance because the reader must continue to be alert to the fact that everything 
in this chapter is subject, first of all, to different interpretations in different 
legal settings, and, second, to amendment of both statute and practice. 

The task of surveillance as an applied science could be simplified considerably by 
avoiding any discussion of legal issues. Although this observation is probably 
valid, we have already pointed out that surveillance very often takes place under 
statute. Beyond this fact, the relevance of the definition of the police powers of 
a state must be acknowledged, i.e., "powers inherent in the state to prescribe, 
within the limits of state and federal constitutions, reasonable laws necessary to 
preserve the public order, health, safety, welfare, and morals" (2). That describes 
a sweeping scope of authority and certainly covers anything that would be dealt with 
under the heading of "public health surveillance. " 

In other words, one cannot look at surveillance and claim to have created an accurate 
picture without considering the legal constraints and processes that accompany it-- 
particularly since, for public health surveillance, we have added the component of 
■timely dissemination of the findings" to our definition of surveillance. How 
information is collected, from and about whom it is collected, how it is interpreted, 
and how and to whom the results are disseminated all must be scrutinized under the 
umbrella of "accepted practice" and "the law." The sections that follow contain 
information specific to the United States, but for an international orientation, the 
issues and concerns remain basically constant, while the written body of the law and 
the process through which the law is enacted and enforced vary widely. 

If the reporting component of public health surveillance is treated as a requirement, 
one can assert that such surveillance began in the United States in 1874 in 
Massachusetts, when the State Board of Health instituted the first statewide 
voluntary plan for weekly reporting of prevalent diseases by physicians. By the turn 
of the century, the forerunner of the Public Health Service had been established, and 
laws in all states required that certain communicable diseases be reported to local 
authorities (3) . 



With the development and growth of surveillance in the United States in the early 1900s 
came the inevitable conflicts created when the interests of one human being conflict with 
those of another individual or political unit. Much of the debate took place because of 
the problem the United States was experiencing with sexually transmitted diseases — which 
became even more acute with the participation of American troops in World War I. The 
issues were basically 

• the moral dilemma created by not reaching consensus on the purpose of 
information obtained through surveillance (i.e., whether to direct control 
efforts toward sexual behavior of the individual or toward the disease 
agents) , 

• the debate surrounding the duty of the physician to his/her patient and to 
society, and 

• the disagreement about whether government provision of health services 
comprised unfair competition to the private practitioner. 

Since these concerns still have not been completely resolved in the United States as of 
the 1990s, they are examined in more detail. 

Social Hygiene Versus the Scientific Approach 

By the early 1900s, the epidemiology of syphilis was reasonably we 11 -documented. This 
understanding did not constitute an unmixed blessing. As William Osier told his students 
at the Johns Hopkins Medical School in 1909, 'In one direction our knowledge was widened 
greatly. It added terror to an already terrible disorder" (4) . Aside from the scope of 
the destructive powers of syphilis, physicians were just beginning to appreciate the fact 
that many "innocent victims" were contracting this disease. The prevailing wisdom of 
earlier years of 'reaping what one sowed, " as well as other statements of poetic and moral 
justice, was no longer adequate when women of "good family" and unblemished reputation 
were known to have contracted syphilis from their spouses and when children suffered 
severe effects from congenital syphilis. 

What the medical and public health officials apparently had the most difficulty 
reconciling was how to direct their efforts to deal with the growing problem of syphilis. 


Both surveillance and treatment efforts could be directed toward a) people, a focus on 
behavior modification through education as a control strategy or b) the disease vector, 
a focus on the organism that caused the disease and how to eradicate it from individuals 
and society at large. Neither approach to syphilis control was ever agreed to be the 
ideal, and, in fact, the two in combination have still not proved totally effective. The 
tensions represented by the "moralistic" and the "scientific" approaches are, moreover, 
still quite evident in public health practice and surveillance in the 1990s. 

One only has to review the popular press for the past several years to see how the "moral 
versus scientific" dilemma relates to public health in the context of such currently 
serious problems as human immunodeficiency virus/acquired immunodeficiency syndrome 
(HIV/AIDS) and the reemergence of multidrug-resistant strains of tuberculosis. 

Duty of Physicians 

The concept of the confidential nature of communication between patient and physician is 
clearly stated in the Hippocratic Oath and has continued to be emphasized in legal and 
social settings. In the context of the syphilis epidemic in the United States in the 
early years of the 20th century, this concept became a crucial point of debate in efforts 
to control the spread of the disease. Physicians did not wish to breach the confidence 
relied on by their patients by reporting cases of syphilis to the authorities; by the same 
token, if they did not report the occurrence of syphilis--if not to the authorities at 
least to the patients' spouses--they were tacitly participating in the continued 
transmission of the disease to "innocent victims." The entire issue boils down to primary 
responsibility to an individual or to society. It clearly has not been resolved but 
constitutes an important component of the success or failure of present-day surveillance 

Economic Competition 

Also as yet unresolved is the problem created for public health officials and for 
practicing physicians in the early 1900s by the need, on the one hand, to have physicians 
report all cases of sexually transmitted disease and to establish public health clinics 
to provide prompt treatment and education to patients and, on the other hand, the need 
for public health officials to protect the financial interests of physicians by not 
infringing on their turf and removing paying customers to free or financially subsidized 


facilities. At the same time, it did not seem reasonable to expect the physicians to make 
such reports and refer such patients for treatment elsewhere when it would mean, in 
essence, taking money out of their own pockets. For surveillance efforts, this dilemma 
guaranteed underreporting of cases, with the selective reporting of cases representing 
patients who could not pay and the withholding of reports of cases representing patients 
who could pay . 

Of concern to the 1990s surveillance effort, and again in the context of HIV/AIDS, 
physicians might choose not to report cases of HIV positivity for fear their patients 
might be discriminated against in a work or social setting. Problems with insurance 
coverage might also lead to such underreporting. 


During the period of the 1940s-1970s, states added many diseases to their mandatory 
reporting lists. Even in states that did not enact legislation to require additional 
reporting, surveillance/reporting efforts were broadened during this period through state 
regulation or directive from the state health commissioners (5) . 

In contrast, surveillance and reporting to agencies in the federal government were--and 
continue to be- -voluntary . The resulting discrepancy in data obtained on a particular 
disease at the state and federal levels leads to problems in analysis and interpretation. 
However, several professional organizations, including the Association of State and 
Territorial Health Officers (ASTHO) and the Council of State and Territorial 
Epidemiologists (CSTE) , have been instrumental in setting up a patchwork system to 
coordinate and improve the quality and completeness of surveillance data. 

A major factor in the development of surveillance planning and implementation during this 
period is represented by the institution in 1976 of the Federal Protection for Human 
Subjects Regulations. One of the most well-known of the regulations states the 
requirement that "informed consent" be obtained from any person who is asked to 
participate in a medical research project. In addition, the regulation covers 
compensation for persons injured during the course of the project and confirmation of the 


ethics of the research being conducted. 

CURRENT LEGAL ISSUES (1980 to the Present) 

There is little dispute that biomedical research and surveillance activities of the 1980s 
were greatly affected by concerns and reactions associated with the HIV/AIDS epidemic. 
All the old issues from early in the 20th century reemerged at critical levels: Do we 
want to treat persons for the disease, or do we want to modify their behavior in 
control/prevention efforts? Is the physician's primary duty to protecting a patient's 
privacy or to the greater good of society? Is the public health machine treading on the 
physician's turf by advertising and providing medical treatment more inexpensively than 
the physician can? 

Although these questions still need to be answered fully, public health action cannot wait 
until consensus is reached before constructing and applying interventions. The sections 
below examine four key legal issues that relate to these questions and have a major impact 
on surveillance in the 1990s. 

Personal Privacy 

The right of an individual to have his/her privacy protected under the law is a vast gray 
area. The U.S. Constitution does not specify a right to privacy, although particulars 
relating to the protection of privacy under particular circumstances are included in the 
Bill of Rights (protection from "search and seizure," etc.). As noted earlier in this 
chapter, the issue of right to privacy and the physician's role in protecting that privacy 
through the concept of privileged communication emerged as a hotly debated issue during 
the war on sexually transmitted diseases in the United States in the early years of the 
20th century. The concept of the so-called "medical secret" (6) involved the dilemma that 
faced a physician whose male patient had a sexually transmitted disease (for which there 
was no sure cure) , whose reputation the physician wished to spare, but whose spouse or 
future spouse was at risk of having the disease if the physician did not step forward and 
report it. Many physicians opted to remain within the accepted double standard of 
behavior of the day and, according to Prince Morrow, became "accomplices" in the further 
transmission of infection (7). The medical secret was described by one physician as a 
"blind policy of protecting the guilty at the expense of the innocent," and a New York 
attorney ventured the opinion that "a physician who knows that an infected patient is 


about to carry his contagion to a pure person, and perhaps to persons unborn, is justified 
both in law and in morals, in preventing the proposed wrong by disclosing his knowledge 
if no other way is open" (7). 

Unfortunately, the right to privacy issue was no more resolved in the early 20th century 
United States than was the public health problem created by the nationwide problem of 
sexually transmitted diseases. Public health officials continue to struggle with 
questions associated with privacy and the rights of the individual versus the good of 
society to this day . 

The landmark case relating to the right of an individual to privacy was Griswold vs. 
Connecticut, 381 U.S. 479 (1965), which resulted from the arrest of the director of the 
Planned Parenthood League of Connecticut (Griswold) on the grounds that she had provided 
information, instruction, and medical advice about contraception to married people. In 
Connecticut at the time, the law stated that the use of contraceptives was punishable by 
law. Subsequently, the U.S. Supreme Court declared the Connecticut law to be 
unconstitutional and reversed the criminal convictions in the case. In the majority 
opinion written for the Court by Justice William Douglas, there are references to the so- 
called 'penumbras or auras of privacy that radiate out from the specific rights to 
privacy stated in the Bill of Rights. He observed that "various guarantees create zones 
of privacy" (S). He went on to say that the Connecticut law exceeded its bounds by 
seeking to regulate the use of contraceptive devices rather than their manufacture and/or 
sale. The only means he could postulate for enforcing the law as written involved the 
invasion of the clearly defined zone of privacy represented by marriage. Lest anyone 
misunderstand his meaning, he observed: "Would we allow the police to search to sacred 
precincts of marital bedrooms for tell tale signs of the use of contraceptives? The very 
idea is repulsive to the notions of privacy surrounding the marriage relationship" (8) . 

Later courts would refer to this constitutionally recognized right of the individual to 
privacy in certain contexts as a ■fundamental interest." In the precedent -setting 
abortion case of Roe v. Wade, 410 U.S. 113 (1973), a single woman challenged the 
constitutionality of a Texas law forbidding abortion (except when the pregnant woman's 
life was in jeopardy) . She claimed that this law denied her constitutional right to 
privacy and cited the earlier opinions of the Supreme Court relating to birth control. 
Justice Blackmun observed that "the state does have an important and legitimate interest 

in preserving and protecting the health of the pregnant woman. . . [and] it has still another 
important and legitimate interest in protecting the potentiality of human life. These 
interests are separate and distinct. Each grows in substantiality as the woman approaches 
term and, at a point during pregnancy, each becomes 'compelling'" (9). 

The link between the right to privacy and surveillance is also related to The Freedom of 
Information Act (amended 1986) . In essence, the latter act spells out the situations and 
conditions pertaining to the right of the U.S. taxpayer to obtain information s/he has 
paid for from agencies within the Federal Government. Clearly, there is the potential 
for conflicting interests in such situations, if information about taxpayer A is released 
to taxpayer B. The act takes this point into consideration in its statement that "to the 
extent required to prevent a clearly unwarranted invasion of personal privacy, an agency 
may delete identifying details when it makes available or publishes an opinion, statement 
of policy, interpretation, or staff manual or instruction" (10) . 

An essential aspect in designing a surveillance program is the assurance to the persons 
(agencies) who report and those being reported upon that the privacy rights of the persons 
whose health information is of interest will not be violated. The conflict created by 
the "right to privacy" and the "need to know" represents an area that must be monitored 
by the managers of a surveillance program as diligently as they monitor the health 
conditions to be reported. To illustrate: One of the most important court decisions the 
Centers for Disease Control (CDC) has obtained in recent years related to litigation 
arising out of the epidemic of toxic-shock syndrome of the late 1970s and early 1980s. 
The attorneys representing the manufacturer of the tampon that had been strongly 
statistically associated with the occurrence of toxic shock syndrome wanted to obtain not 
only data about women who had had toxic shock syndrome and from whom CDC had collected 
information but the names of the women as well. The agency argued (through district court 
and up to the Federal Court of Appeals) that participation in federal surveillance is 
voluntary and that participants in such programs have a reasonable expectation that their 
confidentiality will be protected by the Federal Government. The Appeals Court ruled in 
CDC's favor, but this position will continue to be challenged on a "need to know" basis, 
and persons who are designing and operating surveillance systems should always keep in 
mind the specter of the forced divulgence of information they have assured participants 
would be confidential. This is particularly likely in situations involving litigation, 
because of the courts' strong bias to make available the same information to legal 

representatives for both plaintiffs and defendants. 

The final observation in this section is that the manager of a surveillance program, at 
least within a federal agency, is always in danger of being accused by the popular media 
or the legal community of hiding something deliberately--not to protect the privacy of 
individuals, but for sinister reasons that are usually hinted at but not stated. This 
sort of accusation may have no basis in fact, but must be taken seriously and generally 
requires, at a minimum, an undesirable outlay of energy and worry on the part of the 
surveillance program manager. 

Right of Access 

If the taxpayers support the gathering of information, they have a right to that 
information (12). This statement forms one basis for the "right to access" position. 
Both the Privacy Act and the Freedom of Information Act reflect the post-Watergate era, 
with its focused concern on the potential for the government to keep secret files 
containing information on individuals. Beyond that is the "reasonable man" position, 
which maintains that a person has a right to any information that is about him/her. 
Unfortunately, giving information to an individual about himself /herself can sometimes 
have the effect of providing information that assigns liability to another person (or 
organization) in the data set. So even the process of providing personal information to 
the person in question is not without its hazards. 

In addition to the individuals who wish to obtain information about themselves, there are 
the so-called "third-party" inquirers. These individuals call for information on a need- 
to-know basis and may range from members of the U.S. Congress through attorneys and 
special-interest groups (e.g., "right to life" or "pro-choice" groups) to representatives 
of the news media. 

A major point for the surveillance program manager to ponder is when to make a public-use 
data set. Although there is no legal precedent to be followed here, once the first paper 
has been published about a data set, it is prudent to place that data set in the public 
domain if there is a reasonable expectation of its further use. Although this creates 
the risk of extra work and having others preempt publication, it obviates accusations 
about willful withholding of information or the danger that forced release of data before 

they are properly prepared for public use will allow some subjects to be identified. 

Product Liability 

This heading could be 'Research Institution Discovers Corporate America—and Vice Versa.' 
The issue has been around for many years but seemed to rise to prominence in the United 
States with the emergence of toxic-shock syndrome in the late 1970s and early 1980s. It 
is not unusual for investigations to show that a product is contaminated, that someone 
used a machine incorrectly, or even that someone deliberately tampered with a medication 
or device and caused illness or death. What was not familiar was that a "good" product, 
one that meets all its quality-control specifications and does what it is advertised to 
do, can also have effects that are less than desirable. Thus, no one was ready to deal 
with the situation in which an efficiently designed tampon apparently led to a life- 
threatening illness . The scientists had to accept the findings because scientists deal 
in fact (probability) , and the media had grist for their mills, but the manufacturer of 
the tampon (and its employees and stockholders and legal representatives) did not have 
an easy time coping with "the facts.' In fact, they underwent a classic grief reaction-- 
which the staff at CDC and other health science agencies have since learned to anticipate 
and to recognize- -involving the stages of denial, anger, depression, acceptance, and 
resolution. Human nature was applied with a vengeance, and the first three stages were 
immediate, intense, and enduring. The last two stages took some time and extensive effort 
to induce. 

Ideally, one should assure that surveillance programs are flawless and that all the 
information reported is unassailable. In the world of public health practice, such 
Utopian standards can rarely be met. And public health practitioners must continue to 
be prepared to deal with issues on a mixture of levels- -including public health, legal, 
ethical, socio-cultural, and emotional components. 

Litigation Demands 

Under litigation demands, the issue is to what extent an agency is responsible for 
providing its staff to testify in litigation relating to findings it obtained through 
surveillance or research. Of course, there is no simple answer, just as there have not 
been any simple answers to the other questions posed in this chapter. Clearly, it is not 
responsible to refuse to provide expert testimony in any instance in which it is 


solicited. In some cases, agency scientists may be the only ones who have worked in the 
area in question and have facts to cite. By the same token, in situations in which there 
are massive numbers of suits being conducted over a period of several years (as with 
toxic-shock syndrome or transfusion-associated HIV infection), all of the scientific 
resources of an agency could be expended on time in court and, therefore, none of them 
on the science that is their primary business. Somewhere, there is a correct answer for 
each agency and each health issue, and this problem may need to be faced when planning 
surveillance activities. 


For those who set up and run surveillance programs, it is important to note the following 
summary comments. Public health surveillance systems operate in the massive goldfish bowl 
that encompasses both public health practice and the law. 

• Plan and design surveillance systems so that they are most likely to provide 
all the information and only the information actually needed. 

• Include as few personal identifiers as feasible. 

• Analyze and publish data in a responsible and timely fashion. 

• Be prepared to stand behind the results (and hope your agency will stand 
behind you) . 

• Be prepared to place each data set in the public domain as soon as the first 
results are published. 

• If the findings are revolutionary, be prepared for a hostile reaction 
rather than a medal . 

• Finally, remember that the individual has rights (to privacy, to access 
information, to participate or not to participate in surveillance programs, 
and the like) . The public health practitioner, at least in the role of 
public health practitioner, has no rights--only responsibilities. 

Public surveillance constitutes one of the bridges between what we think is happening and 
what is actually happening. As such, it is one of the most valuable tools of the public 
health practitioner. With surveillance data as the light bulb and the law as a rheostat 
that stimulates change and regulates behavior, the two areas can work in concert to 
improve the quality of the public's health. 


1. Wing KR. The law and the public's health. Ann Arbor, Michigan: Health 
Administration Press, 1990:1-50. 

2. Friedman LM. A history of American law. New York, New York: Norton Press, 1986:1- 

3. Thacker SB, Berkelman RL. Public health surveillance in the United States. 
Epidemiologic Reviews 1988; 10 : 165 . 

4. Osier W. Internal medicine as a vocation. In: Aequanimitas: with other addresses 
to medical students, nurses and practitioners of medicine. 3rd edition. 
Philadelphia, Pennsylvania: W.B. Saunders, 1932:131-46. 

5. Hogue LL. Public health and the law: issues and trends. Rockville, Maryland: 
Aspen Systems Corporation, 1980:10. 

6. Brandt AM. No magic bullet: a social history of venereal disease in the United 
States since 1880, 1985:157-8. 

7. Parran T. The next great plague to go. Survey Graphic 1936:405-11. 

8. 381 U.S. at 484-86. 

9. 410 U.S. at 162-64. 

10. The Freedom of Information Act (As Amended) . Washington, D.C.: Government Printing 
Office, 1986. 

11. Abraham HJ. Freedom and the court: civil rights and liberties in the United 
States. New York, New York: Oxford University Press, 1988;23. 


Chapter XI 

Computerizing Public Health 
Surveillance Systems 

Andrew 6. Dean 

Robert F. Fagan 

Barbara Panter - Connah 

•We only conquer what we wholly assimilate." 

Andr<§ Gide 

In this chapter on informatics or computerization of surveillance systems, we will 
first explore what is technically possible in computerization of surveillance, finding 


an enormous gap between this and the best of today's actual systems. The barriers to 

optimal use of computers in surveillance — mostly social, organization, and legal 

are explored. The remainder of the chapter explores some of the problems that must be 
confronted in thinking about microcomputer-based surveillance, leaning heavily on 
examples from the notifiable disease system in the United States. 

An Ideal Surveillance System 

Ideally the epidemiologist of the future will have a computer and communications 
system capable of providing management information on all these phases and also 
capable of being connected to individual households and medical facilities to obtain 
additional information. 

Suppose that the epidemiologist of the future has a computer with automatic input from 
all inpatient and outpatient medical facilities, with standard records for each office 
or clinic visit and each hospital admission. S/he chooses to compare today or this 
week with a desired period, perhaps the past 5 years, and the computer displays or 
prints a series of maps for all conditions with unusual patterns. One of the maps 
seems interesting, and the epidemiologist may point to a particular area and request 
more information. A more detailed map of the area appears, showing the data sources 
that might provide the desired information, with estimates of the cost of obtaining 
the items desired. A few clicks of the mouse button select the sources, types of 
data, and format for a display, and the computer spends a few minutes interacting with 
computers in the medical facilities involved- -extracting information and paying the 
necessary charges from the epidemiology division's budget. Soon the more detailed 
information is displayed on the epidemiologist's computer screen. 

The pattern of hospitalizations and outpatient visits for asthma stands out, and the 
epidemiologist requests a random sample of specified size of persons who have ever had 
asthma in the same area, matched by age and gender, to serve as controls for a case- 
control study. The video-cable addresses of these "controls" and of the case-patients 
are quickly produced through queries to appropriate local medical-information sources. 
The epidemiologist formulates several questions about recent experiences, types of air 
conditioning, visits to various public facilities, and the like, adapts these to a 


previously tested video questionnaire format, and requests that video interviews be 
performed for case-patients and controls. Each household is contacted or left a FAX- 
like request to tune to a particular channel and answer a 5-minute query from the 
state health department on a matter of importance to public health. Eighty-five 
percent of the subjects respond to the first query, and the computer automatically 
follows up with the rest, bringing the response to 92%, with half of the remainder 
reported to be absent from their homes for at least 2 days. 

The odds ratio for persons with recent hospitalizations for asthma who work in or 
visit in a particular neighborhood is considerably higher than 1.0, and the 
epidemiologist connects by local-area network to the state occupational surveillance 
system and requests a display of all factories in the relevant area. Selecting those 
that deal with possibly allergenic materials, s/he issues a request for more detailed 
investigation of activities at the plants in a selected time interval . The 
epidemiologist also requests information from the weather bureau on wind direction and 
velocity, temperature, and rainfall. 

Within a few hours, a plant is identified that is in the process of moving a large 
pile of by-products with a bulldozer. A request is issued that the by-product be 
sprayed with water to prevent its particles from becoming airborne, and the plant 
manager readily agrees when shown the maps that depict hospitalization rates for 
asthma downwind from the plant. To monitor progress and widen the investigation, the 
epidemiologist asks the computer to do similar studies for conjunctivitis and for 
coryza or hay fever over the previous and next 2 weeks. Selecting several maps and 
tables to include in the report, s/he asks the computer to write a description of the 
studies performed and the findings, and then dictates a brief summary of the problem 
and several follow-up notes to the voice port of the computer. At the end of 2 weeks, 
the number of cases of asthma has fallen to normal for the area, and the computer 
calculates on the basis of the number of medical visits during the outbreak that 
$55,000 has been saved at a total cost of a few hours of the epidemiologist's effort, 
a site visit to the plant, and charges of $9,500 for the data and the communication 
facilities used to perform the interviews. 

Barriers to the Ideal Surveillance System 


Obviously, we are a long way from implementing the system described above. It may be 
helpful in thinking about the future to explore what barriers must be surmounted 
before this scenario can be enacted. Strangely enough, few of them are technical; all 
of the necessary systems could be built today with fairly conventional equipment and 
software, with the exception of the two-way interactive video connection with each 
household. This hook-up with the individual household is more likely to be available 
within the next 10 years than is the connection between the physician's record files 
and the health department. In fact, the two-way interactive video link between the 
household and the outside world is simply awaiting the government's or the 
marketplace's decision on what format will be used and on the realization of the 
benefits of such a connection on the part of the entrepreneurs and the public. 

However, there are some difficult problems to be solved before the 'ideal system" can 
be implemented. They include the following: 

a) The rapid availability of standardized, computerized medical 

records. Several issues need to be addressed before such a system is 
possible. In the United States, for example, a profusion of computerized 
medical-record systems for inpatient and outpatient records as well as 
insurance and other purposes have been developed These systems contain a 
plethora of different variables and use many different formats. Until a 
simple core public health record of age, gender, geographic location, 
diagnosis, and a few other items is created for each outpatient visit and 
each hospitalization- -and is available in a standard format without 
delay--the responsive interactive system above remains an unrealistic 
pipe dream. An additional problem is that most medical records are still 
not more than partially computerized. 

The barriers to establishing standardized public health output from 
computerized medical records are primarily political and administrative; 
most large retail organizations create records of similar size for each 
item sold, and the items carry on average, a much lower price than the 
cost of a visit for medical care. Once there is the will to establish a 
national computerized medical record system, the technical hurdles will 
be readily overcome. The needs include standard but suitably flexible 


record formats, solutions to problems associated with confidentiality, 
incentives to create the records (including the assurance of appropriate 
and cost effective use of the records), and voice output. 

b) Another problem is the lack of recognition that information about 
patients, except for legally designated "reportable diseases, ■ is useful 
in public health and should be available to public health agencies. The 
level of awareness could be heightened if technical solutions to problems 
of confidentiality were publicized and understood by the public and their 
legislative representatives. Such solutions as one-way encoding 
algorithms could provide partial solutions to matching and follow up 
problems, if properly used without turning public health agencies into 
carbon copies of dreaded "big brother." 

c) A pervasive feeling among those in charge of data that their data base 
must be "clean" before anyone else can use it. Months or even years are 
consumed while corrections and updates are made to make the data as 
accurate as possible. Although from one perspective this quality control 
is necessary and important, the concept of "surveillance" includes rapid 
turnaround, a realization on the part of everyone concerned (even the 
media and the public) that the data are preliminary, and the 
understanding that in order to look at today's data today, one must be 
willing to accept today's imperfections. This mental shift, as well as 
corresponding technical developments, will be necessary before a 
computerized system can be used to examine automatically a "time slice" 
of disease and injury records that originate in clinics and hospitals. 
Imperfections will be everywhere, and methods must be found to cope with 
reality--even if it includes warts--on an immediate basis. 

The Technology of the Future 

As stated above, today's technology, given enough social and organizational 
development, is adequate to allow the creation of miracles in public health 
information and communication. Nevertheless, it seems likely that development in 
technology will continue to reflect more of a driving force in public health computing 

than progress in political and social organization. 

Technologic developments over the next decade will probably include the areas shown 

High capacity storage devices 

CD ROM's (compact disk read only memory) similar to those used for music make it 
possible to have access to large bibliographic data bases anywhere there is 
electricity. The MEDLARS data base of the U.S. National Library of Medicine can be 
searched from a clinic in Africa; (once there are lower prices for books on CD ROM and 
they include needed illustrations), it will be possible to take a medical library 
anywhere in a briefcase. Past data bases from the United States and elsewhere will 
become available on CD ROM, although the process of cleaning them up for this purpose 
often reveals gaps and inconsistencies that reflect changing definitions and diminish 
their value as consistent anchors for comparison. 


A local area network (LAN) is a system linking microcomputers, terminals, workstations 
with each other and/or a mainframe computer to facilitate sharing of equipment (e.g., 
printers) programs, data, or other information. LANs are transforming the way many 
agencies do business. The most noticeable effect is the transmission of written 
memoranda that could or would not have been typed, packaged, and sent through a paper 
system. The cost of installing and supporting a LAN is not small, particularly in 
terms of support personnel. Uses for surveillance include entering data at multiple 
computers connected by a LAN. This requires special software to protect against 
errors. Special precautions to protect confidentiality are necessary in a network, if 
several people enter data in the same file at the same time. 

New user interfaces 

The parts of programs that interact with users have become easier to understand, and 
more attractive, with pull-down menus, windows, and pointing devices such as the 
■mouse." This elegance has its cost in terms of requirements for faster computers, 
for more memory, and particularly for greater skill to produce such programs. Some 
new programs cause unexpected problems when run with older programs or on older 


computers. All in all, the trend is toward a standard set of screen "controls," like 
those in modern cars, but the path in that direction is replete with experiment and 
minor failures. 

New programming tools 

It is widely recognized that software production is the narrow point in the 
implementation of new ideas in computing. Useful software still requires hundreds of 
thousands of lines of hand-written and highly personal "coding." Many new trends such 
as "fourth-generation data bases," computer-assisted software design (CASE) tools, and 
■object-oriented design" have made programming more productive, but this area of new 
tools is one in which major advances would create revolutionary changes. 

Higher-capacity processors and more memory 

The almost miraculous advances in computer speed and memory capacity in the last 
decade have removed many of the limits that required use of mainframe computers or 
minicomputers rather than microcomputers. Now almost any project can be done on a 
microcomputer or several microcomputers connected by a LAN if there is sufficient 

Video and computer Integration 

Photographs and fully functional video will soon be appearing on our computer screens. 
Although this may have greatest impact in pathology and radiology, and education, it 
also alters on opportunities to use color and three-dimensional dynamic displays for 
epidemiologic data. The possibilities for computer interaction via ordinary 
television sets are exciting, because every epidemiologist (and market researcher) can 
savor the possibility of interviewing citizens via cable television with the results 
captured immediately in computerized form. The medium offers new challenges in 
identifying responses that result from the various stages of humor, exasperation, or 
intoxication that citizens may undergo in the privacy of their homes. 

Voice and pen input 

System are available now that identify thousands of spoken words (for tens of 
thousands of dollars) and allow for a crude interaction between voice and computer. 
Computers that recognize handwritten text of reasonably structured type are being sold 


currently. Presumably the rather elementary state of computerization of medical 
records will undergo a quantum leap once such systems allow medical staff to dictate 
to the computer without typing and preferably without being near a computer. When 
medical handwriting is replaced by voice dictation into a lapel microphone, real 
progress may occur in the use of computers in both clinical medicine and public health 
settings. As stated above, however, realizing real public-health benefit from such 
technology will require dramatic social and legal changes. 


Since 1985, Centers for Disease Control (CDC) staff have installed and maintained 
customized disease-surveillance software in 36 state health departments and a number 
of county, district, and territorial departments. The software has been based on Epi 
Info, a public-domain word-processing, database, and statistics package for IBM- 
compatible microcomputers that is a joint product of CDC and the Global Programme on 
AIDS, World Health Organization {1,2). These systems have made possible the 
participation of all 50 states in the National Electronic Telecommunications 
Surveillance System (3,4). Benefits cited in a recent evaluation include improved 
access to data and improvement in both quality of data and access associated with 
decentralized entry of data (5) . 

Although reportable-disease systems are a specific kind of surveillance system and Epi 
Info is only one type of data-base/statistics program around which a system can be 
built, many of the principles of computerization apply to other systems. To avoid 
empty generalization, much of the rest of this chapter is based on CDC's experience 
with reportable-disease surveillance using Epi Info. The information is directed to 
those considering computerization of a disease-surveillance or similar system of 
records, whether they wish to do their own system design or will be working with a 
professional computer-systems designer. Computerizing a surveillance system for 
disease is not easy. Since the success of computerization depends as much on the 
administrative and epidemiologic environment as on the software, it is vital that 
public health practitioners understand the details of a new system and participate in 
its design. The most important step in developing a computerized surveillance system 


is identifying the public health objective for the system. In some cases, the 
objective (s) will have been clear for decades in a manual system ('Identify and treat 
or isolate cases of X and evaluate results, " or "Assess results of immunization 
programs and identify new cases for special control efforts"). Computerization can 
then be directed toward accomplishing the same task more efficiently or in greater 
volume or detail. 

The most successful computer systems, however, are those that change methods by which 
an agency operates rather than those that merely automate a manual task ( 6) . In 
establishing a new surveillance system or reexamining an existing system, it may be 
useful to address the following question: "What key pieces of information do I want 
to see on my desk (or computer screen) every day, week, month, or year that will make 
my work easier or more effective? - The same question can be asked at several levels 
of management- -from epidemiologic technician to epidemiologist to director of a public 
health agency . 

Given a surveillance system that has a public health goal and to some extent achieves 
the goal, why computerize? Sometimes the answer is obvious--because the annual report 
takes a herd of clerks 2 years to process," or "we like the graphs health department A 
turns out so easily with their computer." Potential benefits relate to quality of 
data or of reports, quantity of data that can be processed, and speed of processing. 
Dissemination (copying) of surveillance records to another site is one reason disease 
reports in all 50 U.S. states are computerized. 

We were unable to find systematic studies on the benefits of computerizing public 
health surveillance systems, although numerous articles describe individual systems 
that have been computerized (7-10), and Gaynes et al . (21) describe methods for 
evaluating a computerized surveillance system. In literature about the commercial 
world, benefits of computerization have been examined from the viewpoint of financial 
savings. Savings by automating a manual information process may amount to 20% or so, 
but the real benefits are achieved if computerization transforms the entire process 
concerned, giving a competitive advantage in the commercial world—which would 
correspond to a new order of service in the public health world (6) . So far, most 
public health applications have automated manual systems, although some--such as the 
spreadsheet calculation of the impact of smoking on populations--verge on establishing 


new and previously unknown styles of doing business (12) . 

One problem cited in other "vertical markets" (industries with specialized 
practitioners) such as the construction, meat-packing, and real estate industries. 
With only 7,000 epidemiologists in the United States, relatively few commercial 
developers feel that it is financially worthwhile to develop software for this market 
alone, since applications such as spreadsheets, languages, and word processors may 
sell millions of copies to the general public (13) . 

Basic Needs 

The first requisite for computerization is a paper system or operational design that 
works reasonably well or would do so if the process were speedier and more accurate. 
Chaos computerized is not necessarily an improvement over what is already in place, 
although the process of computerization offers a chance to rethink some of the 
features of a system and to make improvements. If the surveillance system is a new 
one, it may be desirable to evolve the computer facilities in small stages with 
minimal investment until the system proves to be useful and well-conceived. This 
requires a careful plan (including provision for changing the plan if necessary) but 
will minimize the expense of adaptation as the epidemiologic design of the system 
undergoes the inevitable adaptation to external reality. After the "bare bones" 
system has proven its worth and the probability of expensive changes is lower, the 
"bells and whistles" can be added later. 

Personnel to do the collection of data, data entry, analysis, and system maintenance 
are important contributors to the system. Many of the tasks can be learned by current 
employees, particularly if they find this challenge welcome. If possible, those 
chosen should be long-term employees to assure stability of the system, although they 
may be aided by students and other temporary employees. The epidemiologist who will 
use the results should participate in the planning of the system and should understand 
how it is constructed. A staff member with some programming skills and/or aptitude 
for microcomputing should be involved in designing and setting up the system, even if 
an outside consultant does the actual programming. 

If several computers are to interact and share data, a set of standards is necessary 


(e.g., just as humans carrying on a conversation need a common language). In the 
United States, the states and CDC chose a standard record format so that computers of 
different types could reformat data to a set of standard records and send these to the 
central agency. This standard, first devised in 1984 and revised in 1991, has served 
the purpose well, without placing unnecessary restrictions on the type of hardware or 
the format of records kept within each state. One state maintains 20 times more 
information for local use than do other states, but all export the same standard 
record formats to the national level. The new standard record format allows for 
standard demographic and diagnostic information, attachment of variable -length 
detailed reports for selected diseases, mixture of summary with individual records, 
and automatic comparison of state and national data bases with each transmission. 

Most government settings have an organization in charge of computer programming, 
approval of new systems, and purchasing of computers and software. It is important to 
maintain liaison with this organization and to arrange its assistance ahead of time 
with difficult areas such as purchasing computers. In some organizations, purchases 
are limited to particular types of computers- -occasionally with unique 
characteristics--or to centrally administered systems. We recently encountered a 
network of "diskless" workstations that presented numerous problems in trying to load 
or run software or back-up files from a particular station without a removable storage 
device. If such problems are present, it is prudent to discover and, if possible, to 
surmount them at an early stage through patient negotiation and collaboration or other 
methods if necessary. The technical difficulties that arise in setting up a computer 
system are usually the easy problems; the difficulties that lead to months and years 
of delay and unhappiness usually reflect misunderstanding and miscommunication among 
individuals or organizational entities. 

Some Key Concepts; Files, Records, and Fields 

Computerized records are stored in files. A file is a collection of records, usually 
one record per case, that has a name (e.g., GEPI.REC, for General EPIdemiology) and 
can be manipulated as a unit. Files, like books, can be opened, closed, read, written 
to, or discarded. They are stored on nonvolatile media such as hard or floppy disks 
or magnetic tape. 


Records correspond to one copy of a completed questionnaire or form, such as a 

disease-report card. Usually, one disease report or questionnaire is stored in a file 

as a single record. Records can be displayed on the screen, searched for by name or 

some other characteristic, saved (written) to a disk, or marked as deleted. Many 
records can be stored in each file. 

A field is one item of information within a record. NAME, AGE, and DATEONSET might be 
fields within a disease-report record. Records in a particular file all have the same 
fields. Each field has a name, a type (text, upper-case text, numeric, date, etc.), 
and a length, such as 22 characters for NAME or 3 for AGE. During analysis, fields 
may be called variables, and commands such as "TABLES DISEASE COUNTY" are used to 
instruct the system to process a particular file and construct the desired table by 
tabulating the fields or variables called DISEASE and COUNTY. In this case, the 
result in Epi Info would be a table that lists DISEASE down the left side and COUNTY 
across the top, with numbers of reports by county indicated in the cells of the table. 

Hardware: What Size Computer is Appropriate? 

With microcomputers being available for much less than $5000, it is possible to 
process more than 100,000 records in reasonable time periods. Processing time tends 
to reflect the record length as well as the number of records, however, and the size 
of each record should be kept short if large numbers will be processed. Since the 
total number of disease reports for the United States is several hundred thousand per 
year, states and counties should find it possible to build most systems on a 
microcomputer if desired. 

Minicomputers and mainframes can serve as the basis for surveillance systems if 
available at reasonable cost and if programming and support staff are available to 
work creatively with staff of the surveillance system. The greater technical skill 
required to run and program such computers often resides in an organization other than 
the one running the surveillance system, and close coordination becomes much more 
important than in the do-it-yourself situation with a microcomputer. 

Systems that seem to require processing of millions of records, such as hospital 
discharge or Medicare records for a state, can be reduced by sampling to a manageable 


size for the microcomputer. The mainframe can be used to select a sample of records 
(e.g., particular age groups, diseases, every tenth record, or persons born in decade 
years). Files are then exported for processing on a microcomputer that is more 
responsive to the epidemiologist's wishes. Epidemiologists are usually acutely 
conscious of sample size when performing interviews but sometimes fail to recognize 
how unnecessary it is to process 6 million records to estimate a simple proportion. 


The type of software used to perform the computerization is often less crucial than 
the skills of those who will program and run it. Usually, there are several types of 
data-base or statistical packages that will do a given task well if properly 
programmed. Beware of the 'indispensable programmer' syndrome, in which a single 
expert programmer writes a system in his or her favorite language and then departs for 
greener pastures, leaving the users without resources for further maintenance. 

Data-base packages such as dBase, Paradox, Foxbase, and Clipper are designed to allow 

data input, storage, retrieval, and editing. Most will count records but do not 

easily do such statistics as odds ratios. They require a skilled programmer to 
produce a customized system. 

Statistics packages, such as Statistical Analysis System (SAS) and Statistical Package 
for the Social Sciences (SPSS), focus on producing statistical reports, usually from 
single files of data. They are less convenient for data entry. Both SAS and SPSS now 
have mainframe and microcomputer versions. They contain many routines rarely used by 
epidemiologists and occupy large amounts of disk space (tens of megabytes for SAS) . 

Epi Info provides a combination of data-base and statistical functions, allowing 
relational linking of several files during data entry or analysis. Questionnaires or 
forms may be up to 500 lines, with hundreds of numeric or text fields, and the number 
of records is limited only by disk storage space. Frequencies, cross tabulations, 
customized reports, and graphs can be produced through commands contained in a program 
file or interactively from the keyboard. Commonly used epidemiologic statistics are 
part of the statistical output. Although it takes little experience to use Epi Info 
for investigating outbreaks, producing a complete surveillance system from the 


beginning takes both skill and time. It may, however, be much simpler to modify 
software supplied with the program. 

It is important to realize the limitations of software packages before they are used. 
Both statistical and data-base packages typically cost at least several hundred 
dollars and therefore are not likely to be feasible for classes of students or large 
numbers of remote computers. 

Some data-base packages limit the number of fields in a record or the number of 
records in a file, and few will do statistics without advanced programming or purchase 
of a supplementary package. Statistics packages, on the other hand, may have 
limitations in handling textual ("alpha") data, and most allow processing of only one 
file at a time. A complete surveillance system may require the functions of both 
data-base and statistical programs. 

The current version of Epi Info has limitations on the number of records that can be 
sorted or linked at one time (tens of thousands) , however, and since text fields are 
limited to 80 characters, Epi Info would not be a good choice if large amounts of text 
are to be stored, as in a complete clinical system containing dictated notes. 

Designing Entry Forms 

In a surveillance system, data items are usually entered in a standard format (e.g. , a 
questionnaire or report form) . The information is stored in files containing one 
record per individual. In Epi Info, the format of the data-base file is specified by 
typing a questionnaire or form in the word processor. The result resembles a paper 
form, with entry blanks indicated by special symbols (e.g., underlined characters for 
text fields and number signs for numeric fields) . The computer reads the form and 
constructs a file in the proper format. 

In designing a form, it is useful to include a unique case identifier as a number of 
combination of letters and digits. This may include meaningful information, such as 
the year, but should not include any item that may need to be changed, such as a 
disease code. It must be designed so that a new and unique number will always be 
available for each record. 


The amount of data entry and computer storage required may be minimized by 
computerizing only information that will actually be used. If follow-up information 
such as name, address, and telephone number can be used from the paper form, there may 
be no need to enter it into the computer. If contact tracing is recorded, the 
computer record may summarize the number of contacts named and the number found or 
treated, with the details on each and progress of the follow-up efforts relegated to 
the paper forms used by field investigators. When including an item on the input 
form, it is helpful to ask, 'how will this be analyzed?" and "how would the result 
look after processing?' Computers around the world are full of data items that 
someone entered "just in case we need it." Most are never needed. 

Textual material can be printed from a computer file, but it is usually difficult or 
impossible to process such entries as "Pen, Strep, and Ampicillin," to produce 
meaningful tabulations. For serious analysis a more usable format would be 

Penicillin <Y> 

Streptomycin <Y> 

Ampicillin <Y> 
in which "<Y>" represents a blank for a "Y" or "N° response. 

A common problem in designing entry forms is that several data items may be similar. 
Suppose you want to record name and treatment (RX) status for up to 12 contacts of 
each case-patient. One possible approach is to create fields called NAME1 through 
NAME12 and RX1 through RX12 . This approach allows the data to be entered, although it 
creates a very large data-entry record (say 12 x 22 characters for NAMEs and 12 x 1 
characters for RX=276 characters, even if no information about contacts is entered) . 
However, analyzing the information becomes a programming nightmare, as determining the 
number of contacts or their treatment status requires examining at least 12 different 
fields in each record to see whether they have been filled in and keeping a running 
tally of the results. In computer data-base jargon, the record is not "normalized." 
These repeating groups of fields should be placed in separate records — one for each 
contact--linked to the main file as described below in the section on linking special- 
purpose records. Then a case-patient with one contact has one record in the case file 
and one record in the contact file rather than the equivalent of these plus 11 empty 
records in a single file. 


This problem is resolved by rethinking what is really the best unit around which to 
build an individual record. The simple answer is that if you intend to tabulate 
cases, build a case record; if you will tabulate contacts or follow-up visits, then 
you need a contact or follow-up record. If both are necessary and the system is large 
or permanent, records should be placed in separate files and linked using relational 
data-base features as described below. 

Data Entry 

The details of data entry should be determined and documented, including who will 
prepare the paper records (if needed) for entry, who will enter them, and at what 
intervals. The status of the report as "suspected" or "confirmed" may determine 
whether it is entered, and this must be determined at the outset. Most disease 
reports are entered in batches--once a week, for example--and in many states not more 
than an hour or two is needed to enter the data for a week, although the quantity of 
records varies sixfold in size in different states and correspondingly in time 
required to enter data. 

Records linked to more extensive specialized forms can be sent as partial submissions 
and revised later to avoid delays in reporting caused by the slower progress of data 
collection for the more detailed forms. This issue needs to be considered and 
resolved in advance . 

Cleaning and Editing the Data 

Errors or duplications inevitably occur during data entry, and additional information 
may arrive that requires changes or additions. The data can be "cleaned" during data 
entry or with the help of analytic programs that display "outliers, " and data can be 
checked visually by browsing through records in the ENTER program or by scanning a 
list printed by the ENTER or ANALYSIS programs. Records can be viewed and corrected 
in a spreadsheet format in ANALYSIS. Finally, a program called VALIDATE can be used 
to compare files entered in duplicate by different operators. Records showing 
different entries are printed out for reconciliation. 

Epi Info allows extensive programming of error checks on data entry. Each field can 
be set to accept only specified codes, and, if necessary, multiple fields can be 


checked for inconsistencies such as gynecologic conditions recorded for males. 
Unfortunately, many errors cannot be caught by such systems, and one can still enter 
the wrong code for a less gender-specific disease. 

Regardless of the method used, errors should be caught and corrected near the time of 
data entry if possible, since they can create much larger problems if left for the end 
of the year. The choice depends largely on orientation and number of personnel 
available and perhaps on their preferences after trying different methods. 

Analysis of Data 

The type of output desired should be planned in advance, since the inputs and outputs 
usually specify fairly precisely what kind of processing is needed to achieve the 
result. Dummy tables and graphs should be sketched on paper. Epi Info and many other 
data-base programs can be programmed to print a table or mixture of text and tables in 
almost any format, using a feature called the "report generator." 

It is not necessary to design reports to cover all possible needs, since ad hoc 
queries are an important part of any system, and additional reports can be added later 
if they are deemed useful. In Epi Info, an epidemiologist can learn to do simple 
queries (READ GEPI; TABLES RACE COUNTY) in a short time and to limit these to 
particular time periods (SELECT REPORTWK = 34) almost as easily. 

Sometimes a simple report such as a listing this week's reports, sorted by disease, 
may be as useful as a number of tables with very small numbers in each cell. The 
number of records available should be considered in designing reports and in 
determining how often they will be produced. 

Distributed Data Base 

So far, we have described a surveillance system housed in a single microcomputer. As 
more community health departments obtain computers, however, the trend is toward 
networks of computers within a state, connected by modem in ways analogous to those 
used in the National Electronic Telecommunications Surveillance System (NETSS), with 
its 50+ state and territorial participants. Each participating site enters data and 
sends them periodically to a computer at the next level up. 


This process would be simple to do if all data were entered at the local level and 
sent to the state level, and if no changes were made later. However, in practice, not 
only are changes made, but in some states records are entered at both state and local 
levels, and some method must be in place to see that both levels of staff eventually 
have the same records. 

Ideally, only one copy of the records would be considered the "master" copy, and each 
user would know its location and provide updates only at the designated time. The 
best way to accomplish this objective is still being worked out, and experiments of 
several types are likely. Designating only one of the sources as the "owner" and 
rightful editor of the data is one possibility. At present, we favor indicating on 
each record the site at which it was created and allowing only that site to make 
changes that are transmitted weekly to the other sites to update their copies of the 
records . 

State health departments use the latest software to transmit year-to-date summary 
information on the state data base to the national level each week. These data are 

compared automatically with the contents of the national data base, and any 

discrepancies are reported. 

Transmitting Data 

In NETSS, most states transmit reports each week through a commercial 
telecommunications network. The 50+ reports stay in the network computer until they 
are picked up on Tuesday morning by CDC staff, stripped of comments and address 
material, and joined together in a single file for processing on the CDC mainframe. 
Error checking is done to test for invalid codes and other problems, and error notices 
are sent back to the states. 

Another method that eliminates errors caused by telephone noise involves transmission 
directly from computer to computer by means of modems and software that retransmits if 
errors are caused by noise. Several states are using this method to connect with CDC 
microcomputers that, in turn, send the files to the CDC mainframe. 

A third less elegant but often practical solution is physical transfer of floppy 


diskettes by mail or messenger at intervals. This allows large files to be 
transferred with minimal inconvenience, and may be appropriate if the additional 
trouble of setting up modems and software is not yet warranted or in developing 
countries where telephones are unreliable or unavailable. 

In any case, the result is that a copy of a file of records from the peripheral site 
arrives at the central site. The records must then be merged into the main data base. 
If all are new records, this task is straightforward. If the incoming records contain 
updates for records previously transmitted, the process is more complex. 

Correcting and Updating Records from Another Site 

In NETSS, only state participants are allowed to update records; CDC staff do not do 
so, although they may enter temporary telephone reports. Updates are sent as records 
with the same identification number as that for the original record. If a new record 
has the same identification number as a record in the data base, the existing record 
is updated so that all non-blank fields of the new record prevail. To change an age, 
for example, a state would send a record containing the case identification number and 
the new age. To delete a record, the state, year, and identification numbers are sent 
in a special 'Delete' record. When errors are found at CDC, the information is 
transmitted to the state staff, who then corrects the errors and transmit update 
records the following week. 

Individual and Summary Records 

Many systems function with a record for each individual case report. In some, 
however, there is a need for summary records, each of which represents a number of 
case reports. This is helpful if large numbers of similar records (e.g., cases of 
gonorrhea in a big city) are processed, or if only summary numbers are available. It 
also allows records from entire years to be summarized in condensed format, so that a 
5-year trend can be calculated without reading and processing each record for the 
previous 5 years. 

A summary record is similar to a case record, but it contains an additional field 
called 'COUNT,' which contains a number. The number indicates how many records with 
the same information are represented by the summary record. Epi Info contains 


commands called SUMTABLES and SUMFREQ to process summary records. These commands sum 
the contents of the count field rather than counting individual records. Since a 
record with COUNT equal to 1 is an individual case record, files that are mixtures of 
summary and individual records can be processed as a single unit. 

Linking Special-Purpose Records to the Main Data Base 

As mentioned above, sometimes it is necessary to link related records in different 
files together in order to allow easy processing of, for example case-patients and 
contacts who are related to case-patients. This requires that a common case 
identification number be included in each record. Epi Info and other data-base 
programs, such as dBASE, allow automatic linking of records through such a common 
identifier. On data entry, answering "Y" to the question 'Contacts (Y/N) ?" might 
cause another form, representing the contact file, to appear on the screen. The 
operator can then enter one or many contact forms for this case, pressing a function 
key (F10) to return to the main form. A separate record is created for each contact. 

In Epi Info's ANALYSIS program, the CONTACT file is READ, and the CASE file is linked 
("related") to it. Each contact record then contains information about the case- 
patient as well as about the contact, and questions such as "how many contacts of 
female case-patients were treated?" can be answered easily. The CASE file can also be 
processed alone to answer questions such as "how many cases of syphilis were there?" 

We also link disease-specific forms to the main data base of reports. Hepatitis, for 
example, requires a full page of extra information used to define further the 
epidemiology of a report. By linking a hepatitis file to the main case file, records 
are created only if the disease is hepatitis, thus saving a great deal of storage 
space over the single-file method, in which all the questions on hepatitis- would be 
left blank in a nonhepatitis record. Current systems, including the one distributed 
as an example on the Epi Info disks, contain related files for hepatitis, meningitis, 
and enteric disease, each of which only appears if a relevant disease code is entered. 

Dissemination of Data 

Dissemination of results is an important element of the surveillance cycle. 
Computerization can assist by making new methods of analysis or presentation 


practical. Use of tabular or graphics software in conjunction with desk-top 
publishing technology can make the preparation of results not only faster but more 
accurate and meaningful. A graphic method for comparison of current results with 
those for the past 5 years has been introduced to the Morbidity and Mortality Weekly 
Report in the United States (Figure V.12) (14). This method would have been too 
cumbersome for manual processing. 

Computer software greatly simplifies and improves the production of maps and graphs. 
Epi Map, a public domain companion to Epi Info, to be released in 1993 will make 
mapping available to anyone with an IBM-compatible microcomputer. 

Tables, maps, graphs, text, and data files may be made available either on-line via 
modem connections or by distributing floppy or CD-ROM disks. The latter are 
particularly useful in remote areas or for large volumes of data than can be easily 
sent over low-speed modems. 

Data Disasters 

Destruction or damage or data on hard disks should be expected and planned for. 
During the first 4 years of NETSS (and during the 3 year tenure of its predecessor, 
the Epidemiologic Surveillance Project), a number of hard disks have "crashed." In 
most cases, back-up files on floppy diskettes had been properly prepared and stored, 
and they were used to restore the data once the disk had been replaced. 

Recently, some state programs began to reuse case-identification numbers from several 
years ago, not realizing that the new records would overwrite the old records in the 
national data base. It is important to be clear about the time period for which 
updates will be accepted. 

Upgrading either hardware or software is a frequent cause of problems, when the new 
items have unexpected features, occupy more memory space, or require that protocols 
for functions, such as communications, be changed. 

Computer viruses are an increasing cause of problems. They can cause a variety of 
difficulties ranging from erratic behavior of software to complete loss of files. 


They may be introduced from networks, by accessing other computer bulletin boards, or 
by loading copied software from unknown sources. 

Programs to detect and eradicate computer viruses are available commercially. It is 
essential to install one of these and to be sure that any disk from an external source 
is scanned for viruses before it is copied or used as a source of new programs. 

Backup Methods 

Methods for disaster prevention center around regular backup of data files onto floppy 
diskettes (or tape if available, but beware of tape backups with only one compatible 
tape drive in the same institution) . The back up copies should be rotated so that 
several circulate in turn and so that the one overwritten has at least two more recent 
relatives. To protect against fire, water damage, and damage by panic-stricken 
personnel, it is wise to keep at least one backup in a site remote from the computer. 
Setting the write-protection feature on the diskettes after making the backup is an 
additional protection. 

Upgrading hardware or software should be done at a time when use of the system is 
least critical, and care should be taken to allow for replacing the old system exactly 
as it was if problems occur with the new one. Thus, before installing a new version 
of software, the old one should be thoroughly backed up or preferably left in place in 
another directory so that it can be used if necessary. 

Training of Staff and Transition Techniques 

We have found that the most effective staff training occurs by having potential 
operators participate in the design of the system and receive short demonstrations and 
hands-on lessons at the time the system is installed. Usually installation of a 
system takes two or three days for planning and decision making, two or three days for 
programming, and a similar period for staff training, trial runs, and revisions. 

National meetings and training sessions for operators of state surveillance systems 
have been helpful in providing extra training and motivation and in surfacing problems 
that need to be addressed and new ideas for software improvements. 

During the transition from a paper to a computerized system, both systems are run in 


parallel for a period until the results are satisfactory and staff feel comfortable 
with the new system. 


The old image of the computer expert in an expensive suit handing the client the keys 
to the new "turn-key" system perfectly adapted to his or her needs was probably always 
a fantasy, but with modest budgets, small data bases, and a desire for "hands-on" 
access to data, it certainly has little relevance to public health needs. Although in 
some ways centralized computers and instant interactivity for updating records would 
present fewer problems than the distributed systems we have described, public health 
workers usually do not require and cannot financially afford the instant updates 
needed for law enforcement, banking, or airline reservations. Microcomputers and 
local data bases can maintain the data and analytic results closer to the 
professionals primarily responsible for prevention and control. 

We are convinced that participation of all 50 state health departments in the national 
computerized system would have been impossible without a) software for states that 
allowed customization for use of local forms and procedures, b) participation of each 
state epidemiologist's staff in designing a system unique to the state, and c) a 
standardized record format. Each state has a different input form, although the 
records sent to CDC are restructured and variable values are recoded by Epi Info 
programs so that they are in the uniform national format. 

As systems become more complex, however, it is important to standardize as many 
features as possible from state to state so that a thoroughly debugged core system can 
be used by all. We are gradually achieving this with a new Epi-Info based system that 
has a series of standard modules, accompanied by other modules that are highly 

As pointed out in this chapter, there is an enormous gap between what is 
technologically possible with the use of computers in public health and what is 
actually going on at the grass-roots level of public health practice. Until the 
keeping of medical records in clinical practice is computerized to a much greater 
extent, it would be difficult to imagine that our scenario of the future will actually 

move closer to reality. 

Other key issues remaining to be resolved include a) the balance between 
confidentiality and free access to clinical records for public health purposes, b) the 
cost of data access and of programming and processing, and c) the ability of both 
professionals and the public to deal with "dirty" and preliminary data. 

Many of these issues have both technical and social solutions. A great deal of work 
in both realms remains to be done before computerized public health surveillance can 
be said to have achieved its full potential. 


1. Dean AD, Dean JA, Burton AH, Dicker RC. Epi Info, version 5: a word 
processing, database, and statistics program for epidemiology on microcomputers. 
Atlanta, GA. : Centers for Disease Control, Atlanta, 1990. 

2. Dean AD, Dean JA, Burton AH, Dicker RC. Epi Info: a general-purpose 
microcomputer program for public health information systems. Am J Prev Med 

3. Graitcer PL, Burton AH. The epidemiologic surveillance project: a computer- 
based system for disease surveillance. Am J Prev Med 1987;3:123-7. 

4. Centers for Disease Control. National Electronic Telecommunications System for 
Surveillance--United States, 1990-1991. MIHR 1991,-40 (29) .-502-3 . 

5. Odell-Butler ME, Ellis B, Hersey JC. Final report for task 8, an evaluation of 
the National Electronic Telecommunications System for Surveillance (NETSS) . 
Arlington, Va. : Battelle, June 1991:49-50. 

6. The big pay-off (benefits of computerizing a business) (node supplement). IBM 
System User March 1990 :S20. 

7. Mary M, Garnerin P, Roure C, et al. Six years of public health surveillance of 
measles in France. Int J Epidemiol 1992;21:163-8. 

8. Centers for Disease Control. Surveillance of influenza-like diseases through a 
national computer network- -France, 1984-1989. MMWR 1989;38 (49) :855-7 . 

9. Watkins M, Lapham S, Hoy W. Use of a medical center's computerized health care 
database for notifiable disease surveillance. Am J Public Health 
1991;81(5) :637-9. 

10. Bernard KW, Graitcer PL, van der Vlugt T, Moran JS, Pulley KM. Epidemiological 

270 27/ 

surveillance in Peace Corps volunteers: a model for monitoring health in 
temporary residents of developing countries. Int J Epidemiol 1989; 18 (1) :220-6. 

11. Gaynes R, Friedman C Copeland TA, Thiele GH. Methodology to evaluate a 
computer-based system for surveillance of hospital-acquired infections. Am J 
Infec Control 1990;18:40-6. 

12. Shultz JM, Novotny TE, Rice DP. Quantifying the disease impact of cigarette 
smoking with SAMMEC II software. Public Health Rep 1991,-106 (3) :326-33 . 

13. Call B. The ones that got away: why some industries have not yet computerized. 
PC Week June 24, 1986 ;3: (25). 

14. Centers for Disease Control. Proposed changes in format for presentation of 
notifiable disease report data. MMWR 1988; 38 (47 ) :805-9 . 


Chapter XII 

State and Local Issues in Surveillance 

Melinda Wharton 
Richard L. Vogt 

"The government is very keen on amassing statistics. They collect them, add them, 
raise them to the nth power, take the cube root and prepare wonderful diagrams. But 
you must never forget that every one of these figures comes in the first instance from 
the village watchman, who just puts down what he damn well pleases.' 

Josiah Stamp 


In a recent report, the Institute of Medicine defined assessment as a core function of 


public health agencies at the state and local level. "An understanding of the 
determinants of health and the nature and extent of community need is a fundamental 
prerequisite to sound decision-making about health. Accurate information serves the 
interests both of justice and the efficient use of available resources. Assessment is 
therefore a core governmental obligation in public health." State responsibilities 
include "assessment of health needs within the state based on statewide data collec- 
tion" as well as "establishment of statewide health objectives, delegating power to 
localities and holding them accountable." Responsibilities of local public health 
units include "assessment, monitoring, and surveillance of local health problems and 
needs and resources for dealing with them" (2) . 


Although much of this book focuses on surveillance at the national level, the legal 
and regulatory authority for public health surveillance activities in the United 
States derives from state and local law (see Chapter X) . Both the vital records and 
morbidity reporting systems were developed initially at the state level, and only 
later were national systems developed, with the participation of all states being 
voluntary. Indeed, in the United States, state and local governments have both the 
authority and the responsibility for almost all public health actions. This decen- 
tralization of power is outlined in the Constitution of the United States. Therefore, 
although most of the issues discussed in this chapter are relevant to other countries, 
some are unique to the practice of surveillance in the United States. 

Although the objectives of surveillance at the state and local level do not differ 
substantially from those at the national level, the link to act ion- -whether it be 
outbreak control, vector-control activities, legislation requiring use of child- 
restraint devices, or community mobilization--is most explicit at the state and local 
level. The objectives of state as well as national surveillance must be considered as 
systems are developed or redesigned, to assure that the information needed for public 
health action is obtained in the most efficient and cost-effective manner. The focus 
of the objectives may vary somewhat by condition (see Chapters I and II) . 



Only two data sources--vital records and notifiable-disease reports--are available at 
the local level in all states in the United States. Although other data sources 
discussed in Chapter III may be available at the state and local levels in some areas, 
alternate data sources may be needed in some states or localities to assess the impact 
of specific public health problems. Innovative solutions to particular data-related 
problems have been developed in many communities; some issues related to data sources 
at the state and local level are summarized below. For more information regarding 
other data sources, see Chapter III. 

Notifiable Diseases 

All 50 states require that physicians report cases of specified notifiable diseases to 
the appropriate state or local health department. The legal authority for the 
collection of this information rests with state statutes that are promulgated in state 
regulation; the diseases that are reportable vary by state (2,3). The notifiable- 
diseases reporting system was initially developed for reporting epidemic diseases such 
as smallpox and yellow fever, and this mechanism is still most commonly used for 
surveillance of infectious diseases. For noninfectious conditions, reporting by 
physicians is less uniformly required. In many states, however, reporting of specific 
occupational or chronic diseases is required by statute. 

Sentinel Systems 

State and local health departments may supplement information available through the 
notifiable-disease reporting system by creating sentinel reporting systems. State- 
based sentinel systems in Maine and Rhode Island relied on reporting by physicians, 
who were recruited by the state health department and were paid small amounts of money 
for participation. Both systems were subsequently discontinued because of budgetary 
cutbacks (4,5) . 

More recently, a sentinel active surveillance system developed in Missouri has been 
organized to ensure representation of the six public health districts in the state. 
Over 500 sites were recruited for participation, including schools, hospitals, day- 
care centers, preschools, and nursing homes; fewer than 30% of the participating indi- 
viduals or institutions were physicians or clinics. Each participating site is 
telephoned weekly by local health departments to solicit reports (f) . A similar 


system, including universities, has been operated by the Los Angeles County Department 
of Health Services since 1981. In addition to providing timely information about 
reportable diseases, the system also has provided data on a variety of nonreportable 
conditions (7) . 

Such sentinel systems may be particularly useful for following trends in common condi- 
tions — e.g., varicella or influenza- -when precise counts of cases are not needed and 
when a public health response is not necessary for individual case reports. However, 
if the reporting units selected for the sentinel system are unrepresentative of the 
overall reporting population, findings may not be generalizable to the wider popula- 
tion. Sentinel surveillance systems may be used to facilitate collection of addition- 
al risk-factor and other information on a subset of case reports, thus limiting the 
overall burden of data collection (8) . 

Hospital -Based Surveillance 

Hospital-based surveillance systems, drawing on emergency room visits or hospital- 
discharge data, have most commonly been developed at the state and local level for 
surveillance of injuries (9-25) . Other uses have included assessment of unmet health 
needs by identification of preventable disease (sentinel health events) (16). Aside 
from nosocomial infections, such systems are likely to have limited usefulness for 
surveillance for communicable disease (17) . 

In areas in which hospital-discharge diagnoses are coded using external cause of 
injury and poisoning codes (E-codes), hospital-discharge data are useful for surveil- 
lance of injuries. Currently 28 states have uniform hospital-discharge reporting 
systems, and addition of E-coding is a high priority for state and local injury- 
surveillance programs (18) . The recent experience of New York State demonstrated the 
feasibility of such an addition, particularly when care was taken to develop a 
constituency to support the proposed change. Review of clinical records demonstrated 
that 93% of charts contained information necessary to allow proper coding. Since E- 
coding has begun, 95% of records of injured persons contain a valid E-code (19) . 

Other hospital-based data sources may be useful for surveillance at the state and 
local level. For example, trauma registries are a potential source of data for injury 


surveillance (20) , despite the lack of representativeness of patients referred to 
trauma centers for care (21) . 

School -Based Surveillance 

School-based surveillance systems have been developed in some states to monitor 
disease trends among children of school age. This approach has been used for surveil- 
lance of influenza and varicella (22,23). Absenteeism is an excellent marker for 
influenza and is almost always available for administrative reasons. In Michigan, 
schools provide reports of cases of notifiable diseases among their students--along 
with counts of number of cases of influenza-like illness and varicella--to local 
health departments on a weekly basis. In many states, notifiable-disease regulations 
mandate reporting of specified diseases by school authorities. 

Surveys at the State and Local Level 

Information on certain issues, such as seat-belt use or nonutilization of health-care 
services, cannot be obtained readily without the use of surveys. Although national 
surveys may provide national estimates, data at the state or even local level are 
needed for health planning or to support legislative initiatives. Since 1981, state 
health departments have collaborated with the Centers for Disease Control (CDC) to 
conduct telephone surveys of adults to obtain information on health practices and 
behavior. In 1990, 45 states and the District of Columbia participated in the 
Behavioral Risk Factor Surveillance System (BRFSS) . The BRFSS allows estimation of 
age- and gender-specific prevalence of various risk factors by state (24,25). 
Likewise, behavioral risk factors among young people are periodically measured through 
state and local school-based surveys in the Youth Risk Behavior Surveillance System 
(26). County or community surveys may be particularly useful in areas with small 
populations, in instances in which morbidity or mortality data may be of limited 
usefulness to monitor the impact of interventions (27) . 

National Mortality Registration System 

State law requires filing a death certificate for every death that occurs in the 
state, and death registration is virtually complete in the United States. At the 


state level, mortality data are available before national data are compiled and 
released. Although the underlying cause of death is determined using standard 
computerized algorithms in all states, not all states use E-coding. 

Such data are useful at the local level to identify preventable mortality and to set 
health priorities in the community. These efforts may be particularly important in 
developing community-based prevention programs for chronic disease (28) . 

Other Data Sources 

Surveillance responsibilities of state and local health departments extend into many 
other areas, and in some jurisdictions may include monitoring of environmental 
quality, illnesses of domestic and wild animals, and vector populations. Although 
outside the scope of this book, these types of surveillance provide important informa- 
tion at the state and local level. For example, management of persons exposed to 
possibly rabid animals is influenced by the epidemiology of rabies in the area of 
exposure (29) . 

Arbovirus surveillance includes monitoring of vectors, vertebrate hosts, human cases, 
weather, and other factors in order to detect or predict changes in the transmission 
dynamics of arboviral infections. Guidelines for arbovirus surveillance programs in 
the United States have recently been developed (30) . 

Provider-Based Reporting: Special Issues 

Mandatory reporting of communicable diseases by physicians has a long history in the 
United States, and there is an equally long history of failure on the part of physi- 
cians to comply. During the yellow fever epidemic of 1795, the New York City Health 
Committee quarantined patients with yellow fever at Bellevue Hospital. Many physi- 
cians refused to report cases, and the New York Medical Society went on record oppos- 
ing the Committee's action, on grounds that the disease was not contagious (31) . 
Physicians fought early efforts to make tuberculosis reportable, arguing that compul- 
sory reporting constituted an invasion of the doctor-patient relationship and a 
violation of confidentiality (32). By 1913, five states had enacted regulations 
requiring reporting of venereal disease. Dr. Herman Biggs, director of the New York 
City Board of Health, stated that "the ten year long opposition to the reporting of 


tuberculosis will doubtless appear a mild breeze compared with the stormy protest 
against the sanitary surveillance of the venereal diseases" (33) . 

The completeness of reporting of communicable diseases is variable, but for most 
diseases in most locations, it is thought to range from low to very low {34,35) . Of 
course, factors other than the failure of physicians to report cases contribute to the 
low level of reporting of incident cases. Persons with asymptomatic infections or 
mild disease are unlikely to seek medical care. Of those persons who do seek care, 
not all will receive a specific diagnosis. Nationally, only 5% of cases of varicella 
are reported in the United States (36) , and estimates of completeness of reporting are 
similar for shigellosis (3 7) . Studies of outpatient-based or hospital-based reporting 
in some areas suggest somewhat higher levels of reporting of diagnosed cases of 
notifiable diseases, with substantial variation by disease (38-40). Reporting rates 
are higher for inpatients than outpatients (17). 

Given the historic reluctance of physicians to participate in reporting disease, it is 
fortunate that reports of disease are available to most state health departments from 
other sources. Almost all states mandate reporting by clinical laboratories of at 
least some notifiable diseases (41) . Laboratory reporting is often more readily 
available and reliable than reports from physicians. In Vermont, 71% of initial 
reports of confirmed cases of notifiable diseases in the period 1986-1987 originated 
from clinical laboratories; only 10% originated from physicians' offices (42) . In 
Oklahoma, approximately 85% of cases of shigellosis are reported, but laboratories 
account for almost all of the reports received. Laboratories reported 77% of all 
reported cases, compared with only 6% for physicians (43) . 

Although laboratory -based reporting may be a valuable adjunct to physician-based 
reporting, it cannot replace reporting by physicians for all diseases. Some report- 
able diseases are clinical syndromes, requiring clinical judgment, and no specific 
laboratory diagnostic procedures exist (44) . In other situations, laboratory diagno- 
sis may play an important role, but may not be routinely available in a timely enough 
manner to replace reporting by physicians. Finally, physicians may have additional 
information that is epidemiologically important but is not known to the laboratory; a 
timely report by a physician may allow early institution of control measures, without 
waiting for the health department to follow up on laboratory reports. 


A number of studies have attempted to identify reasons for physicians' failure to 
report notifiable diseases (42, 45-47) . In recent years, physicians have cited many of 
the same objections that have been raised historically, as noted above, although it is 
at least reassuring that the noncontagiousness of diseases that are actually communi- 
cable is no longer invoked. Commonly cited reasons, in approximate order of impor- 
tance, are summarized in Table XII. 1. 

In an effort to improve reporting of notifiable diseases by physicians, local and 
state health departments have tried a number of different strategies. Although many 
of them have not been formally evaluated, enough information is available to reach 
some conclusions about possible successful approaches. 

Projects aimed at improving reporting by physicians have included many interventions 
(e.g., revised reporting procedures, improved dissemination of findings and feedback 
to participants, and informational campaigns regarding the importance of reporting and 
outlining procedures for reporting) . Even relatively intensive efforts may not 
produce major increases in reporting, although they may be effective in increasing 
awareness of reporting procedures among physicians (7,48). 

Efforts to increase reporting through specific projects provide some clues on the most 
effective approaches. Active surveillance projects, in which health department 
personnel contact physicians' offices on a regular basis, have demonstrated 2- to 5- 
fold increases in the reporting of specified diseases, as well as increases in 
reporting of other conditions not subject to active surveillance (49-51) . The 
consistency of these findings demonstrates that under some circumstances physicians 
are willing to report cases of notifiable disease. In these studies, reporting was a 
simple matter, and that may be important; equally important may be the message 
conveyed by the substantial investment by the health department in active surveil- 
lance—that disease reporting is an important activity. 

The need for surveillance data on notifiable disease and the usefulness of such data 
are so obvious to workers in state and local health departments that we often believe 
that all physicians would report if they only understood the importance of reporting. 
Efforts to educate physicians have included a) lectures to medical students, house 
officers, and local medical groups on the importance of reporting; b) health depart- 


merit newsletters; c) educational mailings; and conjunction with licensure. Although 
all of these may be useful, and lectures and newsletters are important forms of 
feedback to the medical community, evaluation of single presentations to clinical 
groups, newsletters, and mailings have not been found, in isolation, to increase 
reporting. Intensive efforts to market the concept of reporting may be more useful 
but will be accompanied by an obvious increase in cost (52) . 

If sending an occasional speaker to the local medical society and mass mailings are 
not effective, what is? The active surveillance projects and other studies of 
interventions demonstrate the usefulness of telephone contact (49-51, 53) . In fact, 
the efforts that work all target individual physicians- -rather than groups of physi- 
cians — and make limited use of mailings and more use of personal visits and telephone 
contact. Some approaches that appear to be successful include a) providing physicians 
with feedback on the health department's disposition of individual cases (54); b) 
matching laboratory reports with physicians' reports, and for those cases reported 
only by laboratories, notifying physicians that a specific case should have been 
reported to the health department; and c) conducting in-person site visits to review 
reporting procedures (55) . The latter intervention may be quite effective in enhanc- 
ing laboratory- and hospital-based reporting, especially if accompanied by a review of 
medical records. The relevant factors may be less the mode of contact than the need 
to remind physicians on a regular basis that there is a health department that wants 
the information and that the health department actually does something with the data 
that are provided. 

Exhortation and pleading for reports is no substitute for a state or local health 
department that responds promptly to reported public health problems, provides useful 
responses to inquiries from physicians and the public, and gives feedback on its 
activities and on the health status of the community to the medical community and the 
public. Nonetheless, a few specific steps that state and local health departments can 
take to improve reporting of notifiable diseases can be identified (Table XII. 2). 

Active surveillance works, but it is generally too costly to maintain as a routine 
health department activity. Less costly alternatives include sentinel active surveil- 
lance, in which certain physicians and institutions are identified and are targeted 
for active surveillance. Although this approach has been successful in some areas, it 


is also costly and may detract from collection of surveillance data from non-sentinel 
sites. Another approach is what has been called "stimulated passive surveillance," in 
which the health department uses any contact with the medical community to solicit 
reports and provide feedback on community health status and health department activi- 
ties. It may not be feasible to contact every physician, or even a systematic sample 
of physicians, every week, but every week physicians are contacted, for a variety of 
purposes, and those contacts can be used to exchange information. 

Administrative barriers to reporting should be identified and eliminated. Physicians 
should be provided readable and up-to-date copies of lists of notifiable diseases, 
reporting forms, and telephone and facsimile numbers for local and state health 
departments. Reporting procedures should be as simple as possible. Some health 
departments have used toll-free numbers for telephone reporting {46,56). Answering 
machines can answer telephones at night, but people can answer questions and provide- - 
and solicit--additional information. Reporting forms should be simple, clear, and 
printed in colors that allow photocopying or transmission by facsimile machine. Self- 
addressed, postage-paid cards or envelopes may be helpful. Although these tools may 
make reporting easier, without the other components of effective surveillance they are 
unlikely to have substantial impact on reporting behavior of physicians. 

State licensing boards may penalize physicians for failing to report, although such 
actions are rarely taken. In California, a physician who failed to report on a 
patient with hepatitis A who subsequently transmitted infection to others had his 
license suspended for a year, and was placed on probation for 5 years (57) . The 
medicolegal implications of failure to report are well-established in law, where the 
physician's obligation has been found to extend beyond the patient under his/her care 
(58) . Although no single approach--be it improved communications, improved proce- 
dures, education, or fear--is necessarily successful in improving reporting by 
physicians, effective presentations have been developed using case studies that 
include the medicolegal implications of failure to report (Hendricks K, personal 
communication) . 


Although the mechanisms vary, it is important that lists of notifiable diseases 


undergo periodic revision. Public health priorities, epidemiology of specific 
conditions, and available public health interventions all change over time, with the 
result that last year's list of notifiable diseases no longer meets this year's needs. 
Additions and deletions must be made on an as-needed basis in order to maintain the 
usefulness of a notifiable-disease system. In particular, care must be exercised to 
assure that data on all notifiable conditions are actually needed and are used for 
public health purposes. "Diseases are often made reportable but the information 
gathered is put to no practical use, and with no feed-back to those who provided the 
data. This leads to deterioration in the general level of reporting, even for 
diseases of much importance. Better case reporting results when official reporting is 
restricted to those diseases for which control services are provided or potential 
control procedures are under evaluation, or epidemiologic information is needed for a 
definite purpose" (59). 

In Canada, specific criteria have been developed for determining which diseases or 
conditions should be reported at the national level (Table XII. 3) (60). In practice, 
these criteria have not resulted in the removal of any diseases from the list of 
nationally notifiable diseases, but they have at least provided a systematic basis for 
deciding among diseases proposed for addition. 


Most of the analytic issues relevant at the state and local level have been addressed 
elsewhere in this book (chapters V and VI), but some problems encountered in analyses 
at the state and local level are rarely faced at the national level. 

Comparison of rates in different geographic areas poses particular and difficult 
problems when the number of events is small and/or the population of the areas is 
small. When analyzing data drawn from a small population, particularly for an 
uncommon event or from a subset of the population (e.g., when calculating age- or 
race-specific rates), calculated rates may be difficult to interpret. Unfortunately, 
it is difficult to say with certainty what population size, or number of events, is 
"too small" for meaningful analysis. 

Issues involved in assessing the stability of rates and changes in rates when numbers 


are small have been well summarized for the nonstatistician (61). For example, 
confidence intervals for rates can be calculated as shown in Table XII. 4. In general, 
rates calculated based on <20 events will have a 95% confidence interval approximately 
as wide as the rate itself. 

Two methods for comparing independent rates (that is, rates from different, non- 
overlapping geographic areas or from a single area at two different nonoverlapping 
time intervals) have been suggested. The 95% confidence interval for the ratio of two 
independent rates can be calculated using the formula shown in Table XII. 5. The two 
rates differ significantly at the 5% level if the 95% confidence level for the ratio 
of the two rates does not include 1 . This method produces valid results if the rate 
in the denominator is calculated from more than 100 events. The 95% confidence 
interval for the difference between two independent rates can be calculated using the 
formula shown in Table XII. 6. The rates differ significantly at the 5% level if the 
95% confidence interval of the difference between the two rates does not include zero. 
Sometimes the two methods provide contradictory results; if that occurs, one should 
conclude that the rates being compared are not significantly different (61). 

In another report, four age-adjusted mortality indexes were compared, using 1969-1971 
U.S. mortality data by county, for counties with populations of >5,000. On the basis 
of coefficients of variation, the standardized mortality ratio has produced stable 
results for mortality data from all counties studied, while unacceptable instability 
was found when the relative mortality index was applied to data from counties with 
populations of <50,000. Calculation of years of life lost from all causes produced 
stable results when applied to data from counties with populations of _>25,O0O (62). 
The stability of rates for specific causes of death remains a problem for small 
geographic areas. Methods for stabilization of rates have been developed, specifical- 
ly for mapping of uncommon events such as suicide or specific types of cancer by 
county (63,64). 

As an initial step, before a more complicated method for stabilization of rates is 
applied, aggregated rates should be compared with disaggregated rates (i.e., multiple 
years versus a single year; state-wide versus county-wide; and entire population 
versus age-, gender-, or race-specific rates). High rates in geographic areas with 
small populations—or in subsets of the population- -may be due to chance, particularly 


if the elevated rate is based on a small number of observed cases. Alternatively, if 
increases are consistent over time--or across some population subgroups--it is more 
likely that they represent important differences rather than chance occurrences. 

Other events deserve attention, even if only a single case occurs; the occurrence of a 
sentinel health event represents a failure somewhere in the system of public health or 
of health-care delivery and warrants careful attention. Such sentinel events include 
maternal and infant deaths and a wide variety of infectious and noninfectious condi- 
tions (65) . 

Intercensal population estimates for small areas are available from a variety of 
sources. Because of limited availability of age-, gender-, and race-specific esti- 
mates from the U.S. Bureau of the Census for small areas, often, state governments 
have developed their own estimates (66) . Methods for interpolating census data for 
estimation of small area populations have been developed (67) . 

Methods have also been developed for defining hospital service areas in metropolitan 
areas (68). Although these methods have most commonly been used in studies of health- 
services utilization in different geographic areas, they are potentially of value in 
analyses of data generated by hospital-based surveillance at the state or local level. 
Small-area analyses in health-services research have recently been reviewed (69). The 
statistical issues raised by these studies are also relevant to analyses of surveil- 
lance data (70) . 

Although more elaborate techniques have been described, most analyses of surveillance 
data are quite simple — frequencies, proportions, and rates--which may be conveniently 
presented in tabular form, graphs or as maps. Indeed, the simplest analyses—the 
number of births to teenagers by census tract, or crude death rates by county--may be 
the most useful for documenting the need for services. Simple analyses should be done 
and their results thoughtfully considered before more complicated procedures are 
undertaken. By far the most common error made in analysis of surveillance data is 
failure to look at the data. 




Most of the issues relevant to the dissemination of surveillance data at the state and 
local level have been addressed in Chapter VII. The role of newsletters, annual 
reports, and press releases has already been addressed, as has the importance of clear 
presentation and use of graphics. Mapping is a powerful technique for presenting 
data. Electronic mail systems have been developed in some states to facilitate the 
dissemination of information between state and local health departments. 


No model system for surveillance at the state or local level exists. There is great 
variation in organizational structure of state and local health departments, and 
surveillance activities are usually closely linked to disease-control programs. 
Although this linkage helps assure that the data collected will indeed be used, it 
complicates efforts to document the resources, personnel and other, needed for 
surveillance; surveillance cannot be readily separated from other related activities. 

There are only a few published reports that address the cost of routine surveillance 
systems for communicable disease in state health departments. The cost of a newly 
established active surveillance system that surveyed half the primary-care physicians 
in Vermont was estimated to be $20,000 annually, compared with $3,000 for passive 
surveillance (50) . A study of the sentinel active surveillance system in Los Angeles 
County estimated that the additional cost of weekly contacts made with selected 
hospitals, physicians, schools, day-care centers, and university health centers was 
approximately $7,000 per year, compared with an estimated $10,000 per year for passive 
surveillance. The California costs reflected student instead of professional staff 
time and did not include time expended in recording reports at the health department 
(7). In 1985, the Kentucky Department for Health conducted active surveillance for 
hepatitis A infections among one-half of primary-care practitioners in 45 of 120 
counties in the state. The 22-week active surveillance program was estimated to cost 
$5,616. Although the system was cost-effective overall, because the administration of 
immune globulin to contacts averted an estimated $14,021 in direct medical and 
indirect costs of potential subsequent cases, the health department itself, of course, 


incurred increased cost. The system was not continued after the study was completed 
(71) . 

Higher quality data on cost are available for some more recently developed surveil- 
lance systems at the state level. A survey of 24 state and metropolitan health 
departments that conducted surveillance for nutrition in 1981 found that an average of 
16.6 hours of work by a nutritionist was required each month for the surveillance 
system. Eight and one-half hours of clerical time were needed, along with support 
from statisticians, computer technicians, and others {72). 

Data collection, coding, and entry for 2,000 persons with injuries seen at a single 
hospital participating in the National Electronic Injuries Surveillance System cost 
approximately $7,000 in 1989 (12). 

Costs of the BRFSS are shared by CDC and participating state health departments 
through cooperative agreements. In 1987, the cost per state was approximately 
$50,000, or approximately $25-$30 per completed telephone interview (24). 

Part of the Statewide Childhood Injury Prevention Project (SCIPP) in Massachusetts 
involved conducting a random-digit telephone survey. Information on injuries in the 
previous 2 months was obtained; because of the relative infrequency of these events, a 
large sample size was needed. Twelve hundred households were contacted at a cost of 
$25,000, yielding reports of only 80 injuries, most of which were falls (73). 

More complete and accurate documentation of the costs of surveillance--including data 
analysis and dissemination--may facilitate funding, particularly in the current era of 
tight constraints on state budgets. Explicit discussion of costs and benefits may 
help, both in terms of protecting (if not increasing) funding levels and assuring that 
existing surveillance systems are necessary and make the best possible use of person- 
nel time. 


Public health surveillance- -the systematic and ongoing collection of data pertinent to 
public health, and the subsequent analysis and dissemination of these data--is the 


first step toward action in public health, but it is only the first step. A number of 
approaches to translation of data into action have been developed, with emphasis on 
the local level. The Assessment Protocol for Excellence in Public Health (APEXPH) , 
developed in collaboration with the National Association of County Health Officers, 
guides local health department officials through identification of health problems 
that require priority attention and through building of community coalitions for 
action (74). Such an approach provides a good foundation for adopting community 
health objectives {75). These methods have been very successful in communities that 
have undertaken them, and they provide useful outlines for translating information 
into action at the community level. For example, in Tucson, Arizona, a community 
coalition targeted for action the high rate of infant mortality, with the result that 
a new program to provide prenatal care was established. 

Other examples, at the state level, are readily available. National studies that 
found that residents of Delaware died at high rates of preventable chronic disease 
resulted in a statewide cancer control plan, including a mobile mammography unit for 
inner-city neighborhoods. Widespread measles outbreaks occurred in New York State in 
1989 among high school and college students who had been previously vaccinated. 
Surveillance data led New York officials to reconsider the state's vaccination 
strategy, with the result that in April 1989 New York became the first state in the 
United States to adopt a two-dose schedule for routine measles vaccination (76). 
Similarly, surveillance data in Tennessee led to the adoption of a statewide vaccina- 
tion requirement for children who attend school in the state (Figure XII. 1). 

The competition for limited dollars and for the attention of policy makers and the 

public is intense. The challenge is to identify problems, set priorities, and to work 

with communities to develop solutions. More than ever, it is important to use data to 

decide among competing priorities and allocate limited resources- -the most important 

of which are the time and energy of the public health practitioner and the best 
interests of the public. 


1. Institute of Medicine. The future of public health. Washington, D.C. : 
National Academy Press, 1988. 

2. Chorba TL, Berkelman RL, Safford SK, Gibbs NP, Hull HF. Mandatory reporting of 
infectious diseases by clinicians. JAMA 1989;262:301-826. 

3. Freund E, Seligman JP, Chorba TL, Safford SK, Drachman JB, Hull HF. Mandatory 
reporting of occupational diseases by clinicians. JAMA 1989;262:3041-4. 

4. Feagin OT. Maine's sentinel physician system. Journal of the Maine Medical 
Association 1971; 62 : 187 ,201. 

5. Schaffner W, Scott HD, Rosenstein BJ, Byrne EB. Innovative communicable disease 
reporting: the Rhode Island experiment. HSMHA Health Rep 1971;86:431-6. 

6. Dodson DR, Bright MF. Sentinel active surveillance system. Missouri Epidemiol- 
ogist July 1989:1-2. 

7. Weiss BP, Strassburg MA, Fannin SL. Improving disease reporting in Los Angeles 
County: trial and results. Pub Health Rep 1988;103:415-21. 

8. Laboratory Centre for Disease Control. Canadian communicable disease surveil- 
lance system: disease-specific case definitions and surveillance methods. Can 

Dis Wkly Rep 1991; 17 (Suppl 3):l-35. 

9. Gallagher SS, Guyer B, Motelchuck M, Bass J, Lovejoy FH, McLoughlin E, Mehta K. 
A strategy for the reduction of childhood injuries in Massachusetts: SCIPP. N 
Engl J Med 1982;307:1015-8. 

10. Runyan CW, Kotch JB, Margolis LH, Buescher PA. Childhood injuries in North 
Carolina: a statewide analysis of hospitalizations and deaths. Am J Public 
Health 1985;75:1429-32. 

11. Hopkins RS. Consumer product-related injuries in Athens, Ohio, 1980-85: 
assessment of emergency room-based surveillance. Am J Prev Med 1989;2:104-12. 

12. Grisso JA, Wishner AR, Schwarz DF, Weene BA, Holmes JH, Sutton RL. A popula- 
tion-based study of injuries in inner-city women. Am J Epidemiol 1991;134:59- 

13. King WD. Pediatric injury surveillance: use of a hospital discharge data base. 
South Med J 1991;84:342-8. 

14. Goebert DA, Ng MY, Varney JM, Sheetz DA. Traumatic spinal cord injury in 
Hawaii. Hawaii Medical Journal 1991;50 (2 ) :44, 47-48, 50 


15. Smith GS, Langlois JA, Buechner JS. Methodological issues in using hospital 
discharge data to determine the incidence of hospitalized injuries. Am J 
Epidemiol 1991;134:1146-58. 

16. Carr W, Szapiro N, Heisler T, Krasner MI. Sentinel health events as indicators 
of unmet needs. Soc Sci Med 1989;29:705-11. 

17. Watkins M, Lapham S, Hoy W. Use of a medical center's computerized health care 
database for notifiable disease surveillance. Am J Public Health 1991;81:637-9. 

18. Graitcer PL. The development of state and local injury surveillance systems. J 
Safety Res 1987;18:191-8. 

19. Feck G, Relethford JH. The addition of E-codes to the hospital discharge 
reporting system in New York. Abstracts of the 119th Annual Meeting of the 
American Public Health Association, November 10-14, 1991, Atlanta, Ga., 1991:1- 

20. Lloyd LE, Graitcer PL. The potential for using a trauma registry for injury 
surveillance and prevention. Am J Prev Med 1989;1:34-7. 

21. Patetta M, Cole T, Bowling JM, Watkins S. Evaluation of the representativeness 
of the North Carolina Trauma Registry. Abstracts of the 119th Annual Meeting of 
the American Public Health Association, November 10-14, 1991, Atlanta, Ga., 

22. Peterson D, Andrews JS, Levy BS, Mitchell B. An effective school-based influen- 
za surveillance system. Public Health Rep 1979;94:88-92. 

23. Finger R, Stapleton M, Pelletier A. Reportable diseases in Kentucky: a five- 
year surveillance summary 1986-1990. Kentucky Cabinet for Human Resources, 
Department for Health Services, Division of Epidemiology. 

24. Remington PL, Smith MY, Williamson DF, Anda RF, Gentry EM, Hogelin GC. Design, 
characteristics, and usefulness of state-based behavioral risk factor surveil- 
lance: 1981-87. Public Health Rep 1988;103:266-375. 

25. Anda RF, Waller MN, Wooten KG, et al. Behavioral risk factor surveillance, 
1988. In: CDC Surveillance Summaries, June 1990. MMWR 1990;39(no. SS-2):1-21. 

26. Kolbe LJ. An epidemiological surveillance system to monitor the prevalence of 
youth behavior. Health Education 1990;21:44-8. 

27. Aday LAA, Sellers C, Andersen RM. Potentials of local health surveys: a state- 
of-the-art summary. Am J Public Health 1981;71:835-40. 

28. Remington PL, Anderson DE, Manering MC, Peterson EA, Anderson H. The PRECEDES 
Project: background and materials. Wisconsin Medical Journal 1990;89:695-6. 


29. Fishbein DB. Rabies. Infect Dis Clin North Am 1991;5:53-71. 

30. Moore CG, McLean RG, Mitchell CJ, et al . Guidelines for arbovirus surveillance 
programs in the United States. Atlanta, Ga.: Public Health Service (in press). 

31. Duffy J. The sanitarians: a history of American public health. Urbana, 
Illinois: University of Illinois Press, 1990. 

32. Starr R. The social transformation of American medicine. New York: Basic 
Books, 1982:187. 

33. Brandt AM. No magic bullet: a social history of venereal disease in the United 
States since 1880. New York: Oxford University Press, 1987:42-3. 

34. Haward RA. Scale of undernotif ication of infectious diseases by general 
practitioners. Lancet 1973;1:873-4. 

35. Thacker SB, Choi K, Brachman PS. The surveillance of infectious diseases. JAMA 

36. Wharton M, Fehrs L, Stroup N, Cochi SL. Health impact of varicella in the 
1980s. Abstracts of the 30th Interscience Conference on Antimicrobial Agents 
and Chemotherapy, October 21-24, 1990, Atlanta, Ga., 1990:276. 

37. Rosenberg ML, Gangarosa EJ, Pollard RA, et al. Shigella surveillance in the 
United States, 1975. J Infect Dis 1977;136:458-60. 

38. Marier R. The reporting of communicable diseases. Am J Epidemiol 1977 ,-105:587- 

39. Vogt RL, Clark SW, Kappel S. Evaluation of the state surveillance system using 
hospital discharge diagnoses, 1982-1983. Am J Epidemiol 1986;123:197-8. 

40. Campos-Outcalt D, England R, Porter B. Reporting of communicable diseases by 
university physicians. Pub Health Rep 1991;106:579-83. 

41. Sacks JJ. Utilization of case definitions and laboratory reporting in the 
surveillance of notifiable communicable diseases in the United States. Am J Pub 
Health 1985;75:1420-22. 

42. Shramm M, Vogt RL, Mamolen M. Disease surveillance in Vermont: who reports? 
Pub Health Rep 1991,-106:95-7. 

43. Harkess JR, Gildon BA, Archer PW, Istre GR. Is passive surveillance always 
insensitive? An evaluation of shigellosis surveillance in Oklahoma. Am J Pub 
Health 1988;128:878-81. 

44. Centers for Disease Control. Case definitions for public health surveillance. 
MMWR 1990;39(No.RR-13) :l-43. 

45. Konowitz PM, Petrossian GA, Rose DN. The underreporting of disease and physici- 


an's knowledge of reporting requirements. Pub Health Rep 1984;99:31-5. 

46. Do physicians report diseases? Louisiana Morbidity Report 1990 ;1 (4) : 1-2 . 

47. Jones JL, Meyer P, Garrison C, et al. Physician and infection control practi- 
tioner HIV/AIDS reporting characteristics. Am J Pub Health 1992;82:889-91. 

48. Seixas NS, Rosenman KD. Voluntary reporting system for occupational disease: 
pilot project, evaluation. Public Health Rep 1986;101:278-82. 

49. Brachott D, Mosley JW. Viral hepatitis in Israel: the effect of canvassing 
physicians on notifications and the apparent epidemiological pattern. Bull Wld 
Health Org 1972;46:457-64. 

50. Vogt RL, LaRue D, Klaucke DN, Jillson DA. Comparison of an active and passive 
surveillance system of primary care providers for hepatitis, measles, rubella 
and salmonellosis in Vermont. Am J Pub Health 1983;73:795-7. 

51. Thacker SB, Redmond S, Rothenberg RB, et al. A controlled trial of disease 
surveillance strategies. Am J Prev Med 1986;2:345-50. 

52. Scott HD, Thacher-Renshaw A, Rosenbaum SE, et ai. Physician reporting of 
adverse drug reactions: results of the Rhode Island Adverse Drug Reaction 
Reporting Project. JAMA 1990;263:1785-8. 

53. Rothenberg R, Bross DC, Vernon TM. Reporting of gonorrhea by private physi- 
cians: a behavioral study. Am J Public Health 1980;70:983-6. 

54. Spencer L, Wren GR. New reporting system aids epidemiologists. Hospitals 

55. Fife D, McAnaney, Rahman MA. Changes in AIDS case reporting after hospital site 
visits. Am J Public Health 1991;81:1648-50. 

56. Tizes R, Pravda D. Proposed toll-free telephone reporting of notifiable 
diseases. Health Serv Rep 1972;87:633-7. 

57. Disease reporting--a health professional's responsibility. Public Health Letter 
(Los Angeles County Department of Health Services) 2[10J. September 1980. 

58. Isaacman SH. Significance of disease reporting requirements. Infectious 
Disease News 1990;3(10) :23. 

59. Benenson AS, (ed) . Control of communicable diseases in man. Fifteenth Edition. 
Washington, D.C.: American Public Health Association 1990:xxvi. 

60. Laboratory Centre for Disease Control. Establishing goals, techniques, and 
priorities for national communicable disease surveillance. Can Dis Wkly Rep 

61. Kleinman JC, Kiely JL. Infant mortality. NCHS Statistical Notes 1991;1:7-10. 


62. Kleinman JC. Age-adjusted mortality indexes for small areas: applications to 
health planning. Am J Public Health 1977;67:834-40. 

63. Manton KG, Woodbury MA, Stallard E, et al. Empirical Bayes procedures for 
stabilizing maps of U.S. cancer mortality rates. J Am Stat Assn 1989;84:637-50. 

64. Lui KJ, Martinez B, Mercy J. An application of the empirical Bayes approach to 
directly adjusted rates: a note on suicide mapping in California. Suicide and 
Life-Threatening Behavior 1990;20:240-53. 

65. Rutstein DD, Berenberg W, Chalmers TC, et al. Measuring the quality of medical 
care: a clinical method. N Engl J Med 1976;294:582-8. 

66. Balachandran M, Balachandran S (eds . ) . State and local statistics sources. 
Detroit, Michigan: Gale Research Inc., 1990. 

67. Aickin M, Dunn CN, Flood TJ. Estimation of population denominators for public 
health studies at the tract, gender, and age-specific level. Am J Public Health 

68. Thomas JW, Griffith JR, Durance P. Defining hospital clusters and associated 
service communities in metropolitan areas. Socio-Economic Plan Sci 1981,-15:45- 

69. Paul-Shaheen P, Clark JD, Williams D. Small area analysis: a review and 
analysis of the North American literature. J Health Politics Policy Law 

70. Diehr P. Small area statistics: large statistical problems. Am J Public Health 

71. Hinds MW, Skaggs JW, Gershon KB. Benefit-cost analysis of active surveillance 
of primary care physicians for hepatitis A. Am J Public Health 1985;75:176-7. 

72. Scheer JC, Sims LS. Status of nutritional surveillance activities in 24 state 
and metropolitan health departments. Public Health Rep 1983;98:349-55. 

73. Guyer B. The application of morbidity data in the Massachusetts Statewide 
Childhood Injury Prevention Program. Can J Public Health 1989;80:432-4. 

74. APEXPH: Assessment protocol for excellence in public health. Washington, D.C.: 
National Association of County Health Officials, 1991. 

75. Healthy communities 2000: model standards. Washington, D.C.: American Public 
Health Association, 1991. 

76. Birkhead GS, Morse DL, Mills IJ, Novick LF. New York State's two-dose schedule 
for measles immunization. Public Health Rep 1991;106:338-44. 


Chapter XIII 

Important Surveillance Issues in 
Developing Countries 

Mac Otten 

"The health of the people is really the foundation upon which all their happiness and all 
their powers as a state depend.' 

Benjamin Disraeli 


Previous chapters in this book have discussed surveillance largely from the 
perspective of developed countries. Although the issues they address are relevant to 
all nations, developing countries have unique needs and opportunities. The health 
conditions typically associated with the developing world--diarrhea, malaria, 
pneumonia, and malnutrition—occur in settings with only rudimentary health care. 
This chapter highlights a number of surveillance issues relevant to developing 
countries, including resource constraints. 

Although conducting surveillance in developing countries is complex, it also presents 
unique opportunities. Because the formal health-care system is often an integral part 
of organized government services, there are fewer impediments to implementing 
surveillance systems. The limited number of health-care providers and diagnostic 
laboratories reduces the number of data sources, which can facilitate quality 
assurance. Moreover, acute diseases and injuries still represent major causes of 
morbidity and mortality in many of these countries; these are conditions for which 
surveillance techniques are well-developed. Finally, communities often have well- 
defined health systems that can be used for surveillance purposes. These 
opportunities should be taken when feasible- -despite such obstacles as rudimentary 
record-keeping systems and limited resources, numbers of diagnostic laboratories, 
demographic and vital information, and infrastructure. 

Four issues relating to surveillance are covered in this chapter: a) planning, b) data 


sources (e.g., vital statistics, surveys, and sentinel surveillance), c) surveillance 
at the local level, and d) development of integrated surveillance systems. In this 
chapter, the term "local" refers to the health station (which we assume to be the 
lowest level of the formal health system) , where health assistants work. In addition, 
"population-based" is used to describe information for all persons in a certain 
geographic unit as opposed to facility-based information, which may represent only 
persons from the catchment area of a given health facility. 


Identifying Health Objectives and Linkage to Surveillance 

Identifying measurable health objectives, assigning them priority, and then 
linking surveillance to those objectives is a high-priority activity both for the 
surveillance system and for health-system development in general (1-3) . Linking 
surveillance to these ordered health objectives alleviates the pitfall of thinking 
of surveillance as just the reporting of disease rather than as a system that uses 
information from multiple sources (such as sentinel sites, exit interviews, and 
regular surveys) . Linking surveillance to objectives will help planners of the 
surveillance system to think creatively in efforts to build a surveillance system 
to measure all priority health objectives. Table XIII. 1 lists data sources that 
could be used in building a surveillance system in a developing country. 

Throughout the world, health objectives should be based on health impact, 
feasibility of intervention, and cost-effectiveness of the intervention. In 
developing countries, measurable health objectives often cannot be identified 
because high-quality, population-based mortality data are often missing. As a 
result, estimates of mortality and health outcome from such international 
organizations as United Nations International Children's Emergency Fund (UNICEF) 
and the World Health Organization (WHO), international conferences, and population 
laboratories (e.g., International Center for Diarrheal Disease Research, 
Bangladesh) are used. Although health problems are similar in most developing 


countries (Table XIII. 2), relying on data from other countries can create major 
problems, especially for conditions for which impact is not clearly known (e.g., 
hepatitis B, iodine deficiency, or malaria) or for emerging health problems (e.g., 
human immunodeficiency virus [HIV] infection, tobacco use, and motor-vehicle 
injuries) . 

The need for country-specific data is illustrated by the finding of World Bank 
analysts that oral-rehydration therapy (ORT) in low-mortality environments is much 
less cost-effective than passive case detection and short-course chemotherapy for 
tuberculosis, whereas ORT in high-mortality environments is very cost-effective 
(1). The cost-effectiveness varies by a factor of 2 to 10, depending on the local 

Health objectives should focus both on current health status and on anticipated 
health needs. It may be more cost-effective to address preventive strategies 
(e.g. , early bottle feeding, cessation of tobacco use, use of seat belts, and 
sanitation) now rather than when the impact of adverse events becomes more 
apparent . 

For each health objective, the surveillance method for evaluating that objective 
and its sub-objectives should be listed (Table XIII. 3). Once such a list is made, 
a surveillance grid can be constructed to show which component of the surveillance 
system will measure which objective (Table XIII. 4). Completing a surveillance 
grid helps one visualize the overall structure and function of the surveillance 

The process of defining objectives, linking objectives to surveillance components, 
and constructing surveillance grids will highlight surveillance needs. The 
process provides a basis for strengthening existing components, for identifying 
existing information that could measure objectives, and for developing innovative 
new surveillance system components. For example, in many countries, the process 
of linking surveillance to objectives highlights the need for mortality data and 
the absence of vital statistics. 


Often, the most important objectives — the reductions in mortality associated with 
diarrhea and measles--are measured in sentinel areas, since in many countries 
vital events are not registered for the entire country (Table XIII. 4). Risk 
factors, health- related behavior, and health interventions- -such as ORT and use of 
fluids at home, feeding during diarrhea, use of contraception, use of condoms, use 
of chloroquine, missed opportunities for vaccinations--can be measured nationally 
with regularly scheduled surveys. Risk factors and interventions can also be 
identified through exit interviews at the district, health-center, health-station, 
or village level. 

Using a surveillance grid developed for a hypothetical country, one sees that 
surveillance for HIV is not as straightforward as for measles and diarrhea (Table 
XIII. 4). The primary health-status outcome chosen by this country's ministry of 
health was not HIV-related mortality or acquired immunodeficiency syndrome (AIDS), 
but HIV seroprevalence in selected areas and selected populations. Therefore, 
sentinel vital-event registration areas will not be used to measure the HIV- 
related objectives. In addition, the objectives for HIV-related risk factors and 
health interventions are targeted at certain areas (areas in which HIV 
seroprevalence of patients with sexually transmitted diseases [STDs] is >10%) . 
Since national surveys provide estimates only for the country as a whole, national 
surveys will not be the primary method for measuring progress of objectives 
related to risk HIV factors, behavior, and health interventions at a state or 
local level. 

Examining the surveillance system as a whole is important for assigning resources. 
For diseases such as measles, diarrhea, pneumonia, and pertussis, surveillance 
traditionally includes measurement of mortality in vital registration and 
measurement of risk factors and health interventions nationally with surveys and 
locally with exit interviews (4). However, conditions such as HIV, malaria, 
malnutrition, tuberculosis (TB) , vitamin A deficiency, and hepatitis B can be 
difficult to measure. 

Use of a surveillance grid facilitates the integration of some aspects of 

surveillance and may increase cost-efficiency. For example, a laboratory team may 
go to 12 sentinel sites in a year and test blood for HIV from pregnant women and 
patients with sexually transmitted diseases (STDs), blood for syphilis serology 
from 20- to 24-year-old pregnant women, sputum from 50 patients with cough for at 
least 1 month, and blood smears from 50 children with fever. Efficiency can be 
gained by constructing surveys--cluster surveys or exit interviews--that integrate 
questions about priority topics such as diarrhea, measles, HIV, tobacco use, and 
birth spacing. 

Surveillance of Measures of "Outcome" Versus "Process" 

Currently, at national and global levels, much emphasis is being placed on 
measurement of processes (e.g., coverage with vaccinations) versus the measurement 
of health outcomes (e.g., cases of measles) as the primary focus(5). Emphasis is 
placed on process measures, in part, because systems for efficient measurement of 
population-based health outcomes do not exist. 

There are two major problems with process measures. First, process measures do 
not directly measure primary events of interest — death and disease — or the 
effectiveness of the processes (interventions) . In contrast, the health outcome 
is the measure of interest, and what is measured is the effectiveness (i.e., the 
combined effect of the coverage and the efficacy of the intervention) . 

The usefulness of a process measure for surveillance depends on the true and 
consistent effectiveness of the intervention being measured. Focusing on the 
measurement of processes is most suitable when the intervention is documented to 
have consistent, high effectiveness. For example, measles vaccine administered to 
a 9-month-old infant is thought to be 90% effective in preventing subsequent 
measles ( 6) . Therefore, if a child receives measles vaccine before being exposed 
to measles virus, the probability that s/he will have clinical measles is very 

The difficulty with process measurements, however, exists even with an 
intervention as highly effective as measles vaccine (e.g., children infected with 
measles virus before vaccination are not protected by vaccine) . The effectiveness 
of most interventions is often less than that of measles vaccine, and the 
effectiveness of the delivery of such interventions varies substantially from 
setting to setting. For example, on the basis of the industriali zed-country 
experience, three doses of OPV were thought to have an effectiveness of at least 
95% in all settings (7,8). Yet, recent evaluations of field vaccine efficacy, 
reviews of serologic efficacy, and outbreaks in countries with high coverage with 
OPV have shown that the effectiveness of OPV in developing countries is not as 
high as in industrialized countries, and that process measures of OPV coverage can 
lead to a false sense of security (9-12) . 

In programs in which an intervention has high and consistent effectiveness, the 
magnitude of the problem of using process measures also depends on the stage of 
development of a program. If an intervention is reliably 70%-90% effective, as 
are measles vaccine and OPV, one can be relatively confident that health outcomes 
will be positively affected if coverage increases from 20% to 80%. However, one 
cannot be at all confident of any change in health outcome if coverage increases 
from 80% to 90% or 95%. In fact, statistically significant changes in coverage 
from 80% to 90% or 90% to 95% cannot be detected by current methods of 
measurement . 

A second major problem with process measures is measurement accuracy. 
Intervention activities are often measured by administrative methods and 
population-based surveys. An example of the administrative method of estimating 
the percentage coverage of an intervention is counting the number of vaccinations 
administered and then dividing by some denominator, such as the population in the 
catchment area <1 year of age. 

The administrative method is relatively easy and cheap to perform and is available 
locally. On the other hand, both the numerator and the denominator are often 
unavailable. For example, to estimate the percentage of persons who have received 


a complete series of OPV, one must know the number of third doses of OPV 
administered; this number is often not recorded. 

To overcome the limitations of administrative data, population-based surveys are 
used to provide process measures (e.g., the percentage of persons who received ORT 
during the most recent episode of diarrhea and the percentage of reproductive- age 
women who use modern methods of family planning), especially at the national 
level. Yet, there are increased costs associated with surveys and numerous 
potential inaccuracies from current survey tools (see section on surveys below) . 

Using Outcome To Measure Process 

In any international setting, surveillance for both outcomes and processes is 
desirable, but the focus of surveillance should be on outcome measures. Outcome- 
based programs have been extremely successful for global progress to eradicate 
smallpox, guinea worm, and poliomyelitis. The smallpox program, which started out 
as a process-based (coverage-driven) program, switched to an outcome-based 
program, which led to improved program effectiveness (13) . An outcome-based 
program in the Americas has decreased the number of cases of poliomyelitis from 
nearly 3,000 in 1980 to a handful by 1990 (14) . See Appendix XIII. A for a more 
detailed discussion. 


Population-based surveillance is especially important in many developing countries 
because of the disparities of access to health facilities and health status in 
urban centers versus rural areas. A single hospital in the capital city often 
consumes 25%-50% of the health budget for an entire country. Since surveillance 
from sentinel sites and health facilities is often concentrated in urban areas, 
public health needs in rural areas may not be well -represented by policy makers at 


the national level unless population-based surveillance systems are used. 

Vital -Event Registration 

The measurement of vital events is the most important single addition that 
developing countries can make to their existing surveillance system (See Chapter 
III) . Death and birth rates--along with cause-specific, age-specific, and gender- 
specific rates--are very useful. In the United States, for example, 13 of the 18 
status indicators chosen to measure the health status of the population as part of 
the health objectives for the nation will be measured using vital records (15) . 

Why so little emphasis has been placed by developing countries on establishing 
vital-event registration is not clear. Registration could begin in small sentinel 
areas, could be evaluated for problems, and then could be expanded. The vital - 
registration system in the United States started in 1900 in 10 sentinel states, 
and it took 23 years for all states to be admitted into the system (16) . 
Obviously, in the early stages of setting up a registry, some births and deaths 
would be missed. As late as 1974-1977, 21% of neonatal deaths were not registered 
in Georgia (17); despite this underregistration, vital data have been extremely 

In areas in which routine mortality data are not available, the verbal autopsy, in 
which trained or untrained workers take histories from family members to classify 
deaths by cause is a useful technique (18) . In 1978, WHO published a monograph 
called Lay Reporting of Health Information (19) . It contained a detailed list of 
approximately 150 causes of death and a minimal list of 30 causes that could be 
used by non-physicians to classify deaths by cause. 

In establishing vital-event systems, consideration should be given to including 
the registration of pregnancy. This is especially needed to measure the number of 
neonatal deaths, which in turn is needed to allow accurate infant -mortality rates 


to be calculated. Registration of pregnancies would allow measurement of prenatal 
care, fetal death associated with syphilis, family planning, and other important 
health concerns. 

Regular, Periodic Surveys 

Regular, periodic surveys can be an important component of a surveillance system. 
In particular, cluster surveys--multi-stage surveys with primary sampling units — 
are important surveillance tools in many developing countries because they are the 
only feasible method of collecting population-based information (20) . 

Cluster surveys have not been thought of as an essential and regularly performed 
surveillance activity. Surveys have generally been single-purpose and have been 
conducted intermittently on an as-needed basis, often at the request of 
international organizations. However, because the survey is the only method of 
gathering population-based information in many countries and surveys can be used 
to collect information on a variety of health topics, regularly scheduled surveys 
can constitute an excellent surveillance tool (see Behavioral Risk Factor 
Surveillance in Chapter III). 

To assure the development of a useful national surveillance system in a developing 
country, a survey unit or survey person should be assigned the task of 
coordinating all national health surveys. The coordinator first works with 
program staff to develop surveillance questions in high-priority areas (e.g., 
diarrhea, vaccinations, HIV/AIDS, family planning, child survival, malaria, and 
tuberculosis). Two to five questions are often adequate for some conditions. The 
questions should be assigned priority so that the survey coordinator has some 
flexibility to shorten the overall questionnaire if needed. 

Previously conducted surveys can serve as models for adaptation to local 
situations. For example, for vaccination-related questions, the Expanded 
Programme on Immunization (EPI) at WHO has a useful module. WHO also has useful 


questionnaires for diarrhea; acute respiratory-tract infections; and knowledge, 
attitude, and behavior associated with HIV infection. The Centers for Disease 
Control (CDC) has questionnaires on child mortality, health-station practices, 
nutrition, HIV risk behavior among youths, and others. 

Once questionnaire modules have been developed, each module should be field tested 
for readiness for implementation. Advance preparation and testing are very 
important; it is both difficult and time-consuming to develop an effective 

A small set (10 or so) of core questions measuring the highest-priority objectives 
should be included in every survey. Some space should be reserved for last-minute 
questions on information desired by high-level policy makers. Not only will this 
demonstrate the timeliness of this surveillance component, but it might facilitate 
political and financial support for its continuation. Finally, when the time 
comes for a survey, the survey coordinator puts together the core questions, the 
last-minute questions from the policy makers, and the appropriate survey modules. 

Data collection desired by international organizations can be integrated into the 
ministry of health's schedule of surveys. The survey coordinator can provide the 
international organization that wishes to have a survey conducted with the 
schedule and proposed modules to be used. The two groups can then collaborate to 
determine how the needs of both groups could be met. The international group can 
help train survey-unit staff and can help maintain a training manual on designing 
and conducting a survey, including interviewing techniques. This method is a 
cost-effective way to build local capacity and facilitate sustainability . See 
Appendix XIII. B for a discussion of some statistical issues in cluster surveys. 

S>aatia<sl Smrveillaac® 

Sentinel surveillance at health facilities can play a critical role in 


surveillance in developing countries. Sentinel sites are used to a) collect 
important information not collected at all sites and b) pilot collection of new 
information in order to be able to assess the usefulness of the data and the 
method of collection. Since routinely reported information from all sites must be 
restricted to high-priority items and must be easy to collect, much important 
information is unlikely to be collected from all health facilities. 

At sentinel sites, more resources and more experienced and dedicated personnel can 
often be used to collect information on more diseases, more detailed information 
about each" case, and more difficult-to-collect information such as sexual 
behavior. Also, sentinel sites can often serve as sources of information about 
new conditions and can be used to determine the most effective methods for 
inserting newly required data into the routine collection system. 

There are several potential problems in interpreting data from sentinel sites. 
Sentinel sites are often hospitals or other sophisticated facilities and tend to 
serve urban patients. Such data will not reflect rural, small, non-urban health 
stations where the majority of the population may live. Consequently, rural and 
small health stations should be in the sentinel-site system. 

Nevertheless, for several reasons, hospitals as sentinel sites and hospitals in 
urban areas can yield important information in a timely manner at a relatively low 
cost: first, cause-of -death data are available, permitting timely data collection 
and analysis; second, because the number of visits and deaths is large, they 
yield more precise estimates and allowing subgroup analysis by age, gender, or 
other important variables. Also, data are currently available, whereas systems of 
vital events and regular, periodic surveys are not generally established. For 
example, in Kinshasa, Zaire, the Ministry of Health used a hospital -based sentinel 
surveillance system to establish that measles remained an important cause of death 
for children <9 months old. The spread of clinically important resistance to 
chloroquine was detected because of increasing mortality from malaria in sentinel 
hospitals in numerous African countries (21) . 

Surveillance at the Local Level 

Integrated, well-thought-out surveillance at the health-station and health-center 
level warrants more focused attention; especially, data-collection, analysis, and 
dissemination of results as a basis for public health action. Surveillance 
responsibilities should be specified in employee work plans and completion of 
surveillance duties used to assess health-worker performance. 

WHO has surveillance and evaluation training modules for vertical programs such as 
EPI and Control of Diarrheal Diseases (CDD) (20,22,23) , but there are no general 
surveillance training modules for district or health-station levels. Local 
surveillance is critical because major health problems in developing countries 
require innovative public health action at the local level. Local surveillance 
and public health action based on surveillance may be less urgent for programs 
with high effectiveness and ease of administration, (e.g., vaccinations), or for 
programs that depend solely on the formal health-care system (e.g., acute 
respiratory infections or tuberculosis) . However, local surveillance and linked 
public health action will be essential for most of the priority diseases (e.g., 
diarrhea, malaria, and HIV) and related prevention activities (oral -rehydration 
solutions, chloroquine for all cases of fever, and condoms) . In general, these 
interventions require extensive behavior change on the part of clients and also 
require local problem-solving, surveillance of objectives, strategy reformulation, 
and creative intervention by health workers to be successful. 

Collection, Display, and Analysis of Local Surveillance Data 

Analysis of surveillance data and action based on that surveillance information at 
the local level have several benefits. If collected data are prominently 
displayed as tables and graphs in the local health office, public health personnel 
(and patients) can see the results of data-collection efforts. Through the 
analysis and interpretation of the displayed surveillance data, local staff can be 

involved in the process of devising strategies to solve health problems and at the 
same time, can help attain national and local health objectives. Such involvement 
gives health staff a sense of participation and professionalism. 

The process of designing a surveillance system for a district or a health-station 
is the same as for the national level. First, health priorities are determined on 
the basis of the impact of the health problem and the feasibility and cost- 
effectiveness of intervention. Second, objectives are determined and assigned 
priority. Third, surveillance components to measure high-priority objectives are 
identified* and implemented. 

Four differences between national and local surveillance sometimes emerge. First, 
many health stations will not have mortality surveillance based on vital-event 
registration, whereas national surveillance systems may include at least a 
sentinel-registration component. However, health stations can begin sentinel 
population-based mortality surveillance by starting vital-event registration in 
one or two villages. 

Second, 30-cluster surveys conducted regularly every 1-3 years are not feasible 
for district and health-station surveillance of risk factors and health 
interventions . 

Third, resource constraints at the local level limit the number of sentinel sites. 
However, both health stations and districts can conduct a form of sentinel 
surveillance by limiting data collection on some health problems to a small sample 
of sites at infrequent intervals. For example, although children have their 
growth monitored throughout the year, the percentage with weight-f or-age of <80% 
of standard might be calculated only once every 3 months on a consecutive sample 
of 30 children. 

Fourth, limited resources require integration of surveillance and non-surveillance 
health information by local health workers. 


Data collected routinely by health stations should be limited to high priority 
conditions. For example, mandatory reporting could be limited to 10 selected 
diseases on the basis of established priorities or reporting laws. In addition, 
the health station should meet certain standards before reporting requirements are 
expanded: the health station staff should be a) reporting regularly, b) displaying 
information collected, c) thinking about the meaning of the data, d) using the 
data to solve health problems, and e) using the data to evaluate programs targeted 
at certain health problems. If these are all being done, the staff is likely to 
become enthusiastic about the public health aspect of the station's job and 
initiate the idea of collecting more information. For example, information for 
each case-patient (e.g., age and date of onset of disease) can be collected for 
selected health problems instead of just reporting the number of cases of disease 
(i.e., summary-count data). Additional diseases can be added on the basis of 
priority setting (e.g., AIDS or moderate and severe malnutrition). The practice 
of collecting data intermittently for special purposes can be expanded, and data 
items found to be useful at sentinel sites can be added to reportable conditions 
from all health stations or at least can be expanded to a larger number of 
sentinel sites. 

Display and interpretation of surveillance data and planned action based on the 
interpretation can be integrated into assigned duties of health workers and into 
the duties of their supervisors. Each health worker should have a detailed task 
analysis or job description, with the task analysis linked to national and local 
health objectives. 

Employee and project work plans, based on supervisory visits and on input from 
members of the community, should also reflect health objectives and ongoing 
analysis and interpretation of surveillance data. For example, if one of the 
high-priority health objectives is the reduction of measles cases by 50% as of 
1995 (compared with the 1989-1991 baseline) and the graphs of measles cases by 
year and measles cases by month in 1993 show no decline, the work plan for the 
next 6 months might include conducting exit interviews, collecting additional 
information on cases, and convening focus groups. 


Through focus groups, health workers can determine from groups of mothers why 
children are not being vaccinated and what might be done to solve this problem. 
Exit interviews can be used to determine measles coverage. Additional information 
about the ages of persons with measles can be recorded for the next 6 months, and 
then the health worker and supervisor can determine whether measles is a disease 
primarily among infants or among older persons as well. Using the vaccination 
status of persons with measles, health workers can estimate measles coverage. The 
effectiveness of a work plan should then be evaluated both through continued 
surveillance of measles cases and through exit interviews. 

In addition, the 6-month work plan could include teaching mothers about 
appropriate preparation and use of oral-rehydration fluids at home. During a 
supervisory visit, the supervisor can do exit interviews of 30 consecutive women 
seen at the health station and record whether and what they have been taught about 
using fluids at home, possibly asking for demonstration of what they have been 
taught. At the same exit interviews, receipt of measles vaccine can be recorded 
as a measure of coverage. This will integrate surveillance for measles coverage 
with direct health-worker-performance assessment of a diarrhea-related task. 

Exit Interviews and Focus Groups 

Interviews of patients who have finished their visits at health facilities, which 
can be called "exit interviews," can be a flexible, easy, and cost-effective 
method of collecting information. Exit interviews are ideal for measuring 
progress toward local health objectives. They can be used to collect data for 
emergent problems or for routine surveillance, as well as to evaluate the 
performance of health workers. For surveillance purposes, exit interviews can be 
used to collect information about "process" health objectives, health risks, 
health behavior, and health interventions. Unlike surveys, exit interviews can 
be conducted frequently. Supervisory visits provide an excellent opportunity to 
involve the supervisor in the conduct of exit interviews. 


Focus groups can make important contributions to the design of a surveillance 
system. As complex issues such as changes in behavior are assigned higher health 
priorities (e.g., HIV-related behavior, diet, home fluids, treatment practices, 
and reasons for not being vaccinated) , focus groups are often used to gain new 

Focus groups often provide an appropriate first step in generating ideas about why 
events and behavior occur. After ideas or hypotheses are available, surveys, exit 
interviews, and special studies (case-control studies) can be used to identify 
specific factors that should be incorporated into surveillance systems. Health- 
station staff can use focus groups, along with exit interviews, to measure health 
objectives of local importance. 


Over the last 15 years, the sophistication of public health in developing 
countries has increased greatly. EPI provided one model for surveillance. 
However, surveillance for measles was relatively easy--the intervention was 
consistently and highly effective, and almost all infections caused a distinct, 
noticeable condition. However, the EPI surveillance model was not as successful 
for problems such as diarrhea, pneumonia, family planning, and malaria, where the 
interventions were less effective or less consistently effective and where the 
outcome of interest was more difficult to measure. 

Then, HIV appeared. Reporting of cases of AIDS was inadequate for immediate 
prevention because of the lengthy incubation period for this condition. Accurate 
surveillance for HIV had to rely on expensive laboratory testing. 

Of the top 10 priority diseases in developing countries, only tuberculosis and 
malaria require any laboratory testing (at least sentinel testing) for 
surveillance, and the diagnostic tests for malaria and tuberculosis (though not 


the tests for antimicrobial resistance) are relatively simple and inexpensive. In 
addition, the appearance of HIV put new emphasis on the need for surveillance of 
types of health behavior, the main prevention focus for HIV. Previously, 
surveillance had been considered to be adequate in developing countries if it 
covered disease reporting and vaccination coverage. 

Now, surveillance data are expected to be available on risk factors and health 
behavior (e.g. , age at marriage and age at first sexual intercourse for family- 
planning purposes) , as well as on such newly important diseases as hepatitis B, 
genital ulcer disease, urethritis, use of tobacco, and injuries associated with 
motor vehicles. 

As public health programs become more sophisticated and public health workers need 
access to more information on more and more conditions, the complexity of the 
structure of surveillance systems will increase. The integration of surveillance 
and evaluation for vertical programs such as EPI, diarrhea, acute respiratory 
infections, HIV/STD, and family planning into a coherent, rational surveillance 
system will depend on the actions taken by ministries of health. 

There are several advantages to integration: 

surveillance information can be gathered with greater cost-efficacy, 
requirements for health-station staff will be simplified and their 
training will be less duplicative. 

Although international organizations, often supporting vertical programs, control 
a substantial proportion of the resources being spent on public health in 
developing countries, these organizations are likely to respond favorably to the 
implementation of logical, well-crafted, integrated surveillance systems that are 
linked to written national health priorities. 

Surveillance systems must continually focus on outcomes (cases of the health 
problem) in order to adjust strategies and interventions for control and 
prevention. Many countries are trying to reach low levels of vaccine-preventable 

diseases by the year 1995 (measles and neonatal tetanus) or eradication by the 
year 2000 (poliomyelitis) (24) . The poliomyelitis eradication initiative attempted 
to demonstrate that outcome-based surveillance intimately linked to intervention 
can be the "leading wedge" in disease reduction. 

The sophistication of the tools available in developing countries to analyze 
surveillance data has also increased. Surveillance data have been analyzed with 
computers at the national level for the past several years. As the prices of 
computer hardware have continued to decrease, computers have been moved to zonal, 
state, and provincial levels. Epi Info, an inexpensive and freely copyable 
epidemiology computer program, is now available in English, French, Spanish, and 
Arabic (25); also, manuals are available in Czech and Italian. Mapping of 
surveillance data has been underutilized because inexpensive mapping programs that 
can display maps by district, health station, and village and can be linked to 
surveillance data bases have not been available. However, a mapping program 
called Epi Map is compatible with Bpi Info and can create maps of surveillance 
data automatically. 


The vision for surveillance systems in developing countries as described above 
involves systems that are linked to health objectives, ordered by priority, 
limited in scope, and not burdensome at the health-station level. These systems 
should also contain an extensive sentinel network and have strong elements of 
population-based data gathering from surveys and vital event registration. 
Surveillance data need to be collected routinely. Sentinel sites will provide the 
information required to monitor health objectives, but such surveillance should 
also be flexible enough to collect new data needed for emerging problems, and for 
changing priorities. 

Health objectives provide national politicians and health leaders a plan to 
ensure the public's health. With a surveillance system that is linked to these 


objectives, leaders will be able to monitor progress made toward meeting national 
objectives. With analysis and action at the district and 

health-station level, local health staff can take rapid and appropriate action. 
Population-based vital statistics can show whether enough emphasis is being placed 
on health in rural and remote areas of a country. Health surveys can be conducted 
as a regular part of the surveillance system. Expertise and funding provided by 
international organizations can help train and maintain a survey coordinator and 
surveyors . 

In implementing surveillance and health systems, developing counties can avoid the 
mistakes that industrialized countries have already made--poorly planned and 
fragmented surveillance systems, surveillance systems not linked to objectives, 
health objectives that are not explicit and often politicized, large divisions 
between curative and preventive medicine, and differences in health care in rural 
versus urban areas . 

As noted at the beginning of this chapter, surveillance in developing countries is 
accompanied by numerous logistic problems but also presents unique opportunities. 
The careful setting of health priorities and the meticulous allocation of limited 
resources to the interests of the public's health can be the results of 
surveillance in such settings. 

Appendix XIII. A. Using Outcome To Measure Process 

This appendix describes a method to estimate process measures from outcome 
measures. Some process measures such as percentage coverage of an intervention 
(e.g., percentage using chloroquine, percentage having received vaccine, 
percentage using ORT) may be cost-effectively assessed by outcome data (e.g., 
number of cases of malaria, cases of measles, deaths from diarrhea) . There is a 
relationship between the proportion of persons with a disease that has "received" 
an intervention, the effectiveness of the intervention, and the "coverage" of the 
intervention in the population. The relationship is as follows: 

PCI = 


where PCI is the percentage of the cases of disease exposed to the intervention, 
where PPI is the percentage of the population exposed to the intervention, and 
where Eff is the efficacy of the intervention. 

This formula is derived from the formula for program (vaccine) efficacy, where 
efficacy equals the attack rate among persons not exposed to the program or 
intervention minus the attack rate among persons exposed, divided by the attack 
rate among those unexposed (26), i.e., for vaccine efficacy, Eff = VE or vaccine 
efficacy; PCI = PCV or percentage of case-patients who are vaccinated; and PPI = 
PPV or percentage of the population vaccinated. 

The graphic representation of this formula is known in immunization programs as 
the vaccine-efficacy curve (Figure XIII .A.l) (26) . As an example, if the 
percentage of case-patients with disease that have been exposed to the 
intervention (PCI) is <20%, the coverage of the intervention in the population 
(PPI) is poor (i.e., the efficacy of the intervention is 90% or less). If the 
proportion of case-patients who have received the intervention is >50%, either the 
percentage coverage is high or the efficacy of the intervention is low. To 
estimate from surveillance the coverage of cases, one needs to determine whether 

persons with the disease were or were not exposed to a particular intervention 
(e.g., whether case-patients used condoms, whether case-patients received 
appropriate home fluids, or whether case-patients received vaccine) . 

To use the formula or the curve, the exposure to the intervention must be 
dichotomized into a "yes/no" format. For example, for poliomyelitis, exposure is 
categorized into "fully vaccinated" with >3 doses of vaccine and "not fully 
vaccinated" with <3 doses of vaccine. This method has several advantages. It 
allows estimates of coverage at the health-station level, which allows local 
action to solve local health problems. It is much simpler and cheaper than 
conducting surveys, it provides information about effectiveness as well as 
coverage, and it is more difficult to falsify than coverage-survey and 
administrative method estimates. However, this method provides only a crude 
estimate and should be used with other sources of data. For example, if the 
survey or administrative estimate of 0PV3 coverage is 95%, and only 20% of 
confirmed poliomyelitis case-patients received 3 doses of OPV, then the survey or 
administrative estimates should be questioned. 

Appendix XIII. B. 30-Cluster EPI Survey Design 

In the absence of an internationally funded survey to attach modules or questions 
desired by a ministry of health, a 30-cluster EPI survey can be performed (20) . 
The EPI survey was designed to provide a crude estimate of vaccination coverage 
(±10%) (27); it provided information about whether vaccination coverage was low 
(20%-40%) or relatively high (70%-90%) . Other programs have adapted the design 
for other purposes (e.g., mortality from neonatal tetanus, mortality and practices 
associated with diarrhea, and changes in vaccination coverage over time) (22,23). 

However, results have often been misleading because appropriate confidence 
intervals were not calculated. Many health professionals did not realize that the 
confidence interval for each survey was not fixed at ±10% but varied depending on 
the results (inter-cluster correlation and the point estimate) of each survey. 
Often confidence intervals were not calculated and appropriate analyses of 
subgroups (males, females) were not done because easy-to-use computer programs 
were not available. Fortunately, such computer programs as (COSAS; Lotus 
spreadsheet for diarrhea cluster surveys; and CLUSTER, which runs within Epi Info) 
are now available to calculate appropriate confidence intervals. However, if an 
analysis by age, by gender, or some other specific characteristic is desired, a 
more complicated program (e.g., SUDAAN or CARP) still must be used to obtain valid 
point estimates and valid confidence intervals (28) . For example, one cannot get 
a valid estimate of coverage for males and females in a typical EPI coverage 
survey without the use of SUDAAN. 

As the use of the cluster survey becomes more sophisticated and as greater 
accuracy and precision is desired, use of the EPI cluster-survey design is 
complicated by the potential for bias in both selection of the first house and 
subsequent selection of additional houses (29) . Despite being designed and 
analyzed as a survey with equal probability of selection, selection of the 
starting house from a randomly selected direction yields a higher probability of 


selection for houses near the middle of the cluster. If occupants near the middle 
of the cluster have some characteristic associated with the outcome (e.g., have 
higher incomes), a biased estimate will result. 

An alternative method of selecting the first and additional houses in a cluster is 
by segmenting and subsegmenting the cluster until a small number of houses can be 
mapped (e.g., 30 houses). Then, the first and additional houses can be chosen at 
random. Tf one assumes that the number of target-group persons per household is 
similar in all clusters, valid point estimates and approximate confidence 
intervals can be calculated using less-complicated programs (CLUSTER and COSAS) . 
The use of subsegmenting in the absence of being able to select the first house 
randomly has also been described. 

An easy-to-use program that appropriately analyzes cluster surveys (including 
appropriate analysis of subgroups and comparison of two independent surveys done 
at two different times) operating within Epi Info is being prepared. 



1. Jamison DT, Mosley WH. Disease control priorities in developing countries: 
health policy responses to epidemiological change. Am J Public Health 

2. Jamison DT, Mosley WH (eds.). Disease control priorities in developing 
countries. Oxford and New York: Oxford University Press. In press. 

3. Walsh JA, Warren KS. Selective primary health care: an interim strategy 
for "disease control in developing countries. N Engl J Med 1979;301:967-74. 

4. Frerichs RR. Epidemiologic surveillance in developing countries. Ann Rev 
Pub Health 1991;12:80-257. 

5. Lemeshow S, Robinson D. Surveys to measure programme coverage and impact: 
a review of the methodology used by the Expanded Programme on Immunization. 
World Health Stat Q 1985;38:65-75. 

6. Markowitz L,E, Sepulveda J, Diaz-Ortega JL et al. Immunization of 6-month- 
old infants with different doses of Edmonton- Zagreb and Schwarz measles 
vaccines. N Engl J Med 1990;322:580-7. 

7. Hardy GE, Hopkins CC, Linnemann CC et al. Trivalent oral polio vaccine: a 
comparison of two infant immunization schedules. Pediatrics 1970;45:444-8. 

8. McBean AM, Thorns ML, Albrecht P et al . Serologic response to oral polio 
vaccine and enhanced-potency inactivated poliovaccines . Am J Epidemiol 

9. Deming MS, Jaiteh KO, Otten MW et al. Epidemic poliomyelitis in The Gambia 
following the control of poliomyelitis as an endemic disease. II. 
Clinical efficacy of trivalent oral polio vaccine. Am J Epidemiol 

10. Patriarca PA, Wright PF, John JT. Factors affecting the immunogenicity of 
oral poliovirus vaccine in developing countries: review. Review of 
Infectious Diseases 1991;13:926-39. 

11. Sutter RW, Patriarca PA, Brogan S et al . Outbreak of paralytic 
poliomyelitis in Oman: evidence for widespread transmission among fully 
vaccinated children. Lancet 1991;338:715-20. 

12. Otten MW, Deming MD, Jaiteh KO et al . Epidemic poliomyelitis in The Gambia 
following the control of poliomyelitis as an endemic disease. I. 
Descriptive findings. Am J Epidemiol 1992;135:381-92. 

13. Fenner F, Henderson DA, Arita I et al. Smallpox and its eradication. 
Geneva, Switzerland: World Health Organization, 1988:475-6. 

14. Pan American Health Organization. Health Information System for EPI . 
Washington, D.C.: Pan American Health Organization, 1992. 

15. Centers for Disease Control. National Center for Health Statistics. 
Health status indicators for the year 2000. Statistical Notes 1991;1:1-4. 

16. U.S. Bureau of the Census. Historical statistics of the United States, 
Colonial times to 1970, Bicentennial edition, Part 1. Washington, D.C.: 


Government Printing Office, 1975. 

17. McCarthy BJ, Terry J, Rochat R, Quave S, Tyler CW. The underregistration 
of neonatal deaths: Georgia 1974-1977. Am J Public Health 1980:977-82. 

18. Kielmann AA, Taylor CE, DeSweemer C et al . Child and maternal health 
services in rural India: the Narangwal experiment. Baltimore: Johns 
Hopkins University Press, 1983. 

19. World Health Organization. Lay reporting of health information. Geneva, 
Switzerland: World Health Organization, 1978. 

20. Expanded Programme on Immunization. The EPI coverage survey. Training for 
mid-level managers. Geneva, Switzerland: World Health Organization, 1988. 

21. U.S. Agency for International Development, Centers for Disease Control. 
African child survival initiative, 1989-90. Bilingual annual report. 
Washington, D. C. : Government Printing Office, 1990. 

22. Galazka A, Stroh G. Guidelines on the community-based survey of neonatal 
tetanus mortality. Geneva, Switzerland: World Health Organization, 
WHO/EPI/GEN/86/8, 1986. 

23. Programme for Control of Diarrhoeal Diseases. Household survey manual: 
diarrhea case management, morbidity, and mortality. Geneva, Switzerland: 
World Health Organization, CDD/SER/86.2/Rev.l, 1989. 

24. World Health Organization. Global Advisory Group. Part II. Expanded 
Programme on Immunization. Weekly Epidemiological Record 1992;67:17-19. 

25. Dean AG, Dean JA, Burton AH et al . EPI INFO. Version 5. A. Word- 
processing, data-base, and statistics program for epidemiology on 
microcomputers. Atlanta, Ga. : Centers for Disease Control, 1990. 

26. Orenstein WA, Bernier RH, Dondero TJ et al. Field evaluation of vaccine 
efficacy. Bull World Health Organization 1985;63:1055-68. 

27. Henderson RH, Sundaresan T. Cluster sampling to assess immunization 
coverage: a review of experience with simplified sampling methods. Bull 
World Health Organization 1982;60:253-60. 

28. Shah B, Barnwell BG, Hunt P et al. SUDAAN user's manual. Release 5.50. 
Raleigh, N.C.: Research Triangle Institute, 1989. 

29. Lemeshow S, Stroh G. Sampling techniques for evaluating health parameters 
in developing countries. Washington, D.C.: National Academy Press, 1988. 

Reference Material for Principles and Practice of Public Health 



Chapter I: 

Table 1.1. The uses of surveillance 

Chapter II: 

Table II. 1. Steps in planning a surveillance system 
Table II. 2. Criteria for identifying high-priotrity health 
events for surveillance 

Chapter III: 

No tables 

Chapter IV.: 

Table IV. 1. 

Table IV. 2. 

Essential questions for the practice of effective 
disease/injury reporting 
Concerns of the data-base manager 

Chapter V: 

Table V. 

Table V. 

Table V. 

Table V. 

Table V. 

Table V. 

1. Rates and quantities involving rates commonly used in 

2. Crude death rates--Dade and Pinellas counties, 
Florida, 1980 

3. Age-specific death rates--Dade and Pinellas counties, 
Florida, 1980 

4. Directly standardized death rates--Dade and Pinellas 
counties, Florida, 1980 

5. Indirectly standardized death rates — Dade and Pinellas 
counties, Florida, 1980 

6. Five-number summary of 39 4-week totals of reported 
cases of meningococcal infections--United States, 

Table V.7. Common power transformations (y — y ) 

Table V.8. Guide for selecting data graphics 

Table V.9. Primary and secondary morbidity from syphilis, by age 

category — United States, 1989 
Table V.10. Primary and secondary morbidity from syphilis, by age 

category, race, and gender--United States, 1989 

Chapter VI: 

No tables 

Chapter VII: 

Table VII. 1. 

Controlling and directing information dissemination 

Chapter VIII: 

Table VIII. 1. Sample case definition developed by the Centers 

for Disease Control and the U.S. Council of State 

and Territorial Epidemiologists 
Table VIII. 2. The detection of health conditions with a 

surveillance system 
Table VIII. 3. Comparison of estimated costs for active and passive 

surveillance systems in a health department, Vermont, 

June 1, 1980, to May 31, 1981 
Table VIII. 4. Outline of sample surveillance evaluation report 

Chapter IX: 

Table IX. 1. Ethical responsibilities in surveillance — 
participants and duties 

Table IX. 2. An ethical checklist for public health surveillance 

Chapter X: 

No tables 

Chapter XI: 

No tables 

Chapter XII: 

Table XII. 1. 

Table XII. 2. 

Table XII. 3. 

Table XII. 4. 
Table XI I. 5. 

Table XII. 6. 

Reasons cited by physicians for failure to report 

notifiable diseases [42, 45-47) 

What local and state health departments can do to 

improve reporting by physicians 

Criteria used to set priorities for national 

disease surveillance, Canada 1,60) 

Confidence intervals for rates (61) 

Formula for calculating 95% confidence intervals 

for the ratio of two independent rates (61) 

Formula for calculating 95% confidence intervals 

for the difference between two independent rates 

Chapter XIII: 

Table XIII 

Examples of data sources for surveillance in 
developing countries 
Table XIII. 2. Health problems ranked according to preventability 
and treatability, Thailand, 1987 

Examples of objectives linked to surveillance 
components that will measure objectives 

Grid to identify which surveillance component 
will measure a health objective in a hypothetical 
developing country 

Table XIII. 3. 

Table XIII. 4. 


Chapter I: 

Figure I. 

1. Reported cases of congenital syphilis among infants 

<1 year of age and rates of primary and secondary (P&S) 

syphilis among women--United States, 1970-1991 
Figure 1.2. Salmonella rates in New Hampshire and contiguous states, 

by county 
Figure 1.3. Homicide rate, by age and gender of victim, United 

States, 1986 
Figure 1.4. Malaria rates, by year--United States, 1930-1988 
Figure 1.5. Reported cases of measles, by age group, United 

States, 1980-1982 
Figure 1.6. Semi-logarithmic-scale line graph of reported cases 

of paralytic poliomyelitis--United States, 1951-1989 
Figure 1.7. Percentage of reported cases of gonorrhea caused by 

antibiotic-resistant strains—United States, 1980- 

Figure 1.8. Cesarean deliveries as a percentage of all deliveries 

in U.S. hospitals, by year, 1970-1990 

Chapter II: 

No figures 

Chapter III: 

No figures 

Chapter IV: 

No figures 

Chapter V: 

Figure V.l. Crude, gender-specific and gender-race-specific cases 

of primary and secondary syphilis --United States, 1981- 
1990, comparison of differential trends 

Figure V.2. Dot plot of results of swine influenza virus (SIV) 
hemagglutination-inhibition (HI) antibody testing 
among exposed and unexposed swine exhibitors--Wisconsin, 

Figure V.3. Ordered data series and stem-and-leaf display of 39 4- 
week totals of reported cases of meningococcal 
infections — United States, 1987-1989 

Figure V.4. Scatter plot of 39 4 -week totals of reported cases of 
meningococcal inf ections--United States, 1987-1989 

Figure V.5. Box plot of 39 4-week totals of reported cases of 
meningococcal inf ections--United States, 1987-1989 

Figure V.6. Histogram (epidemic curve) of reported cases of 
paralytic poliomyelitis--Oman, January 1988- 
March 1989 

Figure V.7. Sample cumulative attack rate, by grade in school 
and time of onset--North Carolina, 1985 

Figure V.8. Survival curves over time, based on serum testos- 
terone level. Eastern Cooperative Oncology Group 

Figure V.9. Frequency polygon of reported cases of encephalitis — 
United States, 1965 

Figure V.10. Group bar chart of case-fatality rates from ectopic 
pregnancy, by age group and race--United States, 

Figure V.ll. Stacked bar chart of underlying causes of infant 

mortality, by racial/ethnic group and age at death- 
United States, 1983 

Figure V.12. Deviation bar chart of notifiable disease reports, 
comparison of 4-week totals ending May 23, 1992, with 
historical data- -United States 
Figure V.13. Pie charts of poliomyelitis vaccination status of 
children ages 1-4 years in cities with populations 
equal to or greater than 250,000, by financial status- 
United States, 1969 

Figure V.14. Spot map of deaths from smallpox—California, 

Figure V.15. Chloropleth map of confirmed and presumptive cases 
of St. Louis encephalitis, by county- -Florida, 1990 

Figure V.16. Density-equalizing map of California (based upon 
population density) , depicting deaths from smallpox, 

Chapter VI: 

Figure VI . 1 . Example: Data used for report published during week 
20 (May 23, 1992) 

Chapter VII: 

No figures 

Chapter VIII: 

Figure VIII. 1. National Notifiable Diseases Surveillance System 
Figure VIII. 2. Biases in surveillance 

Chapter IX: 

No figures 

Chapter X: 

No figures 

Chapter XI: 

No figures 

Chapter XII: 

Figure XII. 1. Cartoon depicting mumps as a public health 
problem, Tennessee 

Chapter XIII: 

Figure XIII. A. 1. Percentage of case-patients vaccinated 
(PCV) per percentage of population 
vaccinated (PPV) for seven values of 
vaccine efficacy (VE) 

Table 1.1. The uses of surveillance [23) 

Quantitative estimates of the magnitude of a health problem. 

Portrayal of the natural history of disease. 

Detection of epidemics. 

Documentation of the distribution and spread of a health event. 

Facilitating epidemiologic and laboratory research. 

Testing of hypotheses. 

Evaluation of control and prevention measures. 

Monitoring of changes in infectious agents. 

Monitoring of isolation activities. 

Detection of changes in health practice. 

and planning 

TABLE II. 1. Steps in planning a surveillance system 

1. Establish objectives. 

2. Develop case definitions. 

3. Determine data source or data-collection mechanism (type of system) 

4. Develop data-collection instruments. 

5. Field test methods. 

6. Develop and test analytic approach. 

7. Develop dissemination mechanism. 

8. Assure use of analysis and interpretation. 

TABLE II. 2. Criteria for identifying high-priority health events for surveillance 

• Frequency : 




Years of potential life lost 

• Severity : 

Case-fatality ratio 
Hospitalization rate 

• Cost 

Direct and indirect costs 

• Preventability 

• Communicability 

• Public interest 

TABLE IV. 1. Essential questions for the practice of effective disease/injury reporting 
Initiation/sources of reports 

* How and by whom are health-care practitioners (existing and newly practicing) 
entered into the reporting network? 

* By what agency are conditions reported for such temporary residents as college 
students, military personnel, and migrant workers? 

Routing/timing of reports 

* How should 'suspected case, laboratory results pending" be handled? 

* Should the local or the state health department update a case report when 
additional information is received? 

* Should case reports arise from the health jurisdiction in which the patient 
resides? In which the patient became infected (injured)? In which the patient 
became ill (and/or received treatment)? 

* Should a diagnostic laboratory send data on reportable conditions to the requester, 
or should it be responsible for reporting to appropriate local/state health 
departments? (If "yes" to the latter, in what order?) 

* If a case occurs one calendar year, but is not reported until early in the next 
calendar year, what is the year of report? What is the cut-off date for reports 
from the previous year? How are reports treated that are for the previous year but 
are received after the established deadline? 

Is there a mechanism for reporting disease/ injury across state lines, as 

Policy issues in reporting disease/ injury 

What items on the reporting form must be completed before a report can be 

If a reportable condition has a specific case definition (such as measles and 
AIDS) , should the case be reported before confirmation by a disease investigator? 

What mechanism will be (has been) established to deal with situations in which 
cases must be reported in batches rather than individually because the number of 
reports is overwhelmingly large? 

* If case reports are held pending laboratory confirmation, should the "date of 
report' reflect the original date of report or the date laboratory confirmation was 
received or some other date associated with this health event? 

* Are reports generated to identify records with incomplete/unconfirmed data so that 
follow-up can be initiated? 

* How does one avoid duplicate reports of the same case? 

How are discrepancies in the information on duplicate reports resolved? 

TABLE IV. 2. Concerns of the data-base manager 

1. Who will enter the data? What credentials must this person have? Who is this 
person's back-up? Who will update records? Back-up the computer file? 

2. Will data be entered on an as-received basis or according to an established 

3. Does the data-entry screen replicate the paper form from which data are to be 

4. Does the data-entry program allow for certain data items to be entered 
automatically on subsequent screens until the data recorder makes a change? (For 
example, the county initially entered will appear on each subsequent screen until 
the recorder types in a different county. This allows the recorder to batch 
records for more efficient entry) . 

5. Does the data-entry program effectively validate the data being entered for 
completeness by use of "must-enter' fields and "look-up" files? 

6. Does the- data-entry program have the ability to do range checking on values 
entered? If so, does the system allow for acceptable ranges to change, reflecting 
values entered in the data base over a time? Is there a logic audit procedure in 
the system — to locate such errors as misspelled names or addresses, incorrectly 
coded race, gender, or code for disease/injury? 

7. At what level (state or local) will records be changed or deleted? Who owns the 
data records? 

8. If the data base is distributed to other users as an electronic file or on floppy 
diskette, are there safeguards to prevent overwriting another user's data? 
Safeguards against computer viruses? 

9. Are the data-entry programs flexible enough to allow variables to be modified as 
prescribed by changes in state regulations and national recommendations? 

10. Are production reports automatically generated for quality assurance of data entry? 

11. How and with what frequency are data copied and stored for back-up purposes? Are 
paper/film copies maintained (in the event of computer failure)? 

12. Are double-entry systems used for quality assurance? 

TABLE V.l. Rates and quantities involving rates commonly used in epidemiology 




Expressed per 
number at risk 

Measures of morbidity: 


Attack rate 

attack rate 


Number of new cases 
of specified 
condition/given time 

Number of new cases 
of specified 

Number of new cases 
of specified 
condition among 
contacts of known 

Number of current 
cases of specified 
condition at given 

Population at start 
of time interval 

Population at start 
of epidemic 

Size of contact 
population at risk 

population at 
same point in time 

10" where 
x = 2,3,4,5,6 

10 1 where 
x = 2,3,4,5,6 

10" where 

x = 2,3,4,5,6 

10 1 where 

x = 2,3,4,5,6 


Number of old cases 
plus new cases of 
specified condition 
identified in given 
time interval 

Estimated mid-interval 

10 1 where 
x = 2,3,4,5,6 

Measures of mortality: 

death rate 

Total number of deaths 
reported in given 
time interval 

Estimated mid-interval 

1,000 or 

death rate 

Number of deaths from 
specific cause in 
given time interval 

Estimated mid-interval 



Number of deaths from 
specific cause in 
given time interval 

Total number of deaths 
from all causes in 
same interval 

100 or 



Measures of mortality: (continued) 

case ratio 
rate, case- 
fatality ratio) 









Number of deaths from 
specific condition 
in given time 

Number of deaths 
(<28 days of age) in 
given time interval 

Number of deaths 
(<1 year of age) in 
given time interval 

Number of deaths from 
pregnancy related causes 
in given time 

Measures of natality: 


Number of new cases 
of that condition 
in same time 

Number of live births 
in same time 

Number of live births 
reported in same 
time interval 

Number of live births 
in same time 

birth rate 

Number of live births 
reported in given 
time interval 

Estimated total 



fertility rate 

Number of live births 
reported in given 
time interval 

Estimated number of 
women ages 15-44 
years at mid-interval 

Crude rate 
of natural 

Number of live births 
minus number of deaths 
in given time interval 

Estimated total 



Low birth 
weight ratio 

Number of live births 
(<2,500 grams) in 
given time interval 

Number of live births 
reported in same 
time interval 

Expressed per 
number at risk 









TABLE V.2. Crude death rates-Dade and Pinellas counties, Florida, 1980 



Crude death rate 

(per 1,000 


Dade County 




Pinellas County 




Sources: Bureau of the Census, 1983. 

National Center for Health Statistics, Centers for Disease Control. 

TABLE V3. Age-specific death rates-Dade and Pinellas counties, Florida, 1980 

Age group 


Dade County 

Pinellas County 



Rate (per 
1,000 pop.) 



Rate (per 
1,000 pop.) 







































































Sources: Burea 

♦Deaths >75 incl 

u of the Census, 
aal Center for Hi 

ude six persons 

;alth Statistics, 

of unknown ag 

Centers for Disease Control. 

; for Dade and one of unknown age for Pinellas counties. 

TABLE V.4. Directly standardized death rates-Dade and Pinellas counties, Florida, 1980* 

Age group 


1980 U.S. population 

(percentage distribution) 


Age-specific death rates 

(per 1,000 pop.) 

Expected deaths in 1980 

U.S. population using 
county age-specific ratesf 

Dade County 

Pinellas County 

Dade County 

Pinellas County 































































rates (per 

1,000 pop.)§ 

7.9 7.7 

♦United States population, 1980, used as standard. 

tCjj = AjxBjj where i=l,...,9 age groups and j=l,2 counties. 

§2C S /1 00. 

TABLE V.5. Indirectly standardized death rates- 

Dade and Pinellas counties, Florida, 1980* 


Age group 


Death rates 

(per 1,000 pop.) 

U.S. 1980 

1980 population 


Expected number of deaths in 

county based on U.S. -specific 




































































rales (per 

1,000 pop.)H 








rates (per 

1,000 pop.) 






rates (per 

1,000 pop.)tt 



♦United States age spi 
tQj = AjXBjj where i= 
§Deaths >75 include '. 
EQj/2B s forj=U. 

i i 

♦♦U.S. total death rau 

xrfic death rates, 1980, usee 
1.....9 age groups and j=l,2 
S68 of unknown age for Unit 

:/expected death rate. 

as standard, 
ed States. 

tfCrude death rate x adjusting factor. 

TABLE V.6. Five-number summary of 39 4-week totals of reported cases of meningococcal infections- 
United States, 1987-1989 

Median 190 

Hinges 151 237 

Extremes 102 350 

TABLE V.7. Common power transformations (y -> y p ) 









-1/y 2 



Higher powers 

Square root 

No transformation 
AppcrjEE fcr count eta 


Generally logarithm to 
base 10, widely used 

Reciprocal root 

Reciprocal square 

Minus sign preserves 

Lower powers 

TABLE V.8. Guide for selecting data graphics 

Type of graph or chart 
Arithmetic-scale line graph 
Semilogarithmic-scale line graph 


Frequency polygon 

Cumulative frequency 
Scatter diagram 
Simple bar chart 
Grouped bar chart 
Stacked bar chart 

Deviation bar chart 
Pie chart 
Spot map 
ChJoropleth map 
Box plot 

When to use 

Trends in numbers or rates over tune 

1 Emphasize rate of change over time 

2. Display values ranging >2 orders of magnitude 

1 Frequency distribution of continuous variable 

2. Number of cases during epidemic (i.e.. epidemic curve) or over time 

Frequency distribution of continuous variable, especially to show 

Cumulative frequency 

Plot association between two variables 

Compare size or frequency of different categories of single variable 

Compare size or frequency of different categories of 2-4 series of data 

Compare totals and illustrate component parts of the total among 
different groups 

Illustrate differences, both positive and negative, from baseline 

Show components of a whole 

Show location of cases or events 

Display events or rates geographically 

Visualize statistical characteristics (e.g.. median, range, skewness) of 

TABLE V.9. Primary and secondary morbidity from syphilis, by age category-United Stales, 


Age group 































♦Percentages do not add to 100.0 due to rounding. 

































































































































































































" - 































































TABLE VII. 1. Controlling and directing information dissemination 


Establish communications 

Define audience 

Select the channel 

Market the message 

Evaluate the impact 

Questions to be Answered 
What should be said? 

To whom should it be said? 

Through what communication 

How should the message be 

What effect did the message create? 

TABLE VIII. 1. Sample case definition developed by Che Centers for Disease Control and 
the U.S. Council of State and Territorial Epidemiologists 


Clinical case definition 

An illness characterized by all of the following clinical features: 

• A generalized rash lasting _>3 days 

• A temperature _>38.3 C (101 F) 

• Cough or coryza or conjunctivitis 

Laboratory criteria for diagnosis 

• Isolation of measles virus from a clinical specimen 


• Significant rise in measles antibody level by any standard serologic assay 


• Positive serologic test for IgM antibody (to measles) 

Case classification 

Suspected: any rash illness with fever. 

Probable: meets the clinical case definition, has no or noncontributory serologic 

or virologic testing, and is not epidemiologically linked to a probable or 

confirmed case. 

Confirmed: a case that is laboratory confirmed or that meets the clinical case 

definition and is epidemiologically linked to a confirmed or probable case. A 

laboratory-confirmed case does not need to meet the clinical case definition. 


Two probable cases that are epidemiologically linked would be considered confirmed, 
even in the absence of laboratory confirmation. 

TABLE VIII. 2. The detection of health conditions with a surveillance system. 

"Condition" present 

Yes No 

True False 
Yes positive positive A+B 

A B 
Detected by 

surveillance False True 

No negative negative C+D 

C D 


♦Sensitivity = A/ (A+C). 

TABLE VIII. 3. Comparison of estimated costs for active and passive surveillance systems 
in a health department, Vermont, June 1, 1980, to May 31, 1981 

Type of surveillance system 



Public health nurses 


♦Active = Weekly calls from health department to request reports, 
t Passive = Provider-initiated reporting. 



$ 114 



$ 80 







TABLE VIII. 4. Outline of sample surveillance evaluation report 

1 . Public Health Importance 

Describe the public health importance of the health event. The three most 
important categories to consider are the following: 

• Total number of cases, incidence, and prevalence. 

• Indices of severity such as the mortcffiii£ycaa£-efatrs3Lity ratio. 

• Preventability. 

2 . Objectives and Usefulness 

Explicitly state the objectives of the system and the health event (s) being 
monitored (case definitions) . Describe the actions that have been taken as a 
result of the data from the surveillance system. Describe who has used the data 
to make decisions and take actions. List other anticipated uses of the data. 

3 . System Operation 

Describe the following: the population under surveillance, the period of time of 
the data collection, the information that is collected, who provides the 
information, how the information transferred and how often, how the data are 
analyzed (by whom and how often) , how often reports are disseminated, and how 
reports are distributed (to whom and in what media) . Include an assessment of 
the simplicity, flexibility, and acceptability of the system. 

4. Quantitative Attributes: Include assessments of the sensitivity, predictive 
value positive, representativeness, and timeliness of the system. 

5. Cost of Operating the Surveillance System. Estimate direct costs and, if 
possible, assess cost-benefit issues. 

6 . Conclusions and Recommendations 

These should state whether the system is meeting its objectives and should 
address issue of whether to continue and/or modify the surveillance system. 



H 0> 

rt Oi 

J3 C 

J3 C 

10 ~H 

10 -4 

IJ CJ >, X 

-H IJ 

-^ U 

O-H ij U CO 

1-1 V-i 

tl-4 l-l 

flj ■— i ~-i ^ en 

—i o 

—I O 

uJ2 u 10 C >, 

i-i a 

u a 

3 o cy-H ^ 




w a -^ co 03 01 

C l-i 

c u 


a nxiic ■> 


- 14 .C ^ WHU>, 

l-i CO 

5s O 

l-t CO 


a> o u u to «-4 14 —i o 





i 0) 

- 0) 

01 01 


(0 CO 

to CO 

~-l C> 


•D— i 


u u 

c o to o> T3 



o to 



CO *-i 

u u U a. < 















-4 c 


u —4 


(0 t-l 








10 01 



u —4 



CO (0 > 

cnij » 0) 



a) > 

> 10 01 


.c u 

C 01 M 


l-i -^ CO 


Oi-i O) 3 
CO C & u 

1-t 10 




U-H 01 


-4 C -H -H 0) 

ah n 



i— < £ 

kl-H 14 U fl) 





(0 "W (0 -H Q, 

c co 



£ £ U 



Cm U 







oi o 


O u 

-i 01 -1 


CJ c 

j2 C u 


en 0) 


>, 10 01-4 to 


01 U.C 

l-i -H CO U CJ 


i-> c 

o en to u 



1-1 -H 


-4 C 01 ^-i 

u -4 0) 0) 1-4 

0) u 

<d m u 

1-l~4 ll 10 

10 u CO Q, C -H 


0) CI 01 

*C tH 0) u u 

IJ u 

-Q u > 

C l-< u -C 



3 O C 

01 10 

to (DC 

10 c 

03 O OJ 


X ft. 





> -4 

01 U 10 

U -r4 -4 

C > i-i i-i 


•H^U C C 


CJ 10 U 01 OJ > 01 




1-4 3 CO Ij C 10 ~! 


01 u C O O > i-i 



01 3 CO C l-i O 


CO CJ l-t cu o 






















& 01 









—4 -H 


u > 





-H 01 





l-i u 











-1 t 








10 >, 






01 U 



•rt 1J 


* E 






l-> c 







U 3 





>, 01 

01 10 u 

■H E 


OI 01 




T-l-r^ -4-4 


■H CI 




3 E 
CO CO u 

■9 o 


CJ 14 

01 u 


3 U 






CO r-t 

TABLE IX. 2. An ethical checklist for public health surveillance 

1. Justify the surveillance system in terms of maximizing potential public health 
benefits and minimizing public harm. 

2. Justify use of identifiers and the maintenance of records with identifiers. 

3. Have surveillance protocols and analytic research reviewed by colleagues, and 
share data and findings with colleagues and the public health community at large. 

4. Elicit informed consent from potential surveillance subjects. 

5. Assure the protection of confidentiality of subjects. 

6. Inform health-care providers of conditions germane to their patients. 

7. Inform the public, the public health community, and clinicians of findings of 

TABLE XII. 1. Reasons cited by physicians for failure to report notifiable diseases 

1. Assumed that the case would be reported by someone else. 

2. Unaware that disease reporting was required. 

3. Do not have notifiable disease reporting form/telephone number. 

4. Do not know how to report notifiable diseases. 

5. Do not have copy of list of notifiable diseases. 

6. Concerned about confidentiality. 

7. Concerned about violation of doctor-patient relationship. 

8. Reporting is too time-consuming. 

9. Absence of incentives to report. 

TABLE XII. 2. What local and state health departments can do to improve reporting by 

Local health departments 

• Express an interest in disease reporting to those responsible for report- 

• Maximize contact with the local medical community. 

Telephone contact 
Mass media 

• Use the data. 

State health department a 

• Express an interest in disease reporting to those responsible for report- 

• Maintain a reasonable list of reportable conditions. 

• Maximize contact with the state medical community. 
- - Presentations 

Telephone contact 
Mass media 

• Use the data. 

TABLE XII. 3. Criteria used to set priorities for national disease surveillance, 
Canada (60) 

1. Surveillance by the World Health Organization 

2. Importance to agriculture in Canada 
3 . Disease incidence 

4. Morbidity (hospital days and short-term disability) 

5. Mortality. 

6. Case-fatality ratio 

7. Communicability 

8. Potential for outbreaks 

9 . Socioeconomic impact 

10. Public perception of risk 

11. Vaccine preventability 

12. Necessity for an immediate public health response 

TABLE XII. 4. Confidence intervals for rates (61). 
Let r = rate per 1,000 

n = denominator upon which rate is based 

The limits of the 95-percent confidence interval are: 
upper limit: r + 61.981 i r / n 

lower limit: r - 61.981 I r / n 

TABLE XII. 5. Formula for calculating the 95% confidence interval for the ratio of two 
independent rates (61) 

Let r 2 = rate for period 1 (or area 1) 

dj = number of events for period 1 (or area 1) 

r 2 = rate for period 2 (or area 2) 

d 2 = number of events for period 2 (or area 2) 

R = r,/r 2 

The limits of the 95% confidence interval are: 
upper limit: R + 1.96R 1 1/dj + l/d 2 

lower limit: R - 1.96R I l/dj + l/d 2 

TABLE XII. 6. Formula for calculating the 95% confidence interval for the difference 
between two independent rates 

Let r, = rate for period 1 (or area 1) 

nj = denominator upon which r : is based 

r 2 = rate for period 2 (or area 2) 

d 2 = denominator upon which r 2 is based 

D = Ti - r 2 

The limits of the 95% confidence interval are: 


limit: D + 61.981 K ri/n x + r 2 /n 2 


lower limit: D - 61.981 I rj/nj + r 2 /n 2 

Table XIII. 1. Examples of data sources for surveillance in 
developing countries 

I. Case reports 

a. from health stations or hospitals 

b. from sentinel sites 

II. Births and deaths 

a. from hospitals 

b. from sentinel sites 

c. complete ascertainment 

III. Laboratory reports (usually from hospitals) 

IV. Sample surveys (particularly cluster surveys) 

Table XIII. 2. Health problems ranked according to preventability and 
treatability, Thailand, 1987 






(H-M-L) ** 





























Acute diarrhea 







































Peptic Ulcer 






Source: "Review of the Health Situation in Thailand: 
Diseases. " 

Priority Ranking of 

* Rated on a scale of 4 (low) to 16 (high) 
**H=high, M=medium, L=low 

Table XIII. 3. Examples of objectives linked to surveillance components that 
will measure objectives 

Surveillance- linked objectives 

Surveillance component 
that measures objective 

Priority area #1--Diarrhea 

Health status--Reduce diarrhea mortality by 
25% by 1995 

• Risk factor—Increase female literacy of 
10- to 14-year-olds to 80% by 1995 

• Health activity—Increase to 90% the 
proportion of 0- to 4-year-olds given 
appropriate home fluids by 1995 

Vital-event registration 
in five sentinel areas 

Regularly conducted 

Regularly conducted 
health survey 
Local—exit interviews 

Priority area #2 --Measles 

Health status— Reduce measles mortality by 
25% by 1995 

• Health status—Reduce number of reported 
measles cases by 50% by 1995 compared with 

• Health activity- -Increase percentage of 12- 
to 23-month-olds with one dose of measles 
vaccine to 90% nationwide 

• Health activity— Increase to 80% the 
percentage of districts with one-dose 
measle vaccination coverage of 12- to 23- 
month-olds of 90% 

Vital-event registration 
in five sentinel areas 

National disease- 
reporting system 

• Regularly conducted 
health survey 

Exit interviews of 
mothers of 50 12- to 23- 
month-olds at all health 
facilities in district 
twice a year 

Priority area #5--HIV/AIDS 

Health status—Stabilize at 10% the 
proportion of 20- to 25-year-old women who 
have babies at the capital city hospital 
and who are HIV-positive by 1993 

Sentinel HIV testing of 
20- to 2 5 -year old women 
who have babies in 
capital city 

Health status--No increase in the 2% HIV 
seroprevalence of rural women who have 
babies that are HIV-positive by 1993 

Sentinel HIV testing of 
women having babies in 
capital city 

Risk factor—Reduction of HIV-risk taking 
behavior by 50% in 1994 in areas with HIV 
seroprevalence of STD patients >10% (an 
indicator of entrance of HIV into 

Reporting of clinical 
chancroid through the 
national disease- 
reporting system 

Laboratory- -Syphilis 
serology testing of 20- 
to 25-year-old women 
having babies in 
affected areas 

Exit interviews in 
affected areas 

Health activity--Increase to 75% the 
percentage of sexual contacts whose 
partners are not spouses who also use 
condoms by 1995 in areas with HIV 
seroprevalence of STD patients >10% 

Nationwide only-- 
health survey 

Exit interviews in 
affected areas 

Nationwide only-- 
health survey 

9 3 









- i 



3 i 



o ^ 


H >,>° 
2 .ffr" 

rs'a !o 






"O go 

r* <o 

S I 

i_ a, 

o c 

o o 



© Q. 

3 ■ 

® £ 10 






cii § 



IE 1 


F E 


!a> o 
I * 


■a £ ' 



O c 



i« £ 

i» o i 



eg a 

! 8 s 

| 5 .2 

;f 1 

i co *"" 



r o 

la | 



T> -o c3 m -° 


§ ■at 2 " 

■g XI S * J3 

" i 



o o. a c 

a>2 — o 
3 $ m Q. 












FIGURE 1.1. Reported cases of congenital syphilis among infants <1 year of age and rates 
of primary and secondary (P&S) syphilis among women — United States, 1970-1991 




00 o 





3,600 © 
2,700 ^ 



1970 1973 1976 1979 1982 1985 1988 1991 


Note: The surveillance case definition for congenital syphilis changed in 1989. 
Source: Centers for Disease Control. 






FIGURE 1.2. Salmonella rates in New Hampshire and contiguous states, by county 

Cases per 100,000 

.01 - .80* 


Unshaded counties=no cases reported 

FIGURE 1.3. Homicide rate, by age and gender of victim, United States, 1986 

















■ i ii i i i i i i i i i i i i i ■ ■ 

7 5r 7 S *W <%> «%- <%> % %> 3r 3b $r <%> -^ ^ 3? ° 


FIGURE 1.4. Malaria rates, by year— United States, 1930-1988 

1 ,000 n 

100 - 







1 - 

0.1 - 

Relapses of imported malaria 


Relapses from Korean veterans 

Returning Vietnam veterans 

001 ' I I 1 1 1 — -I 1 1 1 1 - 1 — I — 

1 930 1 935 1 940 1 945 1 950 1 955 1 960 1 965 1 970 1 975 1 980 1 985 1 988 


FIGURE 1.5. Reported cases of measles, by age group, United States, 1980-1982* 

22 -i 


20 - 

18 - 


16 - 



14 - 


12 - 



10 - 



8 - 



6 - 


4 - 



2 - 

o -I 













* Rates estimated by extrapolating age, from reported case-patients with known age. 

FIGURE 1.6. Semi-logarithmic-scale line graph of reported cases of paralytic 
poliomyelitis— United States, 1951-1989 













g 0.01 . 


I 0.001 

DC 1950 1955 


J Oral 


1960 1965 1970 1975 1980 1985 1990 







CD 12 
.C CO 



c L - 
o o 




CD -5 



FIGURE 1.7. Percentage of reported cases of gonorrhea caused 
by antibiotic-resistant strains— United States, 1980-1990 



FIGURE 1.8. Cesarean deliveries as a percentage of all deliveries in U.S. hospitals, 
by year, 1970-1990 

30 -f 

§, 20 


-»— ' 

§ 15 



I a I I 

=» S —T T IT" 

1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1989 1990 


FIGURE V. 1 . Crude, gender-specific and gender-race-specific 
cases of primary and secondary syphilis — United 
States, 1981-1990, comparison of differential trends 

1981 1982 19S3 1984 1985 1986 1987 1988 1989 1990 


1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 






Black male 


'* Black female 

White male 

While female 

1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 


FIGURE V.2. Dot plot of results of swine influenza virus (SIV) 
hemagglutination-inhibition (HI) antibody testing among 
exposed and unexposed swine exhibitors — Wisconsin, 1988 












• • • • 

• • • * 

• • •• 

• • • • • 

• • • • • 

• • • • • 

Unexposed Exposed 

Swine exhibitors 

FIGURE V.3. Ordered data series and stem-and-leaf display of 39 4-week totals of reported cases 
of meningococcal infections-United States, 1987-1989 

1987: 226, 307, 350, 236, 222, 258, 197, 167, 138, 108, 191, 190, 201 

1988: 216, 238, 331, 270, 265, 156, 164, 142, 112, 1 11, 153, 138, 159 

1989: 145, 306, 314, 264, 222, 195, 155, 149, 102, 117, 174, 158, 159 



























In this example the first two digits of each datum serve as the stem and the third digit serves as a 
leaf, e.g., for the numbers 264 and 265, the stem and leaves appear as 26 (stem) and 45 (leaves). 
Since further division of the stems would result in an attenuated distributional shape, each stem 
represents a range of 20 numbers, e.g., the stem 26 represents any number from 260 to 279 so that 
for the number 270, the stem and leaf appear as 26 (stem) and (leaf). 

FIGURE V.4. Scatter plot of 39 4-week totals of reported cases of 
meningococcal infections — United States, 1987-1989 



300 : + 



CO 20 ° 

1 50 

















i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 

13 5 7 9 1113 1517 19 2123 25 27 29 3133 35 3739 
1987-1989 (4-week periods) 

FIGURE V.5. Box plot of 39 4-week totals of reported cases of 

meningococcal infections-United States, 1987-1989 

# Box plot 

1 350+ 




5 I . 230+ 






5 110+ 


FIGURE V.6. Histogram (epidemic curve) of reported cases of 
paralytic poliomyelitis — Oman, January 1988-March 1989 





= 1 case 

Oral poliovirus 

i i i 



w 15- 

: : 

cc ,v ' 






J :••.-. i 




i , I i 





l,.,-i | ,,.;ti;,.^..ri : t..i,. 


^ &6 % 4* 4fc,4*> 4/ % % fy ^o^e c ^o feb K 

Date of onset 



(D 40 


o 30 





FIGURE V.7. Sample cumulative attack rate, by grade in 
school and time of onset — North Carolina, 1985 

Eighth grade 
Seventh grade 

Fifth grade 

Sixth grade 
,***"""■" Fourth grade 

6 to 11 

11 a.m. 1 to 3 3 to 5 

5 to 9 


to p.m. p.m. 
1 p.m. 

Time period 







FIGURE V.8. Survival curves over time, based on serum 
testosterone level, Eastern Cooperative Oncology Group 



jgnostic group 

— Best 

— Worst 

— Other 




1 1 1 1 1 

40 60 80 100 120 140 

Weeks from randomization 

FIGURE V.9. Frequency polygon of reported 
cases of encephalitis — United States, 1965 








FIGURE V.10. Group bar chart of case-fatality rates from ectopic 
pregnancy, by age group and race — United States, 1970-1987 


















I White 
Black and other 

15-19 20-24 25-29 30-34 35-39 40-44 
Age group (years) 

FIGURE V. 1 1 . Stacked bar chart of underlying causes of infant mortality, 
by racial/ethnic group and age at death — United States, 1983 

Birth defects 

□ Low birth weight/prematurity/ 
respiratory distress syndrome 

Lil Sudden infant death syndrome 


Black American Hispanic Asian White Total 


FIGURE V.12. Deviation bar chart of notifiable disease reports, comparison 

of 4-week totals ending May 23, 1992, with historical data — United States 

Cases current 
Disease Decrease Increase 4 weeks 

Aseptic meningitis • 
Encephalitis (primary) ■ 
Hepatitis A 
Hepatitis B 
Hepatitis, non-A, non-B ■ 
Hepatitis (unspecified) 
Legionellosis ■ 
Malaria ■ 
Measles (total) 
Meningococcal infections ■ 
Mumps ■ 
Pertussis ■ 
Rabies (animal) ■ 
Rubella ■ 

0.125 .25 .5 1 

Ratio (log scale)* 

Beyond historical limits. 

* Ratio of current 4-week total to the mean of 15 4-week totals (from previous, comparable, and subsequent 
4-week periods for the past 5 years). The point where the hatched area begins is based on the mean and 
two standard deviations of these 4-week totals. 
















FIGURE V.13. Pie charts of poliomyelitis vaccination status of children ages 1-4 years 
in cities with populations ^250,000, by financial status — United States, 1969 




Adequately vaccinated: 3+ doses inactivated poliovirus vaccine (IPV) and/or 
3 doses oral poliovirus vaccine (OPV). 

Inadequately vaccinated: Some poliovirus vaccine, but < 3 doses of IPV 
and/or < 3 doses of OPV. 

Not vaccinated: No vaccine given. 

FIGURE V.14. Spot map of deaths from smallpox— California, 1915-1924 

V \ ' 



( 1 ■ V • • • • 


• • 


7 • • 

• • 

FIGURE V. 1 5. Chloropleth map of confirmed and presumptive cases of 
St. Louis encephalitis, by county — Florida, 1990* 

No cases 
1 -5 cases 
6-10 cases 
>1 cases 

* As of October 17, 1990. 

Indian River 


t f& 

FIGURE V.16. Density-equalizing map of California (based upon 
population density), depicting deaths from smallpox, 1915-1924 

FIGURE VI.l. Example: Data used for report published during week 20 (May 23, 1992) 


■ -■ ■ ' 


^— "C 






x 4 


















"Current" 4 weeks 

* For example, X Q is total of cases reported for weeks 1 6-1 9, 1 992. 

FIGURE Vm.l. National Notifiable Diseases Surveillance System 

report - 

State-authorized sources for case 
reporting, e.g., physicians, laboratories, 
infection-control practitioners, school 
nurses ef a/.telephone case reports to 
local health department. 

Follow-up information is collected, written case 
reports are completed, and reports are sent to 
state health agency. 

Prescribed case data entered into computer. 

File of line listing transmitted 
electronically to CDC. 






l I 

Retrieved data file stored in mainframe 
computer files, from which output 
is generated. 






maps to 







FIGURE Vin.2. Biases in surveillance 

Case ascertainment bias 

information bias 
(Data about the case) 

Population under surveillance 
— I 


- , — i — , 

Reported Not reported 
(true positive) (false negative) 


, — i — , 

Reported Not reported 
(false positive) (true negative) 

I 1 1 

Present Present Absent 
(correct) (incorrect) 

I 1 1 

Present Present Absent 
(correct) (incorrect) 

FIGURE XII. 1. Cartoon depicting mumps as a public health problem, Tennessee 


FIGURE XIII.A.l. Percentage of case-patients vaccinated (PCV) per percentage 
of population vaccinated (PPV) for seven values of vaccine efficacy (VE) 

90 - 







/U " 


£fi . 



o rc\ - 


*90 J95' - 

■50 O 

a. &0 

^Tiicacy - *n 

J/OU' bur ' uf ou# 

AH - 



• on 


9n . 

• OC\ 

in - 

. 1 n 



) 1 

20 30 40 50 60 70 80 

90 10 


u p 

° b 

i*5 CD 

a 8 

7-* a 

o jd 

Ih £h 
*J > 

CD jj 

« « 

a; u 
o a) 

CO <* 

CD ^ 

° -J I 

•a *d 

S3 | 

Z Ih 






















•^ a 

A I *H 



«+h CD 

O o 



CD 2 


co C 

Reproduced by NTIS 

National Technical Information Service 
U.S. Department of Commerce 
Springfield, VA 22161 

This report was printed specifically for your 
order from our collection of more than 2 million 
technical reports. 

For economy and efficiency, NTIS does not maintain stock of its vast 
collection of technical reports. Rather, most documents are printed for 
each order. Your copy is the best possible reproduction available from 
our master archive. If you have any questions concerning this document 
or any order you placed with NTIS, please call our Customer Services 
Department at (703)487-4660. 

Always think of NTIS when you want: 

• Access to the technical, scientific, and engineering results generated 
by the ongoing multibillion dollar R&D program of the U.S. Government. 

• R&D results from Japan, West Germany, Great Britain, and some 20 
other countries, most of it reported in English. 

NTIS also operates two centers that can provide you with valuable 

• The Federal Computer Products Center - offers software and 
datafiles produced by Federal agencies. 

• The Center for the Utilization of Federal Technology - gives you 
access to the best of Federal technologies and laboratory resources. 

For more information about NTIS, send for our FREE NTIS Products 
and Services Catalog which describes how you can access this U.S. and 
foreign Government technology. Call (703)487-4650 or send this 
sheet to NTIS, U.S. Department of Commerce, Springfield, VA 221 61. 
Ask for catalog, PR-827. 


Address . 


- Your Source to U.S. and Foreign Government 
Research and Technology. 


WA 105 P9355 1992 

Principles and practice of 
public health surveillance 

\u fc n 

rv i ) 

\' ■ jjn 

Parklawn Health Library 
U.S. Public Health Service 
Parklawn Bldg. - Rm.13-12 
5600 Fishers Lane 
Rockville, Maryland 20857 

m UB RA«y 


3 203 

00034019 7