Full text of "Principles and practice of public health surveillance"

See other formats

r Tin

P82 ",Ji<li'S

PB93-101129

1994

PRINCIPLES AND PRACTICE OF PUBLIC
HEALTH SURVEILLANCE

CENTERS FOR DISEASE CONTROL
ATLANTA, GA

AUG 92

U.S. DEPARTMENT OF COMMERCE
National Technical Information Service

P3S3-1C112S

Principles and Practice

of Public Health

Surveillance

Steven M. Teutsch

R. Elliott Churchill

Editors

BLDG 10

S. DEPARTMENT OF HEALTH & HUMAN SERVICES CDC

Public Health Service

Epidemiology Program Office
Centers for Disease Control

August 1992

Paiuam Health Lftwy

5600 Fishars

Lane.P.n".. »*»
20857

Us* of trade naaaa is for identification only

and does not constitute endorsement by the

Public Health Service or the Centers for Disease Control.

REPORT DOCUMENTATION PAGE

Form Approved
OMB No. 0704-0188

Public reporting burden for this collection of information is estimated to average 1 hour oer response, including the time for reviewing instructions, searching existing data sources,
gathering and maintaining the data needed, and completing and reviewing the collection of information Send comments regarding this burden estimate or any other aspect of this
collection of information, including suggestions for reducing this burden, to Washington Headouarters Services. Directorate for Information Operations and Reports, 1215 Jefferson
Dav L ' — ■" *— '. andto the Officeof Managementand Budget. Paperwork Reduction Project (0704-0188), Washington. DC 20503

PB9 3-10 1129

4. TITLE AND SUBTITLE

2. REPORT DATE

August 1992

3. REPORT TYPE AND DATES COVERED

Final

Principles and Practice of Surveillance

6. AUTHOR(S)

Teutsch, Steven M. and CHoif.) 4il-^^o
Churchill, R. Elliott, Editors

7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)

Epidemiology Program Office
Centers for Disease Control
Mail stop C08
Atlanta, GA 30333

9. SPONSORING /MONITORING AGENCY NAME(S) AND ADDRESS(ES)

Epidemiology Program Office
Centers for Disease Control
Mailstop C08
Atlanta, GA 30333

5. FUNDING NUMBERS

None

8. PERFORMING ORGANIZATION
REPORT NUMBER

None

10. SPONSORING /MONITORING
AGENCY REPORT NUMBER

None

11. SUPPLEMENTARY NOTES

Oxford University Press has expressed interest in obtaining this material when it
has been placed in the public domain, with a view to publishing it.

12a. DISTRIBUTION /AVAILABILITY STATEMENT

12b. DISTRIBUTION CODE

13. ABSTRACT (Maximum 200 words)

Public health surveillance is the systematic, ongoing assessment of the health of a
community including the timely collection, analysis, interpretation, dissemination
and subsequent use of data. The book presents an organized approach to planning,
developing, implementing, and evaluating public health surveillance systems. Chapters
include: planning; data sources; system management and data quality control; analyzing
surveillance data; special statistical issues; communication; evaluation; ethical
issues; legal issues; use of computers; state and local issues; and surveillance in
developing countries. The book is intended to serve as a desk reference for public
health practitioners and as a text for students in public health.

REPRODUCED BY

US DEPARTMENT OF COMMERCE
NATIONAL TECHNICAL INFORMATION SERVICE
SPHINGF1ELD. VA 22161

14. SUBJECT TERMS

Public Health; Public Health Surveillance; Epidemiology; Disease
Control; epidemics; Communicable Disease; Infectious Disease;
Surveillance

17. SECURITY CLASSIFICATION
OF REPORT

unclassified

NSN 7540-01-280-5500

18. SECURITY CLASSIFICATION
OF THIS PAGE

unclassified

19. SECURITY CLASSIFICATION
OF ABSTRACT

15. NUMBER OF PAGES

16. PRICE CODE

unclassified

20. LIMITATION OF ABSTRACT

Standard Form 298 (Rev 2-89)
Prescribed by AN5I Std Z39-18
298-102

TABLE OF CONTENTS

LIST OF CONTRIBUTORS
PREFACE

CHAPTER I : INTRODUCTION 1

CHAPTER II: PLANNING A SURVEILLANCE SYSTEM. .............. 18

CHAPTER III: SOURCES OF ROUTINELY COLLECTED

DATA FOR SURVEILLANCE .34

CHAPTER IV: MANAGEMENT OF THE SURVEILLANCE SYSTEM

AND QUALITY CONTROL OF DATA 105

CHAPTER V: ANALYZING AND INTERPRETING SURVEILLANCE DATA..... 118

CHAPTER VI : SPECIAL ANALYTIC ISSUES 157

CHAPTER VII : COMMUNICATING INFORMATION FOR ACTION 177

CHAPTER VIII: EVALUATING PUBLIC HEALTH SURVEILLANCE 189

CHAPTER IX: ETHICAL ISSUES .210

CHAPTER X: PUBLIC HEALTH SURVEILLANCE AND THE LAW. .......... 231

CHAPTER XI: COMPUTERIZING PUBLIC HEALTH SURVEILLANCE

SYSTEMS . 245

CHAPTER XII: STATE AND LOCAL ISSUES IN SURVEILLANCE 272

CHAPTER XIII: IMPORTANT SURVEILLANCE ISSUES IN

DEVELOPING COUNTRIES 294

TABLES AND FIGURES

Contributors to this book:

STAFF AT THE CENTERS FOR DISEASE CONTROL:

Willard Cates, Jr., M.D., M.P.H.
Director, Division of Training
Epidemiology Program Office

R. Elliott Churchill, M.A.
Technical Publications Writer-Editor
Office of the Director
Epidemiology Program Office

Andrew G . Dean , M.D., M.P.H.
Chief, System Development and

Support Branch
Division of Surveillance and Epidemiology
Epidemiology Program Office

Robert F. Fagan, B.S.

Systems Analyst

Division of Surveillance and Epidemiology

Epidemiology Program Office

Norma P. Gibbs, B.S.

Chief, Systems Operation and Information

Branch
Division of Surveillance and Epidemiology
Epidemiology Program Office

Richard A. Goodman, M.D., M.P.H.
Assistant Director
Epidemiology Program Office

Robert A. Hahn, Ph.D., M.P.H.

Medical Epidemiologist

Division of Surveillance and Epidemiology

Epidemiology Program Office

Robert J. Howard, B. A.
Public Affairs Officer
Office of Public Affairs
Office of the CDC Director

Douglas N. Klaucke, M.D., M.P.H.
Chief, International Branch
Division of Field Epidemiology
Epidemiology Program Office

Carol M. Knowles, B.S.

Programmer Analyst

Division of Surveillance and Epidemiology

Epidemiology Program Office

Gene W. Matthews, J.D.
Legal Advisor to CDC
Office of the CDC Director

Mac W. Otten, Jr., M.D., M.P.H.

Medical Epidemiologist

Division of Immunization

National Center for Prevention Services

Barbara J. Panter-Connah
Epidemiology Program Specialist
Division of Surveillance and Epidemiology
Epidemiology Program Office

Nancy E. Stroup, Ph.D.

Epidemiologist

Division of Surveillance and Epidemiology

Epidemiology Program Office

Donna F. Stroup, Ph.D.

Director, Division of Surveillance and

Epidemiology
Epidemiology Program Office

Steven M. Teutsch, M.D., M.P.H.
Special Assistant to the Director
Epidemiology Program Office

Stephen B. Thacker, M.D., M.Sc.

Director

Epidemiology Program Office

Melinda Wharton, M.D., M.Sc.

Medical Epidemiologist

Division of Immunization

National Center for Prevention Services

G. David Williamson, Ph.D.

Chief, Statistics and Analytic Methods Branch
Division of Surveillance and Epidemiology
Epidemiology Program Office

Matthew M. Zack, M.D.

Medical Epidemiologist

Division of Chronic Disease Control and

Community Intervention
National Center for Chronic Disease

Prevention and Health Promotion

STAFF IN OTHER AGENCIES:

Patrick L. Remington, M.D., M.P.H.
Chronic Disease Epidemiologist
Wisconsin Department of Health and
Social Services (Madison)

Richard L. Vogt, M.D.
State Epidemiologist
Hawaii Department of Health

(Honolulu)

Kevin M. Sullivan, Ph.D.,
Assistant Professor
Division of Epidemiology
Emory University (Atlanta;

M.P.H., M.H.A.

PREFACE

Since public health surveillance undergirds public health practice, it is unfortunate that no single
resource has been available to provide a guide to the underlying principles and practice of
surveillance. In recent years, a small number of courses on surveillance at schools of public
health have been developed in recognition of the importance of surveillance, but no definitive
textbook has appeared. Principles and Practice of Public Health Surveillance is intended to serve
as a desk reference for those actively engaged in public health practice and as a text for students
of public health.

The book is organized around the science of surveillance, i.e., the basic approaches to planning,
organizing, analyzing, interpreting, and communicating surveillance information in the context of
contemporary society and public health practice. Surveillance provides the information base for
public health decision making. It must continually respond to the need for new information, such as
about chronic diseases, occupational and environmental health, injuries, risk factors, and emerging
health problems. It must also accommodate to changing priorities. Issues, such as long latency,
migration, low frequencies, and the need for local data, must be addressed. New analytic methods
and rapidly evolving technologies present new opportunities and create new demands. This book
addresses many of these issues. Although many examples of surveillance systems are included, this
is not intended to be a manual for establishing surveillance for any particular condition. We
believe that this approach will provide the reader with ideas and concepts that can be adapted to
her or his particular needs.

This book grew out of a recognition by the Surveillance Coordination Group at the Centers for
Disease Control of the need to capture the art as well as the science of surveillance. Most of the
authors are current or former staff in the Epidemiology Program Office at the Centers for Disease
Control. These friends and colleagues have drawn on their own experience in surveillance in states,
a diversity of federal programs, and in international health, as well as having provided an
interweaving of the experience of others. We felt that the risks of being parochial were outweighed
by the desirability of producing a consistent and systematic coverage of the subject. Although most

examples are drawn from the United States, they illustrate basic principles and approaches that can
be applied in a wide variety of settings around the world.

We would like to acknowledge Douglas Klaucke, who pulled together many of the initial thoughts on
organizing the book, and Stephen Thacker, the Director of the Epidemiology Program Office (EPO) , and
Donna Stroup, Director of the Division of Surveillance and Analysis, for their continued support and
encouragement. We also acknowledge with gratitude the creative guidance and constructive criticism
provided by EPO's Assistant Director for Science, Edwin Kilbourne. Finally, and most importantly of
all, we gratefully recognize the expertise, the dedication, and the commitment of all the authors in
assuring that this book became a reality.

SMT Atlanta, Georgia

REC August 1992

Chapter I

Introduction

Stephen B. Thacker

"If you don't know where you're going, any road will get you there."

Lewis Carroll

Public health surveillance is the ongoing systematic collection, analysis, and
interpretation of outcome-specific data for use in the planning, implementation, and
evaluation of public health practice (2) . A surveillance system includes the
functional capacity for data collection and analysis, as well as the timely
dissemination of these data to persons who can undertake effective prevention and
control activities. While the core of any surveillance system is the collection,
analysis, and dissemination of data, the process can be only understood in the context
of specific health outcomes.

BACKGROUND

The idea of observing, recording, and collecting facts, analyzing them and considering
reasonable courses of action stems from Hippocrates (2) . The first real public health
action that can be related to surveillance probably occurred during the period of
Bubonic plague, when public health authorities boarded ships in the port near the
Republic of Venice to prevent persons ill with plague-like illness from disembarking
(3) . Before a large-scale organized system of surveillance could be developed,
however, certain prerequisites needed to be fulfilled. First, there had to be some
semblance of an organized health-care system in a stable government; in the Western
world, this was not achieved until the time of the Roman Empire. Second, a
classification system for disease and illness had to be established and accepted,

which only began to be functional in the 17th century with the work of Sydenham.
Finally, adequate measurement methods were not developed until that time.

Current concepts of public health surveillance evolve from public health activities
developed to control and prevent disease in the community. In the late Middle Ages,
governments in Western Europe assumed responsibilities for both health protection and
health care of the population of their towns and cities (4) . A rudimentary system of
monitoring illness led to regulations against polluting streets and public water,
construction for burial and food handling, and the provision of some types of care
(5). In 1766, Johann Peter Frank advocated a more comprehensive form of public health
surveillance with the system of police medicine in Germany. It covered school health,
injury prevention, maternal and child health, and public water and sewage (4). In
addition, he delineated governmental measures to protect the public's health.

The roots of analysis of surveillance data can also be traced to the 17th century. In
the 1680s, von Leibnitz called for the establishment of a health council and the
application of a numerical analysis in mortality statistics to health planning (2) .
About the same time in London, John Graunt published a book, Natural and Political
Observations Made Upon the Bills of Mortality, in which he attempted to define the
basic laws of natality and mortality. In his work, Graunt developed some fundamental
principles of public health surveillance, including disease-specific death counts,
death rates, and the concept of disease patterns. In the next century, Achenwall
introduced the term 'statistics," and over the next several decades vital statistics
became more widespread in Europe. Nearly a century later, in 1845, Thurnam published
the first extensive report of mental health statistics in London.

Two prominent names in the development of the concepts of public health surveillance
activities are Lemuel Shattuck and William Farr. Shattuck's 1850 report of the
Massachusetts Sanitary Commission was a landmark publication that related death,
infant and maternal mortality, and communicable diseases to living conditions.
Shattuck recommended a decennial census, standardization of nomenclature of causes of
disease and death, and a collection of health data by age, gender, occupation,
socioeconomic level, and locality. He applied these concepts to program activities in
immunization, school health, smoking, and alcohol abuse, and introduced these concepts
into the teaching of preventive medicine.

William Farr (1807-1883) is recognized as one of the founders of modern concepts of
surveillance (6). As superintendent of the statistical department of the Registrar
General's office of England and Wales from 1839 to 1879, Farr concentrated his efforts
on collecting vital statistics, on assembling and evaluating those data, and on
reporting both to responsible health authorities and to the general public.

In the United States, public health surveillance has focused historically on
infectious disease. Basic elements of surveillance were found in Rhode Island in
1741, when the colony passed an act requiring tavern keepers to report contagious
disease among their patrons. Two years later, the colony passed a broader law
requiring the reporting of smallpox, yellow fever, and cholera (7).

National disease monitoring activities did not begin in the United States until 1850
when mortality statistics based on death registration and the decennial census were
first published by the Federal Government for the entire United States (8).
Systematic reporting of disease in the United States began in 1874 when the
Massachusetts State Board of Health instituted a voluntary plan for weekly reporting
by physicians reporting on prevalent diseases, using a standard postcard-reporting
format (9,10). In 1878, Congress authorized the forerunner of the Public Health
Service (PHS) to collect morbidity data for use in quarantine measures against such
pestilential diseases as cholera, smallpox, plague, and yellow fever (11) .

In Europe, compulsory reporting of infectious diseases began in Italy in 1881 and
Great Britain in 1890. In 1893, Michigan became the first U.S. jurisdiction to
require the reporting of specific infectious diseases. Also in 1893, a law was
enacted to provide for the collection of information each week from state and
municipal authorities throughout the United States (12). By 1901, all state and
municipal laws required notification (i.e., reporting) to local authorities of
selected communicable diseases such as smallpox, tuberculosis, and cholera. In 1914,
PHS personnel were appointed as collaborating epidemiologists to serve in state health
departments to telegraph weekly disease reports to the PHS.

In the United States, it was not until 1925, however, following markedly increased
reporting associated with the severe poliomyelitis epidemic in 1916 and the influenza
pandemic in 1918-1919, that all states had begun participating in national morbidity

reporting (13). A national health survey of U.S. citizens was first conducted in
1935. After a 1948 PHS study led to the revision of morbidity reporting procedures,
the National Office of Vital Statistics assumed the responsibility for morbidity
reporting. In 1949, weekly statistics that had appeared for several years in Public
Health Reports began being published by the National Office of Vital Statistics. In
1952, mortality data were added to the publication that was the forerunner of the
Morbidity and Mortality Meekly Report (MMWR) . As of 1961, the responsibility for this
publication and its content was transferred to the Communicable Disease Center (now,
Centers for Disease Control [CDC] ) .

In the United States, the authority to require notification of cases of disease
resides in the respective state legislatures. In some states, authority is enumerated
in statutory provisions; in other states, authority to require reporting has been
given to state boards of health; still other states require reports both under
statutes and health department regulations. Variation among states also exists among
conditions and diseases to be reported, time frames for reporting, agencies to receive
reports, persons required to report, and conditions under which reports are required
(14) .

The Conference (now Council) of State and Territorial Epidemiologists (CSTE) was
authorized in 1951 by its parent body, the Association of State and Territorial Health
Officials to determine what diseases should be reported by states to the Public Health
Service and to develop reporting procedures. CSTE meets annually, and in
collaboration with CDC, recommends to its constituent members appropriate changes in
morbidity reporting and surveillance, including what diseases should be reported to
CDC and published in the MMWR.

DEVELOPMENT OF THE CONCEPT OF SURVEILLANCE

Until 1950, the term "surveillance" was restricted in public health practice to
monitoring contacts of persons with serious communicable diseases such as smallpox, to
detect early symptoms so that prompt isolation could be instituted (15) . The critical
demonstration in the United States of the importance of a broader, population-based
view of surveillance was made following the Francis Field Trial of poliomyelitis
vaccine in 1955 (16,17). Within 2 weeks of the announcement of the results of the

field trial and initiation of a nationwide vaccination program, six cases of paralytic
poliomyelitis were reported through the notifiable-disease reporting system to state
and local health departments; this surveillance lead to an epidemiologic
investigation, which revealed that these children had received vaccine produced by a
single manufacturer. Intensive surveillance and appropriate epidemiologic
investigations by federal, state, and local health departments found 141 vaccine-
associated cases of paralytic disease, 80 of which represented family contacts of
vaccinees. Daily surveillance reports were distributed by CDC to all persons involved
in these investigations. This national common-source epidemic was ultimately related
to a particular brand of vaccine that had been contaminated with live poliovirus. The
Surgeon General requested that the manufacturer recall all outstanding lots of vaccine
and directed that a national poliomyelitis program be established at CDC. Had the
surveillance program not been in existence, many and perhaps all vaccine manufacturers
would have ceased production.

In 1963, Langmuir limited use of the term "surveillance" to the collection, analysis,
and dissemination of data (18) . This construct did not encompass direct
responsibility for control activities. In 1965, the Director General of the World
Health Organization (WHO) established the epidemiological surveillance unit in the
Division of Communicable Diseases of WHO (19) . The Division Director, Karel Raska,
defined surveillance much more broadly than Langmuir, including "the epidemiological
study of disease as a dynamic process." In the case of malaria, he saw epidemiologic
surveillance as encompassing control and prevention activities. Indeed, the WHO
definition of malaria surveillance included not only case detection, but also
obtaining blood films, drug treatment, epidemiologic investigation, and follow-up
(20) .

In 1968, the 21st World Health Assembly focused on national and global surveillance of
communicable diseases, applying the term to the diseases themselves rather than to the
monitoring of individuals with communicable disease (21) . Following an invitation
from the Director General of WHO and with consultation from Raska, Langmuir developed
a working paper and in the year prior to the Assembly obtained comments from
throughout the world on the concepts and practices advocated in the paper. At the
Assembly, with delegates from over 100 countries, the working paper was endorsed, and
discussions on the national and global surveillance of communicable disease identified

three main features of surveillance that Langmuir had described in 1963: a) the
systematic collection of pertinent data, b) the orderly consolidation and evaluation
of these data, and c) the prompt dissemination of results to those who need to know--
particularly those in position to take action.

The 1968 World Health Assembly discussions reflected the broadened concepts of
•epidemiologic surveillance" and addressed the application of the concept to public
health problems other than communicable disease (20) . In addition, epidemiologic
surveillance was said to imply "...the responsibility of following up to see that
effective action has been taken.'

Since that time, a wide variety of health events, such as childhood lead poisoning,

leukemia, congenital malformations, abortions, injuries, and behavioral risk factors

have been placed under surveillance. In 1976, recognition of the breadth of

surveillance activities throughout the world was made evident by the fact that a

special issue of the International Journal of Epidemiology was devoted to surveillance

(22) .

SURVEILLANCE IN PUBLIC HEALTH PRACTICE

The primary function of the application of the term "epidemiologic" to surveillance,
which first appeared in the 1960s associated with the new WHO unit of that name, was
to distinguish this activity from other forms of surveillance (e.g., military
intelligence) and to reflect its broader applications. The use of the term
"epidemiologic," however, engenders both confusion and controversy. In 1971, Langmuir
noted that some epidemiologists tended to equate surveillance with epidemiology in its
broadest sense, including epidemiologic investigations and research (25) . He found
this "both epidemiologically and administratively unwise, " favoring a description of
surveillance as "epidemiological intelligence."

What are the boundaries of surveillance practice? Is "epidemiologic" an appropriate
modifier of surveillance in the context of public health practice? To address these
questions, we must first examine the structure of public health practice. One can
divide public health practice into surveillance; epidemiologic, behavioral, and
laboratory research; service (including program evaluation); and training.

Surveillance data should be used to identify research and service needs, which, in
turn, help to define training needs. Unless data are provided to those who set policy
and implement programs, their use is limited to archives and academic pursuits, and
the material is therefore appropriately considered to be health information rather
than surveillance data. However, surveillance does not encompass epidemiologic
research or service, which are related but independent public health activities that
may or may not be based on surveillance. Thus, the boundary of surveillance practice
excludes actual research and implementation of delivery programs.

Because of this separation, "epidemiologic" cannot accurately be used to modify
surveillance (1) . ' The term "public health surveillance" describes the scope
(surveillance) and indicates the context in which it occurs (public health) . It also
obviates the need to accompany any use of the term "epidemiologic surveillance" with a
list of all the examples this term does not cover. Surveillance is correctly --and
necessarily--a component of public health practice, and should continue to be
recognized as such.

PURPOSES AND USES OF PUBLIC HEALTH SURVEILLANCE DATA

Purposes

Public health surveillance data are used to assess public health status, define public
health priorities, evaluate programs, and conduct research. Surveillance data tell
the health officer where the problems are, whom they affect, and where programmatic
and prevention activities should be directed. Such data can also be used to help
define public health priorities in a quantitative manner and also in evaluations of
the effectiveness of programmatic activities. Results of analysis of public health
surveillance data also enable researchers to identify areas of interest for further
investigation (23) .

The analysis of surveillance data is, in principle, quite simple. Data are examined
by measures of time, place, and person. The routine collection of information about
reported cases of congenital syphilis in the United States, for example, reflects not
only numbers of cases (Figure 1.1), geographic distribution, and populations affected,
but also indicates the effects of crack cocaine use and changing sexual practices over
the past 10 years. The examination of routinely collected data show where rates of

salmonellosis by county in New Hampshire and in three contiguous states. Mapping
these data illustrates the pattern of the spread of disease across state boundaries
(Figure 1.2). The examination of death certificates for data on homicide identifies
high-risk groups and shows that the problem has reached epidemic proportions among
young adult men (Figure 1.3).

USES

The uses of surveillance are shown in Table 1.1. Portrayal of the natural history of
disease can be illustrated by the surveillance of malaria rates in the United States
since 1930 (Figure 1.4). In the 1940s, malaria was still an endemic health problem in
the southeastern United States to the degree that persons with febrile illness were
often treated for malaria until further tests were available. After the Malaria
Control in the War Areas Program led to the virtual elimination of endemic malaria
from the United States, rates of malaria decreased until the early 1950s, when
military personnel involved in the conflict in Korea returned to the United States
with malaria. The general downward trend in reported cases of malaria continued into
the 1960s until, once again, numbers of cases of malaria rose, this time among
veterans returning from the war in Vietnam. Since that time, we have continued to see
increases in numbers of reported cases of malaria involving immigrant populations, as
well as among U.S. citizens traveling abroad.

Surveillance data can be used also to detect epidemics. For example, during the swine
influenza immunization program in 1976, a surveillance system was established to
detect adverse sequelae related to the program (24) . Working with state and local
health departments, CDC was able to detect an epidemic of Guillain-Barr6 syndrome,
which rapidly led to the termination of a program in which 40,000,000 U.S. citizens
had been vaccinated. However, most epidemics are not detected by such analysis of
routinely collected data but are identified through the astuteness and alertness of
clinicians and public health officials of the community. From a pragmatic point of
view, the key point is that when someone does note an unusual occurrence in the health
picture of a community, the existence of organized surveillance efforts in the health
department provides the infrastructure for conveying information to facilitate a
timely and appropriate response.

9
The distribution and spread of disease can be documented from surveillance data, as
seen in the county-specific data on salmonellosis (Figure 1.2). U.S. cancer mortality
statistics have also been mapped at the county level to identify a variety of
geographic patterns that suggest hypotheses on etiology and risk (25) . Recognition of
such clusters can lead to further epidemiologic or laboratory research, sometimes
using individuals identified in surveillance as subjects in epidemiologic studies.
The association between the periconceptual use of multivitamins by women and the
development of neural tube defects by their children was documented using children
identified in a surveillance system for congenital malformations (26) .

Surveillance data can also be used to test hypotheses. For example, in 1978 the U.S.
Public Health Service announced a measles elimination program that included an active
effort to vaccinate school-age children. Because of this program and the state laws
that excluded from school students who had not been vaccinated, CDC anticipated a
change in the age pattern of persons reported to have measles. Before the initiation
of the program, the highest reported rates of measles were for children 10-14 years of
age. As predicted, almost immediately after the school exclusion policy was
implemented, there was not only a general decrease in the number of cases but also a
shift in peak occurrence from school-age to preschool-age children (Figure 1.5). By
1979, there were even lower levels of measles incidence and altered age-specific
patterns.

Surveillance data can be applied in evaluating control and prevention measures. With
routinely collected data, one can examine --without special studies--the effect of a
health policy. For example, the introduction of inactivated poliovirus vaccine in the
United States in the 1950s was followed by a dramatic decrease in the number of
reported number of cases of paralytic poliomyelitis, and the subsequent introduction
in the 1960s of oral poliovirus vaccine was followed by an even greater decline
(Figure 1.6) .

Efforts to monitor changes in infectious agents have been facilitated by the use of
surveillance data. In the late 1970s, antibiotic-resistant gonorrhea was introduced
into the United States from Asia. Laboratory- and clinical -practice-based
surveillance for cases of gonorrhea enabled public health officials to monitor the
rapid diffusion of various strains of this bacterium nationally and facilitated

prevention activities, including notifying clinicians of proper treatment procedures
(Figure 1.7). Similarly, the National Nosocomial Infections Surveillance System, a
voluntary, hospital -based surveillance system of hospital-acquired infections, has
been used to monitor changes in antibiotic-resistance patterns of infectious agents
associated with hospitalized patients.

As noted earlier, the first use of surveillance was to monitor persons with a view of
imposing quarantine as necessary. Although this use of surveillance is rare in
modern-day United States, in 1975 — with the introduction of a suspected case of Lassa
fever — over 500 potential contacts of the patient were monitored daily for 2 weeks to
assure that secondary spread of this serious infectious agent did not occur (27) .

Surveillance data can also be used to good effect for detecting changes in health
practice. The increasing use of various technologies in health care has come to be an
issue of growing concern over the past decade; surveillance data can provide useful
information in this area (28) . For example, in the United States since 1965, the rate
of cesarean delivery has increased from approximately <5% to nearly 25% of all
deliveries (Figure 1.8). Data such as these are useful both in planning research to
learn the causes of these changes and in monitoring the impact of such changes in
practice and procedure on outcomes and costs associated with health care.

Finally, surveillance data are useful for planning. With knowledge about changes in
the population structure or in the nature of conditions that might affect a
population, officials can, with more confidence, plan for optimizing available
resources. For example, data on refugees entering the United States from Southeast
Asia in the early 1980s were broadly applicable; they told where people settled,
described the age and gender structure of the population, and identified health
problems that might be expected in that population. With this information, health
officials were able to plan more effectively the appropriate health services and
preventive activities for this new population.

THE FUTURE OF PUBLIC HEALTH SURVEILLANCE

As we approach the year 2000, several activities are expected to contribute to the
evolution of public health surveillance. First, use of the computer- -particularly the

microcomputer- -has revolutionized the practice of public health surveillance. In the
United States, the National Electronic Telecommunications System for Surveillance
(NETSS) links all state health departments by computer for the routine collection,
analysis, and dissemination of data on notifiable health conditions (29) . Over the
next several years, the growth will be within states, with state health departments
being linked to county departments, and possibly even to health-care providers'
offices for routine surveillance. The Minitel system currently in use in France has
already demonstrated the essential utility of office-based surveillance of various
conditions of public health importance (30) .

The second area of renewed activity associated with surveillance is that of
epidemiologic and statistical analysis. A by-product of the use of computers is the
ability to make more effective use of sophisticated tools to detect changes in
patterns of occurrence of health problems. In the 1980s, applications and methods of
time series analysis and other techniques have enabled us to provide more meaningful
interpretation of data collected in surveillance efforts (31) . More sophisticated
techniques will doubtless continue to be applied in the area of public health as they
are developed.

Until recently, surveillance data were traditionally disseminated as written documents
published periodically by government agencies. While paper reports will continue to
be produced, and public health officials will continue to refine the use of print
media, they are also beginning to use electronic media for the dissemination of
surveillance data. More effective use of the electronic media, and all the other
tools of communications, should facilitate the use of surveillance data for public
health practice. At the same time, ready access to detailed information on
individuals will continue to provide ethical and legal concerns that may constrain
access to data of potential public health importance.

The 1990s will see surveillance concepts applied to new areas of public health
practice such as chronic disease, environmental and occupational health, and injury
control. The evolution and development of methods for these programmatic areas will
continue to be a major challenge in public health.

A more fundamental principle that will underlie the ongoing development of
surveillance is the increasing ability of people to look at public health surveillance
as a scientific endeavor {32) . A growing appreciation of the need for rigor in
surveillance practice will no doubt improve the quality of surveillance programs and
will therefore facilitate the analysis and use of surveillance data. An important
result of this more vigorous approach to surveillance practice will be the increased
frequency and quality of the evaluation of the practice of surveillance {33) .

Finally, and probably most important, is the observation that surveillance needs to be
used more consistently and thoughtfully by policymakers. Epidemiologists not only
need to improve the quality of their analysis, interpretation, and display of data for
public health use, they also need to listen to persons empowered to set policy in
order to understand what stimulates the policymakers' interest and action. This
assessment allows surveillance information to be crafted so that it is presented in
its most useful form to the appropriate audience and in the necessary time frame. In
turn, as we maximize the utility of data for decision making and better understand
what is essential to that process, we will raise the area of public health
surveillance to a new and higher level of importance.

The critical challenge in public health surveillance today, however, continues to be
the assurance of its usefulness. In this effort, we must have rigorous evaluation of
public health surveillance systems. Even more basic is the need to regard
surveillance as a scientific endeavor. To do this properly, one must fully understand
the principles of surveillance and its role in guiding epidemiologic research and
influencing other aspects of the overall mission of public health. Epidemiologic
methods based on public health surveillance must be developed; computer technology for
efficient data collection, analysis, and graphic display must be applied; ethical and
legal concerns must be addressed effectively; the use of surveillance systems must be
reassessed on a routine basis; and surveillance principles must be applied to emerging
areas of public health practice.

13
REFERENCES

1. Thacker SB, Berkelman RL. Public health surveillance in the United States.
Epidemiol Rev 1988;10:164-90.

2. Eylenbosch WJ, Noah ND. Historical aspects. In: Surveillance in Health and
disease. Oxford, England: Oxford University Press, 1988:3-8.

3. Moro ML, McCormick A. Surveillance for communicable disease. In: Eylenbosch
WJ, Noah ND (eds.). Surveillance in health and disease. Oxford, England:
Oxford University Press, 1988:166-82.

4. Hartgerink MJ. Health surveillance and planning for health care in the
Netherlands. Int J Epidemiol 1976;5:87-91.

5. Anonymous (Editorial). Surveillance. Int J Epidemiol 1976;5:3-6.

6. Langmuir AD. William Farr: founder of modern concepts of surveillance. Int J
Epidemiol 1976;5:13-8.

7. Hinman AR. Surveillance of communicable diseases. Presented at the 100th
annual meeting of the American Public Health Association, Atlantic City, New
Jersey, November 15, 1972.

8. Vital statistics of the United States, 1958. Washington, DC: National Office of
Vital Statistics, 1959.

9. Trask JW. Vital statistics: a discussion of what they are and their uses in
public health administration. Public Health Rep 1915,-Suppl 12.

10. Bowditch HI, Webster DL, Hoadley JC et al. Letter from the Massachusetts State
Board of Health to physicians. Public Health Rep 1915,-Suppl 12:31.

11. Centers for Disease Control. Manual of procedures for national morbidity
reporting and public health surveillance activities. Atlanta, Georgia: Public
Health Service, 1985.

12. Chapin CV. State health organization. JAMA 1916,-66:699-703.

13. National Office of Vital Statistics. Reported incidence selected notifiable
disease: United States, each division and state, 1920-50. Vital Statistics
Special Reports (National Summaries). 1953;37:1180-1.

14. Chorba TL, Berkelman RL, Safford SK, Gibbs NP, Hull HF. The reportable
diseases. I. Mandatory reporting of infectious diseases by clinicians. JAMA
1989;262:3018-26.

15. Langmuir AD. Evolution of the concept of surveillance in the United States.
Proc R Soc Med 1971;64:681-9.

16. Langmuir AD, Nathanson N, Hall WJ . Surveillance of poliomyelitis in the United
States in 1955. Am J Public Health 1956;46:75-88.

17. Nathanson N, Langmuir AD. The Cutter incident: poliomyelitis following
formaldehyde-inactivated poliovirus vaccination in the United States during the
Spring of 1955. I. Background. Am J Hyg 1963;78:29-81.

18. Langmuir AD. The surveillance of communicable diseases of national importance.
N Engl J Med 1963;268:182-92.

19. Raska K. National and international surveillance of communicable diseases. WHO
Chron 1966;20:315-21.

20. Report for drafting committee. Terminology of malaria and of malaria
eradication. Geneva, Switzerland: World Health Organization, 1963.

21. National and global surveillance of communicable disease. Report of the
technical discussions at the Twenty-First World Health Assembly. A21/Technical
Discussions/5. Geneva, Switzerland: World Health Organization, May 1968.

22. Int J Epidemiol. 1976;5:3-91.

23. Thacker SB. Les principes et la practique de la surveillance en sante1 publique:
1' utilisation des donnees en sante publique. Sant4 Publique 1992;4:43-9.

24. Retailliau HF, Curtis AC, Starr G et al . Illness after influenza vaccination
reported through a nationwide surveillance system, 1976-1977. Am J Epidemiol
1980;111:270-8.

25. Mason TJ, Fraumeni JF, Hoover R, Blot WJ. An atlas of mortality from selected
diseases. Washington, D.C.: U.S. Department of Health and Human Services. NIH
Publication No. 81-2397, May 1981.

26. Mulinare J, Cordero JF, Erickson D, Berry RJ. Periconceptional use of
multivitamins and the occurrence of neural tube defects. JAMA 1988,-260:3141-5.

27. Zweighaft RM, Fraser DW, Hattwick MAW et al. Lassa fever: response to an
imported case. N Engl J Med 1977 ; 297 :803-7 .

28. Thacker SB, Berkelman RL. Surveillance of medical technologies. J Pub Health
Pol 1986;7:363-77.

29. Centers for Disease Control. National electronic communications systems for
surveillance- -United States, 1990-1991. MMWR 1991;40:502-3.

30. Valleron AJ, Bouvet E, Garnerin et al . A computer network for the surveillance
of communicable diseases: the French experiment. Am J Public Health
1986;76:1289-92.

16//7

31. Stroup DF, Wharton M, Kafadar K, Dean AG. An evaluation of a method for
detecting aberrations in public health surveillance data. Am J Epidemiol (In
press) .

32. Thacker SB, Berkelman RL, Stroup DF. The science of public health surveillance.
J Public Health Pol 1989;10:187-203.

33. Centers for Disease Control. Guidelines for evaluating surveillance systems.
MMWR 1988;37(Suppl No. S-5):l-20.

Chapter II

Planning a Surveillance System

Steven Teutsch

"Natural laws govern the occurrence of a disease, that these laws can be discovered by
epidemiologic inquiry and that, when discovered, the causes of epidemics admit to a
great extent of remedy."

William Farr

As described earlier, public health surveillance is the systematic and ongoing
assessment of the health of a community, including the timely collection, analysis,
interpretation, dissemination, and subsequent use of data. Surveillance provides
information for action, information with a purpose. Surveillance systems evolve in
response to ever-changing needs of society in general and of the public health
community in particular. In order to understand and meet those needs, an organized
approach to planning, developing, implementing, and maintaining surveillance systems
is imperative. In the sections below, approaches to the planning and evaluation
processes to be presented in more detail elsewhere in this book are discussed. The
steps in planning a system are shown in Table II. 1.

OBJECTIVES OF A SURVEILLANCE SYSTEM

Planning a surveillance system begins with a clear understanding of the purpose of
surveillance, i.e., the answer to the question: "What do you want to know?" In the
context of public health, surveillance may be established to meet a variety of
objectives, including assessment of public health status, establishment of public
health priorities, evaluation of programs, and conduct of research. Surveillance data
can be used in all of the following ways:

to estimate the magnitude of a health problem in the
population at risk

to understand the natural history of a disease or injury

to detect outbreaks or epidemics

to document the distribution and spread of a health event

to test hypotheses about etiology

to evaluate control strategies

to monitor changes in infectious agents

to monitor isolation activities

to detect changes in health practice

to identify research needs and facilitate epidemiologic
and laboratory research

to facilitate planning

Surveillance is inherently outcome oriented and focused on various outcomes associated
with health-related events or their immediate antecedents. These include the
frequency of an illness or injury, usually measured in terms of numbers of cases,
incidence, or prevalence; the severity of the condition, measured as a case-fatality
ratio, hospitalization rate, mortality rate, or disability; and the impact of the
condition, measured in terms of cost. Where risk factors or specific procedures are
incontrovertibly linked to health outcomes, it is often useful to measure the latter
because health outcomes often more frequent (and hence more precisely ascertainable
for small populations) and may be more closely linked to public health interventions.
For example, mammography with suitable follow-up is the major prevention strategy for
reducing mortality associated with breast cancer. Assessment of the level of

utilization of mammography by women can be regularly monitored and should be a more
timely indicator of the impact of public health prevention programs than measurement
of mortality from breast cancer. Surveillance data should also provide basic
information on the utilization of mammography services by age and race/ethnicity of
recipient, allowing better targeting of prevention efforts on the population sectors
with the lowest utilization. In addition, over-utilization by some parts of the
population (e.g., women <35 years of age who do not have other risk factors) might
stimulate efforts to reduce unnecessary procedures.

High-priority health events should clearly be under surveillance. However,
determining which should be considered high-priority events can be a daunting task.
Both quantitative and qualitative approaches can be used in a selection process. Some
quantitative factors are shown on Table II. 2. In addition, criteria based on a
consensus process to identify high-priority problems may identify emerging issues or
problems that might otherwise not be considered. The consensus process leading to the
Year 2000 Health Promotion and Disease Prevention Objectives in the United States is
an example of a mechanism for identifying high-priority conditions, types of behavior,
and interventions that require ongoing monitoring (2) .

Because public health surveillance in the United States is driven by the public health
need to be cognizant of diseases and injuries in the community and to respond
appropriately, surveillance is inherently an applied science. Therefore, as
surveillance has evolved, it is generally undertaken only when there is reasonable
expectation that control measures will be taken as appropriate. For many conditions
the link between surveillance and action is obvious (e.g., meningococcal meningitis
prophylaxis for contacts of patients diagnosed as having meningitis) . For emerging
conditions, such as eosinophilia-myalgia syndrome, there is a compelling public health
need to identify cases (delineate the magnitude of the problem) , identify the mode of
spread, and take appropriate action.

Surveillance data are usually augmented by additional studies to determine more
precisely the causes, natural history, predisposing factors, and modes of transmission
associated with the health problem. Yet, undertaking surveillance exclusively for
research purposes is rarely warranted. Research needs are often better served by
other, more precise (and often more costly) methods of case identification (e.g.,

registries), which facilitate more detailed data collection and tracking of cases.
For example, registries of type I diabetes may have value for surveillance, but are
justified primarily because they fill research needs. The ongoing public health
application of these data is more limited. Scarce public health resources and the
efforts of health-care providers to report cases need to be focused on problems for
which the public health importance and the need for public health action can be
readily recognized.

A primary role of surveillance is the assessment of the overall health status of a
community. One approach to this issue is the development and identification of a set
of indicators that measure major components of health status. Such a set has been
developed in the United States to be used at a national, state, and local level
(2) . Another approach is to examine the most frequent, severe, costly, and
preventable conditions in the community by examining most frequent causes of death,
hospitalization, injury, disability, infection, work-site-associated illness and
injury, and major risk factors for all the preceding items. This information can be
obtained in most communities in terms of age, race/ethnicity, gender, and temporal
trends. Regular assessments of the information can form the basis for educating the
community about its major health problems and for identifying specific conditions that
merit more intensive surveillance and intervention.

The specific objective and purpose of the surveillance system should be specified and
general agreement obtained.

METHODS

Once the purpose of and need for a surveillance system has been identified, methods
for obtaining, analyzing, disseminating, and using the information should fee
determined and implemented (see Chapters V, VI, and VII) .

Because surveillance systems are ongoing and require the cooperation of many
individuals, careful consideration must be given to the attributes discussed in
Chapter VIII in the discussion on evaluation. The system adopted must be feasible and
acceptable to those who will contribute to its success; it must be sensitive enough to
provide the information required to do the job at hand, while having a high

predictive-value positive to minimize the expenditure of resources on following up
false-positive cases. A surveillance system should be flexible enough to meet the
continually evolving needs of the community and to accommodate changes in patterns of
disease and injury. It must provide information that is timely enough to be acted
upon. All of these considerations must be carefully balanced in order to design a
system that can successfully meet identified needs without becoming excessively costly
or burdensome .

Case Definitions

Practical epidemiology is heavily dependent on clear case definitions that include
criteria for person, place, and time and that are potentially categorized by the
degree of certainty regarding diagnosis as "suspected" or "confirmed" cases (3) .

While high sensitivity and specificity are both desirable, generally one comes at the
expense of the other. A balance must be struck between the desire for high
sensitivity and level of effort required to track down false-positive cases. In
addition, case definitions evolve over time. During periods of outbreaks, cases
epidemiologically linked to the outbreak cases may be accepted as cases, whereas in
non-epidemic periods, serologic or other more specific information may be required.
Similarly, when active surveillance is used, such as in measles control programs,
numbers of cases identified tend to rise.

As our understanding of a disease and its associated laboratory testing improves,
alterations in case definitions often lead to changes in sensitivity and specificity.
As new systems complement old ones (e.g., as a morbidity system supplements a
mortality system for injury surveillance) , the reported frequency and patterns of
conditions change. These changes must be taken into account in analysis and
interpretation of secular trends in the frequency of reporting. It is all too easy to
define cases of various conditions with such different criteria that it is difficult
to compare the essential descriptors of person, place, or time. For example, in
surveillance of diabetes, one could determine the frequency of diabetes from surveys
(self reports of diabetes) , surveys using glucose determination (laboratory-
confirmed) , or from reviews of ambulatory or hospital records (physician-diagnoses) .
Each method provides a different perspective on the problem. Self reports are subject

23
to vagaries of recall and variation in interpretation (patient may be under treatment,
may have "a touch of diabetes" or prediabetes, or may have a history of gestational
diabetes). Glucose determinations allow detection of previously undiagnosed diabetes.
Medical records identify only patients currently receiving medical care.

Case definitions should be specified including criteria for person, place, time,
clinical or laboratory diagnosis, and epidemiologic features.

Data Collection

Information on diseases, injuries, and risk factors can be obtained in many ways.
Each mechanism has characteristics that must be balanced against the purpose of the
system (see Chapter III) . Timeliness is of the essence for frequently fatal
conditions such as plague, rabies, or meningococcal meningitis. Notifiable-disease
systems are most appropriate for such potentially catastrophic conditions with high
and urgent preventability constraints. Conversely, detailed information on influenza
strains or Salmonella serotypes must come from laboratory -based systems. Long-term
mortality patterns are available through vital records systems.

Often, existing data sets can provide surveillance data. Such sets include vital
records, administrative systems, and risk-factor or health- interview surveys. Among
administrative systems, hospital-discharge data, medical-management-information and
billing systems, police records for violence, and school records for disabilities or
injuries among children can all provide needed data, in addition, with some
modification, an existing system might provide needed data more economically or
efficiently than a newly initiated system.

Existing registries or surveys may collect information on defined populations. To the
extent that the condition of interest is uniformly distributed, the population under
study is reasonably representative, and the information collected is available on a
timely basis, such systems can be valuable data sources. Although many registries are
established for research purposes, they often provide valuable data for surveillance
purposes. In particular, cancer registries have been widely used (4) .

Sentinel providers can also constitute a network for collecting data on common
conditions, such as influenza; more specialized providers can provide data on less
common conditions, e.g., ophthalmologists who provide information on treatment of
patients for diabetic retinopathy.

Standardization

Data-collection instruments should use generally recognized and, where suitable,
computerized formats for each data element to facilitate analysis and comparison with
data collected in other systems, e.g., census and other surveillance data. Careful
consideration should be given to using identifiers. Although additional assurances of
confidentiality and privacy considerations will be required, the ability to link data
to other systems, such as through the National Death Index, may enhance the value of
the system.

Active and passive systems

Primary surveillance-data-collection systems have traditionally been classified as
passive or active. For example, most routine notifiable-disease surveillance relies
on passive reporting. On the basis of a published list of conditions, health-care
providers report notifiable diseases on a case-by-case basis to the local health
department. This passive system has the advantage of being simple and not burdensome
to the health department, but it is limited by variability and incompleteness in
reporting. Although the completeness of reporting may be augmented by efforts to
publicize the importance of reporting and by continued feedback to communications
media representatives, passive reporting systems may still not be representative and
they may fail to identify outbreaks. To obviate these problems, more active systems
are often used for conditions of particular importance. These systems involve regular
outreach to potential reporters to stimulate the reporting of specific diseases or
injuries. Active systems can validate the representativeness of passive reports,
assure more complete reporting of conditions, or be used in conjunction with specific
epidemiologic investigations. Since resources are often limited, active systems are
often used for brief periods for discrete purposes such as during the measles
elimination efforts.

Limited surveillance systems

Some surveillance efforts may not require ongoing systems. Surveillance to deal with
specific problems may be needed to address problems for which all cases must be
identified in order to assess the level of risk. Such programs can be conducted to
resolve specific problems and then be terminated (5) . Similarly, for logistic and
economic reasons, it may not be feasible to mount a surveillance system across large
geographic areas, and representative populations may need to be selected. Sentinel
providers can also provide information on common conditions or conditions of
particular interest to them.

Field testing

The careful development and field testing of surveillance systems and procedures is
important to facilitate the implementation of feasible systems and to avoid making
changes as systems are implemented on a broad scale. The frustration engendered by a
new and poorly executed system may undermine efforts to improve or use existing
systems for the same or other conditions. As new surveillance systems or new
instruments and procedures are developed, field tests of their feasibility and
acceptability are appropriate. These field-test projects can demonstrate how readily
the information can be obtained and can detect difficulties in data-collection
procedures or in the content of specific questions. Analyses of this test information
may also identify problems with the information collected. Model surveillance systems
may facilitate the examination and comparison of a variety of approaches that would
not be feasible on too large a scale and may identify methods suitable for other
conditions or other settings.

The data to be collected by a surveillance system, the data sources and collection
methods, and the procedures for handling the information should be developed and
tested.

Data Analysis

A determination of the appropriate analytic approach to data should be an integral
part of the planning of any surveillance system. The data needed to address the
salient questions must be assessed to assure that the data source or collection
process is adequate. Analyses may prove to be as simple as an ongoing review of all
cases of rare but potentially devastating illnesses, such as plague. For most

conditions, however, an assessment of the crude number of cases and rates is followed
by a description of the population in which the condition occurs (person), where the
condition occurs (place) , and the period over which the condition occurs (time) .
These basic analyses require decisions as to the kind of information that needs to be
collected. The level of detail required varies substantially from condition to
condition. For instance, one may need more detailed information regarding the
population that is not receiving prenatal care than on the one that is exposed to
meningococcal disease, because the nature of the intervention for the former is likely
to be more complex and require an understanding of socioeconomic factors. Similarly,
how one will collect data on geographic areas may depend on whether the data will be
examined at the county, state, or census- tract level.

Most contemporary surveillance systems are maintained electronically. The types of
analyses to be performed and the size of the data bases should suggest the type of
hardware and software needed (see Chapter XI) . As personal computers become more
powerful, the capacity of data-storage devices continues to grow, and data-sharing
systems such as local- and wide-area networks become more widely available, more
surveillance systems can be operated on personal computers. Software to meet most
basic analytic needs for surveillance, including mapping and graphing, is now widely
available. The analytic approach often suggests a basic set of analyses that are
performed on a regular basis. These analyses can be designed early in the development
of the system and incorporated into an automated system, which can then be run by
support personnel .

The adequacy of the data system and processing mechanisms should be assured.

Interpretation and Dissemination

Data must be analyzed and presented in a compelling manner so that decision makers at
all levels can readily see and understand the implications of the information.
Knowledge of the characteristics of the audiences for the information and how they
might use it may dictate any of a variety of communications systems. Routine, public
access to the data—consistent with privacy constraints- -should be planned for and
provided. This access can be facilitated with various electronic media, ranging from

systems with structured-analysis features suitable for general users to files of raw
data for persons who can do special or more detailed analyses themselves.

The primary users of surveillance information, however, are public health
professionals and health-care providers. Information directed primarily to those
individuals should include the analyses and interpretation of surveillance results,
along with recommendations that stem from the surveillance data. Graphs and maps
should be used liberally to facilitate rapid review and comprehension of the data.
Communications media represent a valuable secondary audience that can be used to
amplify the messages from surveillance information. The media play an important role
in presenting and reinforcing health messages. Innovative methods for presenting
information capitalizing on current audiovisual technology should be explored (see
Chapter VII) .

Evaluation

Planning, like surveillance itself, is an iterative process requiring the regular
reassessment of objectives and methods (see Chapter VIII) . The fundamental question
to be answered in evaluation is whether the purposes of the surveillance system have
been met. Did the system generate needed answers to problems? Was the information
timely? Was it useful for planners, researchers, health-care providers, and public
health professionals? How was the information used? Was it indeed worth the effort?
Would those who participated in the system wish to (be willing to) continue to do it?
What could be done to enhance the attributes of the system (timeliness, simplicity,
flexibility, acceptability, sensitivity, predictive-value positive, and
representativeness) ?

Answers to these questions will direct subsequent efforts to revise the system.
Changes might be minor (e.g., the addition of data elements to existing forms), or
major (e.g., the need to obtain information from entirely different data sources).
For example, a system to determine utilization of mammography might be based on
administrative billing systems. Yet, problems with reports of multiple mammography
examinations for the same individual might require the addition of unique patient
identifiers or the addition of questions on mammography use from self reports on
health-interview surveys. If access emerges as a critical factor in mammography

utilization, then ongoing monitoring of the quantity and location of mammography
facilities or monitoring for appropriate insurance coverage for mammography might be
indicated.

Periodic rigorous evaluation assures that surveillance systems remain vibrant.
Systems that assess problems whose only interest is historical should be discontinued
or simplified to reduce the reporting burden. Contemporary systems should take
advantage of the emergence of new technology for information collection, analysis, and
dissemination. They should capitalize on new information systems. For example,
sentinel surveillance systems have become more flexible to allow the inclusion of an
array of topics. Electronic medical records and standardized clinical data bases all
provide opportunities to obtain data that have been burdensome or difficult to secure
(6). These information sources may also provide data in a more timely fashion and
may allow individuals to be tracked, an option that would be virtually impossible
without such electronic systems.

INVOLVEMENT OF INTERESTED PARTIES IN SURVEILLANCE

Virtually all surveillance systems involve networks of organizations and individuals.
Surveillance of notifiable disease relies on health-care providers including
clinicians, hospitals, and laboratories to report to local health departments, who
have the initial responsibility for responding to reports and amassing data. In many
states, epidemiologists in the state health departments are responsible for
surveillance and control of notifiable diseases in their states. In larger states,
other organizational units--such as those dealing with sexually transmitted disease,
immunization, or tuberculosis control--often have primary responsibility for
surveillance and control of specific diseases or injuries. The state epidemiologist
is responsible for the ongoing quality control, collection, analysis, interpretation,
dissemination, and use of notifiable-disease data within that state. Data are
subsequently forwarded each week to the national level where they are again analyzed,
interpreted, and disseminated.

Programs for injuries and chronic and environmental diseases also may have complex
organizational structures and may involve a wide array of external professional and
voluntary interest groups whose needs must be addressed. Some basic surveillance

information can be gleaned from such ongoing information systems as vital records,
hospitalization programs, and registries. Although some of these conditions are part
of state notifiable-disease lists, many require surveillance systems to be established
in unique places (e.g., rehabilitation units and emergency medical services for
spinal-cord injuries or radiology centers for mammography) . The support and interest
of these groups of constituents are valuable in establishing the systems; these groups
can provide key input regarding purposes of systems and users of systems, as well as
assistance in developing the systems themselves.

The complex relationships among these organizational units and their constituents
requires open communication to establish priorities and methods consistent with the
needs and resources of each group. The conflicting desire for more detailed
information must be balanced against the associated burden and cost, as well as
against the utility of collecting extensive amounts of data. For example, electronic
systems that may facilitate higher quality, more complete, and more timely data also
involve the commitment of equipment, training, and changes in day-to-day activities
that may permeate all levels of the system. One must understand the needs of each
recipient group for the information and assess and assure their commitment to the
system. It is also critical to be attentive to how components of the system can best
be integrated into the overall system in terms of day-to-day operations.

The Council of State and Territorial Epidemiologists, an affiliate of the Association
of State and Territorial Health Officials, has the authority in the United States to
recommend which health conditions should be notifiable. After this list has been
agreed upon, it is then up to each state to determine whether and how the conditions
should be made reportable. Although most states report all those conditions
considered to be nationally notifiable, a wide range of conditions are reportable in
only a few states (3) . States may exercise their authority through regulations,
boards of health, or legislative procedures. The diversity of these methods is
described more fully in Chapter XII. Each of these mechanisms entails the involvement
of groups with an array of medical, administrative, public health, and policy
interests.

The success of surveillance depends heavily on the quality of the information entered
into the system and on the value of the information to its intended users. A clear

understanding of how policy makers, voluntary and professional groups, researchers,
and others might use surveillance data is valuable in garnering the support of these
audiences for the surveillance system.

REFERENCES

1. Healthy People 2000. National health promotion and disease prevention
objectives. DHHS Pub. No. (PHS) 91-50212. Washington, D.C.: U.S. Department
of Health and Human Services, Public Health Service, 1991.

2. Centers for Disease Control. Consensus set of health status indicators for the
general assessment of community health status--United States. MMWR 1991;40 :449-
51.

3. Chorba TL, Berkelman RL, Saffod SK, Gibbs NP, Hull HF. Mandatory reporting of
infectious diseases by clinicians. JAMA 1989,-262 :3018-26 .

4. American Cancer Society. Cancer facts and figures- -1991. Atlanta, Georgia:
American Cancer Society, 1991.

5. Teutsch SM, Herman WH, Dwyer DM, Lane JM. Mortality among diabetic patients
using continuous subcutaneous insulin infusion pumps. N Engl J Med
1984;310:361-8.

6. Ellwood PM. Outcomes management. A technology of patient experience. N Engl J
Med 1988;318:1549-56.

33
TABLE OF CONTENTS

Introduction

Notifiable disease and related reporting mechanisms

Vital statistics

Sentinel surveillance

Registries

Administrative data-collection systems

Summary

Appendix III

Chapter III

Sources of Routinely Collected Data for

Surveillance

Nancy E. Stroup
Matthew M. Zack
Melinda Wharton

"The real voyage of discovery consists not in seeing new landscapes but in having new
eyes. ■

Marcel Proust

INTRODUCTION

This chapter reviews sources of routinely collected data that can be used for public
health surveillance. In many instances, these sources will provide sufficient
information so that active case- finding for the health event of interest many not be
necessary. In other instances, analysis of routinely collected data, in conjunction

with active case- finding, will provide the basis for a comprehensive assessment of the
public health impact of a particular health event.

For infectious diseases, surveillance activities have traditionally relied on
"notifiable0 disease reporting systems based on legally mandated reporting of cases to
health officials. Depending on characteristics of the reporting system and of the
specific health event, these systems can provide timely information that is
particularly useful for monitoring short-term trends and for detecting outbreaks or
epidemics of disease. While prevention and control of infectious diseases remains a
mainstay of public health practice, there is increasing emphasis on monitoring the
public health impact of non-infectious or chronic diseases and injuries, as well as
risk factors for these conditions, including behavioral risk factors, demographic
characteristics, and potential exposure to toxic agents. With the expansion in the
number and type of health events under surveillance, the use of existing data sources,
such as vital statistics and more recently hospital discharge data, has expanded; and
new data sources, such as behavioral risk factor surveys, have been developed.

This chapter describes characteristics of six types of health information systems in
which data are collected routinely and are generally available for analysis. The six
are notifiable disease and related reporting systems, vital statistics, sentinel
surveillance, registries, health surveys, and administrative data collection systems.
As more sources of health information become available, effective surveillance for a
specific health event, whether infectious or non- infectious, will rely on analysis and
synthesis of information from a variety of sources, each of which has different
strengths and limitations. In many instances, these sources will provide sufficient
information so that active case-finding or other surveillance-related activities may
not be necessary. In other instances, analysis of routinely collected data, in
conjunction with other activities, will provide the basis for a comprehensive
assessment of the public health impact of a particular health event. For cervical
cancer, for instance, surveillance activities could include the following:
comprehensive assessment of cancer incidence data and cancer mortality data; reports
of cervical cytology and genital infections by laboratories; reports of pap smear
histories, smoking patterns, genital infections and safe sex practices from health
surveys; review of hospital-discharge data to monitor surgical treatment for advanced
disease; and information from a variety of sources on attitudes, payment strategies,

and other barriers or inducements that could influence the prevention, early
detection, and treatment of cervical cancer. The selection and appropriate use of data
from these sources would depend primarily on the nature and scope of activities to be
monitored as part of a cervical cancer control program.

Depending on the health event of interest, special short-term or demonstration
projects can also provide information that is very useful for surveillance or other
prevent ion- related activities. This chapter, however, focuses on sources of data in
which information on a wide range of health events is collected on a routine, ongoing
basis and is generally available for analysis.

The examples provided in this chapter are meant to be illustrative rather than
exhaustive. Many examples are research- rather than surveillance-related, but they do
highlight potential uses of these data sources for surveillance and related
activities. The background information provided on the methods used to collect
different types of data serves, however, as a starting point for a more detailed
assessment of the strengths and limitations of these data systems for surveillance of
a particular health event. The sources of data mentioned in this chapter are listed
separately in Appendix A.

Information on the availability of routinely collected health and population data are
available from a variety of sources. Federal agencies that provide data in the United
States include the following organizations:

the Centers for Disease Control (CDC) , including the National Center for

Health Statistics (NCHS);

the National Institute of Health (NIH) , including the National Cancer

Institute (NCI), the National Heart, Lung, and Blood Institute (NHLBI) ,

the National Institute on Drug Abuse (NIDA) ,

the National Institute on Alcohol Abuse and Alcoholism (NIAAA) , and the

National Institute for Mental Health (NIMH) ,-

the Food and Drug Administration (FDA) ;

the Agency for Health Care Planning and Research (AHCPR) ;

the Indian Health Service (IHS) ,-

the Health Care Financing Administration (HCFA) ;

• the National Highway Traffic Administration (NHTA) ;
the Consumer Product Safety Commission (CPSC) ; and
the Bureau of the Census

State health departments also routinely collect health information, some of which is
not available from federal sources; and private organizations (e.g., the Public Health
Foundation and the National Association of Health Data Organizations) either have
health information or maintain inventories of information that can be obtained from
other sources.

Information is available in other countries from similar national or local agencies
{1-4) . The United Nations and the World Health Organization (WHO) routinely publish
population estimates and summary information on mortality and natality in member
countries (5-6) . Health and demographic information is also available from regional
offices such as WHO/Europe (7).

NOTIFIABLE DISEASE AND RELATED REPORTING MECHANISMS

Overview

Reporting on notifiable diseases at the national level originated in the United States
in 1878, when Congress authorized the United States Public Health Service (PHS) to
collect reports on morbidity from cholera, smallpox, plague, and yellow fever, each of
which was controlled through quarantine measures (8,9). Although initially focused on
foreign ports, authority for weekly reporting was expanded in 1893 to include states
and municipal authorities (9). To increase uniformity, the Surgeon General was
authorized in 1902 to provide forms for the collection, completion, and publication of
reports at the national level . Weekly telegraphic reporting was recommended for a few
diseases in 1903, and by 1928, all states, the District of Columbia, Hawaii, and
Puerto Rico were participating in national reporting of specified conditions (8).
Compulsory notification for selected infectious diseases was also instituted in many
other countries in the late 1800s, including Japan (1880), Scotland (1887), Italy
(1888), England and Wales (1889), and Northern Ireland (1899) (2,3,10).

The list of diseases for which notification is recommended has changed over time, and,
although there is overlap, the lists vary from jurisdiction to jurisdiction. In the

United States, for instance, 47 infectious diseases were considered notifiable at the
national level in 1989 and were reported to CDC through the National Notifiable
Disease Surveillance System (NNDSS) (11) . In at least one state, however, reporting
was required for over 160 infectious diseases or related conditions, 90 occupational
diseases, 23 other environmental diseases, 29 congenital or related conditions, and
six diseases of unknown cause. With the addition of Lyme disease and Hemophilus
influenza in 1991, 49 infectious diseases are currently notifiable at the national
level in the United States (12) . In recent years, lists of notifiable diseases in
other countries included 66 diseases in Italy (19 with rapid reporting procedures) , 32
in Scotland and in Japan, 29 in England and Wales, and 26 in Northern Ireland

(2,3,10). Procedures for modifying the list of notifiable diseases also vary from
country to country. In the United States, reporting for notifiable diseases is
mandated at the state level and the Council of State and Territorial Epidemiologists

(CSTE) , a consortium of epidemiologists from all state and territorial health
departments, recommends a list of conditions to be reported each week to CDC (12) .
National reporting is required for three quarantinable diseases--plague, cholera, and
yellow fever. Cases of these three diseases are also reported to the WHO by member
countries.

In the United States, occupational diseases or occupation-related conditions are
considered notifiable in some states, but at present, occupation-related conditions
are not reported nationally (13,14). In 1988, at least one occupation-related
condition was considered reportable in 34 states or other jurisdictions. Lead
poisoning, pesticide poisoning, and occupation-related lung diseases are among the
occupation- related conditions that are reportable in many states.

In recent years, notifiable-disease-reporting mechanisms have been used in some
localities to collect information on conditions that are not infectious, occupation-
related, or vaccine-related. In the United States, spinal-cord injuries, elevated
blood lead levels for children and for occupational ly exposed workers, and Alzheimer's
disease are among the conditions for which reporting is required in some localities,
although national reporting is not recommended by CSTE (15-17).

Reporting in the United States for adverse events following vaccination or in
association with the administration of drugs differs from other notifiable-disease

reporting procedures in Chat the former types of events are reported nationally rather
than to state health departments. Since 1988, all health-care providers and vaccine
manufacturers have been required to report certain suspected adverse events following
specific vaccinations (18) . The Vaccine Adverse Event Reporting System (VAERS) in
which all reports of suspected adverse events following any vaccination are accepted,
became operational in 1990.

Adverse drug reactions are reported in the United States to the FDA {19,20). Drug
manufacturers are required to submit post-approval reports of adverse drug reactions
as well as reports from ongoing clinical trials and selected reports from foreign
sources. Reports submitted to manufacturers by providers are sent to the FDA, or
providers and patients can submit reports directly. Nearly 60,000 reports were
submitted in 1989. Many other countries have similar adverse-drug-reaction reporting
systems, and about 23 of these report data to the WHO Collaborating Center for
International Drug Monitoring (21) . In England, active surveillance for adverse drug
effects in relation to specific drugs can be monitored through the Prescription Event
Monitoring System, which is funded through both public and private sources {21,22).

Data Collection, Transmission, and Dissemination

Although information on notifiable diseases is collated and published nationally, its
primary purpose is to direct local prevention and control programs. In the United
States, information is generally reported by clinicians to local or state health
departments. State regulations governing notifiable disease reporting are often quite
specific regarding timeliness of reporting. For conditions in which an immediate
public health response is needed, notification by telephone is usually mandated,
either immediately or within 24 hours of a suspected case. Other conditions are
generally reported on a weekly basis after the diagnosis has been confirmed.

For conditions that are reported nationally in the United States through the NNDSS, a
subset of information — including the age, gender, race, and date of occurrence (or
report) — is sent weekly to CDC by state health departments or other jurisdictions in a
standard format, either as individual case reports or aggregate reports. Personal
identifiers are not included in the NNDSS. Since 1990, all reporting states and
localities have transmitted information electronically to CDC through the National

Electronic Telecommunications System for Surveillance (NETSS) (23) . National case
counts for most notifiable diseases are published the week after they are reported to
CDC in the Morbidity and Mortality Weekly Report (MMWR) .

Most state health departments also disseminate surveillance data and other public
health information to health-care providers through weekly or monthly newsletters. For
some conditions, including measles, hepatitis, syphilis, and acquired immunodeficiency
syndrome (AIDS) , more detailed information on risk factors and other information
needed for disease-control programs is also collected by state and local health
departments and, in some instances, is sent to CDC. Information is also sent to CDC
through NETSS for conditions such as spinal cord injuries, giardia infection, and Reye
syndrome, that are not nationally notifiable but for which information is useful at
the national level. Although their use in the United States is limited primarily to
influenza surveillance, networks of sentinel health-care providers in many European
countries report supplemental information on notifiable diseases to local and national
health officials (see below) .

Surveillance for zoonotic diseases also involves monitoring animal hosts that either
transmit the disease directly to humans or are also susceptible to the disease. For
various types of encephalitis, for instance, detection of elevated virus titers in
mosquitoes, wild birds, sentinel flocks of chickens, or horses can signal that an
outbreak of human disease may occur so that mosquito-control activities can be
initiated {24). Similarly, the potential for human cases of rabies is assessed through
monitoring wild skunks, raccoons, bats, and other animal vectors (25); the potential
for human plague is assessed by monitoring rodents in endemic areas (26) ; and Rocky
Mountain spotted fever and Lyme disease are monitored through testing of ticks
(27,28) .

Although most cases of notifiable conditions are reported by clinicians, the role
laboratories play in reporting notifiable conditions is becoming increasingly
important. In the United States, many states have developed reporting requirements for
laboratories and hospitals for conditions that need laboratory confirmation for
diagnosis (11,29,30) . In New York City, for instance, laboratories are required to
report elevated blood-lead levels in children, and at least five states rely on
laboratory reporting to identify workers with elevated levels of lead or other heavy

metals {15) . Comprehensive, nationwide reporting by laboratories is not yet available
in the United States, but in England, Wales, and Northern Ireland, nearly all
microbiology laboratories voluntarily report positive identifications of selected
conditions to the national Public Health Laboratory Service (PHLS) {10) .

Strengths and Limitations

Although many diseases or conditions are considered notifiable, compliance is poor in
most countries and sanctions are rarely enforced. As Sherman and Langitiuir noted in
1952, 'Our system of notification of individual case reports is a haphazard complex of
interdependence, cooperation, and goodwill among physicians, nurses, and county and
state health officers, school teachers, sanitarians, laboratory technicians,
secretaries, and clerks. It is a rambling system with variations as numerous as the
individual diseases for which reports are requested, and as numerous as the interests
and individual traits of the administrative health officers, epidemiologists, and
statisticians in [all] the . . . States and the several federal agencies concerned with
the data" (31). Indeed, it is remarkable- -given the jerry-rigged nature of the system-
-that the information collected is at all useful.

Under-reporting is a consistent and well-characterized problem of notifiable-disease-
reporting systems (see Chapter 12). In the United States, estimates of completeness of
reporting range from 6% to 90% for many of the common notifiable diseases {32) .
Reporting is generally more complete for conditions such as plague and rabies that
cause severe clinical illness with serious consequences. Among the many factors that
contribute to incomplete reporting of notifiable conditions are lack of medical
consultation for mild illnesses; concealment by patients or health-care providers of
conditions that might cause social stigma; lack of awareness of reporting
requirements; lack of interest by the medical community; incomplete etiologic
definition of notifiable conditions; inadequate case definitions for surveillance
purposes; variation in clinical expertise in diagnosing conditions in different areas;
changes in procedures for verifying reports from providers; variation in the use of
laboratory confirmation; variation in laboratory procedures; the effectiveness of
control measures in effect; and priorities of health officials at local, state, and
national levels {9,30,33) . Similarly, increased concern can result in an increase in
reported cases. Public health officials may actively solicit information if an

outbreak is suspected and case reports may increase in response to reports by the
media .

The extent of under-reporting can vary by risk group. An evaluation of reporting for
AIDS in Philadelphia found, for instance, that under-reporting was more prevalent for
those who were employed in white-collar occupations and who had private health
insurance (34) . Similarly, a review of hospital -discharge data in South Carolina
indicated that AIDS diagnoses were less likely to be reported for whites over 40 years
of age 1.35) .

Changes in case definitions and the extent to which laboratory confirmation is
required for reporting can also affect reporting for notifiable conditions. In the
United States, a 1984 survey of state epidemiologists found substantial variation in
definitions used for communicable disease surveillance by state health departments.
Since then, surveillance case definitions have been developed for many communicable
diseases and occupational conditions, as well as for spinal-cord injuries (14,17). The
degree to which standardized case definitions for notifiable-disease reporting have
been adopted varies, but recent experience suggests that there will be more important
changes in trends as they are more widely used. The 1987 revision of the surveillance
case definition for AIDS resulted in an increase in the number of reported cases among
heterosexual drug abusers (36) . Changes in the surveillance case definition for
congenital syphilis resulted in a 5-fold increase in cases in some reporting areas
[37,38) . Adoption of a uniform case definition for Lyme disease is probably reflected
in the decrease in reported cases in the United States in 1990 (39) .

The extent to which clinical reports are confirmed with laboratory findings can have a
substantial impact on reporting rates. For instance, malaria was endemic in the
southeastern United States in the 1930s. Epidemiologic studies in 1947 indicated that
routine reporting of aggregate case counts based on clinical findings alone was not
providing an accurate picture of current disease activity. When reporting of
individual cases with laboratory confirmation was required, it became clear that
endemic malaria had disappeared between 1935 and 1945, before malaria control programs
based on drainage and indoor residential spraying of DDT were initiated (40,41). In
recent years, the role of laboratories has been particularly important for
surveillance of the numerous subtypes of Salmonella, legionellosis, nosocomial

infections, and detecting elevated blood-lead levels (15,30,42) . Without laboratory-
based surveillance, for instance, a large outbreak of drug-resistant Salmonella
newport that originated from animals fed antimicrobials might not have been detected
(43) .

In spite of their limitations, surveillance systems based on reporting of notifiable
conditions are a mainstay of public health surveillance. Unlike most other sources of
routinely collected data, information from notifiable-disease systems is available
quickly and from all jurisdictions. Knowledge of the specific characteristics of
reporting for a particular condition is helpful in interpreting the findings. While
long-term trends may be difficult to interpret without supplemental information,
notifiable-disease systems can often detect outbreaks or other rapid changes in
disease incidence in a timely manner so that control activities can be initiated. As
appropriate, initial observations can be evaluated further with additional studies.
Notifiable-disease systems can also detect changes in patterns of disease by
demographic characteristics or risk groups. In the United States, for instance, human
immunodeficiency syndrome (HIV) and AIDS surveillance systems have identified new risk
groups including intravenous drug abusers and their mates and have highlighted the
emerging problem of children who are born HIV-infected. Evaluation of surveillance
information has also lead to changes in disease prevention and control strategies. On
the basis of reports of measles among elementary school-, high school-, and college-
age students, recommendations for measles vaccination in the United States were
recently changed to include a two-dose schedule (44) . Similarly, because strategies
based on vaccination of high-risk groups have not been as effective as originally
anticipated, recommendations for hepatitis B vaccination have recently been modified
(45) .

In the United States, reports of adverse drug reactions often result in labeling
changes for new drugs (19). Drug withdrawals are infrequent, although two drugs (an
antidepressant and a non-steroidal anti- inflammatory agent) have been withdrawn in
recent years. Vaccine adverse-event-reporting systems are important for detecting
potential problems following administration of vaccine, such as an increase in
paralytic poliomyelitis among recently vaccinated children in the 1950s and the
increase in Guillain-Barre syndrome following vaccination for swine influenza
(18,46,47) .

Notifiable-disease-reporting mechanisms have also been important for identifying
unusual conditions that appear to be increasing and for obtaining a preliminary
assessment of their public health impact. Among the more recent examples in the United
States are AIDS, toxic-shock syndrome, legionellosis, Reye syndrome, and eosinophilia-
myalgia syndrome (EMS). Following the initial report from a state health department,
nationwide surveillance for EMS using a standard case definition was instituted within
a few days, and, through additional studies, the putative agent was identified (48) .

In the future, reporting of notifiable conditions may rely, in part, on computerized
data bases developed for billing and other purposes. However, the utility of these
systems is limited at present: first, because International Classification of Disease
(ICD) codes are often not used to identify infectious agents on billing records and,
second, because information in these large data bases is not available immediately
(49) . In the near-term, improvements in notifiable-disease reporting in most areas
are likely to be related to increased reliance on laboratory-based reporting and on
the use of sentinel health-care providers or sentinel sites.

Vital Statistics
Overview

The systematic registration of vital events had its origins in the parish registers of
15th century Western Europe (1) . One of these registers, the Bills of Mortality--a
weekly tally begun in 1532 of the number of persons who died in London from plague and
other causes, was used to study patterns of mortality by John Graunt, one of the first
to use numerical methods to study disease (50) .

Parish registers were superseded in the 19th century by civil registers kept for legal
purposes. Registration of vital events usually remains the responsibility of local
authorities, but the use of standard procedures for collecting, coding, and reporting
vital events--f irst used systematically by William Farr in Great Britain the 1830s--
allows information from different jurisdictions to be aggregated, summarized, and
compared. Farr, the first medical statistician in the Office of the General
Registrar of England and Wales, recognized the importance of determining death rates
for different segments of the population using information collected systematically at
the time of birth or death. In the first annual report to the Registrar General in
1839, Farr discussed the principles that should govern a statistical classification of

disease and urged the adoption of a uniform system {2,51) . Nomenclature and
statistical classification systems initially developed by Farr and by Marc d'Espine
form the basis of the international disease classification system used today.

Information collected at the time of birth and death is one of the cornerstones of
surveillance in both developed and developing countries. Today, about 80 countries or
areas report statistics on vital events to WHO, which are coded and tabulated
according to the ninth revision of the International Classification of Diseases (ICD-
9) and represent about 35% of the deaths that occur each year worldwide (ICD-9) (52) .

Vital statistics are an important source of information for surveillance because they
are the only health-related data available in many countries in a standard format
(52) . Also, they are often the only source of health information available for the
entire population and the only source available for estimating rates for small
geographic areas. Vital statistics have been used to:

monitor long-term trends (53-55);

identify differences in health status within racial or other subgroups of
the population (55,57);

assess differences by geographic area (58-62) or occupation (50,63);
monitor deaths that are generally considered preventable (64-67) ;
generate hypotheses regarding possible causes or correlates of disease
(68,69) ;

conduct health-planning activities (70,72); and

monitor progress toward achieving improved health of the population
(7, 72, 73) .

The usefulness of vital statistics for surveillance of a particular health event
depends on the characteristics of that health event, as well as on the procedures used
to collect, code, and summarize relevant information. In general, vital statistics
will be more useful for conditions that can be ascertained easily at the time of birth
or death. Likewise, mortality rates derived from death-certificate data will more
closely approximate true incidence for conditions with a short clinical course that
are easy to diagnose, are easily identified as initiating a chain of events leading to
death, and are usually fatal (52,74-75). Although birth and death certificates are

filed shortly after the event occurs, the process of producing final vital statistics
at a national level from these data can take several years. Background information on
the process of producing vital statistics, outlined here for the United States, is
intended to highlight some of the strengths and limitations of vital statistics for
public health surveillance.

Birth and Death Certification

In the United States, responsibility for the registration of birth, death, and fetal
death is vested in the individual states and certain independent registration areas
(e.g., New York City) (77). States are encouraged to adopt standard certificates
similar to the "model" certificate developed by NCHS in collaboration with other
groups although some states modify the "model" certificate to comply with state laws
or regulations or to meet their own information needs (78). Certificates are usually
filed with a registrar within 24 hours in the jurisdiction in which the event
occurred. For birth certificates, the physician or attendant certifies the date,
time, and place of birth and other hospital personnel usually obtain information on
the remaining items (79). The 1989 model birth certificate includes additional
information on perinatal risk factors, such as maternal illnesses and complications of
labor and delivery, that will help to improve surveillance for perinatal events
(77, 80, 81) .

For death certificates, the funeral director is usually responsible for including all
personal information about the decedent and for assuring that medical information is
provided by the physician who certifies the death [82). Information provided by the
physician includes the cause of death (immediate, "as a consequence of, " and
underlying causes), the interval between onset of the condition and death, other
important medical conditions, the manner of death (e.g., "accident", homicide, or
suicide) , whether an autopsy was performed, and whether the medical examiner or
coroner was notified of the death (78). In most cases, information from autopsies and
reports from medical examiners or coroners are not available at the time the death
certificate is filed, although the certificate can be amended when this information
becomes available. Local registrars assure that all vital events that occurred in the
jurisdiction are registered and that required information is provided on certificates
before they are sent to the state registrar. Both state and local registrars can ask

physicians or funeral directors for additional information if the certificate is
considered incomplete. State registrars are usually responsible for numbering,
indexing, and binding certificates for permanent safekeeping. Also, state registrars
usually forward certificates for deaths of non-residents to their states of residence.

Coding, Classification, and Calculation of Rates

To calculate national death rates, the numbers of live births is used as denominators
for infant and maternal mortality rates, and estimates of the population, usually
derived from the censuses are used as the denominators for other death rates {51,83).
Conditions are classified and rates are calculated according to the ninth revision of
the ICD-9 developed through the WHO and in use since 1979. The ICD-9 includes a
tabular list of categories and conditions with code numbers, definitions of key terms
(e.g., underlying cause of death, low birth weight), rules for selecting the
underlying cause of death, and lists of conditions for statistical summaries.

Age-standardized rates are usually calculated when summary rates are compared in order
to control for the effects of differences in age structure between compared
populations (see Chapter V). In the United States, the age distribution of the U.S.
population in 1940 is usually used as the standard for vital statistics {84,85).
Other age distributions--such as the world standard population and the European
standard population--are often used for international comparisons (See Chapter 5) (5).

In the United States, about half the states submit both medical and demographic data
from certificates to NCHS in computerized form {84,85). Final national mortality and
natality data are generally not available from NCHS for at least 20 months after the
close of the calendar year, although a written report based on a 10% sample of deaths
is available within a few months. Final data are often available more quickly from
individual states. Similarly, final mortality and natality data are generally
available, with indices of quality and completeness, within 2-3 years for countries
that routinely report data to WHO (5)

Comparability and Quality Control

The quality of vital-statistics information depends on various factors, including the
completeness of registration, the relevance of the categories used for diseases,

injuries, and other conditions; the accuracy of demographic and medical data provided
on certificates; and the translation of this information into computerized data
(including its categorization and coding) . When rates are calculated, estimates are
also affected by the accuracy of the population estimates or other estimates used for
denominators. Differences in access to medical care, diagnostic practices, and
interpretation of coding rules will also affect comparability.

Registration and medical certification of deaths is virtually complete in most
developed countries [86) . Population estimates used to calculate rates in developed
countries are usually derived from censuses conducted at regular intervals (usually
every 10 years), in which the total population is enumerated (6). Inter-censal
estimates are derived by adjusting census figures for birth, death, and migration
patterns in the intervening years. In some countries, population estimates are
derived from surveys or from continuous population registers. Through the United
Nations, population estimates, including indices of the quality and completeness of
these estimates, are available for about 220 countries or areas of the world.

Population under-counts can have a measurable impact on mortality rates; rates will be
inflated, for instance, if population estimates used for the denominator are too
small. In the United States, for instance, the 1980 age-adjusted death rate (1940 age
standard) from all causes would decrease by 1.1% if the population estimate from the
1980 census was adjusted for under-counts (85) . Effects are even greater for
subgroups of the population. For homicides and deaths resulting from legal
intervention in the United States in 1980, adjustment for census under-count would
change the ratio of death rates for black to white men ages 35-39 years from 7.3 to
6.2--a decrease of nearly 18%.

When cause-specific rates are compared, both the extent to which information on birth
and death certificates is reported completely and accurately and the precision of
population estimates will affect the magnitude and the comparability of rates. The
impact of these factors is likely to be of less importance for aggregated cause-of-
death categories. Nonetheless, comparisons between different geographic areas or
different population subgroups should be interpreted cautiously.

Mortality from "signs, symptoms, and ill-defined conditions" is often used as an
indicator of the care and consideration given by medical certifiers to completing
certificates (ICD-9 780-799). In recent years, countries in which 'signs, symptoms,
and ill-defined conditions" were coded as the underlying cause of death ranged from
less than 1% for Australia, Czechoslovakia, Finland, Hungary, New Zealand, Sweden, and
the United Kingdom to 5%-10% for Belgium, France, Greece, Israel, Poland, Portugal,
and Yugoslavia (86) . In the United States, 1.4% of deaths in 1988 were coded as
"signs, symptoms, and ill-defined conditions," with a range among the states of 0.4%
to 4.1% (85) .

The impact of these factors on international comparisons has been assessed for cancer
and for respiratory disease (76,76). Within the United States, differences in
completeness and accuracy of certificates have also been noted within racial and
ethnic subgroups (87) .

A variety of approaches will facilitate improvement in the quality of information on
birth and death certificates. These include providing physicians and funeral
directors clearer instructions for completing the certificates and more effective
training regarding the importance of vital statistics and the importance of following
recommended procedures for completing both the medical and demographic sections of
certificates (77, 88,89) . State and local registrars can increase the extent to which
they contact physicians and funeral directors when information provided on
certificates is not considered complete and can facilitate amendment of certificates
when additional information is available from autopsies or other sources.

In spite of limitations, birth and death certificates are an important source of
information for cost-efficient surveillance of a wide range of health events at local,
national, and international levels. Although differences in rates may not always
reflect actual differences in disease and injury burden, routine analysis of
information obtained at the time of birth and death can highlight areas in which
further investigation of a health event is warranted.

Examples of Surveillance Systems Based on Vital Statistics and
Related Data

50
Weekly reports. As part of the national influenza surveillance effort in the United
States, vital registrars in 121 U.S. cities report to CDC each week the number of
deaths that have occurred in those jurisdictions (90) . This 121-City Surveillance
System has been operational since 1952. The total number of deaths and the number
attributed to pneumonia and influenza by age group are reported, and the total number
of deaths by age, city, and region are published within a week of receipt in the MMWR.
About one-third of the deaths that occur in the United States are reported through the
121-City Surveillance System, and most are reported to CDC within 2-3 weeks of
occurrence. Mortality rates based on the 121-City system cannot be directly compared
with rates derived from final mortality data. However, the 121-City system does
detect short-term increases in deaths from influenza and pneumonia in a timely manner
as needed for public health intervention. Increases in mortality from other causes-
including mortality during heat waves and increased deaths from pneumonia and
influenza among young men (later linked to AIDS)-- have also been detected using the
121-City system.

Monthly or quarterly reports. In the United States, final mortality data are
generally not available for nearly 2 years, although provisional estimates are
published by NCHS within 3-4 months in the Monthly Vital Statistics Report (MVSR) . The
Current Mortality Sample, a 10% systematic sample of certificates, is sent to NCHS
each month by state registrars. On the basis of this sample, provisional estimates of
total monthly mortality by age, race (white, black, other), gender, state, and region
are published about 3 months later, and provisional rates from 72 selected causes are
published the following month. Provisional rates are published by place of occurrence
while final rates are published by place of residence. For the Mortality Surveillance
System (MSS), time-series regression models are fitted using monthly data, and charts
displaying monthly estimates and the fitted model for specific conditions are
published each month in the MVSR.

The Current Mortality Sample and the MSS are very useful for monitoring overall trends
in total mortality and for monitoring trends in relatively common causes of death that
are increasing or decreasing over time (e.g., heart disease, homicide, lung cancer,
HIV /AIDS) . Although estimates are adjusted for under-reporting, monthly changes in
mortality for conditions for which supplemental information is often needed should be
interpreted with caution.

51
Infant mortality and other adverse reproductive outcomes. Linking information
from death certificates for infants with information on maternal characteristics and
other information from birth certificates is useful for assessing potentially
preventable mortality by geographic area and within subgroups of the population. In
England and Wales, birth and death records for infants were linked for infants born in
1949-1950 and again for infants who died from April 1954 to March 1965 (2) . All
births and deaths of infants have been linked routinely in England and Wales since
1975. In the United States, birth and death certificates have been linked for infants
born from 1983 to 1986 (PI). Approximately 40,000 infants die each year in the United
States, and at least 98% of the death certificates for infants have been linked to
birth certificates in these years. This information is also useful for health
planning and for targeting services, since U.S. infant mortality rates vary
considerably by geographic area and within demographic subgroups.

Information on birth certificates has also been used to identify high-risk mothers who
need supportive services for infant care. In Michigan, for instance, information on
birth certificates is transmitted electronically from hospitals to the state health
department (91) . Key information is then sent to county health departments so that
public health nurses can be assigned to areas with the greatest need.

Occupational mortality. William Farr was the first to evaluate systematically the
associations between occupation and cause of death (50) . The Decennial Supplement on
Occupational Mortality for England and Wales has been published approximately every 10
years since 1855 (1,2). Cause-specific rates and ratios by occupation, adjusted for
social class, are estimated using information derived from death certificates and from
the decennial census (63) . Although estimates are affected by sources of error in
both data sets, occupation-specific mortality rates are useful for identifying
occupations for which more detailed studies may be warranted (92) .

In the United States, usual occupation (even if retired) and industry are included on
the standard death certificate (85) . The states are not required to report this
information to NCHS, but if it is submitted, it has been included since 1985 in the
computerized final mortality files using the Standard Occupational Classification and
Standard Industry Classification systems. In 1987, 14 states reported information on
occupation and industry to NCHS and in 1989, occupation and industry during the last

year for both mother and father were added to the standard certificate for deaths of
fetuses (77). Through the National Traumatic Occupational Fatalities (NTOF)
surveillance system, CDC's National Institute for Occupational Safety and Health
(NIOSH) obtains additional information for work-related traumatic deaths that is
included on death certificates but that is not coded and computerized routinely in all
states (93) . State- and industry-specific rates are derived using estimates of the
employed population from the Bureau of Labor Statistics. Analyses from the NTOF
suggest that traumatic occupational fatality rates decreased in the United States
between 1980 and 1985, although, in some instances, large differences were found in
fatality rates by gender and by state within the same industry.

Supplemental information from other sources. Other sources of information may be
available on the circumstances leading to death. In the United States, medical
examiners and coroners are responsible for investigating sudden and unexpected deaths -
- homicides, suicides, deaths from unintentional injuries, and unanticipated deaths
from natural causes--which account for about 20% of all deaths each year. Reports
from medical examiners and coroners include detailed information on the circumstances
surrounding death, results of laboratory analyses for alcohol and drugs, and other
relevant information. These reports have been used, for instance, to investigate
deaths associated with horseback riding, drug abuse, hurricanes, earthquakes, and heat
waves (.94-98) . In 1990, through the Medical Examiner/Coroner Information Sharing
Program, data from investigations of death were reported to CDC's National Center for
Environmental Health (NCEH) in a computerized format from nine state and eight county
medical-examiners' offices (R.G. Parrish, personal communication).

Additional information on fatalities is often available from other sources. In the
United States, for instance, the Fatal Accident Reporting System (FARS) from the NHTA
has been used to investigate the association between use of child restraints and
motor-vehicle-related crashes (99) and the association between premature mortality and
alcohol-related traffic crashes (100) . The relationship between homicide and the
prevalence of hand-gun ownership in the United States and Canada has been investigated
using data from uniform crime-reporting registries of all homicides and aggravated
assaults maintained by the Federal Bureau of Investigation in the United States and
the Centre for Justice Statistics in Canada (102) . Other sources--such as police,

ambulance, and fire reports--may also include information that is useful for
surveillance of particular health events.

SENTINEL SURVEILLANCE

Overview

The term "sentinel surveillance" encompasses a wide range of activities focused on the
monitoring of key health indicators in the general population or in special
populations. Characteristics of these activities vary considerably, but, in general,
their primary intent is to obtain timely information needed for public health or
medical action in a relatively inexpensive manner rather than to derive precise
estimates of prevalence or incidence in the general population. The term "sentinel"
has been applied to key health events that may serve as an early warning or represent
the tip of the iceberg; to clinics or other sites where health events are monitored;
or to networks of health-care providers who agree to report information on one or more
health events. A sentinel health event, according to Rutstein, is a "preventable
disease, disability, or untimely death whose occurrence serves as a warning signal
that the quality of preventative and/or therapeutic medical care may need to be
improved" (102). Sentinel surveillance, according to Woodhall, represents "an attempt
to find a system that would provide a measure of disease incidence in a country in the
absence of good nation-wide institution-based surveillance without having to resort to
large expensive surveys" (103) . Sentinel surveillance systems are not limited to
developing countries. In Europe, routine morbidity surveillance is often conducted by
networks of primary care providers who routinely report information on conditions that
are relatively common in general practice (104,105).

Sentinel Health Events

Sentinel health events are monitored for many different public health programs. In
the United States, sentinel surveillance for maternal mortality, first used in New
York City in the 19 30s, was associated with a rapid decline in mortality associated
with childbirth. For each case, medical panels reviewed pertinent records to identify
missed opportunities that might have prevented a presumably unnecessary death.
Similar methods have been used to monitor deaths of infants. In Massachusetts, review
of records indicated that, in 1967-1968, about one-third of the deaths of infants

could have been prevented by medical intervention (.102). Monitoring preventable
conditions can also highlight more general problems. For instance, a review of deaths
among infants from Rh hemolytic disease, about 90% of which are considered
preventable, indicated that mothers of many affected infants did not have medical
insurance coverage (106) . Quality of care has also been evaluated using conditions
for which death or disability could have been prevented including evaluation of
hospital-based mortality rates after adjustment for certain patient characteristics
(107-109) .

Sentinel surveillance activities have been particularly useful for identifying health
events that may be related to occupational exposures. Lists of occupation-related
health events have been developed, some of which (e.g., mesothelioma and angiosarcoma
of the liver) are specifically tied to environmental or occupation exposure, and some
of which (e.g., lung cancer and bladder cancer) have other risk factors as well (102).
Mesothelioma, for instance, is a rare form of cancer specifically associated with
exposure to asbestos that may identify the "tip of the iceberg" of asbestos-related
disease in an industry in which workers develop more common conditions, such as lung
cancer and chronic obstructive pulmonary disease.

In the United States, NIOSH has developed the Sentinel Event Notification System for
Occupational Risks (SENSOR) program, which focuses on surveillance of specific
occupational conditions by networks of sentinel providers (210) . Target conditions
monitored by at least one of the 10 states initially included in the program include
silicosis, occupational asthma, pesticide poisoning, lead poisoning, and carpal-tunnel
syndrome. When cases identified by sentinel providers, (usually physicians who
practice occupational medicine) are found to be occupation-related, intervention
activities are undertaken by state health departments in order to prevent additional
cases. Although primarily used for case identification and follow-up, information
derived from SENSOR projects may augment other sources of information on trends for
occupation-related disorders.

Health indicators that are monitored in many different countries could also be
considered sentinel health events. Infant-mortality rates, for instance, are used in
both developing and developed countries as an indicator of the availability and the
quality of medical care. In Europe and the United States, additional health

indicators are monitored routinely to assess the general health of the population. In
Europe, 22 key health indicators have been monitored routinely since 1986 through
WHO'S Health for All activity in order to compare progress toward reducing preventable
morbidity and mortality in participating countries (7). In the United States,
specific goals and objectives for improving the nation's health are monitored using
key health indicators. Goals and objectives initially developed for 1990 have been
revised and expanded for the Year 2000 so that progress toward attainment of specific
objectives can be monitored quantitatively (73). A total of 226 goals and objectives
for the Year 2000 has been proposed for use in monitoring health status at the
national level and a subset of 18 indicators has been selected for monitoring by all
levels of government (112) . Most of these 18 community-health-status indicators are
based on vital statistics and data from the NNDSS.

Sentinel Sites

Sentinel hospitals, clinics, and counties can often provide timely, information on a
wide range of health conditions that is not available from other sources. Although
information is generally not available for the entire population, sentinel systems in
both developing and developed countries can provide sufficient information for making
public health decisions and for detecting long-term trends. In developing countries,
the WHO Expanded Project on Immunization uses sentinel hospitals and clinics in 25
target cities to monitor the impact of vaccination on the incidence of neonatal
tetanus, poliomyelitis, diphtheria, measles, pertussis, and tuberculosis (203). After
initial contact with many hospitals and clinics, officials choose sentinel sites that
serve populations as similar as possible to the general population. In developed
countries, sentinel providers, hospitals, and clinics are used to monitor conditions
for which information is not otherwise available. Sentinel primary-care providers
report information on conditions seen in ambulatory settings, while sentinel sites--
such as drug, sexually transmitted disease, and maternal and child health clinics--
monitor conditions in subgroups that may be more vulnerable than the general
population.

Sentinel hospitals, clinics, and counties can also provide public health information
that is not readily available from other sources. In the United States, for instance,
viral hepatitis is a notifiable disease, but non-A non-B hepatitis (most of which is

hepatitis C) is under-reported, and not all of the detailed information on serology,
demographics, and routes of transmission needed for monitoring is routinely available.
To obtain such information, patients with hepatitis reported to four county health
departments are interviewed, are tested serologically at regular intervals after the
onset of illness, and are followed prospectively to determine whether they have
acquired hepatitis B or hepatitis C-related chronic liver disease (112,123). Taken
together, these sentinel counties are intended to be representative of the incidence
and epidemiologic characteristics of hepatitis B in the United States. Findings from
these sentinel counties have highlighted the increasing importance of parenteral drug
use in the transmission of both hepatitis B and C.

Surveillance from sentinel sites is also used in the United States for surveillance of
HIV infection (114) . Since the epidemic of HIV comprises multiple sub-epidemics in
different population groups and different geographic areas, progression of the
epidemic can be monitored by targeting surveillance efforts directed at groups who are
at increased risk of HIV infection. The use of standardized survey methods and
serologic testing procedures facilitates comparison of findings from the different
groups. Included in the HIV family of surveys are studies of groups that receive care
through publicly- funded clinics--including those for tuberculosis, drug treatment,
sexually transmitted disease, family planning, and prenatal care. Other sentinel
groups in which HIV prevalence is monitored include hospital patients with diagnoses
that are not likely to be associated with HIV infection, women at the time of
childbirth, blood donors, military recruits. Job Corps applicants, university
students, prisoners, migrant farm workers, and homeless persons. Findings from HIV
sentinel surveillance systems have been used to monitor progression of the epidemic in
vulnerable populations and to estimate prevalence in the community at large.

Sentinel Providers

Networks of sentinel general or family practitioners and other primary care providers
are active in many European countries and in the United States, Canada, Israel,
Australia, New Zealand, and other countries (115-117) . Providers in some of these
networks conduct independent research projects, but many of them--particularly in
Europe and Australia- -report surveillance data that are used by national health
agencies. Primary-care practitioners can provide timely information for surveillance

because they generally provide the first professional judgment for medical problems
that are seen in early stages. In most networks, primary-care physicians report a
minimum amount of information, usually at weekly intervals, on a select group of
health events that are relatively common in general practice. A wide range of health
events are reported by these networks including the following: infectious diseases
that are and are not notifiable in that country; conditions such as dementia, gastric
ulcers, multiple sclerosis, acute pesticide poisoning, and drug abuse; and requests
for services, such as mammography, cervical smears, and testing for HIV (104) .
Although most systems are based on reports by primary-care practitioners, the extent
to which rates can be calculated that reflect morbidity in the general population is
related in large part to the manner in which medicine is organized and practiced in
that country. For instance, morbidity reporting by sentinel general practitioners
would more closely approximate morbidity in the general population in countries with
universal health-care coverage in which patients are assigned to the same provider or
group of providers, in which specialists are seen only by referral, and in which
sentinel providers are selected that serve populations that are demographically
similar to the general population. None of the existing networks meet all of these
criteria, and the most enduring networks are usually characterized by highly motivated
volunteer providers who report information consistently over time, when the
population from which patients is drawn cannot be characterized, the number of cases
relative to the total number of patients seen or the number of reporting physicians is
usually monitored. Regardless of the strengths and limitations of each network, most
are able to provide preliminary descriptive information in a timely manner for health
events seen in ambulatory-care settings for which information is not otherwise
available .

A recent survey by Eurosentinel, a newly- formed consortium funded by the European
Economic Community to coordinate activities of sentinel general -practitioner networks,
found that, as of March 1990, there were at least 39 active networks in Europe (104) .
Among the more established networks are those in Great Britain, the Netherlands,
Belgium, and France. Ten of these participated in joint data-collection efforts
including weekly reporting of mumps, measles, and influenza-like illness, and studies
of the use of selected laboratory tests in general practice and of requests for HIV-
testing (105,118) .

58
The oldest sentinel -provider network in Europe, the Weekly Returns Service, was
organized by the Royal College of General Practitioners in Great Britain and has been
in continuous operation since 1962 (104) . In 1990, 242 volunteer general
practitioners from 66 practices in Great Britain reported weekly incidence data for 44
conditions selected collaboratively by participating practitioners, epidemiologists,
and health-service providers (119) . These sentinel providers report conditions for
about 1% of the population, and rates per 100,000 population can be calculated using
information from patient lists. Reported conditions range from those with official
notification procedures in Great Britain (e.g., measles and whooping cough) to
conditions (e.g., multiple sclerosis, rheumatoid arthritis, thyrotoxicosis, and
attempted suicide) for which less information is routinely available from outpatient
settings (104,119, 120) . Information from the Weekly Returns Service has been
particularly useful for monitoring trends in influenza and related illnesses in Great
Britain.

The Surrey University Morbidity Network, also covering about 1% of the population of
Great Britain, has been operational since 1974 (104) . In 1990, 42 infectious and non-
infectious conditions were monitored by 120 practices. One of the purposes of this
network is to examine seasonal and other environmental influences on morbidity. Data
have been collected and transmitted electronically since 1985, and participating
physicians receive reports regularly.

A network of sentinel general practitioners has reported to the Netherlands Institute
of Primary Health Care (NIVEL) since 1970 (104, 121, 122) . The primary purpose of this
network, which covers about 1% of the population, is to gather reliable epidemiologic
data on health problems, as well as on actions taken by providers to address these
problems. In 1990, 45 practices involving 63 general practitioners participated in
the network. Information on 16 topics was reported weekly in 1988-1989, including
requests for sterilization, referrals for speech therapy and echocardiography, and
newly diagnosed cases of dementia. Reasonable estimates of morbidity are possible
because access to medical specialists is available only by referral, a relatively
well-defined population is served by each practice, and because practitioners,
although volunteers, are chosen so that the distribution of their patients is as
representative of the Dutch population as possible (121) . Many descriptive studies
have been published using information provided by the Dutch network (121-123) .

The Belgian Sentinel Practice Network has been operated by the National Health
Department since 1979 (124-126). Each year, about 1,500 general practitioners are
contacted, about 10% of them usually agree to participate, and a final group is
selected so that their patients are representative of the age and sex distribution of
the general population. An estimated 1.3% of the population in Belgium were seen by
sentinel practitioners (104) . In 1990, measles, acute respiratory infections, new
cases of cancer, suicide attempts, and requests for HIV tests were reported by the
network, in addition to five officially notifiable diseases (gonorrhea, infectious
hepatitis, meningitis, syphilis, and urethritis). Dissemination of the information is
one of the strengths of the Belgian network. Bimonthly and annual reports are sent to
participating practitioners, to the Ministry of Public Health, to medical and public
health schools, to professional organizations, and to the press.

In France, networks of sentinel primary-care providers transmit and receive
information on selected conditions using computer terminals and modems available
nationally at low cost (127) . Interactive electronic systems are used by the national
French Communicable Diseases Computer Network (FDCN) , as well as by local and regional
networks in the cities of Toulouse and St. Etienne, and in the regions of Aquitaine,
France-Sud, and Lyon (104) . The largest network, the FDCN, has been operated by the
National Health Department and the National Institute of Health since 1984. In 1990,
about 550 volunteer sentinel general practitioners, about 1% of the number throughout
France, reported new cases of influenza, viral hepatitis, urethritis measles, and
mumps each week, none of which were officially notifiable (104, 128) . Since the
underlying population seen by reporting physicians is not known, trends are usually
expressed as the average number of cases per reporting physician per week.
Information is also transmitted directly by national, hospital, and other
laboratories; and local, regional, and national health agencies are also included in
the network (127) . Electronic mail and bulletin boards are used to disseminate
information, and reporting physicians can contact researchers and obtain literature
searches through the network.

Tracking the spread of influenza-like illness using the FDCN has been particularly
effective. Epidemic thresholds can be calculated on the basis of data from previous
years and the extent of regional spread can be tracked each week (128,129) . Unlike
mortality-based surveillance systems, the FDCN was able to show that the 1988-1989

influenza epidemic occurred earlier, was of shorter duration, and affected primarily
young age groups relative to epidemics in previous years (.130) . In addition to
routine surveillance activities, the FDCN has been used to conduct surveys on
physician attitudes regarding vaccination for measles; the use of measles, mumps, and
rubella trivalent vaccine; HIV testing; and biologic testing for diarrhea (104) .
Surveys conducted before and after a nationwide AIDS campaign found that the number of
tests given to women and to heterosexual men increased following the campaign that
emphasized risks associated with heterosexual activity (131) . Studies of diarrheal
disease have been conducted by the Aquitaine network (132) . Findings from the
Aguitaine studies, coupled with findings on measles from the FDCN, highlight that
localized outbreaks of disease for which public health action is warranted can be
missed by sentinel networks that typically monitor conditions in about 1% of the
population.

In the United States, a network of 139 sentinel physicians reports cases of influenza-
like illness each week to CDC (47,133). Nasopharyngeal specimens are sent by 70
physicians to a central laboratory, which then reports findings to reporting
physicians and to CDC. Physicians also report the total number of office visits per
week so that the percentage of visits by patients with influenza-like illnesses can be
estimated. In 1991, sentinel physicians from the Middle Atlantic and West South
Central regions of the United States reported increased visits for influenza-like
illness by late November, although numbers of such visits had not yet increased in
other areas of the country.

Networks of family practitioners and other primary-care providers have been formed in
the United States and Canada, primarily to conduct collaborative research projects,
but have the potential to conduct surveillance. The descriptive and analytic studies
performed by these networks have been very useful for identifying patterns of illness
in outpatient settings. Unlike most networks in Europe, however, they have generally
not had formal reporting relationships with state or local health agencies that are
responsible for timely public health activities. The Ambulatory Sentinel Practice
Network (ASPN) , formed in 1981, includes 334 volunteer clinicians from 71 practices in
the United States and Canada most of whom are family practitioners and many of whom
practice in rural areas (115,134) . Many studies conducted by ASPN--including studies
of pelvic inflammatory disease, spontaneous abortion, chest pain, carpal tunnel

syndrome, and HIV prevalence- -have increased knowledge regarding the distribution of
conditions with public health impact among patients seen in private ambulatory- care
settings (135-138) .

The Pediatric Research in Office Settings (PROS) network, formed in 1985 and sponsored
by the American Academy of Pediatrics, currently includes about 740 practitioners in
224 practices (139) . The PROS network has completed a study of vision screening of
young children and a pilot study of febrile illness among infants. Regional primary-
care networks include the Dartmouth COOP project in northern New Hampshire and
Vermont, the Upper Peninsula Research Network in Michigan, and the Wisconsin Research
Network. Studies with public health impact conducted by regional networks include
studies of cholesterol-, alcohol-, and cancer-screening activities; development of
methods to identify functional deficits; and development of health-maintenance
protocols for use in private practice.

Many of the established networks of primary-care providers participate in
international collaborative organizations, such as the International Primary Care
Network (IPCN), the European Electronic Adverse Drug Reaction Network (EEADRN) and
Eurosentinel (104,140) . A recent IPCN study of 3,360 children from nine countries
showed that the proportion of children with otitis treated with antibiotics varied
widely between countries and that antibiotic treatment did not improve the rate of
recovery (117) . In association with the British pharmaceutical industry, the EEADRN
monitors adverse drug reactions in the United Kingdom, in Ireland, the Netherlands,
Belgium, and Switzerland (204). Approximately 2,350 physicians participate in the
network using hand-held computers to transfer information to the coordinator.

Establishment of a computerized European sentinel-practice network is a long-term goal
of the Eurosentinel, although preliminary findings indicate that the existing networks
are quite heterogeneous. Nonetheless, Eurosentinel can serve as a clearinghouse for a
wide range of activities that highlight similarities and differences between
countries — both in patterns of disease and in the practice of medicine and public
health. Eurosentinel could also serve as a model for a broad-based international
consortium of sentinel practice networks.

62
REGISTRIES

Overview

The use of registries for surveillance and other medical or public health activities
has increased in recent years, largely because information from other sources,
including notifiable disease reporting mechanisms and vital statistics, is often not
adequate for monitoring the public health impact of non-acute diseases (142) .
Registries differ from other sources of surveillance data in that information from
multiple sources is linked for each individual over time. Information is collected
systematically from diverse sources, including hospital-discharge abstracts, treatment
records, pathology reports, and death certificates. Information from these sources is
then consolidated for each individual so that each new case is identified and cases
are not counted more than once. Case series and hospital-based registries in which the
population at risk is not known can be useful for a variety of activities, including
descriptive analyses and assessment of treatment effectiveness. However, population-
based registries from which incidence rates can be calculated are generally more
useful. Information from registries is used primarily for research purposes, but in
many instances, registries have been useful for surveillance and related activities.

The most successful registries are those where purposes are explicit and realistic,
the data collected are accurate and are limited to essential information, and the
registry meets needs that cannot be accommodated using simpler, less expensive methods
(142, 143) . Even when data collection appears to be straightforward, the time and
resources required to develop a functional registry are often underestimated. Because
high-quality registries are resource intensive for long periods, they are generally
not available for all geographic areas or exposed groups. Also, the complexity of the
data-collection process limits the extent to which data can be made available rapidly.

Registries have been used to monitor a wide range of health events and have identified
opportunities for public health prevention and control activities. For instance,
analysis of data from one of the earliest registries--of blind persons in Great
Britain-- found that blindness among substantial proportion of the elderly was due to
treatable cataracts, a finding that had not been previously recognized (142) . Other
health events that have been monitored using registries include rheumatic fever,

mental illness, Alzheimer's disease and dementia, renal disease, diabetes, heart
disease, head and spinal cord injuries, child abuse, early childhood impairments, and
occupation-related diseases such as berylliosis {16,144-149) .

Registries are also used to monitor health events in groups with increased exposure to
hazardous agents, including radiation and hazardous chemicals found in the work place
and the environment (150-154) . Cancer, however, is by far the most common condition
for which registry information is used for surveillance.

Case Series and Hospital-Based Registries

Case series and hospital-based registries have been useful for surveillance-related
activities even though population-based rates usually cannot be estimated. Changes in
the descriptive epidemiology of berylliosis have been monitored using a registry, for
instance (148,155) . Cases of berylliosis increased sharply in the United States in
1939 to 1941 following an increase in the use of beryllium in large-scale manufacture
of fluorescent lamps and in war industries. The number of cases, among both workers
and those who lived near production facilities, declined rapidly following changes in
the manufacturing process and adoption of an exposure standard. Case registries have
also been used to study relatively rare conditions such as mesothelioma among those
exposed to asbestos and adenocarcinoma of the vagina among women exposed prenatally to
diethylstilbestrol (156) .

For most case registries, however, the primary goal is to provide information that can
be used to improve patient care. Registers of cancer patients are maintained by many
hospitals, and, more recently, some hospitals have established registries of persons
who have been treated for traumatic events. In the United States, hospital-based
cancer registries have been promoted by the American College of Surgeons since 1931
and have been required as part of their cancer program since 1953 (156). Standardized
software was made available to hospitals beginning in the 1980s, and development of an
electronic data-transfer standard allowed information to be transmitted centrally from
nearly 2,000 hospitals, beginning in 1990 (157). The newly formed National Cancer Data
Base of the American College of Surgeons includes basic information on about 20% of
all cases of cancer diagnosed each year in the United States. By highlighting the
importance of histologic confirmation prior to treatment, hospital-based cancer

registries have been particularly useful in improving the overall quality of treatment
for cancer.

More recently, development of regional and state systems for trauma care have prompted
the development of hospital-based trauma registries. The first computerized trauma
registry in the United States was developed in 1969 at Cook County Hospital in Chicago
and was expanded to a statewide registry in 1971 that included information from 50
hospitals designated as trauma centers in the state {141,143, 157) . National surveys in
1987 identified 105 hospitals in 35 states with hospital-based trauma registries and
10 states with central trauma registries (158) . The registries differed considerably,
however, in the criteria used for inclusion of cases, the type of data collected,
coding conventions, and the manner in which data were used. In an effort to make
information in hospital-based trauma registries more comparable, standardized case
criteria and a core set of recommended data items, along with supporting computer
software, were developed by CDC and others in 1988 (.159) . Although data from most
existing trauma registries are not population-based, they have been usee to support
primary prevention activities. For instance, findings from the Virginia Statewide
Trauma Registry and other sources were used to support legislation regulating the use
of all-terrain vehicles (158) .

Population-Based Registries

Population-based registries are particularly useful for surveillance because, using
incidence rates, the occurrence of a health event can be estimated over time in
different geographic areas and subgroups of the population. For most registries, the
population from which cases are identified is the general population of a specified
area. Most cancer and birth defects registries, for instance, estimate rates for the
general population. The population from which cases are identified can also arise from
a group defined by a specific exposure that is thought to increase the risk of
illness .

Descriptive analysis of incidence rates based on registry information can be used for
health planning purposes and can suggest etiologic hypotheses that can be evaluated
further with additional studies (50, 159-162) . For some conditions, comparisons between
incidence and mortality rates can be used to estimate the effectiveness of primary

prevention, early detection, or treatment programs. Findings from studies based on
registry information can also encourage physicians to abandon less-than-ef fective
individual therapies, thus improving the standard of medical care.

Exposure Registries

Examples of exposure-based registries include the survivors of atomic bombing or
Hiroshima and Nagasaki during World War II and their offspring and other groups of
persons exposed to radiation {152,163-167) . Because workers are often exposed to
higher levels of physical, chemical, and biologic agents for longer periods than is
the general public, follow up of cohort of workers have been used for many years to
identify illnesses associated with these agents and to assess how these illnesses can
be prevented.

Registries have also been been used to assess the risk of illness for general
population groups exposed to specific agents. For instance, about 4,600 individuals
exposed to polybrominated biphenyls through contamination of dairy cattle-food
supplements in Michigan were followed to assess acute, subacute, and chronic
conditions that might have been associated with this exposure (168). More recently,
the United States Congress has mandated that the Agency for Toxic Substances and
Disease Registry (ATSDR) address potential public health problems associated with
environmental exposures to hazardous waste sites and chemical spills, partly through
the creation of registries (ISO) . ATSDR has described the rationale for a national
exposure registry and methods to be used in its establishment and maintenance.

Cancer Registries

Cancer registries are used in many different countries to estimate cancer incidence
and mortality rates over time. The Connecticut Tumor Registry, the oldest population-
based cancer registry in the United States, has monitored cancer incidence rates for
nearly 50 years (156) . Like hospital -based registries, the Connecticut registry was
developed initially to support the goals of service-oriented hospital-based cancer
registries throughout the state. Through the Surveillance, Epidemiology, and End
Results (SEER) program, the NCI has collected information from specific population-
based cancer registries since 1973. Participant registries were selected to include a
variety of population groups rather than a representative sample of United States,

although nation-wide rates can be estimated using SEER data. The four major goals of
the SEER program are:

• to estimate cancer-related incidence and mortality in the United States;

• to identify unusual changes in the incidence of specific types of cancer
over time in designated areas or demographic subgroups ,-

• to describe changes in the extent of disease at diagnosis and to estimate
patient survival; and

to foster studies of cancer risk factors, screening, and prognostic
factors to allow intervention.

The SEER registry is probably the largest population-based registry in the Western
world (156). Between 1973 and 1988, the program registered about 1.5 million incident
cases of cancer. At present, about 10% of the United States population lives in one of
the nine areas that includes a SEER registry, and approximately 120,000 new cases of
cancer are registered from these areas each year (169). For all types of cancer
(except certain types of skin cancer) , information on selected patient demographics is
recorded in addition to information on primary site, morphology, confirmation of
diagnosis, extent of disease, and first course of treatment. The registries also
actively follow all living patients to ascertain vital status (except those with in
situ cervical cancer) . Incidence rates for cancer based on SEER registry information
are published regularly, and descriptive analyses of cancer incidence rates by age,
race, gender, and geographic area are routinely performed. Although not part of the
SEER system, many states--including New York, California, and New Jersey --maintain
active, high-quality cancer registries that are used for both public health and
hospital-directed activities. In 1989, there were 42 cancer registries in the United
States, including 28 state-based registries that cover part or all of a state's
residents (170) .

In Europe, the first cancer registry was founded in Denmark in 1942, and there has
been steady growth in the number of registries and the size of included populations
since then (171). At present, Denmark, Belgium, England and Wales, and Scotland have
nationwide registries, and most European countries have registries in certain regions.
Information from cancer-incidence registries around the world is collected by the
International Agency for Research on Cancer (IARC), which is part of WHO. As of 1989,

IARC had identified 238 population-based registries in 53 countries that collected
information on cancer incidence, and rates were available for selected years from 106
of these registries (170) .

Registries provide important information for a wide range of public health activities,
but their usefulness for identifying new hazards has, in practice, been limited.
Initial observations by astute clinicians rather than routine analysis of surveillance
data have led to more extensive studies to investigate associations between
angiosarcoma and vinyl chloride, mesothelioma and asbestos, and diethylstilbestrol and
adenocarcinoma of the vagina (171). Cancer registries were essential, however, for
identifying cases that were evaluated in more extensive epidemiologic investigations.
Today, cancer incidence rates from population-based registries are used extensively in
cancer-cluster investigations to assess whether the number of observed cases differs
substantially from an expected number derived from baseline cancer incidence rates.
With increased emphasis on screening activities to detect asymptomatic cancer cases at
an early, more treatable stage and on behavioral-risk-factor control and possibly
chemo-prevention, the public health importance of high-quality, population-based
cancer registries should increase.

Birth-Defects Registries

Recognition of an epidemic of limb reduction defects among children exposed prenatally
to thalidomide stimulated interest in developing population-based birth-defects
registries in many countries. Some birth- defects surveillance systems (e.g., the
Birth Defects Monitoring Program (BDMP) in the United States) , use available sources
of information including vital statistics and hospital -discharge data to monitor
trends in the birth prevalence of various birth defects {172). This type of passive
monitoring system is discussed further in the section on administrative data in this
chapter.

Like most cancer- incidence registries, however, birth defects registries characterized
by active case finding obtain information on individual cases from multiple sources.
In the United States, the Metropolitan Atlanta Congenital Defects Program (MACDP) has
been in operated by CDC's National Center for Environmental Health (NCEH) (172-174) .
All births are monitored in the five-county metropolitan Atlanta area-- about 35,000

births per year. Included in the MACDP are all live-born and stillborn infants
diagnosed as having at least one major birth defect within their first year of life,
with diagnoses ascertained within their first 5 years of life. Birth-defect rates and
trends are monitored by quarterly reviews and analysis of data and are published
regularly by CDC. Numerous investigations have been performed using MACDP data,
including studies of Vietnam veterans' risk for fathering children with birth defects,
the risk of bearing children with specific birth defects for women with insulin-
dependent diabetes, and an apparent protective effect of peri -conceptual vitamin use
on the risk of neural tube defects (175-177). In addition, the MACDP has served as a
prototype for other birth-defects registries characterized by active case- finding
(172) .

Use of equivalent case definitions, more specific coding schemes, and a uniform set of
variables has facilitated collaborative efforts between the eight birth-defects
registries in the United States characterized by active case-finding (172). For
instance, surveillance for specific birth-defects associated with first trimester
exposure to isotretinoin relies on collaborative efforts by CDC and state birth-
defects registries.

In Europe, population-based birth-defects registries are coordinated through EUROCAT,
which is funded through the Economic Community (178). In 1983, birth-defects among
250,000 births were monitored by 17 birth-defects registries in 10 countries. Both
active and passive birth-defects registries participate in the International
Clearinghouse for Birth Defects Monitoring Systems (ICBDMS) , founded in 1974 by WHO as
a means of disseminating birth-defects data from surveillance systems around the
world. Information is available each year on birth defects among more than 4.5 million
births in 30 countries. Although methods used by various registries differ
considerably, the ICBDMS provides a forum for rapid dissemination of information on
teratogens. Reports from France linking valproic acid, an anti-epileptic drug, with an
increase in spina bifida were disseminated rapidly though this international network
(.179,180) .

More recently, some registries are being developed in some local communities to
monitor preschool children for whom early intervention programs are needed. These
programs can identify children with conditions such as fetal alcohol syndrome,

cerebral palsy, mental retardation, and behavioral or learning disabilities that are
often detected shortly after birth. These registries will be useful for estimating the
prevalence of these conditions, as well as for monitoring the effectiveness of
services provided to children with special needs.

SURVEYS
Overview

Health surveys, particularly those that are conducted on a continual or a periodic
basis, can provide useful information for assessing the prevalence of health
conditions and potential risk factors and for monitoring changes in prevalence over
time. More recently, health surveys have also been used to assess knowledge,
attitudes, and health practices in relation to certain conditions such as HIV/AIDS. A
survey differs from a registry in that persons surveyed are usually only queried once
and are not monitored individually after that one contact. Information on respondents
can be obtained through questionnaires, in-person or telephone interviews, or through
record reviews. Attempts are made to assure that the survey sample is as
representative of the source population as possible in order to increase the validity
and reliability of estimates extrapolated to that population. Surveys are can be
valuable for public health surveillance if similar information is collected over time
and if findings are applied to public health activities.

In the United States, surveys such as NCHS's National Health Interview Survey (NHIS)
are important sources of information for monitoring nationwide trends in the
prevalence of target conditions and risk factors for which national health objectives
for the year 2000 have been established (73,181). Nationwide surveys are costly,
however, and due to their complex sample designs, specialized statistical techniques
are often needed for analysis. Since information is usually not available at a local
level, the usefulness of national surveys for local surveillance activities is
limited.

Health Interview Surveys

In the United States, the NHIS, conducted annually since 1957, provides information on
self-reported illnesses, chronic conditions, injuries, impairments, the use of health
services, and other health-related topics for the civilian, non-institutionalized

population {182, 183) . Households are identified through a complex sample design
involving both clustering and stratification. Households selected for interview each
week are a probability sample from a primary sampling unit such as a county or
metropolitan area. Respondents are interviewed in their homes with an adult family
member providing information for other members of the household. Each year,
information is collected on about 122,000 people from about 48,500 households (2). The
interviews, which average about 80 minutes, include a core set of health and socio-
demographic questions are repeated each year and a supplemental section in which
detailed information is collected on specific health topics. In 1987, for instance,
supplemental information was collected on risk factors for cancer and nn knowledge and
attitudes regarding AIDS. NHIS questions will be modified in the future so that
progress toward meeting the year 2000 health objectives for the nation can be
monitored closely.

In England, Scotland, and Wales, the General Household Survey (GHS) in which
information on housing, employment, education, health, and use of social services is
obtained using structured personal interviews has been in operation since 1971 (2) . An
analogous Continuous Household Survey is conducted in Northern Ireland. Electoral
wards form the primary sampling units, and about 85% of households- -a total of about
12,000 per year--agree to participate in the GHS. Over time, the health section of the
survey has included questions on limitations in activities because of acute or chronic
illnesses, smoking and drinking patterns, and contacts with health-care providers and
other health-related topics. The ability to compare health-related information with
extensive socio-demographic information is one of the major strengths of these
surveys .

In the United States, CDC's National Center for Chronic Disease Prevention and Health
Promotion (NCCDPHP) has worked with state health departments since 1981 to conduct
telephone surveys about adult health behavior and use of prevention services. The
primary purpose of these surveys is to support state prevention initiatives.
Questionnaires used by the Behavioral Risk Factor Surveillance System (BRFSS) include
a core set of questions, and, depending on a state's interest, supplemental questions
developed by CDC and questions that meet state-specific needs {184) . The 1988 BRFSS
included questions on height, weight, physical activity, smoking, alcohol use, seat-
belt use, and use of prevention services, such as cholesterol screening and

mammography. By 1990, 45 states and the District of Columbia were conducting these
surveys. Some states have used BRFSS procedures to conduct more detailed studies. In
Missouri, for instance, cholesterol awareness was compared in urban and rural areas
was compared, and in California, cigarette smoking was compared among Chinese,
Vietnamese, and Hispanics in three communities (185,186) . Information from the BRFSS
is timely and can reflect the particular interests of a state or local community. Use
of telephones for interviewing is economical, although many persons without telephones
who are not included in these surveys are generally more likely to be in need of
public health services than many of the respondents.

Since 1988, NCCDPHP has developed and implemented a Youth Risk Behavior Survey (YRBS)
to focus the efforts of local, state, and federal agencies that monitor the behavior
of young people (187) . In 1990, the national survey used a three-stage sample design
to obtain a probability sample of 11,631 students in grades 9 through 12 in 50 states,
the District of Columbia, Puerto Rico, and the Virgin Islands. From the 1990 survey,
estimates are available for the prevalence of tobacco use, alcohol and drug use,
exercise, diet, types of behavior that affect the risk of intentional and
unintentional injuries, and sexual activity {188-194) . The YRBS was designed to
monitor changes in these types of behaviors biennially so that progress toward meeting
year 2000 objectives can be monitored.

Provider-Based Surveys

In the United States, information on the use of health-care services is not available
routinely. In order to estimate the use of these services nationally, NCHS has
developed two complementary surveys, the National Hospital Discharge Survey (NHDS) and
the National Ambulatory Medical Care Survey (NAMCS) , in which characteristics of
health encounters are monitored (181, 195,196). Through the NHDS, information has been
collected since 1965 on discharges from non-federal, short-stay hospitals, including
characteristics of patients, length of stay, diagnoses, surgical procedures, and
hospital size and type of ownership. Beginning in 1987, computerized information for
some discharges was purchased from commercial abstracting services, but, otherwise,
discharges are sampled randomly from hospitals included in the survey. In 1987,
information was collected on about 181,000 discharges from about 400 hospitals- -about
81% of the hospitals that were asked to participate. Although hospital-discharge

information is available in many states, it is not available nationally, so that state
estimates are often derived by extrapolation from the NHDS. Data from the NHDS as well
as other sources have been used, for instance, to assess the public health burden of
nine major chronic diseases (197) .

The NAMCS has been conducted annually from 1973 to 1981, in 1985, and annually since
1989. The target population for the NAMCS is office visits within the continental
United States to non- federal physicians who are in office-based practice and engaged
in direct patient care (9 ,181,196) . About 70% of all ambulatory visits occur in
physicians' offices, and about 70% of selected physicians agreed to participate in the
survey in 1990. Beginning in 1989, about 2,500 physicians were included in the sample,
with each physician completing a short form for about 30 office visits. Information on
visits to hospital out-patient departments and emergency rooms may be added to the
NAMCS in the future. In addition to information on diagnoses, medications, and reason
for visit, the 1990 NAMCS included information on diagnostic and screening services;
counseling for drug, alcohol, and smoking cessation; and other counseling services
(198). Estimates are published at the national level, and for some events, at the
regional level. Unlike hospital-discharge data, ambulatory- care data are rarely
available for routine use at the state or local level in the United States. To obtain
information that could be used in their programs, however, Wisconsin conducted an
ambulatory medical care survey in 1986-1987 based on the NAMCS questionnaire and study
design (199) . Proprietary data bases, such as the National Disease and Therapeutic
Index (NDTI) provide ongoing data on conditions seen in ambulatory care settings.
Although used primarily by the pharmaceutical industry, the NDTI has been used monitor
the public health impact of recommendations to limit the use of aspirin in children
with fevers (200) .

Other Surveys

Other NCHS surveys include the National Survey of Family Growth (NSFG) and the
National Health and Nutrition Examination Survey (NHANES) also contain information
that is useful for public health activities. The NSFG has provided national data on
demographic and social factors associated with childbearing, adoption, and maternal
and child health based on household interviews of women of childbearing age. The
survey has been conducted four times--in 1973,1976,1982, and 1988 (201-203).

The NHANES has provided extensive information on the prevalence of chronic conditions,
distribution of physiologic and anthropomorphic measures, and nutritional status for
representative samples of the U.S. population (204,205). The first two NHANES cycles
were conducted in 1971 through 1974 and 1976 through 1980 and data collection is
currently under way for the third cycle. A Hispanic Health and Nutrition Examination
Survey was conducted in 1982 through 1984 in order to compare health and nutritional
measures among U.S. residents of Mexican, Puerto Rican, and Cuban origin (206) . Also,
almost 4000 persons ages 55 to 74 years of ages who had been interviewed in NHANES I
and were living in 1984 were enrolled in the NHANES I Follow-up Study to assess
whether their characteristics in the 1970s predicted subsequent health outcomes (207) .
The NHANES studies are rich sources of information that are used primarily for
epidemiologic and related analyses. They have been used, however, to provide point
estimates to monitor changes over time in health outcomes, such as changes in blood-
lead levels (208). In general, sources of information that are available for more of
the population over longer periods are more useful for routine
surveillance activities.

ADMINISTRATIVE DATA-COLLECTION SYSTEMS
Overview

Through the use of standard procedures and classification schemes, vital statistics
are derived from birth and death certificates, completed primarily for legal reasons.
Likewise, information on conditions not evident at the time of birth or death can be
derived from administrative information routinely available on episodes of care
(including hospitalizations, visits to emergency rooms, and visits to health-care
providers in the community). In most instances, routinely collected administrative
data have been computerized for billing purposes, but since diagnoses are often
included, these data sets can provide useful information for public health
surveillance. As computerized administrative data become increasingly available,
their importance for monitoring a wide range of health outcomes is increasing.

Availability and usefulness of administrative data for surveillance depend on a number
of factors including:

• the type of information that is computerized;

• the extent to which uniform classification schemes are used to categorize
diagnoses, signs, symptoms, procedures, and reasons for seeking health
care;

the availability of sufficient computer capacity and user-friendly
software programs to process large amounts of data;
the extent to which supplementary information can be obtained; and
the extent to which information for individuals from different
administrative sources or time periods can be linked using a unique
personal identifier;

Data that include personal identifiers are particularly useful both because statistics
can be calculated on the basis of persons rather than on episodes of care and because
additional information can often be obtained through linkage with other data sets.
Special precautions are needed, however, to protect the confidentiality of individuals
when personal identifiers are included in computerized administrative data bases.
Even when personal identifiers are not included, administrative data can be very
useful, however, for assessing the public health burden of various conditions based on
the number of health-care visits and their costs.

Integrated health-information systems based on administrative data are available in a
few countries, but in most, information may be available only for certain types of
health care (e.g., hospitalizations) or for certain segments of the population (e.g.,
those who receive care through the public sector) . Although usually incomplete,
analysis of administrative data has proved useful for public health surveillance and
program planning.

Integrated Health Information Systems

Integrated health- information systems, in which data on individuals are consolidated
from a variety of sources are available in Sweden, Canada, and for limited groups in
the United States. In Sweden, for instance, use of a unique personal identifier
assigned at birth allows the linkage of computerized information on individuals from a
variety of sources, including birth and death certificates, the cancer registry,
hospital discharge summaries, and prescription records {209) . In addition to
etiologic studies, linked Swedish data bases have been used for a variety of

surveillance-related analyses. Examples include estimating the incidence of acute
myocardial infarction; comparing methods of ascertaining myocardial infarction using
community registers, hospital discharge data, and mortality data; and assessing
temporal trends in the incidence of hip fracture (144,146,210).

In Canada, the Saskatchewan Health Plan maintains population-based billing information
including diagnoses from inpatient, outpatient, and prescription records for
approximately 1 million residents beginning in 1979 (211,212) . This information,
which has been used in studies of associations between nonsteroidal anti- inflammatory
drugs and fatal gastrointestinal bleeding and of associations between valproic acid
use and congenital malformations, could also be used for ongoing surveillance
activities (213,214) .

In the United States, integrated health- information systems have been developed for
some health-maintenance organizations such as the Kaiser Permanente system or for
geographic areas served by one major health care provider- -such as Rochester,
Minnesota. Although used frequently for research, the few integrated health-
information systems in the United States are of limited use for general public health
surveillance because the populations included in them are relatively small and not
representative of the U.S. population. These systems are useful, however, for
providing information on incidence and prevalence for conditions difficult to monitor
nationally- -such as the trends in incidence for specific types of primary intracranial
neoplasms (225) and the prevalence of osteoarthritis of the knee with and without
corroborative radiographic findings (216) .

Hospital-Discharge Data Systems
Overview

The importance of collecting information on morbidity from hospital records was noted
by Florence Nightingale among others, although attempts to collect and analyze this
information systematically were not initiated until the 1940s in Scotland (2,22 7).
Today, computerized information from hospital discharge summaries-- including
demographic information and discharge diagnoses-- is routinely collected and
computerized using standard data-set formats such as the 1981 Recommended Minimum
Basic Data Set (RMBDS) for the European community and the Uniform Hospital Discharge

Data Set (UHDDS) or the Medicare Uniform Bill-82 (UB-82) formats in the United States
(218,219) . Both the UHDDS and the UB-82 formats are currently being revised in
tandem.

In Scotland, for example, a standard morbidity record form is completed for each
admission to a general, psychiatric, or maternity hospital and is sent to a central
agency for processing and statistical analysis (217) . Initiated in parts of Scotland
in 1951, the system eventually included the entire country by 1961. Although records
include a unique personal identifier, they are not linked routinely except in one area
of the country. With the advent of the National Health System in 1948, a similar
system based on 10% of hospital admissions was initiated in England and Wales that
covered all areas by 1958.

To monitor the quality of care provided in U.S. hospitals, each acute-care hospital is
required by the Joint Commission on Accreditation of Healthcare Organizations to
report information on diagnoses, length of stay, and inpatient services. Hospitals
often contract with private companies to abstract and computerize pertinent data from
medical records, but in recent years, many hospitals are computerizing this
information themselves or abstracting it from computerized treatment records.
Beginning in the early 1980s, individual states began to require submission of
hospital-discharge data for utilization, financial, and other health-planning studies
(219) . Thus, hospital discharge summary data are computerized for most discharges
from acute-care hospitals in the United States, but data are not available nationally
for all segments of the population from any one source.

Private -sector systems

In the private sector, the Commission on Professional and Hospital Activities (CPHA)
has abstracted information from medical records of U.S. hospitals for over 30 years
(219,220) . Today, CPHA's Professional Activities Study (PAS) data base includes over
200 million records with diagnoses coded according to the clinical modification of the
ICD-9 (ICD-9-CM) in the UHDDS format; 6 million more records are being added each year
(219,221) . The PAS includes information from clinical rather than billing records,
since staff from cooperating hospitals review medical charts, prepare case abstracts,
and send information to CPHA. Hospital-discharge data from CPHA and more recently
from the McDonnell Douglas Hospital Information System (MDHIS) have been used for the

surveillance of birth defects and related conditions (222). Today, the Birth Defects
Monitoring Program (BDMP) , initiated in 1974, includes information from newborn
discharge summaries for about 1 million newborns per year--about 25% of the births in
the United States. Prevalence rates are calculated using the number of live births as
the denominator, and trends in rates for targeted conditions are published routinely
(223) . Information for the BDMP is abstracted from hospital discharge summaries and
is not routinely verified. Although personal identifiers are not included in BDMP
data sets, participating hospitals have agreed to provide hospital records for special
studies using their own patient numbers to identify records (224,225) . More recently,
additional information on possible maternal exposures (e.g., infections, use of
prescription or illicit drugs, or the use of alcohol) linked to birth defects or other
adverse outcomes noted at birth is available for a subset of infants in the BDMP.
Probabilistic matching procedures are used to link summary data without personal
identifiers from newborn and maternal hospital discharge records (222) . Validation
studies indicate that about 95% of the records linked using the matching algorithm are
true matches. Linked maternal and infant hospital-discharge records are particularly
useful for investigating problems associated with maternal exposures. Information on
birth defects surveillance systems characterized by active case-finding and
integration of information from multiple sources appears in the registry section of
this chapter.

In the United States, use of hospital-discharge data from CPHA, MDHIS, or other
private-sector sources is more limited for surveillance of conditions other than those
identified at birth. For the latter, birth-prevalence rates can be calculated using
the number of live births in that hospital as the population at risk, even if the
geographic areas to which these rates apply are not known. Calculation of incidence
or prevalence rates for other conditions is limited by two factors: first, because the
lack of complete coverage for a geographic area limits the use of census data to
estimate the population at risk; and, second, because initial hospitalizations for
conditions cannot usually be distinguished from subsequent hospitalizations.

In 1988, 29 states maintained hospital-discharge-data systems for acute-care
hospitals: 17 in the UB-82 format, eight in the UHDDS format, and four in unique data
formats (219). Although not currently required on the UHDDS or the UB-82, external
cause-of-injury ("E codes") are required in eight states (226). In most states,

unique personal identifiers are not computerized, and the extent to which these data
can be accessed and used for surveillance varies from state to state. When hospital
discharge information is available, however, estimates of the public health burden of
inpatient care--based on the number, the duration, and the cost of hospitalizations —
have been useful for setting priorities for prevention or treatment efforts or for
targeting interventions to specific subgroups in the community.

In California, for instance, hospital -discharge data coupled with estimates of the
proportion of specific diseases attributable to smoking were used to estimate the cost
of treating smoking-related diseases paid with public funds. To recoup some of these
costs, California instituted a 25-cent sales tax on tobacco products in 1989 (227).
State-based hospital discharge data systems have also been used effectively to assess
the public health impact of injuries in states that require "E codes" (226) . For
instance, the effect of mandatory seat-belt laws and more stringent drunk-driving laws
on motor-vehicle-related injuries has been demonstrated using hospital-discharge data
that includes "E codes'.

Federal data-collection systems

In the United States, health care is provided using public funds for about one-quarter
of the non- institutionalized population--including the elderly (13%), the poor (9%),
and the military and their dependents (4%) [228). In 1965, two federal health-
insurance programs- -a hospital insurance plan and a supplementary insurance plan- -were
established for persons _> age 65. Both of these Medicare health- insurance programs
are administered by HCFA. All eligible recipients are enrolled in the first plan
(Part A), which provides coverage for inpatient hospitalizations, stays in skilled
nursing facilities, and home health services. The second plan (Part B) , for which
beneficiaries pay a small premium, covers physician services, outpatient hospital
services, and other medical services. About 96% of the population _> 65 years is
enrolled in at least the Part A program (229) . Medicare programs were extended in
1972 to cover persons with end-stage renal disease that required dialysis or
transplantation and to persons with disabilities <65 years (230) . In Fiscal Year
1988, Medicare program payments for 31 million beneficiaries _> 65 years and an
additional 3 million persons with disabilities accounted for about 18% of all personal
health- care spending in the United States.

For Part A claims, computerized bills in the UB-82 format are submitted to fiscal
intermediaries and then are consolidated nationally. Diagnoses included on each bill
affect payment to hospitals because, since 1983, most short-stay hospitals have been
paid for each case on the basis of prospectively established rates for some 475
diagnosis-related groups (DRGs) (228) . To monitor the quality of care provided
through Medicare programs, HCFA created the Medicare Provider Analysis and Review
(MEDPAR) file by linking information on individuals such as age, gender, race, and
residence from the eligibility files; information on diagnoses and treatment from Part
A and Part B claims files; and information on health-care providers from a facilities
file. A unique health- insurance number — usually the social security number--is used
to link information on individuals. HCFA has created a public-use file for Part A
data from the MEDPAR file and plans to add Part B files, which will includes
diagnostic data in 1992 {231) .

Although most studies using MEDPAR files have focused on quality of care and medical
effectiveness these files have also been used to assess the public health impact of
various conditions such as end- stage renal disease and hip fracture among the elderly
(107,230-234) Point prevalence can be estimated because nearly all members of the
general population _> 65 years are enrolled in Medicare. Incidence can also be
estimated for some conditions because the first hospitalization can be identified in
records for an individual linked by using the unique personal identifier. These
estimated incidence rates would approximate true incidence rates more closely,
however, for acute events such as hip fracture than for long-standing conditions such
as Type II diabetes. Since many conditions are commonly among the elderly, rates can
often be estimated for small geographic areas such as cities or counties (235) .
Recent studies indicate, for instance, that hip fracture is more common in southern
states, even though weather conditions are more adverse in the north (236,237) .

Even more useful public health surveillance information about Medicare recipients
should be more available in the near future. A National Claims History File is being
created for elderly Medicare recipients with information from all claims linked for
individuals (219) . To obtain additional clinical information, medical records for a
random sample of beneficiaries will be abstracted using standard procedures to create
a Uniform Clinical Data Set. Self-administered questionnaires will be sent to a
sample of the elderly at regular intervals to obtain additional information on health

status prior to entering the Medicare program, on health- related behaviors, and on
functional status. Information from all these sources will be linked in the Medicare
Beneficiary Health Status Registry. Information from other sources, such as the SEER
registry and other cancer-incidence registries will be linked with Medicare files when
possible (238) . An end-stage renal disease registry has been developed by linking
health-claims information (239) . As they become available, these enhanced data sets
should prove useful for monitoring trends, for public health planning, and for
evaluating the effectiveness of medical and preventive health services such as
mammography and vaccination.

Medicaid, HCFA's other major public health-care program, provides health-care funds
for the poor and medically needy through a federal-state cost-sharing program.
Medicaid data had been used in for surveillance and program planning at state and
local levels, particularly in the maternal and child health area. Further information
on uses of Medicaid claims data for surveillance is provided in the ambulatory care
and related data section of this chapter.

Hospital-discharge records from IHS hospitals have been particularly useful for
developing community-specific injury profiles and targeting local public health
interventions (226) . "E codes" have been included in discharge summaries from IHS
hospitals for over 20 years, and regional injury prevention coordinators are notified
electronically of injury-related hospitalizations. Identification of hazardous areas
identified through analysis of local data has led to brighter and more effective
lighting and to installation of pedestrian walkways along hazardous stretches of road.

Data-Collection Systems in Emergency Rooms and Other Units

Administrative data from hospital emergency rooms have been used for surveillance of a
variety of acute health events including non- fatal injuries, illicit drug use,
poisonings, and adverse reactions to prescription drugs. Unlike inpatient hospital-
discharge data, however, emergency- room data are not routinely computerized and
reported from all hospitals in a standard format. Because the type of information
recorded and the filing systems used to retrieve health information differ, special
surveillance systems focused on specific outcomes such as injuries or illicit drug use
have been developed using information obtained from cooperating hospitals.

Information from these special surveillance systems is usually not linked with other
data sources. Although the scope of these systems is limited, they have provided
useful information for the surveillance of acute, non-fatal health events for which
admission to a hospital is not warranted.

In England and Wales, information has been provided by the Home Accident Surveillance
System (HASS) since 1976 [240) . Information is collected by trained clerks from 20
randomly sampled major emergency departments. Each hospital remains in the system for
4 years, and five hospitals are replaced each year from the pool of 270 hospitals with
large emergency departments. A similar system, the European Home and Leisure
Accident Surveillance System (EHLASS) is being implemented in all European Economic
Community countries.

In the United States, information on injuries associated with the use of consumer
products (other than automobiles) is available through CPSC's National Electronic
Injury Surveillance System (NEISS) . Since 1972, information on consumer-product-
related injuries, poisonings, and burns has been abstracted from emergency -room
records of a representative sample of hospitals (9) . Information is sent
electronically each day to CPSC, and more in-depth information can be obtained on
conditions of special interest. Information on occupation- related injuries has been
collected since 1982, although the number of hospitals included in NEISS was reduced
from the original 73 to 62 in 1987 (241,242).

National estimates for a variety of conditions are derived by weighing data from
reporting hospitals. NEISS has provided estimates of various consumer-product- and
occupation-related injuries, including estimates of the number of work-related
injuries in the United States bicycle-related injuries and poisonings among children
(241-243) . NEISS provides the only national estimates of injuries seen in emergency
rooms, although the number of hospital emergency rooms on which this information is
based is relatively small. NEISS data have also been used to assess the public health
impact of injuries at the local level. From NEISS data from one hospital, a cluster
of injuries that occurred among young girls and were related to playground merry-go-
rounds was identified (244) . Pediatric injury surveillance systems using emergency
room and hospital discharge data have also been established in other areas (245,246) .

In the United States, NIDA's Drug Abuse Warning Network (DAWN) relies on reports from
about 700 hospital emergency rooms and 85 medical -examiners' or coroners' offices to
detect emerging trends in the nature and severity of drug-abuse problems in the United
States (9,247). Facilities report voluntarily to DAWN beginning in 1972, about 453
emergency rooms in 21 U.S. cities reported data consistently by 1991 (248). Cocaine-
related deaths increased rapidly between 1985 and 1988 although recent reports
suggest that cocaine-related medical emergencies began to decrease in the first half
of 1989. In the same metropolitan areas, about twice as many deaths were identified
through DAWN as through the vital statistics system, although time trends were similar
in both types of data. The DAWN system provides timely information on medical
emergencies related to drug abuse, although estimates are not population-based and are
based on voluntary participation from medical facilities.

In some areas, information may be available from poison-control centers, burn units,
or trauma registries. In Great Britain, poison-control centers—particularly the
National Poison Information Service in London--have provided information for a variety
of studies of trends in abuse of solvents and poisonings of children (249) . In the
United States, poison-control centers--covering 430 defined geographic areas--reported
over 121,000 instances of exposure to suspected poisons to FDA (243). Reports, for
instance, of childhood poisonings to FDA have declined since the introduction of
child-resistant caps for medication containers, and among children < 5 years of age,
flavored chewable vitamins are now the most common pharmaceutical product associated
with poisoning. Information from poison-control centers has also been used to monitor
acute occupation- related health events such as exposure to agricultural chemicals and
corrosive chemicals (250) . In some centers, requests for information on treatment for
suspected poisonings may be collected and computerized in a standard form, although a
standard format for a minimum data set has not been adopted. Exchange of information
by national and international organizations--such as the American and the European
Associations of Poison Control Centers and the World Federation of Poison Control
Centers—facilitates identification and treatment of persons for acute conditions
related to exposure to toxic substances (249) .

Unlike hospital-discharge data, information from emergency rooms, poison-control
centers, and related facilities is usually not available routinely in a standard
format. Efforts are under way, however, to create standard minimum data sets and

reporting formats to aggregate and compare data. With the increase in surgical and
other procedures performed on an outpatient basis, the importance of collecting core
information from outpatient settings will increase.

Ambulatory Care and Related Data

With the exception of countries such as Sweden and Canada that have integrated health-
information systems, ambulatory-care data are not generally available from
administrative sources for all segments of the population. Information on the
prevalence of signs, symptoms, and conditions not usually requiring hospitalization is
usually obtained through periodic surveys of the general population or through
sentinel-surveillance systems characterized by voluntary reporting of specific
conditions by health- care providers. In the United States, a Uniform Ambulatory Care
Data Set (UACDS) , first developed in 1974 and revised in 199C, offers the possibility
for standardization of ambulatory-care data (219) , although it is not widely used at
present. At present, however, diagnostic information is often not required, and when
included, it is often difficult to distinguish actual diagnoses from presumptive
diagnoses that are being "ruled out." Inpatient procedures are usually coded using
the ICD-9-CM, but a universally accepted classification system is not used in
outpatient settings. The Current Procedure Terminology, fourth revision (CPT-4) and
the HCFA Common Procedure Coding System (BPCS) are both used, although CPT-4 codes are
not equivalent to ICD-9-CM codes used for the same procedures in inpatient settings.
With rapid changes in medical care, it is difficult to maintain an up-to-date
procedure-classification system.

In spite of these limitations, the use of claims and related data from public programs
for surveillance and program planning is increasing in the United States. While data
from public programs cover only a segment of the population, they are the segment to
which public health interventions are most often targeted. Information from the
Medicaid program, in particular, has been used by state and local health departments.
About 23 million individuals were enrolled in Medicaid programs in Fiscal Year 1988,
accounting for about 10% of personal health-care expenditures in the United States
(123) . The eligible population, however, changes substantially over time.

Because the states have broad discretion in administering the program under federal
guidelines, benefits vary from state to state, as do the health-information systems
used to track health claims. The states report aggregate expenditure and utilization
data to HCFA, although about half the states voluntarily report patient-level
information (107 ,228) . Data from five states that report data using uniform
enrollment, provider, and claims- file formats can be aggregated, but otherwise,
differences in eligibility, covered services, and file structure make it difficult to
aggregate data across states. Within states, however, health departments are
attempting to link public health data from various sources to monitor the
effectiveness of their programs, particularly in the maternal and child health area
{203) . Many states now link birth- and death-certificate data for deaths that occur
within the first year of life. Some states are able to link Medicaid data with vital-
record data, and a few are also able to add data from various public health programs
to linked Medicaid/vital-record data sets.

Public health program data are derived from various sources: maternal- and infant-care
clinics; vaccination clinics; neonatal screening programs for inborn errors of
metabolism, maternal drug use, and HIV seroprevalence; lead-screening programs for
schoolchildren; clinics for children with special needs; families enrolled in the
Women, Infants, and Children (WIC) nutrition supplement programs; hospital discharge
data; data from the Pregnancy Risk Assessment Monitoring System (PRAMS) ; school
vaccination records; and data from Head Start programs (203,252). State and local
health departments have met with varying levels of success in linking data sets, but
the most successful have been able to target and evaluate public health interventions
and to monitor outcomes. In Tennessee, for instance, adverse sequelae following
vaccination were monitored using linked vaccination-clinic records, Medicaid-claims
data, and vital records {252) . Also in Tennessee, birth certificate and WIC data were
linked to assess the extent to which high-risk infants were enrolled in county WIC
programs (253) . Massachusetts and Colorado are among the states that are redesigning
data bases for public health programs so that the data can be linked more easily
{203,251) .

Some information derived from state and local public health programs is available
nationally in the United States. CDC's Pediatric Nutrition Surveillance System and
Pregnancy Nutrition Surveillance System have been operational since 1973 and 1980,

respectively (203, 251) . In both systems, key indicators of nutrition status are
monitored continuously in participating states using information derived from
publicly-funded health, nutrition, and food-assistance programs. Information is
available from 40 states for the pediatric-nutrition system and from 16 states for the
pregnancy-nutrition system. These data sets have been used to assess the prevalence
of malnutrition in children < 2 years; to assess the prevalence of anemia during
pregnancy among low-income women; and to monitor the decline in the prevalence of
anemia among low-income children in the United States (254-256) .

Although few countries have integrated health-information systems at present, they may
become more common in the future. Although not integrated and not inclusive of most
of the population, data from the patchwork of administrative systems available at
present have been used successfully for public health surveillance and program
planning. In the United States, computerized hospital discharge data are relatively
standardized, but access is limited in some states. Because data-reporting formats
are less standardized for outpatient settings, it is difficult to aggregate such data.
Efforts by state health departments to create integrated data bases for public
programs will help states to monitor their programs more effectively. Although
eligibility may vary among states, standardization and reporting of data for at least
some core variables could enhance information available nationwide on problems of
public health importance.

SUMMARY

Sources of data available for public health surveillance vary considerably from
country to country. Developed and many developing countries are able to monitor
reproductive outcomes and mortality through vital statistics systems and many
countries have notifiable disease-reporting systems for at least some infectious
diseases. Otherwise, the extent of information available through administrative data
systems, surveys, registries, and sentinel surveillance systems varies extensively
from country to country. Although the quality and the completeness of these data
sources may be limited, they often provide low-cost information that is useful for
public health surveillance and related activities. Even if new data-collection
efforts are needed to address specific problems, routinely collected data can provide
background information that will be useful for designing these studies.

The increasing computerization of health information, the availability of powerful but
relatively inexpensive computers, and the development of user-friendly software should
facilitate the timely use of information from a wide range of sources. Although
integrated health-information systems and computerized medical records may be on the
horizon in some countries, limited information that is available quickly from
notifiable-disease and sentinel-surveillance systems is often the most useful for
conditions in which timely public health action is needed. Since no one source of
data is usually adequate, good public health decision-making invariably requires the
synthesis of data of varying quality from a wide range of sources as well as critical
interpretation of findings.

87
Appendix III. A. Surveillance or Health Information Systems
Mentioned in Chapter III

I. Notifiable diseases and related reporting mechanisms

NNDSS National Notifiable Diseases Surveillance

System, United States (CDC and state health
departments)

VAERS Vaccine Adverse Event Reporting System,
United States (FDA)

II. Vital Statistics

121-City Surveillance System, United States
(CDC)

MSS Mortality Surveillance System, United States

(NCHS/CDC)

NTOF National Traumatic Occupational Fatality

surveillance system, United States
(NIOSH/CDC)

Medical Examiner/Coroner Information Sharing
System, United States (NCEH/CDC)

FARS Fatal Accident Reporting System, United

States (NHTA)

III. Sentinel surveillance

SENSOR Sentinel Event Notification System for

Occupational Risks, United States (NIOSH/CDC)

EEARDN European Electronic Adverse Drug Reaction
Network, Europe

IV. Registries

Connecticut Tumor Registry, United States

SEER Surveillance, Epidemiology, and End Results

Program, United States (NCI)

MACDP

Metropolitan Atlanta Congenital Defects
Program, United States (NCEH/CDC)

V . Surveys
GHS

NHIS

BRFSS

YRBS

NHDS

NAMCS

NDTI

NSFG

NHANES

General Household Survey, United Kingdom

Continuous Household Survey, Ireland

National Health Interview Survey, United
States (NCHS/CDC)

Behavioral Risk Factor Surveillance System,

United States (NCCDPHP/CDC and state health

departments)

Youth Risk Behavior Surveillance System,

United States (NCCDPHP/CDC and state health

departments)

National Hospital Discharge Survey, United
States (NCHS/CDC)

National Ambulatory Medical Care Survey,
United States (NCHS/CDC)

National Disease and Therapeutic Index,
United States (private sources)

National Survey of Family Growth, United
States (NCHS/CDC)

National Health and Nutrition Survey, United
States (NCHS/CDC)

Hispanic Health and Nutrition Survey, United
States (NCHS/CDC)

HANES I Followp-up Study, United States
(NCHS/CDC)

VI. Administrative data-collection systems
PAS

MDHIS

BDMP

Professional Activity Studies, United States
(CPHA)

McDonnell Douglas Hospital Information
System, United States

Birth Defects Monitoring Program, United
States (NCEH/CDC)

MEDPAR

HASS

EHLASS

NEISS

DAWN

PRAMS

Medicare Provider Analysis and Review, United
States (HCFA)

Home Accident Surveillance System, United
Kingdom

European Home and Leisure Accident
Surveillance System, Europe

National Electronic Injury Surveillance
System, United States (CPSC)

Drug Abuse Warning Network, United States
(NIDA)

Pregnancy Risk Assessment Monitoring System,
United States (NCCDPHP/CDC and state health
departments)

90
REFERENCES

1. Alderson M. International mortality statistics. New York: Facts on File,
Inc., 1981.

2. Ashley JSA, Cole SK, Kilbane MPJ. Health information resources: United Kingdom--
health and social factors. In: Holland WW, Detels R, Knox G (eds) . Oxford
Textbook, of Public Health. 2nd edition. Vol. 2 Methods of Public Health
Surveillance. Oxford, England: Oxford University Press, 1991:29-54.

3. Yanagawa H, Nagai M. Health information resources: Japan--health and social
factors. In: Holland WW, Detels R, Knox G (eds.). Oxford textbook of public
health. 2nd edition. Vol. 2: methods of public health surveillance. Oxford,
England: Oxford University Press, 1991:55-66.

4. National Center for Health Statistics. International Health Data Reference
Guide, 1991. DHHS Publication no. (PHS) 92-1007. Hyattsville, Md. : National
Center for Health Statistics, 1992.

5. World Health Organization. World health statistics annual. Geneva,
Switzerland: WHO, 1988.

6. United Nations. 1989 demographic yearbook. 41st issue. New York: Department of
International and Social Affairs, Statistical Office, 1991.

7. World Health Organization. Targets for health for all: targets in support of
the European regional strategy for health for all. Copenhagen, Denmark: WHO
Regional Office for Europe, 1985.

8. National Office of Vital Statistics. Reported incidence of selected notifiable
diseases. United States, each division and state, 1920-50. Vital Statistics
Special Reports. Washington, D.C.iU.S. Government Printing Office, 1953 ; 37 :180 .

9. Pearce N. Health information resources: United States—health and social
factors. In: Holland WW, Detels R, Knox G (eds.). Oxford textbook of public
health. 2nd edition. Vol. 2: Methods of public health surveillance. Oxford,
England: Oxford University Press, 1991:11-28.

10. Moro ML, McCormick A. Surveillance for communicable disease. In: Eylenbosch WJ
and Noah ND (eds.). Surveillance in health and disease. Oxford, England: Oxford
University Press, 1988:166-82.

11. Chorba TL, Berkelman RL, Safford SK, Gibbs N, Hull HF. Mandatory reporting of
infectious diseases by clinicians. MMWR 1990;39 (RR-9) :1-17 .

12. Centers for Disease Control. Summary of notifiable diseases, United States,
1991. MMWR 1991,-40.

13. Freund E, Seligman PJ, Chorba TL, Safford SK, Drachman JG, Hull HF. Mandatory
reporting of occupational diseases by clinicians. MMWR 1990;39 (RR-9) : 19-28.

14. Seligman PJ, Matte TD. Case definitions in public health. Am J Public Health
1991;81:161-2.

15. Centers for Disease Control. Childhood lead poisoning, New York City, 1988. MMWR
1990;39(SS-4) :l-7.

16. Green LA, Lutz LJ. Notions about networks; primary care practices in pursuit of
improved primary care. In: Mayfield J and Grady ML (eds.) . Primary care
research: agenda for the 90s. Rockville, Md.:Agency for Health Care Policy and
Research, 1990:13-22.

17. Centers for Disease Control. Case definitions for public health surveillance.
MMWR 1990;39(No. RR-13):l-43.

18. Centers for Disease Control. Vaccine adverse event reporting system- -United
States. MMWR 1990,-39,-730-3 .

19. Faich GA. National adverse drug reaction reporting, 1984-1989. Arch Intern Med
1991;151:1645-7.

20. Rossi AC, Bosco L, Faich GA, Tanner A, Temple R. The importance of adverse
reaction reporting by physicians: suprofen and the flank pain syndrome. JAMA
1988;259:1203-4.

21. Kimmel K. Surveillance for adverse reactions to drugs. In: Eylenbosch WJ and
Noah ND (eds.). Surveillance in health and disease. Oxford, England: Oxford
University Press, 1988: 244-54.

22. Inman WHW. Hazards of drug therapy. In: Holland WW, Detels R, Knox G (eds.).
Oxford textbook of public health. 2nd edition. Vol. 2: methods of public health
surveillance. Oxford, England: Oxford University Press, 1991:481-500.

23. Centers for Disease Control. National electronic telecommunications system for
surveillance--United States, 1990-1991. MMWR 1991, -40: 502 .

24. Tsai TF. Arboviral infections in the United States. Infect Dis Clin North Am
1991;5(1) :73-102.

25. Fishbein D. Rabies. Infect Dis Clin North Am 1991,-5 (1) :53-74.

26. Craven RB, Barnes AM. Plague and tularemia. Infect Dis Clin North Am
1991;5(1) :165-76.

27. Buchstein SR, Gardner P. Lyme Disease. Infect Dis Clin North Am 1991; 5 (1) :103-
16.

28. Weber DJ, Walker DH. Rocky Mountain spotted fever. Infect Dis Clin North Am
1991;5(1) :19-36.

29. Sacks JJ. Utilization of case definitions and laboratory reporting in the
surveillance of notifiable communicable diseases in the United States. Am J
Public Health 1985;75:1420-2.

30. Berkleman RL, Buehler JW. Surveillance. In: Holland WW, Detels R, Knox G (eds).
Oxford Textbook of Public Health. 2nd edition. Vol. 2 Methods of Public Health
Surveillance. Oxford, England: Oxford University Press, 1991:161-76.

31. Sherman IL, Langmuir AD. Usefulness of communicable disease reports. Public
Health Rep 1952;67:1249-57.

32. Thacker SB, Berkleman RL. Public health surveillance in the United States.
Epidemiol Rev 1988;10:164-90.

33. Serfling RE, Sherman IL. Problems in improving reported morbidity data as a
tool for epidemiological research. CDC bulletin. Atlanta, Ga.:Public Health
Service, October 1951:24-7.

34. Fife D, McAnaney, Rahman MA. Changes in AIDS case reporting after hospital site
visits. Am J Public Health 1991;81:1648-50.

35. Conway GA, Colley-Niemeyer B, Pursley C et al . Underreporting of AIDS cases in
South Carolina, 1986 and 1987. JAMA 1989;262:2859-63.

36. Selik RM, Buehler JW, Karon JM, Chamberland ME, Berkelman RL. Impact of the
1987 revision of the case definition of acquired immune deficiency in the United
States. J Acguir Immune Defic Syndr 1990;3:73-82.

37. Cohen DA, Boyd D, Prabhudas I, Mascola L. The effects of case definition in
maternal screening and reporting criteria on rates of congenital syphilis. Am J
Public Health 1990;80:316-7.

38. Centers for Disease Control. Congenital syphilis--New York City, 1986-1988.
MMWR 1989,-38:825-9.

39. Centers for Disease Control. Lyme disease surveillance- -United States, 1989-
1990. MMWR 1991;40:417-21.

40. Andrews JM, Quinby GE, Langmuir AD. Malaria eradication in the United States.
Am J Public Health 1950;40:1405-11.

41. Langmuir AD. The surveillance of communicable diseases of national importance.
N Engl J Med 1963;268:182-92.

42. Noah N. Transmissible agents. In: Holland WW, Detels R, Knox G (eds.). Oxford
textbook of public health. 2nd edition. Vol. 2: methods of public health
surveillance. Oxford, England: Oxford University Press 1991:417-35.

43. Holmberg SD, Osterholm MT, Senger KA, Cohel ML. Drug- resistant Salmonella from
animals fed antimicrobials. N Engl J Med 1984;311:617-22.

44. Centers for Disease Control. Measles prevention: recommendations of the
Immunization Practices Advisory Committee (ACIP) . MMWR 1989; 38 (No. S-9):l-18.

45. Centers for Disease Control. Hepatitis B virus: a comprehensive strategy for
eliminating transmission in the United States through universal chilhood
vaccination; recommendations of the Immunization Practices Advisory Committee
(ACIP). MMWR 1991;40 No. RR-13.

46. Nathanson N, Langmuir AD. The Cutter incident: poliomyelitis following
formaldehyde-inactivated poliovirus vaccination in the United States during the
spring of 1955. I. Background. American Journal of Hygiene 1963;78:16-28.

47. Centers for Disease Control. Summary of 1990-1991 influenza season, United
States. Influenza Branch, Division of Viral and Rickettsial Diseases. Atlanta,
Ga.: 1991.

48. Centers for Disease Control. Eosinophilia-myalgia syndrome- -New Mexico. MMWR
1989;38:765-7.

49. Watkins M, Lapham S, Hoy W. Use of a medical center's computerized health care
data base for notifiable disease surveillance. Am J Public Health 1991;81;637-
39.

50. Mo.nson RR. Occupational epidemiology. 2nd edition. Boca Raton, Fla.:CRC
Press, Inc., 1990.

51. World Health Organization. Manual of international statistical classification
of diseases, injuries, and causes of death. Ninth Revision. Geneva,
Switzerland: WHO, 1977.

52. Ruzicka LT, Lopez AD. The use of cause-of-death statistics for health situation
assessment: national and international experiences. World Health Stat Q
1990;43:249-62.

53. Brzezinski ZJ. Mortality indicators and health- for-all strategies in the WHO
European region. World Health Stat Q 1986;39:365-78.

54. Kleinman JC. The slowdown in the infant mortality decline. Paediatr Perinat
Epidemiol 1990;4:373-81.

55. Cooper R, Sempos C, Hsieh SC, Kovar MG. Slowdown in the decline of stroke
mortality in the United States, 1978-1986. Stroke 1990;21:1274-9.

56. Schwartz E, Kofie VY, Rivo M, Tuckson R. Black/white comparisons of deaths
preventable by medical Intervention: United States and the District of Columbia
1980-1986. Int J Epidemiol 1990;19:591-8.

57. McCord C, Freeman HP. Excess mortality in Harlem. N Engl J Med 1990;322:173-7.

58. Kleinman JC, Kiely JL. Postneonatal mortality in the United States: an
international perspective. Pediatrics 1990;86:1091-97.

59. Li J-Y. Cancer mapping as an epidemiologic research resource in China. Recent
Results Cancer Res 1989;114:115-36.

60. Walter SD, Birnie SE. Mapping mortality and morbidity patterns: an
international comparison. Int J Epidemiol 1991;20:678-89.

61. Fingerhut LA, Kleinman JC. International and interstate comparisons of homicide
among young males. JAMA 1990;263:3292-5.

62. Powell-Griner E, Woolright A. Trends in infant deaths from congenital
anomalies; results from England and Wales, Scotland, Sweden, and the United
States. Int J Epidemiol 1990;19:391-8.

63. Office of Population Censuses and Surveys. The Registrar General's Decennial
Supplement 1970-72, England and Wales. Occupational Mortality. Series DS No.
1. London:Her Majesty's Stationery Office, 1978.

64. New estimates of maternal mortality. Wkly Epidemiol Rec 1991;66:345-52.

65. Lew JF, Glass RI, Gangarosa RE, Cohen IP, Bern C, Moe CL. Diarrheal deaths in
the United States, 1979 through 1987. JAMA 1991;265:3280-4.

66. Weiss KB, Wagener DK. Asthma surveillance in the United States: a review of
current trends and knowledge gaps. Chest 1990,-98 :179S-184S.

67. Sutter RW, Cochi SL, Brink EW, Sirotkin BI. Assessment of vital statistics and
surveillance data for monitoring tetanus mortality. United States, 1979-1984.
Am J Epidemiol 1990;131:132-42.

68. Hardy RJ, Schroder GD, Cooper SP, Buffler PA, Prichard HM, Crane A.
Surveillance system for assessing health effects from hazardous exposures. Am J
Epidemiol 1990 :132;S32-S42.

69. Pickle LW, Mason TJ, Howard N, Hoover R, Fraumeni JF. Atlas of U.S. cancer
mortality among whites: 1950-1980. DHHS Publ No (NIH) 87-2900. Washington,
D.C.:U.S. Government printing Office, 1987.

70. Brownson RC, Smith CA, Jorge NE et al . The role of data-driven planning and
coalition development in preventing cardiovascular disease. Public Health Rep
1992;107:32-7.

71. Boss LP, Suarez L. Use of data to plan cancer prevention and control programs.
Public Health Rep 1990;105:354-60.

72. Chan LS, Portnoy B, Black BL. California's use of health statistics in child
health planning. Am J Prev Med 1985;1:24-30.

73. Department of Health and Human Services. Healthy people 2000: national health
promotion and disease prevention objectives. U.S. Department of Health and
Human Services. Washington, D.C.:U.S. Government Printing Office, DHHS
Publication No (PHS) 91-50212, 1991.

74. Percy C, Staneck E, Gloeckler L. Accuracy of cancer death certificates and its
effect on cancer mortality statistics. Am J Public Health 1981;71:242-50.

75. Percy C, Muir C. The international comparability of cancer mortality data:
results of an international death certificate study. Am J Epidemiol
1989;129:934-46.

76. Kelson MC, Heller RF. The effect of death certification and coding practices on
observed differences in respiratory disease mortality in 8 EEC countries. Rev
Epidemiol Sante Publique 1983;31:423-2.

77. Vital and health statistics: the 1989 revision of the U.S. standard certificates
and reports. Series 4: documents and committee reports. No. 28. DHHS Publ no
(PHS) 91-1465. Hyattsville, Md. : National Center for Health Statistics, 1991.

78. National Center for Health Statistics. Physicians' handbook on medical
certification of death. Hyattsville, Md. : National Center for Health Statistics,
1987.

79. National Center for Health Statistics. Hospitals' and physicians' handbook on
birth registration and fetal death reporting. Hyattsville, Md. :National Center
for Health Statistics, 1987.

80. Taffel SM, Ventura SJ, Gay GA. Revised U.S. certificate of birth--new
opportunities for research on birth outcome. Birth 1989;16:188-93.

81. Freedman MA, Gay GA, Brockert JE, Potrzebowski PW, Rothwell CJ. The 1989
revision of the U.S. standard certificates of live birth and death and the U.S.
standard report of fetal death. Am J Public Health 1988;78:168-72.

82. Funeral director's handbook on death registration and fetal death reporting.
Hyattsville, Md. : National Center for Health Statistics, 1987.

83. Kleinman JC, Kiely JL. Infant mortality. Healthy People 2000 statistical notes.
Hyattsville, Md.:National Center for Health Statistics, 1991:l(no 2).

84. Vital statistics of the United States, 1988. Vol 1 - Natality. DHHS Publ No

(PHS) 90-1100. Hyattsville, Md.: National Center for Health Statistics, 1990.

85. Vital statistics of the United States, 1988. Vol II - Mortality, Part A. DHHS
Publ No (PHS) 90-1100. Hyattsville, Md. : National Center for Health Statistics,
1990.

86. Lopez AD. Causes of death: an assessment of global patterns of mortality around
1985. World Health Stat Q 1990;43:91-104.

87. Becker TM, Wiggins CL, Key CR, Saraet JM. Signs, symptoms and ill-defined
conditions: a leading cause of death among minorities. Am J Epidemiol
1990;131:664-8.

88. Report of a workshop on improving cause-of-death statistics. In: National
Committee on Vital and Health Statistics, 1990. DHHS Publication no. (PHS) 91-
1205. Hyattsville, Md: National Center for Health Statistics, 1991:53-77.

89. Report of the second workshop on improving cause-of-death statistics. In:
National Committee on Vital and Health Statistics, 1991. DHHS Publication no.
(PHS) 92-1205. Hyattsville, Md. : National Center for Health Statistics,
1992:52-83.

90. Baron RC, Dicker RC, Bussell KE, Herndon JL. Assessing trends in mortality in
121 U.S. Cities, 1970-79, from all causes and from pneumonia and influenza.
Public Health Rep 1988;103:120-8.

91. Interagency Committee on Infant Mortality. Data and surveillance systems related
to programs to reduce infant mortality: a directory of federal efforts.
Atlanta, Ga.: Public Health Service, January 1992.

92. Enterline PE. Extrapolation from occupational studies: a substitute for
environmental epidemiology. Environ Health Perspect 1981;42:39-44.

93. National Institute for Occupational Safety and Health. National traumatic
occupational fatalities, 1980-1985. Atlanta, Ga.:Public Health Service, March
1989.

94. Centers for Disease Control. Injuries associated with horseback riding. MMWR
1990;39:329-32.

95. Centers for Disease Control. Dilaudid-related deaths — District of Columbia,
1987. MMWR 1988;37:425-7.

96. Centers for Disease Control. Medical examiner/ coroner creports of hurricanes
associated with Hurricane Hugo--Puerto Rico, 1989. MMWR 1989;38:680-2.

97. Centers for Disease Control. Medical examiner summer mortality surveillance
system. MMWR 1982;31:336-43.

98. Centers for Disease Control. Earthquake-associated deaths—California, 1989.
MMWR 1989;38:767-70.

99. Centers for Disease Control. Child passenger restraint use and motor-vehicle-
related fatalities. MMWR 1991;40:600-2.

100. Centers for Disease Control. Premature mortality due to alcohol-related motor
vehicle traffic fatalities— United States, 1987. MMWR 1988;37:753-5.

101. Centerwell BS . Homicide and the prevalence of handguns: Canada and the United
States, 1976 to 1980. Am J Epidemiol 1991;11:1245-60.

102. Rutstein DD, Mullan RJ, Frazier JM, Halperin WE, Melius JM, Sesito JP. Sentinel
health events (occupational); a basis for physician recognition and public
health surveillance. Am J Public Health 1983;73:1054-62.

103. Woodall JP. Epidemiological approaches to health planning, management, and
evaluation. World Health Stat Q 1988;41:2-10.

104. Van Casteren V. Inventory of sentinel health information systems with GPs in
European countries. Eurosentinel. Brussels, Belgium: Institute of Hygiene and
Epidemiology, January 1991.

105. Van Casteren V, Leurquin P. Eurosentinel: Concerted Action on Sentinel Health
Information Systems with General Practitioners. Final Report. Brussels.
Institute of Hygiene and Epidemiology. August 1991.

106. Chavez GF, Mulinare J, Cordero JF. Leading major congenital malformations among
minority groups in the United States, 1981-1986. JAMA 261: 1989;205-8 .

107. Roper WL, Winkenwerder W, Hackbarth GM, Krakauer H. Effectiveness in health
care: an initiative to evaluate and improve medical practice. N Engl J Med
1988;1197-202.

108. Hartz AJ, Krakauer H, Kuhn EM et al . Hospital characteristics and mortality
rates. N Engl J Med 1989;321:1720-5.

109. Rutstein DD, Berenberg W, Chalmbers TC, Child CG, Fishman AP, Perdin EB.
Measuring the quality of medical care. A" Engl J Med 1976;294:582-8.

110. Baker EL. Sentinel event notification system for occupational risks (SENSOR):
the concept. Am J Public Health 1989 ;79S: 18-20 .

111. Centers for Disease Control. Consensus set of health status indicators for
general assessment of community health status—United States. MMWR 1991; 40: 449-
59.

112. Alter MJ, Hadler SC, Margolis HS et al. The changing epidemiology of hepatitis
B in the United States: need for alternative vaccinations strategies. JAMA
1990;262;1201-5.

113. Alter MJ, Hadler SC, Judson FN, et al . Risk factors for acute non-A non-B
hepatitis in the United States and association with hepatitis C virus infection.
JAMA 1990;264:2231-5.

114. Pappaioanou M, Dondero TJ, Petersen LR, Onorato IM, Sanchez CD, Curran JW. The
family of HIV seroprevalence surveys: objectives, methods, and use of sentinel
surveillance for HIV in the United States. Public Health Rep 1990;105:113-9.

115. Green LA, Wood M, Becker L et al. The Ambulatory Sentinel Practice Network:
purposes, methods, and policies. J Fam Pract 1984;18:275-80.

116. Hughes JP, van Belle G, Kukull W, Larson EB, Teri L. On the uses of registries
for Alzheimer disease. Alzheimer Dis Assoc Disord 1989;3:205-17.

117. Froom J, Culpepper J, Grob P et al. Diagnosis and antibiotic treatment of acute
otitis media: report from International Primary Care Network. British Medical
Journal 1990;300:582-6.

118. Van Casteren V, Leurguin P, Declercq E et al. Study of the use of some selected
groups of laboratory tests in general practice. Summary report. Eurosentinel.
Brussels, Belgium: Institute of Hygiene and Epidemiology. June 1991.

119. Fleming DM, Crombie DL. Weekly returns service report for 1990. Birmingham,
England :Birmingham Research Unit of the Royal College of General Practitioners,
June 1991.

120. Fleming D, Ayres JG. Diagnosis and patterns of incidence of influenza,
influenza-like illness and the common cold in general practice. Journal of the
Royal College of General Practitioners 1988;38:159-62.

121. Netherlands Institute of Primary Care. Continuous morbidity registration
sentinel stations in the Netherlands. Wtrecht :NIVEL Netherlands Institute of
Primary Health Care, September 1991.

122. Bartelds AIM, Fracheboud J, van der Zee J. The Dutch Sentinel Practice Network:
relevance for public health policy. Utrecht: Netherlands Institute for Primary
Care (NIVEL) , 1989.

123. Sprenger MJW, Mulder PGH, Beyer WEP, Masurel N. Influenza: relation of
mortality to morbidity parameters—Netherlands, 1970-1989. Int J Epidemiol
1991;20:1118-24.

124. Lobet MP, Stroobant A, Mertens R et al. Tool for validation of the network of
sentinel general practitioners in the Belgian health care system. Int J
Epidemiol 1987;16:612-8.

125. Stroobant A, Van Casteren V, Thiers G. Surveillance systems from primary-care
data: surveillance through a network of sentinel general practitioners. In:
Eylenbosch WJ, Noah ND (eds.). Surveillance in health and disease. Oxford,
England: Oxford University Press, 1988:62-74.

126. Stroobant A Lamotte J, Van Casteren V. Epidemiological surveillance of measles
through a network of sentinel general practitioners in Belgium. Int J Epidemiol
1986;15:386-91.

127. Valleron A-J, Bouvet E, Gaernerin P et al . A computer network for the
surveillance of communicable diseases: the French experiment. Am J Public
Health 1986;76:1289-92.

128. Costagliola D, Flauhault A, Galinec D, Garnerin P, Menares J, Valleron A-J. A
routine tool for detection and assessment of epidemics of influenza-like
syndromes in France. Am J Public Health 1991;81:97-9.

129. Garnerin P. Regional distribution of influenza-like syndrome during weeks 49-51
of 1989 (number of cases/physician/week). British Medical Journal 1990 ,-300:701 .

130. Surveillance of influenza-like diseases through a National Computer Network--
France, 1984-1989. MMWR 1989;38:585-7.

131. Massari V, Brunet JB, Bouvet E, Valleron AJ. Attitudes towards HIV-antibody
testing among general practitioners and their patients. Eur J Epidemiol
1988;4:435-8.

132. Maurice S, Megraud F, Vivares C et al. Telematics: a new tool for epidemiologic
surveillance of diarrhoeal diseases in the Aquitaine sentinel network. British
Journal of Medicine 1990;300:514-670.

133. Centers for Disease Control. Influenza activity -- United States, 1991-92. MMWR
1991;40:809-10.

134. Ambulatory Sentinel Practice Network. 1991 Convocation. Chicago, September
1991.

135. Freeman WL, Green LA, Becker LA. Pelvic Inflammatory disease in primary care: a
report from ASPN. Fam Med 1988;20:192-6.

136. Green LA, Becker LA, Freeman WL, Iverson DC, Reed FM. Spontaneous abortion in
primary care; a report from ASPM. J Am Board Fam Pract 1988;1:15-23.

137. Ambulatory Sentinel Practice Network. An exploratory report of chest pain in
primary care. A report from ASPN. J Am Board Fam Pract 1990,-3:143-50.

138. Peterson LR, Calonge NB, Chamberland ME, Engel R, Herring NC. Methods of
surveillance for HIV infection in primary care outpatients in the United States.
Public Health Rep 1990;105:158-62.

139. Hickner J. Practice-based primary care research networks. In: Hibbard H,
Nutting PA, and Grady (eds.). Primary care research: theory and methods.
Rockville, Md.:Agency for Health Care Policy and Research, 1991:13-22.

140. Culpepper L, Froom J. The International Primary Care Network: purpose, methods,
and policies. Fam Med 1988;20:197-201.

141. Goldberg J, Gelfand HM, Levy PS. Registry evaluation methods: a review and
case study. Epidemiol Rev 1980;2:210-20.

142. Weddell JM. Registers and registries: a review. Int J Epidemiol 1973;2:221-8.

143. Pollack DA, McClain PW. Trauma registries: current status and future prospects.
JAMA 1989;262:2280-3.

144. Ahbolm A. Acute myocardial infarction in Stockholm--a medical information
system as an epidemiologic tool. Int J Epi 1978:271-6.

145. Whiting L. The central registry for child abuse cases: rethinking basic
assumptions. Child Welfare 1977;56:761-7.

146. Hammar N, Nervrand C, Ahlmark G et al . Identification of cases of myocardial
infarction: hospital discharge data and mortality compared to myocardial
infarction community registers. Int J Epidemiol 1991:114-20.

147. Brown LJ, Scott RS. A population-based register-development and applications.
Community Health Stud 1988;12:437-43.

148. Eisnebud M, Lisson J. Epidemiological aspects of berry lium- induced nonmalignant
lung disease: a 30-year update. J Occup Med 1983;25:196-202.

149. Johnson A, King R. A regional register of early childhood impairments: a
discussion paper. Community Medicine 1989;11:352-63.

150. Agency for Toxic Substances and Disease Registry. Policies and procedures for
establishing a national registry of persons exposed to hazardous substances.
Atlanta, Ga.: Agency for Toxic Substances and Disease Registry, 1988.

151. Zack M. The pros and cons of exposure registries. In: Proceedings of the
National Conference on Hazardous Wastes and Environmental Emergencies.
1985:161-64.

152. Goldhaber MK, Tokuhata GK, Digon E et al. The Three Mile Island population
registry. Public Health Rep 1983;98:603-9.

153. The EUROCAT Working Group. Preliminary evaluation of the impact of the
Chernobyl radiological contamination on the frequency of central nervous system
malformations in 18 regions of Europe. Paediatr Perinat Epidemiol
1988;2:253-64.

154. Anderson RE, Key CR, Yamamoto T, Thorslund T. Aging in Hiroshima and Nagasaki
atomic bomb survivors. Speculations based upon the age-specific mortality of
persons with malignant neoplasms. Am J Pathol 1974;75:1-11.

155. Sprince NL, Kaxemi H. Beryllium disease. In: Rom WN (ed.). Environmental and
occupational medicine. Boston, Massachusetts: Little, Brown, and Co.,
1983:481-90.

156. Austin DF. Cancer registries: a tool in epidemiology. In: Lilienfeld AM (ed.).
Reviews in cancer epidemiology. Vol 2. New York: Elsevier, 1983:119-39.

157. Menck HR, Garfinkel L, Dodd GD. Preliminary report of the National Cancer Data
Base. Cancer 1991;41:7-8.

158. Centers for Disease Control. National survey of trauma registries--United
States, 1987. MMWR 1989;38:857-9.

159. Centers for Disease Control. Report from the trauma registry workshop,
including recommendations for hospital-based trauma registries. J Trauma
1989:29;827-34.

160. Blot WJ, Fraumeni JF Jr, Mason TJ, Hoover RN. Developing clues to environmental
cancer: a stepwise approach with the use of cancer mortality data. Environ
Health Perspect 1979;32:53-8.

161. MacLennan R, Muir C, Steinitz R, Winkler A (eds.). Cancer registration and its
techniques. IARC scientific publications, no. 21. Lyon, France: International
Agency for Research on Cancer, 1978.

162. Clemmesen J. Uses of cancer registration in the study of carcinogenesis. J
Natl Cane Inst 1981;67:5-13.

163. Shimizu Y, Schull WJ, Kato H. Cancer risk among atomic bomb survivors. The
RERF Life Span Study. Radiation Effects Research Foundation. JAMA
1990,-264:601.

164. Tsyb AF, Dedenkov AN, Ivanov VK, Stepanenko VF, Pozhidaev w. The development
of an all-union registry of persons exposed to radiation resulting from the
accident at the Chernobyl atomic power station. Medical Radiology 1989;34:3-6.

165. Axelson O. Occupational and environmental exposures to radon: cancer risks.
Annu Rev Public Health 1991;12:235-55.

166. Rowland RE, Stehney AF, Lucas HF Jr. Dose- response relationships for female
radium dial workers. Radiat Res 1978;76:368-83.

167. Archer VE. Lung cancer risks of underground miners: cohort and case-control
studies. Yale J Biol Med 1988; 61 (3) : 183-93 .

168. Landrigan PJ, Wilcox KR Jr, Silva J Jr, Humphrey HE, Kauffman C, Keath CW, Jr.
Cohort study of Michigan residents exposed to polybrominated biphenyls:
epidemiologic and immunologic findings. Ann N Y Acad Sci 1979;320:284-94.

169. Ries LAG, Hankey BF, Miller BA, Hartman AM, Edwards BK (eds.). Cancer
statistics review 1973-88. NIH publication no. (NIH) 91-2789. Bethesda, Md. :
National Cancer Institute, 1991.

170. Muir C, Waterhouse J, Mack T, Powell J, Whelan S (eds.). Cancer incidence in
five continents. Volume V. Lyon, France: International Agency for Research on
Cancer, 1987.

171. Parkin D. Surveillance of cancer. In: Eylenbosch WJ, Noah ND (eds.).
Surveillance in health and disease. New York: Oxford University Press. 1988 pp
125-42.

172. Edmonds LD, Layde PM, James LM, Flynt JW, Erickson JD, Oakley GP Jr. Congenital
malformations surveillance: two American systems. Int J Epidemiol
1981;10:247-52.

173. Lynberg MC, Edmonds LD. Surveillance of birth defects. In: Halperin WE, Baker
EL, Monson RR (eds.). Public health surveillance. New York:Van Nostrand
Reinhold, 1992:157-77.

174. Holtzman NA, Khoury MJ. Monitoring for congenital malformations. Annu Rev
Public Health 1986;7:237-66.

175. Erickson JD, Mulinare J, McClain PW et al . Vietnam veterans' risks for
fathering babies with birth defects. JAMA 1984;252:903-12.

176. Becerra JE, Khoury MJ, Cordero JF, Erickson JD. Insulin-dependent diabetes
mellitus in pregnancy and risk for specific birth defects. Pediatr Res
1988;23:266A.

177. Mulinare J, Cordero JD, Erickson JD, Berry RJ. Periconceptual use of
multivitamens and the occurrence of neural tube defects. JAMA 1988;260:3141-5.

178. Weatherall JA, de Wals P, Lechat MF. Evaluation of information systems for the
surveillance of congenital malformations. Int J Epidemiol 1984;13:193-6.

179. Lechat MF. EUROCAT report: surveillance of congenital anomalies, years 1980-
1986. Brussels: Catholic University of Louvain, 1989.

180. Lammer EJ, Sever LE, Oakley GP, Jr. Teratogen update: valproic acid.
Teratology 1987 ;35 (3) -.465-73 .

181. Kovar MG. Data systems of the National Center for Health Statistics. Vital &
Health Statistics. Hyattsville, Maryland: National Center for Health
Statistics. DHHS publication no. (PHS) 89-1325, (Vital & Health Statistics;
series 1; no. 23), 1989.

182. Massey JT. Overview of the National Health Interview Survey and its sample
design. Hyattsville, Maryland: National Center for Health Statistics. DHHS
publication no. (PHS) 89-1384 (Vital & Health Statistics; series 2; no. 110),
1987:1-5.

183. Moore TF, Tadros W. The 1985-94 NHIS sample design. Hyattsville, Maryland:
National Center for Health Statistics. DHHS publication no. (PHS) 89-1384

(Vital & Health Statistics; series 2,- no. 110), 1989:18-27.

184. Centers for Disease Control. Behavior risk factor surveillance, 1988. MMWR
1990;39(Suppl 2) :l-6.

185. Centers for Disease Control. Increased awareness in urban and rural areas- -
Missouri, 1988-91. MMWR 1992 :41;323-5.

186. Centers for Disease Control. Cigarette smoking among Chinese, Vietnamese, and
Hispanics, 1989-91. MMWR 1992;41:362-7 .

187. Kolbe LJ. An epidemiological surveillance system to monitor the prevalence of
youth behaviors that most affect health. Health Education 1990:21 (6) :44-4 .

188. Centers for Disease Control. Participation of high school students in school
physical education--United States, 1990. MMWR 1991;40 (35) :607 , 613-5 .

189. Centers for Disease Control. Tobacco use among high school students--United
States, 1990. MMWR 1991;40 (36) :617-9 .

190. Centers for Disease Control. Attempted suicide among high school
students- -United States, 1990. MMWR 1991 ;40 (37) : 633-5 .

191. Centers for Disease Control. Weapon -carrying among high school students--United
States, 1990. MMWR 1991;40 (40) :681-4 .

192. Centers for Disease Control. Body-weight perception and selected

weight -management goals and practices of high school students — United States,
1990. MMWR 1991:40(43) :741, 747-50.

193. Centers for Disease Control. Sexual behavior among high school students--United
States, 1990. MMWR 1992;40 (51-52) :885-8.

194. Centers for Disease Control. Current tobacco, alcohol, marijuana, and cocaine
use among high school students--United States, 1990. MMWR 1991;40 (38) : 659-63 .

195. Graves EJ. National Hospital Discharge Survey. Hyattsville, Maryland:
National Center for Health Statistics. DHHS publication no. (PHS) 89-1760
(Vital & Health Statistics; series 13, -no. 991, 1989.

196. Nelson C, McLemore T. The National Ambulatory Medical Care Survey: 1975-81 and
1985. Hyattsville, Maryland: National Center for Health Statistics. DHHS
publication no. (PHS) 88-1754 (Vital & Health Statistics; series 13; no. 93),
1988.

100

197. Hahn RA, Teutsch SM, Rothenberg RB, Marks JS. Excessive deaths from nine
chronic diseases in the United States, 1986. JAMA 1990;264,-2654-9.

198. DeLozier JE, Gagnon RO. National Ambulatory Medical Care Survey, 1989 summary.
Advance data from vital and health statistics of NCHS. Hyattsville, Md. :
National Center for Health Statistics, 1991;203:1-11.

199. Schilling S, Wilson D. Wisconsin Amublatory Medical Care Survey, 1986-1987.
Madison: Wisconsin Department of Health and Social Services, 1987.

200. Arrowsmith JB, Kennedy DL, Kuritsky JN, Faich GA. National patterns of aspirin
use and Reye syndrome reporting. United States, 1980 to 1985. Pediatrics
1987;79:858-63.

201. Mathiowetz N, Northrup D, Sperry S, Waksberg J. Linking the National Survey of
Family Growth with the National Health Interview Survey. Hyattsville, Maryland:
National Center for Health Statistics. DHHS publication no. (PHS) 87-1377
(Vital & Health Statistics; series 2; no. 103), 1987.

202. Dawson DA. Family structure and children's health: United States, 1988.
Hyattsville, Maryland: National Center for Health Statistics. DHHS publication
no. (PHS) 91-1506, (Vital & Health Statistics; series 10; no. 178), 1991.

203. Centers for Disease Control and Health Resources and Services Administration.
Health Department Profiles. Maternal Infant and Child Health Programs Data
Analysis and Tracking Approaches Conference. Atlanta, Ga.:Public Health
Service, January 1992.

204. McDowell A, Engel A, Massey JT, Maurer K. Plan and operation of the Second
National Health and Nutrition Examination Survey, 1976-1980. Hyattsville,
Maryland: National Center for Health Statistics. DHHS publication no.

(PHS) 81-1317 (Vital & Health Statistics; series 1; no. 15), 1981.

205. Miller H. Plan and operation of the health and nutrition examination survey:
United States--1971-1973. Hyattsville, Maryland: National Center for Health
Statistics. DHEW publication no. (HRA) 76-1310 (Vital & Health Statistics;
series 1; no. 10a), 1973.

206. Maurer KR. Plan and operation of the Hispanic health and nutrition examination
survey 1982-84. Hyattsville, Maryland: National Center for Health Statistics.
DHHS publication no. (PHS) 85-1321 (Vital & Health Statistics; series 1;

no. 19), 1985.

207. Finucane FF, Freid VM, Madans JH et al. Plan and operation of the NHANES I
epidemiologic followup study, 1986. Hyattsville, Maryland: National Center for
Health Statistics. DHHS publication no. (PHS) 90-1307 (Vital & Health
Statistics; series 1; no. 25), 1990.

208. Annest JL, Pirkle JL, Makuc D, Neese JW, Bayse DD, Kovar MG. Chronological
trend in blood lead levels between 1976 and 1980. N Engl J Med
1983;308(23) -.1373-7.

209. Lunde AS, Lundeborg S, Lettenstrom GS, Thygesen L, Huebner J. The person-number
systems of Sweden, Norway, Denmark, and Israel. Hyattsville, Md. : National
Center for Health Statistics. DHHS publication no. (PHS) 80-1358 (Vital and
Health Statistics; series 2, no. 84), 1980.

210. Naessen T, Parker R, Persson I, Zack M, Adami H-O. Time trends in incidence
rates of first hip fracture in the Uppsala health care region, Sweden, 1965-
1983. Am J Epidemiol 1989;130:289-99.

211. Strom BL, Carson JL. Use of automated data bases for pharmacoepidemiology
research. Epidemiol Rev 1990: 12; 87-107 .

212. West R. Saskatchewan health data bases: a developing resource. Am J Prev Med
1988:4 Supplement.

101

213. Guess HA, West R, Strand LM et al . Fatal upper gastrointestinal hemorrhage or
perforation among users and nonusers of nonsteroidal anti-inflammatory drugs in
Saskatchewan, Canada, 1983. J Clin Epidemiol 1988;41:35-45.

214. West R, Sherman GJ, Downey W. A record linkage study of valproate and
malformations in Saskatchewan. Can J Public Health 1986;76:226-8.

215. Kurland LT, Schoengerg BS, Annegers JF et al. The incidence of primary
intracranial neoplasms in Rochester, Minnesota, 1950-1977. Ann N Y Acad Sci
1982;381:6-16.

216. Wilson MG, Michet CJ, Ilstrup DM, Melton LJ. Idiopathic symptomatic
osteoarthritis of the hip and knee; a population-based incidence study. Mayo
Clin Proc 1990;65:1214-21.

217. Paterson JG. Surveillance systems from hospital data. In: Eylenbosch WJ, Noah
ND (eds.). Surveillance in health and disease. Oxford, England: Oxford
University Press, 1988:49-61.

218. Roger FH. The minimum basic data set for hospital statistics in the EEC.
Luxembourg : Of f ice for Official Publications of the European Communities, 1981.

219. Agency for Health Care Policy and Research. Report to Congress: the feasibility
of linking research-related data bases to federal and non-federal medical
administrative data bases. AHCPR No. 91-0003. April 1991.

220. Jick H. The Commission on Professional and Hospital Activities—professional
activity study- A national resource for the study of rare illnesses. Am J
Epidemiol 1979;109:625-7.

221. Public Health Service-Health Care Financing Administration. The international
classification of diseases clinical modification, 9th revision. DHHS
Publication no (PHS) 80-1260, Washington, D.C.: U.S. Government Printing Office,

1980.

222. Martin ML, Edmonds LD. Use of birth defects monitoring programs for assessing
the effects of maternal substance abuse on pregnancy outcomes. In:
Methodological issues in controlled studies on effects of prenatal exposure to
drug abuse. National Institute on Drug Abuse Research Monograph Series.
Washington D.C. :U.S. Government Printing Office, 1991:66-38.

223. Centers for Disease Control. Temporal trends in the prevalence of congenital
malformations at birth based on the Birth Defects Monitoring Program, United
States, 1979-1987. MMWR 1990;39 (SS-4) ; 19-23 .

224. Stroup NE, Edmonds L, O'Brien TR. Renal agenesis and dysgenesis: are they
increasing? Teratology 1990 :42 ;383-95.

225. Chavez GF, Mulinare J, Edmonds LD. Epidemiology of Rh hemolytic disease of the
newborn in the United States. JAMA 1991;265:3270-4.

226. Report on the need to collect external cause- of -injury codeds in hospital
discharge data systems. In: National Committee on Vital and Health Statistics,
1991. DHHS Publication no. (PHS) 92-1205. Hyattsville, Md. : National Center
for Health Statistics, 1992.

227. Bal DG, Kizer KW, Felten PG, Mozar HN, Niemeyer D. Reducing tobacco consumption
in California: development of a statewide anti-tobacco use campaign. JAMA
1990;264:1570-4.

228. Helbing C, Schieber G. Use of Medicare data in international comparisons.
Health Policy 1990:15;45-66.

229. Fisher ES, Baron JA, Malenka DJ, Barret J, Bubolz TA. Overcoming potential
pitfalls in the use of Medicare data for epidemiologic research. Am J Public
Health 1990;80:1487-90.

102

230. Eggers PW, Connerton R, McMullan M. The Medicare experience with end-stage
renal disease: trends in incidence, prevalence, and survival. Health Care
Financing Review 1984;5:69-88.

231. Health Care Financing Administration. Medicare/Medicaid decision support
systems: Office of Statistics and Data Management and Strategy. Baltimore,
Md.:Health Care Financing Administration Publication (HCFA) 03-272, 1988.

232. Chassin MR et al. Variations in the use of medical and surgical services by the
Medicare population. N Engl J Med 1986;314:285-90.

233. Centers for Disease Control. End-stage renal disease associated with diabetes —
United States, 1988. MMWR 1989; 38; 546-8.

234. Kellie SE, Brody JA. Sex-specific and race-specific hip fracture rates. Am J
Public Health 1990;80:326-8.

235. Wennberg KE, Freeman JL, Shelton RM, Bubolz TA. Hospital use and mortality
among Medicare beneficiaries in Boston and New Haven. N Engl J Med
1989;321:1168-73.

236. Jacobson SJ, Goldberg J, Miles TP, Brody JA, Stiers W, Rimm AA. Regional
variation in the incidence of hip fracture: U.S. white women aged 65 years and
older. JAMA 1990:264;500-2.

237. Stroup NE, Freni-Titulaer LWJ, Schwartz JJ. Unexpected geographic variation in
rates of hospitalization for patients who have fracture of the hip. J Bone
Joint Surg 1990;72:1294-8.

238. Whittle J, Steinberg EP, Anderson GF, Herbert MS. Accuracy of Medicare claims
data for estimation of cancer incidence and resection rates among elderly
Americans. Med Care 1991;29:1226-36.

239. National Institute of Diabetes, Digestive, and Kidney Diseases. U.S. Renal Data
System. USRDS 1991 annual data report. Bethesda, Md. : National Institutes of
Health, 1991.

240. France G, Barrow M. Home accident surveillance system. In: Eylenbosch WJ, and
Noah ND (eds.). Surveillance in health and disease. Oxford, England: Oxford
University Press, 1988; 202-7.

241. Centers for Disease Control. Leading work-related injuries --United States.
MMWR 1984;33:213-5.

242. Centers for Disease Control. Bicycle-related injuries: data from the National
Electronic Injury Surveillance System. MMWR 1987;36:269-71.

243. Centers for Disease Control. Poisoning among children-- United States. MMWR
1986;35:149-52.

244. Hopkins RS . Consumer product-related injuries in Athens, Ohio, 1980-85:
assessment of emergency room-based surveillance. Am J Prev Med 1989;5:104-12.

245. Gallagher SS, Finison K, Guyer B, Goodenough S. Incidence of injuries among
87,000 Massachusetts children and adolescents: results of the 1980-81 statewide
childhood injury prevention program surveillance system. Am J Public Health
1984:74;1340-6.

246. Pollack DA, Holmgreen P, Lui K-J, Kirk ML. Discrepancies in the reported
frequency of cocaine- related deaths, United States, 1983 through 1988. JAMA
1991;266:2233-7.

247. King WD. Pediatric injury surveillance: use of a hospital discharge data base.
South Med J 1991: 84; 342-8.

248. Colliver JD, Kopstein AN. Trends in cocaine abuse reflected in emergency room
episodes reported to DAWN. Public Health Rep 1991;106:59-68.

103

249. Volans GN, Wiseman HM. Surveillance of poisoning -- the role of poison control
centers. In: Eylenbosch WJ and Noah ND (eds.). Surveillance in health and
disease. Oxford, England: Oxford University Press, 1988:255-72.

250. Blanc PD, Rempel D, Maizlish N, Hiatt P, Olson KR. Occupational illness: case
detection by poison control surveillance. Ann Intern Med 1989;111:238-44.

251. Adams MM, Shulman HB Bruc C, Hogue C, Brogan D, the PRAMS Working Group. The
pregnancy risk assessment monitoring system: design, questionnaire, data
collection and response rates. Paediatric and Perinatal Epidemiology
1991:5;333-46.

252. Griffin MR, Ray WA, Livengood JR et al. Risk of sudden infant death syndrome
after immunization with the dipteria-tetanus-pertussis vaccine. JV Engl J Med
1988:319;618-23.

253. Yip R, Fleshood L, Spillman TC, Binkin NJ, Wong FL, Trowbridge FW. Using linked
program and birth records to evaluate coverage and targeting in Tennessee's WIC
program. Public Health Rep 1991;106:176-80.

254. Centers for Disease Control. Anemia during pregnancy in low-income women. MMWR

1990;39:73-6.

255. Yip R, Binkin NJU, Fleshood L, Trowbridge FL. Declining prevalence of anemia
among low income children in the United States. JAMS 1987 ;258; 1619-23 .

256. Gayle HD, Dibley MJ, Marks JS, Trowbridge FL. Malnutrition in the first two
years of life. Am J Dis Child 1987 :141;531-4 .

104

1. U.S. Department of Health and Human Services, Public Health Service. Healthy People
2000. National Health Promotion and Disease Prevention Objectives. 1991. DHHS Pub. No.
(PHS) 91-50212.

2. Centers for Disease Control. Consensus set of health status indicators for the general
assessment of community health status - United States. MMWR 1991;40:449-451.

3. Chorba TL, Berkelman RL, Saffod SK, Gibbs NP, Hull HF. Mandatory reporting of
infectious diseases by clinicians. JAMA 1989;262:3018-3026.

4. American Cancer Society. Cancer Facts and Figures - 1991. American Cancer Society.
1991.

5. Teutsch SM, Herman WH, Dwyer DM, Lane JM. Mortality among diabetic patients using
continuous subcutaneous insulin infusion pumps. N Engl J Med 1984;310:361-368.

6. Ellwood PM. Outcomes management. A technology of patient experience. N Engl J Med
1988;318:1549-1556.

105

Chapter IV

Management of the Surveillance System
and Quality Control of Data

Kevin M. Sullivan

Norma P. Gibbs
Carol M. Knowles

"It is possible to fail in many ways... while to succeed is possible only in one way
(for which reason also one is easy and the other difficult- -to miss the mark easy, to

hit it difficult) . "

Aristotle

INTRODUCTION

This chapter provides a description of practical management and quality control of a
disease-reporting system for notifiable diseases, at the disease- and injury-report-
gathering stage--as in a city/county health department, state health department, or
within the federal government. It focuses on disease-reporting systems for notifiable
diseases. It is important to note that in most health jurisdictions there are laws
that specify which diseases and injuries are reportable, who is responsible for
reporting, and what method and timing of reporting are to be used (e.g. , by telephone
within 24 hours of diagnosis or by mail within 1 week of diagnosis) (1) . Because
these reporting laws differ by geographic locale and municipal unit, the material in
this chapter is restricted to a general overview of a disease-surveillance system,
recognizing that aspects may not be applicable to all areas and that issues specific
to jurisdictions are not covered completely. The term "state" is used in this
discussion; although "state" is a geographic designation in the United States,
analogous geographic units have similar functions in other countries.

106
Types of Reports and Surveillance Systems

There are three categories of notifiable disease reports: a) those in which
information is collected on each individual with the disease or injury; b) conditions
for which only the total number of patients seen is reported; and c) conditions for
which the total number of cases is reported if, and only if, there is judged to be an
epidemic. Each category generally requires specific forms. Once a report has been
received, for many conditions a nurse or other disease investigator may request that
the reporting unit provide information for additional disease/injury-investigation
forms.

A traditional way of classifying a surveillance system is as passive or active (2) . A
passive surveillance system can be described as one with which the health jurisdiction
receives disease/ injury reports from physicians or other individuals or institutions
as mandated by state law. In contrast, an active surveillance system is established
when the health department regularly contacts reporting sources (e.g., once per week)
to elicit reports, including negative reports (no cases). An active surveillance
system is likely to provide more complete reporting but is much more labor intensive
and is therefore more costly to operate than a passive system.

In most surveillance systems, any health worker who has knowledge of an individual
with a reportable condition may be required to report that case to the health
department. In a sentinel surveillance system, only selected physicians or
institutions report disease or injury. Proponents of sentinel systems maintain that
it is preferable to receive disease/injury reports of high quality from a few sources
than to receive data of unknown quality from (in theory) all potential reporting
sources in a population. This, of course, presupposes that the reporters in a
sentinel system will, in fact, provide high-quality information on a reliable basis.
It should also be noted that sentinel systems are inadequate when every case of a
particular condition needs to be identified.

Most states have comprehensive, passive disease surveillance systems. For example,
"as required by law in all 50 U.S. states," any health worker having knowledge of a
person with a reportable condition is obligated to report that case to the local/state
health department (1) . Regular contact initiated by the health department and

107
directed to all possible reporting sources is not feasible or required.

Collection of Data

Laws for reporting disease and injury at the state and local levels not only specify
who is responsible for reporting, but to whom the reports are to be directed. In the
least complicated reporting situation, a physician diagnoses a reportable condition
and sends the appropriate report form to the local health department, where the data
on that case are added to the appropriate disease/injury-surveillance system.
Summaries of reports are reviewed regularly and analyzed by staff at the local health
department to identify any conditions that are being reported more frequently than
expected on the basis of past experience. After disease/injury reports have been
processed at the local level, the information is forwarded to the state health
department to be consolidated with reports from other local health departments, and
the composite data are examined for trends. Each state health department then
voluntarily reports these cases to the Centers for Disease Control (CDC) on a weekly
basis (3) .

This reporting scheme can be reasonably effective, but problems can arise. For
example, how does one notify health-care professionals about the requirements and
procedures for reporting to the health department? Who is responsible for such
notification? How are new practitioners in the jurisdiction identified and notified
of their responsibility to report? who provides quality assurance for the process?
How? At what frequency? Other issues include reporting of suspected cases while
laboratory results are pending, the desired routing of reports, the mechanism for
updating/completing reports as additional information is received, reporting of
disease/injury among transients (e.g., military personnel or migrant workers), and
defining appropriate time frames for reporting a case of a specific disease/injury
(Table IV. 1) .

There may not be one correct answer to each of the questions formulated in Table IV. 1
that applies in all situations; the answers are often situation dependent. However, a
disease- or injury-surveillance system should document how to respond to each of the
above questions so that disease reporting is performed in a consistent manner for each
disease.

108
Entry of Data into the Surveillance System

With the availability of microcomputers, many health departments enter disease/injury
reports into computerized data bases. It is essential that one person be responsible
for management of the surveillance data base (i.e., to be designated and to act as the
data-base manager (DBM) (4) . A primary responsibility of the DBM is maintaining the
integrity and completeness of the data base. Concerns of the DBM are summarized in
Table IV. 2.

Checklist for Data-Base Manager

With any surveillance system for disease/injury, there is a need to establish
procedures for maintenance and retention of paper disease- report forms (called "source
documents"). In general, the individual disease reports are filed by year of report
(or onset), by disease, and in alphabetical order by the patient's last name. If not
already specified by disease-reporting laws, retention periods should be designated
for maintaining these files for reference purposes. Electronic reporting may obviate
the need for redundant paper records. (See Chapter XI for more information on
computerized surveillance systems.)

Documentation and Training

Documentation is a critical step in the development of a computerized system--but one
that is often neglected. A users' manual if needed and should provide both general
and detailed descriptions of the system, including the following topics (4):

• General description of the entire system

• Detailed procedures for installing the system

• Detailed procedures for operating the system

• Detailed procedures for maintaining the system

The DBM should maintain contact with the programmer for the system so that
modifications to record formats and programs can be documented by the manager; the
programmer should also maintain a file of all such changes. Thorough, clear
documentation facilitates the addition of new programs and modifications in equipment
or operations (4) .

109

A formal training program should be established for persons involved in the daily
operation of the surveillance system. These staff members must feel that they can
participate in shaping the system, and their ideas and comments should be elicited as
part of the training process (4) . The DBM should schedule a series of training
classes that include hands-on experience with the data-base software. Written
operational procedures—including guidelines for interpreting information contained in
the disease/injury report forms — should be distributed and explained at this time.
Software tutorial packages and videotapes (interactive or presentational) can also be
useful tools for training.

Management of the organization responsible for the surveillance system should also be
oriented to the system in one or several briefing sessions.

Analysis and Standard Reports

An effective surveillance system must be designed to cover all the following areas in
its reporting process:

• Determining whether a condition is being reported more frequently than
expected (see Chapter V)

• Responding appropriately to reports of individual cases

• Detecting clusters of cases

• Notifying public health practitioners of the presence of specific
conditions in their areas

• Reinforcing the importance of reporting through facilitating effective
control /prevent ion activities

The completeness and timeliness of case reports in the surveillance system should be
assessed regularly. This assessment should include both the proportion of the reports
with each variable, such as age of patient or date of onset of the condition, date

110

completed, and time between onset of condition and receipt of report. At the local
health department, this information can be analyzed by reporting source (e.g.,
clinicians or hospital or diagnostic laboratory staff) or, at the state level, by
health jurisdiction. These analyses should identify groups or institutions in need of
additional information or training on disease reporting.

Most surveillance systems for infectious disease rely primarily on receipt of case
reports from physicians and other health-care providers. To encourage reporting by
these health professionals, many local health departments and most state health
departments publish newsletters containing data and other information of interest to
the contributors to the data base (1) . Such newsletters may include standard tabular
reports of the occurrence of a reportable condition by week or month, with a year-to-
date summary. They may also include narrative reports about conditions of interest or
about other topics relevant to public health. Such feedback is important to
demonstrate to those involved with the system that the data are being used, as well as
to accomplish communications goals (see Chapter VII) .

The information needs of management and operations personnel should be considered as
programs are developed for standard reports from the data base. Standard reports
should include information on time, place, and person, and should be produced in a
form that can be easily interpreted by epidemiologists and management. The purpose of
each report should dictate the appearance of the output, e.g., a table, map, or graph.
Most types of reports should be produced on a regular basis and according to a set
schedule, but others may be created only on an as-needed basis.

Data Sharing

In some situations disease and injury reports may be shared by various local or state
health departments, particularly with conditions that require additional investigation
or follow-up. For example, when a resident of one county/state is examined and given
a particular diagnosis at a hospital in a neighboring county/state, health authorities
need to be able to track the condition back to its source in order to respond
appropriately .

Occasionally, disease and injury reports are sent directly to the state health

Ill

department, bypassing the local health department. If that happens, the state needs
to notify the appropriate local health department so that the reports can be added to
the disease/injury reporting system at the local level. Additional data that the
state may collect should also be shared with the local health department.

The DBM should be aware of other sources of information that may need to be accessed
and compared with or added to the data collected in his or her own system — e.g.,
laboratory results, epidemiologic information for specific conditions, population
estimates, and mortality records. Through careful planning and coordination on the
part of managers of reporting systems, standard coding schemes can be adopted as data
systems evolve. These actions facilitate the sharing and use of data.

System Maintenance and Security

Maintenance of a system should be directed first toward reducing errors introduced
through flaws in design and through content changes (e.g., changes in the list of
notifiable conditions) and second toward improving the system's scope and services.
Related activities can be categorized as routine maintenance, emergency maintenance,
requests for special reports, and system improvements. Maintenance should not be
performed on an informal or first-come, first-served basis. An effective maintenance
program includes the following steps (4):

• Back up data and system files according to an established schedule, and
maintain records in a secure environment.

• Require that requests for emergency maintenance be made in writing and
entered into a log.

• Assign priorities to special requests on the basis of urgency of need and
time and resources required.

• Institutionalize routine maintenance, such as procedures associated with
changing to a new reporting year.

• Document maintenance as it is conducted.

112

In order to maintain the integrity of a computer system, only one person should have
the authority to access the system and assign and change passwords. The DBM should be
the only staff member with authority to install or modify production software. This
same rule should apply to access to the physical computer files. Authority to add or
delete files from subdirectories or environments of computers should be delegated to
only one individual who is then held accountable for all modifications. A second
computer should be available for testing changes to the system so that the computer
used for the surveillance system can be reserved for production only. The second
computer could also serve as a back-up computer should the primary machine fail.

The numerous risks to the security of a data base include mechanical failure, human
carelessness, malicious damage, crime, and invasion of privacy. Therefore, back-up
copies of the data base should be kept off-site to ensure that the system cannot be
deliberately or unintentionally destroyed. Updating of the off-site copies should be
done on a routine basis, and new diskettes should be used to make back-up copies at
least once each year.

A monthly, total system back-up is recommended, if a valid copy of the current system
is available. Data files that are changed during the day should be backed up at the
end of the day.

Computer viruses have become a threat to data-base and computer-system security.
These programs can be highly sophisticated and are capable of attaching themselves to
software or data being loaded on the computer or data being sent from one computer to
another. Software is available to scan entire systems or diskettes for virus
infections; such software should be updated periodically because of the addition of
new viruses. Data received via telecommunications channels or on diskettes from other
sources should always be scanned before data files and programs are copied to the
computer's disk. Software retrieved from electronic bulletin boards should be
carefully examined before being incorporated into a system.

In the event of extended mechanical failure, a contingency plan should be in place for
shifting the base of operations to another computer.

Surveillance data on disease/injury are generally received by a local health

113

department, forwarded through a regional health center, and eventually directed to the
state health department. The complete reporting form, which includes confidential
information on patients, is usually shared by local and state health departments for
purposes of follow-up (if necessary) and for identifying and deleting any redundant
(duplicate) reports.

Persons who report disease/injury should be familiar with the types of activities that
may follow the receipt of a report. For example, for purposes of prevention or
treatment, all cases of syphilis may be investigated to determine the source of the
infection and potential spread of the infection to others. Disease- reporting laws
may specify who has access to the confidential portions of a disease/injury report,
and it is important to assure that the confidentiality of the report is maintained.
Failure to keep the reports confidential is likely to lead to an unwillingness to
report on the part of physicians and other health-care providers. Reports and files
that do not require personal identifiers should not contain them. In the United
States, notifiable-disease reports received from states by CDC do not include personal
identifiers (such as name, address, and telephone number) .

Modification of Reporting Systems

The basic steps shown below are intended to ensure that a computer-based surveillance
system will meet current and future needs. A systems analyst, an epidemiologist, and
the final users of information from the system should work together to produce a
system that is user- friendly and functional (5) .

1. Review current methods of processing disease/injury information. Obtain copies
of paper forms or computer- screen forms or reports. Determine whether suggested
report forms or screens are available from state or national agencies. Often,
ready-to-use surveillance software is available. Use of such systems
facilitates standardization, quality control, and comparability of data.

2. Review with management and users any problems with the current method for
processing data and any desired future enhancements.

3. Document the current system and proposed future system. Allow concerned parties

114
to review and comment on their understanding of objectives for the system.

4. Limit access to the confidential portion of a disease/injury report as much as
possible. Store the original report forms containing confidential data in
locked cabinets or a locked room. Secure electronic data bases by limiting
access to the computer, and obtain additional security through the required use
of passwords (pre-approved for access to the protected portion of the data
base) .

5. Document developmental specifications to meet the objectives above. In
addition, document proposed testing schedules and methodology for implementing
the system when it is completed.

6. Develop prototypic screens and reports for management and end users to review,
so that misunderstandings and problems can be identified and resolved during
development .

7. Once all parties are in agreement, establish self-contained modules of
development that can be completed, and proceed to the testing stage while other
modules are being developed.

8. Begin development in a test environment separate from any current computer-based
production system. Document any changes to developmental specifications that
become necessary during actual development.

9. Produce processing manuals for users (to include not only the operation of the
computerized system but also proper handling of paper forms, storage of
electronic and paper data, and distribution of final reports) . This
documentation should be as thoroughly tested as the actual computer system.

10. Establish training sessions or develop tutorial manuals for users. If such
manuals are to be effective, a development/test system for users must be in
place during their training stage.

11. Finalize specification documents to include all current stages of the system, as

115

well as all expected future enhancements. This documentation should include a
schedule and methodology for maintaining and troubleshooting the system.

12. Establish and document proper back-up and data- recovery techniques. This step
includes selecting a data-base manager.

SUMMARY

A surveillance system of high quality and integrity can only be developed through
careful planning, documentation, implementation, training, and long-term support.
Because of the changing nature of disease/injury reporting (e.g., new conditions being
added or case definitions being modified) , useful surveillance systems must be
flexible enough to allow for such changes with a minimal amount of disruption.

Also important is the coordination of disease and injury- reporting activities among
local health departments, from local health departments to their appropriate state
health departments, and among state health departments. The Council of State and
Territorial Epidemiologists has played an important role in the state-to-state
coordination of disease and injury reporting, as well as in reporting practices from
states to CDC.

While there are many complicated aspects of disease/injury-surveillance systems, it is
important to remember that the overall purposes of such systems are to provide
information on preventing disease and injury and to improve the quality of the public
health.

116
REFERENCES

1. Chorba TL, Berkelman RL, Safford SK, Gibbs NP, Hull HF. Mandatory reporting of
infectious diseases by clinicians. MMWR 1990; 39 (RR-9) : 1-17 .

2. Mausner JS, Kramer S. Epidemiology — an introductory text. Philadelphia, Pa.:
W.B. Saunders Co., 1985.

3. Wharton M, Chorba TL, Vogt RL, Morse DL, Buehler JW. Case definitions for
public health surveillance. MMWR 1990; 39 (RR-13) : 1-43 .

4. Murdick, RG. MIS concepts and design. Englewood Cliffs, N.J: Prentice-Hall,
1980.

5. Klaucke DN, Buehler JW, Thacker SB, Parrish RG, Trowbridge FL, Berkelman RL et
al . Guidelines for evaluating surveillance systems. MMWR 1988,-37 (S-5) : 1-18.

117

118

Chapter V

Analyzing and Interpreting Surveillance

Data

Willard Cates, Jr.
6. David Williamson

■Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost
in information?"

T.S. Eliot
['Where is the information we have lost in data?"

Editors]

119

INTRODUCTION

Historically, the core processes of public health surveillance have involved using
appropriate methods to aggregate the units of data being collected- -namely , analysis- -
and also creative approaches to assess the emerging data patterns--namely ,
interpretation (1).

For these reasons, the ability to analyze and interpret surveillance data determines
the mettle of the epidemiologist. Viewed as basic to observational studies (2),
surveillance is at the forefront of the spectrum of descriptive epidemiology.
Surveillance has a myriad of uses (3,4), each of which requires careful analysis and
interpretation. Whether surveillance is used to detect epidemics, suggest hypotheses,
characterize trends in disease or injury, evaluate prevention programs, or project
future public health needs, data from a surveillance system must be analyzed carefully
and interpreted prudently. In this chapter we address practical and methodologic
approaches to surveillance analysis; the presentation of surveillance data by time,
place, and person; the concept of rates and standardization of rates; approaches to
exploratory data analysis; the use of graphics and maps; and, finally, the systematic
interpretation of surveillance data.

APPROACH TO SURVEILLANCE ANALYSIS

Practical Approach

The fundamental approach to analyzing surveillance data is relatively straightforward.
Because of their descriptive nature, surveillance data cannot be used for formal
hypothesis testing (5). Rather, the regular scrutiny of systematically collected
information allows epidemiologists to describe patterns of disease and injury in human
populations, organized by a variety of sub-measures. Moreover, the analysis (and
subsequent interpretation) proceeds from the specific elements of the data themselves.
Thus, surveillance analysis represents an inductive reasoning process in which the
assembly of individual units eventually produces a more general picture of health-
related problems in a population.

Frequently, the time-consuming problems of collecting, managing, and storing

120

surveillance data leave little energy for the analysis itself. Nonetheless, analyzing
surveillance data must be afforded a high priority by those in charge of surveillance
systems (3). Approaches to analyzing surveillance data include the following steps:

1. Know the inherent idiosyncracies of the surveillance data set. It is
tempting to begin immediately to examine trends over time. However,
intimate knowledge of the day-to-day strengths and weaknesses of the
data-collection methods and the reporting process can provide a "real
world" sense of the trends that emerge.

2. Proceed from the simplest to the most complex. Examine each condition
separately, both by numbers and crude trends. How many cases were
reported each year? How many cases were reported in each age group each
year? What are the variable-specific rates? Only after looking
separately at each variable should one examine the relationships among
these variables.

3. Realize when inaccuracies in the data preclude more sophisticated
analyses. Erratically collected or incomplete data cannot be corrected
by complex analytic techniques. Differential reporting (see
representativeness Chapter VIII) by different regions or by different
health facilities render the resulting surveillance data set liable to
misinterpretation .

Me thodo logic Considerations

Analysis of surveillance information depends on the accuracy of that information
(Chapter VIII). Attempts to analyze data that are haphazardly collected or have
varying case definitions waste valuable time and resources. The two key concepts
which determine the accuracy of surveillance data are reliability and validity (5) .
Reliability refers to whether a particular condition is reported consistently by
different observers, whereas validity refers to whether the condition as reported
reflects the true condition as it occurs. Ideally, both reliability and validity can
be achieved, but in practice, reliability (e.g., reproducibility) is easier than
validity to assess. In situations involving conditions, such as laboratory testing

121

for infectious diseases, when biologic measures complement clinical case definitions,
the accuracy of the data can be more completely assured. However, in the context of
more subjective behavioral aspects, such as those associated with lifestyles, accuracy
is more difficult to confirm.

The application of standard statistical techniques to the analysis of surveillance
data is dictated by the limitations of the data themselves and the flexibility of the
epidemiologist/statistician (5) . In a sense, because the essentials of sampling
theory have not been satisfied, no statistical testing is possible with the often
incomplete surveillance data set. However, if the information is viewed as samples
over time, apparent clusters of health events can be evaluated for their statistical
■significance." Applying 95% confidence limits or other standard statistical tests to
these 'samples over time" can allow a determination of whether any differences are
unlikely to have occurred by chance alone.

Surveillance analyses are often ecologic, since they describe trends in groups of
individuals. Thus, the use of surveillance data may be especially prone to the
problem of the "ecological fallacy" (6,7). In brief, this type of bias may occur when
health officials interpreting observations about groups (e.g., aggregated surveillance
data) make causal inferences about individual phenomena (8) . These population-level
analyses may suffer from two separate problems (7): a) aggregation bias — due to

loss of information when individuals are grouped and b) specification bias due to

the definition of the "group" itself (8) . The chances of the ecological fallacy can
be reduced by analyzing subsets of surveillance data to reveal trends in the
individual characteristics. However, when describing bodies of surveillance data,
public health officials usually synthesize the populations trends, thus opening the
possibility for fallacious interpretation.

Time, Place, and Person

Surveillance data allow public health officials to describe health problems in terms
of the basic epidemiologic parameters of time, place, and person. In addition,
surveillance data permit comparisons among these different parameters (e.g., what are
the patterns of disease/injury at one time compared with another, in one place
compared with another, or among one population compared with another) . Use of

122

appropriate census data as denominators allows calculation of rates, which then
facilitates comparison of the risks of disease or injury in terms of the parameters of
time, place, and person. Moreover, use of these fundamental variables permits the
epidemics to be detected, long-term trends to be monitored, seasonal patterns to be
assessed and future occurrence of disease/injury to be projected, thus possibly
facilitating a more timely public health response.

Time

Analysis of surveillance data by time can reveal trends in disease/ injury . For all
health conditions, a measurable delay occurs between the exposure and the problem. In
the case of disease, an interval exists between exposure and expression of symptoms,
as well as an interval between a) onset of symptoms and diagnosis of the problem, and
b) eventual reporting of the illness to public health authorities so that it can be
included in the surveillance data set. For an infectious disease, this last interval
may represent days or weeks, whereas for chronic disease it may be measured in years.
Thus, choosing the appropriate interval for analysis must involve a consideration of
the health condition being assessed.

Analysis of surveillance data by time can be conducted in several different ways to
detect changes in incidence of disease/ injury . The easiest analysis is usually a
comparison of the number of case reports received during a particular interval (e.g.,
weeks or months) (see Figure 1.1) . Such data can be organized into a table or graph
to assess whether an abrupt increase has occurred, whether the trends are stable, or
whether a gradual rise or fall in the numbers occurs. Another simple method of
analysis compares the number of cases for a current time period (e.g., a given month)
with the number reported during the same interval for the past several years .
Similarly, the cumulative number of cases reported in the period representing the
year-to-date can be compared with the appropriate cumulative number for previous
years.

Analyzing long-term (secular) trends is facilitated by graphing surveillance data over
time. The watershed events that influence secular trends--such as changes in the case
definition used for surveillance, new diagnostic criteria, changes in reporting
requirements or practices, publicity about a particular condition, or new intervention
programs- -can be indicated on the graph. Changes in the surveillance system itself

123

also influence long-term trends, particularly when the intensity of active case
detection increases (e.g. , screening programs in particular communities) .

Finally, additional epidemiologic measures enhance the analysis of surveillance data
by time. Using denominators to calculate rates becomes especially important if
changes occur in the community, such as the immigration of a new population. As the
size of a population changes over time, so will the expected number of cases of
diseases and injuries. In addition, analysis by date of onset rather than date of
report more clearly defines the condition. Because of delays between diagnosis and
reporting, using date of onset when practical and possible provides a better
representation of actual disease incidence. The longer the interval between the
occurrence of symptoms, the seeking of health care, and the reporting of events, the
greater the need for a surveillance system based on date of onset.

Place

Analysis by the place where the condition occurred is the next step. (see Figure
1.2). The location from which the condition was reported (such as a hospital) may not
be the place where the exposure actually occurred (in the community) . Similarly, for
medical procedures, the place an operation took place may not be the place of
residence of the patient. For example, the District of Columbia has the highest rate
of legal abortions in the United States, but more than 50% of this figure reflects
women who reside outside the District (9).

Locating the geographic area with the highest rates can facilitate efforts to identify
cause (s) and allow appropriate interventions to be applied. John Snow's removing the
Broad Street pump handle remains the classic example of intervention by location (20) .
Even in situations in which the numbers of a particular problem are decreasing, focal
areas with high levels of the condition may remain, and the identification of these
areas allows prevention resources to be targeted effectively. Finally, the size of
the unit for geographic analysis is determined by the type of condition involved. For
some rare conditions, large areas such as states may be appropriate, whereas for
events that occur at relatively high frequency or for outbreak situations, areas
defined by postal codes or other geographic boundaries may be the most desirable size
of the measure.

124

The availability of computers, as well as software for spatial mapping, allow more
sophisticated analysis of surveillance data by place. Public health officials are now
able to use surveillance data to follow the geographic course of a particular
condition, thus assisting in their efforts to plan intervention strategies (see 'Maps*
below) .

Parson

Analyzing surveillance data by the characteristics of persons who have the condition
provides further specification. The demographic variables most frequently used are
age, gender, and race/ethnicity. Other variables such as marital status, occupation,
and levels of income and education may also be helpful, even though most surveillance
systems do not routinely collect such information.

Analysis of trends in disease/injury by age depends on the specific health condition
of interest. For childhood diseases, relatively narrow age categories (e.g., by
single years) , can identify the age group associated with the peak incidence of a
particular health condition. Conversely, for conditions that primarily affect older
populations, broader 10-year age intervals are frequently used. In general, the
typical age distribution associated with the health condition provides the best guide
to deciding which age categories to use, with several narrower categories for the ages
associated with peak incidence and broader categories covering the remainder of the
age spectrum.

Surveillance systems have also been used to analyze behavioral characteristics of
populations. Such systems generally depend on self-reported behavior and may be based
on repeated surveys of representative groups, trends in markers for specific types of
behavior (e.g., sales of a particular product), or active surveillance of a particular
behavioral characteristic or indicator in a defined group (e.g. , testing urine for
drugs in school or work settings) .

If possible, the characteristics of persons included in any surveillance system should
be related to denominators. While assessing the number of cases alone can be
sufficient, variable-specific rates are more helpful in allowing comparisons of the
risk involved. Thus, even if the number of cases of a particular condition is higher
in one part of a population, the rate may be lower if that group represents a large

125

proportion of the population. In this way, comparing the rates within surveillance
data of certain populations is analogous to calculating relative risks within
observational cohort studies.

Interactions among Time, Place, and Person

By proceeding from the simple (e.g., crude rates) to the more complex (e.g., variable-
specific rates) , meaningful trends may be revealed. This is because interactions
among the time-place-person parameters of surveillance data can obscure important
patterns of disease/injury in specific populations. For example, in the United States
in the 1980s, the overall number of syphilis cases fell during the first two-thirds of
the decade but rose beginning in 1987 (Figure V.l, Panel A) . When analyzed by gender
(Figure V.l, Panel B) , the decline in syphilis occurred primarily among men; cases
among women were low for the first 5 years, increased slightly in 1986, and rose more
rapidly for the rest of the decade. Finally, when stratified by both gender and race
(Figure V.l, Panel C) , the decrease in numbers of cases of syphilis was seen only
among white males--presumably among men who have sex with other men and who had
changed their sexual practices in response to human immunodeficiency virus (HIV)
prevention activities (12). Conversely, the increase in syphilis occurred among black
men and women, with both trends beginning in 1986, and being linked to unsafe sexual
behavior associated with use of crack cocaine (13) . If more specific analysis by
person had not occurred, the offsetting trends in the mid-1980s of declines among
white males might have delayed recognition by public health officials of the syphilis
epidemic among minorities.

RATES AND RATE STANDARDIZATION

Overview

A rate measures the frequency of an event. It comprises a numerator (i.e., the upper
portion of a fraction denoting the number of occurrences of an event during a
specified time) and a denominator (i.e., the lower portion of a fraction denoting the
size of the population in which the events occur) . A crucial aspect of a rate is the
specification of the time period under consideration. An optional component is a
multiplier, a power of 10 that is used to convert awkward fractions to more workable
numbers (14) . The general form of a rate is shown below:

126

rate = number of occurrences of event in specified time X 10°,
average or mid- interval population

where the denominator represents the size of the population during the specified
period in which the events occur and the power of n usually ranges from 2 to 6 (i.e.,
the number at risk varies between 100 and 1,000,000). The selection of n depends on
the incidence or prevalence of the event.

Although surveillance often provides numerator data only, the use of raw numbers such
as cases of a disease or injury has limitations. Raw numbers quantify occurrences of
an event during a specified time without regard to population size and dynamics, or
other demographic characteristics such as distribution by race and gender. Rates
enable one to make more appropriate, informative comparisons of occurrences in a
population over time, among different sub-populations, or among different populations
at the same or different times, since the size of the population and the period of
time specified are accounted for in the calculation of rates.

A wide variety of "rates" are employed in standard public health practice (Table V.l).
These measures are calculated in numerous ways and may have different connotations.
Special distinction should be made among the terms 'rate,- "ratio," and "proportion."
A ratio is any quotient obtained by dividing one quantity by another. The numerator
and denominator are generally distinct quantities, neither of which is a subset of the
other. No restrictions exist on the value or dimension of a ratio. A proportion is a
special type of ratio for which the numerator is a subset of the denominator
population, thus requiring the resulting quotient to be dimensionless, positive, and
less than one, or less than 100 if expressed as a percentage. Although all rates are
ratios, in epidemiology a rate may be a proportion (e.g., prevalence rate) or may be
limited in scope by further restrictions such as representing the number of
occurrences of a health event in a specified time and population per unit time (e.g.,
hazard or incidence rate) . This latter definition is most restrictive and is the
definition generally used for rates in chemistry and physics.

Use of Rates in Epidemiology

Calculation and analysis of rates is critical in epidemiologic investigations, not
only for formulating and testing hypotheses about cause(s), but also for identifying

127

risk factors for disease and injury. Rates also allow valid comparisons within or
among populations for specific times. To determine rates, one must have reliable
numerator and denominator data, the latter being generally more difficult to obtain in
most epidemiologic investigations, particularly if the data to be analyzed (i.e, the
number of occurrences of an event) have been collected from public health surveillance
systems.

Crude, Specific, and Standardized Rates

Crude and specific rates

Rates can be calculated either for the entire population or for certain subpopulations
within the larger group. Rates describing a complete population are termed "crude."
The computation of crude rates is performed as the initial step in analysis since they
are important in obtaining information about and contrasting entire populations.

Within a population, the rate at which a particular health event occurs may not be
constant throughout the entire population. To examine the differences, the population
is partitioned into relevant "specific" subpopulations, and a "specific rate is
calculated for each subset. For example, if one calculates death rates by age group
(because death rate is not constant for all age categories) , the resulting rates are
termed "age-specific death rates."

Variation of rates among population subgroups results from several factors: natural
history of the health problem, differential distribution of susceptibility or
cause (s), or genetic differences among subpopulations. For example, mortality rates
are higher among men than women and blacks than whites (15) . The distribution of
subgroups within the population may also be so disparate that a summary rate may not
convey useful information. Therefore, the magnitude of a crude rate depends on the
magnitude of the rates of the subpopulations as well as on the demographics of the
entire population (16). These variations in rates across a population would remain
unknown if only crude rates were calculated.

Standardized rates

When rates are compared across different populations or for the same population over

128

time, crude rates are appropriate only if the populations are similar with respect to
factors that are associated with the health event being investigated (2 7) . Such
factors could include age, race, gender, socioeconomic status, or risk factors (e.g.,
number of cigarettes smoked) . If the populations are dissimilar, variable-specific
rates should be computed and compared. Alternatively, the rates can be adjusted for
the effect of a confounding variable in order to obtain an undistorted view of the
effect that other variables have on risk. This adjustment of rates when comparing
populations is called standardization and yields "standardized" or "adjusted" rates.
The two techniques of standardization are direct and indirect.

Direct standardization

A directly standardized rate is obtained for a study population by averaging the
specific rates for the population, using the distribution of a selected standard
population as the averaging weights. This adjusted rate represents "what the crude
rate would have been in the study population if that population had the same
distribution as the standard population with respect to the variable (s) for which the
adjustment or standardization was carried out" (14). The rate is termed "directly
standardized" because specific rates are used directly in the calculation. If data
for the same standard population are used to calculate directly standardized rates for
two or more study populations, those standardized rates can be appropriately
compared. Any difference among the standardized rates cannot be attributed to
differential population distributions of the standardized variable because the
calculations have been adjusted for that variable {18) . The following data must be
available in order to use direct adjustment:

• Specific rates for the study population and

• Distribution for the selected standard population across the same strata
as those used in determining the specific rates.

Indirect Standardization

An indirectly standardized rate is calculated for a study population by averaging the
specific rates for a select standard population, using the distribution of the study
population as weights. One should use indirect adjustment when any of the specific
rates in the study population are unavailable or when such small numbers exist in the
categories of strata that the data are unreliable (i.e., the resulting rates are

129

unstable) . This commonly occurs in occupational mortality or in small geographic
areas. For these reasons, indirect standardization is used more often than direct
standardization. Indirectly standardized rates for two or more populations of
interest can be appropriately compared if the same standard population is used in the
computations. The following data are required to make an indirect adjustment to a
rate:

• Specific rates for the selected standard population,

• Distribution for the study population across the same strata as those
used in calculating the specific rates,

• Crude rate for the study population, and

• Crude rate for the standard population.

A special application of the indirect standardized rate, when the health event of
interest is death, is the standardized mortality ratio (SMR) . It is the number of
deaths occurring in a study population or subpopulation, expressed as a percentage of
the number of deaths expected to occur if the given population and the selected
standard population had the same specific rates (19). Explicitly, the SMR is an
indirect, age-adjusted ratio calculated as the indirect standardized mortality rate
for the study population, divided by the crude mortality rate for the standard
population. Additional information is available on the use of the SMR, as well as on
computation of variance and confidence intervals for direct and indirect
measures (18) .

Choice of Standard Population

If crude rates are to be adjusted, an appropriate standardized population needs to be

chosen. In extreme cases, the choice of different standardized populations can lead

to different results. For example, use of one standardized population may yield an

adjusted rate higher for population A than for population B, while choice of another

standard population may yield a higher rate for population B (18) .

Two factors should be considered when choosing a standard population:

• Select a population that is representative of the study populations being
compared and

• Understand how choice of a standard population affects directly

130

standardized rates (e.g., if the age-specific rates for population A are

greater than for population B at young ages and the opposite is true at

older ages, a standard population with distribution skewed to younger

ages will yield a higher directly standardized rate for population A than
for population B) .

Generally the choice of standard population makes little difference in comparing
adjusted rates. Although magnitudes of the adjusted rates depend upon choice of
standard population, no meaning is attached to those magnitudes; only relative
differences in the adjusted rates can be assessed.

Various choices are available for a standard population. Customary selections include
the combined or pooled population of the overall population to be studied, the
population of one of the study groups, a large population (such as the 1940 or 1980
United States population) , or a hypothetical population. Calculating standardized
rates using different standard populations allows comparisons of different
distributions (20) .

To Standardize or Not To Standardize. The decision to standardize is not always
straightforward. Several factors, most of which are data-driven, must be considered
in the decision process. Reasons to present standardized rates include the following
(17):

• Standardization adjusts for confounding variables to yield a more
realistic view of the effect of other variables on risk,

• A summary measure for a population is easier to compare with similar
summary measures than are sets of specific rates,

• A standardized rate has a smaller standard error than any of the specific
rates (this is important when comparing sub-populations or geographic
areas) ,

• Specific rates may be imprecise or unstable because of sparse data in the
strata, and

• Specific rates may be unavailable for certain groups of interest (e.g.,
small populations or those designated by specific geographic areas) .

131

The major disadvantage of standardization is evident when the specific rates vary
differently across strata, such as when they move in different directions or at
different magnitudes, in individual age groups. In this case the trend in the
standardized rate is a weighted average of the trends in the specific rates, where the
weights depend on the standard population selected. When this occurs, the
standardized rate tends to mask the differences, and no single summary measure will
reveal these differences.

Another unfavorable characteristic of standardized rates is that their magnitude is
arbitrary and depends entirely on the standard population. Although generally not the
case, relative rankings of summary measures from different study populations may
change if a different standard population is selected.

Regardless of the decision made regarding standardization, it is crucial to evaluate
the specific rates to characterize accurately and to understand more fully the
variation among study populations. Standardized rates should never be used as a
substitute for specific rates, nor should they be the basis of inferences when
specific rates can be computed. A compromise to the use of a summary measure versus a
set of specific measures is to use the specific rates but to eliminate or combine
categories to minimize the number of rates required for comparison. Additional
discussion is available on advantages and disadvantages of standardization and on
analyzing crude and specific rates (21).

Rate standardization: practical example

To demonstrate how crude, specific, and standardized rates are obtained, we compare
death rates in two Florida counties. This example shows how standardized rates can be
misleading if they are not properly scrutinized.

We will use population and death totals for Pinellas and Dade Counties in Florida for
1980 (Table V.2). The crude death rate for Pinellas County is about 60% higher than
that for Dade County. When the age distributions of each county are used, the
resulting age-specific death rates are generally slightly higher in Dade County (Table
V.3), even though the crude death rate is substantially higher for Pinellas County.
This seeming anomaly in the data results from the different age distributions of each
county. Specifically, the population in Pinellas is older.

132

Directly standardizing the Pinellas and Dade County rates to the United States 1980
population corrects for the differences in population (Table V.4). Once differences
in age-related distributions in the two counties have been taken into account, the
adjusted death rate for Pinellas County is lower than that for Dade County (7.7 and
7.9, respectively).

The indirect method of adjustment increases the relative difference between death
rates for the two counties (Table V.5). The adjusting factor is computed as the 1980
death rate for the total U.S. population divided by the expected death rate. Then,
adjusted death rate is calculated as the adjusting factor multiplied by the crude
death rate. In this example, indirect adjustment reinforces and accentuates the
results of direct adjustment by yielding rates of 7.5 and 7.8 deaths per 1,000
population for Pinellas and Dade Counties, respectively.

This example illustrates the importance of being thoroughly familiar with the data.
Comparison of crude death rates alone can be misleading. However, calculating age-
specific and adjusted rates permits an accurate understanding of death rates in these
counties and shows that the high crude rate in Pinellas County reflects its older
population. The example also illustrates how the magnitude of adjusted rates depends
on the choice of standard population.

Analysis of Rates

When numerator and denominator data are available, analysis of rates should always
begin with calculation of crude rates and proceed to subsequent computation of
relevant specific rates. If appropriate, a standard population can be chosen to
determine standardized rates. Tables and especially maps are important means of
presenting rates at different times and/or locations. (See "Tables," "Graphs," and
"Maps" below) .

Several statistical procedures are available to analyze data. Inference on a single
proportion is performed using a z test, and assessing the difference between two
proportions can be accomplished with a z or x2 test (17).* Use of Poisson parameters

*Note that Fleiss does not distinguish between rates and proportions or the analysis
of them.

133

is helpful in comparing two rates (22) . A series of %' tests can be used to compare
proportions from several independent samples (16), and Poisson regression is
frequently used for comparing several rates (23). Other modeling procedures that can
be used to analyze rates include smoothing, Box-Jenkins, and Kalman filter approaches,
all of which are time-series methods discussed in Chapter VI. Space-time cluster
techniques and small-area estimation methods are also discussed in Chapter VI.

EXPLORATORY DATA ANALYSIS
Overview

Exploratory data analysis (EDA) is enumerative, numeric, or graphic detective

work (24) . It is the application of a set of techniques to a body of data to make the

data more understandable. EDA is a philosophy that minimizes assumptions, allows the

data to motivate the analysis, and combines ease of description with quantitative

knowledge. EDA leads the analyst to uncover characteristics often hidden within the

data.

Practice of EDA involves four fundamental steps (24-25) :

1. Using visual displays to convey the structure of the data and analyses,

2. Transforming the data mathematically to simplify their distribution and
to clarify their analysis,

3. Investigating the influence that unusual observations (outliers) have on
the results of analysis, and

4. Examining the residuals (the difference between the observed data and a
fitted model) to provide additional insight into the data.

EDA is the initial step in any analysis. It allows the investigator to become
familiar with the data and forms the foundation for further analysis. Although most
public health surveillance systems are established for specific topics, proper EDA of
the data can provide insight into demographic, temporal, and spatial patterns
otherwise overlooked in the collection of numbers. EDA may additionally contribute to
more timely detection of unusual observations, which may, in turn, facilitate a
quicker public health response to factors that cause increased morbidity and/or
mortality .

134
Data Displays

A first step in any analysis of data is a visual examination of the data. A few of

the techniques that should be used initially are described below for application to a

single set of numbers, for exploration of relationships between two factors, and for
comparisons among several populations.

Dot plots

A dot plot is a one-dimensional plot (Figure V.2) of the individual values of a set of
numbers. The x-axis represents one or more categories of a non-continuous variable,
and the y-axis represents the range of values displayed by the observations.
Observations with identical values are plotted side by side on the same horizontal
plane.

Stem-and-Leaf Displays

A stem-and-leaf display is a graphic (Figure V.3) that allows the digits of the
observation values to sort the numbers into numerical order for display. This is a
variation of the conventional histogram. The basic principle used in constructing a
stem-and-leaf display is the splitting of each data value between a suitable pair of
adjacent digits to form a set of leading digits and a set of trailing digits. The set
of leading digits forms the stems, and the set of the first trailing digit from the
data forms the leaves. Remaining trailing digits are ignored for the purpose of the
graphic. Variations to the stem-and-leaf display are possible {24).

Many investigators begin an evaluation of data with a histogram (see below) , but the
stem-and-leaf display has several advantages over the histogram. Because every
observation is plotted in the stem-and-leaf display, it contains more detail than the
histogram and allows computation of percentages points. Moreover, transformations can
be applied directly to stem-and-leaf data.

Scatter plots

The scatter plot or scatter diagram is a plot (Figure V.4) that reveals the
relationship between two variables. Each observation comprises a pair of values, one
for each variable. The observation is plotted by measuring the value of one variable
on the horizontal axis and the value of the other on the vertical axis.

135
Data summaries

One can summarize a data set by calculating a few numbers which are relatively easy to
interpret. For example, measures of central tendency and variability are frequently
used to describe data. In particular, two types of summary displays have proven
useful in characterizing data, i.e., the five-number summary and the box plot.

Five-number summaries

The five-number summary of a data set is a simple display (Table V.6) involving the
median, hinge, and extreme values. The median is a measure of the central tendency of
the data that splits an ordered data set in half. The hinges are a measure of the
variability of the data and are the values in the middle of each half. Therefore, the
hinges are the data values that are approximately 1/4 and 3/4 from the beginning of
the ordered data set. They are determined by formulas [25) and are similar to
quartiles that are defined so that 1/4 of the observations lie below the lower
quartile and 1/4 lie above the upper quartile. The extremes also reflect the
variability of the data and are the smallest and largest values in the data.

Box plots

The box plot is a graphic representation (Figure V.5) of the five-number summary with
the two ends of the box representing the hinges and the line through the box
representing the median. A line runs from each end of the box (i.e., from each hinge)
to the corresponding lower and upper extreme values. This plot allows the reader to
see quickly the median level, the variability, and the symmetry of the data.
Variations of the box plot, including identification of outlier values, are possible
(25) .

Transformations

Transformation or re-expression of data is a powerful tool that facilitates
understanding their implications. If numbers are collected in a manner that renders
them hard to grasp, the data analyst should use a transformation method, while
preserving as much of the original information as can be used. When used
appropriately, transformed data can be readily analyzed and interpreted.

Raw data are transformed for a number of reasons--including the achievement of

136

symmetry- -to produce a straight-line relationship, to allow use of an additive model,
to reduce variability, and to attain normally distributed data. Symmetry is highly
desirable when analyzing a single data set, since it ensures that a "typical" value
(such as the mean or median) more nearly summarizes the data. When analyzing pairs of
data, a straight-line relationship is important because linear associations are
simple, both in form and in interpretation. One or both variables can be transformed
to achieve linearity. Additive models have the desirable feature that data in multi-
way tables can be typically decomposed into additive effects and analyzed accordingly.
Reduced variability of the data is crucial when comparing several data sets. If the
data spread varies with the data set, then "typical" values are obtained more
accurately in the data with smaller spread. Finally, normally distributed data are
needed so that normal theory statistics can be applied to test hypotheses and draw
inferences.

Not all data sets can be transformed. The ratio of the largest to smallest value in
the original data set is a simple indicator of whether a group of numbers will be
affected substantially by transforming. If the ratio is near 1, a transformation will
not severely alter the appearance of the data. Since transformations affect larger
values and smaller values differently, the further the ratio is from 1, the greater
the need is for transformation to display and understand the data most simply.

Transformations are generally accomplished by raising each value of the data set to
some power p. Different values of p yield different effects on a data set, but those
effects are ordered if the values of p are ordered. Some transformations are
especially effective in certain instances (Table V.7) . For example, the square root
transformation is particularly capable of reducing variability in count data.
Guidelines are available to assist in selecting appropriate transformations (24,25) .

Smoothing

Smoothing refers to EDA techniques that summarize consecutive, overlapping segments of
a series of data to produce a smoother curve. Its goal is to represent patterns in
the data more clearly without becoming encumbered with any detailed peaks and valleys.
Variations in the data set caused by irregular components are smoothed so that the
overall trend can be determined more readily. Thus, smoothing allows investigators to

137
search for patterns in data that may otherwise be masked.

Smoothing is used on data series to explore the relationship between two variables.
The values along the x-axis should be equally spaced. The y values are called a time
series if they are collected over successive time intervals, although these values
need not be defined by time (e.g., in a data sequence of birth rates by mother's age) .
As long as the x-axis defines an order and the order is not too irregular, the y
sequence can be called a time series, and smoothing techniques can be applied. In
time-series analysis, models are frequently developed on the smoothed data because
these data are generally easier to model .

Numerous smoothing approaches exist, each having its own assets and liabilities. The
simplest example of smoothing is a moving average of three intervals in which
observation y^ in the data sequence is replaced with the mean of y^j, yi# and yitl.
Discussions of smoothing functions, including suggestions on how to overcome the
problem of obtaining end points for the smoothed series, appear elsewhere (25-26) .

DATA GRAPHICS

Overview

Visual tools play a critical role in public health surveillance. Data graphics
visually display measured quantities using points, lines, a coordinate system,
numbers, symbols, words, shading, and color (27). Graphics allow researchers to mesh
presentation and analysis. Data graphics are essential to organizing, summarizing,
and displaying information clearly and effectively. The design and quality of such
graphics largely determine how effectively scientists can present their information.

Many visual tools are available to assist in analysis and presentation of results.
The data to be presented and the purpose for the presentation are the key factors in
deciding which visual tools should be used (Table V.8) . Further discussion and
guidance in producing effective, high-quality data graphics are available from several
sources (27-32) .

Tables

138

A table arranges data in rows and columns and is used to demonstrate data patterns and
relationships among variables and to serve as a source of information for other types
of data graphics (28) . Table entries can be counts, means, rates, or other analytic
measures .

A table should be simple; two or three small tables are simpler to understand than one
large one. A table should be self-explanatory so that if taken out of context readers
can still understand the data. The guidelines below should be used to increase
effectiveness of a table and ensure that it is self-explanatory (29) .

Describe what, when, and where in a clear, concise table title.

Label each row and column clearly and concisely.

Provide units of measure for the data.

Provide row and column totals .

Define abbreviations and symbols.

Note data exclusions.

If the data are not original, reference the source.

One -variable tables

One of the most basic tables is a frequency distribution by category for a single
variable. For example, the first column of the table contains the categories of the
factor of interest, and the second column lists the number of persons or events that
appear in each category and gives the total count . Often a third column contains
percentages of total events in each category (Table V.9).

Multi-variable tables

Most phenomena monitored by public health surveillance systems are complex and require
analysis of the interrelationships of several factors. When data are available on
more than one variable, multi-variable cross-classified tables can elucidate
associations. These tables are also called contingency tables when all the primary
table entries (e.g., frequencies, persons, or events) are classified by each of the
variables in the table (Table V.10).

139

The most frequently used type of table in epidemiologic analysis is the two-by-two
contingency table, which is appropriate when two variables, each having two
categories, are studied. This special case is particularly suited for analyzing case-
control and cohort studies for which the categories of the variables are case and
control (or ill and well) and exposed and unexposed.

Graphs

A graph is a visual display of quantitative information involving a system of
coordinates. Two-dimensional graphs are generally depicted along an x-axis
(horizontal orientation) and y-axis (vertical orientation) coordinate system. Graphs
are primary analytic tools used to assist the reader to visualize patterns, trends,
aberrations, similarities, and differences in data.

Simplicity is key to designing graphs. Simple, uncluttered graphs are more likely
than complicated presentations to convey information effectively. Several specific
principles should be observed when constructing graphs (29) .

• Ensure that a graph is self-explanatory by clear, concise labeling of

title, source, axes, scales, and legends,

Clearly differentiate variables by legends or keys.

Minimize the number of coordinate lines.

Portray frequency on the vertical scale, starting at zero, and the method

of classification on the horizontal scale.

Assure that scales for each axis are appropriate for the data.

Clearly indicate scale division, any scale breaks, and units of measure.

Define abbreviations and symbols.

Note data exclusions.

If the data are not original, reference the source.

Several commonly used graphs are described below. The scatter plot, an extremely
helpful graph for detecting the relationship between two variables, has already been
described (see "Data Displays").

Arithmetic-scale line graphs

140

An arithmetic-scale line graph is one in which equal distances along the x and/or y
axes represent equal quantities along that axis. This type of graph is typically used
to demonstrate an overall trend over time rather than focusing on particular
observation values. It is most helpful for examining long series of data or for
comparing several data sets (see Figure 1.1).

The scale of the x-axis is usually presented in the same increments as the data are
collected (e.g., weekly or monthly). Several factors should be considered when
selecting a scale for the y-axis {28) .

• Choose a length for the y-axis that is suitably proportional to that of
the x-axis. (A common recommendation is a 5:3 x: y-axis ratio.)

• Identify the maximum y-axis value and round the value up slightly.

• Select an interval size that provides enough detail for the purpose of
the graph.

Scale breaks can be used for either or both axes if the range of the data is
excessive. However, care should be taken to avoid misrepresentation and
misinterpretation of the data when scale breaks are used.

Semi- logarithmic -scale line graphs

A semi- logarithmic-scale line or semi-log graph is characterized by one axis being
measured on an arithmetic scale (usually the x-axis) and the other being measured on a
logarithmic scale. A logarithm is the exponent expressing the power to which a base
number is raised (e.g., log 100 = log 102 = 2 for base 10). The axis portraying the
logarithmic scale on semi-log graph paper is divided into several cycles, with each
cycle representing an order of magnitude and values 10 times greater than the
preceding cycle (e.g., a 3-cycle semi-log graph could represent 1 to 10 in the first
cycle, 10 to 100 in the second cycle, and 100 to 1,000 in the third cycle).

A semi- logarithmic-scale line graph is particularly valuable when examining the race
of change in surveillance data, because a straight line represents a constant rate of
change. For absolute changes, an arithmetic-scale line graph would be more
appropriate. The semi-log scale is also useful when large differences in magnitude or
outliers occur because this type of graph allows the plotting of wide ranges of values

141

(see Figure 1.6). With semi-log graphs, the slope of the line indicates the rate of
increase or decrease; thus a horizontal line indicates no change in rate. Also,
parallel lines for two conditions demonstrate identical rates of change [29) .

Histograms

A histogram is a graph in which a frequency distribution is represented by adjoining
vertical bars. The area represented by each bar is proportional to the frequency for
that interval (i.e., the height multiplied by the width of each bar yields the number
of events for that interval) . Thus, scale breaks should never be used in histograms
because they misrepresent the data.

Histograms can be constructed with equal- and unequal-class intervals. Equal-class
intervals occur when the height of each bar is proportional to the frequency of the
events in that interval. We do not recommend using histograms with unequal class
intervals because they are difficult to construct and interpret correctly.

The epidemic curve is a special type of a histogram in which time is the variable
plotted on the x-axis. The epidemic curve represents the occurrence of cases of a
health problem by date of onset during an epidemic, (e.g., an outbreak of paralytic
poliomyelitis in Oman [see Figure V.6]). Usually the class intervals on the x-axis
should be less than one- fourth of the incubation period of the disease, and the
intervals should begin before the first reported case during the epidemic in order to
portray any identified background cases of the condition being graphed.

Cumulative frequency and survival curves

A cumulative frequency curve is used for both continuous and categorical data. It
plots the cumulative frequency on the y-axis and the value of the variable on the x-
axis. Cumulative frequencies can be expressed either as the number of cases or as a
percentage of total cases. For categorical data, the cumulative frequency is plotted
at the right-most end of each class interval (rather than at the mid point) to depict
more realistically the number or percentage of cases above and below the x-axis value
(Figure V.7) . When percentages are graphed, the cumulative frequency curve allows
easy identification of medians, quartiles, and other percentiles of interest.

A survival curve (Figure V.8) is useful in a follow-up study for graphing the

142

percentage of subjects remaining until an event occurs in the study. The x-axis
represents time, and the y-axis is percentage surviving. A difference in orientation
exists between cumulative frequency and survival curves (Figures V.7, V.8).

Frequency polygons

A frequency polygon is constructed from a histogram by connecting the midpoints of the
class intervals with a straight line. A frequency polygon is useful for comparing
frequency distributions from different data sets (Figure V.9). Detailed instructions
for constructing frequency polygons are presented elsewhere (28,29) .

Charts

Charts are useful graphics for illustrating statistical information. Many types of
charts can be used [28-30) . They are most suited and helpful for comparing magnitudes
of events in categories of a variable. In the paragraphs below, we describe several
of the most frequently used types of charts.

Bar charts

Bar charts are one of the simplest and most effective ways to present comparative
data. A bar chart uses bars of the same width to represent different categories of a
factor. Comparison of the categories is based on linear values since the length of a
bar is proportional to the frequency of the event in that category. Therefore, scale
breaks could cause the data to be misinterpreted and should not be used in bar charts.
Bars from different categories are separated by spaces (unlike the bars in a
histogram). Although most bars are vertical, they may be depicted horizontally. They
are usually arranged in ascending or descending length, or in some other systematic
order .

Several variations of the bar chart are commonly used. The grouped or multiple-unit
bar chart compares units within categories (Figure V.10). Generally the number of
units within a category is limited to three for effective presentation and
understanding .

A stacked bar chart is also used to compare different groups within each category of a
variable. However, it differs from the grouped bar chart in that the different groups

143

are differentiated not with separate bars, but with different segments within a single
bar for each category. The distinct segments are illustrated by different types of
shading, hatching, or coloring, which are defined in a legend (Figure V.ll).

The deviation bar chart illustrates differences in either direction from a baseline.
This type of chart is especially useful for demonstrating positive-negative and
profit-loss data or comparisons of data at different times (Figure V.12). The
incorporation of a confidence interval-like portion in the bars provides additional
useful information.

Pie charts

A pie chart represents the different percentages of categories of a variable by
proportionally sized pieces of pie (Figure V.13) . The pieces are usually denoted with
different colors or shading, and the percentages are written inside or outside the
pieces to allow the reader to make accurate comparisons.

Maps

Maps are the graphic representation of data using location and geographic coordinates
(33) . A map generally provides a clear, quick method for grasping data and is
particularly effective for readers who are familiar with the physical area being
portrayed. A few popular types of maps that depict incidence or distribution of
health conditions are described below.

Spot maps

A spot map is produced by placing a dot or other symbol on the map where the health
condition occurred or exists (Figure V.14). Different symbols can be used for
multiple events at a single location. Although a spot map is beneficial for
displaying geographic distribution of an event, it does not provide a measure of risk
since population size is not taken into account.

Chloropleth maps

A chloropleth map is a frequently used statistical map involving different types of
shading, hatching, or coloring to portray range-graded values (Figure V.15). It is
also called a shaded or area map. Chloropleth maps are useful for depicting rates of

144
a health condition in specific areas.

Care must be taken in interpreting chloropleth maps because each area is shaded
uniformly regardless of any demographic differences within an area. For example, most
of a county may be relatively sparsely populated by low-income persons, where as a
small portion of that county may be densely inhabited by persons with higher incomes;
and the rate at which a particular health condition occurs may falsely appear to be
evenly distributed by location and by socioeconomic status throughout the county.
Chloropleth maps can also give the false impression of abrupt change in number or rate
of a condition across area boundaries when, in fact, a gradual change may have
occurred from one area to the next.

Density-equalizing maps

A density-equalizing or rubber map (Figure V.16) transforms actual geographic
coordinates to produce an artificial figure in which area or population density is
equal throughout the map (34) . Density-equalizing maps correct for the confounding
effect of population density and thus are particularly useful in analyzing geographic
clusters of public health events.

Several algorithms exist to transform coordinates of maps. Any transformation routine
should define a continuous transformation over the map domain, solve for the unique
solution that minimizes map distortion, accept optional constraints, and avoid
overlapping of transformed areas (35) .

INTERPRETATION OF SURVEILLANCE DATA

The real art of conducting surveillance lies in interpreting what the data say. Data
need to be interpreted in the context of our understanding of the etiology,
epidemiology, and natural history of the disease or injury. The interpretation should
focus on aspects which might lead to improved control of the condition. By proceeding
from the simple to the complex, investigators can use surveillance as a basis for
taking appropriate public health action. Epidemics can be recognized, preventive
strategies applied, and the effect of such actions can be assessed. The key to
interpretation lies in knowing the limitations of the data and being meticulous in
describing them. One axiom to be kept in mind always is that, because of the

145
descriptive nature of surveillance data, correlation does not eijual causation.

Limitations in Data

No surveillance system is perfect; however, most can be useful. Several problems
inherent in data obtained through surveillance must be recognized if the data are to
be interpreted correctly.

Uncle rrepor t ing

Because most surveillance systems are based on conditions reported by health-care
providers, underreporting is inevitable. Depending on the condition, 5%-80% of cases
that actually occur will be reported [36-39) . However, the need for completeness of
reporting—particularly for common health problems--may be exaggerated. Disease
trends by time, place, and person can frequently be detected even with incomplete
data. So long as the underreporting is relatively consistent, incomplete data can
still be applied to derive useful inferences. For problems that occur infrequently,
the need for completeness becomes more important.

Unrepresentativeness of reported cases

Health conditions are not reported randomly. For example, illnesses dealt with in a
public health facility are reported disproportionately more frequently than those
diagnosed by private practitioners. A health problem that leads to hospitalization is
more likely to be reported than problems dealt with on an outpatient basis. Thus,
reporting biases can distort interpretation. When it is possible, adjusting for
skewed reporting will allow investigators to obtain a more accurate picture of the
occurrence of a health problem. Collecting data from multiple sources may help
provide ways to improve the representativeness of the information.

Inconsistent case definitions

Different practitioners frequently use different case definitions for health problems.
The more complex the diagnostic syndrome, the greater the difficulty in reaching
consensus on a case definition. Moreover, with newly emerging problems, as
understanding of their natural history progresses, we frequently adjust the case
definition to allow greater accuracy of diagnosis. Persons who interpret surveillance
data must be aware of any changes in case definitions and must adjust their

146
interpretations accordingly.

Approach to Interpretation

Creative interpretation of surveillance data requires more common sense than
sophisticated reasoning. The data can speak for themselves. Brainstorm and test, if
possible, all potential explanations for an observed pattern. Has the nature of
reporting changed? Have providers or new geographic areas entered the surveillance
system? Has the case definition changed? Has a new intervention, such as screening
or therapy, been introduced?

Consistency among different surveillance systems is probably the most crucial factor
affecting interpretation. If different surveillance data sets from different
locations show similar trends, the likelihood that the effect is real increases.
Examine trends in different age groups. Finally, choose the surveillance system you
think represents the highest quality local information. If the trends of the health
problem are evident there, you can be more confident about your interpretations.

To facilitate interpretation of surveillance data, formats can be designed to
determine whether the number of reported cases of a health problem for a specified
reporting period differs from that of a previous period. An example of such a "user-
friendly" format has been published in CDC's Morbidity and Mortality Weekly Report
(MMWR) since 1990 {40,41) . Known simply as "Figure 1," the graph uses horizontal bars
to indicate the ratio of the current level of disease to the previous 5-year average
(Figure V.12). Striping in the bars shows whether the number of reported cases during
the most recent 4-week interval are higher or lower than the expected based on the
mean and two standard deviations of the 4-week totals. A change in the occurrence of
disease identified by this approach indicates the need for more detailed examination
of the data--and may indicate an epidemic. Other diverse statistical techniques can
be used to detect aberrations in surveillance data (42; see Chapter VI).

INTERPRETIVE USES FOR SURVEILLANCE DATA

Identifying Epidemics

An important use of surveillance data is in determining whether increases in numbers

147

of cases of a health condition at the local or national level represent outbreak
(i.e., epidemic) situations that require immediate investigation and intervention.
Thus, a surveillance system can function as an early warning signal for public health
officials. For example, increases in numbers of cases of hepatitis B among military
recruits provided the stimulus to intervene with drug-prevention programs (43). CDC's
Birth Defects Monitoring System identified increases in renal agenesis (44) during the
1970s and 1980s, which prompted an investigation. Monitoring of regional trends in
rubella and congenital rubella identified outbreaks among the Amish in 1989-1990 (45) .
A national registry of anti-abortion-associated violence clearly documented an
"epidemic" of attacks in the mid-1980s, which decreased after vigorous prosecution was
initiated by the Federal Bureau of Investigation (46) .

The utility of surveillance data in detecting epidemics is highest in situations in
which cases of the health condition occur over a wide geographic area or gradually
over time. In such situations, the time-place-person links among cases probably would
not be recognized by individual practitioners (3) . Typical examples occur with
infectious diseases, when laboratory monitoring of unusual serotypes or antibiotic-
resistance patterns identify outbreaks of specific microorganisms that might otherwise
have gone unnoticed. Nationwide epidemics of Salmonella newport (47) , S. enteritidis
(48), and Shigella sonnei (49) have been detected through surveillance.

Identifying New Syndromes

The most dramatic use of surveillance data occurs when a "new" syndrome emerges from
an ongoing monitoring system. Legionnaire's disease was detected and subsequently
characterized as the result of an outbreak of non-influenza pneumonia within a
specific place and population (50) . Acquired immunodeficiency syndrome (AIDS) was
recognized both because of rapid increases in requests for CDC's pentamidine supply
and because it occurred in a special time (early 1981), place (California, New York),
and person (men having sex with men) setting (51) . Finally, the national scope of the
epidemic of eosinophilia myalgia syndrome (EMS) was noticed because its unique
features were like those of toxic oil syndrome (52) .

Monitoring Trends

Even if specific outbreaks or new syndromes cannot be identified by tracking

148

surveillance data, the baseline level of the health condition being monitored reflects
any variation in its occurrence over time. This purpose is especially relevant to
assessing events associated with reproductive health (e.g., ectopic pregnancy or
neonatal mortality), chronic disease, or infections with a long latency. The
progressive decline — until recently — of tuberculosis in the 20th century and the
constant increase in numbers of cases of AIDS throughout the 1980s reflect this
monitoring function (53,54) .

Evaluating Public Policy

Surveillance data can assess the health impact--pro or con--of specific interventions
or of public policy. The rapid fall in numbers of cases of poliomyelitis and measles
after national vaccination campaigns were instituted is a classic example of the
usefulness of surveillance data {55,56). Creative interpretation of surveillance data
has also been applied to non-infectious-conditions; the impact, in such situations, is
somewhat more difficult to assess. For example, in Washington, D.C., the adoption of
a gun-licensing law coincided with an abrupt decline in firearm-related homicides and
suicides (57) . No similar reductions occurred in the number of homicides or suicides
committed by other means, nor did states adjacent to the District experience any
reductions in their rates of firearm-related homicides or suicides. Also,
surveillance of legal abortions and of deaths associated with illegal abortion has
helped trace the public health impact of this controversial health problem (8 ,58, 59) .
After legal abortion became widely available, deaths from illegal abortion decreased
markedly; however, restriction of federal funds for abortion had a negligible effect
on health parameters (60) .

Though it is tempting to use trends in disease and injury to monitor the impact of
community interventions, such evaluation becomes increasingly suspect when several
factors contribute to the occurrence of disease or health condition being monitored.
In addition, if only a portion of the population accepts an intervention, analysis and
interpretation of surveillance data are made even more difficult. Frequently,
surveillance of process measures or other health problems can act as proxies for the
intended outcome. Moreover, finding comparability in data from several populations
that have attempted similar public health programs strengthens evidence that the
interpretation is correct. For example, to evaluate the effectiveness of allowing

149

people to exchange used hypodermic needles for new ones as a means of preventing AIDS,
epidemiologists could simultaneously examine trends in numbers of needles distributed,
surveys of needle use, and incidence of higher- prevalence infections such as hepatitis
B.

Projecting Future Needs

Mathematical models based on surveillance data can be used to project future trends.
This tool helps health officials determine the eventual need for preventive and
curative services. Recently such modelling assisted in estimating the impact of AIDS
on the United States health-care system in the 1990s (61) . Hot only did such
projections address the demand for AZT by HIV-infected persons with low CD-4
lymphocyte counts, but also the requirements for hospital care for persons with life-
threatening superinfections later in the course of HIV-related disease. In addition,
models based on surveillance data can predict the decline of morbidity and/or
mortality when there are changes in risk factors among the population at risk.
Examples of this application include projecting the decline in cardiovascular disease
on the basis of decreased smoking of cigarettes (62), the decline in cirrhosis-related
mortality in the presence of lower levels of alcohol use (63), and decreased rates of
mortality from cervical cancer associated with an increase in the prevalence of
hysterectomy (64).

150
REFERENCES

1. Thacker SB, Berkelman RL. Public health surveillance in the United States.
Epidemiol Rev 1988;10:164-90.

2. Doll R. Surveillance and monitoring. Int J Epidemiol 1974;3:305-13.

3. Berkelman RL, Buehler JW. Surveillance. In: Holland WW, Detels R, Knox G,
eds. Oxford textbook of public health, second edition. Vol 2: Methods of
public health. Oxford: Oxford University Press, 1991:161-76.

4. Hinman AR. Analysis, interpretation, use and dissemination of surveillance
information. PAHO Bull 1977;11:338-43.

5. Thacker SB, Berkelman RL, Stroup DF. The science of public health surveillance.
J Pub Health Pol 1989;10:187-203.

6. Morgenstern H. Uses of ecologic analysis in epidemiologic research.
Am J Public Health 1982;72:1336-44.

7. Piantadosi S, Byar DP, Green SB. The ecological fallacy. Am J Epidemiol
1988;127:893-904.

8. Robinson WS. Ecological correlations and the behavior of individuals.
Am Sociol Rev 1950;15:351-7.

9. Koonin LM, Kochanek KD, Smith JC, Ramick M. Abortion surveillance, United
States, 1988. In: CDC surveillance summaries, July 1991. MMWP. 1991;40(No. SS-
2) : 15-42.

10. Snow J. Snow of cholera. New York: Hafner Press, 1965.

11. Firebaugh G. A rule for inferring individual relationships from aggregate data.
Am Sociol Rev 1978;43:557-72.

151

12. Rolfs RT, Nakashima AK. Epidemiology of primary and secondary syphilis in the
United States, 1981-1989. JAMA 1990; 254 : 1432-7 .

13. Marx R, Aral SO, Rolfs RT, Sterk CE, Kahn JG. Crack, sex, and STD. Sex Transm
Dis 1991;18:92-101.

14. Last JM, ed. A dictionary of epidemiology. 2nd ed. New York: Oxford University
Press, 1988:141.

15. Health United States 1990. DHHS publication no. (PHS) 91-1232. Hyattsville,
Maryland: Centers for Disease Control, 1991.

16. Ahlbom A, Norell S. Introduction to modern epidemiology. Chestnut Hill,
Massachusetts: Epidemiology Resources Inc, 1984:97.

17. Fleiss JL. Statistical methods for rates and proportions. 2nd ed. New York:
John Wiley & Sons, Inc, 1981:321.

18. Kahn HA, Sempos CT. Statistical methods in epidemiology. In: MacMahon B, ed.
Monographs in epidemiology and biostatistics. Vol 12. New York: Oxford
University Press, 1989:292.

19. Lilienfeld AM, Lilienfeld DE. Foundations of epidemiology. 2nd ed. New York:
Oxford University Press, 1980:375.

20. Peavy JV. Adjusted rates. DHHS publication No. (PHS) 00-1833. Atlanta:
Centers for Disease Control, 1988.

21. Mausner JS, Bahn AK. Epidemiology: an introductory text. Philadelphia:
WB Saunders, 1974:377.

22. Haight F. Handbook of the Poisson distribution. New York: John Wiley & Sons,
Inc, 1967.

23. Kleinbauiti DG, Kupper LL, Muller KE. Applied regression analysis and other

152

multivariable methods. 2nd ed. Boston: PWS-Kent Publishing Co, 1988:718.

24. Tukey JW. Exploratory data analysis. Reading, Massachusetts: Addison-Wesley
Publishing Company, 1977:688.

25. Velleman PF, Hoaglin DC. Applications, basics, and computing of exploratory
data analysis. Boston: Duxbury Press, 1981:354.

26. McNeil DR. Interactive data analysis. New York: John Wiley & Sons, Inc,
1977:186.

27. Tufte ER. The visual display of quantitative information. Cheshire,
Connecticut: Graphics Press, 1987:197.

28. Principles of epidemiology. 2nd ed (field test version 11/91). Atlanta: Centers
for Disease Control, 1991.

29. Peavy JV, Dyal WW, Eddins DL. Descriptive statistics: tables, graphs, &
charts. DHHS publication no. (PHS) 00-1834. Atlanta: Centers for Disease
Control, 1986.

30. Schmid CF . Statistical graphics design principles and practices. New York:
John Wiley & Sons, Inc, 1983:212.

31. Chambers JM, Cleveland WS, Kleiner B, Tukey PA. Graphical methods for data
analysis. Boston: Duxbury Press, 1983:395.

32. Tufte ER. Envisioning information. Cheshire, Connecticut: Graphics Press,
1990:126.

33. Haggett P, Cliff AD, Frey A. Locational analysis in human geography. 2nd ed.
Bristol: JW Arrowsmith Ltd, 1977:605.

34. Gillihan AF. Population maps. Am J Public Health 1927;17:316-9.

153

35. Merrill DW, Selvin S, Mohr MS. Analyzing geographic clustered response. In:
American Statistical Association 1991 Proceedings of the Section on Statistics
and the Environment. Alexandria, Virginia: American Statistical Association
(in press) .

36. Eylenbosch WJ, Noah ND. Surveillance in health and disease. Oxford: Oxford
University Press, 1988:15-23,32-5.

37. Vogt RL, LaRue D, Klaucke DN, Jillson DA. Comparison of active and passive
surveillance systems of primary care providers for hepatitis, measles, rubella
and salmonellosis in Vermont. An J Public Health 1983;73:795-7.

38. Levy BS, Mature J, Washburn JW. Intensive hepatitis surveillance in Minnesota:
methods and results. Am J Epidemiol 1977;105:127-34.

39. Marier R. The reporting of communicable diseases. Am J Epidemiol 1977,-105:587-
90.

40. Centers for Disease Control. Proposed changes in format for presentation of
notifiable disease report data. MMWR 1989,-38:805-9.

41. Centers for Disease Control. Changes in format for presentation of notifiable
disease report data. MMWR 1990,-39:234-5.

42. Stroup DF, Williamson GD, Herndon JL, Karon JM. Detection of aberrations in the
occurrence of notifiable diseases surveillance data. Stat Med 1989;8:323-9.

43. Cowan DN, Prier RE. Changes in hepatitis morbidity in the United States Army,
Europe. Milit Med 1984;149:260-5.

44. Edmonds LD, James LM. Temporal trends in the prevalence of congenital
malformations at birth based on the Birth Defects Monitoring Program, United
States, 1979-1987. In: CDC surveillance summaries, December 1990. MMWR
1990;39(No. SS-4):19-23.

154

45. Centers for Disease Control. Outbreak of rubella among the Amish- -United
States, 1991. MMWR 1991;40:264.

46. Grimes DA, Forrest JD, Kirkman AL, Radford B. An epidemic of anti-abortion
violence in the United States. Am J Obscet Gynecol 1991;165:1263-8.

47. Holmberg SD, Osterholm MT, Senger KA, Cohen ML. Drug-resistant Salmonella from
animals fed antimicrobials. N Engl J Med 1984;311:617-22.

48. St Louis ME, Morse DL, Potter ME, et al . The emergence of grade A eggs as a
major source of Salmonella enteritidis infections: new implications for the
control of salmonellosis. JAMA 1988;259:2103-7.

49. Centers for Disease Control. Nationwide dissemination of multiply resistant
Shigella sonnei following a common-source outbreak. MMWR 1987;36:633-4.

50. Fraser DW, Tsai TR, Orenstein w, et al. Legionnaires' disease: description of
an epidemic of pneumonia. N Engl J Med 1977,-297:1189-97.

51. Centers for Disease Control. Pneumocystic pneumonia--Los Angeles. MMWR
1981;30:250-2.

52. Swygert LA, Maes EF, Sewell LE, Miller L, Falk H, Kilbourne EM. Eosinophilia-
myalgia syndrome: results of national surveillance. JAMA 1990;264:1698-703.

53. Reider HL, Cauthen GM, Kelly GD, et al . Tuberculosis in the United States.
JAMA 1989;262:385-90.

54. Centers for Disease Control. Update: acquired immunodeficiency syndrome —
United States, 1981-1990. MMWR 1991;40:358-69.

55. Centers for Disease Control. Measles prevention: recommendations of the
Immunization Practices Advisory Committee (ACIP) . MMWR 1989; 38 (no. S-9):l-18.

56. Centers for Disease Control. Progress toward eradicating poliomyelitis from the

1551 H

Americas. MMWR 1989;38:532-5.

57. Loftin C, McDowall D, Wiersema B, Cottey TJ. Effects of restrictive licensing
of handguns on homicide and suicide in the District of Columbia. N Engl J Med
1991;325:1615-20.

58. Cates W Jr, Rochat RW, Grimes DA, Tyler CW Jr. Legalized abortion: effect on
national trends of maternal and abortion-related mortality (1940-1976) .

Am J Obstet Gynecol 197 8;132:211-4.

59. Cates W Jr. Legal abortion: the public health record. Science 1982,-215:1586-
90.

60. Cates W Jr. The Hyde amendment in action: how did the restriction of federal
funds for abortion affect low-income women. JAMA 1981;246:1109-12.

61. Centers for Disease Control. HIV prevalence estimates and AIDS case projections
for the United States: report based upon a workshop. MMMR 1990;39:(No. RR-16) .

62. Kullback S, Cornfield J. An information theoretic contingency table analysis of
the Dorn study of smoking and mortality. Comput Biomed Res 1976;9:409-37.

63. Skog 0. The risk function for liver cirrhosis from lifetime alcohol
consumption. J SCud Alcohol 1984;45:199-208.

64. Centers for Disease Control. Hysterectomy prevalence and death rates for
cervical cancer--United States, 1965-1988. MMWR 1992;41:17-20.

157

Chapter VI

Special Analytic Issues

Donna F. St roup

•There is only one good, that is knowledge. There is only one evil, that is
ignorance . ■

Socrates

NATURE OF PUBLIC HEALTH SURVEILLANCE DATA

Data obtained in a public health surveillance system have several characteristics that
affect analyses. Most fundamentally, data from most surveillance systems are not
generated from a designed study or randomized trial. Although this departure has been

158

addressed in the context of epidemiologic studies and field investigations (1) , the
effect in the surveillance setting has specific consequences.

First, for a surveillance system, data are reported regularly, and may be updated
after the initial report. Since the lag time between first report and subsequent
updating may vary by health event or reporting location, methods developed for early
detection of aberrations in the data should be applied as soon as provisional data are
available. If the analyses are implemented as part of a routine surveillance program,
results can be monitored as data are updated.

Second, surveillance data are generated by a spatial as well as a temporal process.
For example, at a given point in time, cases of a disease for a given area may not
appear excessive; however, when compared with other times or other areas at a given
time, an excess may become apparent (2) .

Third, when only aggregated data are available (e.g., from regions, counties, or
states) , the distribution of cases in the underlying population cannot be assessed
directly. This problem is compounded because the areas of aggregation are usually
arbitrarily defined and case definitions are not consistent within areas. As a
result, statistical inferences concerning the properties of individuals are confounded
by the properties of the aggregated system.

Finally, the surveillance process is generally a multivariate one (3) . Multiple
health events under surveillance may be related for a given point in time for the same
area, or the relationship may be delayed in time for the same or nearby areas if
diagnosis is uncertain or confirmation is delayed. The multivariate nature of this
process should be used to improve the ability of any method to detect aberrations from
a baseline.

CLUSTERING OF HEALTH EVENTS

One foundation of the science of epidemiology is the study of the departure of the
observed patterns of the occurrence of disease from the expected pattern of occurrence
(4) . Variations in the usual incidence of health events in different geographic areas
or different time periods may provide important clues to specific risk factors or even

159

to the etiology of the problem. The expected numbers of reported health events are
generated by a process involving human behavior and transmission of disease, and
patterns of occurrence within human populations may lead to hypotheses about the
determinants of the health problem (5) .

The public health community continues to struggle with nomenclature for such
variations. The term "cluster" can be defined as "a set of events occurring unusually
close together to each other in time or space, in both time and space, or within the
limits of demographic characteristics (e.g. persons in the same occupation)."
•Cluster" is usually used to describe uncommon events (e.g., leukemia, suicide) and
tends to evoke emotional response from members of the public or from the media.

A related term is "epidemic" , historically used to describe aggregation of infectious
diseases: "an outbreak of a disease spreading rapidly from person to person" {6).
More recently, the concept has broadened to the following: "the occurrence in a
community or region of cases of an illness, specific health-related behavior, or other
health-related events clearly in excess of normal expectancy .... The number of
cases indicating the presence of an epidemic will vary according to agent, size and
type of population exposed, previous experience or lack of exposure to the disease and
time and place of occurrence; thus, epidemicity is relative to the usual frequency of
the disease in the same area, among the specified population, at the same season of
the year" (7). it is prudent to be conscious of the fact that the term "epidemic"
evokes responses beyond these definitions. In late 1988, the British Public Health
Laboratory Service used "epidemic" to describe an increase in reported numbers of
cases of Salmonella enteritidis associated with contaminated chicken and eggs. The
country's Chief Medical Officer, Sir Donald Acheson, advised caution "...in using the
word epidemic when addressing the public because of its connotations with terrifying
diseases such as cholera and smallpox" (8) . The term "outbreak" has less evocative
connotations. With all such definitions, a critical concept is the comparison of an
observed number with what is usual or normal. The distinction made here is that
•aberration" will be used to denote changes in the occurrence of health events that
are statistically significant when compared with usual or normal history. The
definition of an epidemic may require the existence of an aberration; e.g., the
Centers for Disease Control (CDC) declares that an epidemic of a specific strain of
influenza is occurring only if the number of reported deaths exceeds a 95% confidence

160
limit in the forecast for two or more consecutive periods. In general, application of
the term "epidemic* may require epidemiologic conditions beyond the statistical ones,
e.g., laboratory isolates or resistance to vaccine.

In this chapter, 'aberration' is used to describe statistical departures from a usual
distribution. It is important to understand that such departures do not necessarily
signal the "onset of an epidemic" or the "presence of a cluster." Conversely, one can
have an epidemic even in the absence of a statistical increase, such as when infant
mortality is "low" but still higher than expected. The methods developed here are
intended for routine use by the public health analyst, in conjunction with
epidemiologic investigation and close communication with the source of the
surveillance reports.

ABERRATIONS IN TIME

Since the definition of surveillance implies ongoing data collection, perhaps the most
fundamental question suggested by the analysis of a surveillance system is the
following: When does the value of reported events signal a change in the process from
past patterns? Although fundamental, the analysis required to address this question
suggests additional questions. How are "past patterns" defined? If an outbreak
occurred in the past, should this affect the definition of a change? Other than the
disease or injury process itself, what other factors could cause a change?

In the paragraph below, we use the terms "baseline" to denote historical data and
■current report" to denote the recent data on which the assessment is based.

Graph of Current and Past Experience

State health departments report the numbers of cases of about 50 notifiable diseases
each week to CDC's National Notifiable Diseases Surveillance System (NNDSS) . The list
of health events is determined collaboratively by the Council of State and Territorial
Epidemiologists and CDC {9,10). Each week provisional reports are published in the
Morbidity and Mortality Meekly Report (MMWR) and are made available to
epidemiologists, clinicians, and other public health professionals in a timely manner.
Although the tables of the MMWR continue to provide important information, the volume
of data and the need for ease of interpretation encouraged the development of a

161
graphic display to highlight unusually high or low numbers of reported cases.

A new analytic and graphical method was adopted for this system to achieve the
following objectives: a) to portray in a single comprehensible figure the weekly
reports of data for approximately 20 diseases and to compare those data with past
results b) to highlight for further analysis the results most likely to reflect either
long-term trends or epidemics. These objectives were formulated to reflect most
recent behavior in as short a time period as possible for weekly publication, but a
long enough period to assure stable results. To facilitate comprehension, the same
method is used for all diseases portrayed.

The analytic method currently used for constructing Figure I in the MMWR (see
Figure VI. 12), called the °CDC MMWR Current/Past Experience Graph (CPEG),° compares
the number of reported cases in the current 4-week period for a given health event
with historical data on the same condition from the preceding 5 years (11,12).
Numbers of cases in the current month are listed to facilitate interpretation of
instability caused by small numbers.

The choice of 4 weeks as the "current period" was based on evidence that weekly
fluctuation in data from disease reports usually reflects irregular reporting
practices rather than actual incidence of disease. The use of 5 years of history
achieves the objective of using the same model for all conditions portrayed, since
some health events were made notifiable only recently (e.g., acquired immunodeficiency
syndrome (AIDS) and legionellosis) .

Also, modelling of reported influenza incidence has shown that more accurate forecasts
are based on more recent data (13) . To increase the historical sample size and to
account for any seasonal effect, the baseline is taken to be the average of the
reported number of cases for the preceding 4-week period, the corresponding 4-week
period, and the following 4-week period, for the previous 5 years. This yields 15
correlated observations, referred to as the historical observations, or "baseline"
(Figure VI. 1) .

The deviation from unity of the ratio of the current 4-week total to the historical
average is indicative of a departure from past patterns. We plot this ratio on a

162

logarithmic scale so that an n-fold increase projects to the right the same distance
as an n-fold decrease projects to the left, and no change from past patterns (1:1)
produces a bar of zero length (14) . To distinguish the conditions that may require
further investigation, the hatching on the bars begins at a point based on the mean
and standard deviation of the historical observations.*

An evaluation of this method shows that it has good statistical robustness to patterns
in the data and high sensitivity and predictive value positive for epidemiologically
confirmed outbreaks {15). An outbreak of rubella detected by this method proved to be
of substantial public health importance (16). Recent increases beyond historical
limits in reporting of aseptic meningitis reflected increased disease activity
primarily in the northeastern United States (17).

TIME- SERIES METHODS

The method used by CDC to estimate excess mortality associated with influenza was
developed from a 1932 study that defined the expected number of weekly deaths from
pneumonia and influenza, or from all causes, as the median number of deaths for a
given week during non-epidemic years (18) . "Excess deaths, " then, was defined as the
difference between the observed and the conditional expected numbers, a one-period-
ahead forecast. Later, a regression model was fitted to weekly pneumonia and
influenza data from U.S. cities to calculate an expected number of deaths (19). In
1979, CDC proposed a new method to estimate expected deaths using a body of methods
called time-series (20). More recently, a method forecasting separate expected
numbers by age group has been investigated (13) .

The methodology of time series is appropriate for data available sequentially over
time. A time-series model generally comprises components estimating the effect of
secular trend, cycles, or year-to-year seasonal patterns. The process of model
fitting consists of identification, estimation, and diagnostic validation. One then
evaluates competing models on the basis of the fit of the models to the observed data
and of the accuracy of the forecasts.

♦Historical limits of the ratio of current reports to the historical mean are calculated
as 1 plus or minus 2 times the standard deviation divided by the mean, where the mean and
the standard deviation are calculated from the 15 historical 4-week periods.

163

Most common methods of time-series analysis, such as the Auto Regressive Integrated
Moving Average (ARIMA) models (21), are appropriate for relatively long series of data
that exhibit certain regular properties over the entire series. Differencing, or
forming a new series by subtracting adjacent observations, is generally used to create
a series with a stationary mean, that is without trend. An additional property,
stationarity of the variance, is generally required, so that the process does not
become more or less variable over time. An autoregressive model includes terms that
model the data at one point in time as a function of previous data. A moving-average
term creates a series from averages of adjacent observations and is used to model
cycles in the data.

The advantage of time-series models for surveillance over other modeling methods, such
as regression, is that the estimation process accounts for period- to-period
correlations and seasonality, as well as long-term secular trends. A more detailed
description of the concepts used in time series has been described (21) .

Scan Statistic

Consider this surveillance question: Is the number of cases reported for a certain
time period excessive? While ARIMA time-series methods provide one approach to the
answer, often the mechanics of this analysis are complex. The scan statistic (22)
offers a relatively simple alternative in this situation. The scan statistic is the
maximum number of reported cases (i.e., events) in an interval of predetermined length
over the time frame of interest. It is used to test the null hypothesis of uniformity
of reporting against an alternative of temporal clustering. Consider the following
setting. Surveillance data are reported over a time period T, containing k intervals
of equal length:

nx n2 ... n*

j ! ! I | ! ! I L

tj. t, tfc

T
Where ti( i= 1, 2, ..., k are of equal length t
and T = tj + t2 + ... + tk.

The total number of events reported in the entire time period is called N and is the

164

sum of the numbers of events in each of the intervals n2 + n2 + ... + nt. Let n = max
{n^} , i= 1,2, ..., k, or the largest report in any of the intervals. Then compute L =
T/t, or the number of intervals in the entire time period.

The statistical question addressed by the scan statistic is: What is the probability
that the maximum number of cases in any interval of length t is equal to or exceeds n?

For example if the frequency of trisomies among karyotyped spontaneous abortions for a
defined geographic area by calendar month of last menstrual period in 1992 are as
follows:

Month

Number

cases

Month

Number

of cases

January

July

February

August

March

September

April

October

May

November

June

December

What is the probability of 10 or more trisomies in December given there were a total
of 40 in 1992? Using the notation defined above, N = 40, T = 12; L = 12/1 =12; n =
10; and t= 1. Then from tabulated values {23) the probability of 10 or more trisomies
in December, given 40 for the year, is 0.083.

40
40
40

L= 8

n p

14 0.002

13 0.040

14 0.012

15 0.003

14 0.042

L= 12

n p

11 0.007

10 0.083

11 0.024

12 0.006

11 0.064

n p

10 0.007

9 0.082

10 0.021

11 0.005

10 0.053

If the results of the scan statistic are to be useful, the lengths of the entire time
frame and the scanning interval must be determined a priori . The lack of extensive
tabulated values and the computer- intensive calculations for large sample sizes limit
the usefulness of the method. Approximations to the exact distribution are described
elsewhere {23-25) .

165
ABERRATIONS IN SPACE AND TIME

Given cases of a health event reported from a defined geographic area over a defined
time period, can we say that the cases occur unusually close together in both space
and time? That is, do they form a spatial-temporal cluster? Traditional approaches
to the analysis of health-event aggregation in geographic areas have been based on
randomization arguments {26-27). A representative discussion follows.

One proposed method divides the study area into subareas (e.g., counties or census
tracts) and the study time period into intervals of constant length (e.g., month or
year) (28) . The cases of the health event for each time-space "cell" are then
calculated. The maximum count within any time interval is summed across all subareas
to obtain a test statistic. This method assumes equal population density across all
area cells and has limitations {29) .

In Knox's method, all possible pairs of cases are examined, and each pair is

classified according to whether the case-patients in the pair lived "close" together

and had onset of the health problem (or report) "close" in time, resulting in the 2-

by-2 table:

Reports close in time?

Yes No
Reports close Yes a b

in space? No c d

Under the hypothesis of no clustering, the expected number may be calculated in the
usual way, with an adjustment in the significance test, since the statistic is based
on pairs of cases (30). A brief example follows.

Consider cases of a disease with the following spatial and temporal relationships:

Close in space?

Yes No All

Yes 1 5

Close in time? No 2 3

All 6 22 28

The test statistic to be computed is X = number of pairs close in space and time, 1 in
this example. We use row and column marginal totals to compute an expected value for

166

this cell: (6x5) / 28 = 1.07. Now use the Poisson distribution to compute the
probability of seeing one (or more) cases close in space and time, given that we
expect 1.07; this value is at least 0.63. Therefore, we conclude that these data
provide no evidence for space/time clustering.

A criticism of Knox's method is that the choice of the critical time and space
distances is arbitrary. This problem was addressed for the question of spatial
clustering (31) , and the method does not require spatial boundaries or assessment of
the entire population base. An alternative approach is demonstrated by Williams (32) ,
with a sensitivity analysis of the time and space critical values.

A second criticism of Knox's method is that it makes no allowance for edge effects
which arise either from natural geographic boundaries (e.g., coastlines) or because
there are unrecorded cases outside the designated study region. A new method (33)
addresses this, by altering the interpretation of expected pairs of close cases and
replacing the simple count of close pairs by a weighted sum. Recently, this new
method has been applied to test the hypothesis that many non-outbreak, cases of
Legionnaires' disease in Scotland and not sporadic and to attempt to pinpoint cases
clustering in space and time (34) .

It is important to emphasize that because of the diverse and complicated nature of
clusters, there is no single test to assess them. The statistical sources suggested
here are intended only to augment other epidemiologic methods in a systematic,
integrated approach (35) , coupled with flexibility in methods of analysis and
interpretation of significance levels.

COMPLETENESS OF COVERAGE

Statistical methods are the basis of many aspects of evaluating a public health
surveillance system (36) . For example, the question of completeness of a surveillance
system is fundamental to the system's usefulness. One approach to the assessment of
completeness involves a capture-mark-recapture technique, developed for the
enumeration of wildlife populations (37) and used by the U. S. Census Bureau (38) .
The method requires two parallel surveillance systems, or a surveillance system and a
survey, measuring the incidence of a single health event, and provides an estimate of

167

true total number of cases of that health event and the completeness of coverage of
the two systems.

The Chandra Sekar-Deming (CSD) and Lincoln-Peterson Capture-Recapture (LPCR) Methods
suggest the following structure for the analysis. Suppose two surveillance systems
for the same health event report R and S totals respectively for some time period. In
addition, suppose it is possible to match the cases so that we know which C of the
cases are reported to both surveillance systems. This structure suggests the
following 2-by-2 table:

Surveillance system 1

Surveillance
system 2

Cases Cases not
reported reported

All
cases

Cases reported C
Cases not reported Nt
All cases R

The CSD and LPCR methods estimate N, the total number of cases from the combined
information, and provide a confidence interval for that estimate. Using the notation
suggested in the table above,

N = [ (R+l) (S+l) / (C+l) ] - 1

Var(N) = (R + l) (S+l) N, N2 / [ (C+l)2 (C+2) ]

95% CI (N) = N + 1.96 Vvar (N) .

Thus the completeness of each surveillance system can be calculated as follows:

Completeness of #1 = R / N
Completeness of #2 = S / N.

Consider the following example. There exist two independent surveillance systems for
hepatitis A for a location with stable population. Suppose that the events identified
in either of the two systems are true events, that the matching procedure identifies
all true matches, and only true matches are identified.

Surveillance system 1

Surveillance
system 2

Cases
reported

Cases not
reported

All
cases

168

Cases reported 790 60 850

Cases not reported 50 X

All cases 840 N

The estimated number of cases missed by both systems is

X = (50 • 60) / 790 = 3.8 -> 4.
So, the estimated number of cases in the population under surveillance is:

N = 790 + 50 + 60 + 4 = 904.

The formulas above yield a 95% confidence interval for N of 904_+4. The completeness
of surveillance system #1 is 840/904 or 0.93, and that of surveillance system #2 is
0.94.

The usefulness of results from this capture-recapture calculation is based on four
assumptions :

• Surveillance is done for a closed population.

• The matching procedure successfully identifies all true matches and, conversely,
only true matches are identified.

• All events identified in either of the two systems are true events.
The two systems are independent.

Clearly, these are seldom if ever satisfied for public health surveillance
systems; however, this should not preclude the method as an investigative tool.
For example, at the national level, the lack of personal identifiers precludes
exact matching of cases between surveillance systems. However, other information
(age, gender, county, date of onset) may allow probability matching or estimates
of the overlap. Application of the LPCR method with more stringent or relaxed
matching criteria will yield bounds on the completeness of coverage still useful
for surveillance evaluation. For example, if we relax the matching criteria in
the table above so that 820 cases are reported to both systems, analogous
calculations show that the completeness of system #1 is 0.96, and that of system
#2 is 0.98.

169
SELECTION OF ANALYTIC METHODS

No single method can be used to detect all epidemics or all types of aberrations.
Several questions provide a framework for choosing an analytic method.

What is the purpose of the surveillance system? The data used for the CPEG
analyses are reported weekly by state health departments. Although each state
analyzes its own data, patterns may be apparent from the aggregated national picture
that may facilitate prevention and intervention efforts. Additionally, the data are
maintained historically for the archival purposes of measuring trends and assessing
the effects of interventions.

What is the purpose of the analytic method? Since a single method cannot be
expected to distinguish between a change in historical trend and a one-time outbreak
with unsustained increases, the analyst must identify the purpose of the analysis
before choosing an analytic method. If the nature of the data is determined and the
questions are well-defined, the results of the analytic method can be used to augment
other sources of information.

The purpose of CPEG is to facilitate the routine analysis of surveillance data and to
supplement other sources of information. The method is not useful for conditions with
long-term historical trends. When the data have complex patterns, it may be helpful
to remove (simplify) some of this pattern by modeling. The classical methods of time-
series analysis are appropriate for this situation, but these may not be accessible to
the practicing public health official.

Which conditions should be monitored? Routine analysis should be reserved and
adapted for conditions for which there are public health interventions. The CPEG
methodology is most appropriate for conditions with historical trends that do not
exhibit frequent changes in trend or level and that occur often enough so that a
single case or two does not constitute a significant flag. If the raw data are not
already analyzed for trend and period effects, and the variance of the numerator
(present cases) cannot be assumed to have the same variance as the observations in the
denominator (historical data) , and if the series exhibits considerable correlation for
first-order (adjacent) observations and beyond, the CPEG method may be less powerful.

170

For rare conditions, the instability caused by small numbers of reported cases may
make the results unsuitable for repeated use.

What is the (person, place, or time) unit of analysis? We chose national data
for presentation of CPEG. The objective was to use as short and recent a time period
as possible for weekly publication, thus making the results useful for timely
intervention. However, variability in weekly reports reflecting factors other than
the disease process--e.g. , delayed reports due to outbreaks — made the results
unstable. We then chose a 4 -week window.

Because of the interest in analytic techniques for the analysis of aberrations in
surveillance data at the state level, six state health departments evaluated the
usefulness of the "CPEG" (39) . During the 4-month period of study, a total of 210
episodes were observed, of which 27 episodes were flagged as exceeding historical
limits; one state had no episodes of unusual reporting. Overall, 14 episodes (52%)
represented epidemiologically confirmed outbreaks. Many were small, and none were
detected when aggregated with other state data for the national analyses. Each
disease exceeded historical limits at least twice during the study period, and for all
but meningococcal disease, at least one incident represented an outbreak. Although
the numbers are clearly small, the proportion of episodes that represented outbreaks
varies. This is expected for conditions with different epidemiology.

The five outbreaks that the health department knew about but that were not detected by
the CPEG method highlight some of its limitations. In three outbreaks, cases were not
reported nationally as current reports; thus, they were not included with the data
used for the calculation. The other two outbreaks were not detected because of
concurrent increases in the corresponding baseline.

What provision is there for updating or correcting the data using later
reports? In the NNDSS, cases are reported as early as possible and then later
confirmed or modified. The methodology of CPEG is applied to the provisional
(earliest reported) data. In our study of six states, two of the five outbreaks that
were not detected reflected late reports not included in the current reporting period.

171

How is the baseline determined? The choice of 5 years as a baseline period was
based on a consideration of appropriate sample size balanced by a desire to use the
same method for all conditions. Although a longer baseline might be used for some
conditions with a long reporting history, epidemics or changes in trend in the
baseline will increase the variance of the baseline and thus offset any benefit of
additional data. An additional source of variation may be increases in reporting due
to intensive investigation. In these cases, the analyst may choose to omit or adjust
the increased baseline data.

How are outbreaks in the baseline handled? CPEG as presented here does not adjust
for epidemics in the baseline. The result of this is a progressive decline in
sensitivity--when an outbreak moves in and then out of the baseline window. To
address this point, one could use a median of the baseline reports (rather than a
mean) . Unfortunately, this replacement invalidates the technique used to compute the
point for signalling aberrations, and the alternative methods for calculating this are
not as accessible to the practicing epidemiologist as the CPEG methodology.

What are the sensitivity and, predictive value positive of the method?
Applying CPEG by states detected 14 of 19 (74%) of outbreaks and 14 of 27 (52%) of the
episodes exceeded historical levels were actually outbreaks by sensitivity (74%) and
predictive value positive (52%) of CPEG in states is therefore quite high. Partly
because of the use of provisional data, we use the mean of the historical baseline in
the calculation. We investigated the predictive value positive of the CPEG from six
state health departments by asking each department to follow up on aberrations
detected by this system. In addition, we asked that outbreaks that came to their
attention through other sources but had not been identified by CPEG be noted.

What are the mechanics of operation? For any analytic method to be useful, it
must be easily implemented in the routine work of the practicing epidemiologist. In
evaluating the states use of CPEG at the national level, an epidemiologist routinely
evaluated each aberration, analyzed state distributions, and conveyed results to each
CDC program responsible for the control of the condition. Additional information was
provided by epidemiologists in state health departments. Investigation was based on
this evidence in addition to that obtained through other analysis. Eventually, state

172
health departments will have the software to generate CPEG locally.

Emergent methods provide opportunities for the future of surveillance analysis. Many
methods of pattern recognition are based on Bayesian concepts, in which a different
approach is taken to the process that generates the data--in this context, reports of
a health event.

Classical statistical theory regards the data as arising from a process with unknown
but constant parameters. The objective of classical methods, then, is to use the
observed data to estimate or make inferences about the unknown values. Bayesian
methods regard the parameters as having prior distributions, independent of the data,
and the data are used to update or refine our idea of this distribution. "The gain in
introducing the prior [distribution] is partly that it provides a way of injecting
additional information into the analysis and partly that there is a gain in logical
clarity" (40) .

In the application to data generated over time and space as public health surveillance
reports, the Bayesian approach recognizes the value of information beyond the mere
data history (e.g., a change in the definition of a reportable case of AIDS (41). In
such circumstances, no statistical model can be expected to predict such occurrences
using historical data only. "There is a tendency to overfit [sic] a particular past
realization at the expense of the unrealized future" (42) . It is necessary to have a
system in which people can convey their information to the method and have the method
convey this uncertainty in a way that is useful for intervention and control.

One important application of Bayesian methodology is to increase the stability of
observed rates of health events on the basis of data for small populations. For
example, county-level mapping may provide the resolution necessary to identify regions
with potentially elevated risk, but the high variability of observed rates in counties
with small populations may mask any underlying patterns. A two-stage empirical Bayes
procedure (43) addresses this problem by augmenting information for one county with
that of all other counties. Devine (44) applied this method to mapping of injury-
related mortality rates for the United States from 1979 through 1987. This work
represents an important step towards producing meaningful maps for small areas.
However, sensitivity to model assumptions and consideration of spatial dependence

173
remain areas for investigation.

174

REFERENCES

1. Goodman RA, Buehler JW, Koplan JP. The epidemiologic field investigation:
science and judgment in public health practice. Am J Epidemiol 1990;132:9-16.

2. Openshaw S, Taylor PJ. The modifiable areal unit problem. In: Wrigley N and
Bennett RJ, (ed.). Quantitative geography: a British view. London: Routledge
and Kegan, Paul 1981.

3. Thacker SB, Berkelman RL, Stroup DF. The science of public health surveillance.
J Publ Hlth Pol 1989;10:187-203.

4. Lilienfeld AE, Lilienfeld DE. Foundations of epidemiology. 2nd edition.
Oxford, England: Oxford University Press, 1980.

5. Macmahon B, Pugh TF. Epidemiology: principles and methods. Boston, Ma.: Little
Brown and Co. , 1970.

6. Baker AD, Margerison FM. New medical dictionary. London, England: Northcliff,
1935.

7. Last JM. a dictionary of epidemiology, 2nd edition. Oxford, England: Oxford
University Press, 1988.

8. London Times, January 11, 1989.

9. Thacker SB. The surveillance of infectious diseases. JAMA 1983;249:1181-5.

10. Centers for Disease Control, summary of notifiable diseases United States 1990.
MMWR 1990,-39: (53) .

11. Stroup DF, Williamson GD, Herndon JL, Karon JM. Detection of aberrations in the
occurrence of notifiable diseases surveillance data. Stat Wed 1989;8:323-32.

12. Centers for Disease Control. Proposed changes in format for presentation of
notifiable disease report data. MMWR 1989;38 (47) :805-9 .

13. Stroup DF, Thacker SB, Herndon JL. Application of multiple time series analysis
to the estimation of pneumonia and influenza mortality, by age, 1962-1983. Stat
Med 1989;7:1045-59.

14. Morgenstern H, Greenland S. Graphing ratio measures of effect. J Clin
Epidemiol 1990;43:539-42.

15. Stroup DF, Wharton M, Kafadar K, Dean AG. An evaluation of a method for
detecting aberrations in public health surveillance data. In press: Stat Med.

16. Centers for Disease Control. Increase in rubella and congenital rubella
syndrome- -United States, 1988-1990. MMWR 1991;40:93-9.

17. Centers for Disease Control. Aseptic meningitis--New York State and United
States, weeks 1-36, 1991. MMWR 1991;40 (45) :773-5 .

18. Collins SD. Excess mortality from causes other than influenza and pneumonia
during influenza epidemics. Publ Health Rep 1932;47:2159-80.

19. Serf ling RE. Methods for current statistical analysis of excess pneumonia-
influenza deaths. Public Health Rep 1963;78:494-505.

20. Choi K, Thacker SB. An evaluation of influenza mortality surveillance, 1962-
1979. I. Time series forecasts of expected pneumonia and influenza deaths.
Amer J Epidemiol 192; 113 : 215-26 .

21. Box GEP, Jenkins G. Time series analysis: forecasting control. San Francisco,
Ca. : Holden-Day, 1976.

22. Wallenstein S. A test for detection of clustering over time. Am J epidemiol
1980;111:367-72.

175// 7*

23. Naus JI . Approximations for distributions of scan statistics. J Amer Stat Assn
1982;77:177-83.

24. Wallenstein S, Neff N. An approximation for the distribution of the scan
statistic. Stat Med 1987;6:197-207.

25. Glaz J. Approximations and bounds for the distribution of the scan statistic.
J Amer Stat Assn 1989;84:560-6.

26. Mantel N. The detection of disease clustering and a generalized regression
approach. Cancer Res 1967;27:209-20.

27. Aldrich TE, Wilson CC, Warner SS, Easterly CE. Studying case clusters: a primer
for disease surveillance. Am J Epidemiol 1989;120:223-30.

28. Ederer F, Myers MH, Mantel N. A statistical problem in space and time: do
leukaemia cases come in clusters? Biometrics 1964;20:626-39.

29. Knox EG. The detection of space-time interaction. Appl Statist 1964;13:25-9.

30. David FN, Barton DE. Two space time interaction tests for epidemicity. Brit J
Prev Soc Med 1966;20:44-8.

31. Cuzick J, Edwards R. Spatial clustering for in for inhomogeneous populations.
J R Statist Soc 1990;652:73-104.

32. Williams EH, Smith PG, Day NE, et al. Space-time clustering of Burkitt's
lymphoma in the west Nile district of Uganda: 1961-1975. Brit J cancer
1978;37:109-22.

33. Diggle PJ, Chetwynd AG, Haggkvist R. Second order analysis of space-time
clustering. Lancaster, Pa.: Lancaster University, 1991. (Department of
Mathematics technical report) .

34. Bhopal RS, Diggle PJ, Rowlingson B. Pinpointing clusters of apparently sporadic
cases of legionnaire's disease. BMJ 1992;304:1022-7.

35. Centers for Disease Control. Guidelines for investigating clusters of health
events. MMWR 1990; 39 (No. RR-11).

36. Centers for Disease Control. Guidelines for the evaluation of surveillance
systems. MMWR 1988; 37 (S-5) .

37. Eberhardt LL. Appraising variability in wildlife populations. J Wildlife
Management 1978;42:207-38.

38. Wolter KM. Accounting for America's uncounted and miscounted. Science
1991;253:12-5.

39. Wharton M, Price W, Hoesly F et al. Evaluation of a method for outbreak
detection in six states. Am J Prev Med (in press).

40. Cox DR, Hinkley DV. Theoretical Statistics. London, England: Chapman and Hall,
1974.

41. Selik RM, Buehler JW, Karon JM et al. Impact of the 1987 revision of the case
definition of acquired immune deficiency syndrome in the United States. J
Acquired Immune Deficiency Syndromes 1990;3:73-82.

42. Harrison PJ, Stevens CF. Bayesian forecasting (with discussion). J Royal Stat
Soc 1976;38:205-47B.

43. Tsutakawa RK. Mixed model for analyzing geographic variability in mortality
rates. J Am Stat Assoc 1988;83:37-42.

44. Devine OJ. A modified empirical bayes approach for stabilizing mortality rates
in areas with small populations. Proceedings of the National Meeting of the
American Statistical Association, Atlanta, Ga., August, 1991.

177

Chapter VII

COMMUNICATING INFORMATION FOR ACTION

Richard A. Goodman

Patrick L. Remington

Robert J. Howard

"All I know is just what I read in the papers.

Will Rogers

DEFINITION OF THE PROBLEM: COMMUNICATING
SURVEILLANCE DATA

Standard definitions for public health surveillance specify the requirement for the
timely dissemination of findings to those who have contributed and others who need to
know {1-3). In the United States, surveillance findings have been disseminated
through the Morbidity and Mortality Weekly Report (MMWR) series of publications,
public health bulletins in states, and special reports in peer-reviewed journals.
However, even though new technologies and epidemiologic methodologies have
dramatically improved the collection and analysis of surveillance data, public health
programs have lagged in developing effective approaches to the dissemination of
surveillance f indings--and to the ultimate successful communication of those findings.

As recently as the 1970s, public health surveillance in the United States focused
almost exclusively on the detection and monitoring of cases of specific communicable
diseases, and surveillance data were disseminated primarily in a basic tabular format.
However, surveillance efforts have expanded rapidly and now include chronic diseases,
injuries, occupationally acquired conditions, and other problems. In addition,
surveillance encompasses problems as diverse as personal behavior (e.g., cigarette

178

smoking and seat-belt use); environmental insults (e.g., hazardous materials
incidents); and preventive practices (e.g., Pap smears and mammographic screening).

Because of the fundamental changes in public health programs and priorities, programs
at all levels require innovative approaches to convey surveillance findings to new and
more diverse constituencies. This chapter provides a practical framework for
optimizing dissemination and communication of information developed through public
health surveillance efforts.

BASIC CONCEPTS FOR DISSEMINATING AND COMMUNICATING
SURVEILLANCE INFORMATION

Surveillance has been characterized as a process that provides "information for
action." This concept is inherently consistent with one definition that described
communications as "...a process, which is a series of actions or operations, always in
motion, directed toward a particular goal" (5). On the basis of this definition,
then, public health programs must ensure more than the mere transmission or
dissemination of surveillance results to others; rather, surveillance data should be
presented in a manner that facilitates their consequent use for public health actions.
One fundamental concept is that the terms "dissemination" and "communication" cannot
be used interchangeably. Dissemination is a one-way process through which information
is conveyed from one point to another. In comparison, communications is a loop-
involving at least a sender and a recipient and is a collaborative process. The
communicator's job is completed when the targeted recipient of the information
acknowledges receipt and comprehension of that information.

A basic framework for disseminating the results of public health surveillance with the
intent of communicating can be adapted from fundamental models for communications.
One such model—which emphasizes the effect of communications-includes the sender, the
message, the receiver, the channel, and the impact (3). The sender is the person
responsible for surveillance of each health condition being monitored. For
applications in public health practice, this model can be modified (See Table VII. 1).

Each of these steps is discussed in greater detail in the paragraphs below. They

179

should all be read with the understanding that one should never disseminate more
information than s/he can evaluate and revise, as needed, during the communications
process.

Establish Message

The primary message or communications objective for the findings of any public health
surveillance effort should reflect the basic purposes of the surveillance system. In
this textbook, the purposes of surveillance systems have been described (Chapters I
and II) . For each of these categories, the findings and interpretation of
surveillance data may necessitate a different type of public health response. In
addition to disseminating data to those who may have contributed, the communications
objectives should also dictate the delivery of the information to the relevant target
groups and the stimulation of appropriate public health action, as illustrated below.

To detect and control outbreaks

When the purpose of a surveillance system is to detect outbreaks or other occurrences
of disease in excess of predicted levels, the primary communications objective should
be to inform two groups: a) the population at risk of exposure or disease, and b)
persons and organizations responsible for immediate control measures and other
interventions. For example, when surveillance efforts detect influenza activity in a
specific locality, public health agencies can promptly disseminate this information to
health-care providers who may, in turn, intensify efforts to vaccinate or provide
amantadine chemoprophylaxis to persons at high risk of complications from influenza.
The release and timing of such messages should be carefully considered and coordinated
with appropriate agencies.

In the context of this example, the impact of releasing a message recommending the use
of amantadine or influenza vaccine may be enhanced if the release has been coordinated
with public health units, local pharmaceutical suppliers, and medical organizations.

To determine etiology and natural history of disease

Public health surveillance for newly recognized or detected problems may be initiated
to assist in determining the epidemiology, etiology, and natural history of such
conditions. In such circumstances, the communications objective may simply be to

180
provide information which is sufficient to initiate surveillance.

For example, when eosinophilia-myalgia syndrome (EMS) was recognized in the United
States in October 1989, a case definition was developed and disseminated to the public
health community to enable the immediate implementation of national surveillance for
EMS (4) . Surveillance efforts were critical in characterizing the epidemiology and
natural history of EMS, as well as in assisting in the development of hypotheses
regarding its cause.

Evaluate control measures

For many public health conditions, surveillance is the principal means for assessing
the impact of control measures. Epidemiologic trends and patterns that are based on
surveillance findings must be conveyed to persons involved in control efforts in order
to refine control activities and guide the allocation of resources in support of those
activities.

Following a period of relative quiescence, as of the mid-1980s the incidence of
measles in the United States surged. When surveillance indicated that vaccination
coverage had declined substantially in some groups (e.g., children residing in inner-
city locations) , key findings were conveyed to and used by public health programs and
primary care providers in targeting measles vaccination efforts.

To detect changes in disease agents

In addition to monitoring trends in the occurrence of public health problems,
surveillance systems may be fundamental to the process of detecting changes in disease
agents and the impact of these changes on public health. For example, in the late
1980s in the United States, surveillance documented an increase in the incidence of
tuberculosis- -an increase substantially in excess of predicted levels. In addition to
this overall trend, transmission of multi-drug-resistant tuberculosis (MDR-TB) was
detected in health-care and prison settings (5) . The public health implications of
these findings are similar to the basic considerations outlined above for detecting
and controlling outbreaks: specifically, there is need for timely and effective
notification of populations at risk and of organizations responsible for
control/prevention measures. Therefore, in the case of MDR-TB, the communications
objectives would include immediate notification of the public health community about

181

the problem with the intent of facilitating implementation of proper diagnostic,
therapeutic, and preventive measures.

To detect changes in health practices

Some surveillance systems monitor changes in health practices and behaviors in the
population rather than changes in patterns of disease (6) . This "life-style"
information is particularly important for problems such as chronic disease, for which
trends in risk behavior often precede changes in health outcome by years or even
decades. The communications objective in this context is often to increase awareness
regarding the role of behavior in causing disease or injury. In addition, this
information may be used to identify high risk groups in the population.

For example, surveillance data regarding trends in cigarette smoking indicate that
smoking rates have not declined among persons with lower educational attainment.
Accordingly, surveillance data which characterize risk factors (such as smoking),
outcomes, health services, and other related factors may guide public health programs
and decision makers in the implementation of targeted communitywide or statewide
intervention strategies (7).

Facilitate planning of health policies

For some conditions, the most appropriate control measure is promulgation of a public
health policy. In this context, surveillance information about the public health
impact of different conditions and problems must be effectively communicated to
legislators and public health policy makers.

For example, in California, surveillance information about smoking-attributable
mortality, morbidity, and economic costs helped in enacting Proposition 99. This
legislation provided for a 25-cent increase in the state cigarette tax which, in turn,
funded statewide initiatives to prevent and control the use of tobacco. Subsequently,
surveillance data regarding trends in the prevalence of smoking and the impact of this
initiative assisted in ensuring the application of state funds to control tobacco use.
Similarly, data for the United States have confirmed that increases in cigarette taxes
have helped in reducing cigarette smoking (8).

182
Define the Audience

Identification of target groups is an essential part of the process of developing
strategies for communicating surveillance results. Typically, public health
surveillance information and reports have been disseminated in a standard format with
only limited consideration of the target audiences and, more importantly, the
techniques to communicate effectively to these groups. In general, key target groups
may include public health practitioners, health care providers, professional and
voluntary organizations, policy makers (e.g., from the executive and legislative
branches of government), the press, or the public.

In some instances, surveillance information should be disseminated widely, in which
case communication strategies should be tailored to subgroups of greater interest.
For example, information regarding trends in injecting drug use (IDU) -related risks
for HIV is often communicated to the general public through the newspapers; however,
this strategy may be suboptimal for reaching the groups at highest risk, who use
alternative media such as radio and television (9) .

Select the Channel

Specification of the messages and audiences for surveillance results enable selection
of the most suitable channels of communication for this information. Traditionally,
surveillance information has been disseminated through published surveillance reports.
However, in addition to conventional means for communicating with traditional
audiences, the advent of new methods and technologies have made possible improved
communications with both old and new audiences. This spectrum of communications
options includes professional and trade publications, electronic channels, broadcast
media, print media, and public forums:

• Publications: government public health bulletins and surveillance reports,
peer-reviewed public health and biomedical journals, newsletters.

• Electronic: telecommunications systems (e.g., National Electronic
Telecommunications Surveillance System [see Chapter IV] , Public Health
Net), fax and batch fax, audioconferences, videoconferences.

183

• Media: news releases, news conferences, fact sheets, video releases.

• Public forums: briefings, hearings and testimony, conferences and other
planned meetings.

Market the Information

Once the message has been defined and the target audience and channel selected, it is
critical to assure that the information is communicated and marketed- -not merely
disseminated- -to those who need to know. In the decade of the 1990s, enormous
quantities of information concerning public health are communicated through
professional channels, as well as the print and electronic media. Because of the
volume of essential information, as well as time constraints, surveillance information
must be carefully tailored for presentation to each targeted audience, including
public health and health care professionals, policy makers, and the public.

To ensure that surveillance information is readily communicated to target audiences,
public health agencies should use those techniques that are most effective for
marketing information. First, as a general principal, graphic formats and other
visual displays are likely to be more effective in conveying information than
conventional tabular presentations. Such formats include maps, bar graphs,
histograms, diagrams, or other ways of visually depicting data which may not be
readily comprehended through tabular presentation. For example, in December 1989, the
Centers for Disease Control introduced a graphic format for displaying national
notifiable disease surveillance data in the Morbidity and Mortality Weekly Report
(10). This bar graph (Figure V.12), which replaced a standard table, was designed
both to facilitate interpretation of routine notifiable disease data and to enable
timely public health responses to changes in disease patterns.

Second, the principal components of the message can be focused by selecting the most
important point, then stating that point as a simple declarative sentence. This
message, termed the "single over- riding communication objective (SOCO) ■ , should
consider three questions:

• What is new?

• Who is affected?

184
• What works best?

For example, chronic disease surveillance information data indicate that compared with
younger women, older women are less likely to have received a Pap test in the past,
are more likely to have cervical cancer diagnosed at a late stage, and have higher
mortality rates due to cervical cancer. Traditionally, this information might be
disseminated to health care and public health providers through vital statistics
reports and other published accounts about cervical cancer. However, if these
findings are to be used as a basis for action, they first must be synthesized, then
effectively communicated. Thus, in addition to presenting these findings in detailed
reports, they also may be expressed through a single message, the SOCO: "Older women
need to get regular Pap tests."

Third, techniques must be used which present (or "package") the surveillance
information in a manner which captures an audience's interest and focuses attention on
a specific issue. Examples of these techniques are the use of introductory terms such
as: "A new study . . ."; "Recent findings . . . * ,- and "Information recently released .
..." These terms are likely to appeal more to a target audience than a presentation
which begins with a conventional preface, such as "Based on recent surveillance
findings, . . . ."

Fourth, the method and forum of release of surveillance information may be critical--
particularly when a timely release is required, or when the target audiences include
the media, the public, or policy makers. Under such circumstances, news conferences
or other news releases may be considered, and should be held when they are likely to
be attended. Foremost, the presenter should involve reporters in the public health
surveillance process by "walking them through it", and should recognize opportunities
to articulate the SOCO on camera or in print. Important adjuncts for presenting the
information include readily available handouts and effective, but simple, visuals.

Evaluate the Effect

Because public health surveillance is, by definition, oriented toward action,
evaluation efforts should address two considerations: first, whether surveillance
information has been communicated to those who need to know; and second, whether the

185

information has had a beneficial effect upon the public health problem/ condition of
interest.

Assessment of whether surveillance information has been communicated to those who
need to know may be accomplished through a process evaluation, such as by monitoring
the distribution of the information or a user survey. In particular, the
effectiveness of communication through newspapers can be evaluated by using clipping
services which determine the number of published reports, the geographic distribution
of the reports, and the proportion of the total audience to which the reports have
been circulated. In addition, process evaluation efforts should include a review of
the content of articles to assess both the accuracy and appropriateness of the
communicated message.

The second consideration—the impact of the communications effort on the public health
problem — requires an evaluation of outcomes (e.g., knowledge or practices) within
specific target audiences.

Under ideal circumstances, this type of evaluation requires surveys of the target
audiences both before and after the surveillance information has been communicated to
detect changes in levels of outcomes. The potential for such evaluation is
constrained, however, by technical and methodologic challenges, as well as substantial
resource requirements.

SUMMARY

Effective communication of public health surveillance results represents the critical
link in the translation of science information section. Recognition of the key
components in this process- -including the medium, the message, the audience, the
response, and the evaluation of the process--is the first step in completing the
communications loop.

186
REFERENCES

1. Langmuir AD. The surveillance of communi cable diseases of national importance.
N Engl J Med 1963;288:182-92.

2. Thacker SB, Berkelman RL. Public health surveillance in the United States.
Epidemiologic Rev 1988;10:164-90.

3. Hiebert RE, Ungurait DE, Bohn TW. the process of communication. In: Mass
media: An introduction to modern communication III. Longman Inc., New York,
1982, pp 15-29.

4. Centers for Disease Control. Eosinophilia-myalgia syndrome--New Mexico. MMWR
1989;38:765-7.

5. Centers for Disease Control. Nosocomial transmission of multidrug-resistant
tuberculosis among HIV-infected persons—Florida and New York, 1988-1991. MMWR
1991;40:585-91.

6. Remington PL, Smith MY, Williamson DF, Anda RF, Gentry EM, Hogelin GC. Design,
characteristics, and usefulness of state-based risk factor surveillance 1981-
1986. Public Health Rep 1988 July-August; 103 (4) : 366-75.

7. Boss LP, Suarez L. Uses of data to plan cancer prevention and control programs.
Public Health Rep 1990;105:354-60.

8. Peterson DE, Zeger SL, Remington PL, Anderson HA. The effect of state cigarette
tax increases on cigarette sales, 1955 to 1988. Am J Public Health 1992;82:94-
6.

9. Centers for Disease Control. HIV-prevention messages for injecting drug users:
sources of information and use of mass media- -Baltimore, 1989. MMWR
1991;40:465-9.

187 //Sfc'

10. Centers for Disease Control. Proposed changes in format for presentation of
notifiable disease report data. MMWR 1989;38:805-9.

189

Chapter VIII

Evaluating Public Health Surveillance

Douglas N. Klaucke

"The best way to escape from a problem is to solve it."

Brandon Francis

OVERVIEW

The overall purpose of evaluating public health surveillance is to promote the most
effective use of health resources. The highest-priority public health events should
be under surveillance, and surveillance systems should meet their objectives as
efficiently as possible. Meeting each of these objectives involves evaluating
surveillance from two different perspectives; in turn, each perspective has a slightly
different emphasis in the application of the elements of surveillance evaluation.

TYPES OF EVALUATION

190

The first level of evaluation answers the question, "Should this health event be under
surveillance?" This question should be answered from a perspective external to the
surveillance system itself. It is the first question that should be asked when
deciding whether to start a new system or before conducting a detailed evaluation of
an existing one. This "external" evaluation is primarily an assessment of the public
health importance of a health event and how its importance compares with that of other
health events. Once a health event is identified as being of high priority, it is
important to consider both the feasibility and cost of conducting surveillance for
that event. If this first-level evaluation leads to a decision to discontinue a
surveillance system, a detailed evaluation of that system is superfluous.

The second level evaluates an operating surveillance system for a high-priority health
event to increase the system's utility and efficiency. This type of evaluation may
also compare two or more systems involving the same health event . This type of
evaluation will determine whether the system is meeting its objectives, serving a
useful public health function, and operating as efficiently as possible. It should
include at least the following steps:

• An explicit statement of the purposes and objectives of the system

• A description of its operation

• Documentation of how the surveillance system has been useful

• An assessment of the different quantitative and qualitative attributes,
and

• Estimates of the cost of the system.

The goal is to maximize the system's usefulness and to achieve the simplest, least
expensive system that meets its objectives.

ADAPTING THE EVALUATION

Although all systems should be assessed for their purpose and usefulness, specific
attributes described below that are critical to one system may be less important to
another. Efforts to improve certain attributes--such as the ability of a system to
detect a health event --may detract from other attributes--such as simplicity or
timeliness. Thus, the success of an individual surveillance system depends on the
proper balance of characteristics, and the strength of an evaluation depends on the

191

ability of the evaluator to assess these characteristics with respect to the system's
objectives. Any approach to evaluation must therefore be flexible.

Determining the most efficient approach to surveillance for a given health event is an
art. There is room for creativity and opportunity to combine scientific rigor with
practical realities. The methods discussed in this chapter should be used as a guide
to the types of questions that need to be answered about the system. Each evaluation
should be individually tailored. Few evaluations address fully all of the methods
outlined in this chapter, and many profitably focus on only one or two major
attributes, such as sensitivity and timeliness (1-3) . Some of these elements may also
be useful for evaluating other health-information systems or evaluating the value of
secondary data sources for surveillance.

Each of the listed aspects of a surveillance evaluation will be discussed in the
sections that follow: public health importance, objectives and usefulness, operation
of the system and qualitative attributes (simplicity, flexibility, and acceptability),
quantitative attributes (sensitivity, predictive value positive, representativeness,
and timeliness), and cost. This chapter continues the process through which methods
for evaluating public health surveillance systems evolve (4,5).

PUBLIC HEALTH IMPORTANCE

The public health importance of a health event and the need for surveillance of that
health event can be described in a variety of ways. Health events that affect many
people or require large expenditures of resources are clearly important in a public
health context. However, health events that affect relatively few persons may also be
important, especially if the events cluster in time and place--e.g., a limited
outbreak of a severe disease. At other times, public concerns may focus attention on
a particular health event, creating or heightening the sense of importance associated
with it. Health problems that are now rare because of successful control measures may
be perceived as 'unimportant, ■ but their level of importance should be assessed on the
basis of their potential to reemerge. Finally, the public health importance of a
health event is influenced by its preventability and the ability of public health
action to influence it.

192

Some measures of the importance of a health event, and, therefore, the surveillance
system that monitors it, include the following:

• Magnitude of the problem: Total number of cases, incidence, and
prevalence.

• Severity: Mortality rate and case- fatality ratio.

• Morbidity: physician visits, hospital days.

• Premature mortality: Years of potential life lost (YPLL) .

• Economic cost: Costs of medical care, lost productivity.

• Preventability : Prevented fraction.

Measures of importance used should take into account the effect of existing control
measures. For example, the number of cases of vaccine-preventable illness has
declined following the implementation of school immunization laws, and the public
health importance of diseases in this category is underestimated by case counts
alone. In such instances, it may be possible to estimate the number of cases that
would be expected in the absence of control programs (6) .

Preventability can be defined at several levels--from preventing the occurrence of
disease (primary prevention), through early detection and treatment, (secondary
prevention) , to minimizing the effects of the health problem among those already ill
(tertiary prevention) . From the perspective of surveillance, preventability reflects
the potential for effective public health interventions at any of these levels.

The need for surveillance may also be affected by factors other than those mentioned
above. Political and public pressure may affect whether surveillance is undertaken —
or, at the other extreme, forbidden- -for a specific health event. Regulations, laws,
and public health programs may be implemented on the basis of considerations other
than those listed above. However, it is still important to make the scientific
criteria as clear and explicit at possible.

Even when using quantitative measures, judgment is necessary to decide which criteria
are most relevant for each condition. It is important to make these judgments as
explicit--and as early--as possible.

193

Attempts have been made to quantify the public health importance of health
conditions. Dean described such an approach that involved using a score that
accommodated for age-specific mortality and morbidity rates and health-care costs (7).
The Canadian Laboratory Centre for Disease Control has used explicit criteria in
setting national surveillance priorities for communicable diseases. Their criteria
include the parameters listed above, plus several others such as interest on the part
of the World Health Organization, or the Department of Agriculture (Canada) , potential
for outbreaks, public perception of risk, and necessity for immediate public health
response. Their ratings for 60 communicable diseases can be useful in setting
priorities for initiating a surveillance system (8) .

SYSTEM OBJECTIVES AND USEFULNESS

The most important steps in evaluating a surveillance system are a) describing the
health event (s) under surveillance, b) stating explicitly the objectives of the
system, and c) describing how the system has actually been used to help prevent and/or
control disease or injury. These three steps alone often sufficiently indicate how
the system can be improved.

Case definition (s) should be specified, which include symptoms, signs, laboratory
results, and epidemiologic information; a scale of severity; and the different levels
of confidence in the diagnosis for each case, such as "suspected," "probable," and
■confirmed. " Case definitions for nationally notifiable diseases have been published
for Canada and the United States (9,10). Table VIII. 1 outlines a case definition
developed by the Centers for Disease Control (CDC) and the U.S. Council of State and
Territorial Epidemiologists.

The possible objectives of surveillance systems and the uses of surveillance
information are very similar and have been reviewed in Chapter I.

A surveillance system might also meet a statutory requirement based on political
necessity or public pressure or might identify cases for additional studies. There
may also be objectives, such as meeting the reporting requirements of the World Health
Organization, that might not be of immediate or direct benefit to the agency operating
the surveillance system.

194

The usefulness of a system should be described specifically, including the actions
that have been taken as a result of the data and analysis from the surveillance
system, and who used the data to make decisions and take actions. Other anticipated
uses of the data should be noted and their feasibility determined.

A surveillance system should contribute to the control and prevention of adverse
health events. This process may include an improved understanding of the public
health consequences of the events. A surveillance system can also be useful if it
determines that an adverse health event previously thought to have public health
importance actually does not.

An assessment of the usefulness of a surveillance system begins with a review of the
objectives of the system and should consider the dependence of policy decisions and
control measures on the surveillance system. Depending on the objectives of a
particular surveillance system, the system may be considered useful if it
satisfactorily addresses one or more of the following questions. Does the system,
e.g.,

• detect trends signaling changes in the occurrence of the health problem in
question?

• detect epidemics?

• provide estimates of the magnitude of morbidity and mortality related to
the health problem being monitored?

• stimulate epidemiologic research likely to lead to control or prevention?

• identify risk factors involved in the occurrence of the health problem?

• permit assessment of the effects of control measures?

• lead to improved clinical practice by the health-care providers who are
the constituents of the surveillance system?

Usefulness may be affected by all the attributes of surveillance described below.
Increased sensitivity may afford a greater opportunity for identifying epidemics and
understanding the natural course of an adverse health event in a community. More
rapid reporting allows more timely control and prevention activities. Increased
specificity enables public health officials to focus on productive activities. A

195

representative surveillance system will characterize more accurately the epidemiologic
features of a health event in the population.

OPERATION OF THE SYSTEM

To evaluate a surveillance system, one must know how it operates (see Chapter IV) .
The system description should include the following:

The people and organizations involved,

The flow of information (up and down) ,

Mechanisms of information transfer,

Frequency of reporting and feedback, and

Quality control.

The evaluation should address the following questions. What is the population being
monitored? Who is responsible for reporting a case (and to which public health
agency)? What information is collected on each case, and who is responsible for
collecting it? If there are multiple administrative levels represented in the system,
how are the data transferred from one level to another? How is information stored?
Who analyzes the data? How are they analyzed, and how often? Are there preliminary
and final tabulations, analyses, and reports? How often are reports disseminated? To
whom? By what mechanisms /media are the reports distributed? Are there any
■automatic" responses to case reports, (e.g., follow-up of individual cases of rabies,
botulism, or poliomyelitis)?

A diagram is often useful to summarize the relationship between the various components
of a system (Figure VIII. 1).

ATTRIBUTES OF THE SYSTEM

Each surveillance system has characteristics or attributes that contribute directly to
its ability to meet its specific objectives. The combination of these attributes
determines the strengths and weaknesses of the system. The attributes must be
balanced against each other, (e.g., high sensitivity may only be possible with a
complex reporting system from a wide array of providers) .

196
QUALITATIVE ATTRIBUTES:

Simplicity and Flexibility

In describing a surveillance system, three desirable qualitative attributes should be
addressed: simplicity, flexibility, and acceptability.

Simplicity of a surveillance system refers both to its structure and to its ease of
operation. Surveillance systems should be as simple as possible, while still meeting
their objectives. It may be useful to think of the simplicity of a surveillance
system from two perspectives: the design of the system and the size of the system.
The following measures might be considered in evaluating the simplicity of a system:

Amount and type of information necessary to establish a diagnosis.

Number and type of reporting sources,

Method(s) of transmitting case information/data,

Staff training requirements.

Type and extent of data analysis,

Amount of computerization,

Methods of distributing reports, and

Amount of time spent operating the system.

The cost estimates for a system are also an indirect indicator of simplicity. Simple
systems usually cost less that complex ones. Another consideration is the ability of
the system to adapt to changing needs such as the addition of new conditions or data-
collection elements. This characteristic is termed "flexibility."

Acceptability

Acceptability reflects the willingness of individuals and organizations to participate
in the surveillance system. This attribute refers to the acceptability of the system
to health department staff and at least equally importantly to persons outside the
sponsoring agency, (e.g., doctors or laboratory staff) who are asked to report cases
of certain kinds of health problems. To assess acceptability, one must consider the
points of interaction between the system and its participants, including subjects
(persons identified as having cases) and reporters. Indicators of acceptability

197

include the following: a) subject or agency participation rates; b) interview

completion rates and question refusal rates, if the system involves case interviews;

c) completeness of report forms; d) physician, laboratory, or hospital/facility
reporting rates; and e) timeliness of reporting.

QUANTITATIVE ATTRIBUTES

The four quantitative attributes of a surveillance system include sensitivity,
predictive value positive, representativeness, and timeliness. These are often
difficult to measure precisely, but even indirect estimates can be useful in helping
to improve the efficiency of a system and in comparing it with other systems.

Sensitivity

The sensitivity of a surveillance system can be considered on two levels. First, the
completeness of case report ing- - i .e. , the proportion of cases of a disease or health
condition that are detected by the surveillance system (Table VIII. 2) — can be
evaluated. Second, the system can be evaluated for its ability to detect epidemics
(11). (see Chapters V & VI) .

The sensitivity of a surveillance system is affected by the likelihood that

• persons with certain health conditions seek medical care;

• the condition is correctly diagnosed which reflects the skill of care
providers and the accuracy of diagnostic tests; and

• the case is reported to the system, once it has been diagnosed.

These factors also apply to surveillance systems that do not fit the traditional
disease/care-provider model. For example, the sensitivity of a telephone-based
surveillance system of morbidity or risk factors would be affected by

• the number of people who have telephones, who are at home when the
surveyor calls, and who agree to participate;

• the ability of persons to understand and correctly answer the questions;
and

• the willingness of respondents to report their status.

198

The extent to which these questions are explored depends on the system and on the
resources available for the evaluation. The measurement of sensitivity in a
surveillance system requires the validation of information collected through the
system, so as to distinguish accurate from inaccurate case reports, and the collection
of information external to the system, so as to determine the frequency of the
condition in a community, (i.e. a "gold standard.") (22). From a practical
standpoint, the primary emphasis in assessing sensitivity — assuming that most reported
cases are correctly classif ied--is estimating what proportion of the total number of
cases in the community are being detected by the system. If this proportion is
estimated using methods that compare two or more surveillance systems, none of which
is a "gold standard," then this proportion should be called an estimate of
"completeness of coverage" rather than of sensitivity. (See also Chapter VI on
capture recapture) .

A surveillance system that does not have high sensitivity can still be useful in
monitoring trends, as long as the sensitivity and predictive value positive remain
reasonably constant. Questions concerning sensitivity in surveillance systems most
commonly arise when changes in patterns of occurrence of the health problem are
noted. Changes in sensitivity can be precipitated by heightened awareness of a health
problem, introduction of new diagnostic tests, or changes in the method of conducting
surveillance.***** A search for such surveillance "artifacts" is often an initial
step in investigating an outbreak.

Several evaluations have looked at the sensitivity or completeness of coverage of
surveillance systems {13-15) .

Predictive value positive

Predictive value positive (PVP) is defined as the proportion of persons identified as
case-patients who actually have the condition being monitored {11). In Table VIII. 2
above this is represented by A/ (A+B) .

In assessing PVP, primary emphasis is placed on the confirmation of cases reported
through the surveillance system. Its effect on the use of public health resources can
be considered on two levels. At the level of an individual case, PVP affects the
amount of resources required for investigation of cases. For example, where every

199

reported case of hepatitis A is promptly investigated by a public health nurse, and
family members at risk are referred for a prophylactic immune globulin injection each
reported case generates a requirement for follow-up. A surveillance system with low
PVP and therefore frequent "false-positive" case reports would lead to resources being
wasted on cases that do not, in fact, exist.

The other level is that of detection of epidemics. A high rate of erroneous case
reports over the short term might trigger an inappropriate outbreak investigation, and
conversely, a constant high level of "false-positive" reports might mask a true
outbreak. In assessing this attribute, we want to know what proportion of epidemics
identified by the surveillance system are "true epidemics."

Calculating the PVP requires confirmation of all cases. Interventions initiated on
the basis of information obtained from the surveillance system should be documented
and kept on file. Personnel activity reports, travel records, and telephone logbooks
may all be useful in estimating the impact of the PVP on the detection of epidemics.

A low PVP means that a) non-cases are being investigated, and b) there may be mistaken
reports of epidemics. "False-positive" reports to surveillance systems lead to
unnecessary interventions, and falsely detected "epidemics" lead to costly
investigations. A surveillance system with high PVP will lead to fewer "less
unnecessary and inappropriate expenditure of resources (16) .

The PVP for a health event may be enhanced by clear and specific case definitions.
Good communication between the persons who report cases and staff operating the
surveillance system can also improve PVP. The sensitivity and specificity of the case
definition, as well as the prevalence of the condition in the population contribute to
the PVP; (Table VIII. 2) the PVP increases with increasing specificity and prevalence.

Sensitivity and predictive value positive are inversely related. The balance between
assuring that all (or almost all) cases are identified (high sensitivity) and few
false positives are identified (high PVP) must be based on the level of importance
accorded to identifying all cases (e.g., for rabies or meningococcal meningitis) and
the ability to use an indicator of the disease in the community (e.g., use of
Salmonella laboratory isolates) .

200
Representativeness

A truly representative surveillance system accurately describes the occurrence of a
health event over time and its distribution in the population by place and person.

Representativeness is assessed by comparing the characteristics of reported events
with those of all such events that occurred. Although this information is not
generally available in specific detail, some judgment of the representativeness of
surveillance data is possible, on the basis of knowledge of the following factors:

• characteristics of the population--e.g. , age, socioeconomic status, and
geographic location (17);

• natural history of the condition--e.g. , latency period, fatal outcome;

• prevailing medical practices--e.g. , sites performing diagnostic tests, and
physician-referral patterns (18,19);

• multiple sources of data--e.g., mortality rates for comparison with data
on incidence, laboratory reports for comparison with physician reports.

Representativeness can also be examined through special studies of a representative
sample of the population {16) .

The points at which bias can enter a surveillance system and decrease
representativeness are illustrated in Figure VIII. 2.

Case ascertainment bias (Representativeness)

This might also be called "sampling bias" and is the differential identification
and/or reporting of cases from different populations or over time.

In order to generalize findings from surveillance data to the population at large, the
data from a surveillance system should reflect the population characteristics that are
important to the goals and objectives of that system. These characteristics generally
relate to time, place, and person. An important result of evaluating the
representativeness of a surveillance system is the identification of subgroups in the
population that may be systematically excluded from the reporting system. This will

201

enable appropriate modification of data-collection practices and more accurate
projections of incidence of the health event in the target population.

Changes in reporting practices over time can introduce bias into the system and make
it difficult to follow long-term trends or establish baseline rates to be used for the
recognition of outbreaks. For example, switching from a passive to an active system
or changing reporting sources may change the sensitivity of the system. Publicity can
also increase rates of reporting in passive systems (20) . While more complete
reporting is desirable in principle, it is difficult to predict how a change in
reporting practices or in publicity associated with the reportable condition will
change the proportion of cases reported.

Differences in reporting practices by geographic location can bias the
representativeness of the system. For example, the National Notifiable Diseases
Surveillance System (NNDSS) aggregates data collected independently by the 50 states,
Washington, D.C. and several territories. For some infectious diseases, some states
collect data only from laboratories, whereas other states also accept cases reported
by health practitioners (21) . Also, despite efforts to achieve consistency, case
definitions are not standardized across state and territorial boundaries (10) .

Differential reporting rates of cases may occur in association with different
characteristics of the person, so that cases among certain subpopulations may be less
likely to be reported than those among other groups. For example, an evaluation of
reporting on viral hepatitis in a county in Washington State suggested that cases of
hepatitis B were underreported among homosexual men and that cases of hepatitis nonA-
nonB were underreported among persons exposed to blood transfusions. The importance
of these risk factors as contributors to the occurrence of these diseases was
apparently underestimated, as indicated by the selective underreporting of certain
hepatitis cases (22) .

Bias in descriptive information about a reported case

Given that a case of a reportable health condition has been identified and reported,
there may be errors in the collection and recording of descriptive information about
the case, or 'information bias."

202

Most surveillance systems collect more than simple case counts. Information commonly
collected includes the demographic characteristics of affected persons, details about
the health event, and the presence or absence of defined potential risk factors. The
quality, usefulness, and representativeness of this information depends on its
completeness and validity.

Quality of data is influenced by the clarity of the information forms, the training
and supervision of persons who complete surveillance forms, and the care exercised in
management of data. A review of these facets of a surveillance system provides an
indirect measure of quality of data. An examination of the percentage of "unknown" or
"blank" responses to items on surveillance forms or questionnaires is
straightforward. Assessing the validity of responses requires special studies, such
as chart reviews or re- interviews of respondents.

Errors and bias can make their way into a surveillance system at any stage in the
reporting and assessment process. Because surveillance data are used to identify
high-risk groups, to target interventions, and to evaluate interventions, it is
important to be aware of the strengths and limitations of the information in the
system.

So far, the discussion of attributes has been aimed at the information collected for
cases, but many surveillance systems also involve calculating morbidity and mortality
rates. The denominators for these rate calculations are often obtained from a
separate data system maintained by another agency, such as the Bureau of the Census or
the National Center for Health Statistics of CDC. Although these data are regularly
evaluated, thought should be given to the comparability of categories (e.g., race,
age, or residence) used in the numerator and denominator of rate calculations.

Several studies have looked at quality-assurance problems associated with surveillance
data. A sample of National Electronic Injury Surveillance System (NEISS) records were
compared with emergency-room records to assess the quality of data recorded in the
surveillance system {23). A study of quality of national malaria surveillance reports
was carried out in the United Kingdom (24). The quality of Behavioral Risk Factor
Surveillance System (BRFSS) data, which are obtained through monthly telephone
surveys, for behavioral risks associated with cardiovascular problems has been

203

examined in California {25) . And CDC examined the completeness of race-ethnicity
reporting in the NNDSS {26) .

Timeliness

Timeliness reflects the delay between any two (or more) steps in a surveillance
system. The timeliness of the system can best be assessed by the ability of the
system to take appropriate action based on the urgency of the problem and the nature
of the public health response. Four points of time in the surveillance process are
most often considered when measuring timeliness: a) time of onset of disease or
occurrence of an injury, b) time of diagnosis, c) time the report of case received by
public health agency responsible for control activities, and d) time of implementation
of control activities. Usually one of the first two points of time (a or b) is used
as the starting point, and each of the other two points (c, d) is used as an end
point .

Timeliness is usually measured in days or weeks, but in hospital settings it might be
measured in hours; for diseases that do not necessitate an immediate response, it
might be measured in months or even years .

Evaluations of the timeliness with which shigellosis is reported in two different
surveillance systems in the United States found median delays of 11 and 12.5 days from
time of onset of illness to receipt of report by the public health agency responsible
for control measures. This delay did not allow public health officials to intervene
in a timely manner to prevent the occurrence of secondary or tertiary cases. However,
such a time frame might still allow for effective intervention in settings, such as
day-care facilities, in which outbreaks may persist for weeks or months {27) . Another
study of timeliness in the reporting of salmonellosis, shigellosis, hepatitis A, and
bacterial meningitis looked at the reporting delay between date of onset and date of
report to the CDC (3) . Median reporting delays ranged from 20 days for bacterial
meningitis to 33 days for hepatitis A. Wide variations in reporting delays were found
between states as well. A study in Australia showed that reports of infectious
diseases from laboratories were received by the Medical Officer of Health in a
substantially shorter time than those received from medical practitioners (13) .

204

In contrast, if there is a long latency between exposure and appearance of disease,
the rapid identification of cases of illness may not be as important as the rapid
availability of data to interrupt and prevent exposures that lead to disease.

The need for a rapid reporting to a surveillance system depends on the nature of the
public health problem under surveillance and the objectives of the system. Recently,
computer technology has been integrated into surveillance systems and may promote
timeliness of reporting {28,29) .

COST

The final descriptive element is an estimation of the resources used to operate the
system. The estimates generally are limited to direct costs and include the costs of
personnel and resources required for collecting, processing, and analyzing
surveillance data, as well as for the dissemination of information resulting from the
system.

Personnel costs may be determined from an estimate of the time it takes to operate the
system for different personnel. While this can be expressed as person-time expended
per year of operation, it is preferable to convert the estimate to dollar costs by
multiplying the person-time by appropriate salary and benefit figures.

Other costs may include those associated with travel, training, supplies, equipment,
and services such as mail, telephone, rent, and computer time.

The resources required at all relevant levels of the public health system-- from the
local health-care provider to municipal, county, state, and federal health agencies-
should be included.

The approach to resources described here includes only those personnel and material
resources required for the direct operation of surveillance. A more comprehensive
evaluation of costs should examine consequential or indirect costs, such as follow-up
laboratory testing or treatment, case investigations or outbreak control resulting
from surveillance, costs of secondary data sources (e.g., vital statistics or survey
data) , and costs averted (benefits) by surveillance.

205
Costs are judged relative to benefits, but few evaluations of surveillance systems
have included a formal cost-benefit analysis, and such analyses are beyond the scope
of this chapter. Estimating benefits, such as savings resulting from morbidity
prevented through surveillance, may be possible in some instances, although this
approach does not take into account the less tangible benefits that may result from
surveillance systems. More realistically and in most instances, costs should be
judged with respect to the objectives and usefulness of a surveillance system.

Alternative data collections may be compared based on their costs and number of cases
identified (See also Chapter XII) . For example, in Vermont, two methods of collecting
surveillance data were compared. The 'passive" system was already in place and
comprised unsolicited reports of notifiable diseases to the district offices or the
state health department. The "active" system was implemented to involve in a
probability sample of physicians' practices. Each week a health department employee
called these practices to solicit reports of selected notifiable diseases. In
comparing the two systems, an attempt was made to estimate associated costs. The
resources estimates directly applied to the surveillance systems are shown in Table
VIII. 3. The active system identified on additional 23 cases at an average cost of
$861 per case.

RECOMMENDATIONS

On the basis of the evaluation, an assessment of how well the surveillance system is
meeting its current objectives should be made (Table VIII. 4). Modifications to the
system to enhance its usefulness and improve its attributes should be considered. A
regular review of each surveillance system should assure that systems remain
responsive to contemporary public health needs.

206
REFERENCES

1. Harness JR, Gildon BA, Archer PW, Istre GR. Is passive surveillance always
insensitive? An evaluation of shigellosis surveillance in Oklahoma. Am J
epidemiol 1988;128:878-81.

2. Modesitt SK, Hulman S, Fleming D. Evaluation of active versus passive AIDS
surveillance in Oregon. Am J Public Health 1990;80:463-4.

3. Birkhead G, Chorba TL, Root S, Klaucke DN, Gibbs NJ. Timeliness of national
reporting of communicable diseases: the experience of the National Electronic
Telecommunications System for Surveillance. Am J Public Health 1991;81:1313-5.

4. Klaucke DN, Buehler JW, Thacker SB, et aJ . Guidelines for evaluating
surveillance systems. MMWR 1988,-37 (SS-5) :1-18.

5. Thacker SB, Parrish RG, Trowbridge FL. A method for evaluating systems of
epidemiological surveillance. World Health Statistics Quarterly 1988;41:11-18.

6. Hinman AR, Koplan JP. Pertussis and pertussis vaccine: reanalysis of benefits,
risks, and costs. JAMA 1984;251:3109-13.

7. Dean AG, West DJ, Weir wm. Measuring loss of life, health, and income due to
disease and injury. Public Health Rep 1982;97:38-47.

8. Laboratory Centre for Disease Control. Establishing goals, techniques and
priorities for national communicable disease surveillance. Canada Diseases
Weekly Report 1991;17:79-84.

9 . Laboratory Centre for Disease Control . Canadian communicable disease
surveillance system. Disease-specific case definitions and surveillance
methods. Canada Diseases Weekly Report 1991; 17 (Suppl 3):l-35.

10. Wharton M, Chorba TL, Vogt RL, et al. Case definitions for public health
surveillance. MMWR 1990.-39 (RR-13) :l-43 .

207

11. Weinstein MC, Fineberg HV. Clinical decision analysis. Philadelphia: W.B.
Saunders Co., 1980.

12. Chandra Sekar C, Demine WE. On a method of estimating birth and death rates and
the extent of registration. J Am Stat Assoc 1949;44:101-15.

13. Murphy DJ, Seltzer BL, Yesalis CE. Comparison of two methodologies to measure
agricultural occupational fatalities. Am J Public Health 1990;80:198-200.

14. Rosenman KD, Trimbath L, Stanbury M. Surveillance of occupational lung disease:
comparison of hospital discharge data to physician reporting. Am J Public
Health 1990;80:1257-8.

15. Rushworth RL, Bell SM, Rubin GL, et al. Improving surveillance of infectious
diseases in New South Wales. Med J Australia 1991; 154 (12) :828-l.

16. Barker WH, Feldt KS, Feibel J, et al. Assessment of hospital admission
surveillance of stroke in a metropolitan community. Am J Chron Dis 1984; 37 :609-
15.

17. Kimball AM, Thacker SB, Levy ME. Shigella surveillance in a large metropolitan
area: assessment of a passive reporting system. Am J Public Health 1980;70:164-

18. Vogt RL, Larue D, Klaucke DN, Jillson DA. Comparison of active and passive
surveillance systems of primary care providers for hepatitis, measles, rubella
and salmonellosis in Vermont. Am J Public Health 1983;73:795-7.

19. Thacker SB, Redmond S, Rothenberg R, et al. A controlled trial of disease
surveillance strategies. Am J Prev Med 1986;2:345-50.

20. Davis JP, Vergeront JM. The effect of publicity on the reporting of toxic-shock
syndrome in Wisconsin. J Infect Dis 1982;145:449-57.

21. Sacks JJ. Utilization of case definitions and laboratory reporting in the

208

JjZ6 ?

surveillance of notifiable communicable diseases in the United States. Am J
Public Health 1985;75:1420-2.

22. Alter MJ, Mares A, Hadler SC, et al . The effect of under reporting on the
apparent incidence and epidemiology of acute viral hepatitis. Am J Epidemiol
1987;125:133-9.

23. Hopkins Rs. Consumer product-related injuries in Athens, Ohio, 1980-85:
assessment of emergency room-based surveillance. Am J Prev Med 1989; 5 (2) :104-
12.

24. Phillips-Howard PA, Mitchell J, Bradley DJ. Validation of malaria surveillance
case reports: implications for studies of malaria risk. Journal of Epidemiology
and Community Health-London 1990;44 (2) :155-61 .

25. Jackson C, Jatulis DE, Fortmann SP. The behavioral risk factor survey and the
Stanford Five-city project survey: a comparison of cardiovascular risk behavior
estimates. Am J Public Health 1991;82:412-6.

26. Buehler JW, Stroup DF, Klaucke DN, Berkelman RL. The reporting of race and
ethnicity in the national notifiable disease surveillance system. Pub Health
Rep 1989;104:457-65.

27. Rosenberg ML. Shigella surveillance in the United States, 1975. J Infect Dis
1977;136:458-9.

28. Marks JS, Hogelin GC, Gentry EM, et aJ . The behavioral risk factor surveys: I.
State-specific prevalence estimates of behavioral risk factors. Am J Prev Med
1985;1:1-8.

29. Graitcer FL, Burton AH. The epidemiologic surveillance project: a computer-
based system for disease surveillance. Am J Prev Med 1987;3:123-7.

210

Chapter IX

Ethical Issues

Robert A. Hahn

"Epidemiologists [and surveillance investigators] should be cognizant that many competing
values may have moral weight equal to or greater than the freedom of scientific
inquiry. . . .there are many clearly appropriate social restraints on epidemiologic research
[and surveillance]."

Beauchamp

INTRODUCTION

Webster defines ethics as "the discipline dealing with what is good and bad or right and
wrong or with moral duty and obligation." A professional code of ethics provides a guide
to right and wrong behavior. An ethical code is not a description of what practitioners
(and others) actually do, but rather a prescription for what they should do. Ethical
obligations derive principally from moral values--such as the "Golden Rule, " presumably
shared by the broader society--rather than from scientific principles, such as "formulate
a hypothesis and a method before collecting data. " However, ethical decisions require
an understanding of the objectives, current issues, and methods of the scientific
disciplines to which they refer.

OVERVIEW

Over the past several decades, much ethical discussion in health--i .e. , "bioethics"--has
focused on clinical medicine and medical research, and thus on physicians and their
patients and on researchers and research subjects. Because public health is concerned
with the public, specific principles of bioethics may not apply directly to public health,
although underlying moral values may be shared. Ethical principles associated with
surveillance are perhaps closer to those of the social sciences than to those of clinical

211
medicine or medical research (1).

Indeed, public health ethics may conflict with the ethics of clinical medicine insofar
as clinical ethics- -represented by such issues as patient confidentiality—compromise
public health (e.g., when the patient's condition threatens the health of others) ; or when
the demands of public health compromise the rights of individuals (e.g., in quarantine);
or when mass vaccination is required for public health despite the personal objections
of individual patients (2) . The practice of public health generally assumes that
individual rights may be ethically superseded in the pursuit of public well-being and a
greater public good (2) . Epidemiologists and ethicists have recently collaborated in the
formulation of ethical principles for epidemiology (3) .

Although characteristics may distinguish surveillance-related ethical issues from ethical
issues in other areas of epidemiology and public health, many of the ethical issues
confronting public health surveillance are similar to those of epidemiology.
Consequently, much of the discussion in this chapter draws heavily on experience in
epidemiologic research, where these issues have been more fully discussed. Public health
surveillance may affect the public in several ways. Surveillance is the principal means
by which the health status of the population is assessed; it can be used to identify
problems, indicate solutions, plan interventions, and monitor change. As such, public
health surveillance commonly requires widespread and repeated contact with the public it
serves regarding basic and often personal matters of health and exposures to risk factors.
In addition, surveillance systems may be linked with other systems, requiring compatible
identifiers of individual records; and systems may be shared among researchers or public
health officials, thus increasing chances of public disclosure. Many facets of
surveillance may infringe on individual privacy and therefore may increase the risk of
breaches of confidentiality.

Several theories have been proposed to account for the basic principles underlying sound
ethical decisions. Such theories are relevant in public health decisions about resource
allocation, intervention, surveillance, and other issues, but are only briefly mentioned
here.

Some ethicists dispute the possibility of formulating general ethical principles, because
they believe that correct ethics are specific to each situation (i.e., "situation ethics")

212

(4). In contrast, most ethicists assume that ethical principles apply to different
situations; these ethicists commonly adopt one of two positions about the nature of
ethical rules. Utilitarians believe that ethical actions are those that most effectively
distribute valued goods within the population; this position is sometimes equated with
the epithet, 'the end justifies the means.* In contrast, deontologists believe that
certain principles, such as honesty, are fundamental, and that ends, such as the
distribution of goods in a population, do not justify the violation of fundamental
principles. Public health intervention programs commonly combine utilitarian and
deontological approaches. They attempt to maximize the distribution of health benefits,
while maintaining a satisfactory level of morality in the means of distribution.

MORAL PRINCIPLES IN CLINICAL MEDICINE AND RESEARCH

Ethicists have formulated several basic moral principles that they believe underlie
clinical medicine and research (5) . Some of these basic principles apply to public health
surveillance:

Respect for autonomy asserts that "autonomous actions and choices should not be
constrained by others" (5) . Basic to the notion of autonomy is self-determination
and voluntary action.

Beneficence is the principle that one should act to enhance the welfare of others.
Although non-maleficence, or avoiding acts that might harm others, is sometimes
viewed as a principle separate from beneficence, it may also be regarded as the
first tenet of beneficence. That is, in order to benefit others, one must at least
avoid doing them harm.

Paternalism is the active pursuit of another person's well-being (as perceived by
the pursuer) , independent of--and sometimes contrary to-- that person's express
wishes. Paternalism may be regarded as a form of beneficence. While paternalism
is generally thought of as protection of a person against harm to himself /herself ,
the notion may be broadened to include threatened harm to others. Paternalism
commonly conflicts with respect for autonomy and, perhaps for this reason, is not
a popular concept in the United States. It becomes useful when a person's capacity
for autonomy is compromised (as may occur in sickness) or when personal autonomy
may seriously compromise the well-being of others.

Justice is the principle promoting the equitable distribution of burdens and
benefits in society. Unfortunately, there is no agreed-upon definition of equity;

213
the range includes an equal share for each person, each according to need, each
according to effort, each according to societal contribution, or each according to
presumed merit (5) .

Other ethical principles are regarded by some ethicists as independent and by others as

derivative from more basic principles (5) :

Veracity is the duty of full disclosure of relevant information. Veracity is often
considered a duty of clinicians or researchers but may also be a duty of patients
or subjects.

Privacy is the duty to respect a person's right "...of determining, ordinarily, to
what extent his thoughts, sentiments, and emotions shall be communicated to others"
(6) . Privacy includes protection from unwanted intrusions, and from the divulgence
of personal information to others. The right to privacy may derive from respect
for autonomy.

Confidentiality is the duty not to disclose information about individuals without
their consent. Confidentiality may be seen as a principle following privacy.
Fidelity, commonly applied to the relationship between physician and patient, is
the duty to keep promises and maintain contracts.

CONFLICTS AND SANCTIONS

While conflicts among ethical principles are common--e.g. , paternalism versus respect for
autonomy- -there is no simple prescription for resolving such conflicts. Utilitarians
might choose one alternative and deontologists, another. Attempts to prescribe principles
of conflict resolution emphasize that decisions should be accompanied by justification
of the choice ( 7) .

In contrast to medical institutions, institutions of public health and epidemiology do
not license practitioners and do not maintain official sanctions against violations of
professional ethical standards (even insofar as such standards exist and are codified) .
Public health practitioners are not sued for malpractice. Informal sanctions (e.g., the
avoidance of unscrupulous colleagues or loss of one's job) occur, but have not been
systematically described. Some epidemiologists have recently proposed an ethical duty
to monitor and address the unethical practices of their colleagues (7). In contrast to
the absence of collegial sanctions in public health, some aspects of epidemiology and

214

surveillance are governed by law (e.g., violations of confidentiality by surveillance
personnel) (8) .

Varying degrees of contact are involved in different forms of surveillance. Environmental
surveillance (e.g., of environmental lead or rates of Lyme disease infection of ticks),
may involve contact with animals or the physical environment rather than with humans;
surveillance using hospital records or death certificates involves indirect human contact;
surveillance by household interviews and/or physical examinations requires face-to-face
and/or physical contact . Ethical principles may vary from situation to situation and are
likely to be more stringent as more human contact is involved.

This chapter focuses on surveillance involving face-to-face human contact. Also
considered are surveys such as the Health Interview Survey, the National Health and
Nutrition Examination Survey, and the Vital Statistics System of the Centers for Disease
Control's National Center for Health Statistics. These surveys or statistical systems
may not meet the stringent objectives of public health surveillance, but because they
entail the collecting personal information on individuals and are widely used for
surveillance they provide examples surrounding data collection. The U.S. Census is also
considered, because census information plays an essential role in providing denominators
for surveillance data.

The collection of public health information may involve the participation of many
individuals and institutions. Potential participants include not only the investigator
and subjects of surveillance but persons in the immediate social environment of study
subjects, the investigator's colleagues, the broader public health community, clinicians,
and society at large. Explicit and implicit relations among these parties delineate their
ethical obligations to one another (Table IX. 1) . Ethical issues are reviewed below by
focusing on several of these relationships.

RELATIONSHIPS IN SURVEILLANCE AND THEIR ASSOCIATED
ETHICAL OBLIGATIONS

Surveillance practitioners and society at large. The practice of public health may

215

be regarded as one means by which a society addresses issues of well-being in the
population. Public health practitioners retain an essential connection with society at
large; ultimately, they are supported by and act at the behest of their public
constituency- The assumption is that, as they pursue and achieve public interests, they
should be supported by society in their work.

As agents of public welfare, public health practitioners have several ethical
responsibilities as outlined below:

Choice of surveillance topics. In pursuit of beneficence, as well as in upholding
public fidelity, practitioners should conduct surveillance on priority issues with
potential public health benefit (7). "As a parallel in a research study, it would be
unethical to ask anyone to participate that has little likelihood of producing meaningful
results or furthering scientific knowledge for the good of society" (5). Insofar as
surveillance findings are basic indicators of health inequities and trends, (e.g., in risk
or exposure, health-care access, morbidity, or mortality) , the pursuit of justice is also
a primary moral rationale for surveillance.

Judgments of priority and potential benefit should be based on explicit criteria, such
as the criteria for the strength of scientific evidence used by the Preventive Services
Task Force (10) . Perhaps paradoxically, surveillance results themselves facilitate the
determination of priority issues, (e.g., the magnitude and location of health problems
in the population) .

Avoidance of conflicts of interest . As with other epidemiologic activities,
surveillance may be prone to conflict of interest. "Virtually all epidemiologic research
is sponsored, and few if any research sponsors, public or private, are disinterested in
the outcome of their epidemiologic research" (12). In their commitment to public well-
being, practitioners of surveillance must assure that data are conducted to answer
scientific or public health questions effectively, rather than to serve the interests of
financial and institutional sponsors or to "prove" personal preconceptions. For example,
practitioners must assure that populations surveyed and questions asked are appropriate
to assess the issues considered and not to find "results" desired by a sponsor.
Epidemiologists have presented guidelines for avoiding conflicts of interest (22); the
guidelines apply to surveillance activities as well.

216

• The investigator's independence from the sponsor must be

maintained in the design, conduct, and reporting of
epidemiologic (and surveillance) results. Written agreement
between investigator and sponsor may increase the likelihood
of independence.

• Investigations should not be conducted in secrecy, and results
should be published in a timely fashion.

• Decisions on release and publication of results should not be
influenced by the interests of sponsors.

• All sponsorship should be acknowledged.

• Decisions regarding the dissemination and publication of results
should be made by the investigator rather than the sponsor.

Bond (23) has suggested that certain private industries may have an ethical obligation
to monitor the effects of their activities for instance the exposures and health of these
employees. Rothman (21) has argued that it is unethical to judge the results of
investigations simply on the basis of sponsorship, e.g., private industry. Rather,
investigations should be judged by the quality of the work involved.

Methodologic and analytic scrutiny. The principle of beneficence requires that one
choose the best feasible method of investigation and that one appropriately analyze
results — thus requiring knowledge of scientific methods (7).

Interpretation and recommendation. The principle of beneficence also requires (as
does the concept of surveillance itself) that surveillance data be interpreted and used
to assess and address public health problems.

Report of findings. Finally, the principle of beneficence requires that surveillance
results be reported understandably, sensitively, and responsibly, in a timely fashion,
with scientific objectivity and caution, appropriate confidence, and appropriate doubt.
"Epidemiologists should carefully avoid being placed in a situation in which their results
might be suppressed or inappropriately edited by either internal or external influences"
(7). Some (14) have argued that epidemiologists should be advocates for the positions

217

firmly supported by their data. Others (25) have asserted that epidemiologists are
legitimate expert witnesses. Practitioners of surveillance must also be free of internal
or external constraints and must be able to present the results of their work objectively.

INVESTIGATORS AND SUBJECTS

Beneficence

Surveillance subjects do not usually benefit directly from surveillance, though some
benefit to them may accrue as a side-effect (e.g., when surveillance subjects are given
physical examinations or when a discovery made by surveillance serves a health need of
a surveillance subject) . When an adverse health condition is determined in the course
of surveillance, it is the responsibility of the investigator to provide the surveillance
subject with timely information about the discovered condition; if the condition is
complex or sensitive, such information may be best conveyed by the subject's physician,
trained counselors, or local public health officials (9) .

Non-Malef icence

A more common ethical issue in surveillance is non-maleficence. Surveillance subjects must
not be harmed in the course of the surveillance program. When invasive procedures are
deemed necessary to the surveillance system- -including psychologically as well as
physically invasive procedures- -care must be taken that subjects do not suffer undue
reactions (9) .

Epidemiologists have recognized a need to be culturally sensitive to the populations they
are studying. Cultural sensitivity may be a component of beneficence, non-maleficence,
and autonomy, and may also enhance the effectiveness of the investigation. Cultural
sensitivity is important not only during the course of surveillance but also in the
appropriate reporting of results.

Non-maleficence may also require that survey participants be compensated for their
participation. Compensation should at least cover the costs of participation--e.g. ,
transportation, lost work time, and child care. While altruism and the personal
contribution to potential public health benefits may motivate some prospective
participants in a data collection system, additional compensation may increase the

218
participation of others--a pragmatic rather than an ethical justification for payment.

Protection of Privacy

Non-maleficence may also underlie respect for privacy. Protection of privacy requires
not only restraint in intrusion and in the disturbance of persons in their private lives
but assurance that once information (or a specimen) has been collected, it will not be
distributed to others in a form that identifies the surveillance subject (see Chapter
X) (16) .

Beauchamp et al. propose three situations in which the invasion of privacy by
epidemiologists (and surveillance investigators) is justified (7):

• The invasion of privacy is a necessary aspect of the
investigation.

• There is no reason to suspect that subjects of the
investigation will be placed at substantial risk (e.g., of
being fired or divorced) .

• The research must have potential social benefit.

In Public Law 93-579 (17), the Congress states the following:

■ (2) the privacy of an individual is directly affected by

the collection, maintenance, use, and dissemination of

personal information by Federal agencies;...

(4) the right to privacy is a personal and fundamental right

protected by the Constitution of the United States; and

(5) in order to protect the privacy of individuals identified

in information systems maintained by Federal agencies, it

is necessary and proper for the Congress to regulate the

collection, maintenance, use, and dissemination of information

by such agencies."

In the United States, public health surveillance activities conducted under the auspices
of the Executive Branch (thus including the Department of Health and Human Services and
the Bureau of the Census) are regulated by the Public Health Service Act and by the
Privacy Act of 1974 (17). Both acts regulate contractors of federal agencies as well as

219

the agencies themselves. Regulations apply to " establishments °--i .e. , institutions--as
well as to individuals surveyed. They address "systems of records" "... from which
information is retrieved by the name of the individual or by some identifying number,
symbol or other identifying particular assigned to the individual" (27). Thus, records
without identifiers are exempt from these regulations.

While the Privacy Act focuses on the disclosure and dissemination of information already
collected, the act also restricts surveillance information that may be collected by
stipulating that records may contain only "such information about an individual as is
relevant and necessary to accomplish a purpose of the agency...." This enforces the
ethical obligation to conduct surveillance on issues with potential public health benefit.
In addition, the Privacy Act prohibits use of surveillance (or other information) "for
any purpose other than the purpose for which it was supplied unless such establishment
or person has consented. . .to its use for such other purpose . . . .'{18).

The Privacy Act gives individuals the right to obtain their own records, to correct errors
in the record, and to receive an accounting of how the record has been disseminated.
Exemptions to individual access include the use of records maintained for statistical
purposes only (rather than for administrative use) . Census information, for example, is
exempt. Exemptions must meet specific criteria and must be published in the Federal
Register.

The Privacy Act requires that federal agencies train and regulate personnel with access
to record systems and that agencies maintain physical means of protecting records from
unwarranted access. Agencies are also required to describe their record systems and to
report procedures used to comply with requirements in the Federal Register. Criminal
penalties and fines may be imposed on persons who violate the stipulations of the act.

Informed Consent

The Privacy Act regulates not only the collection and maintenance of record systems, but
the informed consent procedures by which they are collected and matters of confidentiality
involved in the dissemination of records that have been collected. Informed consent is
a requirement based on respect for autonomy. Informed consent must be attained primarily
in the context of surveys and studies. Administrative, medical-care, and legally mandated

220

information-collection systems should also consider obtaining informed consent. The
Privacy Act requires that potential participants in record systems be a) informed of the
authority under which the data are collected, b) explained the purposes of the
information, c) explained routine uses of the information, and d) described the
consequences of not participating. Informed consent is required for "establishments"
(through their representatives) as well as for individuals.

Epidemiologists and philosophers have proposed several elements to be included in
comprehensive informed consent:

• Reasonable disclosure of the goals and uses of the study
(or surveillance activity) .

• Evidence of comprehension on the part of prospective
participants. The response of potential respondents to
surveys following appropriate information is sometimes
regarded as evidence of consent, despite the lack of
evidence of respondent comprehension (19) .

• Voluntariness on the part of prospective participants.
"All forms of duress or undue influence are to be
scrupulously avoided" ( 7) .

• Competence on the part of prospective participants.

• Consent of prospective participants.

Possible harm of the surveillance--e.g. , from some physical test--should also be explained
to prospective participants. To guarantee autonomy, comprehensive informed consent should
also be receptive to informed dissent and non-participation or to withdrawal at any point
in the research or surveillance activity.

Feinlieb (5) argues that, "the first responsibility of the epidemiologist to the subject
is to be clear about the objectives of the study." He also allows that, when the goals
of epidemiologic investigations (or surveillance) are complex or when full disclosure
might bias responses, comprehensive disclosure may not be required, so long as the
respondent is "...not deliberately misled into participating in a study that the
investigator knows is against the respondent's interests" (9). This paternalistic
principle may compromise the participant's autonomy.

221
Disclosure, Dissemination, and Confidentiality

The Privacy Act forbids the disclosure of information in which individual identity is
ascertainable, unless the subject has agreed to disclosure. This principle thus protects
the confidentiality of individuals and affects the dissemination of surveillance findings
(see Chapter X) .

Records protected by the Privacy Act are exempt from Freedom of Information Act (FOIA)
requests. FOIA specifically exempts "personal and medical files and similar files the
disclosure of which would constitute a clearly unwarranted invasion of personal privacy"
and matters "specifically exempted from disclosure by statute" (19). Federal surveillance
data are also commonly exempt from subpoena and may be explicitly exempted by
authorization of the Secretary of Health and Human Services (18) . Census data, too, are
exempt from FOIA access .

There are several dimensions of disclosure (19) :

***** • Exact disclosure, which indicates a precise (numerical)
value of some characteristic, (e.g., precise income or age,
associated with an individual) , versus approximate disclosure,
which indicates a range of values associated with an individual.

• Probability-based disclosure indicates the likelihood (<100%)
that some characteristic is associated with an individual,
while certainty disclosure indicates (with 100% likelihood) that
the characteristic is associated with the individual .

• Internal disclosure associates an individual with a characteristic
on the basis of evidence found within one particular study or
survey, while external disclosure associates individuals and
characteristics by linking studies or surveys.

Since the absolute protection of disclosure might make the use of surveillance information
impossible and would severely hamper programs of disease control and prevention, non-
disclosure requirements have been interpreted as protecting individuals from harm while
allowing appropriate use of surveillance information. For example, publication of
analyses or tables with small numbers of conditions such as fetal or infant deaths or
deaths from rabies in a county — allowing the identification of individuals--is said to

222

be reasonable because these exceptions "...have been accepted traditionally and because
they rarely, if ever, reveal any information about individuals that is not known socially"
(20) . Also exempt is publication of small numbers if the identifying characteristics are
judged not to be "sensitive."

Two kinds of breaches of confidentiality should be differentiated. In the first,
information collected in confidence by a clinician or public health practitioner should
be divulged if the information substantially threatens the welfare of another person
(21,22). Divulging information need not reveal the identity of the first individual, but
such revelation may be unavoidable. This is a common occurrence associated with "contact
tracing" for sexually transmitted diseases. The public health responsibilities of
clinicians and public health practitioners may override duties of confidentiality to
individual patients and surveillance subjects, even though their actions abrogate privacy,
autonomy, and even beneficence. In the second kind of breach of confidentiality,
revelation of information and the identity of an individual serves no public health
purpose and is therefore unethical.

Several techniques may mitigate the likelihood of disclosure and may legitimate the
publication of otherwise protected data: a) small samples (e.g., <10% of the data) hamper
efforts to identify which individual in the population a sampled individual represents,
b) the deliberate creation of errors or imputations of missing data allows that any given
datum may be an error or an imputation rather than a true observation, c) incompleteness
of reporting allows that an individual may not have been included in the survey, and d)
lack of sensitivity of the information in question (because of prior publication or
historical time frame), so that publication reveals no harmful information.

In the United States, individual states use surveillance information for their own
disease-control programs. As major surveillance agencies, the states have been critically
concerned with issues of confidentiality (23) . While all states have provisions for
complying with freedom of information requests and maintaining confidentiality of
information, they vary in specific regulations and their enforcement. Twenty-five states
have general confidentiality requirements with little specific definition; seven states
require written consent for release of information; five states exclude surveillance
information from subpoena; and 10 states have penalties for unlawful disclosure of
information on some or all reported infectious diseases (23) . The states are concerned

223

with the protection of the confidentiality of data released for federal surveillance
systems and, in collaboration with CDC, have established confidentiality guidelines (23) .

Several procedures are commonly used to protect the confidentiality of records in
surveillance investigation settings, disseminated data sets, and published tabulations
and analyses:

a. Names or other personal identifiers are necessary in public health
surveillance for two principal, related purposes: to follow up individuals for the
determination of subsequent health events and to link data systems for additional
information on individuals. Surveillance functions which require neither follow-up nor
linkage may avoid problems of confidentiality by not using names or other identifiers.
It should be noted, however, that the absence of identifiers, as in "blinded" studies,
may preclude informing surveillance subjects of adverse surveillance findings.

b. When names or other identifiers are justified, problems of disclosure may
be minimized with use of protected or "scrambled" identifiers, which make association
between records and individuals difficult. The use of identifiers in record systems and
separate files relating identifiers and individuals maintained in separate, secure areas
is a common means of minimizing disclosure.

c. Identifying information can be destroyed once it has served its designated
follow-up or linkage function.

d. Avoiding the collection of data that will not be used and that might serve
to identify individuals .

e. Precise data--e.g., dates of birth or death or income in exact dollar
amounts, residence by block or street or address--are rarely essential; data-range
specifications are most often adequate for surveillance purposes. Since precise data
facilitate identification of individuals, the use of data ranges is preferable if
surveillance goals can be achieved with such information.

f . In some surveillance investigations, linkage with other surveillance sources
is necessary to determine additional information. In this case, the Privacy Act requires
that federal agencies and personnel involved be trained in and comply with common
regulations of privacy and confidentiality.

g. Suppression of analyses or tables with cells with small numbers in
publications (19) :

h. i) no table should include a row or column in which all
cases are found in one cell.

224

ii) the marginal total of any row or column should not be

fewer than three,

iii) no estimate should be based on fewer than three cases,

iv) no estimates should be published if one case contributes

more than 60% to that estimate,

v & vi) no characteristics of individuals should be

identifiable by calculation from other tabulated data in

the same or other data sets. Solutions to the problem of

small numbers may be the aggregation of rows or columns or

the suppression of data in cells and marginal totals.

Veracity

In the ethics of public health surveillance, the principle of veracity is usually
considered in the disclosure by investigators of the goals and uses of surveillance
information. However, veracity may also be an ethical duty of surveillance subjects (to
the investigator as well as to society) once they participate. Deception by subjects may
contribute to erroneous results and public health harm.

Investigators and Persons in Subjects' Social Environments

During the course of surveillance, it may be discovered that some condition of the
surveillance subject (e.g., an infectious disease or violent intentions) might severely
affect or might have affected the well-being of other persons in the subject's social
environment. In this case, it may be the ethical duty of investigators to inform
appropriate authorities (e.g., public health officials or law enforcement agents) of these
circumstances (9). Paternalistic social beneficence might justify the breach of
confidentiality.

Surveillance and the Public Health Community

Public health surveillance practitioners have the duty of having their work reviewed by
colleagues for ethical as well as scientific integrity; they also have the responsibility
of reviewing the work, of others. The review process requires the sharing of methods and
findings. Ethical--as well as scientific — critiques must be balanced. "Epidemiologists
and many research scientists often search in detective-like fashion for flaws in the
studies of those they review, even though the studies may contain substantial merit" (7) .

225

While some agencies have policies to protect researchers' primary use and control of the
data they collect (24) , others have favored broader access (25) . Ethical principles
justifying broad access are detailed below.

• Enhancing the quality of science by allowing reanalysis

and confirmatory studies--thus potentially contributing
to public welfare

• Expanding knowledge by facilitating additional analyses--
thus also potentially contributing to public welfare

• Reducing the burden of surveillance on subjects

• Reducing the burden of surveillance on practitioners

Epidemiologists and ethicists have also argued that practitioners have the obligation to
promote ethical behavior in the public health community and to confront ethically
unacceptable behavior of colleagues ( 7) .

CLINICIANS AND THE PUBLIC HEALTH COMMUNITY

Physicians, laboratorians, and other health-care practitioners play a critical role in
reporting infectious diseases to local and state health departments. Reporting traumatic
events (e.g., gunshot wounds and child abuse) is also required in some states {26).
Fulfilling these duties may prevent further infection or trauma. While reporting selected
diseases and injuries is mandatory for physicians and others in all states, completeness
of reporting is said to range from 6% to 90% for many notifiable diseases (27) ; reporting
laws are seldom enforced.

Investigators and Clinicians

Investigators have a duty to report findings to clinicians. Findings may concern the
welfare of a clinician's patients who have been surveillance subjects. Findings from
surveillance investigations may also have implications for patients in general or patients
with certain conditions.

The scale and significance of public health surveillance demand scrupulous and ongoing
attention to ethics as well as to science (Table IX. 2) . Ethics should not be regarded
as an afterthought, or worse, an obstacle, to professional practice, but as an element

226

vital to its foundation and goals.

227
REFERENCES

1. Cassel J. Ethical principles for conducting fieldwork. Am Anthropologist
1980;82:28-41.

2. Lappe M. Ethics and public health. Maxcy-Rosenau Public Health and Preventive
Medicine 12th ed. Norwalk, Connecticut: Appleton-Century-Crofts, 1986:1867-77.

3. Soskolne CL. Ethical decision-making in epidemiology: The case study approach.
J Clin Epidemiol 1991;44 (Suppl 1) :125S-30S.

4. Fletcher J. Morals and medicine. Boston: Beacon Press, 1960.

5. Beauchamp TL, Childress JF. Principles of Biomedical Ethics. 3rd ed. New York:
Oxford University Press, 1989.

6. Warren SD, Brandeis LD. The right to privacy. Harvard Law Review 1890;4:193-220.

7. Beauchamp TL, Cook RR, Fayerweather WE, Raabe GK, Thar WE, Cowles SR, Spivey GH.
Ethical guidelines for epidemiologists. J Clin Epidemiol 1991; 44 (Suppl 1):151S-
69S.

8. Lako CJ. Privacy protection and population-based health research. Soc Sci Med
1986;23:293-5.

9. Feinleib M. The epidemiologists responsibilities to study participants. J Clin
Epidemiol 1991;44 (Suppl 1):73S-9S.

10. U.S. Preventive Services Task Force. Guide to clinical preventive services: An
assessment of the effectiveness of 169 interventions. Baltimore: Williams &
Wilkins, 1989.

11. Rothman KJ . The ethics of research sponsorship. J Clin Epidemiol 1991; 44 (Suppl
1) :25S-8S.

228

12. Stolley PD. Ethical issues involving conflicts of interest for epidemiologic
investigators. A report of the committee on ethical guidelines of the society for
epidemiologic research. J Clin Epidemiol 1991: 44 (Suppl 1).

13. Bond GG. Ethical issues relating to the conduct and interpretation of
epidemiologic research in private industry. J Clin Epidemiol 1991, -44 (Suppl 1) :29S-
34S.

14. Last JM. Obligations and responsibilities of epidemiologists to research subjects.
J Clin Epidemiol 1991, -44 (Suppl 1):95S-101S.

15. Cole P. The epidemiologist as an expert witness. J Clin Epidemiol 1991, -44 (Suppl
1) :35S-9S.

16. Greenawalt K. Privacy, in Reich, WT., ed.. Encyclopedia of bioethics. New York:
Free Press, pp. 1356-63.

17. The Privacy Act, Washington, D.C.: U.S. Government Printing Office, 1974.

18. Public Health Service Act, Washington, D.C.: U.S. Government Printing Office, 1944,
as amended.

19. Centers for Disease Control (CDC) . Staff manual on confidentiality. 1984.

20. National Center for Health Statistics. NCHS staff manual on confidentiality.
1984.

21. Vernon TM. Confidential reporting by physicians. Am J Public Hlth 1991;81:931-2.

22. Teutsch S, Berkelman RL, Toomey KE, Vogt RL. Reporting for disease control
activities. Am J Public Hlth 1991,-81.

23. Vogt RL. Confidentiality: Perspectives from state epidemiologist, in Challenge
for Public Health Statistics in the 1990's (sic). Proceedings of the 1989 Public
Health Conference on Records and Statistics, National Center for Health Statistics,

229^3 0

July 17-19, Washington, D.C.

24. Sharing Research Data. Washington, D.C: National Academy Press, 1985.

25. Hogue CJR. Ethical issues in sharing epidemiologic data. J Clin Epidemiol
1991;44(Suppl 1) :103S-7S.

26. Smith GR. Health care information confidentiality and privacy: A review and
analysis of state and federal law. Emory University School of Law, unpublished ms,
1987.

27. Thacker SB, Berkelman RL. Public health surveillance in the United States.
Epidemiol Rev 1988;10:164-90.

231

CHAPTER X

Public Health Surveillance and the Law

Gene W. Matthews
R. Elliott Churchill

"The people's good is the highest law."

Marcus Tullius Cicero

INTRODUCTION

Public health surveillance and the law are joined by so many interconnecting links that
virtually every aspect of a surveillance program is associated with one or more legal
issues. In the United States, and throughout the world, many surveillance efforts
have been effected through mandates enforced by statutes or regulations. By the same
token, reports derived from the interpretation and application of data from
surveillance programs have been used to drive legislation relating to public health.

Public health surveillance involves the collection, analysis, interpretation, and
dissemination of data. It may be useful to have a working definition of the law to
meld with this description of surveillance. In essence, as Wing observes, the law

232

is "the sum or set or conglomerate of all of the laws in all of the jurisdictions:
the constitutions, the statutes and the regulations that interpret them, the
traditional principles known as common law, and the judicial opinions that apply and
interpret all these legal rules and principles* (1). However, that is by no means
all. The law is also the legal profession, and, in order to understand the law, we
must try to understand the lawyers--how they think, how they speak, and what roles
they play in the legal process. In addition, from a very practical point of view,
the law is also the legal process—legislatures and their politics, as well as the
time, efforts, and costs associated with changes in legislation. Finally, the law
is what it is interpreted to be. This takes us back to the lawyers, as well as to
the judges in the legal system.

We cannot avoid what Wing describes as 'the traditional barrier" between the legal
profession and the rest of the world. He continues with the observation that 'the
legal profession has for centuries done many things to surround the practice of law
with a quasi -mystical aura. Much as the medical profession would have us believe
that there is something almost sacred about medical judgment and that only a
physician can understand it, lawyers have perpetuated the only partially justified
myth that there is something called legal judgment that only someone with the proper
mix of formal education, practical experience, and appropriate vocabulary can make'
(1) .

'The basic function of the law is to establish legal rights, and the basic purpose
of the legal system is to define and enforce those rights .... Legal rights" are
the "relationships that establish privileges and responsibilities among those
governed by the legal system" (1) . This concept of "legal rights" does not purport
to cover freedoms or interests given unconditional, global protection, but rather it
covers the protection of carefully specified interests against the effects of other
carefully specified interests. Finally, some rights are protected, not by statute
or regulation, but by an understanding and application of the prevailing ethics in
an area. In general, ethics are regulated through whatever sanctions are imposed
against censured behavior by peers or colleagues (see Chapter IX) .

233

This orientation is pivotal in our discussion of legal issues associated with
surveillance because the reader must continue to be alert to the fact that everything
in this chapter is subject, first of all, to different interpretations in different
legal settings, and, second, to amendment of both statute and practice.

The task of surveillance as an applied science could be simplified considerably by
avoiding any discussion of legal issues. Although this observation is probably
valid, we have already pointed out that surveillance very often takes place under
statute. Beyond this fact, the relevance of the definition of the police powers of
a state must be acknowledged, i.e., "powers inherent in the state to prescribe,
within the limits of state and federal constitutions, reasonable laws necessary to
preserve the public order, health, safety, welfare, and morals" (2). That describes
a sweeping scope of authority and certainly covers anything that would be dealt with
under the heading of "public health surveillance. "

In other words, one cannot look at surveillance and claim to have created an accurate
picture without considering the legal constraints and processes that accompany it--
particularly since, for public health surveillance, we have added the component of
■timely dissemination of the findings" to our definition of surveillance. How
information is collected, from and about whom it is collected, how it is interpreted,
and how and to whom the results are disseminated all must be scrutinized under the
umbrella of "accepted practice" and "the law." The sections that follow contain
information specific to the United States, but for an international orientation, the
issues and concerns remain basically constant, while the written body of the law and
the process through which the law is enacted and enforced vary widely.

If the reporting component of public health surveillance is treated as a requirement,
one can assert that such surveillance began in the United States in 1874 in
Massachusetts, when the State Board of Health instituted the first statewide
voluntary plan for weekly reporting of prevalent diseases by physicians. By the turn
of the century, the forerunner of the Public Health Service had been established, and
laws in all states required that certain communicable diseases be reported to local
authorities (3) .

234

SURVEILLANCE IN THE EARLY YEARS (1900-1930)

With the development and growth of surveillance in the United States in the early 1900s
came the inevitable conflicts created when the interests of one human being conflict with
those of another individual or political unit. Much of the debate took place because of
the problem the United States was experiencing with sexually transmitted diseases — which
became even more acute with the participation of American troops in World War I. The
issues were basically

• the moral dilemma created by not reaching consensus on the purpose of
information obtained through surveillance (i.e., whether to direct control
efforts toward sexual behavior of the individual or toward the disease
agents) ,

• the debate surrounding the duty of the physician to his/her patient and to
society, and

• the disagreement about whether government provision of health services
comprised unfair competition to the private practitioner.

Since these concerns still have not been completely resolved in the United States as of
the 1990s, they are examined in more detail.

Social Hygiene Versus the Scientific Approach

By the early 1900s, the epidemiology of syphilis was reasonably we 11 -documented. This
understanding did not constitute an unmixed blessing. As William Osier told his students
at the Johns Hopkins Medical School in 1909, 'In one direction our knowledge was widened
greatly. It added terror to an already terrible disorder" (4) . Aside from the scope of
the destructive powers of syphilis, physicians were just beginning to appreciate the fact
that many "innocent victims" were contracting this disease. The prevailing wisdom of
earlier years of 'reaping what one sowed, " as well as other statements of poetic and moral
justice, was no longer adequate when women of "good family" and unblemished reputation
were known to have contracted syphilis from their spouses and when children suffered
severe effects from congenital syphilis.

What the medical and public health officials apparently had the most difficulty
reconciling was how to direct their efforts to deal with the growing problem of syphilis.

235

Both surveillance and treatment efforts could be directed toward a) people, a focus on
behavior modification through education as a control strategy or b) the disease vector,
a focus on the organism that caused the disease and how to eradicate it from individuals
and society at large. Neither approach to syphilis control was ever agreed to be the
ideal, and, in fact, the two in combination have still not proved totally effective. The
tensions represented by the "moralistic" and the "scientific" approaches are, moreover,
still quite evident in public health practice and surveillance in the 1990s.

One only has to review the popular press for the past several years to see how the "moral
versus scientific" dilemma relates to public health in the context of such currently
serious problems as human immunodeficiency virus/acquired immunodeficiency syndrome
(HIV/AIDS) and the reemergence of multidrug-resistant strains of tuberculosis.

Duty of Physicians

The concept of the confidential nature of communication between patient and physician is
clearly stated in the Hippocratic Oath and has continued to be emphasized in legal and
social settings. In the context of the syphilis epidemic in the United States in the
early years of the 20th century, this concept became a crucial point of debate in efforts
to control the spread of the disease. Physicians did not wish to breach the confidence
relied on by their patients by reporting cases of syphilis to the authorities; by the same
token, if they did not report the occurrence of syphilis--if not to the authorities at
least to the patients' spouses--they were tacitly participating in the continued
transmission of the disease to "innocent victims." The entire issue boils down to primary
responsibility to an individual or to society. It clearly has not been resolved but
constitutes an important component of the success or failure of present-day surveillance
efforts.

Economic Competition

Also as yet unresolved is the problem created for public health officials and for
practicing physicians in the early 1900s by the need, on the one hand, to have physicians
report all cases of sexually transmitted disease and to establish public health clinics
to provide prompt treatment and education to patients and, on the other hand, the need
for public health officials to protect the financial interests of physicians by not
infringing on their turf and removing paying customers to free or financially subsidized

236

facilities. At the same time, it did not seem reasonable to expect the physicians to make
such reports and refer such patients for treatment elsewhere when it would mean, in
essence, taking money out of their own pockets. For surveillance efforts, this dilemma
guaranteed underreporting of cases, with the selective reporting of cases representing
patients who could not pay and the withholding of reports of cases representing patients
who could pay .

Of concern to the 1990s surveillance effort, and again in the context of HIV/AIDS,
physicians might choose not to report cases of HIV positivity for fear their patients
might be discriminated against in a work or social setting. Problems with insurance
coverage might also lead to such underreporting.

ERA OF GRADUAL GROWTH IN MANDATED SURVEILLANCE ( 1940s -
1970s)

During the period of the 1940s-1970s, states added many diseases to their mandatory
reporting lists. Even in states that did not enact legislation to require additional
reporting, surveillance/reporting efforts were broadened during this period through state
regulation or directive from the state health commissioners (5) .

In contrast, surveillance and reporting to agencies in the federal government were--and
continue to be- -voluntary . The resulting discrepancy in data obtained on a particular
disease at the state and federal levels leads to problems in analysis and interpretation.
However, several professional organizations, including the Association of State and
Territorial Health Officers (ASTHO) and the Council of State and Territorial
Epidemiologists (CSTE) , have been instrumental in setting up a patchwork system to
coordinate and improve the quality and completeness of surveillance data.

A major factor in the development of surveillance planning and implementation during this
period is represented by the institution in 1976 of the Federal Protection for Human
Subjects Regulations. One of the most well-known of the regulations states the
requirement that "informed consent" be obtained from any person who is asked to
participate in a medical research project. In addition, the regulation covers
compensation for persons injured during the course of the project and confirmation of the

237

ethics of the research being conducted.

CURRENT LEGAL ISSUES (1980 to the Present)

There is little dispute that biomedical research and surveillance activities of the 1980s
were greatly affected by concerns and reactions associated with the HIV/AIDS epidemic.
All the old issues from early in the 20th century reemerged at critical levels: Do we
want to treat persons for the disease, or do we want to modify their behavior in
control/prevention efforts? Is the physician's primary duty to protecting a patient's
privacy or to the greater good of society? Is the public health machine treading on the
physician's turf by advertising and providing medical treatment more inexpensively than
the physician can?

Although these questions still need to be answered fully, public health action cannot wait
until consensus is reached before constructing and applying interventions. The sections
below examine four key legal issues that relate to these questions and have a major impact
on surveillance in the 1990s.

Personal Privacy

The right of an individual to have his/her privacy protected under the law is a vast gray
area. The U.S. Constitution does not specify a right to privacy, although particulars
relating to the protection of privacy under particular circumstances are included in the
Bill of Rights (protection from "search and seizure," etc.). As noted earlier in this
chapter, the issue of right to privacy and the physician's role in protecting that privacy
through the concept of privileged communication emerged as a hotly debated issue during
the war on sexually transmitted diseases in the United States in the early years of the
20th century. The concept of the so-called "medical secret" (6) involved the dilemma that
faced a physician whose male patient had a sexually transmitted disease (for which there
was no sure cure) , whose reputation the physician wished to spare, but whose spouse or
future spouse was at risk of having the disease if the physician did not step forward and
report it. Many physicians opted to remain within the accepted double standard of
behavior of the day and, according to Prince Morrow, became "accomplices" in the further
transmission of infection (7). The medical secret was described by one physician as a
"blind policy of protecting the guilty at the expense of the innocent," and a New York
attorney ventured the opinion that "a physician who knows that an infected patient is

238

about to carry his contagion to a pure person, and perhaps to persons unborn, is justified
both in law and in morals, in preventing the proposed wrong by disclosing his knowledge
if no other way is open" (7).

Unfortunately, the right to privacy issue was no more resolved in the early 20th century
United States than was the public health problem created by the nationwide problem of
sexually transmitted diseases. Public health officials continue to struggle with
questions associated with privacy and the rights of the individual versus the good of
society to this day .

The landmark case relating to the right of an individual to privacy was Griswold vs.
Connecticut, 381 U.S. 479 (1965), which resulted from the arrest of the director of the
Planned Parenthood League of Connecticut (Griswold) on the grounds that she had provided
information, instruction, and medical advice about contraception to married people. In
Connecticut at the time, the law stated that the use of contraceptives was punishable by
law. Subsequently, the U.S. Supreme Court declared the Connecticut law to be
unconstitutional and reversed the criminal convictions in the case. In the majority
opinion written for the Court by Justice William Douglas, there are references to the so-
called 'penumbras0 or auras of privacy that radiate out from the specific rights to
privacy stated in the Bill of Rights. He observed that "various guarantees create zones
of privacy" (S). He went on to say that the Connecticut law exceeded its bounds by
seeking to regulate the use of contraceptive devices rather than their manufacture and/or
sale. The only means he could postulate for enforcing the law as written involved the
invasion of the clearly defined zone of privacy represented by marriage. Lest anyone
misunderstand his meaning, he observed: "Would we allow the police to search to sacred
precincts of marital bedrooms for tell tale signs of the use of contraceptives? The very
idea is repulsive to the notions of privacy surrounding the marriage relationship" (8) .

Later courts would refer to this constitutionally recognized right of the individual to
privacy in certain contexts as a ■fundamental interest." In the precedent -setting
abortion case of Roe v. Wade, 410 U.S. 113 (1973), a single woman challenged the
constitutionality of a Texas law forbidding abortion (except when the pregnant woman's
life was in jeopardy) . She claimed that this law denied her constitutional right to
privacy and cited the earlier opinions of the Supreme Court relating to birth control.
Justice Blackmun observed that "the state does have an important and legitimate interest

239
in preserving and protecting the health of the pregnant woman. . . [and] it has still another
important and legitimate interest in protecting the potentiality of human life. These
interests are separate and distinct. Each grows in substantiality as the woman approaches
term and, at a point during pregnancy, each becomes 'compelling'" (9).

The link between the right to privacy and surveillance is also related to The Freedom of
Information Act (amended 1986) . In essence, the latter act spells out the situations and
conditions pertaining to the right of the U.S. taxpayer to obtain information s/he has
paid for from agencies within the Federal Government. Clearly, there is the potential
for conflicting interests in such situations, if information about taxpayer A is released
to taxpayer B. The act takes this point into consideration in its statement that "to the
extent required to prevent a clearly unwarranted invasion of personal privacy, an agency
may delete identifying details when it makes available or publishes an opinion, statement
of policy, interpretation, or staff manual or instruction" (10) .

An essential aspect in designing a surveillance program is the assurance to the persons
(agencies) who report and those being reported upon that the privacy rights of the persons
whose health information is of interest will not be violated. The conflict created by
the "right to privacy" and the "need to know" represents an area that must be monitored
by the managers of a surveillance program as diligently as they monitor the health
conditions to be reported. To illustrate: One of the most important court decisions the
Centers for Disease Control (CDC) has obtained in recent years related to litigation
arising out of the epidemic of toxic-shock syndrome of the late 1970s and early 1980s.
The attorneys representing the manufacturer of the tampon that had been strongly
statistically associated with the occurrence of toxic shock syndrome wanted to obtain not
only data about women who had had toxic shock syndrome and from whom CDC had collected
information but the names of the women as well. The agency argued (through district court
and up to the Federal Court of Appeals) that participation in federal surveillance is
voluntary and that participants in such programs have a reasonable expectation that their
confidentiality will be protected by the Federal Government. The Appeals Court ruled in
CDC's favor, but this position will continue to be challenged on a "need to know" basis,
and persons who are designing and operating surveillance systems should always keep in
mind the specter of the forced divulgence of information they have assured participants
would be confidential. This is particularly likely in situations involving litigation,
because of the courts' strong bias to make available the same information to legal

240
representatives for both plaintiffs and defendants.

The final observation in this section is that the manager of a surveillance program, at
least within a federal agency, is always in danger of being accused by the popular media
or the legal community of hiding something deliberately--not to protect the privacy of
individuals, but for sinister reasons that are usually hinted at but not stated. This
sort of accusation may have no basis in fact, but must be taken seriously and generally
requires, at a minimum, an undesirable outlay of energy and worry on the part of the
surveillance program manager.

Right of Access

If the taxpayers support the gathering of information, they have a right to that
information (12). This statement forms one basis for the "right to access" position.
Both the Privacy Act and the Freedom of Information Act reflect the post-Watergate era,
with its focused concern on the potential for the government to keep secret files
containing information on individuals. Beyond that is the "reasonable man" position,
which maintains that a person has a right to any information that is about him/her.
Unfortunately, giving information to an individual about himself /herself can sometimes
have the effect of providing information that assigns liability to another person (or
organization) in the data set. So even the process of providing personal information to
the person in question is not without its hazards.

In addition to the individuals who wish to obtain information about themselves, there are
the so-called "third-party" inquirers. These individuals call for information on a need-
to-know basis and may range from members of the U.S. Congress through attorneys and
special-interest groups (e.g., "right to life" or "pro-choice" groups) to representatives
of the news media.

A major point for the surveillance program manager to ponder is when to make a public-use
data set. Although there is no legal precedent to be followed here, once the first paper
has been published about a data set, it is prudent to place that data set in the public
domain if there is a reasonable expectation of its further use. Although this creates
the risk of extra work and having others preempt publication, it obviates accusations
about willful withholding of information or the danger that forced release of data before

241
they are properly prepared for public use will allow some subjects to be identified.

Product Liability

This heading could be 'Research Institution Discovers Corporate America—and Vice Versa.'
The issue has been around for many years but seemed to rise to prominence in the United
States with the emergence of toxic-shock syndrome in the late 1970s and early 1980s. It
is not unusual for investigations to show that a product is contaminated, that someone
used a machine incorrectly, or even that someone deliberately tampered with a medication
or device and caused illness or death. What was not familiar was that a "good" product,
one that meets all its quality-control specifications and does what it is advertised to
do, can also have effects that are less than desirable. Thus, no one was ready to deal
with the situation in which an efficiently designed tampon apparently led to a life-
threatening illness . The scientists had to accept the findings because scientists deal
in fact (probability) , and the media had grist for their mills, but the manufacturer of
the tampon (and its employees and stockholders and legal representatives) did not have
an easy time coping with "the facts.' In fact, they underwent a classic grief reaction--
which the staff at CDC and other health science agencies have since learned to anticipate
and to recognize- -involving the stages of denial, anger, depression, acceptance, and
resolution. Human nature was applied with a vengeance, and the first three stages were
immediate, intense, and enduring. The last two stages took some time and extensive effort
to induce.

Ideally, one should assure that surveillance programs are flawless and that all the
information reported is unassailable. In the world of public health practice, such
Utopian standards can rarely be met. And public health practitioners must continue to
be prepared to deal with issues on a mixture of levels- -including public health, legal,
ethical, socio-cultural, and emotional components.

Litigation Demands

Under litigation demands, the issue is to what extent an agency is responsible for
providing its staff to testify in litigation relating to findings it obtained through
surveillance or research. Of course, there is no simple answer, just as there have not
been any simple answers to the other questions posed in this chapter. Clearly, it is not
responsible to refuse to provide expert testimony in any instance in which it is

242

solicited. In some cases, agency scientists may be the only ones who have worked in the
area in question and have facts to cite. By the same token, in situations in which there
are massive numbers of suits being conducted over a period of several years (as with
toxic-shock syndrome or transfusion-associated HIV infection), all of the scientific
resources of an agency could be expended on time in court and, therefore, none of them
on the science that is their primary business. Somewhere, there is a correct answer for
each agency and each health issue, and this problem may need to be faced when planning
surveillance activities.

CONCLUSION

For those who set up and run surveillance programs, it is important to note the following
summary comments. Public health surveillance systems operate in the massive goldfish bowl
that encompasses both public health practice and the law.

• Plan and design surveillance systems so that they are most likely to provide
all the information and only the information actually needed.

• Include as few personal identifiers as feasible.

• Analyze and publish data in a responsible and timely fashion.

• Be prepared to stand behind the results (and hope your agency will stand
behind you) .

• Be prepared to place each data set in the public domain as soon as the first
results are published.

• If the findings are revolutionary, be prepared for a hostile reaction
rather than a medal .

• Finally, remember that the individual has rights (to privacy, to access
information, to participate or not to participate in surveillance programs,
and the like) . The public health practitioner, at least in the role of
public health practitioner, has no rights--only responsibilities.

Public surveillance constitutes one of the bridges between what we think is happening and
what is actually happening. As such, it is one of the most valuable tools of the public
health practitioner. With surveillance data as the light bulb and the law as a rheostat
that stimulates change and regulates behavior, the two areas can work in concert to
improve the quality of the public's health.

RBFEREHCES

1. Wing KR. The law and the public's health. Ann Arbor, Michigan: Health
Administration Press, 1990:1-50.

2. Friedman LM. A history of American law. New York, New York: Norton Press, 1986:1-
8.

3. Thacker SB, Berkelman RL. Public health surveillance in the United States.
Epidemiologic Reviews 1988; 10 : 165 .

4. Osier W. Internal medicine as a vocation. In: Aequanimitas: with other addresses
to medical students, nurses and practitioners of medicine. 3rd edition.
Philadelphia, Pennsylvania: W.B. Saunders, 1932:131-46.

5. Hogue LL. Public health and the law: issues and trends. Rockville, Maryland:
Aspen Systems Corporation, 1980:10.

6. Brandt AM. No magic bullet: a social history of venereal disease in the United
States since 1880, 1985:157-8.

7. Parran T. The next great plague to go. Survey Graphic 1936:405-11.

8. 381 U.S. at 484-86.

9. 410 U.S. at 162-64.

10. The Freedom of Information Act (As Amended) . Washington, D.C.: Government Printing
Office, 1986.

11. Abraham HJ. Freedom and the court: civil rights and liberties in the United
States. New York, New York: Oxford University Press, 1988;23.

245

Chapter XI

Computerizing Public Health
Surveillance Systems

Andrew 6. Dean

Robert F. Fagan

Barbara Panter - Connah

•We only conquer what we wholly assimilate."

Andr<§ Gide

In this chapter on informatics or computerization of surveillance systems, we will
first explore what is technically possible in computerization of surveillance, finding

246

an enormous gap between this and the best of today's actual systems. The barriers to

optimal use of computers in surveillance — mostly social, organization, and legal

are explored. The remainder of the chapter explores some of the problems that must be
confronted in thinking about microcomputer-based surveillance, leaning heavily on
examples from the notifiable disease system in the United States.

OVERVIEW OF A SURVEILLANCE SYSTEM IN THE FUTURE
An Ideal Surveillance System

Ideally the epidemiologist of the future will have a computer and communications
system capable of providing management information on all these phases and also
capable of being connected to individual households and medical facilities to obtain
additional information.

Suppose that the epidemiologist of the future has a computer with automatic input from
all inpatient and outpatient medical facilities, with standard records for each office
or clinic visit and each hospital admission. S/he chooses to compare today or this
week with a desired period, perhaps the past 5 years, and the computer displays or
prints a series of maps for all conditions with unusual patterns. One of the maps
seems interesting, and the epidemiologist may point to a particular area and request
more information. A more detailed map of the area appears, showing the data sources
that might provide the desired information, with estimates of the cost of obtaining
the items desired. A few clicks of the mouse button select the sources, types of
data, and format for a display, and the computer spends a few minutes interacting with
computers in the medical facilities involved- -extracting information and paying the
necessary charges from the epidemiology division's budget. Soon the more detailed
information is displayed on the epidemiologist's computer screen.

The pattern of hospitalizations and outpatient visits for asthma stands out, and the
epidemiologist requests a random sample of specified size of persons who have ever had
asthma in the same area, matched by age and gender, to serve as controls for a case-
control study. The video-cable addresses of these "controls" and of the case-patients
are quickly produced through queries to appropriate local medical-information sources.
The epidemiologist formulates several questions about recent experiences, types of air
conditioning, visits to various public facilities, and the like, adapts these to a

247

previously tested video questionnaire format, and requests that video interviews be
performed for case-patients and controls. Each household is contacted or left a FAX-
like request to tune to a particular channel and answer a 5-minute query from the
state health department on a matter of importance to public health. Eighty-five
percent of the subjects respond to the first query, and the computer automatically
follows up with the rest, bringing the response to 92%, with half of the remainder
reported to be absent from their homes for at least 2 days.

The odds ratio for persons with recent hospitalizations for asthma who work in or
visit in a particular neighborhood is considerably higher than 1.0, and the
epidemiologist connects by local-area network to the state occupational surveillance
system and requests a display of all factories in the relevant area. Selecting those
that deal with possibly allergenic materials, s/he issues a request for more detailed
investigation of activities at the plants in a selected time interval . The
epidemiologist also requests information from the weather bureau on wind direction and
velocity, temperature, and rainfall.

Within a few hours, a plant is identified that is in the process of moving a large
pile of by-products with a bulldozer. A request is issued that the by-product be
sprayed with water to prevent its particles from becoming airborne, and the plant
manager readily agrees when shown the maps that depict hospitalization rates for
asthma downwind from the plant. To monitor progress and widen the investigation, the
epidemiologist asks the computer to do similar studies for conjunctivitis and for
coryza or hay fever over the previous and next 2 weeks. Selecting several maps and
tables to include in the report, s/he asks the computer to write a description of the
studies performed and the findings, and then dictates a brief summary of the problem
and several follow-up notes to the voice port of the computer. At the end of 2 weeks,
the number of cases of asthma has fallen to normal for the area, and the computer
calculates on the basis of the number of medical visits during the outbreak that
$55,000 has been saved at a total cost of a few hours of the epidemiologist's effort,
a site visit to the plant, and charges of $9,500 for the data and the communication
facilities used to perform the interviews.

Barriers to the Ideal Surveillance System

248

Obviously, we are a long way from implementing the system described above. It may be
helpful in thinking about the future to explore what barriers must be surmounted
before this scenario can be enacted. Strangely enough, few of them are technical; all
of the necessary systems could be built today with fairly conventional equipment and
software, with the exception of the two-way interactive video connection with each
household. This hook-up with the individual household is more likely to be available
within the next 10 years than is the connection between the physician's record files
and the health department. In fact, the two-way interactive video link between the
household and the outside world is simply awaiting the government's or the
marketplace's decision on what format will be used and on the realization of the
benefits of such a connection on the part of the entrepreneurs and the public.

However, there are some difficult problems to be solved before the 'ideal system" can
be implemented. They include the following:

a) The rapid availability of standardized, computerized medical

records. Several issues need to be addressed before such a system is
possible. In the United States, for example, a profusion of computerized
medical-record systems for inpatient and outpatient records as well as
insurance and other purposes have been developed These systems contain a
plethora of different variables and use many different formats. Until a
simple core public health record of age, gender, geographic location,
diagnosis, and a few other items is created for each outpatient visit and
each hospitalization- -and is available in a standard format without
delay--the responsive interactive system above remains an unrealistic
pipe dream. An additional problem is that most medical records are still
not more than partially computerized.

The barriers to establishing standardized public health output from
computerized medical records are primarily political and administrative;
most large retail organizations create records of similar size for each
item sold, and the items carry on average, a much lower price than the
cost of a visit for medical care. Once there is the will to establish a
national computerized medical record system, the technical hurdles will
be readily overcome. The needs include standard but suitably flexible

249

record formats, solutions to problems associated with confidentiality,
incentives to create the records (including the assurance of appropriate
and cost effective use of the records), and voice output.

b) Another problem is the lack of recognition that information about
patients, except for legally designated "reportable diseases, ■ is useful
in public health and should be available to public health agencies. The
level of awareness could be heightened if technical solutions to problems
of confidentiality were publicized and understood by the public and their
legislative representatives. Such solutions as one-way encoding
algorithms could provide partial solutions to matching and follow up
problems, if properly used without turning public health agencies into
carbon copies of dreaded "big brother."

c) A pervasive feeling among those in charge of data that their data base
must be "clean" before anyone else can use it. Months or even years are
consumed while corrections and updates are made to make the data as
accurate as possible. Although from one perspective this quality control
is necessary and important, the concept of "surveillance" includes rapid
turnaround, a realization on the part of everyone concerned (even the
media and the public) that the data are preliminary, and the
understanding that in order to look at today's data today, one must be
willing to accept today's imperfections. This mental shift, as well as
corresponding technical developments, will be necessary before a
computerized system can be used to examine automatically a "time slice"
of disease and injury records that originate in clinics and hospitals.
Imperfections will be everywhere, and methods must be found to cope with
reality--even if it includes warts--on an immediate basis.

The Technology of the Future

As stated above, today's technology, given enough social and organizational
development, is adequate to allow the creation of miracles in public health
information and communication. Nevertheless, it seems likely that development in
technology will continue to reflect more of a driving force in public health computing

250
than progress in political and social organization.

Technologic developments over the next decade will probably include the areas shown
below:

High capacity storage devices

CD ROM's (compact disk read only memory) similar to those used for music make it
possible to have access to large bibliographic data bases anywhere there is
electricity. The MEDLARS data base of the U.S. National Library of Medicine can be
searched from a clinic in Africa; (once there are lower prices for books on CD ROM and
they include needed illustrations), it will be possible to take a medical library
anywhere in a briefcase. Past data bases from the United States and elsewhere will
become available on CD ROM, although the process of cleaning them up for this purpose
often reveals gaps and inconsistencies that reflect changing definitions and diminish
their value as consistent anchors for comparison.

Networks

A local area network (LAN) is a system linking microcomputers, terminals, workstations
with each other and/or a mainframe computer to facilitate sharing of equipment (e.g.,
printers) programs, data, or other information. LANs are transforming the way many
agencies do business. The most noticeable effect is the transmission of written
memoranda that could or would not have been typed, packaged, and sent through a paper
system. The cost of installing and supporting a LAN is not small, particularly in
terms of support personnel. Uses for surveillance include entering data at multiple
computers connected by a LAN. This requires special software to protect against
errors. Special precautions to protect confidentiality are necessary in a network, if
several people enter data in the same file at the same time.

New user interfaces

The parts of programs that interact with users have become easier to understand, and
more attractive, with pull-down menus, windows, and pointing devices such as the
■mouse." This elegance has its cost in terms of requirements for faster computers,
for more memory, and particularly for greater skill to produce such programs. Some
new programs cause unexpected problems when run with older programs or on older

251

computers. All in all, the trend is toward a standard set of screen "controls," like
those in modern cars, but the path in that direction is replete with experiment and
minor failures.

New programming tools

It is widely recognized that software production is the narrow point in the
implementation of new ideas in computing. Useful software still requires hundreds of
thousands of lines of hand-written and highly personal "coding." Many new trends such
as "fourth-generation data bases," computer-assisted software design (CASE) tools, and
■object-oriented design" have made programming more productive, but this area of new
tools is one in which major advances would create revolutionary changes.

Higher-capacity processors and more memory

The almost miraculous advances in computer speed and memory capacity in the last
decade have removed many of the limits that required use of mainframe computers or
minicomputers rather than microcomputers. Now almost any project can be done on a
microcomputer or several microcomputers connected by a LAN if there is sufficient
motivation.

Video and computer Integration

Photographs and fully functional video will soon be appearing on our computer screens.
Although this may have greatest impact in pathology and radiology, and education, it
also alters on opportunities to use color and three-dimensional dynamic displays for
epidemiologic data. The possibilities for computer interaction via ordinary
television sets are exciting, because every epidemiologist (and market researcher) can
savor the possibility of interviewing citizens via cable television with the results
captured immediately in computerized form. The medium offers new challenges in
identifying responses that result from the various stages of humor, exasperation, or
intoxication that citizens may undergo in the privacy of their homes.

Voice and pen input

System are available now that identify thousands of spoken words (for tens of
thousands of dollars) and allow for a crude interaction between voice and computer.
Computers that recognize handwritten text of reasonably structured type are being sold

252

currently. Presumably the rather elementary state of computerization of medical
records will undergo a quantum leap once such systems allow medical staff to dictate
to the computer without typing and preferably without being near a computer. When
medical handwriting is replaced by voice dictation into a lapel microphone, real
progress may occur in the use of computers in both clinical medicine and public health
settings. As stated above, however, realizing real public-health benefit from such
technology will require dramatic social and legal changes.

BACK TO THE PRESENT: COMPUTERIZED PUBLIC HEALTH
SURVEILLANCE IN 1992

Since 1985, Centers for Disease Control (CDC) staff have installed and maintained
customized disease-surveillance software in 36 state health departments and a number
of county, district, and territorial departments. The software has been based on Epi
Info, a public-domain word-processing, database, and statistics package for IBM-
compatible microcomputers that is a joint product of CDC and the Global Programme on
AIDS, World Health Organization {1,2). These systems have made possible the
participation of all 50 states in the National Electronic Telecommunications
Surveillance System (3,4). Benefits cited in a recent evaluation include improved
access to data and improvement in both quality of data and access associated with
decentralized entry of data (5) .

Although reportable-disease systems are a specific kind of surveillance system and Epi
Info is only one type of data-base/statistics program around which a system can be
built, many of the principles of computerization apply to other systems. To avoid
empty generalization, much of the rest of this chapter is based on CDC's experience
with reportable-disease surveillance using Epi Info. The information is directed to
those considering computerization of a disease-surveillance or similar system of
records, whether they wish to do their own system design or will be working with a
professional computer-systems designer. Computerizing a surveillance system for
disease is not easy. Since the success of computerization depends as much on the
administrative and epidemiologic environment as on the software, it is vital that
public health practitioners understand the details of a new system and participate in
its design. The most important step in developing a computerized surveillance system

253

is identifying the public health objective for the system. In some cases, the
objective (s) will have been clear for decades in a manual system ('Identify and treat
or isolate cases of X and evaluate results, " or "Assess results of immunization
programs and identify new cases for special control efforts"). Computerization can
then be directed toward accomplishing the same task more efficiently or in greater
volume or detail.

The most successful computer systems, however, are those that change methods by which
an agency operates rather than those that merely automate a manual task ( 6) . In
establishing a new surveillance system or reexamining an existing system, it may be
useful to address the following question: "What key pieces of information do I want
to see on my desk (or computer screen) every day, week, month, or year that will make
my work easier or more effective?- The same question can be asked at several levels
of management- -from epidemiologic technician to epidemiologist to director of a public
health agency .

Given a surveillance system that has a public health goal and to some extent achieves
the goal, why computerize? Sometimes the answer is obvious--because the annual report
takes a herd of clerks 2 years to process," or "we like the graphs health department A
turns out so easily with their computer." Potential benefits relate to quality of
data or of reports, quantity of data that can be processed, and speed of processing.
Dissemination (copying) of surveillance records to another site is one reason disease
reports in all 50 U.S. states are computerized.

We were unable to find systematic studies on the benefits of computerizing public
health surveillance systems, although numerous articles describe individual systems
that have been computerized (7-10), and Gaynes et al . (21) describe methods for
evaluating a computerized surveillance system. In literature about the commercial
world, benefits of computerization have been examined from the viewpoint of financial
savings. Savings by automating a manual information process may amount to 20% or so,
but the real benefits are achieved if computerization transforms the entire process
concerned, giving a competitive advantage in the commercial world—which would
correspond to a new order of service in the public health world (6) . So far, most
public health applications have automated manual systems, although some--such as the
spreadsheet calculation of the impact of smoking on populations--verge on establishing

254

new and previously unknown styles of doing business (12) .

One problem cited in other "vertical markets" (industries with specialized
practitioners) such as the construction, meat-packing, and real estate industries.
With only 7,000 epidemiologists in the United States, relatively few commercial
developers feel that it is financially worthwhile to develop software for this market
alone, since applications such as spreadsheets, languages, and word processors may
sell millions of copies to the general public (13) .

Basic Needs

The first requisite for computerization is a paper system or operational design that
works reasonably well or would do so if the process were speedier and more accurate.
Chaos computerized is not necessarily an improvement over what is already in place,
although the process of computerization offers a chance to rethink some of the
features of a system and to make improvements. If the surveillance system is a new
one, it may be desirable to evolve the computer facilities in small stages with
minimal investment until the system proves to be useful and well-conceived. This
requires a careful plan (including provision for changing the plan if necessary) but
will minimize the expense of adaptation as the epidemiologic design of the system
undergoes the inevitable adaptation to external reality. After the "bare bones"
system has proven its worth and the probability of expensive changes is lower, the
"bells and whistles" can be added later.

Personnel to do the collection of data, data entry, analysis, and system maintenance
are important contributors to the system. Many of the tasks can be learned by current
employees, particularly if they find this challenge welcome. If possible, those
chosen should be long-term employees to assure stability of the system, although they
may be aided by students and other temporary employees. The epidemiologist who will
use the results should participate in the planning of the system and should understand
how it is constructed. A staff member with some programming skills and/or aptitude
for microcomputing should be involved in designing and setting up the system, even if
an outside consultant does the actual programming.

If several computers are to interact and share data, a set of standards is necessary

255

(e.g., just as humans carrying on a conversation need a common language). In the
United States, the states and CDC chose a standard record format so that computers of
different types could reformat data to a set of standard records and send these to the
central agency. This standard, first devised in 1984 and revised in 1991, has served
the purpose well, without placing unnecessary restrictions on the type of hardware or
the format of records kept within each state. One state maintains 20 times more
information for local use than do other states, but all export the same standard
record formats to the national level. The new standard record format allows for
standard demographic and diagnostic information, attachment of variable -length
detailed reports for selected diseases, mixture of summary with individual records,
and automatic comparison of state and national data bases with each transmission.

Most government settings have an organization in charge of computer programming,
approval of new systems, and purchasing of computers and software. It is important to
maintain liaison with this organization and to arrange its assistance ahead of time
with difficult areas such as purchasing computers. In some organizations, purchases
are limited to particular types of computers- -occasionally with unique
characteristics--or to centrally administered systems. We recently encountered a
network of "diskless" workstations that presented numerous problems in trying to load
or run software or back-up files from a particular station without a removable storage
device. If such problems are present, it is prudent to discover and, if possible, to
surmount them at an early stage through patient negotiation and collaboration or other
methods if necessary. The technical difficulties that arise in setting up a computer
system are usually the easy problems; the difficulties that lead to months and years
of delay and unhappiness usually reflect misunderstanding and miscommunication among
individuals or organizational entities.

Some Key Concepts; Files, Records, and Fields

Computerized records are stored in files. A file is a collection of records, usually
one record per case, that has a name (e.g., GEPI.REC, for General EPIdemiology) and
can be manipulated as a unit. Files, like books, can be opened, closed, read, written
to, or discarded. They are stored on nonvolatile media such as hard or floppy disks
or magnetic tape.

256

Records correspond to one copy of a completed questionnaire or form, such as a

disease-report card. Usually, one disease report or questionnaire is stored in a file

as a single record. Records can be displayed on the screen, searched for by name or

some other characteristic, saved (written) to a disk, or marked as deleted. Many
records can be stored in each file.

A field is one item of information within a record. NAME, AGE, and DATEONSET might be
fields within a disease-report record. Records in a particular file all have the same
fields. Each field has a name, a type (text, upper-case text, numeric, date, etc.),
and a length, such as 22 characters for NAME or 3 for AGE. During analysis, fields
may be called variables, and commands such as "TABLES DISEASE COUNTY" are used to
instruct the system to process a particular file and construct the desired table by
tabulating the fields or variables called DISEASE and COUNTY. In this case, the
result in Epi Info would be a table that lists DISEASE down the left side and COUNTY
across the top, with numbers of reports by county indicated in the cells of the table.

Hardware: What Size Computer is Appropriate?

With microcomputers being available for much less than $5000, it is possible to
process more than 100,000 records in reasonable time periods. Processing time tends
to reflect the record length as well as the number of records, however, and the size
of each record should be kept short if large numbers will be processed. Since the
total number of disease reports for the United States is several hundred thousand per
year, states and counties should find it possible to build most systems on a
microcomputer if desired.

Minicomputers and mainframes can serve as the basis for surveillance systems if
available at reasonable cost and if programming and support staff are available to
work creatively with staff of the surveillance system. The greater technical skill
required to run and program such computers often resides in an organization other than
the one running the surveillance system, and close coordination becomes much more
important than in the do-it-yourself situation with a microcomputer.

Systems that seem to require processing of millions of records, such as hospital
discharge or Medicare records for a state, can be reduced by sampling to a manageable

257

size for the microcomputer. The mainframe can be used to select a sample of records
(e.g., particular age groups, diseases, every tenth record, or persons born in decade
years). Files are then exported for processing on a microcomputer that is more
responsive to the epidemiologist's wishes. Epidemiologists are usually acutely
conscious of sample size when performing interviews but sometimes fail to recognize
how unnecessary it is to process 6 million records to estimate a simple proportion.

Software

The type of software used to perform the computerization is often less crucial than
the skills of those who will program and run it. Usually, there are several types of
data-base or statistical packages that will do a given task well if properly
programmed. Beware of the 'indispensable programmer' syndrome, in which a single
expert programmer writes a system in his or her favorite language and then departs for
greener pastures, leaving the users without resources for further maintenance.

Data-base packages such as dBase, Paradox, Foxbase, and Clipper are designed to allow

data input, storage, retrieval, and editing. Most will count records but do not

easily do such statistics as odds ratios. They require a skilled programmer to
produce a customized system.

Statistics packages, such as Statistical Analysis System (SAS) and Statistical Package
for the Social Sciences (SPSS), focus on producing statistical reports, usually from
single files of data. They are less convenient for data entry. Both SAS and SPSS now
have mainframe and microcomputer versions. They contain many routines rarely used by
epidemiologists and occupy large amounts of disk space (tens of megabytes for SAS) .

Epi Info provides a combination of data-base and statistical functions, allowing
relational linking of several files during data entry or analysis. Questionnaires or
forms may be up to 500 lines, with hundreds of numeric or text fields, and the number
of records is limited only by disk storage space. Frequencies, cross tabulations,
customized reports, and graphs can be produced through commands contained in a program
file or interactively from the keyboard. Commonly used epidemiologic statistics are
part of the statistical output. Although it takes little experience to use Epi Info
for investigating outbreaks, producing a complete surveillance system from the

258

beginning takes both skill and time. It may, however, be much simpler to modify
software supplied with the program.

It is important to realize the limitations of software packages before they are used.
Both statistical and data-base packages typically cost at least several hundred
dollars and therefore are not likely to be feasible for classes of students or large
numbers of remote computers.

Some data-base packages limit the number of fields in a record or the number of
records in a file, and few will do statistics without advanced programming or purchase
of a supplementary package. Statistics packages, on the other hand, may have
limitations in handling textual ("alpha") data, and most allow processing of only one
file at a time. A complete surveillance system may require the functions of both
data-base and statistical programs.

The current version of Epi Info has limitations on the number of records that can be
sorted or linked at one time (tens of thousands) , however, and since text fields are
limited to 80 characters, Epi Info would not be a good choice if large amounts of text
are to be stored, as in a complete clinical system containing dictated notes.

Designing Entry Forms

In a surveillance system, data items are usually entered in a standard format (e.g. , a
questionnaire or report form) . The information is stored in files containing one
record per individual. In Epi Info, the format of the data-base file is specified by
typing a questionnaire or form in the word processor. The result resembles a paper
form, with entry blanks indicated by special symbols (e.g., underlined characters for
text fields and number signs for numeric fields) . The computer reads the form and
constructs a file in the proper format.

In designing a form, it is useful to include a unique case identifier as a number of
combination of letters and digits. This may include meaningful information, such as
the year, but should not include any item that may need to be changed, such as a
disease code. It must be designed so that a new and unique number will always be
available for each record.

259

The amount of data entry and computer storage required may be minimized by
computerizing only information that will actually be used. If follow-up information
such as name, address, and telephone number can be used from the paper form, there may
be no need to enter it into the computer. If contact tracing is recorded, the
computer record may summarize the number of contacts named and the number found or
treated, with the details on each and progress of the follow-up efforts relegated to
the paper forms used by field investigators. When including an item on the input
form, it is helpful to ask, 'how will this be analyzed?" and "how would the result
look after processing?' Computers around the world are full of data items that
someone entered "just in case we need it." Most are never needed.

Textual material can be printed from a computer file, but it is usually difficult or
impossible to process such entries as "Pen, Strep, and Ampicillin," to produce
meaningful tabulations. For serious analysis a more usable format would be

Penicillin <Y>

Streptomycin <Y>

Ampicillin <Y>
in which "<Y>" represents a blank for a "Y" or "N° response.

A common problem in designing entry forms is that several data items may be similar.
Suppose you want to record name and treatment (RX) status for up to 12 contacts of
each case-patient. One possible approach is to create fields called NAME1 through
NAME12 and RX1 through RX12 . This approach allows the data to be entered, although it
creates a very large data-entry record (say 12 x 22 characters for NAMEs and 12 x 1
characters for RX=276 characters, even if no information about contacts is entered) .
However, analyzing the information becomes a programming nightmare, as determining the
number of contacts or their treatment status requires examining at least 12 different
fields in each record to see whether they have been filled in and keeping a running
tally of the results. In computer data-base jargon, the record is not "normalized."
These repeating groups of fields should be placed in separate records — one for each
contact--linked to the main file as described below in the section on linking special-
purpose records. Then a case-patient with one contact has one record in the case file
and one record in the contact file rather than the equivalent of these plus 11 empty
records in a single file.

260

This problem is resolved by rethinking what is really the best unit around which to
build an individual record. The simple answer is that if you intend to tabulate
cases, build a case record; if you will tabulate contacts or follow-up visits, then
you need a contact or follow-up record. If both are necessary and the system is large
or permanent, records should be placed in separate files and linked using relational
data-base features as described below.

Data Entry

The details of data entry should be determined and documented, including who will
prepare the paper records (if needed) for entry, who will enter them, and at what
intervals. The status of the report as "suspected" or "confirmed" may determine
whether it is entered, and this must be determined at the outset. Most disease
reports are entered in batches--once a week, for example--and in many states not more
than an hour or two is needed to enter the data for a week, although the quantity of
records varies sixfold in size in different states and correspondingly in time
required to enter data.

Records linked to more extensive specialized forms can be sent as partial submissions
and revised later to avoid delays in reporting caused by the slower progress of data
collection for the more detailed forms. This issue needs to be considered and
resolved in advance .

Cleaning and Editing the Data

Errors or duplications inevitably occur during data entry, and additional information
may arrive that requires changes or additions. The data can be "cleaned" during data
entry or with the help of analytic programs that display "outliers, " and data can be
checked visually by browsing through records in the ENTER program or by scanning a
list printed by the ENTER or ANALYSIS programs. Records can be viewed and corrected
in a spreadsheet format in ANALYSIS. Finally, a program called VALIDATE can be used
to compare files entered in duplicate by different operators. Records showing
different entries are printed out for reconciliation.

Epi Info allows extensive programming of error checks on data entry. Each field can
be set to accept only specified codes, and, if necessary, multiple fields can be

261

checked for inconsistencies such as gynecologic conditions recorded for males.
Unfortunately, many errors cannot be caught by such systems, and one can still enter
the wrong code for a less gender-specific disease.

Regardless of the method used, errors should be caught and corrected near the time of
data entry if possible, since they can create much larger problems if left for the end
of the year. The choice depends largely on orientation and number of personnel
available and perhaps on their preferences after trying different methods.

Analysis of Data

The type of output desired should be planned in advance, since the inputs and outputs
usually specify fairly precisely what kind of processing is needed to achieve the
result. Dummy tables and graphs should be sketched on paper. Epi Info and many other
data-base programs can be programmed to print a table or mixture of text and tables in
almost any format, using a feature called the "report generator."

It is not necessary to design reports to cover all possible needs, since ad hoc
queries are an important part of any system, and additional reports can be added later
if they are deemed useful. In Epi Info, an epidemiologist can learn to do simple
queries (READ GEPI; TABLES RACE COUNTY) in a short time and to limit these to
particular time periods (SELECT REPORTWK = 34) almost as easily.

Sometimes a simple report such as a listing this week's reports, sorted by disease,
may be as useful as a number of tables with very small numbers in each cell. The
number of records available should be considered in designing reports and in
determining how often they will be produced.

Distributed Data Base

So far, we have described a surveillance system housed in a single microcomputer. As
more community health departments obtain computers, however, the trend is toward
networks of computers within a state, connected by modem in ways analogous to those
used in the National Electronic Telecommunications Surveillance System (NETSS), with
its 50+ state and territorial participants. Each participating site enters data and
sends them periodically to a computer at the next level up.

262

This process would be simple to do if all data were entered at the local level and
sent to the state level, and if no changes were made later. However, in practice, not
only are changes made, but in some states records are entered at both state and local
levels, and some method must be in place to see that both levels of staff eventually
have the same records.

Ideally, only one copy of the records would be considered the "master" copy, and each
user would know its location and provide updates only at the designated time. The
best way to accomplish this objective is still being worked out, and experiments of
several types are likely. Designating only one of the sources as the "owner" and
rightful editor of the data is one possibility. At present, we favor indicating on
each record the site at which it was created and allowing only that site to make
changes that are transmitted weekly to the other sites to update their copies of the
records .

State health departments use the latest software to transmit year-to-date summary
information on the state data base to the national level each week. These data are

compared automatically with the contents of the national data base, and any

discrepancies are reported.

Transmitting Data

In NETSS, most states transmit reports each week through a commercial
telecommunications network. The 50+ reports stay in the network computer until they
are picked up on Tuesday morning by CDC staff, stripped of comments and address
material, and joined together in a single file for processing on the CDC mainframe.
Error checking is done to test for invalid codes and other problems, and error notices
are sent back to the states.

Another method that eliminates errors caused by telephone noise involves transmission
directly from computer to computer by means of modems and software that retransmits if
errors are caused by noise. Several states are using this method to connect with CDC
microcomputers that, in turn, send the files to the CDC mainframe.

A third less elegant but often practical solution is physical transfer of floppy

263

diskettes by mail or messenger at intervals. This allows large files to be
transferred with minimal inconvenience, and may be appropriate if the additional
trouble of setting up modems and software is not yet warranted or in developing
countries where telephones are unreliable or unavailable.

In any case, the result is that a copy of a file of records from the peripheral site
arrives at the central site. The records must then be merged into the main data base.
If all are new records, this task is straightforward. If the incoming records contain
updates for records previously transmitted, the process is more complex.

Correcting and Updating Records from Another Site

In NETSS, only state participants are allowed to update records; CDC staff do not do
so, although they may enter temporary telephone reports. Updates are sent as records
with the same identification number as that for the original record. If a new record
has the same identification number as a record in the data base, the existing record
is updated so that all non-blank fields of the new record prevail. To change an age,
for example, a state would send a record containing the case identification number and
the new age. To delete a record, the state, year, and identification numbers are sent
in a special 'Delete' record. When errors are found at CDC, the information is
transmitted to the state staff, who then corrects the errors and transmit update
records the following week.

Individual and Summary Records

Many systems function with a record for each individual case report. In some,
however, there is a need for summary records, each of which represents a number of
case reports. This is helpful if large numbers of similar records (e.g., cases of
gonorrhea in a big city) are processed, or if only summary numbers are available. It
also allows records from entire years to be summarized in condensed format, so that a
5-year trend can be calculated without reading and processing each record for the
previous 5 years.

A summary record is similar to a case record, but it contains an additional field
called 'COUNT,' which contains a number. The number indicates how many records with
the same information are represented by the summary record. Epi Info contains

264

commands called SUMTABLES and SUMFREQ to process summary records. These commands sum
the contents of the count field rather than counting individual records. Since a
record with COUNT equal to 1 is an individual case record, files that are mixtures of
summary and individual records can be processed as a single unit.

Linking Special-Purpose Records to the Main Data Base

As mentioned above, sometimes it is necessary to link related records in different
files together in order to allow easy processing of, for example case-patients and
contacts who are related to case-patients. This requires that a common case
identification number be included in each record. Epi Info and other data-base
programs, such as dBASE, allow automatic linking of records through such a common
identifier. On data entry, answering "Y" to the question 'Contacts (Y/N) ?" might
cause another form, representing the contact file, to appear on the screen. The
operator can then enter one or many contact forms for this case, pressing a function
key (F10) to return to the main form. A separate record is created for each contact.

In Epi Info's ANALYSIS program, the CONTACT file is READ, and the CASE file is linked
("related") to it. Each contact record then contains information about the case-
patient as well as about the contact, and questions such as "how many contacts of
female case-patients were treated?" can be answered easily. The CASE file can also be
processed alone to answer questions such as "how many cases of syphilis were there?"

We also link disease-specific forms to the main data base of reports. Hepatitis, for
example, requires a full page of extra information used to define further the
epidemiology of a report. By linking a hepatitis file to the main case file, records
are created only if the disease is hepatitis, thus saving a great deal of storage
space over the single-file method, in which all the questions on hepatitis- would be
left blank in a nonhepatitis record. Current systems, including the one distributed
as an example on the Epi Info disks, contain related files for hepatitis, meningitis,
and enteric disease, each of which only appears if a relevant disease code is entered.

Dissemination of Data

Dissemination of results is an important element of the surveillance cycle.
Computerization can assist by making new methods of analysis or presentation

265

practical. Use of tabular or graphics software in conjunction with desk-top
publishing technology can make the preparation of results not only faster but more
accurate and meaningful. A graphic method for comparison of current results with
those for the past 5 years has been introduced to the Morbidity and Mortality Weekly
Report in the United States (Figure V.12) (14). This method would have been too
cumbersome for manual processing.

Computer software greatly simplifies and improves the production of maps and graphs.
Epi Map, a public domain companion to Epi Info, to be released in 1993 will make
mapping available to anyone with an IBM-compatible microcomputer.

Tables, maps, graphs, text, and data files may be made available either on-line via
modem connections or by distributing floppy or CD-ROM disks. The latter are
particularly useful in remote areas or for large volumes of data than can be easily
sent over low-speed modems.

Data Disasters

Destruction or damage or data on hard disks should be expected and planned for.
During the first 4 years of NETSS (and during the 3 year tenure of its predecessor,
the Epidemiologic Surveillance Project), a number of hard disks have "crashed." In
most cases, back-up files on floppy diskettes had been properly prepared and stored,
and they were used to restore the data once the disk had been replaced.

Recently, some state programs began to reuse case-identification numbers from several
years ago, not realizing that the new records would overwrite the old records in the
national data base. It is important to be clear about the time period for which
updates will be accepted.

Upgrading either hardware or software is a frequent cause of problems, when the new
items have unexpected features, occupy more memory space, or require that protocols
for functions, such as communications, be changed.

Computer viruses are an increasing cause of problems. They can cause a variety of
difficulties ranging from erratic behavior of software to complete loss of files.

266

They may be introduced from networks, by accessing other computer bulletin boards, or
by loading copied software from unknown sources.

Programs to detect and eradicate computer viruses are available commercially. It is
essential to install one of these and to be sure that any disk from an external source
is scanned for viruses before it is copied or used as a source of new programs.

Backup Methods

Methods for disaster prevention center around regular backup of data files onto floppy
diskettes (or tape if available, but beware of tape backups with only one compatible
tape drive in the same institution) . The back up copies should be rotated so that
several circulate in turn and so that the one overwritten has at least two more recent
relatives. To protect against fire, water damage, and damage by panic-stricken
personnel, it is wise to keep at least one backup in a site remote from the computer.
Setting the write-protection feature on the diskettes after making the backup is an
additional protection.

Upgrading hardware or software should be done at a time when use of the system is
least critical, and care should be taken to allow for replacing the old system exactly
as it was if problems occur with the new one. Thus, before installing a new version
of software, the old one should be thoroughly backed up or preferably left in place in
another directory so that it can be used if necessary.

Training of Staff and Transition Techniques

We have found that the most effective staff training occurs by having potential
operators participate in the design of the system and receive short demonstrations and
hands-on lessons at the time the system is installed. Usually installation of a
system takes two or three days for planning and decision making, two or three days for
programming, and a similar period for staff training, trial runs, and revisions.

National meetings and training sessions for operators of state surveillance systems
have been helpful in providing extra training and motivation and in surfacing problems
that need to be addressed and new ideas for software improvements.

During the transition from a paper to a computerized system, both systems are run in

267

parallel for a period until the results are satisfactory and staff feel comfortable
with the new system.

DISCUSSION

The old image of the computer expert in an expensive suit handing the client the keys
to the new "turn-key" system perfectly adapted to his or her needs was probably always
a fantasy, but with modest budgets, small data bases, and a desire for "hands-on"
access to data, it certainly has little relevance to public health needs. Although in
some ways centralized computers and instant interactivity for updating records would
present fewer problems than the distributed systems we have described, public health
workers usually do not require and cannot financially afford the instant updates
needed for law enforcement, banking, or airline reservations. Microcomputers and
local data bases can maintain the data and analytic results closer to the
professionals primarily responsible for prevention and control.

We are convinced that participation of all 50 state health departments in the national
computerized system would have been impossible without a) software for states that
allowed customization for use of local forms and procedures, b) participation of each
state epidemiologist's staff in designing a system unique to the state, and c) a
standardized record format. Each state has a different input form, although the
records sent to CDC are restructured and variable values are recoded by Epi Info
programs so that they are in the uniform national format.

As systems become more complex, however, it is important to standardize as many
features as possible from state to state so that a thoroughly debugged core system can
be used by all. We are gradually achieving this with a new Epi-Info based system that
has a series of standard modules, accompanied by other modules that are highly
customizable.

As pointed out in this chapter, there is an enormous gap between what is
technologically possible with the use of computers in public health and what is
actually going on at the grass-roots level of public health practice. Until the
keeping of medical records in clinical practice is computerized to a much greater
extent, it would be difficult to imagine that our scenario of the future will actually

268
move closer to reality.

Other key issues remaining to be resolved include a) the balance between
confidentiality and free access to clinical records for public health purposes, b) the
cost of data access and of programming and processing, and c) the ability of both
professionals and the public to deal with "dirty" and preliminary data.

Many of these issues have both technical and social solutions. A great deal of work
in both realms remains to be done before computerized public health surveillance can
be said to have achieved its full potential.

269
REFERENCES

1. Dean AD, Dean JA, Burton AH, Dicker RC. Epi Info, version 5: a word
processing, database, and statistics program for epidemiology on microcomputers.
Atlanta, GA. : Centers for Disease Control, Atlanta, 1990.

2. Dean AD, Dean JA, Burton AH, Dicker RC. Epi Info: a general-purpose
microcomputer program for public health information systems. Am J Prev Med
1991;7:178-82.

3. Graitcer PL, Burton AH. The epidemiologic surveillance project: a computer-
based system for disease surveillance. Am J Prev Med 1987;3:123-7.

4. Centers for Disease Control. National Electronic Telecommunications System for
Surveillance--United States, 1990-1991. MIHR 1991,-40 (29) .-502-3 .

5. Odell-Butler ME, Ellis B, Hersey JC. Final report for task 8, an evaluation of
the National Electronic Telecommunications System for Surveillance (NETSS) .
Arlington, Va. : Battelle, June 1991:49-50.

6. The big pay-off (benefits of computerizing a business) (node supplement). IBM
System User March 1990 :S20.

7. Mary M, Garnerin P, Roure C, et al. Six years of public health surveillance of
measles in France. Int J Epidemiol 1992;21:163-8.

8. Centers for Disease Control. Surveillance of influenza-like diseases through a
national computer network- -France, 1984-1989. MMWR 1989;38 (49) :855-7 .

9. Watkins M, Lapham S, Hoy W. Use of a medical center's computerized health care
database for notifiable disease surveillance. Am J Public Health
1991;81(5) :637-9.

10. Bernard KW, Graitcer PL, van der Vlugt T, Moran JS, Pulley KM. Epidemiological

270 27/

surveillance in Peace Corps volunteers: a model for monitoring health in
temporary residents of developing countries. Int J Epidemiol 1989; 18 (1) :220-6.

11. Gaynes R, Friedman C Copeland TA, Thiele GH. Methodology to evaluate a
computer-based system for surveillance of hospital-acquired infections. Am J
Infec Control 1990;18:40-6.

12. Shultz JM, Novotny TE, Rice DP. Quantifying the disease impact of cigarette
smoking with SAMMEC II software. Public Health Rep 1991,-106 (3) :326-33 .

13. Call B. The ones that got away: why some industries have not yet computerized.
PC Week June 24, 1986 ;3: (25).

14. Centers for Disease Control. Proposed changes in format for presentation of
notifiable disease report data. MMWR 1988; 38 (47 ) :805-9 .

272

Chapter XII

State and Local Issues in Surveillance

Melinda Wharton
Richard L. Vogt

"The government is very keen on amassing statistics. They collect them, add them,
raise them to the nth power, take the cube root and prepare wonderful diagrams. But
you must never forget that every one of these figures comes in the first instance from
the village watchman, who just puts down what he damn well pleases.'

Josiah Stamp

INTRODUCTION

In a recent report, the Institute of Medicine defined assessment as a core function of

273

public health agencies at the state and local level. "An understanding of the
determinants of health and the nature and extent of community need is a fundamental
prerequisite to sound decision-making about health. Accurate information serves the
interests both of justice and the efficient use of available resources. Assessment is
therefore a core governmental obligation in public health." State responsibilities
include "assessment of health needs within the state based on statewide data collec-
tion" as well as "establishment of statewide health objectives, delegating power to
localities and holding them accountable." Responsibilities of local public health
units include "assessment, monitoring, and surveillance of local health problems and
needs and resources for dealing with them" (2) .

AUTHORITY FOR REPORTING SURVEILLANCE DATA

Although much of this book focuses on surveillance at the national level, the legal
and regulatory authority for public health surveillance activities in the United
States derives from state and local law (see Chapter X) . Both the vital records and
morbidity reporting systems were developed initially at the state level, and only
later were national systems developed, with the participation of all states being
voluntary. Indeed, in the United States, state and local governments have both the
authority and the responsibility for almost all public health actions. This decen-
tralization of power is outlined in the Constitution of the United States. Therefore,
although most of the issues discussed in this chapter are relevant to other countries,
some are unique to the practice of surveillance in the United States.

Although the objectives of surveillance at the state and local level do not differ
substantially from those at the national level, the link to act ion- -whether it be
outbreak control, vector-control activities, legislation requiring use of child-
restraint devices, or community mobilization--is most explicit at the state and local
level. The objectives of state as well as national surveillance must be considered as
systems are developed or redesigned, to assure that the information needed for public
health action is obtained in the most efficient and cost-effective manner. The focus
of the objectives may vary somewhat by condition (see Chapters I and II) .

SOURCES OF SURVEILLANCE DATA

274

Only two data sources--vital records and notifiable-disease reports--are available at
the local level in all states in the United States. Although other data sources
discussed in Chapter III may be available at the state and local levels in some areas,
alternate data sources may be needed in some states or localities to assess the impact
of specific public health problems. Innovative solutions to particular data-related
problems have been developed in many communities; some issues related to data sources
at the state and local level are summarized below. For more information regarding
other data sources, see Chapter III.

Notifiable Diseases

All 50 states require that physicians report cases of specified notifiable diseases to
the appropriate state or local health department. The legal authority for the
collection of this information rests with state statutes that are promulgated in state
regulation; the diseases that are reportable vary by state (2,3). The notifiable-
diseases reporting system was initially developed for reporting epidemic diseases such
as smallpox and yellow fever, and this mechanism is still most commonly used for
surveillance of infectious diseases. For noninfectious conditions, reporting by
physicians is less uniformly required. In many states, however, reporting of specific
occupational or chronic diseases is required by statute.

Sentinel Systems

State and local health departments may supplement information available through the
notifiable-disease reporting system by creating sentinel reporting systems. State-
based sentinel systems in Maine and Rhode Island relied on reporting by physicians,
who were recruited by the state health department and were paid small amounts of money
for participation. Both systems were subsequently discontinued because of budgetary
cutbacks (4,5) .

More recently, a sentinel active surveillance system developed in Missouri has been
organized to ensure representation of the six public health districts in the state.
Over 500 sites were recruited for participation, including schools, hospitals, day-
care centers, preschools, and nursing homes; fewer than 30% of the participating indi-
viduals or institutions were physicians or clinics. Each participating site is
telephoned weekly by local health departments to solicit reports (f) . A similar

275

system, including universities, has been operated by the Los Angeles County Department
of Health Services since 1981. In addition to providing timely information about
reportable diseases, the system also has provided data on a variety of nonreportable
conditions (7) .

Such sentinel systems may be particularly useful for following trends in common condi-
tions— e.g., varicella or influenza- -when precise counts of cases are not needed and
when a public health response is not necessary for individual case reports. However,
if the reporting units selected for the sentinel system are unrepresentative of the
overall reporting population, findings may not be generalizable to the wider popula-
tion. Sentinel surveillance systems may be used to facilitate collection of addition-
al risk-factor and other information on a subset of case reports, thus limiting the
overall burden of data collection (8) .

Hospital -Based Surveillance

Hospital-based surveillance systems, drawing on emergency room visits or hospital-
discharge data, have most commonly been developed at the state and local level for
surveillance of injuries (9-25) . Other uses have included assessment of unmet health
needs by identification of preventable disease (sentinel health events) (16). Aside
from nosocomial infections, such systems are likely to have limited usefulness for
surveillance for communicable disease (17) .

In areas in which hospital-discharge diagnoses are coded using external cause of
injury and poisoning codes (E-codes), hospital-discharge data are useful for surveil-
lance of injuries. Currently 28 states have uniform hospital-discharge reporting
systems, and addition of E-coding is a high priority for state and local injury-
surveillance programs (18) . The recent experience of New York State demonstrated the
feasibility of such an addition, particularly when care was taken to develop a
constituency to support the proposed change. Review of clinical records demonstrated
that 93% of charts contained information necessary to allow proper coding. Since E-
coding has begun, 95% of records of injured persons contain a valid E-code (19) .

Other hospital-based data sources may be useful for surveillance at the state and
local level. For example, trauma registries are a potential source of data for injury

276

surveillance (20) , despite the lack of representativeness of patients referred to
trauma centers for care (21) .

School -Based Surveillance

School-based surveillance systems have been developed in some states to monitor
disease trends among children of school age. This approach has been used for surveil-
lance of influenza and varicella (22,23). Absenteeism is an excellent marker for
influenza and is almost always available for administrative reasons. In Michigan,
schools provide reports of cases of notifiable diseases among their students--along
with counts of number of cases of influenza-like illness and varicella--to local
health departments on a weekly basis. In many states, notifiable-disease regulations
mandate reporting of specified diseases by school authorities.

Surveys at the State and Local Level

Information on certain issues, such as seat-belt use or nonutilization of health-care
services, cannot be obtained readily without the use of surveys. Although national
surveys may provide national estimates, data at the state or even local level are
needed for health planning or to support legislative initiatives. Since 1981, state
health departments have collaborated with the Centers for Disease Control (CDC) to
conduct telephone surveys of adults to obtain information on health practices and
behavior. In 1990, 45 states and the District of Columbia participated in the
Behavioral Risk Factor Surveillance System (BRFSS) . The BRFSS allows estimation of
age- and gender-specific prevalence of various risk factors by state (24,25).
Likewise, behavioral risk factors among young people are periodically measured through
state and local school-based surveys in the Youth Risk Behavior Surveillance System
(26). County or community surveys may be particularly useful in areas with small
populations, in instances in which morbidity or mortality data may be of limited
usefulness to monitor the impact of interventions (27) .

National Mortality Registration System

State law requires filing a death certificate for every death that occurs in the
state, and death registration is virtually complete in the United States. At the

277

state level, mortality data are available before national data are compiled and
released. Although the underlying cause of death is determined using standard
computerized algorithms in all states, not all states use E-coding.

Such data are useful at the local level to identify preventable mortality and to set
health priorities in the community. These efforts may be particularly important in
developing community-based prevention programs for chronic disease (28) .

Other Data Sources

Surveillance responsibilities of state and local health departments extend into many
other areas, and in some jurisdictions may include monitoring of environmental
quality, illnesses of domestic and wild animals, and vector populations. Although
outside the scope of this book, these types of surveillance provide important informa-
tion at the state and local level. For example, management of persons exposed to
possibly rabid animals is influenced by the epidemiology of rabies in the area of
exposure (29) .

Arbovirus surveillance includes monitoring of vectors, vertebrate hosts, human cases,
weather, and other factors in order to detect or predict changes in the transmission
dynamics of arboviral infections. Guidelines for arbovirus surveillance programs in
the United States have recently been developed (30) .

Provider-Based Reporting: Special Issues

Mandatory reporting of communicable diseases by physicians has a long history in the
United States, and there is an equally long history of failure on the part of physi-
cians to comply. During the yellow fever epidemic of 1795, the New York City Health
Committee quarantined patients with yellow fever at Bellevue Hospital. Many physi-
cians refused to report cases, and the New York Medical Society went on record oppos-
ing the Committee's action, on grounds that the disease was not contagious (31) .
Physicians fought early efforts to make tuberculosis reportable, arguing that compul-
sory reporting constituted an invasion of the doctor-patient relationship and a
violation of confidentiality (32). By 1913, five states had enacted regulations
requiring reporting of venereal disease. Dr. Herman Biggs, director of the New York
City Board of Health, stated that "the ten year long opposition to the reporting of

278

tuberculosis will doubtless appear a mild breeze compared with the stormy protest
against the sanitary surveillance of the venereal diseases" (33) .

The completeness of reporting of communicable diseases is variable, but for most
diseases in most locations, it is thought to range from low to very low {34,35) . Of
course, factors other than the failure of physicians to report cases contribute to the
low level of reporting of incident cases. Persons with asymptomatic infections or
mild disease are unlikely to seek medical care. Of those persons who do seek care,
not all will receive a specific diagnosis. Nationally, only 5% of cases of varicella
are reported in the United States (36) , and estimates of completeness of reporting are
similar for shigellosis (3 7) . Studies of outpatient-based or hospital-based reporting
in some areas suggest somewhat higher levels of reporting of diagnosed cases of
notifiable diseases, with substantial variation by disease (38-40). Reporting rates
are higher for inpatients than outpatients (17).

Given the historic reluctance of physicians to participate in reporting disease, it is
fortunate that reports of disease are available to most state health departments from
other sources. Almost all states mandate reporting by clinical laboratories of at
least some notifiable diseases (41) . Laboratory reporting is often more readily
available and reliable than reports from physicians. In Vermont, 71% of initial
reports of confirmed cases of notifiable diseases in the period 1986-1987 originated
from clinical laboratories; only 10% originated from physicians' offices (42) . In
Oklahoma, approximately 85% of cases of shigellosis are reported, but laboratories
account for almost all of the reports received. Laboratories reported 77% of all
reported cases, compared with only 6% for physicians (43) .

Although laboratory -based reporting may be a valuable adjunct to physician-based
reporting, it cannot replace reporting by physicians for all diseases. Some report-
able diseases are clinical syndromes, requiring clinical judgment, and no specific
laboratory diagnostic procedures exist (44) . In other situations, laboratory diagno-
sis may play an important role, but may not be routinely available in a timely enough
manner to replace reporting by physicians. Finally, physicians may have additional
information that is epidemiologically important but is not known to the laboratory; a
timely report by a physician may allow early institution of control measures, without
waiting for the health department to follow up on laboratory reports.

279

A number of studies have attempted to identify reasons for physicians' failure to
report notifiable diseases (42, 45-47) . In recent years, physicians have cited many of
the same objections that have been raised historically, as noted above, although it is
at least reassuring that the noncontagiousness of diseases that are actually communi-
cable is no longer invoked. Commonly cited reasons, in approximate order of impor-
tance, are summarized in Table XII. 1.

In an effort to improve reporting of notifiable diseases by physicians, local and
state health departments have tried a number of different strategies. Although many
of them have not been formally evaluated, enough information is available to reach
some conclusions about possible successful approaches.

Projects aimed at improving reporting by physicians have included many interventions
(e.g., revised reporting procedures, improved dissemination of findings and feedback
to participants, and informational campaigns regarding the importance of reporting and
outlining procedures for reporting) . Even relatively intensive efforts may not
produce major increases in reporting, although they may be effective in increasing
awareness of reporting procedures among physicians (7,48).

Efforts to increase reporting through specific projects provide some clues on the most
effective approaches. Active surveillance projects, in which health department
personnel contact physicians' offices on a regular basis, have demonstrated 2- to 5-
fold increases in the reporting of specified diseases, as well as increases in
reporting of other conditions not subject to active surveillance (49-51) . The
consistency of these findings demonstrates that under some circumstances physicians
are willing to report cases of notifiable disease. In these studies, reporting was a
simple matter, and that may be important; equally important may be the message
conveyed by the substantial investment by the health department in active surveil-
lance—that disease reporting is an important activity.

The need for surveillance data on notifiable disease and the usefulness of such data
are so obvious to workers in state and local health departments that we often believe
that all physicians would report if they only understood the importance of reporting.
Efforts to educate physicians have included a) lectures to medical students, house
officers, and local medical groups on the importance of reporting; b) health depart-

280

merit newsletters; c) educational mailings; and conjunction with licensure. Although
all of these may be useful, and lectures and newsletters are important forms of
feedback to the medical community, evaluation of single presentations to clinical
groups, newsletters, and mailings have not been found, in isolation, to increase
reporting. Intensive efforts to market the concept of reporting may be more useful
but will be accompanied by an obvious increase in cost (52) .

If sending an occasional speaker to the local medical society and mass mailings are
not effective, what is? The active surveillance projects and other studies of
interventions demonstrate the usefulness of telephone contact (49-51, 53) . In fact,
the efforts that work all target individual physicians- -rather than groups of physi-
cians— and make limited use of mailings and more use of personal visits and telephone
contact. Some approaches that appear to be successful include a) providing physicians
with feedback on the health department's disposition of individual cases (54); b)
matching laboratory reports with physicians' reports, and for those cases reported
only by laboratories, notifying physicians that a specific case should have been
reported to the health department; and c) conducting in-person site visits to review
reporting procedures (55) . The latter intervention may be quite effective in enhanc-
ing laboratory- and hospital-based reporting, especially if accompanied by a review of
medical records. The relevant factors may be less the mode of contact than the need
to remind physicians on a regular basis that there is a health department that wants
the information and that the health department actually does something with the data
that are provided.

Exhortation and pleading for reports is no substitute for a state or local health
department that responds promptly to reported public health problems, provides useful
responses to inquiries from physicians and the public, and gives feedback on its
activities and on the health status of the community to the medical community and the
public. Nonetheless, a few specific steps that state and local health departments can
take to improve reporting of notifiable diseases can be identified (Table XII. 2).

Active surveillance works, but it is generally too costly to maintain as a routine
health department activity. Less costly alternatives include sentinel active surveil-
lance, in which certain physicians and institutions are identified and are targeted
for active surveillance. Although this approach has been successful in some areas, it

281

is also costly and may detract from collection of surveillance data from non-sentinel
sites. Another approach is what has been called "stimulated passive surveillance," in
which the health department uses any contact with the medical community to solicit
reports and provide feedback on community health status and health department activi-
ties. It may not be feasible to contact every physician, or even a systematic sample
of physicians, every week, but every week physicians are contacted, for a variety of
purposes, and those contacts can be used to exchange information.

Administrative barriers to reporting should be identified and eliminated. Physicians
should be provided readable and up-to-date copies of lists of notifiable diseases,
reporting forms, and telephone and facsimile numbers for local and state health
departments. Reporting procedures should be as simple as possible. Some health
departments have used toll-free numbers for telephone reporting {46,56). Answering
machines can answer telephones at night, but people can answer questions and provide- -
and solicit--additional information. Reporting forms should be simple, clear, and
printed in colors that allow photocopying or transmission by facsimile machine. Self-
addressed, postage-paid cards or envelopes may be helpful. Although these tools may
make reporting easier, without the other components of effective surveillance they are
unlikely to have substantial impact on reporting behavior of physicians.

State licensing boards may penalize physicians for failing to report, although such
actions are rarely taken. In California, a physician who failed to report on a
patient with hepatitis A who subsequently transmitted infection to others had his
license suspended for a year, and was placed on probation for 5 years (57) . The
medicolegal implications of failure to report are well-established in law, where the
physician's obligation has been found to extend beyond the patient under his/her care
(58) . Although no single approach--be it improved communications, improved proce-
dures, education, or fear--is necessarily successful in improving reporting by
physicians, effective presentations have been developed using case studies that
include the medicolegal implications of failure to report (Hendricks K, personal
communication) .

MAINTENANCE OF A LIST OF NOTIFIABLE DISEASES

Although the mechanisms vary, it is important that lists of notifiable diseases

282

undergo periodic revision. Public health priorities, epidemiology of specific
conditions, and available public health interventions all change over time, with the
result that last year's list of notifiable diseases no longer meets this year's needs.
Additions and deletions must be made on an as-needed basis in order to maintain the
usefulness of a notifiable-disease system. In particular, care must be exercised to
assure that data on all notifiable conditions are actually needed and are used for
public health purposes. "Diseases are often made reportable but the information
gathered is put to no practical use, and with no feed-back to those who provided the
data. This leads to deterioration in the general level of reporting, even for
diseases of much importance. Better case reporting results when official reporting is
restricted to those diseases for which control services are provided or potential
control procedures are under evaluation, or epidemiologic information is needed for a
definite purpose" (59).

In Canada, specific criteria have been developed for determining which diseases or
conditions should be reported at the national level (Table XII. 3) (60). In practice,
these criteria have not resulted in the removal of any diseases from the list of
nationally notifiable diseases, but they have at least provided a systematic basis for
deciding among diseases proposed for addition.

ANALYSIS OF DATA

Most of the analytic issues relevant at the state and local level have been addressed
elsewhere in this book (chapters V and VI), but some problems encountered in analyses
at the state and local level are rarely faced at the national level.

Comparison of rates in different geographic areas poses particular and difficult
problems when the number of events is small and/or the population of the areas is
small. When analyzing data drawn from a small population, particularly for an
uncommon event or from a subset of the population (e.g., when calculating age- or
race-specific rates), calculated rates may be difficult to interpret. Unfortunately,
it is difficult to say with certainty what population size, or number of events, is
"too small" for meaningful analysis.

Issues involved in assessing the stability of rates and changes in rates when numbers

283

are small have been well summarized for the nonstatistician (61). For example,
confidence intervals for rates can be calculated as shown in Table XII. 4. In general,
rates calculated based on <20 events will have a 95% confidence interval approximately
as wide as the rate itself.

Two methods for comparing independent rates (that is, rates from different, non-
overlapping geographic areas or from a single area at two different nonoverlapping
time intervals) have been suggested. The 95% confidence interval for the ratio of two
independent rates can be calculated using the formula shown in Table XII. 5. The two
rates differ significantly at the 5% level if the 95% confidence level for the ratio
of the two rates does not include 1 . This method produces valid results if the rate
in the denominator is calculated from more than 100 events. The 95% confidence
interval for the difference between two independent rates can be calculated using the
formula shown in Table XII. 6. The rates differ significantly at the 5% level if the
95% confidence interval of the difference between the two rates does not include zero.
Sometimes the two methods provide contradictory results; if that occurs, one should
conclude that the rates being compared are not significantly different (61).

In another report, four age-adjusted mortality indexes were compared, using 1969-1971
U.S. mortality data by county, for counties with populations of >5,000. On the basis
of coefficients of variation, the standardized mortality ratio has produced stable
results for mortality data from all counties studied, while unacceptable instability
was found when the relative mortality index was applied to data from counties with
populations of <50,000. Calculation of years of life lost from all causes produced
stable results when applied to data from counties with populations of _>25,O0O (62).
The stability of rates for specific causes of death remains a problem for small
geographic areas. Methods for stabilization of rates have been developed, specifical-
ly for mapping of uncommon events such as suicide or specific types of cancer by
county (63,64).

As an initial step, before a more complicated method for stabilization of rates is
applied, aggregated rates should be compared with disaggregated rates (i.e., multiple
years versus a single year; state-wide versus county-wide; and entire population
versus age-, gender-, or race-specific rates). High rates in geographic areas with
small populations—or in subsets of the population- -may be due to chance, particularly

284

if the elevated rate is based on a small number of observed cases. Alternatively, if
increases are consistent over time--or across some population subgroups--it is more
likely that they represent important differences rather than chance occurrences.

Other events deserve attention, even if only a single case occurs; the occurrence of a
sentinel health event represents a failure somewhere in the system of public health or
of health-care delivery and warrants careful attention. Such sentinel events include
maternal and infant deaths and a wide variety of infectious and noninfectious condi-
tions (65) .

Intercensal population estimates for small areas are available from a variety of
sources. Because of limited availability of age-, gender-, and race-specific esti-
mates from the U.S. Bureau of the Census for small areas, often, state governments
have developed their own estimates (66) . Methods for interpolating census data for
estimation of small area populations have been developed (67) .

Methods have also been developed for defining hospital service areas in metropolitan
areas (68). Although these methods have most commonly been used in studies of health-
services utilization in different geographic areas, they are potentially of value in
analyses of data generated by hospital-based surveillance at the state or local level.
Small-area analyses in health-services research have recently been reviewed (69). The
statistical issues raised by these studies are also relevant to analyses of surveil-
lance data (70) .

Although more elaborate techniques have been described, most analyses of surveillance
data are quite simple — frequencies, proportions, and rates--which may be conveniently
presented in tabular form, graphs or as maps. Indeed, the simplest analyses—the
number of births to teenagers by census tract, or crude death rates by county--may be
the most useful for documenting the need for services. Simple analyses should be done
and their results thoughtfully considered before more complicated procedures are
undertaken. By far the most common error made in analysis of surveillance data is
failure to look at the data.

DISSEMINATION OF SURVEILLANCE: STATE AND LOCAL

285

PERSPECTIVES

Most of the issues relevant to the dissemination of surveillance data at the state and
local level have been addressed in Chapter VII. The role of newsletters, annual
reports, and press releases has already been addressed, as has the importance of clear
presentation and use of graphics. Mapping is a powerful technique for presenting
data. Electronic mail systems have been developed in some states to facilitate the
dissemination of information between state and local health departments.

RESOURCES FOR SURVEILLANCE AT THE STATE AND LOCAL
LEVEL

No model system for surveillance at the state or local level exists. There is great
variation in organizational structure of state and local health departments, and
surveillance activities are usually closely linked to disease-control programs.
Although this linkage helps assure that the data collected will indeed be used, it
complicates efforts to document the resources, personnel and other, needed for
surveillance; surveillance cannot be readily separated from other related activities.

There are only a few published reports that address the cost of routine surveillance
systems for communicable disease in state health departments. The cost of a newly
established active surveillance system that surveyed half the primary-care physicians
in Vermont was estimated to be $20,000 annually, compared with $3,000 for passive
surveillance (50) . A study of the sentinel active surveillance system in Los Angeles
County estimated that the additional cost of weekly contacts made with selected
hospitals, physicians, schools, day-care centers, and university health centers was
approximately $7,000 per year, compared with an estimated $10,000 per year for passive
surveillance. The California costs reflected student instead of professional staff
time and did not include time expended in recording reports at the health department
(7). In 1985, the Kentucky Department for Health conducted active surveillance for
hepatitis A infections among one-half of primary-care practitioners in 45 of 120
counties in the state. The 22-week active surveillance program was estimated to cost
$5,616. Although the system was cost-effective overall, because the administration of
immune globulin to contacts averted an estimated $14,021 in direct medical and
indirect costs of potential subsequent cases, the health department itself, of course,

286

incurred increased cost. The system was not continued after the study was completed
(71) .

Higher quality data on cost are available for some more recently developed surveil-
lance systems at the state level. A survey of 24 state and metropolitan health
departments that conducted surveillance for nutrition in 1981 found that an average of
16.6 hours of work by a nutritionist was required each month for the surveillance
system. Eight and one-half hours of clerical time were needed, along with support
from statisticians, computer technicians, and others {72).

Data collection, coding, and entry for 2,000 persons with injuries seen at a single
hospital participating in the National Electronic Injuries Surveillance System cost
approximately $7,000 in 1989 (12).

Costs of the BRFSS are shared by CDC and participating state health departments
through cooperative agreements. In 1987, the cost per state was approximately
$50,000, or approximately $25-$30 per completed telephone interview (24).

Part of the Statewide Childhood Injury Prevention Project (SCIPP) in Massachusetts
involved conducting a random-digit telephone survey. Information on injuries in the
previous 2 months was obtained; because of the relative infrequency of these events, a
large sample size was needed. Twelve hundred households were contacted at a cost of
$25,000, yielding reports of only 80 injuries, most of which were falls (73).

More complete and accurate documentation of the costs of surveillance--including data
analysis and dissemination--may facilitate funding, particularly in the current era of
tight constraints on state budgets. Explicit discussion of costs and benefits may
help, both in terms of protecting (if not increasing) funding levels and assuring that
existing surveillance systems are necessary and make the best possible use of person-
nel time.

SUMMARY

Public health surveillance- -the systematic and ongoing collection of data pertinent to
public health, and the subsequent analysis and dissemination of these data--is the

287

first step toward action in public health, but it is only the first step. A number of
approaches to translation of data into action have been developed, with emphasis on
the local level. The Assessment Protocol for Excellence in Public Health (APEXPH) ,
developed in collaboration with the National Association of County Health Officers,
guides local health department officials through identification of health problems
that require priority attention and through building of community coalitions for
action (74). Such an approach provides a good foundation for adopting community
health objectives {75). These methods have been very successful in communities that
have undertaken them, and they provide useful outlines for translating information
into action at the community level. For example, in Tucson, Arizona, a community
coalition targeted for action the high rate of infant mortality, with the result that
a new program to provide prenatal care was established.

Other examples, at the state level, are readily available. National studies that
found that residents of Delaware died at high rates of preventable chronic disease
resulted in a statewide cancer control plan, including a mobile mammography unit for
inner-city neighborhoods. Widespread measles outbreaks occurred in New York State in
1989 among high school and college students who had been previously vaccinated.
Surveillance data led New York officials to reconsider the state's vaccination
strategy, with the result that in April 1989 New York became the first state in the
United States to adopt a two-dose schedule for routine measles vaccination (76).
Similarly, surveillance data in Tennessee led to the adoption of a statewide vaccina-
tion requirement for children who attend school in the state (Figure XII. 1).

The competition for limited dollars and for the attention of policy makers and the

public is intense. The challenge is to identify problems, set priorities, and to work

with communities to develop solutions. More than ever, it is important to use data to

decide among competing priorities and allocate limited resources- -the most important

of which are the time and energy of the public health practitioner and the best
interests of the public.

288
REFERENCES

1. Institute of Medicine. The future of public health. Washington, D.C. :
National Academy Press, 1988.

2. Chorba TL, Berkelman RL, Safford SK, Gibbs NP, Hull HF. Mandatory reporting of
infectious diseases by clinicians. JAMA 1989;262:301-826.

3. Freund E, Seligman JP, Chorba TL, Safford SK, Drachman JB, Hull HF. Mandatory
reporting of occupational diseases by clinicians. JAMA 1989;262:3041-4.

4. Feagin OT. Maine's sentinel physician system. Journal of the Maine Medical
Association 1971; 62 : 187 ,201.

5. Schaffner W, Scott HD, Rosenstein BJ, Byrne EB. Innovative communicable disease
reporting: the Rhode Island experiment. HSMHA Health Rep 1971;86:431-6.

6. Dodson DR, Bright MF. Sentinel active surveillance system. Missouri Epidemiol-
ogist July 1989:1-2.

7. Weiss BP, Strassburg MA, Fannin SL. Improving disease reporting in Los Angeles
County: trial and results. Pub Health Rep 1988;103:415-21.

8. Laboratory Centre for Disease Control. Canadian communicable disease surveil-
lance system: disease-specific case definitions and surveillance methods. Can

Dis Wkly Rep 1991; 17 (Suppl 3):l-35.

9. Gallagher SS, Guyer B, Motelchuck M, Bass J, Lovejoy FH, McLoughlin E, Mehta K.
A strategy for the reduction of childhood injuries in Massachusetts: SCIPP. N
Engl J Med 1982;307:1015-8.

10. Runyan CW, Kotch JB, Margolis LH, Buescher PA. Childhood injuries in North
Carolina: a statewide analysis of hospitalizations and deaths. Am J Public
Health 1985;75:1429-32.

11. Hopkins RS. Consumer product-related injuries in Athens, Ohio, 1980-85:
assessment of emergency room-based surveillance. Am J Prev Med 1989;2:104-12.

12. Grisso JA, Wishner AR, Schwarz DF, Weene BA, Holmes JH, Sutton RL. A popula-
tion-based study of injuries in inner-city women. Am J Epidemiol 1991;134:59-
68.

13. King WD. Pediatric injury surveillance: use of a hospital discharge data base.
South Med J 1991;84:342-8.

14. Goebert DA, Ng MY, Varney JM, Sheetz DA. Traumatic spinal cord injury in
Hawaii. Hawaii Medical Journal 1991;50 (2 ) :44, 47-48, 50

289

15. Smith GS, Langlois JA, Buechner JS. Methodological issues in using hospital
discharge data to determine the incidence of hospitalized injuries. Am J
Epidemiol 1991;134:1146-58.

16. Carr W, Szapiro N, Heisler T, Krasner MI. Sentinel health events as indicators
of unmet needs. Soc Sci Med 1989;29:705-11.

17. Watkins M, Lapham S, Hoy W. Use of a medical center's computerized health care
database for notifiable disease surveillance. Am J Public Health 1991;81:637-9.

18. Graitcer PL. The development of state and local injury surveillance systems. J
Safety Res 1987;18:191-8.

19. Feck G, Relethford JH. The addition of E-codes to the hospital discharge
reporting system in New York. Abstracts of the 119th Annual Meeting of the
American Public Health Association, November 10-14, 1991, Atlanta, Ga., 1991:1-
38.

20. Lloyd LE, Graitcer PL. The potential for using a trauma registry for injury
surveillance and prevention. Am J Prev Med 1989;1:34-7.

21. Patetta M, Cole T, Bowling JM, Watkins S. Evaluation of the representativeness
of the North Carolina Trauma Registry. Abstracts of the 119th Annual Meeting of
the American Public Health Association, November 10-14, 1991, Atlanta, Ga.,
1991:139.

22. Peterson D, Andrews JS, Levy BS, Mitchell B. An effective school-based influen-
za surveillance system. Public Health Rep 1979;94:88-92.

23. Finger R, Stapleton M, Pelletier A. Reportable diseases in Kentucky: a five-
year surveillance summary 1986-1990. Kentucky Cabinet for Human Resources,
Department for Health Services, Division of Epidemiology.

24. Remington PL, Smith MY, Williamson DF, Anda RF, Gentry EM, Hogelin GC. Design,
characteristics, and usefulness of state-based behavioral risk factor surveil-
lance: 1981-87. Public Health Rep 1988;103:266-375.

25. Anda RF, Waller MN, Wooten KG, et al. Behavioral risk factor surveillance,
1988. In: CDC Surveillance Summaries, June 1990. MMWR 1990;39(no. SS-2):1-21.

26. Kolbe LJ. An epidemiological surveillance system to monitor the prevalence of
youth behavior. Health Education 1990;21:44-8.

27. Aday LAA, Sellers C, Andersen RM. Potentials of local health surveys: a state-
of-the-art summary. Am J Public Health 1981;71:835-40.

28. Remington PL, Anderson DE, Manering MC, Peterson EA, Anderson H. The PRECEDES
Project: background and materials. Wisconsin Medical Journal 1990;89:695-6.

290

29. Fishbein DB. Rabies. Infect Dis Clin North Am 1991;5:53-71.

30. Moore CG, McLean RG, Mitchell CJ, et al . Guidelines for arbovirus surveillance
programs in the United States. Atlanta, Ga.: Public Health Service (in press).

31. Duffy J. The sanitarians: a history of American public health. Urbana,
Illinois: University of Illinois Press, 1990.

32. Starr R. The social transformation of American medicine. New York: Basic
Books, 1982:187.

33. Brandt AM. No magic bullet: a social history of venereal disease in the United
States since 1880. New York: Oxford University Press, 1987:42-3.

34. Haward RA. Scale of undernotif ication of infectious diseases by general
practitioners. Lancet 1973;1:873-4.

35. Thacker SB, Choi K, Brachman PS. The surveillance of infectious diseases. JAMA
1983,-249:1181-5.

36. Wharton M, Fehrs L, Stroup N, Cochi SL. Health impact of varicella in the
1980s. Abstracts of the 30th Interscience Conference on Antimicrobial Agents
and Chemotherapy, October 21-24, 1990, Atlanta, Ga., 1990:276.

37. Rosenberg ML, Gangarosa EJ, Pollard RA, et al. Shigella surveillance in the
United States, 1975. J Infect Dis 1977;136:458-60.

38. Marier R. The reporting of communicable diseases. Am J Epidemiol 1977 ,-105:587-
90.

39. Vogt RL, Clark SW, Kappel S. Evaluation of the state surveillance system using
hospital discharge diagnoses, 1982-1983. Am J Epidemiol 1986;123:197-8.

40. Campos-Outcalt D, England R, Porter B. Reporting of communicable diseases by
university physicians. Pub Health Rep 1991;106:579-83.

41. Sacks JJ. Utilization of case definitions and laboratory reporting in the
surveillance of notifiable communicable diseases in the United States. Am J Pub
Health 1985;75:1420-22.

42. Shramm M, Vogt RL, Mamolen M. Disease surveillance in Vermont: who reports?
Pub Health Rep 1991,-106:95-7.

43. Harkess JR, Gildon BA, Archer PW, Istre GR. Is passive surveillance always
insensitive? An evaluation of shigellosis surveillance in Oklahoma. Am J Pub
Health 1988;128:878-81.

44. Centers for Disease Control. Case definitions for public health surveillance.
MMWR 1990;39(No.RR-13) :l-43.

45. Konowitz PM, Petrossian GA, Rose DN. The underreporting of disease and physici-

291

an's knowledge of reporting requirements. Pub Health Rep 1984;99:31-5.

46. Do physicians report diseases? Louisiana Morbidity Report 1990 ;1 (4) : 1-2 .

47. Jones JL, Meyer P, Garrison C, et al. Physician and infection control practi-
tioner HIV/AIDS reporting characteristics. Am J Pub Health 1992;82:889-91.

48. Seixas NS, Rosenman KD. Voluntary reporting system for occupational disease:
pilot project, evaluation. Public Health Rep 1986;101:278-82.

49. Brachott D, Mosley JW. Viral hepatitis in Israel: the effect of canvassing
physicians on notifications and the apparent epidemiological pattern. Bull Wld
Health Org 1972;46:457-64.

50. Vogt RL, LaRue D, Klaucke DN, Jillson DA. Comparison of an active and passive
surveillance system of primary care providers for hepatitis, measles, rubella
and salmonellosis in Vermont. Am J Pub Health 1983;73:795-7.

51. Thacker SB, Redmond S, Rothenberg RB, et al. A controlled trial of disease
surveillance strategies. Am J Prev Med 1986;2:345-50.

52. Scott HD, Thacher-Renshaw A, Rosenbaum SE, et ai. Physician reporting of
adverse drug reactions: results of the Rhode Island Adverse Drug Reaction
Reporting Project. JAMA 1990;263:1785-8.

53. Rothenberg R, Bross DC, Vernon TM. Reporting of gonorrhea by private physi-
cians: a behavioral study. Am J Public Health 1980;70:983-6.

54. Spencer L, Wren GR. New reporting system aids epidemiologists. Hospitals
1979;53:105-6.

55. Fife D, McAnaney, Rahman MA. Changes in AIDS case reporting after hospital site
visits. Am J Public Health 1991;81:1648-50.

56. Tizes R, Pravda D. Proposed toll-free telephone reporting of notifiable
diseases. Health Serv Rep 1972;87:633-7.

57. Disease reporting--a health professional's responsibility. Public Health Letter
(Los Angeles County Department of Health Services) 2[10J. September 1980.

58. Isaacman SH. Significance of disease reporting requirements. Infectious
Disease News 1990;3(10) :23.

59. Benenson AS, (ed) . Control of communicable diseases in man. Fifteenth Edition.
Washington, D.C.: American Public Health Association 1990:xxvi.

60. Laboratory Centre for Disease Control. Establishing goals, techniques, and
priorities for national communicable disease surveillance. Can Dis Wkly Rep
1991;17:79-84.

61. Kleinman JC, Kiely JL. Infant mortality. NCHS Statistical Notes 1991;1:7-10.

292^3

62. Kleinman JC. Age-adjusted mortality indexes for small areas: applications to
health planning. Am J Public Health 1977;67:834-40.

63. Manton KG, Woodbury MA, Stallard E, et al. Empirical Bayes procedures for
stabilizing maps of U.S. cancer mortality rates. J Am Stat Assn 1989;84:637-50.

64. Lui KJ, Martinez B, Mercy J. An application of the empirical Bayes approach to
directly adjusted rates: a note on suicide mapping in California. Suicide and
Life-Threatening Behavior 1990;20:240-53.

65. Rutstein DD, Berenberg W, Chalmers TC, et al. Measuring the quality of medical
care: a clinical method. N Engl J Med 1976;294:582-8.

66. Balachandran M, Balachandran S (eds . ) . State and local statistics sources.
Detroit, Michigan: Gale Research Inc., 1990.

67. Aickin M, Dunn CN, Flood TJ. Estimation of population denominators for public
health studies at the tract, gender, and age-specific level. Am J Public Health
1991;81:918-20.

68. Thomas JW, Griffith JR, Durance P. Defining hospital clusters and associated
service communities in metropolitan areas. Socio-Economic Plan Sci 1981,-15:45-
51.

69. Paul-Shaheen P, Clark JD, Williams D. Small area analysis: a review and
analysis of the North American literature. J Health Politics Policy Law
1987,-12:741-809.

70. Diehr P. Small area statistics: large statistical problems. Am J Public Health
1984;74:313-4.

71. Hinds MW, Skaggs JW, Gershon KB. Benefit-cost analysis of active surveillance
of primary care physicians for hepatitis A. Am J Public Health 1985;75:176-7.

72. Scheer JC, Sims LS. Status of nutritional surveillance activities in 24 state
and metropolitan health departments. Public Health Rep 1983;98:349-55.

73. Guyer B. The application of morbidity data in the Massachusetts Statewide
Childhood Injury Prevention Program. Can J Public Health 1989;80:432-4.

74. APEXPH: Assessment protocol for excellence in public health. Washington, D.C.:
National Association of County Health Officials, 1991.

75. Healthy communities 2000: model standards. Washington, D.C.: American Public
Health Association, 1991.

76. Birkhead GS, Morse DL, Mills IJ, Novick LF. New York State's two-dose schedule
for measles immunization. Public Health Rep 1991;106:338-44.

294

Chapter XIII

Important Surveillance Issues in
Developing Countries

Mac Otten

"The health of the people is really the foundation upon which all their happiness and all
their powers as a state depend.'

Benjamin Disraeli

INTRODUCTION

Previous chapters in this book have discussed surveillance largely from the
perspective of developed countries. Although the issues they address are relevant to
all nations, developing countries have unique needs and opportunities. The health
conditions typically associated with the developing world--diarrhea, malaria,
pneumonia, and malnutrition—occur in settings with only rudimentary health care.
This chapter highlights a number of surveillance issues relevant to developing
countries, including resource constraints.

Although conducting surveillance in developing countries is complex, it also presents
unique opportunities. Because the formal health-care system is often an integral part
of organized government services, there are fewer impediments to implementing
surveillance systems. The limited number of health-care providers and diagnostic
laboratories reduces the number of data sources, which can facilitate quality
assurance. Moreover, acute diseases and injuries still represent major causes of
morbidity and mortality in many of these countries; these are conditions for which
surveillance techniques are well-developed. Finally, communities often have well-
defined health systems that can be used for surveillance purposes. These
opportunities should be taken when feasible- -despite such obstacles as rudimentary
record-keeping systems and limited resources, numbers of diagnostic laboratories,
demographic and vital information, and infrastructure.

Four issues relating to surveillance are covered in this chapter: a) planning, b) data

295

sources (e.g., vital statistics, surveys, and sentinel surveillance), c) surveillance
at the local level, and d) development of integrated surveillance systems. In this
chapter, the term "local" refers to the health station (which we assume to be the
lowest level of the formal health system) , where health assistants work. In addition,
"population-based" is used to describe information for all persons in a certain
geographic unit as opposed to facility-based information, which may represent only
persons from the catchment area of a given health facility.

PLANNING

Identifying Health Objectives and Linkage to Surveillance

Identifying measurable health objectives, assigning them priority, and then
linking surveillance to those objectives is a high-priority activity both for the
surveillance system and for health-system development in general (1-3) . Linking
surveillance to these ordered health objectives alleviates the pitfall of thinking
of surveillance as just the reporting of disease rather than as a system that uses
information from multiple sources (such as sentinel sites, exit interviews, and
regular surveys) . Linking surveillance to objectives will help planners of the
surveillance system to think creatively in efforts to build a surveillance system
to measure all priority health objectives. Table XIII. 1 lists data sources that
could be used in building a surveillance system in a developing country.

Throughout the world, health objectives should be based on health impact,
feasibility of intervention, and cost-effectiveness of the intervention. In
developing countries, measurable health objectives often cannot be identified
because high-quality, population-based mortality data are often missing. As a
result, estimates of mortality and health outcome from such international
organizations as United Nations International Children's Emergency Fund (UNICEF)
and the World Health Organization (WHO), international conferences, and population
laboratories (e.g., International Center for Diarrheal Disease Research,
Bangladesh) are used. Although health problems are similar in most developing

296

countries (Table XIII. 2), relying on data from other countries can create major
problems, especially for conditions for which impact is not clearly known (e.g.,
hepatitis B, iodine deficiency, or malaria) or for emerging health problems (e.g.,
human immunodeficiency virus [HIV] infection, tobacco use, and motor-vehicle
injuries) .

The need for country-specific data is illustrated by the finding of World Bank
analysts that oral-rehydration therapy (ORT) in low-mortality environments is much
less cost-effective than passive case detection and short-course chemotherapy for
tuberculosis, whereas ORT in high-mortality environments is very cost-effective
(1). The cost-effectiveness varies by a factor of 2 to 10, depending on the local
situation.

Health objectives should focus both on current health status and on anticipated
health needs. It may be more cost-effective to address preventive strategies
(e.g. , early bottle feeding, cessation of tobacco use, use of seat belts, and
sanitation) now rather than when the impact of adverse events becomes more
apparent .

For each health objective, the surveillance method for evaluating that objective
and its sub-objectives should be listed (Table XIII. 3). Once such a list is made,
a surveillance grid can be constructed to show which component of the surveillance
system will measure which objective (Table XIII. 4). Completing a surveillance
grid helps one visualize the overall structure and function of the surveillance
system.

The process of defining objectives, linking objectives to surveillance components,
and constructing surveillance grids will highlight surveillance needs. The
process provides a basis for strengthening existing components, for identifying
existing information that could measure objectives, and for developing innovative
new surveillance system components. For example, in many countries, the process
of linking surveillance to objectives highlights the need for mortality data and
the absence of vital statistics.

297

Often, the most important objectives — the reductions in mortality associated with
diarrhea and measles--are measured in sentinel areas, since in many countries
vital events are not registered for the entire country (Table XIII. 4). Risk
factors, health- related behavior, and health interventions- -such as ORT and use of
fluids at home, feeding during diarrhea, use of contraception, use of condoms, use
of chloroquine, missed opportunities for vaccinations--can be measured nationally
with regularly scheduled surveys. Risk factors and interventions can also be
identified through exit interviews at the district, health-center, health-station,
or village level.

Using a surveillance grid developed for a hypothetical country, one sees that
surveillance for HIV is not as straightforward as for measles and diarrhea (Table
XIII. 4). The primary health-status outcome chosen by this country's ministry of
health was not HIV-related mortality or acquired immunodeficiency syndrome (AIDS),
but HIV seroprevalence in selected areas and selected populations. Therefore,
sentinel vital-event registration areas will not be used to measure the HIV-
related objectives. In addition, the objectives for HIV-related risk factors and
health interventions are targeted at certain areas (areas in which HIV
seroprevalence of patients with sexually transmitted diseases [STDs] is >10%) .
Since national surveys provide estimates only for the country as a whole, national
surveys will not be the primary method for measuring progress of objectives
related to risk HIV factors, behavior, and health interventions at a state or
local level.

Examining the surveillance system as a whole is important for assigning resources.
For diseases such as measles, diarrhea, pneumonia, and pertussis, surveillance
traditionally includes measurement of mortality in vital registration and
measurement of risk factors and health interventions nationally with surveys and
locally with exit interviews (4). However, conditions such as HIV, malaria,
malnutrition, tuberculosis (TB) , vitamin A deficiency, and hepatitis B can be
difficult to measure.

Use of a surveillance grid facilitates the integration of some aspects of

298
surveillance and may increase cost-efficiency. For example, a laboratory team may
go to 12 sentinel sites in a year and test blood for HIV from pregnant women and
patients with sexually transmitted diseases (STDs), blood for syphilis serology
from 20- to 24-year-old pregnant women, sputum from 50 patients with cough for at
least 1 month, and blood smears from 50 children with fever. Efficiency can be
gained by constructing surveys--cluster surveys or exit interviews--that integrate
questions about priority topics such as diarrhea, measles, HIV, tobacco use, and
birth spacing.

Surveillance of Measures of "Outcome" Versus "Process"

Currently, at national and global levels, much emphasis is being placed on
measurement of processes (e.g., coverage with vaccinations) versus the measurement
of health outcomes (e.g., cases of measles) as the primary focus(5). Emphasis is
placed on process measures, in part, because systems for efficient measurement of
population-based health outcomes do not exist.

There are two major problems with process measures. First, process measures do
not directly measure primary events of interest — death and disease — or the
effectiveness of the processes (interventions) . In contrast, the health outcome
is the measure of interest, and what is measured is the effectiveness (i.e., the
combined effect of the coverage and the efficacy of the intervention) .

The usefulness of a process measure for surveillance depends on the true and
consistent effectiveness of the intervention being measured. Focusing on the
measurement of processes is most suitable when the intervention is documented to
have consistent, high effectiveness. For example, measles vaccine administered to
a 9-month-old infant is thought to be 90% effective in preventing subsequent
measles ( 6) . Therefore, if a child receives measles vaccine before being exposed
to measles virus, the probability that s/he will have clinical measles is very
low.

299
The difficulty with process measurements, however, exists even with an
intervention as highly effective as measles vaccine (e.g., children infected with
measles virus before vaccination are not protected by vaccine) . The effectiveness
of most interventions is often less than that of measles vaccine, and the
effectiveness of the delivery of such interventions varies substantially from
setting to setting. For example, on the basis of the industriali zed-country
experience, three doses of OPV were thought to have an effectiveness of at least
95% in all settings (7,8). Yet, recent evaluations of field vaccine efficacy,
reviews of serologic efficacy, and outbreaks in countries with high coverage with
OPV have shown that the effectiveness of OPV in developing countries is not as
high as in industrialized countries, and that process measures of OPV coverage can
lead to a false sense of security (9-12) .

In programs in which an intervention has high and consistent effectiveness, the
magnitude of the problem of using process measures also depends on the stage of
development of a program. If an intervention is reliably 70%-90% effective, as
are measles vaccine and OPV, one can be relatively confident that health outcomes
will be positively affected if coverage increases from 20% to 80%. However, one
cannot be at all confident of any change in health outcome if coverage increases
from 80% to 90% or 95%. In fact, statistically significant changes in coverage
from 80% to 90% or 90% to 95% cannot be detected by current methods of
measurement .

A second major problem with process measures is measurement accuracy.
Intervention activities are often measured by administrative methods and
population-based surveys. An example of the administrative method of estimating
the percentage coverage of an intervention is counting the number of vaccinations
administered and then dividing by some denominator, such as the population in the
catchment area <1 year of age.

The administrative method is relatively easy and cheap to perform and is available
locally. On the other hand, both the numerator and the denominator are often
unavailable. For example, to estimate the percentage of persons who have received

300

a complete series of OPV, one must know the number of third doses of OPV
administered; this number is often not recorded.

To overcome the limitations of administrative data, population-based surveys are
used to provide process measures (e.g., the percentage of persons who received ORT
during the most recent episode of diarrhea and the percentage of reproductive- age
women who use modern methods of family planning), especially at the national
level. Yet, there are increased costs associated with surveys and numerous
potential inaccuracies from current survey tools (see section on surveys below) .

Using Outcome To Measure Process

In any international setting, surveillance for both outcomes and processes is
desirable, but the focus of surveillance should be on outcome measures. Outcome-
based programs have been extremely successful for global progress to eradicate
smallpox, guinea worm, and poliomyelitis. The smallpox program, which started out
as a process-based (coverage-driven) program, switched to an outcome-based
program, which led to improved program effectiveness (13) . An outcome-based
program in the Americas has decreased the number of cases of poliomyelitis from
nearly 3,000 in 1980 to a handful by 1990 (14) . See Appendix XIII. A for a more
detailed discussion.

POPULATION-BASED SURVEILLANCE

Population-based surveillance is especially important in many developing countries
because of the disparities of access to health facilities and health status in
urban centers versus rural areas. A single hospital in the capital city often
consumes 25%-50% of the health budget for an entire country. Since surveillance
from sentinel sites and health facilities is often concentrated in urban areas,
public health needs in rural areas may not be well -represented by policy makers at

301

the national level unless population-based surveillance systems are used.

Vital -Event Registration

The measurement of vital events is the most important single addition that
developing countries can make to their existing surveillance system (See Chapter
III) . Death and birth rates--along with cause-specific, age-specific, and gender-
specific rates--are very useful. In the United States, for example, 13 of the 18
status indicators chosen to measure the health status of the population as part of
the health objectives for the nation will be measured using vital records (15) .

Why so little emphasis has been placed by developing countries on establishing
vital-event registration is not clear. Registration could begin in small sentinel
areas, could be evaluated for problems, and then could be expanded. The vital -
registration system in the United States started in 1900 in 10 sentinel states,
and it took 23 years for all states to be admitted into the system (16) .
Obviously, in the early stages of setting up a registry, some births and deaths
would be missed. As late as 1974-1977, 21% of neonatal deaths were not registered
in Georgia (17); despite this underregistration, vital data have been extremely
useful.

In areas in which routine mortality data are not available, the verbal autopsy, in
which trained or untrained workers take histories from family members to classify
deaths by cause is a useful technique (18) . In 1978, WHO published a monograph
called Lay Reporting of Health Information (19) . It contained a detailed list of
approximately 150 causes of death and a minimal list of 30 causes that could be
used by non-physicians to classify deaths by cause.

In establishing vital-event systems, consideration should be given to including
the registration of pregnancy. This is especially needed to measure the number of
neonatal deaths, which in turn is needed to allow accurate infant -mortality rates

302

to be calculated. Registration of pregnancies would allow measurement of prenatal
care, fetal death associated with syphilis, family planning, and other important
health concerns.

Regular, Periodic Surveys

Regular, periodic surveys can be an important component of a surveillance system.
In particular, cluster surveys--multi-stage surveys with primary sampling units —
are important surveillance tools in many developing countries because they are the
only feasible method of collecting population-based information (20) .

Cluster surveys have not been thought of as an essential and regularly performed
surveillance activity. Surveys have generally been single-purpose and have been
conducted intermittently on an as-needed basis, often at the request of
international organizations. However, because the survey is the only method of
gathering population-based information in many countries and surveys can be used
to collect information on a variety of health topics, regularly scheduled surveys
can constitute an excellent surveillance tool (see Behavioral Risk Factor
Surveillance in Chapter III).

To assure the development of a useful national surveillance system in a developing
country, a survey unit or survey person should be assigned the task of
coordinating all national health surveys. The coordinator first works with
program staff to develop surveillance questions in high-priority areas (e.g.,
diarrhea, vaccinations, HIV/AIDS, family planning, child survival, malaria, and
tuberculosis). Two to five questions are often adequate for some conditions. The
questions should be assigned priority so that the survey coordinator has some
flexibility to shorten the overall questionnaire if needed.

Previously conducted surveys can serve as models for adaptation to local
situations. For example, for vaccination-related questions, the Expanded
Programme on Immunization (EPI) at WHO has a useful module. WHO also has useful

303

questionnaires for diarrhea; acute respiratory-tract infections; and knowledge,
attitude, and behavior associated with HIV infection. The Centers for Disease
Control (CDC) has questionnaires on child mortality, health-station practices,
nutrition, HIV risk behavior among youths, and others.

Once questionnaire modules have been developed, each module should be field tested
for readiness for implementation. Advance preparation and testing are very
important; it is both difficult and time-consuming to develop an effective
questionnaire.

A small set (10 or so) of core questions measuring the highest-priority objectives
should be included in every survey. Some space should be reserved for last-minute
questions on information desired by high-level policy makers. Not only will this
demonstrate the timeliness of this surveillance component, but it might facilitate
political and financial support for its continuation. Finally, when the time
comes for a survey, the survey coordinator puts together the core questions, the
last-minute questions from the policy makers, and the appropriate survey modules.

Data collection desired by international organizations can be integrated into the
ministry of health's schedule of surveys. The survey coordinator can provide the
international organization that wishes to have a survey conducted with the
schedule and proposed modules to be used. The two groups can then collaborate to
determine how the needs of both groups could be met. The international group can
help train survey-unit staff and can help maintain a training manual on designing
and conducting a survey, including interviewing techniques. This method is a
cost-effective way to build local capacity and facilitate sustainability . See
Appendix XIII. B for a discussion of some statistical issues in cluster surveys.

S>aatia<sl Smrveillaac®

Sentinel surveillance at health facilities can play a critical role in

304

surveillance in developing countries. Sentinel sites are used to a) collect
important information not collected at all sites and b) pilot collection of new
information in order to be able to assess the usefulness of the data and the
method of collection. Since routinely reported information from all sites must be
restricted to high-priority items and must be easy to collect, much important
information is unlikely to be collected from all health facilities.

At sentinel sites, more resources and more experienced and dedicated personnel can
often be used to collect information on more diseases, more detailed information
about each" case, and more difficult-to-collect information such as sexual
behavior. Also, sentinel sites can often serve as sources of information about
new conditions and can be used to determine the most effective methods for
inserting newly required data into the routine collection system.

There are several potential problems in interpreting data from sentinel sites.
Sentinel sites are often hospitals or other sophisticated facilities and tend to
serve urban patients. Such data will not reflect rural, small, non-urban health
stations where the majority of the population may live. Consequently, rural and
small health stations should be in the sentinel-site system.

Nevertheless, for several reasons, hospitals as sentinel sites and hospitals in
urban areas can yield important information in a timely manner at a relatively low
cost: first, cause-of -death data are available, permitting timely data collection
and analysis; second, because the number of visits and deaths is large, they
yield more precise estimates and allowing subgroup analysis by age, gender, or
other important variables. Also, data are currently available, whereas systems of
vital events and regular, periodic surveys are not generally established. For
example, in Kinshasa, Zaire, the Ministry of Health used a hospital -based sentinel
surveillance system to establish that measles remained an important cause of death
for children <9 months old. The spread of clinically important resistance to
chloroquine was detected because of increasing mortality from malaria in sentinel
hospitals in numerous African countries (21) .

305
Surveillance at the Local Level

Integrated, well-thought-out surveillance at the health-station and health-center
level warrants more focused attention; especially, data-collection, analysis, and
dissemination of results as a basis for public health action. Surveillance
responsibilities should be specified in employee work plans and completion of
surveillance duties used to assess health-worker performance.

WHO has surveillance and evaluation training modules for vertical programs such as
EPI and Control of Diarrheal Diseases (CDD) (20,22,23) , but there are no general
surveillance training modules for district or health-station levels. Local
surveillance is critical because major health problems in developing countries
require innovative public health action at the local level. Local surveillance
and public health action based on surveillance may be less urgent for programs
with high effectiveness and ease of administration, (e.g., vaccinations), or for
programs that depend solely on the formal health-care system (e.g., acute
respiratory infections or tuberculosis) . However, local surveillance and linked
public health action will be essential for most of the priority diseases (e.g.,
diarrhea, malaria, and HIV) and related prevention activities (oral -rehydration
solutions, chloroquine for all cases of fever, and condoms) . In general, these
interventions require extensive behavior change on the part of clients and also
require local problem-solving, surveillance of objectives, strategy reformulation,
and creative intervention by health workers to be successful.

Collection, Display, and Analysis of Local Surveillance Data

Analysis of surveillance data and action based on that surveillance information at
the local level have several benefits. If collected data are prominently
displayed as tables and graphs in the local health office, public health personnel
(and patients) can see the results of data-collection efforts. Through the
analysis and interpretation of the displayed surveillance data, local staff can be

306
involved in the process of devising strategies to solve health problems and at the
same time, can help attain national and local health objectives. Such involvement
gives health staff a sense of participation and professionalism.

The process of designing a surveillance system for a district or a health-station
is the same as for the national level. First, health priorities are determined on
the basis of the impact of the health problem and the feasibility and cost-
effectiveness of intervention. Second, objectives are determined and assigned
priority. Third, surveillance components to measure high-priority objectives are
identified* and implemented.

Four differences between national and local surveillance sometimes emerge. First,
many health stations will not have mortality surveillance based on vital-event
registration, whereas national surveillance systems may include at least a
sentinel-registration component. However, health stations can begin sentinel
population-based mortality surveillance by starting vital-event registration in
one or two villages.

Second, 30-cluster surveys conducted regularly every 1-3 years are not feasible
for district and health-station surveillance of risk factors and health
interventions .

Third, resource constraints at the local level limit the number of sentinel sites.
However, both health stations and districts can conduct a form of sentinel
surveillance by limiting data collection on some health problems to a small sample
of sites at infrequent intervals. For example, although children have their
growth monitored throughout the year, the percentage with weight-f or-age of <80%
of standard might be calculated only once every 3 months on a consecutive sample
of 30 children.

Fourth, limited resources require integration of surveillance and non-surveillance
health information by local health workers.

307

Data collected routinely by health stations should be limited to high priority
conditions. For example, mandatory reporting could be limited to 10 selected
diseases on the basis of established priorities or reporting laws. In addition,
the health station should meet certain standards before reporting requirements are
expanded: the health station staff should be a) reporting regularly, b) displaying
information collected, c) thinking about the meaning of the data, d) using the
data to solve health problems, and e) using the data to evaluate programs targeted
at certain health problems. If these are all being done, the staff is likely to
become enthusiastic about the public health aspect of the station's job and
initiate the idea of collecting more information. For example, information for
each case-patient (e.g., age and date of onset of disease) can be collected for
selected health problems instead of just reporting the number of cases of disease
(i.e., summary-count data). Additional diseases can be added on the basis of
priority setting (e.g., AIDS or moderate and severe malnutrition). The practice
of collecting data intermittently for special purposes can be expanded, and data
items found to be useful at sentinel sites can be added to reportable conditions
from all health stations or at least can be expanded to a larger number of
sentinel sites.

Display and interpretation of surveillance data and planned action based on the
interpretation can be integrated into assigned duties of health workers and into
the duties of their supervisors. Each health worker should have a detailed task
analysis or job description, with the task analysis linked to national and local
health objectives.

Employee and project work plans, based on supervisory visits and on input from
members of the community, should also reflect health objectives and ongoing
analysis and interpretation of surveillance data. For example, if one of the
high-priority health objectives is the reduction of measles cases by 50% as of
1995 (compared with the 1989-1991 baseline) and the graphs of measles cases by
year and measles cases by month in 1993 show no decline, the work plan for the
next 6 months might include conducting exit interviews, collecting additional
information on cases, and convening focus groups.

308

Through focus groups, health workers can determine from groups of mothers why
children are not being vaccinated and what might be done to solve this problem.
Exit interviews can be used to determine measles coverage. Additional information
about the ages of persons with measles can be recorded for the next 6 months, and
then the health worker and supervisor can determine whether measles is a disease
primarily among infants or among older persons as well. Using the vaccination
status of persons with measles, health workers can estimate measles coverage. The
effectiveness of a work plan should then be evaluated both through continued
surveillance of measles cases and through exit interviews.

In addition, the 6-month work plan could include teaching mothers about
appropriate preparation and use of oral-rehydration fluids at home. During a
supervisory visit, the supervisor can do exit interviews of 30 consecutive women
seen at the health station and record whether and what they have been taught about
using fluids at home, possibly asking for demonstration of what they have been
taught. At the same exit interviews, receipt of measles vaccine can be recorded
as a measure of coverage. This will integrate surveillance for measles coverage
with direct health-worker-performance assessment of a diarrhea-related task.

Exit Interviews and Focus Groups

Interviews of patients who have finished their visits at health facilities, which
can be called "exit interviews," can be a flexible, easy, and cost-effective
method of collecting information. Exit interviews are ideal for measuring
progress toward local health objectives. They can be used to collect data for
emergent problems or for routine surveillance, as well as to evaluate the
performance of health workers. For surveillance purposes, exit interviews can be
used to collect information about "process" health objectives, health risks,
health behavior, and health interventions. Unlike surveys, exit interviews can
be conducted frequently. Supervisory visits provide an excellent opportunity to
involve the supervisor in the conduct of exit interviews.

309

Focus groups can make important contributions to the design of a surveillance
system. As complex issues such as changes in behavior are assigned higher health
priorities (e.g., HIV-related behavior, diet, home fluids, treatment practices,
and reasons for not being vaccinated) , focus groups are often used to gain new
information.

Focus groups often provide an appropriate first step in generating ideas about why
events and behavior occur. After ideas or hypotheses are available, surveys, exit
interviews, and special studies (case-control studies) can be used to identify
specific factors that should be incorporated into surveillance systems. Health-
station staff can use focus groups, along with exit interviews, to measure health
objectives of local importance.

BUILDING INTEGRATED SURVEILLANCE SYSTEMS

Over the last 15 years, the sophistication of public health in developing
countries has increased greatly. EPI provided one model for surveillance.
However, surveillance for measles was relatively easy--the intervention was
consistently and highly effective, and almost all infections caused a distinct,
noticeable condition. However, the EPI surveillance model was not as successful
for problems such as diarrhea, pneumonia, family planning, and malaria, where the
interventions were less effective or less consistently effective and where the
outcome of interest was more difficult to measure.

Then, HIV appeared. Reporting of cases of AIDS was inadequate for immediate
prevention because of the lengthy incubation period for this condition. Accurate
surveillance for HIV had to rely on expensive laboratory testing.

Of the top 10 priority diseases in developing countries, only tuberculosis and
malaria require any laboratory testing (at least sentinel testing) for
surveillance, and the diagnostic tests for malaria and tuberculosis (though not

310

the tests for antimicrobial resistance) are relatively simple and inexpensive. In
addition, the appearance of HIV put new emphasis on the need for surveillance of
types of health behavior, the main prevention focus for HIV. Previously,
surveillance had been considered to be adequate in developing countries if it
covered disease reporting and vaccination coverage.

Now, surveillance data are expected to be available on risk factors and health
behavior (e.g. , age at marriage and age at first sexual intercourse for family-
planning purposes) , as well as on such newly important diseases as hepatitis B,
genital ulcer disease, urethritis, use of tobacco, and injuries associated with
motor vehicles.

As public health programs become more sophisticated and public health workers need
access to more information on more and more conditions, the complexity of the
structure of surveillance systems will increase. The integration of surveillance
and evaluation for vertical programs such as EPI, diarrhea, acute respiratory
infections, HIV/STD, and family planning into a coherent, rational surveillance
system will depend on the actions taken by ministries of health.

There are several advantages to integration:

surveillance information can be gathered with greater cost-efficacy,
requirements for health-station staff will be simplified and their
training will be less duplicative.

Although international organizations, often supporting vertical programs, control
a substantial proportion of the resources being spent on public health in
developing countries, these organizations are likely to respond favorably to the
implementation of logical, well-crafted, integrated surveillance systems that are
linked to written national health priorities.

Surveillance systems must continually focus on outcomes (cases of the health
problem) in order to adjust strategies and interventions for control and
prevention. Many countries are trying to reach low levels of vaccine-preventable

311
diseases by the year 1995 (measles and neonatal tetanus) or eradication by the
year 2000 (poliomyelitis) (24) . The poliomyelitis eradication initiative attempted
to demonstrate that outcome-based surveillance intimately linked to intervention
can be the "leading wedge" in disease reduction.

The sophistication of the tools available in developing countries to analyze
surveillance data has also increased. Surveillance data have been analyzed with
computers at the national level for the past several years. As the prices of
computer hardware have continued to decrease, computers have been moved to zonal,
state, and provincial levels. Epi Info, an inexpensive and freely copyable
epidemiology computer program, is now available in English, French, Spanish, and
Arabic (25); also, manuals are available in Czech and Italian. Mapping of
surveillance data has been underutilized because inexpensive mapping programs that
can display maps by district, health station, and village and can be linked to
surveillance data bases have not been available. However, a mapping program
called Epi Map is compatible with Bpi Info and can create maps of surveillance
data automatically.

SUMMARY

The vision for surveillance systems in developing countries as described above
involves systems that are linked to health objectives, ordered by priority,
limited in scope, and not burdensome at the health-station level. These systems
should also contain an extensive sentinel network and have strong elements of
population-based data gathering from surveys and vital event registration.
Surveillance data need to be collected routinely. Sentinel sites will provide the
information required to monitor health objectives, but such surveillance should
also be flexible enough to collect new data needed for emerging problems, and for
changing priorities.

Health objectives provide national politicians and health leaders a plan to
ensure the public's health. With a surveillance system that is linked to these

312

objectives, leaders will be able to monitor progress made toward meeting national
objectives. With analysis and action at the district and

health-station level, local health staff can take rapid and appropriate action.
Population-based vital statistics can show whether enough emphasis is being placed
on health in rural and remote areas of a country. Health surveys can be conducted
as a regular part of the surveillance system. Expertise and funding provided by
international organizations can help train and maintain a survey coordinator and
surveyors .

In implementing surveillance and health systems, developing counties can avoid the
mistakes that industrialized countries have already made--poorly planned and
fragmented surveillance systems, surveillance systems not linked to objectives,
health objectives that are not explicit and often politicized, large divisions
between curative and preventive medicine, and differences in health care in rural
versus urban areas .

As noted at the beginning of this chapter, surveillance in developing countries is
accompanied by numerous logistic problems but also presents unique opportunities.
The careful setting of health priorities and the meticulous allocation of limited
resources to the interests of the public's health can be the results of
surveillance in such settings.

313
Appendix XIII. A. Using Outcome To Measure Process

This appendix describes a method to estimate process measures from outcome
measures. Some process measures such as percentage coverage of an intervention
(e.g., percentage using chloroquine, percentage having received vaccine,
percentage using ORT) may be cost-effectively assessed by outcome data (e.g.,
number of cases of malaria, cases of measles, deaths from diarrhea) . There is a
relationship between the proportion of persons with a disease that has "received"
an intervention, the effectiveness of the intervention, and the "coverage" of the
intervention in the population. The relationship is as follows:

PPI-(PPI*Eff)
PCI =

l-(PPI*Eff)

where PCI is the percentage of the cases of disease exposed to the intervention,
where PPI is the percentage of the population exposed to the intervention, and
where Eff is the efficacy of the intervention.

This formula is derived from the formula for program (vaccine) efficacy, where
efficacy equals the attack rate among persons not exposed to the program or
intervention minus the attack rate among persons exposed, divided by the attack
rate among those unexposed (26), i.e., for vaccine efficacy, Eff = VE or vaccine
efficacy; PCI = PCV or percentage of case-patients who are vaccinated; and PPI =
PPV or percentage of the population vaccinated.

The graphic representation of this formula is known in immunization programs as
the vaccine-efficacy curve (Figure XIII .A.l) (26) . As an example, if the
percentage of case-patients with disease that have been exposed to the
intervention (PCI) is <20%, the coverage of the intervention in the population
(PPI) is poor (i.e., the efficacy of the intervention is 90% or less). If the
proportion of case-patients who have received the intervention is >50%, either the
percentage coverage is high or the efficacy of the intervention is low. To
estimate from surveillance the coverage of cases, one needs to determine whether

314
persons with the disease were or were not exposed to a particular intervention
(e.g., whether case-patients used condoms, whether case-patients received
appropriate home fluids, or whether case-patients received vaccine) .

To use the formula or the curve, the exposure to the intervention must be
dichotomized into a "yes/no" format. For example, for poliomyelitis, exposure is
categorized into "fully vaccinated" with >3 doses of vaccine and "not fully
vaccinated" with <3 doses of vaccine. This method has several advantages. It
allows estimates of coverage at the health-station level, which allows local
action to solve local health problems. It is much simpler and cheaper than
conducting surveys, it provides information about effectiveness as well as
coverage, and it is more difficult to falsify than coverage-survey and
administrative method estimates. However, this method provides only a crude
estimate and should be used with other sources of data. For example, if the
survey or administrative estimate of 0PV3 coverage is 95%, and only 20% of
confirmed poliomyelitis case-patients received 3 doses of OPV, then the survey or
administrative estimates should be questioned.

315
Appendix XIII. B. 30-Cluster EPI Survey Design

In the absence of an internationally funded survey to attach modules or questions
desired by a ministry of health, a 30-cluster EPI survey can be performed (20) .
The EPI survey was designed to provide a crude estimate of vaccination coverage
(±10%) (27); it provided information about whether vaccination coverage was low
(20%-40%) or relatively high (70%-90%) . Other programs have adapted the design
for other purposes (e.g., mortality from neonatal tetanus, mortality and practices
associated with diarrhea, and changes in vaccination coverage over time) (22,23).

However, results have often been misleading because appropriate confidence
intervals were not calculated. Many health professionals did not realize that the
confidence interval for each survey was not fixed at ±10% but varied depending on
the results (inter-cluster correlation and the point estimate) of each survey.
Often confidence intervals were not calculated and appropriate analyses of
subgroups (males, females) were not done because easy-to-use computer programs
were not available. Fortunately, such computer programs as (COSAS; Lotus
spreadsheet for diarrhea cluster surveys; and CLUSTER, which runs within Epi Info)
are now available to calculate appropriate confidence intervals. However, if an
analysis by age, by gender, or some other specific characteristic is desired, a
more complicated program (e.g., SUDAAN or CARP) still must be used to obtain valid
point estimates and valid confidence intervals (28) . For example, one cannot get
a valid estimate of coverage for males and females in a typical EPI coverage
survey without the use of SUDAAN.

As the use of the cluster survey becomes more sophisticated and as greater
accuracy and precision is desired, use of the EPI cluster-survey design is
complicated by the potential for bias in both selection of the first house and
subsequent selection of additional houses (29) . Despite being designed and
analyzed as a survey with equal probability of selection, selection of the
starting house from a randomly selected direction yields a higher probability of

316

selection for houses near the middle of the cluster. If occupants near the middle
of the cluster have some characteristic associated with the outcome (e.g., have
higher incomes), a biased estimate will result.

An alternative method of selecting the first and additional houses in a cluster is
by segmenting and subsegmenting the cluster until a small number of houses can be
mapped (e.g., 30 houses). Then, the first and additional houses can be chosen at
random. Tf one assumes that the number of target-group persons per household is
similar in all clusters, valid point estimates and approximate confidence
intervals can be calculated using less-complicated programs (CLUSTER and COSAS) .
The use of subsegmenting in the absence of being able to select the first house
randomly has also been described.

An easy-to-use program that appropriately analyzes cluster surveys (including
appropriate analysis of subgroups and comparison of two independent surveys done
at two different times) operating within Epi Info is being prepared.

317

REFERENCES

1. Jamison DT, Mosley WH. Disease control priorities in developing countries:
health policy responses to epidemiological change. Am J Public Health
1991;81:15-22.

2. Jamison DT, Mosley WH (eds.). Disease control priorities in developing
countries. Oxford and New York: Oxford University Press. In press.

3. Walsh JA, Warren KS. Selective primary health care: an interim strategy
for "disease control in developing countries. N Engl J Med 1979;301:967-74.

4. Frerichs RR. Epidemiologic surveillance in developing countries. Ann Rev
Pub Health 1991;12:80-257.

5. Lemeshow S, Robinson D. Surveys to measure programme coverage and impact:
a review of the methodology used by the Expanded Programme on Immunization.
World Health Stat Q 1985;38:65-75.

6. Markowitz L,E, Sepulveda J, Diaz-Ortega JL et al. Immunization of 6-month-
old infants with different doses of Edmonton- Zagreb and Schwarz measles
vaccines. N Engl J Med 1990;322:580-7.

7. Hardy GE, Hopkins CC, Linnemann CC et al. Trivalent oral polio vaccine: a
comparison of two infant immunization schedules. Pediatrics 1970;45:444-8.

8. McBean AM, Thorns ML, Albrecht P et al . Serologic response to oral polio
vaccine and enhanced-potency inactivated poliovaccines . Am J Epidemiol
1988;128:615-28.

9. Deming MS, Jaiteh KO, Otten MW et al. Epidemic poliomyelitis in The Gambia
following the control of poliomyelitis as an endemic disease. II.
Clinical efficacy of trivalent oral polio vaccine. Am J Epidemiol
1992;135:393-408.

10. Patriarca PA, Wright PF, John JT. Factors affecting the immunogenicity of
oral poliovirus vaccine in developing countries: review. Review of
Infectious Diseases 1991;13:926-39.

11. Sutter RW, Patriarca PA, Brogan S et al . Outbreak of paralytic
poliomyelitis in Oman: evidence for widespread transmission among fully
vaccinated children. Lancet 1991;338:715-20.

12. Otten MW, Deming MD, Jaiteh KO et al . Epidemic poliomyelitis in The Gambia
following the control of poliomyelitis as an endemic disease. I.
Descriptive findings. Am J Epidemiol 1992;135:381-92.

13. Fenner F, Henderson DA, Arita I et al. Smallpox and its eradication.
Geneva, Switzerland: World Health Organization, 1988:475-6.

14. Pan American Health Organization. Health Information System for EPI .
Washington, D.C.: Pan American Health Organization, 1992.

15. Centers for Disease Control. National Center for Health Statistics.
Health status indicators for the year 2000. Statistical Notes 1991;1:1-4.

16. U.S. Bureau of the Census. Historical statistics of the United States,
Colonial times to 1970, Bicentennial edition, Part 1. Washington, D.C.:

318

Government Printing Office, 1975.

17. McCarthy BJ, Terry J, Rochat R, Quave S, Tyler CW. The underregistration
of neonatal deaths: Georgia 1974-1977. Am J Public Health 1980:977-82.

18. Kielmann AA, Taylor CE, DeSweemer C et al . Child and maternal health
services in rural India: the Narangwal experiment. Baltimore: Johns
Hopkins University Press, 1983.

19. World Health Organization. Lay reporting of health information. Geneva,
Switzerland: World Health Organization, 1978.

20. Expanded Programme on Immunization. The EPI coverage survey. Training for
mid-level managers. Geneva, Switzerland: World Health Organization, 1988.

21. U.S. Agency for International Development, Centers for Disease Control.
African child survival initiative, 1989-90. Bilingual annual report.
Washington, D. C. : Government Printing Office, 1990.

22. Galazka A, Stroh G. Guidelines on the community-based survey of neonatal
tetanus mortality. Geneva, Switzerland: World Health Organization,
WHO/EPI/GEN/86/8, 1986.

23. Programme for Control of Diarrhoeal Diseases. Household survey manual:
diarrhea case management, morbidity, and mortality. Geneva, Switzerland:
World Health Organization, CDD/SER/86.2/Rev.l, 1989.

24. World Health Organization. Global Advisory Group. Part II. Expanded
Programme on Immunization. Weekly Epidemiological Record 1992;67:17-19.

25. Dean AG, Dean JA, Burton AH et al . EPI INFO. Version 5. A. Word-
processing, data-base, and statistics program for epidemiology on
microcomputers. Atlanta, Ga. : Centers for Disease Control, 1990.

26. Orenstein WA, Bernier RH, Dondero TJ et al. Field evaluation of vaccine
efficacy. Bull World Health Organization 1985;63:1055-68.

27. Henderson RH, Sundaresan T. Cluster sampling to assess immunization
coverage: a review of experience with simplified sampling methods. Bull
World Health Organization 1982;60:253-60.

28. Shah B, Barnwell BG, Hunt P et al. SUDAAN user's manual. Release 5.50.
Raleigh, N.C.: Research Triangle Institute, 1989.

29. Lemeshow S, Stroh G. Sampling techniques for evaluating health parameters
in developing countries. Washington, D.C.: National Academy Press, 1988.

Reference Material for Principles and Practice of Public Health

Surveillance

TABLES

Chapter I:

Table 1.1. The uses of surveillance

Chapter II:

Table II. 1. Steps in planning a surveillance system
Table II. 2. Criteria for identifying high-priotrity health
events for surveillance

Chapter III:

No tables

Chapter IV.:

Table IV. 1.

Table IV. 2.

Essential questions for the practice of effective
disease/injury reporting
Concerns of the data-base manager

Chapter V:

Table V.

1. Rates and quantities involving rates commonly used in
epidemiology

2. Crude death rates--Dade and Pinellas counties,
Florida, 1980

3. Age-specific death rates--Dade and Pinellas counties,
Florida, 1980

4. Directly standardized death rates--Dade and Pinellas
counties, Florida, 1980

5. Indirectly standardized death rates — Dade and Pinellas
counties, Florida, 1980

6. Five-number summary of 39 4-week totals of reported
cases of meningococcal infections--United States,
1987-1989

Table V.7. Common power transformations (y — y )

Table V.8. Guide for selecting data graphics

Table V.9. Primary and secondary morbidity from syphilis, by age

category — United States, 1989
Table V.10. Primary and secondary morbidity from syphilis, by age

category, race, and gender--United States, 1989

Chapter VI:

No tables

Chapter VII:

Table VII. 1.

Controlling and directing information dissemination

Chapter VIII:

Table VIII. 1. Sample case definition developed by the Centers

for Disease Control and the U.S. Council of State

and Territorial Epidemiologists
Table VIII. 2. The detection of health conditions with a

surveillance system
Table VIII. 3. Comparison of estimated costs for active and passive

surveillance systems in a health department, Vermont,

June 1, 1980, to May 31, 1981
Table VIII. 4. Outline of sample surveillance evaluation report

Chapter IX:

Table IX. 1. Ethical responsibilities in surveillance —
participants and duties

Table IX. 2. An ethical checklist for public health surveillance

Chapter X:

No tables

Chapter XI:

No tables

Chapter XII:

Table XII. 1.

Table XII. 2.

Table XII. 3.

Table XII. 4.
Table XI I. 5.

Table XII. 6.

Reasons cited by physicians for failure to report

notifiable diseases [42, 45-47)

What local and state health departments can do to

improve reporting by physicians

Criteria used to set priorities for national

disease surveillance, Canada 1,60)

Confidence intervals for rates (61)

Formula for calculating 95% confidence intervals

for the ratio of two independent rates (61)

Formula for calculating 95% confidence intervals

for the difference between two independent rates

Chapter XIII:

Table XIII

Examples of data sources for surveillance in
developing countries
Table XIII. 2. Health problems ranked according to preventability
and treatability, Thailand, 1987

Examples of objectives linked to surveillance
components that will measure objectives

Grid to identify which surveillance component
will measure a health objective in a hypothetical
developing country

Table XIII. 3.

Table XIII. 4.

FIGURES

Chapter I:

Figure I.

1. Reported cases of congenital syphilis among infants

<1 year of age and rates of primary and secondary (P&S)

syphilis among women--United States, 1970-1991
Figure 1.2. Salmonella rates in New Hampshire and contiguous states,

by county
Figure 1.3. Homicide rate, by age and gender of victim, United

States, 1986
Figure 1.4. Malaria rates, by year--United States, 1930-1988
Figure 1.5. Reported cases of measles, by age group, United

States, 1980-1982
Figure 1.6. Semi-logarithmic-scale line graph of reported cases

of paralytic poliomyelitis--United States, 1951-1989
Figure 1.7. Percentage of reported cases of gonorrhea caused by

antibiotic-resistant strains—United States, 1980-

1990
Figure 1.8. Cesarean deliveries as a percentage of all deliveries

in U.S. hospitals, by year, 1970-1990

Chapter II:

No figures

Chapter III:

No figures

Chapter IV:

No figures

Chapter V:

Figure V.l. Crude, gender-specific and gender-race-specific cases

of primary and secondary syphilis --United States, 1981-
1990, comparison of differential trends

Figure V.2. Dot plot of results of swine influenza virus (SIV)
hemagglutination-inhibition (HI) antibody testing
among exposed and unexposed swine exhibitors--Wisconsin,
1988

Figure V.3. Ordered data series and stem-and-leaf display of 39 4-
week totals of reported cases of meningococcal
infections — United States, 1987-1989

Figure V.4. Scatter plot of 39 4 -week totals of reported cases of
meningococcal inf ections--United States, 1987-1989

Figure V.5. Box plot of 39 4-week totals of reported cases of
meningococcal inf ections--United States, 1987-1989

Figure V.6. Histogram (epidemic curve) of reported cases of
paralytic poliomyelitis--Oman, January 1988-
March 1989

Figure V.7. Sample cumulative attack rate, by grade in school
and time of onset--North Carolina, 1985

Figure V.8. Survival curves over time, based on serum testos-
terone level. Eastern Cooperative Oncology Group

Figure V.9. Frequency polygon of reported cases of encephalitis —
United States, 1965

Figure V.10. Group bar chart of case-fatality rates from ectopic
pregnancy, by age group and race--United States,
1970-1987

Figure V.ll. Stacked bar chart of underlying causes of infant

mortality, by racial/ethnic group and age at death-
United States, 1983

Figure V.12. Deviation bar chart of notifiable disease reports,
comparison of 4-week totals ending May 23, 1992, with
historical data- -United States
Figure V.13. Pie charts of poliomyelitis vaccination status of
children ages 1-4 years in cities with populations
equal to or greater than 250,000, by financial status-
United States, 1969

Figure V.14. Spot map of deaths from smallpox—California,
1915-1924

Figure V.15. Chloropleth map of confirmed and presumptive cases
of St. Louis encephalitis, by county- -Florida, 1990

Figure V.16. Density-equalizing map of California (based upon
population density) , depicting deaths from smallpox,
1915-1924

Chapter VI:

Figure VI . 1 . Example: Data used for report published during week
20 (May 23, 1992)

Chapter VII:

No figures

Chapter VIII:

Figure VIII. 1. National Notifiable Diseases Surveillance System
Figure VIII. 2. Biases in surveillance

Chapter IX:

No figures

Chapter X:

No figures

Chapter XI:

No figures

Chapter XII:

Figure XII. 1. Cartoon depicting mumps as a public health
problem, Tennessee

Chapter XIII:

Figure XIII. A. 1. Percentage of case-patients vaccinated
(PCV) per percentage of population
vaccinated (PPV) for seven values of
vaccine efficacy (VE)

Table 1.1. The uses of surveillance [23)

Quantitative estimates of the magnitude of a health problem.

Portrayal of the natural history of disease.

Detection of epidemics.

Documentation of the distribution and spread of a health event.

Facilitating epidemiologic and laboratory research.

Testing of hypotheses.

Evaluation of control and prevention measures.

Monitoring of changes in infectious agents.

Monitoring of isolation activities.

Detection of changes in health practice.

and planning

TABLE II. 1. Steps in planning a surveillance system

1. Establish objectives.

2. Develop case definitions.

3. Determine data source or data-collection mechanism (type of system)

4. Develop data-collection instruments.

5. Field test methods.

6. Develop and test analytic approach.

7. Develop dissemination mechanism.

8. Assure use of analysis and interpretation.

TABLE II. 2. Criteria for identifying high-priority health events for surveillance

• Frequency :

Incidence

Prevalence

Mortality

Years of potential life lost

• Severity :

Case-fatality ratio
Hospitalization rate
Disability

• Cost

Direct and indirect costs

• Preventability

• Communicability

• Public interest

TABLE IV. 1. Essential questions for the practice of effective disease/injury reporting
Initiation/sources of reports

* How and by whom are health-care practitioners (existing and newly practicing)
entered into the reporting network?

* By what agency are conditions reported for such temporary residents as college
students, military personnel, and migrant workers?

Routing/timing of reports

* How should 'suspected case, laboratory results pending" be handled?

* Should the local or the state health department update a case report when
additional information is received?

* Should case reports arise from the health jurisdiction in which the patient
resides? In which the patient became infected (injured)? In which the patient
became ill (and/or received treatment)?

* Should a diagnostic laboratory send data on reportable conditions to the requester,
or should it be responsible for reporting to appropriate local/state health
departments? (If "yes" to the latter, in what order?)

* If a case occurs one calendar year, but is not reported until early in the next
calendar year, what is the year of report? What is the cut-off date for reports
from the previous year? How are reports treated that are for the previous year but
are received after the established deadline?

Is there a mechanism for reporting disease/ injury across state lines, as
appropriate?

Policy issues in reporting disease/ injury

What items on the reporting form must be completed before a report can be
forwarded?

If a reportable condition has a specific case definition (such as measles and
AIDS) , should the case be reported before confirmation by a disease investigator?
(3)

What mechanism will be (has been) established to deal with situations in which
cases must be reported in batches rather than individually because the number of
reports is overwhelmingly large?

* If case reports are held pending laboratory confirmation, should the "date of
report' reflect the original date of report or the date laboratory confirmation was
received or some other date associated with this health event?

* Are reports generated to identify records with incomplete/unconfirmed data so that
follow-up can be initiated?

* How does one avoid duplicate reports of the same case?

How are discrepancies in the information on duplicate reports resolved?

TABLE IV. 2. Concerns of the data-base manager

1. Who will enter the data? What credentials must this person have? Who is this
person's back-up? Who will update records? Back-up the computer file?

2. Will data be entered on an as-received basis or according to an established
schedule?

3. Does the data-entry screen replicate the paper form from which data are to be
entered?

4. Does the data-entry program allow for certain data items to be entered
automatically on subsequent screens until the data recorder makes a change? (For
example, the county initially entered will appear on each subsequent screen until
the recorder types in a different county. This allows the recorder to batch
records for more efficient entry) .

5. Does the data-entry program effectively validate the data being entered for
completeness by use of "must-enter' fields and "look-up" files?

6. Does the- data-entry program have the ability to do range checking on values
entered? If so, does the system allow for acceptable ranges to change, reflecting
values entered in the data base over a time? Is there a logic audit procedure in
the system — to locate such errors as misspelled names or addresses, incorrectly
coded race, gender, or code for disease/injury?

7. At what level (state or local) will records be changed or deleted? Who owns the
data records?

8. If the data base is distributed to other users as an electronic file or on floppy
diskette, are there safeguards to prevent overwriting another user's data?
Safeguards against computer viruses?

9. Are the data-entry programs flexible enough to allow variables to be modified as
prescribed by changes in state regulations and national recommendations?

10. Are production reports automatically generated for quality assurance of data entry?

11. How and with what frequency are data copied and stored for back-up purposes? Are
paper/film copies maintained (in the event of computer failure)?

12. Are double-entry systems used for quality assurance?

TABLE V.l. Rates and quantities involving rates commonly used in epidemiology

Measure

Numerator

Denominator

Expressed per
number at risk

Measures of morbidity:

Incidence
rate

Attack rate

Secondary
attack rate

Point
prevalence

Number of new cases
of specified
condition/given time

Number of new cases
of specified
condition/epidemic
period

Number of new cases
of specified
condition among
contacts of known
patients

Number of current
cases of specified
condition at given
time

Population at start
of time interval

Population at start
of epidemic
period

Size of contact
population at risk

Estimated
population at
same point in time

variable:
10" where
x = 2,3,4,5,6

variable:
101 where
x = 2,3,4,5,6

variable:
10" where

x = 2,3,4,5,6

variable:
101 where

x = 2,3,4,5,6

Period
prevalence

Number of old cases
plus new cases of
specified condition
identified in given
time interval

Estimated mid-interval
population

variable:
101 where
x = 2,3,4,5,6

Measures of mortality:

Crude
death rate

Total number of deaths
reported in given
time interval

Estimated mid-interval
population

1,000 or
100,000

Cause-
specific
death rate

Number of deaths from
specific cause in
given time interval

Estimated mid-interval
population

100,000

Proportionate
mortality

Number of deaths from
specific cause in
given time interval

Total number of deaths
from all causes in
same interval

100 or
1,000

Measure

Numerator

Measures of mortality: (continued)

Dealh-to-
case ratio
(Case-fatality
rate, case-
fatality ratio)

Neonatal

mortality

rate

Infant
mortality

rate

Maternal

mortality

rate

Number of deaths from
specific condition
in given time
interval

Number of deaths
(<28 days of age) in
given time interval

Number of deaths
(<1 year of age) in
given time interval

Number of deaths from
pregnancy related causes
in given time
interval

Measures of natality:

Denominator

Number of new cases
of that condition
in same time
interval

Number of live births
in same time
interval

Number of live births
reported in same
time interval

Number of live births
in same time
interval

Crude
birth rate

Number of live births
reported in given
time interval

Estimated total

mid-interval

population

Crude
fertility rate

Number of live births
reported in given
time interval

Estimated number of
women ages 15-44
years at mid-interval

Crude rate
of natural
increase

Number of live births
minus number of deaths
in given time interval

Estimated total

mid-interval

population

Low birth
weight ratio

Number of live births
(<2,500 grams) in
given time interval

Number of live births
reported in same
time interval

Expressed per
number at risk

100

1,000

100,000

1,000

100

TABLE V.2. Crude death rates-Dade and Pinellas counties, Florida, 1980

Population

Deaths

Crude death rate

(per 1,000

population)

Dade County

1,706,097

16,859

9.9

Pinellas County

732,685

11,531

15.7

Sources: Bureau of the Census, 1983.

National Center for Health Statistics, Centers for Disease Control.

TABLE V3. Age-specific death rates-Dade and Pinellas counties, Florida, 1980

Age group

(years)

Dade County

Pinellas County

Population

Deaths

Rate (per
1,000 pop.)

Population

Deaths

Rate (per
1,000 pop.)

0-4

97,870

383

3.9

31,005

101

3.3

5-14

221,452

0.3

77,991

0.3

15-24

284.956

440

1.5

95,456

0.8

25-34

265,885

529

2.0

90,435

129

1.4

35-44

207^64

538

2.6

65419

168

2.6

45-54

193^05

1,107

5.7

69472

460

6.6

55-64

175479

2,164

12.3

98,132

1,198

12.2

65-74

152,172

3,789

24.9

114,686

2,746

23.9

>75

107,114

7,834*

73.1

89,889

6,629*

73.7

Total

1,706,097

16,859

9.9

732,685

11431

15.7

Sources: Burea
Natio

♦Deaths >75 incl

u of the Census,
aal Center for Hi

ude six persons

1983.
;alth Statistics,

of unknown ag

Centers for Disease Control.

; for Dade and one of unknown age for Pinellas counties.

TABLE V.4. Directly standardized death rates-Dade and Pinellas counties, Florida, 1980*

Age group
(years)

(A)

1980 U.S. population

(percentage distribution)

(B)

Age-specific death rates

(per 1,000 pop.)

U.S. population using
county age-specific ratesf

Dade County

Pinellas County

Dade County

Pinellas County

0-4

7.2

3.9

3.3

5-14

15.3

0.3

15-24

18.7

1.5

0.8

25-34

16.5

2.0

1.4

35-44

11.4

2.6

45-54

10.0

5.7

6.6

55-64

9.6

12.3

12.2

118

117

65-74

6.9

24.9

23.9

172

165

>75

4.4

73.1

73.7

322

324

Totals

100.0

9.9

15.7

793

769

Directly
adjusted

death

rates (per

1,000 pop.)§

7.9 7.7

♦United States population, 1980, used as standard.

tCjj = AjxBjj where i=l,...,9 age groups and j=l,2 counties.

§2CS /1 00.

TABLE V.5. Indirectly standardized death rates-

Dade and Pinellas counties, Florida, 1980*

;

Age group
(years)

(A)

Death rates

(per 1,000 pop.)

U.S. 1980

(B)
1980 population

(C)

Expected number of deaths in

county based on U.S. -specific

ratesf

Dade

Pinellas

Dade

Pinellas

0-4

3.3

97,870

31,005

323

102

5-14

0.3

221,452

77,991

15-24

1.2

284,956

95,456

342

115

25-34

1.3

265,885

90,435

346

118

35^4

2.3

207,564

65,519

477

151

45-54

5.9

193,505

69,572

1,142

410

55-64

13.4

175,579

98,132

2,353

1,315

65-74

29.8

152,172

114,686

4,535

3,418

>75

87.2§

107,114

89,889

9,340

7,838

Totals

8.8

1,706,097

732,685

18,924

13,490

Expected

death

rales (per

1,000 pop.)H

11.1

18.4

Adjusting
factors**

0.79

0.48

Crude

death

rates (per

1,000 pop.)

9.9

15.7

Indirecdy

adjusted

death

rates (per

1,000 pop.)tt

7.8

7.5

♦United States age spi
tQj = AjXBjj where i=
§Deaths >75 include '.
EQj/2Bsforj=U.

i i

♦♦U.S. total death rau

xrfic death rates, 1980, usee
1.....9 age groups and j=l,2
S68 of unknown age for Unit

:/expected death rate.

as standard,
xninties.
ed States.

tfCrude death rate x adjusting factor.

TABLE V.6. Five-number summary of 39 4-week totals of reported cases of meningococcal infections-
United States, 1987-1989

Median 190

Hinges 151 237

Extremes 102 350

TABLE V.7. Common power transformations (y -> yp)

Transformation

2
1
\6

log(y)

-Vi

-l//y

-1

-2

-1/y
-1/y2

Name

Notes

Higher powers

Square
Raw
Square root

No transformation
AppcrjEE fcr count eta

Logarithm

Generally logarithm to
base 10, widely used

Reciprocal root

Reciprocal
Reciprocal square

Minus sign preserves
order

Lower powers

TABLE V.8. Guide for selecting data graphics

Type of graph or chart
Arithmetic-scale line graph
Semilogarithmic-scale line graph

Histogram

Frequency polygon

Cumulative frequency
Scatter diagram
Simple bar chart
Grouped bar chart
Stacked bar chart

Deviation bar chart
Pie chart
Spot map
ChJoropleth map
Box plot

When to use

Trends in numbers or rates over tune

1 Emphasize rate of change over time

2. Display values ranging >2 orders of magnitude

1 Frequency distribution of continuous variable

2. Number of cases during epidemic (i.e.. epidemic curve) or over time

Frequency distribution of continuous variable, especially to show
components

Cumulative frequency

Plot association between two variables

Compare size or frequency of different categories of single variable

Compare size or frequency of different categories of 2-4 series of data

Compare totals and illustrate component parts of the total among
different groups

Illustrate differences, both positive and negative, from baseline

Show components of a whole

Show location of cases or events

Display events or rates geographically

Visualize statistical characteristics (e.g.. median, range, skewness) of
variable

TABLE V.9. Primary and secondary morbidity from syphilis, by age category-United Stales,

1989

Age group
(years)

Cases

Number

Percentage*

<14

230

0.5

15-19

4,378

10.0

20-24

10,405

23.6

25-29

9,610

21.8

30-34

8,648

19.6

35^4

6,901

15.7

45-54

2,631

6.0

>55

1278

2.9

Total

44,081

100.0

♦Percentages do not add to 100.0 due to rounding.

—

**■*

SO*

<~r

<B-

p»

w*l

P»
CD

<-r

oo'

<rT

«n

«o"

Csl

est

•n

"™

W-l

—

««i

Csl

s©

«3

r-*

t*^

<-t

c*f

■3

P"*

rs»

<sf

**T

t*i

CD
CO

■v

w-i

«r>

«n

C-l

r—

VS,

TABLE VII. 1. Controlling and directing information dissemination

Steps

Establish communications
message

Define audience

Select the channel

Market the message

Evaluate the impact

Questions to be Answered
What should be said?

To whom should it be said?

Through what communication
medium?

How should the message be
stated?

What effect did the message create?

TABLE VIII. 1. Sample case definition developed by Che Centers for Disease Control and
the U.S. Council of State and Territorial Epidemiologists

Measles

Clinical case definition

An illness characterized by all of the following clinical features:

• A generalized rash lasting _>3 days

• A temperature _>38.3 C (101 F)

• Cough or coryza or conjunctivitis

Laboratory criteria for diagnosis

• Isolation of measles virus from a clinical specimen

• Significant rise in measles antibody level by any standard serologic assay

• Positive serologic test for IgM antibody (to measles)

Case classification

Suspected: any rash illness with fever.

Probable: meets the clinical case definition, has no or noncontributory serologic

or virologic testing, and is not epidemiologically linked to a probable or

confirmed case.

Confirmed: a case that is laboratory confirmed or that meets the clinical case

definition and is epidemiologically linked to a confirmed or probable case. A

laboratory-confirmed case does not need to meet the clinical case definition.

Comment

Two probable cases that are epidemiologically linked would be considered confirmed,
even in the absence of laboratory confirmation.

TABLE VIII. 2. The detection of health conditions with a surveillance system.

"Condition" present

Yes No

True False
Yes positive positive A+B

A B
Detected by

surveillance False True

No negative negative C+D

C D

A+C B+D TOTAL

♦Sensitivity = A/ (A+C).

TABLE VIII. 3. Comparison of estimated costs for active and passive surveillance systems
in a health department, Vermont, June 1, 1980, to May 31, 1981

Type of surveillance system

Paper
Mailing
Telephone
Personnel

Secretary

Public health nurses

TOTAL

♦Active = Weekly calls from health department to request reports,
t Passive = Provider-initiated reporting.

Active*

Passivet

$ 114

185

1,947

$ 80

175

3,000
14,025

2,000
0

$19,271

$2,203

TABLE VIII. 4. Outline of sample surveillance evaluation report

1 . Public Health Importance

Describe the public health importance of the health event. The three most
important categories to consider are the following:

• Total number of cases, incidence, and prevalence.

• Indices of severity such as the mortcffiii£ycaa£-efatrs3Lity ratio.

• Preventability.

2 . Objectives and Usefulness

Explicitly state the objectives of the system and the health event (s) being
monitored (case definitions) . Describe the actions that have been taken as a
result of the data from the surveillance system. Describe who has used the data
to make decisions and take actions. List other anticipated uses of the data.

3 . System Operation

Describe the following: the population under surveillance, the period of time of
the data collection, the information that is collected, who provides the
information, how the information transferred and how often, how the data are
analyzed (by whom and how often) , how often reports are disseminated, and how
reports are distributed (to whom and in what media) . Include an assessment of
the simplicity, flexibility, and acceptability of the system.

4. Quantitative Attributes: Include assessments of the sensitivity, predictive
value positive, representativeness, and timeliness of the system.

5. Cost of Operating the Surveillance System. Estimate direct costs and, if
possible, assess cost-benefit issues.

6 . Conclusions and Recommendations

These should state whether the system is meeting its objectives and should
address issue of whether to continue and/or modify the surveillance system.

H 0>

rt Oi

J3 C

10 ~H

10 -4

IJ CJ >, X

-H IJ

-^ U

O-H ij U CO

1-1 V-i

tl-4 l-l

flj ■— i ~-i ^ en

—i o

—I O

uJ2 u 10 C >,

i-i a

u a

0 3 o cy-H ^

0 01

w a -^ co 03 01

C l-i

c u

a nxiic ■>

- 14 .C ^ WHU>,

l-i CO

5s O

l-t CO

a> o u u to «-4 14 —i o

0 to

uuHOffl^uraoiB

i 0)

- 0)

01 01

CC0(03C0«WW0)CO

(0 CO

to CO

~-l C>

(oaiat'oaiajo^aio

•D— i

T3-1

u u

rn£Cniiao">
c o to o> T3

o to

CO *-i

u u U a. <

-4

Cl-4

-4 c

u —4

—4

(0 t-l

*-4

10 01

f-H

u —4

CO (0 >

cnij » 0)

a) >

> 10 01

—4

.c u

C 01 M

l-i -^ CO

Oi-i O) 3
CO C & u

0 1-t 10

S§

U-H 01

-4 C -H -H 0)

ah n

i— < £

kl-H 14 U fl)

■OO-i

§i

(0 "W (0 -H Q,

c co

£ £ U

Cm U

CO CO U

oi o

O u

-i 01 -1

CJ c

j2 C u

en 0)

C O CO

>, 10 01-4 to

01 U.C

l-i -H CO U CJ

i-> c

o en to u

O^ ID M1JH

1-1 -H

UhO

-4 C 01 ^-i

u -4 0) 0 0) 1-4

0) u

<d m u

1-l~4 ll 10

10 u CO Q, C -H

C-H

0) CI 01

*C 0 tH 0) u u

IJ u

-Q u >

C l-< u -C

cconno

u 0

3 O C

01 10

to (DC

10 c

03 O OJ

C03

X ft.

-*H

> -4

01 U 10

U -r4 -4

C > i-i i-i

•H^U C C

CJ 10 U 01 OJ > 01

HllrlEfllO'D

1-4 3 CO Ij C 10 ~!

■n

01 u C O O > i-i

€

CMOIiuO-lC

01 3 CO C l-i O

CO CJ l-t cu o

-H

i-H

-4

1-1

& 01

-4

■0

—4 -H

u >

1-4

-H 01

l-i u

(-H

-1 t

1-1

-4

10 >,

J2 0

CO QJ

01 U

•rt 1J

* E

JS-H

-r4

1-1

l-> c

OHO

U 3

>, 01

01 10 u

■H E

f-t

OI 01

■O-t-i

T-l-r^ -4-4

■H CI

3 0 E
CO CO u

■9 o

■r^

CJ 14

01 u

3 U

0 10

c 0

CO CO

CO r-t

TABLE IX. 2. An ethical checklist for public health surveillance

1. Justify the surveillance system in terms of maximizing potential public health
benefits and minimizing public harm.

2. Justify use of identifiers and the maintenance of records with identifiers.

3. Have surveillance protocols and analytic research reviewed by colleagues, and
share data and findings with colleagues and the public health community at large.

4. Elicit informed consent from potential surveillance subjects.

5. Assure the protection of confidentiality of subjects.

6. Inform health-care providers of conditions germane to their patients.

7. Inform the public, the public health community, and clinicians of findings of
surveillance.

TABLE XII. 1. Reasons cited by physicians for failure to report notifiable diseases
(42,45-47)

1. Assumed that the case would be reported by someone else.

2. Unaware that disease reporting was required.

3. Do not have notifiable disease reporting form/telephone number.

4. Do not know how to report notifiable diseases.

5. Do not have copy of list of notifiable diseases.

6. Concerned about confidentiality.

7. Concerned about violation of doctor-patient relationship.

8. Reporting is too time-consuming.

9. Absence of incentives to report.

TABLE XII. 2. What local and state health departments can do to improve reporting by
physicians

Local health departments

• Express an interest in disease reporting to those responsible for report-
ing.

• Maximize contact with the local medical community.

Presentations
Mailings
Newsletters
Telephone contact
Mass media

• Use the data.

State health department a

• Express an interest in disease reporting to those responsible for report-
ing.

• Maintain a reasonable list of reportable conditions.

• Maximize contact with the state medical community.
- - Presentations

Mailings
Newsletters
Telephone contact
Mass media

• Use the data.

TABLE XII. 3. Criteria used to set priorities for national disease surveillance,
Canada (60)

1. Surveillance by the World Health Organization

2. Importance to agriculture in Canada
3 . Disease incidence

4. Morbidity (hospital days and short-term disability)

5. Mortality.

6. Case-fatality ratio

7. Communicability

8. Potential for outbreaks

9 . Socioeconomic impact

10. Public perception of risk

11. Vaccine preventability

12. Necessity for an immediate public health response

TABLE XII. 4. Confidence intervals for rates (61).
Let r = rate per 1,000

n = denominator upon which rate is based

The limits of the 95-percent confidence interval are:
upper limit: r + 61.981 i r / n

lower limit: r - 61.981 I r / n

TABLE XII. 5. Formula for calculating the 95% confidence interval for the ratio of two
independent rates (61)

Let r2 = rate for period 1 (or area 1)

dj = number of events for period 1 (or area 1)

r2 = rate for period 2 (or area 2)

d2 = number of events for period 2 (or area 2)

R = r,/r2

The limits of the 95% confidence interval are:
upper limit: R + 1.96R 1 1/dj + l/d2

lower limit: R - 1.96R I l/dj + l/d2

TABLE XII. 6. Formula for calculating the 95% confidence interval for the difference
between two independent rates

Let r, = rate for period 1 (or area 1)

nj = denominator upon which r: is based

r2 = rate for period 2 (or area 2)

d2 = denominator upon which r2 is based

D = Ti - r2

The limits of the 95% confidence interval are:

upper

limit: D + 61.981 K ri/nx + r2/n2

lower limit: D - 61.981 I rj/nj + r2/n2

Table XIII. 1. Examples of data sources for surveillance in
developing countries

I. Case reports

a. from health stations or hospitals

b. from sentinel sites

II. Births and deaths

a. from hospitals

b. from sentinel sites

c. complete ascertainment

III. Laboratory reports (usually from hospitals)

IV. Sample surveys (particularly cluster surveys)

Table XIII. 2. Health problems ranked according to preventability and
treatability, Thailand, 1987

Rank

Disease

Total

score

(4-16)*

Prevent-
ability
(H-M-L) **

Disease

Total
score
(4-16)

Treat-
ability
(H-M-L)

Tetanus

Malaria

Poliomyelitis

Pneumonia

Measles

Dengue
(hemorrhagic)

Diphtheria

Acute diarrhea

Rabies

Tuberculosis

Rubella

Veneral
Disease

Traffic
injury

Dysentery

Stroke

Conjunctivitis

Malaria

Influenza

10.

Peptic Ulcer

Measles

Source: "Review of the Health Situation in Thailand:
Diseases. "

Priority Ranking of

* Rated on a scale of 4 (low) to 16 (high)
**H=high, M=medium, L=low

Table XIII. 3. Examples of objectives linked to surveillance components that
will measure objectives

Surveillance- linked objectives
Objective

Surveillance component
that measures objective

Priority area #1--Diarrhea

Health status--Reduce diarrhea mortality by
25% by 1995

• Risk factor—Increase female literacy of
10- to 14-year-olds to 80% by 1995

• Health activity—Increase to 90% the
proportion of 0- to 4-year-olds given
appropriate home fluids by 1995

Vital-event registration
in five sentinel areas

Regularly conducted
survey

Regularly conducted
health survey
Local—exit interviews

Priority area #2 --Measles

Health status— Reduce measles mortality by
25% by 1995

• Health status—Reduce number of reported
measles cases by 50% by 1995 compared with
1990

• Health activity- -Increase percentage of 12-
to 23-month-olds with one dose of measles
vaccine to 90% nationwide

• Health activity— Increase to 80% the
percentage of districts with one-dose
measle vaccination coverage of 12- to 23-
month-olds of 90%

Vital-event registration
in five sentinel areas

National disease-
reporting system

• Regularly conducted
health survey

Exit interviews of
mothers of 50 12- to 23-
month-olds at all health
facilities in district
twice a year

Priority area #5--HIV/AIDS

Health status—Stabilize at 10% the
proportion of 20- to 25-year-old women who
have babies at the capital city hospital
and who are HIV-positive by 1993

Sentinel HIV testing of
20- to 2 5 -year old women
who have babies in
capital city

Health status--No increase in the 2% HIV
seroprevalence of rural women who have
babies that are HIV-positive by 1993

Sentinel HIV testing of
women having babies in
capital city

Risk factor—Reduction of HIV-risk taking
behavior by 50% in 1994 in areas with HIV
seroprevalence of STD patients >10% (an
indicator of entrance of HIV into
community)

Reporting of clinical
chancroid through the
national disease-
reporting system

Laboratory- -Syphilis
serology testing of 20-
to 25-year-old women
having babies in
affected areas

Exit interviews in
affected areas

Health activity--Increase to 75% the
percentage of sexual contacts whose
partners are not spouses who also use
condoms by 1995 in areas with HIV
seroprevalence of STD patients >10%

Nationwide only--
Regularly-conducted
health survey

Exit interviews in
affected areas

Nationwide only--
Regularly-conducted
health survey

9 3

-8
I

s
8

- i

flf

3 i

o ^

€£

H >,>°
2 .ffr"

rs'a !o

"fig

|!8

?NJ

His*

"O go

r* <o

©■=
i_ a,

o c

o o

3 ■

® £ 10

$£

cii §

IE 1

III
F E

IS I

!a> o
I *

ill

■a £ '

ill

O c

lis

iSS

i« £

i» o i

!§£

s£

eg a

! 8 s

| 5 .2

;f 1

i co *""
8J<

ilfl

r o

g8
la |

T> -o c3 m -°

1811*

§ ■at2"

■g XI S * J3

" i

iS
o o. a c

a>2 — o
3 $ m Q.

".CO

E
o
|

8
8

FIGURE 1.1. Reported cases of congenital syphilis among infants <1 year of age and rates
of primary and secondary (P&S) syphilis among women — United States, 1970-1991

5,400

3-1

00 o

4,500

O
o

■1,800

■900

1970 1973 1976 1979 1982 1985 1988 1991

Year

Note: The surveillance case definition for congenital syphilis changed in 1989.
Source: Centers for Disease Control.

FIGURE 1.2. Salmonella rates in New Hampshire and contiguous states, by county

Cases per 100,000

.01 - .80*
.81-1.60
1.61-3.20

>3.20

Unshaded counties=no cases reported

FIGURE 1.3. Homicide rate, by age and gender of victim, United States, 1986

14-

12-

c
o

Q.
O
Q.

O
O

o"
o

■ i ii i i i i i i i i i i i i i ■ ■

75r 7S *W <%> «%- <%> % %> 3r 3b $r <%> -^ ^ 3? °

Age

FIGURE 1.4. Malaria rates, by year— United States, 1930-1988

1 ,000 n

100 -

o
o
o

<3
O

0>
Q.
to

05
CO

cc
O

10-

1 -

0.1 -

Relapses of imported malaria

Relapses from Korean veterans

Returning Vietnam veterans
Immigrants

001 ' I I 1 1 1 — -I 1 1 1 1- 1 — I —

1 930 1 935 1 940 1 945 1 950 1 955 1 960 1 965 1 970 1 975 1 980 1 985 1 988

Year

FIGURE 1.5. Reported cases of measles, by age group, United States, 1980-1982*

22 -i

c
o

20 -

18 -

3
Q.

16 -

14 -

O
O

12 -

10 -

8 -

6 -

4 -

2 -

o -I

0-4

5-9

10-14
Age

■Wm

15-19

1980

1981

■

1982

*20

* Rates estimated by extrapolating age, from reported case-patients with known age.

FIGURE 1.6. Semi-logarithmic-scale line graph of reported cases of paralytic
poliomyelitis— United States, 1951-1989

3
D.
O
Q.

O
O
O

CD
CL

if)
CD

100
10

0.1.

g 0.01 .

T3
CD

I 0.001

DC 1950 1955

Inactivated
vaccine

J Oral

vaccine

1960 1965 1970 1975 1980 1985 1990

Year

CO
CD
CO

CO CD
O

CD 12
.C CO

few

O CD

c L-
o o

O.Q

CO W

c£

CD -5

i_
0
Q.

FIGURE 1.7. Percentage of reported cases of gonorrhea caused
by antibiotic-resistant strains— United States, 1980-1990

Year

FIGURE 1.8. Cesarean deliveries as a percentage of all deliveries in U.S. hospitals,
by year, 1970-1990

30 -f
25

§, 20

-»— '

§ 15
o

I a I I

=» S—TT IT"

1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1989 1990

Year

FIGURE V. 1 . Crude, gender-specific and gender-race-specific
cases of primary and secondary syphilis — United
States, 1981-1990, comparison of differential trends

1981 1982 19S3 1984 1985 1986 1987 1988 1989 1990
Year

30,000

1981 1982 1983 1984 1985 1986 1987 1988 1989 1990
Year

25,000

20,000

15,000

10,000-

5,000-

Black male

'* Black female

White male

While female

1981 1982 1983 1984 1985 1986 1987 1988 1989 1990

Year

FIGURE V.2. Dot plot of results of swine influenza virus (SIV)
hemagglutination-inhibition (HI) antibody testing among
exposed and unexposed swine exhibitors — Wisconsin, 1988

*-*

*->

X
>

(fi

150

80
40

<10

•-•••-•-•■

• • • •

• • • *

• • ••

• • • • •

Unexposed Exposed

Swine exhibitors

FIGURE V.3. Ordered data series and stem-and-leaf display of 39 4-week totals of reported cases
of meningococcal infections-United States, 1987-1989

1987: 226, 307, 350, 236, 222, 258, 197, 167, 138, 108, 191, 190, 201

1988: 216, 238, 331, 270, 265, 156, 164, 142, 112, 1 11, 153, 138, 159

1989: 145, 306, 314, 264, 222, 195, 155, 149, 102, 117, 174, 158, 159

Stem

Leaf

-674

450

22668

0157

474

259356899

28127

In this example the first two digits of each datum serve as the stem and the third digit serves as a
leaf, e.g., for the numbers 264 and 265, the stem and leaves appear as 26 (stem) and 45 (leaves).
Since further division of the stems would result in an attenuated distributional shape, each stem
represents a range of 20 numbers, e.g., the stem 26 represents any number from 260 to 279 so that
for the number 270, the stem and leaf appear as 26 (stem) and 0 (leaf).

FIGURE V.4. Scatter plot of 39 4-week totals of reported cases of
meningococcal infections — United States, 1987-1989

400

350

300 : +

250

to
a>

CO 20°
O

1 50

100
50

i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i

13 5 7 9 1113 1517 19 2123 25 27 29 3133 35 3739
1987-1989 (4-week periods)

FIGURE V.5. Box plot of 39 4-week totals of reported cases of

meningococcal infections-United States, 1987-1989

# Box plot

1 350+

3
1

5 I . 230+

5 110+

FIGURE V.6. Histogram (epidemic curve) of reported cases of
paralytic poliomyelitis — Oman, January 1988-March 1989

OU"

25-

20-

-t':^H^'"""J

= 1 case

Oral poliovirus
vaccination
campaigns

i i i

w 15-

: :

cc ,v'

■•:

10-

;

J :••.-. i

i , I i

■

lip

Li^ij

l,.,-i|,,.;ti;,.^..ri:t..i,.

^ &6 % 4* 4fc,4*> 4/ % % fy ^o^ec^o feb K

Date of onset

60-

50-1
(D 40

o 30

10-

FIGURE V.7. Sample cumulative attack rate, by grade in
school and time of onset — North Carolina, 1985

Eighth grade
Seventh grade

Fifth grade

Sixth grade
,***"""■" Fourth grade

6 to 11

11 a.m. 1 to 3 3 to 5

5 to 9

a.m.

to p.m. p.m.
1 p.m.

Time period

p.m.

D
CO

c
g

't
o

Q.
O

FIGURE V.8. Survival curves over time, based on serum
testosterone level, Eastern Cooperative Oncology Group

1.1
1.0
0.9

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Q„0

jgnostic group

— Best

— Worst

— Other

N=50

N=168

1 1 1 1 1

40 60 80 100 120 140

Weeks from randomization

FIGURE V.9. Frequency polygon of reported
cases of encephalitis — United States, 1965

V)
CD
CO
CO

Post-infectious
encephalitis

Etiology
unknown

Arthropod-borne
encephalitis

J'F'M'A'M^J'J'A'S'O'N'D

Month

FIGURE V.10. Group bar chart of case-fatality rates from ectopic
pregnancy, by age group and race — United States, 1970-1987

"q.

o
o
o

CD
Q.

I White
Black and other

15-19 20-24 25-29 30-34 35-39 40-44
Age group (years)

FIGURE V. 1 1 . Stacked bar chart of underlying causes of infant mortality,
by racial/ethnic group and age at death — United States, 1983
15

Birth defects

□ Low birth weight/prematurity/
respiratory distress syndrome

Lil Sudden infant death syndrome

Others

Black American Hispanic Asian White Total
Indian

Race/ethnicity

FIGURE V.12. Deviation bar chart of notifiable disease reports, comparison

of 4-week totals ending May 23, 1992, with historical data — United States

Cases current
Disease Decrease Increase 4 weeks

Aseptic meningitis •
Encephalitis (primary) ■
Hepatitis A
Hepatitis B
Hepatitis, non-A, non-B ■
Hepatitis (unspecified)
Legionellosis ■
Malaria ■
Measles (total)
Meningococcal infections ■
Mumps ■
Pertussis ■
Rabies (animal) ■
Rubella ■

0.125 .25 .5 1

Ratio (log scale)*

0 Beyond historical limits.

* Ratio of current 4-week total to the mean of 15 4-week totals (from previous, comparable, and subsequent
4-week periods for the past 5 years). The point where the hatched area begins is based on the mean and
two standard deviations of these 4-week totals.

357

1,287

925

433

260

206

196

138

651

FIGURE V.13. Pie charts of poliomyelitis vaccination status of children ages 1-4 years
in cities with populations ^250,000, by financial status — United States, 1969

Poverty

Nonpoverty

□

Adequately vaccinated: 3+ doses inactivated poliovirus vaccine (IPV) and/or
3 doses oral poliovirus vaccine (OPV).

Inadequately vaccinated: Some poliovirus vaccine, but < 3 doses of IPV
and/or < 3 doses of OPV.

Not vaccinated: No vaccine given.

FIGURE V.14. Spot map of deaths from smallpox— California, 1915-1924

V \ '

•

( 1 ■ V • • • •

•

• •

•

7 • •

• •
•••

FIGURE V. 1 5. Chloropleth map of confirmed and presumptive cases of
St. Louis encephalitis, by county — Florida, 1990*

No cases
1 -5 cases
6-10 cases
>1 0 cases

* As of October 17, 1990.

Indian River

St
Luae

tf&

FIGURE V.16. Density-equalizing map of California (based upon
population density), depicting deaths from smallpox, 1915-1924

FIGURE VI.l. Example: Data used for report published during week 20 (May 23, 1992)

1992

■ -■ ■ '

Xo*

^— "C

1991

1990

1989

1988

Xio

X12

1987

Xl3

X14

Xl5

12-15

16-19
Week

20-23

"Current" 4 weeks

* For example, XQ is total of cases reported for weeks 1 6-1 9, 1 992.

FIGURE Vm.l. National Notifiable Diseases Surveillance System

Case-
report -
source

State-authorized sources for case
reporting, e.g., physicians, laboratories,
infection-control practitioners, school
nurses ef a/.telephone case reports to
local health department.

Follow-up information is collected, written case
reports are completed, and reports are sent to
state health agency.

Prescribed case data entered into computer.

File of line listing transmitted
electronically to CDC.

Disease-
control
bulletin

Centers

for

Disease

ControlTCDC)

l I

Retrieved data file stored in mainframe
computer files, from which output
is generated.

Weekly

summary

transmitted

County
maps to
states

MMWR
Annual
Summary

MMWR

Tables

l-lll

FIGURE Vin.2. Biases in surveillance

Case ascertainment bias

information bias
(Data about the case)

Population under surveillance
— I

Case-patients

- , — i — ,

Reported Not reported
(true positive) (false negative)

Non-cases

, — i — ,

Reported Not reported
(false positive) (true negative)

I 1 1

Present Present Absent
(correct) (incorrect)

I 1 1

Present Present Absent
(correct) (incorrect)

FIGURE XII. 1. Cartoon depicting mumps as a public health problem, Tennessee

...AND ONCE ttaA4U TENNESSEE^ EXCELLENCE \\\ EDUCATION
EFFORT GR«4tt>"Tb A HM.T THROUGH A &A510 NEGLECT
OF THE FUNDAMENTALS... -ftftf/Tte&KBS*B

FIGURE XIII.A.l. Percentage of case-patients vaccinated (PCV) per percentage
of population vaccinated (PPV) for seven values of vaccine efficacy (VE)

100
90 -

PCV

PPV-(PPV X VE)

l100

■90

1-(PPVXVE)

/U "

t\J

£fi .

-en

o rc\ -

iccine

*90 J95' -

■50 O

a. &0

^Tiicacy - *n

J/OU' bur ' uf ou#

AH -

./in

/I/

4U
• on

9n .

• OC\

in -

. 1 n

) 1

0
0

0 20 30 40 50 60 70 80

90 10

PPV

u p

° b

i*5 CD

a 8

7-* a

o jd

Ih £h
*J >

CD jj

« «

a; u
o a)

CO <*

CD ^

° -J I

•a *d

S3 |

Z Ih

■d

CD
>

•

J-i

,CD

-a

•fH

•^ a

A I *H

«+h CD

O o

CD 2

a-d

co C

Reproduced by NTIS

National Technical Information Service
U.S. Department of Commerce
Springfield, VA 22161

This report was printed specifically for your
order from our collection of more than 2 million
technical reports.

For economy and efficiency, NTIS does not maintain stock of its vast
collection of technical reports. Rather, most documents are printed for
each order. Your copy is the best possible reproduction available from
our master archive. If you have any questions concerning this document
or any order you placed with NTIS, please call our Customer Services
Department at (703)487-4660.

Always think of NTIS when you want:

• Access to the technical, scientific, and engineering results generated
by the ongoing multibillion dollar R&D program of the U.S. Government.

• R&D results from Japan, West Germany, Great Britain, and some 20
other countries, most of it reported in English.

NTIS also operates two centers that can provide you with valuable
information:

• The Federal Computer Products Center - offers software and
datafiles produced by Federal agencies.

• The Center for the Utilization of Federal Technology - gives you
access to the best of Federal technologies and laboratory resources.

For more information about NTIS, send for our FREE NTIS Products
and Services Catalog which describes how you can access this U.S. and
foreign Government technology. Call (703)487-4650 or send this
sheet to NTIS, U.S. Department of Commerce, Springfield, VA 221 61.
Ask for catalog, PR-827.

Name

Address .

Telephone

- Your Source to U.S. and Foreign Government
Research and Technology.

PARKLAWN HEALTH LIBRARY

WA 105 P9355 1992

Principles and practice of
public health surveillance

\u fc n

rv i )

\' ■ jjn

Parklawn Health Library
U.S. Public Health Service
Parklawn Bldg. - Rm.13-12
5600 Fishers Lane
Rockville, Maryland 20857

m UBRA«y

PABKLAWN HEALTH LIBRARY

3 203

00034019 7

Internet Archive Audio

Featured

Top

Images

Featured

Top

Software

Featured

Top

Books

Featured

Top

Video

Featured

Top

Mobile Apps

Browser Extensions

Archive-It Subscription

Save Page Now

Full text of "Principles and practice of public health surveillance"

See other formats