BIODIVERSITY DATA MANAGEMENT (Document 1)

DATA FLOW MODEL

in the context of the Convention on Biological Diversity

WORLD CONSERVATION MONITORING CENTRE

The mission of the World Conservation Monitoring Centre is to provide information on the status, security and management of the Earth’s biological diversity.

BIODIVERSITY DATA MANAGEMENT (Document 1)

DATA FLOW MODEL

in the context of the Convention on Biological Diversity

United Nations Environment Programme

March 1995

ACKNOWLEDGEMENTS

This document is one of a series of four researched and compiled by the World Conservation Monitoring Centre, Cambridge UK with 80% funding from the Global Environment Facility (GEF) through the United Nations Environment Programme (UNEP), Project GF/0301-94-40 (GF/0301-94-06). The need for the development of a package of tools and materials to support national information management for the Convention was identified and the project promulgated by Mark Collins (Director, WCMC) and Robin Pellew (former Director of WCMC).

Principal authors were Ian Crain and Gwynneth Martin, incorporating a preliminary data flow model by Claire Appleby. Many WCMC staff and consultants have contributed and critically reviewed this complex document including Ian Barnes, Mark Collins, Helen Corrigan, Harriet Gillett, Don Gordon, Jeremy Harrison, Martin © Jenkins, Gareth Lloyd, Richard Luxmore, Chris Magin, Jim Paine, and Jake Reynolds. The document has benefited, as well, from review and comment from NGOs, UNEP, and experts in a number of countries including those who participated in a consultation meeting hosted by UNEP in Nairobi in October, 1994. Graphical concepts were developed by Gwynneth Martin and Gareth Lloyd and executed by Ian Kime of "Constructive Solutions". Document organisation, integration and input was by Laura Battlebury. Ian Crain was the project manager and responsible for overall design and editing.

Digitized by the Internet Archive in 2010 with funding from UNEP-WCMC, Cambridge

http://www.archive.org/details/dataflowmodelinc95wcmc

INTRODUCTION (3. 35 es ect ars mae gcd cll acl ie enicere 1 Dede Background ims Jee LEIS), ORE eae Re Foam toes op ase tl tole 1 1.2 Information Requirements of the CBD ..........-.---..-+--: 2 1.3. Approach to the Data Flow Model .............----+++-+-: 5 1.4 Conceptual Overview of National Activities forthe CBD ......... 7 Were Neosho its lg ea cling 6b cho.d Galois bod oO ola 0 46/9 nD o Bio 7 i422) Participants: qeiseaeee ss) Sie ecw. cee el ee ge Be: ie 9 1.5 Methodology and Symbolism ..............----------- 9 1.5.1 Data Flow Diagrams ................--+-+2e0e-- 9 S22) Data Models:iite wee tem. rehome eee Wee cc < joi -inc ie ve aerate 11 FIRST LEVELIPROGCESSES wig seen ede ee cnc a ee ile}e® 21) Conduct Country Study ery acts hc eek) oh) “oes eee) 13 2.2. Set Priorities and Prepare Action Plans .................. 14 233>enimplement*Action'Plans?.239.7& eta dete A. eee a Sener e n e 14 2: 4ulesEvaluateiResultsy .:\ie tates sot a oet etels.. 2 neal cme mayee 14 2650; “Reportto"CBD: (nesta. eee ee. Se A ec Cee 14 216 “Biodiversity ‘Databases! .w") ee: &, A Peers. a SE Se 14 SECOND’ LEVEL PROCESSES #2568. 6 S23) 55008 .* eee 15 3al ~-Conduct:Country/Stidy 22 2k. Scan wsee <0 = memos) eomeionrs 15 3.1.1 Conduct Institutional Survey .................... 15 3.1.2 Identify Biciogical Diversity..................... 16 3.1.3 Identify Adverse Processes ..................... 16 3.1.4 Determine Economic Implications ................. 16 3.2 Set Priorities and Prepare Action Plans .................. 16 3.2.1 Establish Strategic Objectives .................... 16 3.2.2 Select Indicators and Establish Targets .............. 17 322-3) Develop Action Plansieas csr einen inne nee 17 393 4) Implement’Action Plans yas sees ee oe ile 18 3:3s1 Verify Informations yes Gee Sk & POSES Ae Bee eee 18 3.3.2 Collect Information and Fill Gaps ................. 18 3/3 33ayMonitor Changer. 6% 5) 01,5 css, 5, us atehe sen ec ont einen enters 19 35374. ‘Enact Legislation. 2205 a)a)e is eye fens trees succeed oh ewes 19 35325) Perform Other, Actions|yee-ireate)) eee nn ene 19 3:4. ‘Evaluate; Resultsis: ences 2s Renee Sa ee ess Sia ae eee 19 314elwyEstimate:Indicatorsig fit. sie. c sesso. co doc oe eien) oh eve eeaneiis 19 3.4.2 Assess Current Status vs Targets .................. 19 BAS. IREportito (SBD ier. sus) yierstes wicinone ech eis scene: Coenen tent mermen ete 20 DATASTORES . 2.214 wewlaiite ERAS Gr ho ee Eee a et 21 Ayl |Ovetview of Datastoress.v-1cy-n- tie) nee ae tence monet 21 As er aindicatory Valucsini.it a) seeds cae fev ace eee net etek Mena mene? om 24 423% noilnstitutionsiy : sazr., . fides ee ite ee. 2. cee. Ue. Bad enad 0s 25 Ara (Catalog of DatavHoldings 7): icin ee a ee OO. a 26

TABLE OF CONTENTS

4.5. Sectoral Information - Core Biodiversity Data .............. 2

45 i MOVETVIEW 2 per ho sacs CST ROR aeons! so Mile. ance Hone 27 4°52") Habitats’ les nis SMe Pe eI ae RO oe eee, 2 - 29 4:5-3ProtectedvAreas fv f pct tin otek yaa eget a cae ae ape 30 ASA SPECIES rice ee ne eee OE OEP CUED st aceon 31 4:5 5 Threats ek ake eae ee ee Un Se so Begala 34 4.5.6 Integrating Core Biodiversity Data ................. 36 REFERENCES einen eect yea A a Ue RE gu ies eee 39 LIST OF ANNEXES ANALYSIS OF THE INFORMATION NEEDS OF THE CBD....... 41

LIST OF ACRONYMS & ABBREVIATIONS ................. 49

1 INTRODUCTION

1.1 Background

The Convention on Biological Diversity (CBD) was signed at the United Nations Conference on Environment and Development in Rio de Janeiro in June 1992 by 154 nations and subsequently came into force in November 1993. Article 7 of the Convention is concerned with identification and monitoring activities to support Articles 8 to 10 (in-situ conservation, ex-situ conservation and sustainable use of components of biological diversity). Contracting parties are required to identify components of biological diversity important for its conservation and sustainable use (Article 7a); to identify activities likely to have adverse impacts (Article 7c); and to monitor the status of both components and threats (Articles 7b and 7c). Specifically Article 7d identifies the requirement to:

"Maintain and organise, by any mechanism, data derived from identification and monitoring activities”.

Having recognised this clearly identified need for management of data in support of national planning related to biodiversity, the United Nations Environment Program (UNEP), in collaboration with the World Conservation Monitoring Centre (WCMC), designed and submitted to the Global Environment Facility (GEF), a project proposal entitled Biodiversity Data Management Capacitation in Developing Countries and Networking Biodiversity Information (BDM). This proposal was endorsed and subsequently a sub-project was established between UNEP and WCMC for Development of Supporting Materials for Biodiversity Data Management and Exchange.

The sub-project has produced an interlinked package of resource materials to assist in national capacity building. There are four principal components of this package:

Document 1. Data Flow Model (This Document)

Document 2. Guidelines for a National Institutional Survey - to provide guidance to countries in conducting a survey and assessment of the capacity of existing national institutions to support biodiversity information management.

Document 3. Guidelines for Information Management - to facilitate the development of capacity for information management and exchange as required by the CBD.

Document 4. Resource Inventory - the core output of the project; a collection of reference directories, guidelines, and standards relating to biodiversity information management.

The Data Flow Model is intended to identify in a formal structure the relationships between

components of biodiversity data, from acquisition through to use in national strategy development, planning, and monitoring for implementation of the CBD.

Data Flow Model - Document 1 1

1.2 Information Requirements of the CBD The CBD has three main purposes:

© the conservation of biodiversity @ encouraging the sustainable use of biodiversity @ the equitable sharing of the benefits of the use of biodiversity.

The information required by a country to meet these objectives is wide ranging, going beyond the normal boundaries of "conservation" or "environmental" information. A number of Articles of the CBD require or imply the need for facilities for the management and exchange of biodiversity information. Articles 7d, 12c, 13b, 15(7) and 16, each identify information management and exchange requirements, and Article 17 explicitly indicates "access to and transfer of technology among Contracting Parties are essential elements for i the attainment of the objectives of this Convention". A clause-by-clause analysis of the ~ implied information requirements is given in Annex 1, and summarised below.

The CBD identifies three main categories of biodiversity information: @ ecosystems and habitats @ species and communities @ described genomes and genes of social, scientific or economic importance.

To this basic list one must add:

@ the scientific and technical information required to measure, assess and take decisions on appropriate action

@ bio-technology, its value and risks © local knowledge of traditional uses and values of biological resources

@ interrelationships between biodiversity, human actions, laws and conditions, economics and development.

The Report of the Open-ended Intergovernmental Meeting of Scientific Experts on Biological Diversity - Second Session (UNEP/CBD/IC/2/11) provides in its annexes further indication of the scope of the information and technology of biodiversity. For instance, Annex II lists six major categories of technology "relevant to the identification, characterisation and monitoring of ecosystems, species and genetic resources":

© classification technologies for terrestrial, marine and other aquatic ecosystems

@ ecosystem evaluation technologies

© biogeographical mapping technologies

ea ae ee rae ee 2 Data Flow Model - Document 1

© isolation, characterisation and classification technologies (for terrestrial, marine and other aquatic organisms, for plants, animals microbes and genes, and for indigenous and non-indigenous organisms)

© technologies to determine species and genetic resource status

@ key enabling technologies including, information technology, advanced biochemical and molecular technologies, risk assessment etc).

These main headings were further subdivided into a large number of classes from "biogeography", "ecosystem function" and "traditional knowledge" through to "abundance, distribution and range" (of species) and "biotechnology".

The following paragraphs, adapted from the United Nations Environment Programme ~ (UNEP) Guidelines for Country Studies on Biological Diversity (1993), identify some of the types of information which might be important, particularly for the initial assessment and strategy development.

Biological

This is the primary focus of biodiversity conservation - the core data which includes the requirements for species, ecosystems and genetic resources, covering issues ranging from status and distribution of resources to functional relationships and the development of tools to support the science.

Physical

Information on physical factors such as climate, topography and hydrology allows biological data to be placed within a physical context, and also allows for the development of predictive models (as the distribution of many species and vegetation types can be predicted by a combination of physical characteristics). Physical factors can also have a significant effect on potential use of resources, and on management options.

Socio-economic

The use and abuse of biological resources is essentially a function of socio-economic factors. Important data might range from monitoring of forestry or fisheries practice, to the impact of farming methods, or the distribution of population centres and transport routes. Perhaps as significant is accessibility to natural resources, and the uses that local peoples make of these resources. The latter often form an essential, but perhaps invisible part of the local economy.

Costs and Benefits

In order for management of biodiversity to be efficient, it is necessary to know the true value of biodiversity and the costs and benefits of management options. This needs to cover questions such as the costs of managing protected area systems, the level of income derived from tourism, and the value of indirect benefits such as watershed protection. Methods for assessing some of these values are only just being developed, and further dissemination of information on the methodologies will be required.

Data Flow Model - Document 1 3

Pressures and Threats i Identifying and monitoring both potential and actual threats to biological diversity is an important component of any information management programme aimed at improved management of biological resources. However, the latter may need to look beyond immediate physical causes and effects, to the underlying impact of human activities (which links threats to socio-economic factors).

Sustainable Management

Conservation of biodiversity is about effective and sustainable resource management. To assess that management, information will not only be required on the biodiversity itself, its status and distribution, but also on current and past management activities, especially on the use of biological resources. For example, information is likely to be required on a range of factors concerning protected areas, plus on effective management regimes and technologies in a range of protected and unprotected habitats. i

Sources and Contacts

Information is also required on information models, standards, and technologies, and on appropriate agencies and experts who can be contacted. This may include bibliographic information on who has published what, where, basic information on names and addresses of appropriately qualified experts, sources of information on reliable and appropriate models, and metadatabases.

Interrelationships

The above paragraphs begin to illustrate the extent of the interrelationships between the information that might be required in order to study and manage biodiversity more effectively. It is essential that these interrelationships are kept clearly in mind when planning information management strategies. Comprehensive forecasting of the effects of these interrelationships is also necessary for efficient information sharing.

Another method of sub-dividing the information requirements of the CBD is the following eight-point classification, which reflects the way in which national and international agencies are organised to manage biodiversity information:

Conservation Encompassing information on species, habitats, protected areas, biodiversity indicators, wildlife, etc.

Genetic Resources Encompassing agriculture, agricultural research, gene banks, use of genetic resources for benefit of mankind, traditional use, genetic threats, etc.

Technology

Encompassing information on the technology of biodiversity monitoring and assessment, such as data collection technology, computer systems and telecommunications, remote sensing, geographic information systems, database techniques and standards.

ae a ea rer 4 Data Flow Model - Document 1

Bio-technology Encompassing a forum for interchange of information on research and application of

bio-technology.

Environmental Statistics/Economics Encompassing resource utilisation, value of biodiversity, land use, industrial outputs, equitable sharing of benefits, natural resource utilisation, trade, economics, etc.

Policy Encompassing policy development, modelling, decision support systems and technology, empowerment and public consultation techniques, etc.

Human Factors Encompassing population, human health, social conditions, indigenous knowledge, and their relationships to biodiversity.

Environmental Law Encompassing environmental legislation, conventions, protocols, regulation, standards, etc.

It should also be noted that Article 18(3) requires the establishment of "a clearing house mechanism to promote and facilitate technical and scientific cooperation". This clearing house mechanism is for exchange between countries, but it is clear that a proper response to the CBD requires information exchange, integration and assimilation within each country, as well as building of capacity, to effectively utilise the clearing house mechanism. This document is intended to facilitate the development of such a national biodiversity information system.

The more detailed levels of the data flow model presented in this document give emphasis to the traditional areas of biodiversity information; that is conservation, genetic resources, and environmental statistics/economics. However, later editions of the model are planned to cover the other equally important areas.

1.3. Approach to the Data Flow Model

A distinction is often drawn between "data" and "information". Data normally refer to facts which result from measurements or observations (such as wildlife counts or the chemistry of a soil sample), whereas "information" is produced by analysing and interpreting data, usually with the intent to communicate ideas and facilitate decision making. The transformation of data into information may include processing the data using statistical techniques, analysis through models, selection and abstraction, and often expert human interpretation, for instance of the significance of patterns and trends.

An "information system" is essentially a structured set of processes (and associated people and equipment) for converting data into information, and for presenting it in forms which are useful for communication and decision making. Often the modern information~Xxystem will utilise computers in some of the processes, and for storage, but this is by no means necessary. The principles of information management remain the same whether or not

Data Flow Model - Document 1 5

computers are used - the need for data to flow from process to process, the need for well defined processes (of analysis and integration), the need to store and maintain data (and information), and the need to present or output the information in useful forms. Some or all of the processes may be manual, requiring specialised knowledge and interpretation. In considering a Data Flow Model for the CBD, no presumption is made on the extent and nature of the use of computer technology. The Data Flow Model is intended to provide an outline of the processes for information management under the Convention which is independent of the extent to which computer systems are employed or which particular hardware and software are adopted.

Strategy development and decision making in response to the CBD requires information which is integrated and summarised to a very high level, the result being many stages of processing removed from the original raw observations of the field scientist. A national information system supporting the CBD will be characterised by a series of summarising and ~ integrating steps. At the lower levels of the process, the outputs will require analysis and interpretation by qualified specialists in sectoral institutions. However, as the information becomes more refined, policy analysts and strategists will be required.

As noted and implied in the CBD and related documents (eg Country Study Guidelines, and the reports of Expert Panels established to follow up the CBD), the range of potential information types varies widely in biodiversity, with so many other types of information necessary for its management and understanding. The information required includes numeric, categoric, spatial and textual data occurring in a variety of forms and using a mixture of different media. This illustrates the broad scope and complexity of the data to be collected, exchanged and analyzed, and the potentially complex processes required to use it effectively.

It is also clear that the data derived from national monitoring activities depends upon the specific threats identified in the country concerned. This means that no universally applicable data requirements can be determined.

Consideration of these factors (breadth of information type, and country-to-country variability) resulted in the decision to produce what is termed a "generic" data flow model. This model has been developed through analysis of the processes that contracting parties are expected to undertake to implement the CBD, and of the broad categories of data required by these processes. An overview of the processes is given in Section 2 of this document, and a further level of detail is described in Section 3. The various types of data used in these Processes are discussed in Section 4 and data models are also suggested, again starting with an overview and then expanding detail in selected areas. Applications of the models are also illustrated in this section. The intent of this document is to provide, at the national level, a ‘sound basis for information management system designs that will:

© facilitate the presentation of biodiversity information to decision makers @ have an underlying common structure @ serve the goals of the CBD.

ScaURS acca ee ee see

6 Data Flow Model - Document 1

1.4 Conceptual Overview of National Activities for the CBD

1.4.1 Activities The overall process for the implementation of the CBD within a country is depicted in Figure

1.1. Information from Country Study

Set Strategic

Objectives Define Action

Plans & Targets Y

Implement

Action Plans

Evaluate Results

Figure 1.1: Overview of National Activities in Support of the CBD

Each component of the figure is described below.

Information from Country Study

The purpose of the Country Study (UNEP, 1993), recommended as the first stage in implementation of the CBD, is to provide information of various kinds to be used in the formulation of national strategies and plans for the conservation of biodiversity. The data will also provide a baseline for monitoring and assessment of the effect of measures taken (see CBD Articles 6,7).

Existing Biodiversity Information

Much information which can be used in the formulation of national strategies and action plans for a country is undoubtedly already in existence within the country, in neighbouring countries and/or with international agencies. The organisation and availability of the information will vary considerably from country to country and this will influence how easily

Data Flow Model - Document 1 7

and effectively it can be used (see CBD Articles 6,7 and Document 2).

Set Strategic Objectives ! The setting of strategic objectives must start with the identification of the components of biodiversity of importance to conservation in the country, then proceed with identification of existing and potential threats, and evaluation of the economic implications of any conservation measures. The objectives will be established at several levels of detail and jurisdiction, and they will be integrated as far as possible into relevant policies and sectors (see CBD Article 6a).

Define Action Plans and Targets

Based on the strategic objectives, priority areas for action will be defined and specific targets set. The action plans should include an indication of how progress towards the targets should be measured (see Measure Effects below and CBD Article 6a). Fa

Implement Action Plans The defined action plans should be funded and implemented. This will involve multiple organisations reflecting the different levels of strategic objectives noted above (see CBD

Articles 6-11).

Measure Effects

The success or failure of actions to meet defined targets should be measured. This may require action after a specific time period, a monitoring programme on a more continuous basis, or a combination of the two. It is important that the measurement process and targets be discussed fully in the action plan (see CBD Article 7b).

Evaluate Results

The measured effects should be compared with defined targets, allowing an evaluation of the actions to be made. As a result, it may be necessary to update the action plan, or go back a step further and re-examine the strategic objectives. For example, further data may need to be collected to supplement existing biodiversity information. There should be an iterative cycle of planning, implementation, measurement, and evaluation (see CBD Articles 7b, 7d).

Report to CBD

The exact reporting requirements of the CBD have not yet been defined, but are soon to be addressed by the Secretariat. Once the requirement is fully defined, its place in the overall process will be apparent, permitting additional detail to be added to Figure 1.1 (see CBD Article 26).

Some rectangles in Figure 1.1 represent information management "processes", and others data collections or "datastores"; it is not necessarily clear which is which. In the terminology of Section 1.3, there would seem to be six processes and two datastores, although clearly, data are associated with "Evaluate Results" and "Measure Effects". The two datastores may serve as information inputs or outputs or both. The arrows joining the four boxes in some cases imply dataflow, in others some sort of action or sequence of events. This type of diagram may give a good conceptual overview, but it is not consistent in meaning. For the purpose of evolving a useful data flow model, to provide a more useful representation, it is

apa eo re ne polenta

8 Data Flow Model - Document 1

valuable to separate the "process" elements (what is done with the information) from the information itself. This provides for a consistency of meaning within the diagrams and independence from implementation methods (eg manual, computerised, or mixed). The method chosen and symbolism used are defined in Section 1.5 below.

1.4.2 Participants

Because of the broad scope of information required to develop effective strategies, a wide range of institutions and agencies are obliged to interact and participate in the activities depicted in Figure 1.1. The participating institutions will include those concerned with both research and policy in economic and social issues, as well as the environment and natural resources. National and sub-national agencies might include those responsible for statistics, health, education, economic development and planning, social development, science and technology, tourism, industry, land tenure and management, and law, as well as renewable and non-renewable resources, environment, museums, herbaria, national parks, heritage, and ~ wildlife.

Participants will also include national and international NGOs, educational institutions, the corporate sector, multilateral and bi-lateral development agencies, scientific and social councils. Ideally, these all work in partnership in an atmosphere of sharing (similar to the concept of the international Clearing House Mechanism) to provide the necessary flow of information required to fulfil the objectives of the CBD. A further parallel between the CBD Clearing House Mechanism (WCMC, 1994) and national biodiversity information management process, is the concept of a linked series of specialised institutions (which may themselves be networks or "clearing houses") connected via a hub as depicted in Figure 1.2.

1.5 Methodology and Symbolism

1.5.1 Data Flow Diagrams

Following the methodology developed by Yourdon (1979), data flow diagrams are used to illustrate the flow of data between processes. These are commonly used to analyse an operation or system (eg the operation of a business) into elemental processes which are clearly understood, and to define the data required in those processes. The conventions adopted in this document are illustrated in Figure 1.3.

Operations may be expressed as a number of "processes" shown in rounded rectangles, labelled with a single digit at the first level. Each process may be broken down into (sub) processes which in turn may be split further, and so on, an extra decimal label being added at each level. Data used in the overall operation may be in one of two types of "datastore" - external, meaning generated outside the overall operation (depicted as a plain rectangle), or -internal, implying that the data are generated by one of the defined processes (depicted by an open-ended rectangle).

The directional lines between processes and datastores indicate the direction of data flow. For clarity, it is conventional that each data flow diagram should contain a limited number of process boxes (4-6) and datastores.

Data Flow Model - Document 1 9

Statistics &

iN TPB

ene

4

AX K St 7

Ces oe he

aa

Figure 1.2: Conceptual View of a Cooperative Clearing House Network

Note that datastore is a generic term, and refers to any logically related collection of data. The data or information held in a datastore may not be all physically in the same institution, and may be in hard copy and/or electronic forms.

In this document, the overall operation is the implementation of the CBD within a country. The content of the datastores depicted in the data flow diagrams in Sections 2 and 3 outline the basic categories of information required. The detailed specification of the databases required to store and process these data must be determined by individual countries on the basis of their own particular needs and priorities, and the information management capabilities in place. However, given the underlying nature of the data, plus the kind of analyses and outputs frequently required, it is suggested that where computer systems are employed, the most effective solution is a relational database management system (RDBMS) linked to a Geographic Information System (GIS). The former allows extensive manipulation -and reporting of non-spatial data, and the GIS extends these functions into the spatial domain. allowing data sources to be integrated and analyzed to provide outputs in a variety of forms, including graphs, tables and maps.

In the text, processes are identified in italics, and datastores in bold italics.

a 10 Data Flow Model - Document 1

Conduct Process

[: 2 | Catalog of Data Holdings Datastore (intemal)

Existing informakon Datastore (extemal) ~<—— Data Flow

Figure 1.3: Symbolism Used in Data Flow Diagrams (after Yourdon 1979)

1.5.2 Data Models

The content of the datastores is elaborated in Section 4. The discussion takes place within the context of RDBMS/GIS technologies, and methodologies associated with these are used in modelling the data. The basis is an "entity-relationship" approach (Chen, 1976), extended to encompass spatial elements.

ot Entity (non-spatial)

Entity (spatial)

RELATIONSHIPS

one-to-one

many-to-many

Figure 1.4: Symbolism Used in Entity-Relationship Diagrams (after Ashworth & Goodland, 1990)

Data Flow Model - Document 1 11

An "entity" is a item of interest whose attributes (properties) are being measured or recorded. For instance, an institution might be an entity with attributes of location, name, year established, mission, etc. The notation used is illustrated in Figure 1.4 and follows that of Ashworth and Goodland (1990). Thus rectangles represent non-spatial entities, lozenges represent spatial entities (points, vectors or polygons), and connecting lines show relationships between entities. The latter comprise three types as illustrated.

In the text, entities are identified in bold.

It should be noted that data models are to some extent subjective. Thus two individuals may produce distinct and valid models of the same data, reflecting the different objectives of their applications. The models presented in this document are generic, since the intention is to provide a framework which can be modified to meet specific situations.

12 Data Flow Model - Document 1

2 FIRST LEVEL PROCESSES

The first level data flow diagram is depicted in Figure 2.1. This level of analysis identifies 5 basic processes and one very general datastore. Each of these is elaborated below.

INFORMATION PROCESSES (Datastores)

Evaluate Results

CBD

Figure 2.1: CBD Data Flow Diagram, Level 1

2.1. Conduct Country Study

This process is well defined in the Guidelines for Country Studies on Biological Diversity (UNEP, 1993), where it is indicated that the goal of the Country Study is to initiate a process of improved biodiversity planning that will stimulate the action necessary at the national level to implement the CBD. Specific objectives include the provision of an information base for biodiversity planning and management through gathering and assessment of data required for decision making. This includes information on population, economics, environment, and so on, as well as biological datasets per se. Thus the Biodiversity Databases shown in the figure depict layers of information from several sectors to be taken into account in biodiversity management.

Data Flow Model - Document 1 13

2.2 Set Priorities and Prepare Action Plans

This is a combination of the setting of strategic objectives and defining action plans described in Section 1.3. As noted the objectives are to be integrated with policies in relevant sectors. For example, the general objective "increase the area of natural habitat under protection" might contain "increase the area of protected forests" and/or "increase the area of protected wetlands". The action plans specify targets which will be, as far as possible, quantifiable results over specific time periods, for example "increase protected forest areas by 200 sq.km over the next two years".

2.3. Implement Action Plans

This is as described in Section 1.3, and represents the totality of all the actions taken by the institutions involved in implementing the CBD in a particular country, including measurement and additional data collection.

2.4 Evaluate Results This is as described in Section 1.3. Note that this process includes the measurement of effects

which may lead to revision of data collection plans.

2.5 Report to CBD This is as described in Section 1.3.

2.6 Biodiversity Databases

It is unlikely that all the information required for biodiversity planning will be integrated into a single database at one site. For example, a plant species database may be maintained by the national herbarium, whereas data on protected areas may be managed by the country’s national parks agency. In addition, it is clear that other information sources are required, such as baseline data for the country (eg infrastructure), physical environment data (eg soils, hydrology, geology and climate), socio-economic data (eg demographics, health, local use of resources, and land-ownership), all of which will be maintained by the agency with the relevant mandate (the custodian). The Biodiversity Databases shown in Figure 2.1 therefore represent the total collection of data required for all of the processes involved in the implementation of the CBD. All of the top level processes shown use the data; the processes of conducting the country study, setting priorities, implementing plans and evaluating results will add data to the overall store. In the next section, the processes are broken down into components and the very general datastore, Biodiversity Databases, is further sub-divided.

an a Sa Re a i ee 14 Data Flow Model - Document 1

3 SECOND LEVEL PROCESSES

This section contains the level 2 data flow diagrams for each of the five processes outlined above (using the methodology and symbolism described in Section 1.5). In each case, the data flow diagram is followed by an expansion of the second level processes including a description of related datastores.

3.1 Conduct Country Study The first level 2 data flow diagram is Conduct Country Study (see Figure 3.1). Each of its processes is described in subsequent paragraphs.

Conduct ial Institutions

Institutional Preliminary Catalog o' Survey 12 Toe Data Holdings

Identify : Fislogea | 3 | Catalog of Data Holdings

Diversity

Existing Sectoral

Information

Identify Adverse 4 | Human Activity & Impacts

Processes

Determine 5 | Economic Values Economic

Implications

3.1.1 Conduct Institutional Survey The Guidelines for Country Studies on Biological Diversity (UNEP, 1993) explicitly identify the need for an initial assessment of the country’s capacity for conservation and sustainable use of biodiversity. The Guidelines suggest that the information required for this includes institutional capacities, human resource capabilities, available technological facilities and -information resources in place, and that an institutional survey would be undertaken to acquire such knowledge. These suggestions are expanded in considerable detail in Guidelines for a National Institutional Survey (Document 2), one of the other components of this project (see 1.1). The latter include suggested methods of carrying out such a survey and details a the type of data to be collected.

Data Flow Model - Document 1 15

The two outputs of the Conduct Institutional Survey process are related metadatabases. The first contains institutional information (see Section 4.3) such as staff skills and technological facilities (institutional capacity). The second catalogues the datasets (information resources) held by the institutions (see Section 4.4). This metadatabase is labelled "preliminary", since the institutional survey process involves only a cursory review of the datasets.

3.1.2 Identify Biological Diversity

This process ties directly to Article 7(a) of the CBD, ie the target is identification of components of biodiversity important for its conservation and sustainable use. The Convention does not include any definitive lists in this regard and items are to be defined as appropriate for the individual country. An indicative list in Annex I of the Convention (see Section 4.5) provides some insight into the nature of the required data. Clearly this process will involve examination of existing data, including those held by international agencies. The Preliminary Catalog of Data Holdings will provide pointers to relevant datasets in national ~ institutions. As the content of the datasets is used in subsequent planning processes, catalog entries can be confirmed and detail added where necessary, producing the Catalog of Data Holdings. This datastore may be used identify significant gaps in data resources.

3.1.3 Identify Adverse Processes

Again this process ties directly to an article of the Convention, namely Article 7(c). Direct threats to biodiversity (ie adverse processes) include deforestation, drainage of wetlands, emission of pollutants, urbanisation and the spread of invasive introduced species. Indirect threats are less well known, but should nevertheless be considered. However, as in Section 3.1.2, these may differ greatly from one country to another. The identification of adverse processes may involve integrating and interpreting data from a wide range of institutions. The data may already suggest specific threats, or may indicate reductions in biological resources for unconfirmed reasons.

3.1.4 Determine Economic Implications

The process of determining economic implications is documented in Section C of the Technical Annex to the Country Study Guidelines (UNEP, 1993). This process includes estimating the economic value of the benefits resulting from the sustainable use of biodiversity, and quantification of the costs of current and proposed conservation actions.

3.2 Set Priorities and Prepare Action Plans The level 2 data flow diagram for this process is illustrated in Figure 3.2.

3.2.1 Establish Strategic Objectives The process of establishing strategic objectives in support of the CBD involves consultation - amongst key institutions to identify the principal objectives in the context of primary threats, economic values and the capacity of institutions to support actions for biodiversity. It should make use of the four datastores resulting from the Conduct Country Study process, and involve analysis and interpretation of a range of existing sectoral data sources. Priorities should also identified in the process, in terms of human activities and impacts, and economic values. The output datastore will contain narrative descriptions of the strategic objectives. Attention is drawn to the paper National Biodiversity Strategies (UNEP/WRI, 1994).

16 Data Flow Model - Document 1

[+] Institutions

| 3 Catalog of Data Holdings | | 4 Human Activity & Impacts 5 [ Economic Values

Establish Strategic Objectives

Existing Sectoral Information

Select Indicators

& Establish Targets

Develop Action Plans

Figure 3.2: Set Priorities and Prepare Action Plans (DFD - Level 2)

3.2.2 Select Indicators and Establish Targets

In order to measure the results of actions, appropriate indicators should be chosen along with target values and critical thresholds. Indicators should be quantitative where possible, such as "to have protected areas totalling 5% of each ecosystem in the country by the year 2010". Consideration should be given to the paper Biodiversity Indicators for Policy Makers (WRI/IUCN, 1993). The process makes use of defined Objectives, as well as other inputs such as the Catalog of Data Holdings.

The output datastore Targets includes the definitions of selected indicators, methodologies for estimating them, and specific target levels. It may comprise of a mixture of textual and Numeric data.

3.2.3 Develop Action Plans

The planning process must include the estimation of the costs of proposed actions, and define the institutions responsible for each task. This process therefore attempts to reconcile the Targets identified in Process 2.2 with available institutional capacity (see Institutions). The output Actions lists tasks, responsible institutions, required legislation and regulation, data collection and monitoring plans, and associated costs and timetables. Although the output is

Data Flow Model - Document 1 ‘17

mainly textual in nature, it might benefit from organisation under an automated project Management system.

3.3 Implement Action Plans a my at The level 2 data flow diagram for this process is illustrated in Figure 3.3.

Sectoral Information

Y

| 2 | Catalog of Data Holdings

[10] Laws/Regulations

os 8 re}

Actions Figure 3.3: Implement Action Plans (DFD - Level 2)

3.3.1 Verify Information

The process to Implement Action Plans is assigned to a range of institutions, and commences with a process of review and verification of existing data sources. This contributes towards a distributed collection of selected and verified Sectoral Information. The selection should reflect the information needs of the CBD as defined by the selection of indicators and targets.

3.3.2 Collect Information and Fill Gaps

Depending on the extent of the information gaps identified in the Catalog of Data Holdings, this process may be a dominant or minor component of the overall implementation process. The additional data also contributes to building up the national collection of up-to-date sectoral information needed for estimating key indicators. The Sectoral Information datastores represent the main biodiversity information resource of the country. These

a 18 Data Flow Model - Document 1

datastores may be extensive and held under the custodianship of a number of separate institutions. They may occur in several forms including quantitative, textual, and spatial.

3.3.3 Monitor Change

The objective of implementing action plans is to achieve positive change. Thus new information should be collected regularly, as defined in Actions, to keep Sectoral Information datastores up to date. This may involve long-term site monitoring programmes to record plant and wildlife populations, or regular habitat monitoring schemes using aerial photography or remote sensing.

3.3.4 Enact Legislation One of the primary tools for implementing action plans is the enactment or amendment of legislation, regulations and policies (eg to create protected areas, to encourage or restrict certain human activities, to limit industrial wastes). This results in a datastore of laws, ~ regulations and policies which is largely textual in nature, except for quantitative tables or, for example, standards reflecting regulatory limits.

3.3.5 Perform Other Actions Although legislation and regulation are important actions, a range of other activities are desirable including:

institutional strengthening

human resource development and training monitoring and enforcement of regulation biodiversity research

operational actions to reduce threats.

Many national institutions may be involved in this process, which may be broken down into a large number of sub-processes depending on specific national strategies and priorities. Such sub-division is beyond the scope of this document.

3.4 Evaluate Results The level 2 data flow diagram for this process is illustrated in Figure 3.4.

3.4.1 Estimate Indicators

Indicators are estimated (or calculated where possible) on the basis of data held in the Sectoral Information datastore. This results in a further datastore of key numeric Indicator Values.

- 3.4.2 Assess Current Status vs Targets

In this process, the indicators and other results are compared to their original targets. This may involve simple numeric comparison, but more commonly, analytical assessments of progress towards targets. In addition, the effectiveness of legislation and regulation should also be assessed. The result of the analysis is a datastore of Assessments containing both quantitative and textual (explanatory) material.

Data Flow Model - Document 1 19

I

Sectoral Information

14 | Indicator Values

Assess Current Status vs. Targets

12 Assessments

Figure 3.4: Evaluate Results (DFD - Level 2)

3.5 Report to CBD The reporting requirements of the CBD (see Article 26) have not yet been fully defined. For

this reason a detailed breakdown of the Report to CBD process cannot be given. However, a suggested outline of the process is shown in Figure 3.5.

Sectoral Information

Indicator Values

Extract Reporting Elements

12 | Assessments CBD Reporting Elements

CBD |_ Report i —>] repre [r4]oo Report : Requirements : to CBD

Figure 3.5: Report to CBD (DFD - Level 2)

20 Data Flow Model - Document 1

4 DATASTORES

4.1. Overview of Datastores

The analysis of the CBD process depicted in Sections 2 and 3 indicate the presence of fourteen "datastores". As previously indicated, these datastores are conceptual, representing logical groupings of information required by or produced by the processes; no assumption is made on how and where the data may be kept. Datastores do not equate to physical datasets, data holdings or institutions; the information required for a datastore may reside in a number of institutions and derive from a range of disciplines. Note especially that datastore 9, Sectoral Information, represents the major national repository of scientific data relevant to biodiversity, and for this reason is depicted as multiple datastores. This section examines datastores from the perspective of data structure and content. The fourteen datastores in total represent the "Biodiversity Databases" identified in the level 1 data flow diagram (Figure 2.1). Each is briefly outlined, following which selected datastores are expanded in more © detail in Sections 4.2 to 4.5. This initial pass at a data flow model focuses on core scientific biodiversity data. Subsequent documents are planned to incorporate other relevant domains.

1 Institutions

This datastore keeps the information about the institutional strengths and biodiversity information analysis and management capacity of the country. The custodian of this information would normally be a lead agency in the implementation of the CBD, and commonly would be implemented as a metadatabase. Wide distribution or ease of access by all other agencies is important. The process of compiling this datastore is the subject of Document 2 of this series. An important function of this datastore is to connect to the reservoir of biodiversity technology and the associated enabling technology (survey and monitoring techniques such as remote sensing), and to sources of expertise for institutional strengthening. Datastores on technology are currently beyond the scope of this model, but could be linked in to this key datastore, the structure of which is elaborated in Section 4.3.

2 Preliminary Catalog of Data Holdings

3 Catalog of Data Holdings These two datastores could be implemented as a single evolving metadatabase. This would identify who has what data of relevance to the CBD, a key element in finding the required information for analysis as well as identifying information gaps. The structures of these datastores are elaborated in Section 4.4.

4 Human Activity and Impacts

This datastore is multi-sectoral and covers the driving forces which influence, positively and negatively, the conservation and sustainable use of biodiversity. It should include databanks on industrial and agricultural activities and outputs, population and social factors, land tenure, landuse change, etc. This encompasses much of what is often referred to as "environmental statistics". The scope of this document does not allow for the elaboration of the structure of this large and complex datastore. However, suitable data structures for environmental statistics are well established, with standard frameworks available from organisations such as the UN Statistical Office and the OECD (see Resource Inventory, Document 4, Section 5.9).

Data Flow Model - Document 1 21

5 Economic Values An assessment of the economic value of biological resources is extremely important in fostering their sustainable use and ensuring equitable sharing of benefits. Some models for organising this information have been proposed, but none are universally accepted. Implementation of such a datastore is likely to depend on national accounting and statistical systems, and will vary between countries as a result. Some guidance on assessing economic values and structuring the information is provided in the UNEP Guidelines for Country Studies (UNEP, 1993).

6 Objectives This datastore contains national biodiversity objectives at the strategic level. These would normally be framed in general terms, eg "to sustainably harvest forest products while maintaining biodiversity". This datastore would normally be implemented in narrative form (electronic or manual) with a simple structure, such as a sectoral sub-division on ~ the lines of the governmental program delivery structure.

7 Targets This datastore identifies measurable results sought in particular time frames. Although

targets might be framed in narrative terms, ideally they would be connected to quantitative indicators. Implementation should therefore be integrated with the Indicator Values datastore, which is elaborated in Section 4.2.

8 Actions

This datastore comprises the information on proposed and ongoing actions designed to achieve specific targets. This includes information on field projects, biodiversity research activities, biotechnology acquisition plans and projects, and planned and implemented policies and programmes (including those aimed at equitable sharing of benefits, sustainable use and conservation). This is not a scientific datastore, but rather an information communication and referral tool. Implementation in the form of documented "Action Plans" and bibliographic referral is most appropriate. The structure might logically parallel that of datastores 6 and 7. Guidance for organising the information of datastores 6 through 8 can be found in National Biodiversity Strategies (UNEP/WRI, 1994).

9 Sectoral Databases As outlined in Section 1.2, the range of information which relevant to the CBD is vast. These sectoral databases refer to the observational scientific data, traditionally gathered, collected and managed in sectors such as marine science, soil science, agriculture, wildlife, botany, zoology, forestry, genetic resources, and so on. The way in which these sectors are divided varies from country to country, depending on the scientific and administrative structure. Two main classes occur within this group:

© Core Biodiversity Data Those data which relate directly to plants, animals, their habitats and ecosystems, and related genetic resources.

a 22 Data Flow Model - Document 1

@ Natural Resource Base Data Those data which define the resource base for biodiversity, including, soil, geology, land capability and use, climate, physiography, hydrology, and aquatic resources.

Potential data structures for core biodiversity data are elaborated in Section 4.5. The management of natural resource base data is relatively mature (compared to biological/ecological information). Thus conventions, classification systems, and standard approaches have been defined in many areas such as soil science, geology, climate, and oceanography. It is beyond the scope of this document to provide further detail of these data management conventions. However, reference is made to the work of the International Council of Scientific Unions (ICSU) CODATA program, to the practices and standards of the various sectoral scientific unions (such as, the International Society

for Soil Science), and to Document 4, Sections 4 and 5. ;

10 Laws/Regulations

Il

This datastore maintains information on the laws, regulations, policies, etc, which govern the use and conservation of biological diversity, including related areas which effect commercialisation, economics and benefit sharing. The structure of such a datastore is best implemented as a simple index or metadatabase providing assistance in locating relevant documents. This would be similar to any bibliographic or document management system, such as the one employed by the IUCN Environmental Law Centre.

Indicator Values

Indicators improve communication by quantifying and simplifying information. They can provide policy and decision makers with essential information on the status and trends in biodiversity conservation, and can help evaluate effectiveness of conservation efforts in relation to explicit management objectives or targets. They can be applied at scales ranging from community level (to guide resource managers) up to national and international levels, and can provide a framework for the collection and reporting of information at all these scales. There is a continuous need for comparison of indicators, and where possible a common approach to their selection, measurement and reporting should be introduced. This also applies to the setting of baselines and targets. The Indicator Values datastore is closely tied to that of Targets, and is elaborated in Section 4.2.

12 Assessments

Assessments are the results of analytic comparison of existing conditions (derived from monitoring) with identified targets. These normally take the form of narrative reports with quantitative tables of calculated indicator values and other summary data. This datastore could be implemented as a document management system similar to that advocated for datastore 5, 6, 7 and 11. Many countries may choose to integrate the Targets and Assessments datastores to implement the desired feedback loop of the CBD process. Quantitative tables included in the assessments could be integrated with national "State-of-the-Environment" reporting systems where these exist, using the data structures recommended by UNEP or Organisation for Economic Cooperation and Development (OECD).

Data Flow Model - Document 1 23

13 CBD Reporting Elements

14 CBD Report The CBD reporting elements consist largely of information extracted from other

datastores, especially from the Sectoral Information, Actions, and quantitative component of the Assessments datastore. As the reporting requirements for the CBD are not yet fully defined, it is not appropriate to explore a data structure for these datastores. However, the quantitative component of the Reporting Elements datastore is likely to be maintained as a set of relational database tables linked to narrative assessments in the form of text- based files (for guidance on this approach see Document 3, Section 3.6).

4.2 Indicator Values

Indicators are determined by the specific issue under consideration, the target users of the indicator, its spatial and temporal scope, data availability and the framework available for analysis. They should be relevant to policy, should be well founded in technical and scientific : terms, and should be measurable. They may have a range of components and may draw on data contained in several datastores such as economic values, human impacts and sectoral databases (eg protected areas, habitats, physical features). Indicators are "information" derived by analysing primary data, even though they may be presented in tabular, map, or graphical form. Their value derives from their ability to place data in the context of agreed baselines and targets.

Many groups have developed indicators for environmental, social and economic monitoring, and new indicators are continually being developed for application at sectoral, national and even global scales. The World Bank, Organisation for Economic Cooperation and Development (OECD), and the World Resources Institute (WRI) are but a few of these. WCMC is currently testing a number of indicators of forest condition in a series of case study sites in tropical regions.

Indicator Paes / iti alculation Definition Mathod

Indicator Value

Figure 4.1: Indicator Values Data Model

ee EE non OREM Oe a IEP cI 24 Data Flow Model - Document 1

However, there continue to be problems in the development of consistent measurable biodiversity indicators. These mainly stem from a lack of appropriate primary data, but are also subject to debate over definitions, inadequate comparability of baselines and goals. As a result there is no universally accepted data structure typifying a datastore of Indicator Values.

Figure 4.1 suggests a simplified structure for such a datastore, with three entities: the indicator value (specific instances of the measurement or estimation of an indicator at a particular time and place); the indicator definition (which may be both descriptive and quantitative and carry attributes such as the desired level of the indicator, or its critical threshold); and estimation/calculation method (there may be more than one acceptable estimation method for a given indicator).

4.3 Institutions

An Institution is defined as a recognisable organisation that maintains or uses information of relevance to the CBD. The resulting metadatabase contains information on the strengths, capacities and data holdings of each institution in the country and, if relevant, region. An example of an Institution metadata entry is given below:

Name: World Conservation Monitoring Centre Acronym: WCMC

Type: Non-governmental

Theme: Information Services

Keywords: biodiversity; conservation; information Postal_Address: 219 Huntingdon Road

Postal_Code: CB3 ODL

City: Cambridge

Country: United Kingdom

Contact_Person: Jo Taylor

Contact_Status:

Information Officer

Telephone: 44-1223-277314

Fax: 44-1223-277136

Email: Internet: info@wcmc.org.uk

Update_Date: 1994-09-01

Mission: To provide research, information and technical services so that

decisions affecting the conservation and sustainable use of biological resources may be based on the best available scientific information.

- A few sample field definitions are given below to give a flavour of the institutional metadata (the full Metadata Data Dictionary is defined in Document 2, Annex 6).

Name:

Definition Official name of the institution.

Format Maximum 50 characters.

Status Mandatory.

Example World Conservation Monitoring Centre.

Data Flow Model - Document 1 25

Type:

Definition The organisation type, selected from one of the following: governmental; non- governmental; commercial; academic; inter-governmental; United Nations.

Format Maximum 20 characters.

Status Mandatory.

Example Non-governmental.

Theme: Definition The primary function of the organisation, selected from one of the following:

research; consultancy; information services; campaigning. The selection of a primary function keyword is not intended to wholly define the scope of the organisation. Detailed description of the function of the organisation can be expanded on in the "Mission" section.

Format Maximum 30 characters. Status Mandatory. Example Information Services.

A data model for this metadatabase is shown in Figure 4.2.

Linked Institutions

Technical Resources

Human Resources

Figure 4.2: Institutions Data Model

4.4 Catalog of Data Holdings

- The Dataset is defined as a collection of data and accompanying documentation maintained at an Institution. A collection of data refers to one or a series of Data Members which relate to a specific theme or geographic region. Sample definitions of Dataset metadata items are:

Name: ;

Definition The name given to the dataset or activity being described. The title should be descriptive enough to allow the reader to make a reasonable decision as to whether the data may be of interest.

a a 26 Data Flow Model - Document 1

Format Maximum 50 characters. Status Mandatory. Example /.frican Protected Areas GIS.

Theme:

Definition This is the theme or parameter being measured by the dataset. The keyword entered is the most general, and should, if possible, be taken from the standard terminology lists.

Format Use INFOTERRA terminology list. Maximum 31 characters. Status Mandatory.

Example Terrestrial ecosystems.

A data model for this metadatabase is shown in Figure 4.3.

Dataset

Figure 4.3: Catalog of Data Holdings Data Model

4.5 Sectoral Information - Core Biodiversity Data 4.5.1 Overview Annex I of the CBD contains the following indicative list of categories for identification an monitoring: "1. Ecosystems and habitats: containing high diversity, large numbers of endemic or threatened species, or wilderness; required by migratory

Data Flow Model - Document 1 27

species; of social, economic, cultural or scientific importance, or, which are representative, unique or associated with key evolutionary or other biological processes.

74, Species and communities which are: threatened; wild relatives of domesticated or cultivated species; of medicinal, agricultural or other economic value; or social, scientific or cultural importance; or importance for research into the conservation and sustainable use of biological diversity, such as indicator species.

3. Described genomes and genes of Social, scientific or economic importance."

Based on this list (particularly items 1 and 2) and given the ways in which biodiversity data are currently organised by the key agencies, the core data may be considered as relating to Bc four primary entities: habitats, protected areas, species and threats. These four entities represent the first level entity-relationship (E-R) model for biodiversity data a the datastore Sectoral Information.The relationships between these core biodiversity entities are shown in Figure 4.4.

Protected Areas

Threats

Figure 4.4: Core Biodiversity Data Entities

The figure shows many-to-many relationships between all entities, ie:

a species may be subject to many threats

a threat may apply to many species

a species may be in several protected areas a protected area may harbour many species.

The entities shown are at a high level (Level 1) and each can be further broken down (Level 2). These are elaborated in turn in E-R diagrams in subsequent sections. Appropriate models for genetic resources (item 3 above) have not been developed in this document and are planned for future research. Some work has been done on establishing effective ways of managing agricultural germplasm information at the International Plant Genetic Resources Institute (IPGRI) in Rome, and there are close links to the way in which species information

28 Data Flow Model - Document 1

in general are organised (see Section 4.5.3).

4.5.2 Habitats . Figure 4.5 shows a structure which could be used for handling data relating to habitats. There are three entities:

@ area is a spatial entity defining the geographic location of the habitat @ habitats has the basic attributes of the polygon, eg identifier, type, etc.

e habitat type gives more detail of the meaning of "type", eg type, description, etc.

Habitat

Figure 4.5: Habitats Data Model

There is a one-to-one relationship between area and habitats, and a one-to-many relationship between habitat type and habitats.

This structure is simple and can be implemented using elementary geographic information systems and data management tools. These allow produciion of maps showing the current geographical distribution of the various habitat types, with tables giving the area and, for example, percentage coverage of the total area of the country by a specific habitat type. This type of output is useful in summarising data for decision makers for planning purposes.

The structure is easily extended to cope with habitat monitoring, by maintaining a sequence of date-stamped editions describing the situation with respect to habitat at different points in time. By comparing these, maps and tables showing change (either decrease or increase) in the various habitat classes can be produced.

The principal problem in dealing with habitat data is not one of complexity in data structure at this level, but the absence of an internationally accepted habitat or ecosystem classification at an appropriate scale for national biodiversity planning. Again, the varying requirements of different countries means that a widely applicable classification is difficult to conceive. However the data management structure suggested here is independent of this problem.

Data Flow Model - Document 1 - 29

This basic structure is used in the Tropical Forests Database developed by WCMC. The source data have been derived from a variety of sources including satellite imagery, existing databases, maps, survey data, and so on, and harmonised into standardised broad forest categories. For example, the four major categories of forest are lowland rain forest, montane rain forest, inland swamp forest and mangrove. The forest polygons are held in an Arc/INFO coverage with linked database tables, echoing the structure illustrated in Figure 4.5.

4.5.3 Protected Areas Figure 4.6 shows a structure which could be used for handling data relating to protected

areas.

National Management Objectives & Legislation

Management Objectives for P.A.

Biotic & Abiotic Components

Protected Areas

Budgets / Staffing

Ownership

Protection Measures / Effectiveness

Area

Figure 4.6: Protected Areas Data Model

30 Data Flow Model - Document 1

The entities and relationships are:

@ protected area is the basic entity containing primary attributes such as name, year established, size, designation, description, etc.

@ area is a spatial entity defining the geographic location of the protected area

®@ socio-economic values (such as tourism) may be associated with a protected area

@ the land (of the protected area) may be owned (ownership entity) by one or more agencies (or individuals)

® management objectives will be set for the protected area (and these may relate in turn to national management objectives)

@ the protected area will be assigned budget/staffing for its operation

@ protection measures are established for a protected area and the effectiveness recorded.

The spatial element in this remains simple, but there are more entities than in the case of habitats and there is variation in the nature of the entities. For example, items in budget and ownership are fairly apparent, but the attributes to be included in economic values may be less so. The form of the entities could imply that more capability is required in the data management tools.

With a protected areas database of this form, maps and reports could be produced:

showing the various areas under protection

summarising the current management objectives

giving budgetary roll-ups of expenditures on protected areas totalling costs and benefits of tourism (an economic value) highlighting interactions with surrounding land use.

There is a well established database of Protected Areas in use at WCMC. Although it includes the elements mentioned above (among others), the purpose is to provide a source of information on the world’s protected areas, rather than to provide a mechanism to assist in managing protected areas which is left to experts in the countries concerned. This inevitably leads to a difference in perspective. However, some outputs similar to those above can be produced from the WCMC system, which is implemented using the FoxPro relational database management system (RBDMS) and Atlas-GIS mapping package (see Document 4, Section 3.2).

4.5.4 Species Figure 4.7 shows a structure which could be used for handling data relating to plant and

animal species.

Data Flow Model - Document 1 31

Altemate Names

Economic Value

Taxonomic Heirarchy

Protection Measures / Effectiveness

Geographic Distribution

Legislation / Regulation

Figure 4.7: Species Data Model

The following points should be noted:

@ the central entity species/taxa contains the basic attributes of the plant or animal, eg name, and this may be at the species or taxa level

@ any species/taxa may have alternate names, both common and scientific @ more than one collection may relate to any species/taxa

@ the taxonomic hierarchy, shown as one entity for the sake of simplifying the figure, is composed of multiple entities

@ any species/taxa may have protection measures applied to it © the protection measures in turn may relate to legislation/regulation

@ all species/taxa are located geographically (geographic distribution)

32 Data Flow Model - Document 1

e the geographic distribution may be located as one or more specific areas (or points)

@ any species/taxa may have an economic value attached to it (which may be expressed as several entities).

Again the spatial element remains simple and the complexity is in the non-spatial attributes (especially as several of the entities are likely to expand into multiple entities) and correspondingly more complex data management tools may be required to implement such a database. In fact the form of the model is likely to be influenced by the facilities available in the computer system used.

The type of analyses which could be generated include:

@ lists of species, grouped for example by family, with maps showing their distribution

@ lists of endemic species and associated distribution maps © lists of economically significant species, identifying the type of value

© cross-referencing of legislation with the species covered and the nature of their protection

© summaries of total species, number endemic to the country, species populations

Again, comparison of the results of similar analyses conducted over a period of time, permits the monitoring of species distribution and numbers.

This is a very simplified view, and additional linkages to threats, protected areas, trade in species, etc will be needed. Figure 4.8 shows, for purposes of example, the main files in the database used to manage plant information at WCMC and other establishments. This structure echoes the data model given above, for example:

@ the names table is the central species/taxa entity

@ the genera, families, orders, subclasses tables give the taxonomic hierarchy

@ the distributions table links to WCMC areas (Biodiversity Reporting Units) providing an index to geographic location

© the distributions table also contains a conservation status item which may link to laws (c.f. protection measures and legislation).

- Note that the E-R diagram in Figure 4.8 showing the general relationships between the principal data files in the database is diagrammatic, but not strictly correct.

Data Flow Model - Document 1 33

aa ie sl

S|

aa Data Source Data IVCN Staus Locations Sources Categories

Figure 4.8: Main Elements of WCMC Plant Database ("BG-BASE")

Note also that the conceptual E-R diagram of Figure 4.7 contains no parallel to the "data sources" and "data sources location" table. These are linked to many other tables and provide a mechanism for documenting the source of the various items of data. This is valuable information when dealing with biodiversity data of all kinds, and arises not so much from the structure of the information itself, but from the requirements for managing it.

This particular implementation uses a system called BG-BASE, which has been developed using the Advanced Revelation, a relational type of database management system (RBDMS) which allows for variable field lengths and multi-value fields.

4.5.5 Threats

-The data which are needed to describe and analyse threats, and the structures needed to effectively manage them, depend heavily on the nature of the threat. For example if the threat to a species originates from trade, then details of the traffic and trade in related commodities would be relevant; if the threat is due to loss of habitat, then rates of land use change and related spatial information must be recorded. Different data structures are required in each case. Threats deriving from widely distributed phenomena such as climate change or long-range transport of pollutants, present different challenges again for data organisation.

RN EN 34 Data Flow Model - Document 1

While the Country Studies Guidelines (UNEP, 1993) outline three major classes of threat: External Socio-Economic Factors, Direct Threats: Local Impact, and Direct Threats: Global Impact - in designing a national program of information management on threats, emphasis should be placed mainly on the proximate (or "direct") threats of local origin. Socio- economic factors should be considered as causal factors, rather than as threats per se.

This distinction is not always clear. There is considerable interrelation between causal factors, threats to species, threats to habitats, human activities, mitigation measures, etc. "Threats" to one species may result from measures aimed at conserving another. It is a complex situation without as yet a great deal of standardisation of concepts and approaches. Listed in UNEP (1993) are seven major categories of human-induced threats with very many sub-classes, and a number of other ways of categorising threats are available.

From the perspective of threats to species, a primary breakdown would distinguish between ~ threats to the habitat of the species (such as loss, fragmentation and degradation of habitat quality), and threats to the species itself (such as harvesting, hunting, introduction of competitive species). The E-R diagram which follows (Figure 4.9) applies to the organisation of information on threats which are internal to the country. Different structures will be required to deal with external threats such as climate change.

CAUSAL FACTORS

THREAT DESCRIPTION

ACTIVITIES

IMPACT ASSESSMENT

REMEDIAL ACTIONS

Figure 4.9: Threats Data Model

Referring to Figure 4.9:

© The central entity threat description contains data such as threat category (preferably following the standard IUCN classification), intensity and duration, a narrative description of the threat, and pointers to relevant references.

Data Flow Model - Document 1 35

© the activities entity records the quantitative information related to the threat, for example how much, how many, where, etc. The exact structure of this will be highly dependent on the nature of the threat. For instance if the threat was "road building”, activities might include the length, nature and position of roads, associated support facilities, construction timetables, and so on. If "hunting" was the threat, then records of annual take and the nature and number of hunting parties might be relevant. Many activities are likely to be associated with each threat and vice versa.

© The causal factors entity identifies and describes the primary driving forces which generate threats. For example, in the case of "road building", causal factors might be mining and tourism. Socio-economic and other human factors (see suggested list in UNEP, 1993) would also be listed here. Many causal factors may relate to each threat and vice versa. Similarly, causal factors may generate many specific activities which need monitoring. :

© The remedial actions entity includes data on the feasibility of actions to reduce the threat, as well as the costs and benefits involved.

© The impact assessment entity contains estimates of the likely effects of the threats, both ecological and economic, and suggests their potential for reversal.

This data model breaks down the Threats entity of the first level model of Figure 4.4, but is still aimed at a general and generic level, and is thus provided as an example. Expanding the model to include threats to habitats will require at the very least the addition of a spatial entity to reflect the geographic extent of the threat. Other additions which should be considered are linkages to information on institutions and their capabilities and roles in remedial actions, traditional uses of the threatened habitats, economic benefits of resource utilisation, international conventions and treaties, and so on.

Expanding or modifying the model to deal effectively with non-proximate threats, and more general causal factors (such as marine pollution driven by industrial development, and ultimately by population pressure) requires the introduction of additional entities such as "ultimate driving forces", and "regional (or collective) threats" which would hold information which is not specific to a species or habitat.

The consideration and classification of threats is at an early stage (see Document 4, Section 5.9), so no example can be given of an operational database based on this model. At the present time, threats are normally described in narrative form in species or protected areas information systems.

4.5.6 Integrating Core Biodiversity Data

Existing (computerised) databases tend not to attempt to hold all biodiversity information in a single structure. The focus in any one implementation tends to concern a particular entity (or small group of entities), with investment in detailed data about that entity. This may be due to the sectoral mandate of the agency implementing the database, lack of data, or because the user requirements are successfully achieved with that limited information. This is not to say that relationships to other entities are necessarily ignored. For instance, both the

36 Data Flow Model - Document 1

examples of species and protected areas databases given above contain data w threats. Arguably this is minimal within the databases, but users could establish links to other (perhaps non-computerised) databases containing further details if desired.

In the context of the CBD, some countries may wish to integrate all their biodiversity information into a single information system at one site, such as a national environmental information centre. For the reasons mentioned above, others may prefer to leave the responsibility of data custodianship to several agencies, and implement an effective coordinating mechanism. The latter could be the foundation of a phased approach leading to integration in the longer term. Regardless of the mechanism, the work to produce standards for classification schemes, agreed taxonomies, data transfer mechanisms, and high level dataflow models, is needed even where custodianship is maintained on a sectoral basis, in order to have a conceptually integrated biodiversity information management system for the country. This integrated view will then permit the development of sound national strategies ~ and actions in response to the CBD.

Data Flow Model - Document 1 37

—_—_—_—_—_—____eee—e—eee"”:_OCO ee ee

38 Data Flow Model - Document 1

5 REFERENCES Ashworth, C. and Goodland, M. 1990. SSADM.: A Practical Approach. McGraw Hill.

Chen, P. 1976. The Entity-Relationship Model - Towards a Unified View of Data. ACM Trans. on Database Systems. 1:9-36.

Yourdon, E. 1979. Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design. Prentice Hall.

UNEP 1993. Guidelines for Country Studies on Biological Diversity. United Nations Environment Programme, Nairobi, Kenya.

UNEP/WRI 1994. National Biodiversity Strategies - Guidelines for Biodiversity Planning and © Profiles from Early Country Experience. World Resources Institute, Washington DC, in prep.

WCMC 1994. The Biodiversity Clearing House - Concept and Challenges. WCMC Biodiversity Series No 2, World Conservation Press, Cambridge, UK, pp.34.

WRI/IUCN 1993. Biodiversity Indicators for Policy Makers. World Resources Institute, Washington, DC, pp.42.

Eee en eee ee eee ee ee Data Flow Model - Document 1 39

Data Flow Model - Document 1

ANNEX 1: ANALYSIS OF THE INFORMATION NEEDS OF THE CBD

[Article | Information Requirements __—(|_—Type"___ it

Objectives

2, None

Use of Terms

3. How "actions within the jurisdiction" are | Informative Principle affecting the environment of other jurisdictions.

Global and regional state-of-the- Scientific

environment information.

Interrelationships and effects of outputs Scientific

(pollution, wastes etc).

How trading and other economic activities | Economic

affect the environment of other nations.

ju Ama

National institutional strengths and Metadata capabilities.

4. || Jurisdictional Scope 5)

Cooperation

Needs of Contracting Parties. Informative

Strengths and capabilities of "competent Metadata

international organisations”.

Informative and metadata

Current sectoral and cross-sectoral plans and strategies which may effect conservation and sustainable use of biodiversity.

6. General Measures for Conservation and

Sustainable Use

TTT eee

Data Flow Model - Document 1 41

Us [As identified in Annex I to the CBD]: Identification and Monitoring "]. Ecosystems and habitats: containing high diversity, large numbers of endemic | Scientific or threatened species, or wilderness; required by migratory species; of social, economic, cultural or scientific importance; or, which are representative, unique or associated with key evolutionary or other biological processes. Economic, 2. Species and communities which are: social threatened; wild relatives of domesticated or cultivated species; of medicinal, Scientific agricultural or other economic value; or social, scientific or cultural importance for research into the conservation and sustainable use of biological diversity, such as indicator species; and

3. Described genomes and genes of social, scientific or economic importance."

A broad program of systematic data collection on species, protected areas, critical habitats, and ecosystems (these are referred to as the "core biodiversity data" and elaborated considerably in the body of this document).

Human activities, including industrial activities, agricultural practices, land use etc.

Monitoring data on the state of the environment, measured according to Standard procedures in continuous time series.

nee ee 42 Data Flow Model - Document 1

8. 8a toh Scientific In-Situ Conservation As in Article 7, but with more specific reference to protected areas and to the status of ecosystems and species populations. Informative 8g and scientific The effects of the introduction of "living modified organisms" in other Scientific jurisdictions, and monitoring data on local effects. Informative 8h Information on alien species (presence, sources). Informative

Eradication and control measures for alien | Informative species.

8i Economic Present uses of biodiversity. ;

8j Innovations and practices of indigenous and local communities.

Accrued benefits (especially economic) of use of biodiversity and relative contributions of local communities (to enable fair sharing of benefits).

8k Existing legislation and regulation on protection of endangered species.

Effects of legislation and regulation in other jurisdictions.

8l,m Financial requirements of conservation measures.

Data Flow Model - Document 1 43

9. Ex-Situ Conservation

10.

Sustainable Use of Components of Biological Diversity

11. Incentive Measures

12: Research and Training

13. Public Education and Awareness

Capabilities and facilities of institutions for research and ex-situ conservation.

Research results on effective methods of re-introduction of species.

10a,b As per Article 6.

10c,d

Customary use and traditional cultural practices and how these can be used for remedial action.

10e

Strengths and capabilities of private sector

organizations for the development of methodologies.

Incentive measures found to be effective in other jurisdictions.

Cost/effectiveness of incentive measures employed.

Training and education needs and priorities.

Available sources of training and education.

Biodiversity research activities world- wide.

Available materials suitable for public awareness.

Successful awareness tools and activities.

Bibliographies, technology of networking and information exchange tools.

Data Flow Model - Document 1

Metadata

Informative

Informative

Informative

Metadata

Informative

Economic

Informative Metadata

Metadata

Metadata

Informative

Metadata, informative

14.

Impact Assessment and Minimising Adverse Impacts

15. Access to Genetic Resources

16. Access to and Transfer of Technology

17. Exchange of Information

18. Technical and Scientific Cooperation

Major projects which may have impact on biodiversity.

Impact assessment methodologies.

Resources and population at risk in- country and in neighbouring regions.

Nature, availability and location of emergency response facilities.

Emergency response contingency plans and strategies.

Pts 1-6

Systematic record of available genetic resources (germplasm, plant and animal genetic research results).

Environmental sound uses of genetic resources.

Pt7

Benefits, commercial and otherwise, of genetic research and resulting genetic resources.

As in Article 15 but related to technology innovation rather than genetic resources.

Bibliographies, directories, metadatabases on research, technology, and available data (world wide).

National institutional strengths and capabilities.

Technical and scientific advances and research programmes of Contracting Parties.

Metadata

Informative

Scientific

Informative

Informative

Scientific

Scientific and informative

Metadata

Metadata

Data Flow Model - Document 1

45

19. [From Article 19]

Handling of "any available information the use and

Biotechnology and safety regulations required by that

Distribution of its Contracting Party in handling such

Benefits organisms, as well as any available information on the potential adverse impact of the specific organisms concerned to the Contracting Party into which those organisms are to be introduced."

20. Financial resources available to support Economic Financial Resources activities under the CBD.

Economic and social conditions within the | Social developing countries.

Environmental conditions within the Scientific developing countries.

2A: As per Article 20. Economic Financial Mechanism

Terms and conditions of other relevant Benet with Other | international conventions. International Conventions

23. None Conference of the

Parties

24. None Secretariat

Integrated information from all other Scientific, eis Body on Articles. informative Scientific, Technical and Technological Advice

26. Yet to be determined. Reports

Procedural and administrative Articles with little information management requirement.

46 Data Flow Model - Document 1

“Types of Information: Informative. Descriptive information about the issue. Usually in narrative form. Legal. Regulations, legislation and other legal instruments.

Scientific. Measured or scientifically observed data, often in numeric or categoric form in databases.

Economic. Information related to costs, expenditures, and other financial information, usually in numeric form.

Social. Information on population, health, and other social measures, usually in numeric form. ;

Data Flow Model - Document 1 47

Data Flow Model - Document 1

ANNEX 2: LIST OF ACRONYMS & ABBREVIATIONS

BDM Biodiversity Data Management

CBD Convention on Biological Diversity

DFD Data Flow Diagram

ERD Entity Relationship Diagram

GEF Global Environment Facility

GIS Geographic Information Systems

ICSU International Council of Scientific Unions

IUCN World Conservation Union

OECD Organisation for Economic Cooperation & Development UNEP United Nations Environment Programme

WCMC World Conservation Monitoring Centre

WRI World Resources Institute

NB See also the index of acronyms and abbreviations in the Resource

Inventory (Document 4).

Data Flow Model - Document 1

49

a m5 ie oS oe we 4 f ¥ Oa at) The is Zann (cured vilaeayitens

lad PRE es they geet 7 | ue wae wif

pidge ee i. Salen lip fhevonee —aid

pillbedk 40) of cnoitelvsiday tae suit te ton iy peereey = st snitiorsetS) Vane me oy

WORLD CONSERVATION ~ _ MONITORING CENTRE

- World Conservation Monitoring Centre 219 Huntingdon Road _ Cambridge CB3 ODL United Kingdom Telephone +44 223 277314 Fax +44 223 277136

_ The World Conservation Monitoring Centre is a joint-venture between the thr ce _ partners who developed the World Conservation Strategy and its successor coi ir the Earth: TUCN-The World Conservation Union, UNEP- United Nations Environment og Programme, and pe Wee Wide Fund for Nature.