a Management Project FRAMEWORK FOR INFORMATION MANAGEMENT in the Context of the Convention on Biological Diversity United Nations Environment Programme in association with the World Conservation Monitoring Centre WORLD CONSERVATION MONITORING CENTRE UNEP - United Nations Environment Programme is a secretariat within the United Nations which has been charged with the responsibility of working with governments to promote environmentally sound forms of development, and to co-ordinate global action for development without destruction of the environment. The World Conservation Monitoring Centre, based in Cambridge, UK, is a joint venture between three partners in the World Conservation Strategy and its successor Caring for the Earth: IUCN - The World Conservation Union, UNEP - United Nations Environment Programme, and WWE - World Wide Fund for Nature. The Centre provides information services on the conservation and sustainable use of species and ecosystems and supports others in the development of their own information systems. Copyright © 1996 United Nations Environment Programme For additional copies of this document or further information please contact: Task Manager Biodiversity Data Management (BDM) Project United Nations Environment Programme PO Box 30552 Nairobi KENYA Digitized by the Internet Archive in 2010 with funding from UNEP-WCMC, Cambridge http://www.archive.org/details/frameworkforinfo96unep TABLE OF CONTENTS ACKNOWLEDGEMENTS 20.......ccccceecceeceeeeeeeeceecessnnennnnccnsusecescesceeseeneeseerceseeecesesecansesecscauaceecenecececaaaneaeas i PREFACE ...............ccccsceccececeesescecsssessssncccccaceucnnccsceeeeeeeeeceesssenenensnaaaaeaaaacausauceusaaeeaseasaescersreseeseeeeeeenenes ili 41 INFORMATION FOR DECISION SUPPORT ..........::sccssssssssessnnenscceneccncaeaeeenaeaseceenseeeseresseeeeseeeeeeenens 1 TERI ON SACI DNOLONN COIN] ce tesense 405 cadsonedagosdadauecotad aaiacodden dose seed saribad sdeucaoscacodosseadaconucesoosdts 2 1 QVINFORNUIAMIONENEE DS eeccceec sees caacuaccsete cece ncsaten te emcineceisecins ote teeese selina reese ciecisara(iela 3 1-3oTHE) POWER OF INFORMATION @2.<.-.-seconcccc-cccnceececesescccscrcnstcscentcenseerecssrs sis nseincie eee 4 LPAVINE ORATION AS A DOOL cc ccnc eccesscreees ce ceases sscenecbicb erence -earseesareieclfiscins tneleiscinanilesinn 5 2 INFORMATION SYSTEMS - THE FRAMEWORK ...........:::ssssessessesteesenseeeeeeeneneenenenseeceeecenceeeeeeeeeeess 7 OU ON ABEXO) DIU CIN LOIN), cae ecgcconecrsasocsececdon sdocouendesecodaadecanoendanadedrccddgaecacK cc sccoaaduqaceqeOnCsKoddd 8 VOY DI SI OIN [COIN CIE DIRS 5 ae ee HES 6: Sp ant dnouosonbacedoseedaoseregreocdeSobdcbdadndotnccope seangecaRaqsReBSBqoNe 9 DIMA ON ei nih Gijon ccnecnbaoonocodoodcoseHbdedadoudleocdenebe ep ccnnandoader eraeadooddeec ace 25r6 sos Cobban SuaEUGEC 9 2.2.2 Custodianship ............-202sseccsecoeceeescesecescecccecceecscssesscscceeesecsesersenrseecescntcesess 9 DP OVEN NGOS ELIT nnaadé nedkecoaedeoncdsoodenococcJzcaosiaeoc anSanasnandensEssust sonadebdaeneascdeades—c0G000C 10 213) NETWORK CO-ORDINATION secescenec eee ce re cack Se olaa seieste botnet nies Sele ep ele nie= s=\0 =icle/nnia=(n-)s<[aciaer= 10 DBS AN COMGINIIG Woo dackidesnonbeaaaocosdonnosbondaneseedsoeLaed ade bn conpocuassenDUduscabcoRaebaBcomanonUctoasaddo 10 DEVO) ADEA) 3h, Cals 81s} leet aqdaoneceeeese sabdesoddboeudedcedspdendocadsacensusdsopsepeHob oubuquaboskoccoodeoc 11 2.3.3 Legal Considerations ...............0.c0eeceeeeee nett eect eees eee eeee essence ee eeeeeeeeceeeeeecencoes 12 PAORGANISATIONAT STRUGIURE te -2-c cote tate c oem esecacanceasoceaecesersiansere siete +s ctacineesincioneci 13 DLW ONS ath Os /pansanceecc eaooboncdaodadbanedousnuececcosastnddspsddooeubaananbessun ooo ane usouaausoRcoUDeOC 13 DEAD eInformationyblOwssscacessaese sesso sate ace eeeeerettomoe ae seats atl elee tatters les /=iselcolscioiain= 14 DDS 1720 (OPI EY CSU TIN Gy pace nenconconeonobanp nace sscooccacoudadodacdossJ65d000cr coronegncadeLGedsocereagucEbaqoncG 15 DIS TON Grats CM sade se abocdacsoneecéeusespednce cticonuddseosdaceaneSeaneeNsaccbodononuccosucousasasnaseseon009 15 DS DPK ey, Steps) -.cec-- 0. cece sec scanner seen seasneractmeses sacaetsecosccecees ssc scaeseesearasiineeccientienes 15 3 INFORMATION SYSTEMS DEVELOPMENT ..........s:sccccssssccseceesssnnceccccnsssnsnceeeecusucssnesansnsenseeesenne 19 VTA IN FINXO) DIOLCN 0 COIN Renee ceanonocodebobesenoddoocacabd0eo cadectacsesdoctccoecdiaoce yoadpcesqvebaeacdee TepEoceococG 20 By) 8) INO) SS (GYNIL (CONUS. G Ue sccucducosdecccosnsannéqoHdesesasucensbucbosdstannedoodorocéscaceooqsnocedec05—090 21 3.3 USER NEEDS ASSESSMENT <2 oce.cetece cess coe cross = ceancnmnso ans acnnaneceeadncensees-necccces=- "ase se 21 ByiSy ail (OVC a dle pagans seneanconnarooonaaderbecsouboeouecadaacaerasco9n7couoncosousseeDdoDadeagoucdedeacneododaC 21 Bias) (rb EN SiS a} panera cenenoccocrooaeeeseceouBbocc—oe2c0.b9 ¢c/24 ciga5s0d6q08009006e 2507 On Gas oNso40o0 S52 050000 23 BRE VSH DEL CUIN(o0l3 Gagsrepreennarconesoeeccuc -aeqocoqsebesceseneononbsnlonsccoceaG Cg uueo 00. 0DceCHoReeenesqoc 24 3'3:4)Processing Needs) s..sccssees-nesssee cee ace een one= ace ceee nse elinm ee = eis eniislelslecisnaplnaeo-- = e\nr\e me 24 BBV Tiere) (3 caval IM 1069 a cococsocauuseeccoocceosdesneccoacaceoosocoooaDaqcsebeodonocdunsnuedsesEsqocqs0s00 26 3.4 APPROACH 1: STRUCTURED DEVELOPMENT LIFE CYCLE..............0.0::eeseeeeeeees eens 28 Bh ANS TL (ON niflehw/ancnaeenoe anda sooncbe snabosnocacoanedaduoseendoaacasenc o9g2=00c conan ceDrenatnongdesgsqqep00000 28 3.4.2 System Design ..........000:2.-csssececcseccesccnssconsensecscecerccerecercesestssscceccesccsseasers 28 3:4:3yDevelopmentesescessster nessa es setae setae eae cae oe tote ade tae eects tes slo rseiae aia sien sm 29 3.4.4 Implementation.............-2..-.---cec020-eceeerecccerennseeccaseseWeriesedecsecrercnseerceseecererns 30 BANS] Operationyes asses spe se sess see oeies ses ass a eee Be eet eee liens mrone 30 315 APPROAGH 2 PROMO MYVPING peese essere eee see ee issen sees ece eee ete seem eer ater cis iecinoi= 31 2)S) jl (OMYUSInAle\ acco ocandoandensecsooiacbospocvasoonssocucadasobeBeedeccaso-acesccéconags JopBegoRbaQanTSEGCOGS 31 3:5)2 The sLbrow-away, WPLOtoty pees: ssoseccncecceceseenctesscoansercaoeceescectente seacaccectrmaanes 32 3)-5:3) The Evolutionary, Prototy peves--or- se scecete etree acene sete cseece ceo see cain eeseec suse ase sseseee 32 3:54 Summaryiof Methodolopiesta:--c= -ceemeeee 54 SOIEAMPIER coos coo enea so cce eck somata cie eee eee ee eee a nee ck Sait dias Uses oaeee 54 GINEORMATION/ PRODUCTION crcrecsccersncenescensetceareseestee tr trevesces teresa crererseeresemreestccescrecsscecccsstenee 55 Gal INTRODUCTION ets tees coe enen cosee casera ne eee eee Sector Seine cae: oles ine Sao Sema 56 6:2. INFORMATION PRODUGMS iano... s (eimai nes Dg Reeth oe Lee | 7 - sii Taciee to mu ie teriie= ~ SS Ha f nde @- 7 fyee eer Hye Woe "a Ake a8 : = ; Weruriovpy « ot 99) eum 6 2D re - ah’ , weg a, Minis ~ - eo IR is , i, ‘Satiquory thane aly ferro oat a -— Apitinatma gee Astin yils|. lacie Epes) om i fed = i wwe Eb shivers. co-hedl ato. * antacrpte i 3 Information Systems Development 3.1 INTRODUCTION The concept of custodianship implies that data should be collected and managed by the agency which is most appropriate and best equipped to do so. Since biodiversity is a very wide ranging topic, spanning many agencies and disciplines, the application of custodianship results in a _ distributed information system consisting of loosely linked datasets managed in separate locations (see Figure 2). To operate effectively, three fundamental activities are necessary in such a system: e Regular collection (monitoring) Many agencies are good at data collection. However, the time dimension may be lacking from their data - the paradigm to use is monitoring, not collection - and the techniques used may not be consistent over time or with other agencies. To reveal environmental trends, data should be collected in standard formats, via standard techniques, and over long periods of time. One-off studies may be interesting for many reasons, but consistency of results is almost certainly more useful in the long-term. e Management and accessibility For information to flow between agencies, and from agencies to other audiences, data should be managed in ways which promote accessibility. Various principles and techniques are necessary to achieve this, including the design and development of local information systems (this chapter) and, more specifically, computer databases (see Chapter 4). It is the task of all individual stakeholders in the network and, in particular, its hub to ensure that sufficient resources and expertise are mobilised to develop the capacity of agencies to manage data effectively. e Summary into information Although the architecture of a multi- agency information system is distributed, some co-ordinated activity is necessary to firstly summarise the data collected by custodians and secondly build collaborative information products (see Section 2.3). This activity is facilitated by the network hub, which maintains contact with all the custodians and monitors the status of their data (see Figure 3). Within the context of the overall information system, partner agencies have two goals: to develop their own data for improved corporate productivity; and to integrate their data with other agencies to achieve results beyond _ individual capacities. Thus improvements in data management capacity are immediately beneficial to the agency concerned, as well as the wider network. Depending on the profile given to information within an agency, and the resources which are available, projects may already be underway to increase information usage, boost data management effectiveness, and even implement localised information systems. The potential for collaborating with or building on existing work should be investigated by all agencies before embarking on new projects, since the experiences gained may be extremely valuable. However, in many situations custodians will wish to initiate new projects to manage their data, and in such cases assistance may be required from the network hub. In order to address the needs of different custodians, a variety of approaches to information system development may be required, and these should always be debated in an open, consultative manner (see Section 3.3). For example, a system set up to manage a plant genetic resources dataset in a ministry of agriculture may differ greatly from a system set up to manage sustainability 20 @ Information Systems Development indicators in a forest department. It would not be efficient to merge the two datasets together, nor would it be acceptable to the agencies concerned. Although every information system project will have its own objectives, generic methods of project management are useful in structuring the design and development process. The remainder of this chapter considers such techniques - referred to as system development methodologies - whilst Chapter 4 focuses in greater depth on the specific issue of database design. 3.2 HISTORICAL CONTEXT As the use of computers in information management has expanded, methodologies for the development of information systems have steadily matured. These originally arose to address the problem of excessive cost (resources and time), which often exceeded original estimates. In the late 1960s and early ‘70s, a standard project life cycle became accepted as a means of structuring information system projects. Given the constraints of the technology at that time (for example mainframe architectures, punch card processing, and languages such as FORTRAN and COBOL), projects tended to be managed by computer specialists. When eventually delivered, the systems were subjected to criticisms such as ‘not what I wanted’, ‘incomplete’ and ‘unworkable’. Two factors contributed to this: e long development periods during which users altered their requirements e difficulties in phrasing user needs in complete and unambiguous ways. In the early 1970s the first of these challenges was partly addressed via concepts such as structured programming and structured analysis. The tools which were developed to support these techniques prepared the ground for many of the development tools we use today - tools which relieve much of the burden of programming. The driving force was productivity, since the cost of human resources was a key consideration in the overall project. A class of system development methodologies grew up around the structured programming paradigm. Collectively, these are referred to as the Structured Development Life Cycle approach, in which development is carried out in a series of structured phases. Different variants of the life cycle are advocated by different countries and organisations, some of which are accepted as ‘standards’ in industry and government. Computer performance has __ increased markedly since the 1970s. Project development is now centred around the ‘desktop’, with powerful, sometimes graphical languages being introduced for accelerated system design. With this revolution came the introduction of ‘prototyping’ tools capable of modelling the finished product quickly to generate feedback from prospective users. This led the introduction of new development methodologies in which prototype systems are modified on the basis of feedback from prospective users. 3.3 USER NEEDS ASSESSMENT 3.3.1 Overview When building information systems, the earlier that problems are identified the easier (and therefore cheaper) it is to correct them. Correcting an error during the design stage is normally a simple, paperwork _ task; correcting an error after a design has been translated into a working system is more costly (this may require equipment modifications); correcting an error after Information Systems Development users have begun to employ the system for their work is more costly still, and may involve retraining of staff in addition to equipment modifications (see Figure 5). It is important to assess user needs rigorously before embarking on an information system development, and refer to these needs often as work progresses. Without proper attention to user needs assessment, time and money can be wasted on systems which are not cost- effective (eg fail to deliver the required products), leading to dissatisfaction and eventually loss of confidence in the project by stakeholders. The key challenges in user needs assessment are therefore: e to reduce unnecessary costs and delays during system development e to promote ownership of the development process by stakeholders. Clearly, the solution to these challenges will be different in each project. However, in most cases the principle objectives of a user needs assessment can be summarised as follows: e to define the users of the information system, especially the audience for its information products e to determine the priority information needs of these users 228 Figure 5: Relative cost of change during information system development e to set objectives for information system performance, including defined products and services e to establish participative, collaborative approaches to information production and use. The main product of the assessment is a document known as_ the ‘functional specification’ of the information system. This describes the background to the information system project including the justification, cost-benefit analysis, description of key stakeholders (including their capabilities and needs), and definition of products and services. The specification also comprises technical details such as an inventory of essential datasets, diagrams illustrating information flow between system processes, and definitions of the major database structures. This is done at a formal, conceptual level since the specification is quite independent of equipment issues (eg hardware or software); indeed, it should be free from any kind of implementation details. The size and formality of the specification will vary according to the complexity of the system proposed. For instance, a complex project involving several sectors and agencies might be broken down into a series of sub-projects, each with their own functional specification. Information Systems Development An indication of the importance of the user needs assessment is provided by Richardson (1994), who claims that this step “took 80% of the time of the start-up phase” of the Environmental Resources Information Network (ERIN) information system in Australia, and that “great self-control was needed not to be ‘busy’ purchasing hardware, software, and data until these matters were settled”. Most standard text books on information systems development devote a chapter to user needs assessment, as do more specific books on GIS_ implementation. Two examples are Powers and Cheney (1990) and Aronoff (1989). A useful guide to establishing needs for GIS can also be found in Wiggins and French (1992) and guidelines for the requirements phase for general information systems development in the Model Software Development Standard. To reduce unnecessary costs and delays during system development, emphasis should be placed on the early stages of the development process, particularly user needs assessment. 3.3.2 Initial Steps Active consultation is essential during a user needs assessment to promote participation in the development process and reveal needs which cannot be ‘guessed’ reliably by developers. Conversely, consultation allows developers to explain the potential applications and limitations of information technology to different users. Assessment projects often begin with a workshop attended by representatives of the major stakeholders and technical experts who will contribute to the information system. This workshop should attempt to reach agreement on: e which environmental issues are the highest priority e what information is required to support decisions on these issues (content) e what long-term information and monitoring programmes are required, and who is responsible for implementing them e which audiences require information most urgently, and how best to reach them (delivery) e how and when information should be presented (format and timeliness) e which data collection and management standards will be followed e what mechanisms are required for data exchange and cost recovery (eg ‘Memoranda of Understanding’) e what are the main capacity building needs (eg technical and human resources). More detailed consultations between stakeholders and members of _ the Development Team will be necessary as the assessment progresses. These usually take the form of questionnaires, interviews, brainstorming sessions and working groups (see Section 3.3.5), during which stakeholders are invited to outline institutional strengths and capacity building needs, and suggest specific collaborative objectives. In response, representatives of the Development Team may probe the operational procedures of the user's organisation to judge how best to implement requests. Multiple consultations may be required to deal with operational issues such as data availability, quality assurance, operating procedures and data security. In large projects, formal techniques such as data modelling (which results in entity relationship diagrams), process modelling and prototyping are used to structure the assessment results. An example of a formal specification (for BirdLife International) can be found in Van Dijkhuizen (1994), and a less formal example (for the UNEP Office of a Information Systems Development B23 Harmonization of Environmental Information) in Crain (1992). To determine the overall issues, objectives and challenges of the user needs assessment, it is constructive to hold an initial workshop attended by major stakeholders and experts. 3.3.3 Data Needs Having decided the key environmental issues and information needs, the task of the Development Team is to determine which datasets are required to support them. For instance, the need “to be able to decide on enhancements to a national parks system”, may require data on the current extent and status of protected areas. Similarly, the need “to decide whether to _ permit bio- exploration” in a certain region, may require information on the ecology, biodiversity, traditional uses, and cultural values of the region. Data modelling is commonly used to facilitate the transformation of information needs into data requirements (see Section 3.3.5). In this technique, primary data ‘entities’ are depicted graphically, and their relationships to one other made explicit. This is useful for communicating the nature and structure of perceived data requirements back to users for verification, and also serves to consolidate ideas. At this stage data modelling should be restricted to high levels of generality, and make no reference to where or how the data will be obtained or managed. More detailed data modelling takes place during information system design (see Section 4.5). Currently available datasets should be catalogued on paper or electronically in a metadatabase (see Section 5.3). This enables gaps to be determined by comparing existing datasets with those which are required. Data needs can then be expressed in terms of existing datasets (which may need enhancement) and new datasets which are 248 required to cover gaps. During this process it should be remembered that gaps should only be filled if there is sufficient justification for doing so. Data collection should always be linked to the development of priority information products, rather than being treated as an end in itself. The assessment of data needs should lead to the following outputs: e table of required datasets, indicating content, current custodianship, access method, data type (eg tabular, text, spatial, graphics), and quality estimate e generalised data model e preliminary data dictionary. 3.3.4 Processing Needs Various processing tasks are necessary to transform data into information products. These should be documented to enable appropriate facilities to be built into the information system. Typical processing tasks include data integration, analysis, validity checking, updating, and reporting. It is convenient to divide data processing needs into three categories: management, analysis and production. e Management processes ensure that data are maintained securely and made available for widespread use. Typical processes include dataset documentation, quality assurance (error detection, update, backup), application of standards, database development, and negotiation of data exchange agreements. Associated processing needs are those which facilitate the use of data within an organisation, such as file conversion and exchange, procurement, messaging (eg electronic mail and other on-line services), and technical support and training. e Analysis processes are applied to one or more actively managed datasets to yield specific results useful for building Information Systems Development information products. These include data integration, summary (‘aggregation’), statistical analysis (including _ spatial analysis) and other interpretative techniques such as modelling and forecasting. e Production processes combine the results of analysis with other sources of information, such as the history and context of the issue concerned, and supporting details like acknowledgements and method of follow-up. Production also involves packaging and communicating information products which may require specific processes of its own, such as publishing and marketing (see Chapter 6). The first step towards determining processing needs is to identify and describe the current processes and data flow. Formal ‘process modelling’ tools are available (see Section 3.3.5) to illustrate the flow of data and information between processes and to describe the processing which takes place at each step. Process modelling is frequently used by management consultants during quality improvement and _ re-engineering exercises. The objective is to analyse current business processes and suggest alternatives which enable the organisation to meet its output needs more effectively. Process modelling may be applied at all levels. Thus whole agencies or departments may treated as processes in a high-level process model, and the resulting flow as evidence of partnership or linkage. High level process models are sometimes referred to as institutional linkages diagrams. They are a useful means of determining the co- ordination needs of multi-agency information systems. Assuming that the objectives of the information system have been set (by an initial workshop or steering committee), it should be possible to study the current process model and decide what functions (‘capacities’) are missing. The capabilities of the agency (or agencies) concerned can then be examined and potential solutions proposed. One of three outcomes is likely: e Current processes are adequate. Priorities must be set and resources allocated. e Some processes are weak. Capacity enhancement is required, leading to the restructuring of weak processes (eg concentration of resources), provision of training or equipment, recruitment of new staff, or application of quality assurance procedures. e Many processes are weak or poorly co- ordinated. Major capacity building is required to renew the agencies/processes concerned. Some processes may be replaced, enhanced or discarded if the opportunity to ‘re-engineer’ the overall process is taken. Guidance may be required from international agencies and consulting companies. The assessment of processing needs should lead to the following outputs: e annotated process model (formal data flow diagram) e institutional linkage diagram (as above but treating agencies as processes) e description of the data management capacities of partner agencies e description of the analysis techniques and related tools (eg software) employed by the above e description of the desired outputs, including services and information products, of the information system e assessment of the strengths and weaknesses of current processes, including suggestions for alternative process models and capacity building requirements. i Information Systems Development @ 25 3.3.5 Tools and Methods There are useful tools and methods for determining and documenting user needs. Any particular assessment may require only a subset of these, the most appropriate methods depending on the depth of the study, the nature of the scientific endeavours, the organisational culture, and previous experience of the participants. e Questionnaires Questionnaires are a highly structured method of data collection in which respondents are requested to ‘fill in the blanks’ on a form. This can be a valuable data collection tool in itself, or as a guide to facilitate data gathering, eg in interviews. A properly designed questionnaire promotes the systematic collection, cataloguing and evaluation of data. This eases the summarisation of general basic facts and trends. Data collection by this method is inexpensive and efficient. Questionnaires are best for collecting specific information or opinions on narrow options. The principal value is as a preliminary screening method to help determine which institutions or functions should be studied in more depth. As well, questionnaires can be helpful as a checklist or aide-memoire for conducting structured interviews. Questionnaires have limitations for open- ended or general assessment of user requirements and past experience has shown very low response rates are obtained from ‘blind’ distribution - that is, mailings without advance warning or explanatory material. Response rates can be improved by including a supporting brochure providing a summary explanation of the purpose of the study and questionnaire, together with a sample questionnaire completed as an illustration. However, even with this assistance, respondents may have difficulty answering some of the questions, may leave some fields blank, misinterpret questions, or bias answers based on incorrect assumptions. Structured Interviews The structured interview uses an independent person to obtain views through direct questioning and discussion. The interview is ‘structured’ in the sense that there are particular topics and/or questions which are asked in all cases, and standard explanatory information is provided in advance. Interviews may be conducted individually or as a group. Individual interviews can be conducted formally (questions are asked and responses recorded on tape or written down), or informally (questionnaire is used as a guide to discussing key topics). Information can either be recorded at the time or summarised following the interview. The interviewing approach should be sensitive to the cultural norms of the institution and individual concerned. Group interviews are useful where discussion and consultation are the preferred way to establish answers. A questionnaire or check list is used as a guide to solicit and record information. Information from the group is then summarised. In this approach it is useful to have one person to lead the discussion and another to record important information. Group interviews often benefit from a short presentation on the topic before opening up the discussion more fully. Working groups Working groups are small teams of individuals formed to address specific topics and return their results in a 26 a Information Systems Development specified time frame. Working groups differ from Committees in having a time- limited mandate - no on-going role after the assigned task is completed. Working groups are usually composed of experts in particular fields rather than representatives of organisations. Working groups are a particularly useful way to refine information on a certain topic (eg a working group on indicators, or GIS) or to resolve serious problems or uncertainties. Workshops Workshops are similar to working groups in having the objective of addressing a particular topic. A workshop brings together relevant expertise for a short period (4 to 3 days) with the aim of producing agreement, better mutual understanding of issues, and a plan for future actions. Workshops _ often incorporate elements of training and, where a wide spectrum of institutions are involved, facilitate sharing of knowledge and expertise. Workshops always arrive at decisions by consensus. Brainstorming Brainstorming is a particular type of discussion technique in which the goal is to accumulate ideas on a subject in a short space of time. A facilitator is needed to initiate and steer the session, as well as create an atmosphere which stimulates creative thought. In a _ brainstorming session, all individuals are free to speak, and there is particular encouragement to put forward unusual and new approaches. All inputs are recorded. The ideas are then sorted and used where applicable in the context of the project. Brainstorming is most useful when defining the initial scope of a project, when a change in strategy is required, or simply for an infusion of new ideas and inspiration. For example, brainstorming may be useful in trying to identify key datasets in an institution, or new forms of information products to influence decision making. Data Modelling Data models illustrate the relationships between data entities, which may be defined as items of interest whose attributes (properties) are being recorded (see Section 4.5). The technique was first described by Peter Chen (1976). For example, an entity representing ‘institutions’ might be described by the following attributes: name, address, date established, mission, and annual turnover. The relationships between entities are depicted in ‘entity-relationship’ (E-R) diagrams, which use formal, consistent conventions to indicate different kinds of relationship. For example, a one-to-many relationship exists between an institution and its staff; and a many-to-many relationship exists between staff and the projects on which they work (assuming more than one person works on each project). Data models can be subjective. Thus two individuals may produce distinct but equally valid models of the same data, based on different sets of objectives for their applications. The kinds of data model most useful for user needs assessment are relatively high level (ie generalised), with more detailed modelling left until later stages of information system development (see Section 4.5). Process Modelling Following the methodology developed by Yourdon (1979), process models (or data flow diagrams) can be used to illustrate the flow of data between processes in a business or information system operation. A consistent diagrammatic convention is often used, with lines between processes and datasets indicating the direction of ee ETE Information Systems Development Hf 27 data flow. For clarity, it is conventional that each diagram should contain only a limited number of processes (4-6) and themes. The process model is useful in providing an clear overview of the existing operations of an information system. 3.4 APPROACH 1: STRUCTURED DEVELOPMENT LIFE CYCLE 3.4.1 Overview A well established class of methodologies uses the Structured Development Life Cycle approach, in which the development is carried out in a series of structured incremental phases. Although different variants of the life cycle are advocated in different locations, all share the following basic features: e there are distinct phases moving from conceptual issues to operation e specific defined products result from each phase e the phases are carried out in sequence, building on the products established in previous phases e a decision as to whether to proceed is taken after the completion of each phase e looping may be required to revise or refine products from the previous, but not earlier phases. Figure 6 shows an example structure for the life cycle. Diagrams such as these have led to the ‘waterfall’ label being applied to this methodology. The overall aim of system development is to create working databases in the agencies which are partners to the information system. Ideally, this is achieved using Development Teams drawn from _ the agencies concerned, rather than bringing in external consultants. However, the information system hub and, in particular, its 28 @ Steering Committee may be required to facilitate system developments in other ways (eg training). The structured development life cycle approach follows a series of logical steps from project initiation to operation. 3.4.2 System Design 3.4.2.1 Purpose In the system design phase, the functional specification prepared in the user needs assessment is translated into a logical, and then physical design (see Section 4.4). This should result in a design based on the datasets and processes outlined in the functional specification which, once implemented, will deliver the outputs which users desire. The relationship between the different components of the information system (eg databases in different agencies) are made explicit in the design phase, and appropriate data exchange procedures are suggested. Decisions are made on the overall system architecture during this phase, including the type of hardware and software to be used (see Section 4.6). The organisational environment has a large impact on how this is handled. The system may be implemented on existing hardware and software; but if no suitable equipment exists procurement may have to be initiated. The architecture of the system should, in most cases, enhance rather than replace existing mechanisms for data exchange amongst between different groups of users. 3.4.2.2 Activities The Development Team puts together the system design in terms of the required data storage, access, and processing capabilities, and these are verified by selected users to ensure that they concur with user needs. Verification can be achieved by means of informal discussions, interviews and workshops (see Section 3.3.5), or by means Information Systems Development Figure 6: Structured Development Life Cycle | User Needs Development | of prototyping techniques (see Section 3.5). Specific designs for sub-components, such as database applications, may also’ be undertaken at this stage (see Chapter 4). The design phase provides an opportunity to begin training users in the system development strategy. If hardware and software are to be installed, effort is also needed to verify functionality against vendor claims, and to develop tight specifications for additional equipment and _ technical support. 3.4.2.3 Products The product of this phase is a design specification which defines and prioritises the development tasks to be undertaken in the next phase, including details of any equipment required. Estimation of costs can be rigorous here, since these can be calculated by totalling the proposed development time and resources required (procurement costs can be confirmed by vendors). The design specification should provide sufficient cost-benefit analysis to enable project managers to decide whether or not to request design modifications in order to satisfy time-scale or budgetary commitments. Information Systems Development | Implementation | eae SETS Operation | 3.4.3 Development 3.4.3.1 Purpose In accordance with the design specification, database structures are established physically (see Section 4.7) and populated with test data to verify operation (see Section 4.8.1). 3.4.3.2 Activities The major involvement is with the developers who are coding, testing and documenting the information system. However, user involvement should be maintained through demonstrations of functionality as they are developed. Continuing user involvement serves a number of purposes: e assists with verifying, testing and debugging the system e ensures that the system correctly addresses user needs (ie reflects the content of the functional specification) e prepares users for delivery of the system in the next phase. 3.4.3.3 Products The chief product of this phase is a functioning system which conforms to the design specification; the decision to proceed @ 29 depends on this having been achieved. Assuming all is well, an implementation plan should be prepared for the following phase. 3.4.4 Implementation 3.4.4.1 Purpose The purpose of the implementation phase is two-fold: e to check the functionality of the system against user needs as laid out in the functional specification e to establish and document effective operating procedures, including appropriate user manuals, data security policies, and data exchange guidelines for the system (see Section 5.4.2) e to ensure that staff are familiar with these procedures by providing appropriate training. The implementation plan produced in the development phase should guide how this is achieved. For instance, techniques for exercising the full range of system capabilities and administrative duties. Functionality may be incorrect or missing, in which case details should be recorded for correction, and the affected parts of the system should be re-tested at a later stage. The Development Team is often expected to absorb and implement a_ series. of modifications during system testing. This should not be taken as an opportunity for users to demand fundamental changes in system characteristics, merely to check that their original needs are satisfied. 3.4.4.2 Activities Both users and developers are involved heavily in this phase. The former organise and carry out system testing, and the developers correct, modify and fine-tune system performance. The results of this process should be recorded in the form of operating manuals, policies and guidelines. 3.4.4.3 Products A functioning information system ready for operation, including the appropriate documentation, operating procedures and training provision. 3.4.5 Operation 3.4.5.1 Purpose The operational phase is where the system should remain for its lifetime, becoming a regular feature of the agency or groups of agencies for which it was built. During operation, users may detect errors in the system or conceive of improvements which could be made. It is important that a mechanism be put in place to accommodate feedback from users of this kind, in order to constantly improve system performance. One solution is the retention of a small technical support team (possibly some of the same individuals responsible for system development) who can respond to user problems and make changes ‘on the fly’ or during periods when the system is not actively in use. The undertaking of major revisions or the correction of serious operational problems is best handled by the user community as a whole, via such mechanisms as a user support group or other forum. 3.4.5.2 Activities Users review the performance of the system, including the documentation and suggested operating procedures, taking care to establish mechanisms for technical support. 3.4.5.3 Products The outputs of this phase are those which are derived directly from using the system - ie the benefits of improved information management which were originally sought when the project was initiated. Information Systems Development 3.5 APPROACH 2: PROTOTYPING 3.5.1 Overview The structured development life cycle methodology described above has some disadvantages. The methodology requires a great deal of interaction with users in the early phases to define system requirements, followed by a (potentially long) period where the developers implement the specification. After this, the users once again become involved to test the final product. However, gaps in participation at any stage of system development can erode confidence in the Development Team. Furthermore, user needs tend to evolve throughout the development period, making it essential to maintain dialogue on a regular basis. With many industrial and administrative information systems it is relatively easy to specify the data requirements and the processes which are required to create the desired information. However, with biodiversity information systems (and many other scientific applications) the ‘process’ part of the specification is more difficult. For example, it may be _ troublesome determining what types of analyses should be applied to the data, and how to summarise information in ways that are suitable to policy and decision-makers. This increases the risk that decisions made during the user needs assessment may need major revision. These concerns have led to alternative, more interactive approaches to information system development which applies the concept of ‘prototyping’. The principles of _ this approach are: e to create a common ground between users and developers e to have all parties understand the complexity of the processes being automated e to build small versions of the system quickly (and inexpensively) so that user needs can be discussed in the light of a real example e to allow changes to be incorporated easily during the development process e to provide continuous interaction between users and developers throughout the development process. The principal advantages are that the developers can quickly verify that their interpretation of user needs is correct, allowing problems to be identified and corrected early in the process. Figure 7: The Throw-away Prototype validate revise Prototype Information Systems Development Production Version @ 31 Figure 8: The Evolutionary Prototype Prototyping methodologies develop ‘mock- up’ or partial systems within a short space of| time, allowing potential users to provide | feedback before proceeding. Within this general framework, prototyping methods can be categorised into two types as described below. 3.5.2 The ‘Throw-away’ Prototype With this approach a simple mock-up or demonstration of the system or one of its parts is built, demonstrating to users how it would perform in practice (for example, how the data entry screens would look, or how reports would be formatted). The demonstrations do not necessarily use real data; nor are real analyses usually tackled at this stage. The prototype is rather like an artist's sketch of a new building (see Figure 7): it can be modified, perhaps several times, until users are completely satisfied, following which it is discarded and a real system (production version) is built. 3.5.3. The Evolutionary Prototype The evolutionary prototype starts building a small part of the overall system (eg one process) all the way from design to 32 8 Production Version implementation. Feedback from users is then incorporated into the design piece by piece, increasing the core capabilities of the prototype until it evolves into the production system. The result of the evolutionary approach is a system which can be adapted easily to future changes (see Figure 8). 3.5.4 Summary of Methodologies The features of structured and prototyping approaches may be combined for maximum effectiveness. For instance, prototyping may be added as an additional phase in the structured life cycle, or applied during the design phase of the structured life cycle (see Figure 9). With combined approaches, adaptation to change is integral to the development methodology. In practice, the traditional ‘waterfall’ approach works best for complex projects which are precisely defined in advance (ie high certainty of user requirements) and tightly controlled during development. Conversely, prototyping works best with simpler, less easily defined projects, which may evolve as user needs are refined. A combination is recommended where the project is both complex and uncertain. Information Systems Development The choice of methodology and related tools is usually made by the team responsible for building or upgrading the system. Nevertheless, all users should be aware of the options in order to participate effectively in the project. 3.6 EXAMPLES Two biodiversity information systems are profiled below. Further information on these, plus a range of other biodiversity application software, is provided in UNEP/WCMC (1995). e BG-BASE An illustrative example of a computerised biodiversity information system is BG- BASE (see Figure 12), which was implemented following a request from IUCN to create a microcomputer-based application for botanical gardens, both large and small, based on the International Transfer Format (ITF) for plant data (see UNEP/WCMC 1995). A full account of the implementation process is given in Walter (1989), an excerpt of which is included below: “From the beginning the design of BG- BASE has been a group effort; it has now involved more than 100 people from over 35 institutions.... For approximately two years, a group of five to eight of us (specialists) met over lunch nearly every week to plan and to discuss the design, and eventually to test and criticise the implementation. Ideas for new data fields, new files, and new reports were _ presented regularly for general discussion, resulting in some fairly heated debates. The heart of the system was always understood to be based on_ the International Transfer Format, but since this format specified only 36 fields, we had a great deal of fleshing out to do. As it currently stands, BG- BASE comprises 564 fields spread over 12 major files. In addition to these major files, there are another ten index files that allow the user to look up information in a wide variety of ways” The heart of the system was based, as requested by IUCN, on the International Transfer Format (ITF) for Botanic Figure 9: Prototyping in design phase System Design Development Implementation Information Systems Development Gardens, a _ protocol created for exchanging information (see UNEP/WCMC 1995). The value of using the ITF and the need to keep the application generic have become evident over time. BG-BASE has now been adopted by over 50 institutions world- wide to manage living collections, conservation information, herbarium specimens, and as a teaching tool. These institutions comprise botanic gardens, arboreta, horticultural societies, museums, universities and conservation monitoring centres. The use of BG-BASE to manage plant conservation data at WCMC illustrates the importance of a flexible design. Although originally designed as a specimen-based system managing botanic gardens’ living collections, BG-BASE has proved suitable for use in other contexts. Biodiversity Data Bank Biodiversity Data Bank (BDB) was established at the Institute of Environment and Natural Resources, Makerere University, in early 1993, although the task of collating Uganda's biodiversity data began long before this using manual techniques (a full account is given in MUIENR/WCMC 1995). The specification of BDB was conceived by a small Development Team with extensive knowledge of the information requirements of the biodiversity sector in Uganda. Many key organisations were consulted, including the Botany and Zoology Departments at Makerere University, the University Herbarium and Zoology Museum, Uganda Wildlife Authority, Forest Department, and several NGOs, such as IUCN and WWF. The scope of the system is such that it can handle a wide variety of biodiversity data. This was considered important by users who requested a single system to manage 34 8 their data, rather than a series of separate databases. The major data _ holdings include taxonomic names, species distribution records, protected area profiles, details of administrative units, a gazetteer, bibliography, and directory of contacts. BDB was originally conceived as a means of organising the large amount of data relating to Ugandan biodiversity located inside and outside of the country. From the outset an aim of the system was species mapping, and thus facilities were built into the system to download species distribution data in a form suitable for desktop mapping programs. However, due to a requirement to provide information on the country's protected areas system, pre-defined reports were also developed to list species, and in some cases estimate diversity, within protected areas. More sophisticated analyses were also developed to _ predict species distribution on the basis of observed habitat suitability. Information Systems Development 4 Database Development 4.1 INTRODUCTION Effective data management is central to the success of a distributed information system. The goal of this chapter is to promote techniques which inherently facilitate integration and exchange of data - thereby widening the range of applications for which the data can be used and simplifying the process of information production (see Chapter 6). Custodian agencies should follow consistent, long-term methodologies for data collection and management, in accordance with the following principles: e data should be collected and managed in their primary form, not classified, aggregated, or otherwise interpreted, allowing them to be used for multiple purposes e data should be collected and managed following accepted standards (conventions) to reduce transaction costs and expedite interpretation by others e databases should be developed and implemented using generic methodologies which facilitate adaptation to future needs e databases should be implemented using widely available computer hardware and software to expedite access by others. In addition to these core principles (which are elaborated later), a number of quality management principles should also be noted: e datasets should be fully documented to facilitate use by others (see Section 5.3) e procedures for operational and data security should be established (see Section 5.4) e datasets should be maintained and used by groups, not individuals to increase operational security (see Section 5.5). Data should be managed using standard, sustainable methodologies, which widen the range of applications of the data. 4.2 PRIMARY DATA Environmental data record objects and phenomena in the physical environment. Some of these recordings are factual, for example the geo-reference of the location where an recording was made, the date of the recording, the dimensions of a tree, the weight of a log, the mean annual precipitation at a site, or the water retention capability of a soil profile. These are all Primary data based on facts which can be measured against a stable, widely accepted standard (Busby 1994). Secondary, or derived data are those developed from primary data by a process of interpretation or classification, either at the time, or later. Examples include: species name, vegetation type, canopy extent, and climatic zone. Derived data should not be stored in a database unless the primary data from which they were derived are also available. Why is this? Because, as concepts and paradigms shift, derived data are degraded in value and ultimately become useless. For example, if the only representation of a species distribution is an outline drawn on a map, this information becomes redundant if the species is split or otherwise disaggregated following a taxonomic revision. The correct approach would be to store the co-ordinates of the species observations (and supplementary identification notes) to enable new outlines to be derived. The principle of storing primary data needs to be applied intelligently. No one, for example, would refuse to store the names of species or vegetation types, even though they are susceptible to change. The process of deciding which data to store is therefore one of risk assessment. Given the high costs of collecting data, the benefits of using a Database Development particular dataset should be balanced against the risk that it will become obsolete. As a tule, we should not be obliged to use data which are known to be deficient but which are too costly to replace or enhance. The costs of data collection are particularly high in the case of large, national-level datasets, and strict priorities are therefore required for dataset production and maintenance. In general, it is wiser to develop nationally consistent datasets at low resolution, and progressively fine-tune, than to piece together more accurate, but inconsistent, local-scale datasets. This does not imply that local-scale data have no role in national information systems, only that a priority-setting framework should first be established to regulate their contribution (see Section 2.5). 4.3 DATA STANDARDS Standards are the means by which people communicate information and are thus vital in any information system. Standards embrace the selection of attributes representing environmental phenomena, the nature and allowable values of those attributes, and how they can be used to greatest effect by stakeholders (Busby 1994). The purpose of standards is to lower the transaction costs of using data. Thus priorities for establishing standards should take into account the expected uses of the data, for instance in creating collaborative information products. The development of standards requires.a real commitment of resources, largely intellectual in nature. They cannot be overlooked, taken for granted, or left to a specialists who are not actively participating in the information systems project. They require concrete and determined attention by management; developing standards will not be easy. Recognising that progress towards formally accepted national (and international) Database Development standards can be very slow, national information system projects will inevitably develop their own, interim, standards. In such cases it is vital to build on previous experiences, perhaps at the international or regional level, which may be available via international organisations and networks. Interim standards are commonplace across many of the major themes, often having arisen to suit particular data collection and management objectives. Such de facto standards are propagated and adapted in local database implementations. The development of a multi-agency information system provides a good opportunity to reconcile and revise existing standards, taking into account a wide range of stakeholder’s needs. 4.4 DATABASE DEVELOPMENT Database development involves designing and building the structures necessary to manage one or more related datasets. Generic methods are available to develop databases, and the ideas presented in following sections attempt to simplify and summarise these. The terminology for the following processes follows Daniels and Tate (1984). A user needs assessment (see Section 3.3) is assumed to have taken place before database development is attempted. The assessment, which is written up in the form of a functional specification, is intended to provide all the details necessary to design the database in accordance with user’s needs. Database design is partitioned into two phases: the logical design phase, which is independent of the equipment used for implementation; and the physical design phase, which determines how the logical design will work using the equipment selected. Linking these two phases is the analysis of the equipment required to implement the database, which in most cases involves the B37 Figure 10: Database development Data model Equipment requirement Physical Design Physical model Physical database selection of appropriate hardware and software. Figure 10 illustrates how these processes give rise to the final, physical database. 4.5 LOGICAL DESIGN Logical database design involves identifying key datasets and studying how these need to be accessed and analysed to achieve the desired objectives. The logical design is independent of both hardware and software, and does not assume any particular method of physical data organisation (in practice the hardware and software platforms available - perhaps constrained by budgetary limitations - may affect the final logical design). The advantages of producing a logical design are: e it provides a stable base from which to set standards and co-ordinate the development of the database e it provides a conceptual model which is completely free of implementation considerations, and which can be used as a point of reference when adding to or modifying the functionality of the 38 database, or changing the equipment on which it is based e it provides a specification which can be used in the evaluation of alternative data management software e it provides a base line from which an optimum physical data organisation can be produced. It is important for users to achieve a common understanding of the datasets managed by an agency - ie those required to meet its ‘mission-critical’ needs as identified in the user needs assessment. The process of the structure and inter-relationships between a group of datasets is referred to as data modelling, and various language and diagramming aids exist to standardise this. The process of data modelling is facilitated by dialogue with domain experts who are familiar with the dependencies and _ inter- relationships between the major themes. The first step in the development of a data model is to study the functional specification resulting from the user needs assessment. Consideration of this document, together with discussions with both users and experts, permits determination of the basic ‘items of Database Development Figure 11: E-R model eal interest’ and hence the initial entities of the data model. The next step is to determine what relationships exist between the entities that have been identified. It is important at this stage to concentrate on the ‘natural’ relationships which exist, rather than just those which it is thought may be computerised. Data models are often represented in a formal manner. The most popular representation is the entity-relationship (E-R model), first described by Peter Chen in 1976. This model provides a very clear diagrammatic representation of the top-level objects to be modelled in a domain. In the original paper, Chen set out the foundation of the model; it has since been extended and modified by Chen and many others. In addition, the E-R model has been made part of a number of Computer Aided Software Engineering (CASE) Tools (see UNEP/WCMC 1995). Today, there is no single E-R model, although most share the features outlined below. e Entities Items of interest (concrete of abstract) whose attributes are being measured. Entities are represented as tables in a physical database. Attributes Database Development Species Countries Alternate names Properties of an entity which are measured to produce data (eg ‘designation’ is an attribute of the ‘Protected Areas’ entity). Attributes are represented as columns or ‘fields’ in database tables, such that all instances of a given entity are structured similarly. Relationships Descriptions of how two entities relate to one another (eg ‘species’ may be related to ‘genera’ by a ‘belongs to’ or ‘many-to- one’ relationship). Figure 11 illustrates this. Note that alternative symbols may be used to construct entity-relationship diagrams. The notation adopted in this document follows that of Ashworth and Goodland (1990). Connecting lines between entities are single or forked depending on their relationship, forked lines indicating the ‘many’ side of a one-to-many or many-to-many relationship (see Kroenke 1992 or UNEP/WCMC 1995). The advantages of producing a data model are: e improved dialogue between users, and consequent development of data structures e identification of redundant data improved capacity to identify data validation criteria @ 39 e a formal, possibly automated method for implementing the physical database. Prepare Entity-Relationship diagrams to explore data relationships and record the data model. 4.6 EQUIPMENT REQUIREMENT 4.6.1 Overview Following the data modelling phase, the next step is to study how the data will be used in practice. This involves analysing what kind of integration, analysis and communication processes will be applied to the data, with the intention of deciding what kind of data management equipment is required. Before embarking on the potentially costly process of selecting computer hardware and software, it is worth deciding whether or not such equipment is actually justified. Some advantages of the latter include the ability to: e enforce consistency and structure in data storage, which contributes to data quality e automate validation during data entry e analyse large volumes of data e produce multiple and varied reports from the same data. Developers considering whether to invest in data management software should ask the following questions: e do the data contain relationships too complex for the capabilities of a manual filing system or word processor? e will the quantity of data be too much for manual methods or word processing to efficiently handle? e will it be necessary to integrate data from several sources into a combined output? e will there be a need for the data to be shared amongst more than one user in a single institution, or with other institutions? e do the data require extensive searching, sorting, or updating? e will frequent reporting of the data be required? If the answer to some of these questions is yes, then the use of specialist data management software should be considered. If the answer is yes to many questions then such software is certainly required. Evaluate whether a special-purpose computer system is required before proceeding. 4.6.2 The Selection Process Assuming that computer hardware and software are required, the following questions about the database should be answered in order to specify the need: e How big is the database? How many individual entities will be included? How many cases (instances) of each entity are there? e Are any special data types needed, such as spatial data, large volumes of text, images, sounds, or video? Will document storing and searching be necessary? e How many people need access to the database? Will they be sharing a single computer or using a network? Are they all in the same institution or physical location? e What are the long-term plans for the database? Will the scope or the number of users grow? e How much computer experience does the implementing agency have (eg for technical support and maintenance)? How much time is there to learn new software? e How much money is available to spend on hardware and software? 4.6.3 Software The most commonly used form of data management software on the market today is Database Development the relational database management system (RDBMS). These offer good flexibility and performance at modest cost, although they do not deal easily with large-scale textual sources (see Section 6.3.4). Many evaluation criteria can be used to select a suitable data management package, some key examples of which are given below: e is it powerful enough to manage the expected volume of data? e will it meet user expectations in terms of look and feel? e does it contain good facilities for applications development? (the amount of money spent on applications development usually exceeds the initial costs of the software, so short development periods can result in significant savings) e is it a popular product which will continue to be supported and enhanced? (it can be beneficial to forsake the latest technology for the stability and support of a well established product). The above criteria should be evaluated against the requirements of the physical database design. However, counting the number of check marks in each case is a poor way to compare products, since key features like speed nd __ reliability overshadow lesser capabilities. Ideally, the software is tested under realistic local conditions. Published software ‘benchmarks’ are often optimistic and may not reflect the demands of the destined database. Many important software characteristics are subjective. These include ease of use, consistency of the user interface, and expressiveness of a programming language. Selecting a software package purely from a list of features is unlikely to be satisfactory; nothing can substitute for examining a live installation. Reputable computer magazines often contain advertisements and wide-ranging reviews of Database Development software packages, although these too can be biased (software reviewers sometimes have connections with vendors whose products are under review). If you rely on published reviews, temper the prejudices of any one reviewer by using several sources. Computer bulletin boards are another source of outside expert advice. The vendors of very popular software packages usually maintain bulletin boards which may be accessed via services such as _ Internet newsgroups and CompuServe Forums. Bulletin boards not only store objective assessments of software, but can also provide solutions to technical problems via a network of remotely connected users. Knowledge can often be gained simply by observing the debates and comments of other users. When selecting a software package consider the criteria that are of most importance to the project; prioritise these and then assess how well different products perform. 4.6.4 Hardware Depending on the capability of existing hardware to support the desired design, and the availability of resources to acquire further equipment, new computer hardware may be commissioned to implement the design. Common architectures for this include: e stand-alone computers e locally networked computers’ with database software residing on a file server machine (LAN) e client-server architecture e a fully distributed database consisting of a series of remotely networked computers communicating via permanent or dial-up communication lines (WAN) The third option, client-server, is becoming an increasingly popular solution to the data processing needs of medium to large-sized organisations. This architecture is a hybrid m4 of the stand-alone and the traditional network options. It integrates the best characteristics of personal computers (friendly software and quick response) with the best traits of powerful centralised servers (high storage capacity, data exchange, strong security). The client-server architecture divides tasks between the user's computer running ‘client’ software, and a central computer running ‘server’ software. Typically, critical datasets are stored on the server where they are managed very securely and can be processed at great speed. The client software (running on the user’s computer) sends requests to the server software when data processing is required. The processing then takes place on the server and the results are sent back to the client. Many clients can communicate with the server at once, allowing flexible, yet highly secure, data processing. Key issues to bear in mind when selecting a suitable platform are: e Scaleability As the number of users, records, or features, grow, an application that once performed perfectly well on a low-cost architecture can drop off in performance quickly. Typically, stand-alone or small network computer architectures are most likely to suffer from this problem, which explains the rise of more sophisticated architectures such as client-server. e Connectivity To enable rapid exchange of data between individuals and agencies, electronic connectivity is very desirable. This could take the form of a group of locally networked computers sharing a common storage area, or more sophisticated dial-up communication lines to external services such as the Internet and private networks. The capacity to connect computers together is becoming increasingly recognised as the key to rapid dispersal and exchange of data. 42 e Compatibility The issue of hardware and software compatibility is now diminishing in importance as manufacturers evolve a range of ‘standard’ specifications for their products. However, the so-called standards are still too varied and numerous to discount the problem entirely. As far as hardware is concerned, the major decision on compatibility is whether to adopt IBM-PC compatible computers, Macintosh computers, or (usually) larger workstations running the UNIX operating system. Within this broad classification, issues such as operating system choice, emulation software availability, network operating system (eg Novell, Vines, Lantastic), comnectivity protocols between databases (eg ODBC) tend to dominate. At all stages, the best solution is to adopt technology which has been proven to be reliable and useful in circumstances similar to those anticipated, working on the principle that in such cases, compatibility issues are unlikely to cause serious disruption. When selecting computer hardware, attention should be paid to its scaleability, connectivity and compatibility with existing equipment. 4.7 PHYSICAL DESIGN Physical database design involves adapting the logical design to the requirements of the equipment used for implementation. Transformation of the logical design into the physical design is usually straightforward: entities in the data model become a fables in the physical model, and attributes become table fields. The way in which relationships between the entities are dealt with depends on which data management software is used (see Section 4.6.3). If the chosen package does not support some types of relationship, Database Development then this has to be resolved by altering the logical design. Each field in the database should be documented in terms of its purpose, data type, size, and order in its corresponding table. When pooled across all the tables of the database, these definitions are known as the data dictionary of the database, and provide a complete description of its structure, format, and use. The business world is highly heterogeneous and a database for one company is unlikely to use the same data dictionary as that of another. In contrast, it is likely that countries and organisations managing biodiversity data may be recording and tracking many of the same parameters. Thus in the interests of data exchange and co-operation with external partners, notice should be taken of existing standards and common practices (see Section 4.3). There are currently several international projects to assemble environmental thesauri (see UNEP/WCMC 1995). These are being developed in multi-lingual versions (primarily European languages at this stage). The most mature of these thesauri is the INFOTERRA Thesaurus of Environmental Terms (UNEP 1990), which currently contains around 1,600 terms. This number is not sufficient to cover many local terms, and must therefore be augmented in such situations. During the transformation of large databases from logical to physical design, CASE tools (Computer Aided Software Engineering) can prove useful. These allow E-R diagrams to be drawn up, and used to validate and maintain the logical database design. Some CASE tools are also able to output the E-R diagrams directly into a Data Definition Language (DDL) that prescribes the physical database design. Database Development Compile a data dictionary for the database using standard terminology and thesauri where possible. 4.8 IMPLEMENTATION ISSUES 4.8.1 Data Entry Following completion of the physical design, the latter is transferred to the selected hardware and software (see Section 4.6.2) by creating appropriate database tables. The next step is to populate these tables with the required data. Ideally, all the necessary data have been computerised previously and are available in electronic format for importation into the database. However, data are frequently in the wrong format or available only in hard copy form. In such cases they must be converted into an appropriate form for importation or entered manually into the database via the keyboard or other input tool (eg a scanner or digitising tablet in the case of maps). Custom programs can be designed to regulate and validate data entry in many database and spreadsheet packages. This idea can be extended to automate other processes such as querying and reporting data, and ‘downloading’ data for exchange. A database which is accompanied by automated data entry or other procedures is often referred to as a database ‘application’. Where data are entered via the keyboard, validation checks should begin with rigorous examination of the raw, normally hard-copy, data sources. This can be a labour-intensive and tedious task, but is very important for maintenance of data quality (see Section 5.3). Where data are not entered directly, but are imported from another electronic source, validation checks should _ be performed on all the imported data. As an illustration of how errors can be introduced into a database by manual typing, suppose that a data entry screen has 10 fields, and B43 that each field takes on average 6 characters to fill. If the success rate of the typist is 99%, then the chance of the whole screen being entered correctly is (0.99)°"°, which, surprisingly, is only 55%. An example of the types of validation check applied to species distribution records prior to entering a database is presented below (Richardson 1994): e records are checked to see that all required data fields are present e scientific names are checked for validity e grid references of terrestrial species are checked for being over land, not water e the presence of a species in a certain location is tested against a prediction based on bioclimatic factors, and outliers selected out for further validation. For large applications, it is a good idea to write special-purpose validation routines, or take advantage of automated procedures offered by most data management software. Such routines perform ‘reasonableness’ checks on field values, such as ranges for numeric fields, or string-length for character fields. It may also be possible to enforce consistency checks such as capitalisation and hyphenation. Finally, many packages permit the user to select values from a set of predefined choices. This eliminates the possibility of typographic errors, and can speed up data entry considerably. Data validation procedures should be established to reduce errors during database Population. 4.8.2 Synonyms and Equivalent Terms In a typical data management package, data are retrieved by means of structured requests or ‘queries’. Thus, if the user wants to find information on protected areas by providing the search string ‘protected area’, the search will fail to retrieve records marked ‘park’, or ‘reserve’ or ‘sanctuary’, despite the semantic similarity. The problem of synonyms and equivalent terms is particularly prevalent in the environmental domain due to _ its heterogeneous make-up. This difficulty can be overcome by developing custom search routines using the facilities of the software, and offering them to the user as menu or push-button options. An on-line thesaurus can also assist the user by providing a series of alternative search terms. This can be done in a passive mode by suggesting the terms to the user on request, or in active mode where the thesaurus is automatically consulted during the search process to identify synonyms and semantic matches. 4.8.3 Hierarchical Data Hierarchical structures are required to manage many forms of biodiversity data, including species names (order, family, genus, species), geographic relationships (region x is located in country y, in continent Z), and other multi-level classification systems used for the description of land use, vegetation, and other ecological units. In a recent study from Australia, Richardson (1994) highlighted the problems encountered when establishing a taxonomic database structure, and the need for these to be tackled during early stages of the system development process. The same kind of problem, which arises when an attempt is made to manage data which may not be formalised, complete, or even agreed, occurs similarly in the case of habitat or ecosystem classification categories. Firstly, systems had to be designed to integrate differing standards between disciplines (eg botany and zoology) and between institutions. This is especially common at the generic level where different practices can result in the ‘splitting’ or ‘lumping’ of genera. Secondly, taxonomic standards change with time, as knowledge of the phylogenetic relationships between 44 Database Development Subclasses Names Distribution Data Source Data Locations Sources species improves. Thus data supplied by different sources may use differing names for the same species, and the database structure must be able to integrate these synonyms. This situation may also arise when it is discovered that taxa previously thought of as one species consist of two or more and, as a result, a part of the data for a species is included under the wrong name. Richardson suggested that taxonomic database structures should take into account the following: e Formal Categories The family, genus, species, sub-species, other infra-specific categories, and corresponding authorities of the taxa (family name is included as the same name may be used for genera of plants and animals). e Applied Categories Users may need to associate other names with the formal categories, such as synonyms and common names. Applied Database Development Figure 12: E-R diagram showing the relationships between tables in BG-BASE Families Major Taxa _——$$$<—_—_—_—_—, Protected Basic Protected Report Units Areas IVCN Staus Categories categories should be fully referenced in terms of authority, date and source. For example, in the BG-BASE database used at WCMC to store data on threatened plants and plant collections, plant names are stored in a five-tier hierarchy comprising the Names, Genera, Families, Orders, and Subclasses tables (see Figure 12). Note that a sixth table containing synonyms (not shown) is linked to the Names table. The hierarchy described stores plant names with minimum storage overhead and, with properly structured reports, can be used to respond to queries such as ‘list all distribution records of species belonging to the same family as Acer palmatum’. m 45 deen ae Sey res - cui) amps oid rt pede nobok: a Tasmanians eae PSE) aie) AG re Pos ‘ ' : a & a ag wee : 2 iT vt ine ' Bigs wit se ‘ Pi vg eh reer ae We ‘ ou aR - oo na 7 rr gnanreear a at a = i e ' ‘ - ey > 4 "ae 2 1) “ ' _ 74 pti : iw. a) Ries ah i ty eer 1 a iy eines! ra e par wey s >« Cac 7 ( eo) . 4 y - i 7 taht Sobre es Paty de S j Up ad an magi tr pets tn Lat & {sme Lai oe burrohel aro pee juhs, pa es Pun Saheeiie My Ty tic ' Lil se. i “Chins ey ere Lar 5 E ? > liana f ’ a *- ™ Saaly 4! a — 7 ’ mays ‘7 trans io © pels ’ © ‘dosaale I LP } ee Te, nh = {> birppi §.. amy) sey = Pseapr i 1 haxypart ‘ a ls cane - ie eye ’ ws Wh = a es A ; a A ti F epee th gn f Spel wav’ ; dies wae el as fa i. 9 @, - ret "L, (Oreo Lonevnppccanieyh yen" ; Apes Sted 3 == = ; ae and aren Te eae fh Algranow cnet 210) anette == a Se Se GR A ily die tation, il thes al = ae ae aia ‘a —_ : 7 aay ih ‘ac ae A a mf ; . a ; =| baila a i liga A al (es pete? oF ee = ns lal, nc, eee ' : he ad y ee oo — a yeas es ne Munt%e ‘a3 * : ; ; ad v ( Chat aa tay rer Ai anred t vit ey . pinkie wibabas Legon “aria Bea 4 viv bee . AGRA erie £9: ! a aie cu i, Bly. m va. fs Supporting Materials 7.1 REFERENCES Archer, H. and Croswell, P.L. 1989. Public access to geographic information systems: an emerging legal issue. Photogrammetric Engineering and Remote Sensing. 55:1575- 1581. Aronoff, S. 1989. Geographic Information Systems - A Management Perspective. WDL Publications, Ottawa, Canada. Ashworth, C. and Goodland, M. 1990. SSADM: A Practical Approach. McGraw Hill. Ayers, L.F., Kottman C.A. 1994. A call for GIS certification. GIS World 7 (12):48-52. British Standards Institute 1994. Draft BS ISO 14001: Environmental Management Systems - Specification with guidance for use. Technical Committee ESS/1_ - Environmental Management Systems, British Standards Institute, London, UK. Burnett, J., Copp, C. and Harding, P. 1995. Biological Recording in the United Kingdom: Present practice and future development. Summary report. Department of the Environment, London, UK. Busby, J.R. and Walton, D.W. 1994. A National Biological Survey for the United States? Comparable Australian Activities at the National Level. In: Biodiversity - Broadening the Debate. Longmore, R. (Editor). Australian Nature Conservation Agency, Canberra. Burley, C. 1994. CIESIN Metadata Entry Form Instructions. CIESIN. Chen, P.P. 1976. The entity-relationship model - toward a unified view of data. ACM Trans. on Database Systems 1(1):9-36. Clark, G.L. 1981. Law, the state and the spatial integration of the United States. Environment and Planning. A13(10):1197- 12328 70 a Codd, E.F. 1970. A relational model of data for large shared data banks. Comm. ACM. 13(6):377-387. Codd, E.F. 1979. Extending the database relational model to capture more meaning. ACM Trans. Database Systems 4(4):397- 434. Connell, J.L., Shatner L.B. 1989. Structured Rapid Prototyping: An Evolutionary Approach to _ Software Development. Prentice-Hall, Englewood Cliffs, NJ. Cooley, G.P., et al. 1992. Collections and Research Information System Master Plan. MTR_92W00083 Mitre, McLean, Virginia, USA. Cooley G.P., Harrington, M.B., and Lawrence, L.M. 1993. Analysis and Recommendations for Scientific Computing and Collections Informaticn Management of Free-Standing Museums of Natural History and Botanical Gardens. MTR- 93W0000109V1 Vol.1. Mitre, McLean, Virginia, USA. Crain, I.K. 1992. User requirements for the Harmonization of Environmental Measurement Information System (HEMIS). UNEP-HEM Munich, Germany. 86pp. Crain, I.K. (Editor) 1994. An Introduction to HEM and the HEMDisk. UNEP-HEM Munich, Germany. Cutts, G. 1991. Structured Systems Analysis and Design Methodology. 2nd _ edition. Blackwell Scientific, Oxford, UK. 481pp. Date, C.J. 1983. An Introduction to Database Systems. Vol. Il. Addison-Wesley, Reading, Mass. 383pp. Date, C.J. 1990. An Introduction to Database Systems. 5th edition., Vol. I. Addison-Wesley, Reading, Mass. DeMarco, T. 1979. Structured Analysis and System Specification. Prentice-Hall, Englewood Cliffs, NJ. 352pp. Supporting Materials Edwards, P. 1983. Systems Analysis and Design. Mitchell McGraw Hill. Epstein, E.F. 1990. Access to Information: Legal Issues, Proceedings of the XIX Congress of the International Federation of Surveyors(FIG). Vol 3. 92-99pp. Fidel, R. 1987. Database Design for Information Retrieval. John Wiley, New York. 232pp. Fitzgerald, G., Stokes, N., and Wood, J.R.G. 1985. Feature analysis of contemporary information system methodologies. Computer Journal 28(3):223-230. Flaaten, P. 1989. Foundations of Business Systems. Dryden Press. Gane, C. 1990. Computer Aided Software Engineering: The Methodologies, the Products and the Future. Prentice-Hall, Englewood Cliffs, NJ. 220pp. Gause, D.C and Weinberg, G.M. 1989. Exploring Requirements: Quality Before Design. Dorset House Publishing Company. Hammond, A., et al. 1995. Environmental Indicators: A Systematic Approach to Measuring and Reporting on Environmental Policy Performance in the Context of Sustainable Development. World Resources Institute, Washington. Howe, D.R. 1983. Data Analysis for Data base Design. Arnold, London. 307pp. Jordan, E., and Machesky, J. 1990. Systems Development. PWS-Kent. Kroenke, D.M. 1992. Database Processing. Macmillan. Maddison, R.N., et al. 1983. Information System Methodologies. Wiley Heyden, Chichester, UK. 128pp. Malamud, C. 1989. INGRES: Tools for Building an Information Architecture. Van Nostrand Reinhold. Supporting Materials Margules, C.R. and Redhead, T.D. 1995. Guidelines for using the BioRap Methodology and Tools. CSIRO, Australia, 1995. McLean, I. 1989. Democracy and New Technology. Polity Press, Cambridge. 204pp. MUIENR/WCMC 1995. Biodiversity Data Bank: Software Overview. Makerere University Institute of Environment and Natural Resources, Kampala, Uganda, 1995. Obermeyer, N.J., and Pinto, J.K. 1994. Managing Geographic Information Systems. The Guilford Press. 226pp. Olle, T.W., Sol, H.G., and Verrijn Stuart A.A. (Editors). 1982. Information Systems Design Methodologies - A Comparative Review. North Holland, Amsterdam. Olle, T.W., Sol, H.G., and Tully, C.J. (Editors). 1983. Information Systems Design Methodologies - A Feature Analysis. North Holland, Amsterdam. Onsrud, H.J. 1989. Legal and Liability Issues in Publicly Accessible Land Information Systems, Proc. GIS/LIS, Vol.1. 295-300pp. Oxborrow, E. 1989. Databases and Database Systems. Chartwell-Bratt, Bromley, UK. 254pp. Pinborg, U. 1992. Catalogue of Data Sources (CDS) for the Environment: Analysis and Suggestions for a Meta-data System and Service. European Environment Agency. Powers, M.J. and Cheney, P.H. 1990. Structured Systems Development. Boyd and Fraser Publishing. Rhoads, A.F. 1990. A modern, computer- accessed Flora of Pennsylvania: a tool for resource managers. In: [Ecosystem Management. New York State Museum, New York]. 71 Richardson, B.J. 1994. The industrialisation of scientific information. In: Forey, P.L., Humphries, C.J., and Vane-Wright, R.I. (Editors) Systematics and Conservation Evaluation. Systematics Association Special Volume 50:123-31. Clarendon Press, Oxford, UK. Robinson, H. 1981. Database Analysis and Design. Chartwell-Bratt, England. 375pp. Rock-Evans, R. 1981. Data Analysis. IPC Business press. Smith, J.M., and Smith, D.C.P. 1977. Database Abstractions: Aggregation and Generalization. ACM Trans. Database Systems 2(2):105-133. Stein, B.A. 1994. Strengthening National Capacities for Biodiversity Information Management. The Nature Conservancy, USA. Townsend, J.T. 1992. Introduction to Databases. Que. Ullman, J.D. 1982. Principles of Database Systems. 2nd edition. Computer Science Press, Rockville, Maryland. 484pp. INFOTERRA 1990. Thesaurus of Environmental Terms. 3rd edition. UNEP, Nairobi. UNEP 1992. The Grid Meta-Database (MDb) Entity-Attribute Definitions. UNEP- GRID Geneva. UNEP 1993. Guidelines for Country Studies on Biological Diversity. United Nations Environment Programme, Nairobi, Kenya. UNEP/WCMC 1995. Electronic Resource Inventory: A searchable resource for biodiversity data management. WCMC, Cambridge, UK. [Windows 3.1 or higher]. Verheijen, G.M.A., and Bekkum, J.van 1982. NIAM: An Information Analysis Method, In: Olle, T.W., et al (Editors), Information Systems Design Methodologies - A Comparative Review. North-Holland, Amsterdam. Vinden R.J. 1982. Data Dictionaries for Database Administrators. TAB Books, Blue Ridge Summit, PA. 176pp. Van Dijkhuizen, H. 1994. World Bird Database: User Requirement Specification and System Design Specification. BirdLife International, Cambridge, UK. Walter, K.S. 1989. Designing a computer- software application to meet the plant-record needs of the Arnold Arboretum. Arnoldia 49(1):42-53. WCMC 1993. Availability of Biodiversity Information in East Africa. WCMC, Cambridge, UK. WCMC 1994. The Biodiversity Clearing House - Concept and Challenges. WCMC Biodiversity Series No 2, World Conservation Press, Cambridge, UK, 34pp. WDC 1991. Directory Interchange Format Manual (version 4.0). World Data Center, NASA, USA. Wiggins, L.L. and French, S.P. 1992. Geographic Information Systems: Assessing Your Needs and Choosing a_ System. Planning Advisory Service Report. American Planning Association, Chicago. WRI/IUCN 1993. Biodiversity Indicators for Policy Makers. World Resources Institute, Washington. 42pp. WRI/UNDP 1990. World Resources 1990- 92. Oxford University Press, New York. pp13-19. WRI/UNEP/IUCN 1995. National Biodiversity Planning - Guidelines Based on Early Experiences Around the World. World Resources Institute, Washington. WWF-India 1994. Indira Gandhi Conservation Monitoring Centre: A Profile. WWF-India, New Delhi. 728 Supporting Materials Yourdon, E. 1975. Techniques of Program Structure and Design. Prentice-Hall, Englewood Cliffs, NJ. Yourdon, E. 1979. Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design. Prentice Hall. Yourdon, E. Undated. The CASE Report. Nastec Corporation, Southfield, MI, USA. Supporting Materials 73 7.2 GLOSSARY 7.2.1 Biodiversity Terms Accession. A sample of a crop variety collected at a specific location and time; may be of any size. Alien species. A species occurring in an area outside of its historically known natural range as a result of intentional or accidental dispersal by human activities (also known as an exotic or introduced species). Artificial insemination. A __ breeding technique, commonly used in domestic animals, in which semen is introduced into the female reproductive tract by artificial means. Assemblage. See ‘Community.’ Biochemical analysis. The analysis of proteins or DNA using various techniques, including electrophoretic testing and restriction fragment length polymorphism analysis. These techniques are _ useful methods for assessing plant diversity and have also been used to identify many strains of micro-organisms. Biodiversity. See ‘Biological diversity’. Biogeography. A branch of geography that deals with the geographical distribution of animals and plants. Biological diversity. Means the variability among living organisms from all sources including, inter alia, terrestrial, marine and other aquatic ecosystems and the ecological complexes of which they are part;. this includes diversity within species, between species and of ecosystems. Biological Oxygen Demand (BOD). The amount of dissolved oxygen consumed by micro-organisms as they decompose organic material in polluted water. Measurement of the rate of oxygen take-up is used as a standard test to detect the polluting capacity of effluent; the greater the BOD value (g) (and hence the greater the presence of oxygen - consuming micro-organisms) the greater the volume of pollutant present. Biological resources. Includes genetic resources, organisms or parts thereof, populations, or any other biotic component of ecosystems with actual or potential use or value for humanity. Biologically unique species. A species that is the only representative of an entire genus or family. Biome. A major portion of the living environment of a particular region (such as a fir forest or grassland), characterised by its distinctive vegetation and maintained by local climatic conditions. Bioregion (bioregional planning). A territory defined by a combination of biological, social, and geographic criteria, rather than geopolitical considerations; generally, a sys- tem of related, interconnected ecosystems. Biosphere reserve. Established under UNESCO’s Man in the Biosphere (MAB) Program, biosphere reserves are a series of protected areas intended to demonstrate the relationship between conservation and development. Biota. The living organisms of a region. Biotechnology. Techniques that use living organisms or substances from organisms to make or modify a product. The most recent advances in biotechnology involve the use of recombinant DNA techniques and other sophisticated tools to harness and manipulate genetic materials. Biotic. Pertaining to any aspect of life, especially to characteristics of entire populations or ecosystems. Breed. A group of animals or plants related by descent from common ancestors and visibly similar in most characteristics. Taxonomically, a species can have numerous breeds. 748 Supporting Materials Breeding line. Genetic lines of particular significance to plant or animal breeders that provide the basis for modern varieties. Buffer zone. The region near the border of a protected area; a transition zone between areas managed for different objectives. Captive breeding. The propagation or preservation of animals outside their natural habitat, involving control by humans of the animals chosen to constitute a population and of mating choices within that population. Carrying capacity. The maximum number of people, or individuals of a particular species, that a given part of the environment can maintain indefinitely. Chromatography. A _ chemical analysis technique whereby an extract of compounds is separated by allowing it to migrate over or through an adsorbent (such as clay or paper) so that the compounds are distinguished as separate layers. Climax community. The end of a sequence of successions; a community that has reached stability under a _ particular set of environmental conditions. Clonal propagation. The multiplication of an organism by asexual means such that all progeny are genetically identical. In plants, it is achieved through use of cuttings or in vitro culture. For animals, embryo splitting is a method of clonal propagation. Co-management. The sharing of authority, responsibility, and benefits between government and local communities in the management of natural resources. Common property resource management. The management of a specific resource (such as a forest or pasture) by a well-defined group of resource users with the authority to regulate its use by members and outsiders. Community. A group of ecologically related populations of various species of organisms occurring in a particular place and time. Comparative advantage. Relative superiority with which a region or state may produce a good or service. Complementarity. The concept of achieving conservation efficiently by ensuring that a set of areas is assembled with due regard to the additional species that each brings into the network. This is the basis of a critical faunas analysis. Conservation. The management of human interactions with genes, species, and ecosystems so as to provide the maximum benefit to the present generation while maintaining their potential to meet the needs and aspirations of future generations; encompasses elements of saving, studying, and using biodiversity. Country of origin of genetic resources. Means the country which possesses those genetic resources in in-situ conditions. Country providing genetic resources. Means the country supplying genetic resources collected from in-situ sources, including populations of both wild and domesticated species, or taken from ex-situ sources, which may or may not have originated in that country. Critical faunas analysis. Is a methodology to identify the minimum set of areas which would contain at least one viable population of every species in a given animal or plant group. Critical habitat. A technical classification of areas in the United States that refers to habitats essential for the conservation of endangered or threatened species. The term may be used to designate portions of habitat areas, the entire area, or even areas outside the current range of the species. Cryogenic storage. The preservation of seeds, semen, embryos, or micro-organisms at very low temperatures, below -130°C . At these temperatures, water is absent, molecular kinetic energy is low, diffusion is Supporting Materials 75 virtually nil, and storage potential is expected to be extremely long. Cryopreservation. See ‘Cryogenic storage’. Cultivar. A cultivated variety (genetic strain) of a domesticated crop plant (derived from ‘cultivated variety’). Cultural diversity. Variety or multiformity of human social structures, belief systems, and strategies for adapting to situations in different parts of the world. Cutting. Plant piece (stem, leaf, or root) removed from a parent plant that is capable of developing into a new plant. Cycad. Any of an order of gymnosperms of the family cycadaceae. Cycads are tropical plants that resemble palms but reproduce by means of spermatozoids. DNA. Deoxyribonucleic acid. The nucleic acid in chromosomes that codes for genetic information. Domesticated or cultivated species. Means species in which the evolutionary process has been influenced by humans to meet their needs. Domestication. The adaptation of an animal or plant to life in intimate association with and to the advantage of man. Ecology. A branch of science concerned with the interrelationship of organisms and their environment. Ecosystem. A dynamic complex of plant, animal, fungal, and micro-organism communities and their associated non-living environment interacting as an ecological unit. Ecosystem diversity. The variety of ecosystems that occurs within a larger landscape, ranging from biome (the largest ecological unit) to micro-habitat. Ecotourism. Travel undertaken to witness sites or regions of unique natural or ecological quality, or the provision of services to facilitate such travel. Electrophoresis. Application of an electric field to a mixture of charged particles in a solution for the purpose of separating (eg mixture of proteins) as they migrate through a porous supporting medium of filter paper, cellulose acetate, or gel. Embryo transfer. An animal breeding technique in which viable and _ healthy embryos are artificially transferred to recipient animals for normal gestation and delivery. Endangered species. A technical definition used for classification in the United States referring to a species that is in danger of extinction throughout all or a significant portion of its range. The International Union for the Conservation of Nature and Natural Resources (IUCN) definition, used outside the United States, defines species as endangered if the factors causing their vul- nerability or decline continue to operate. Endemic. Restricted to a specified region or locality. Endemic Bird Area (EBA). Is a term used by BirdLife International to describe areas with two or more restricted-range bird species entirely confined to them. Endemism. The occurrence of a species in a particular locality or region. Environmental Impact Assessment (EIA). A method of analysis which attempts to predict the repercussions of a proposed developments (usually industrial) upon the social and physical environment of the surrounding area. Equilibrium theory. A theory of island biogeography maintaining that greater numbers of species are found on larger islands because the populations on smaller islands are more vulnerable to extinction. This theory can also be applied to terrestrial analogues such as forest patches in agricul- Supporting Materials tural or suburban areas or nature reserves where it has become known as ‘insular ecology.’ Exotic species. An organism that exists in the free state in an area but is not native to that area. Also refers to animals from outside the country in which they are held in captive or free-ranging populations. Ex-situ. Pertaining to study or maintenance of an organism or groups of organisms away from the place where they naturally occur. Commonly associated with collections of plants and animals in storage facilities, botanic gardens or zoos Ex-situ conservation. The conservation of components of biological diversity outside their natural habitats. Extant. Species are those whose members are living at the present time. Extinct. As defined by the IUCN, extinct taxa are species or other taxa that are no longer known to exist in the wild after repeated search of their type of locality and other locations where they were known or likely to have occurred. Extinction. Disappearance of a taxonomic group of organisms from existence in all regions. Fauna. Organisms of the animal kingdom. Feral. A domesticated species that has adapted to existence in the wild state but remains distinct from other wild species. Examples are the wild horses and burros of the West and the wild goats and pigs of Hawaii. Flora. Organisms of the plant kingdom Forest Resource Accounting (FRA). A methodology for forest management based on the use of information for improved conservation and sustainable utilisation. Gamete. The sperm or unfertilised egg of animals that transmit the parental genetic information to offspring. In _ plants, functionally equivalent structures are found in pollen and ovules. Gene. A chemical unit of hereditary information that can be passed from one generation to another. Gene bank. A facility established for the ex situ. conservation of individuals (seeds), tissues, or reproductive cells of plants or animals. General Circulation Model (GCM). Global-scale computer model that simulates physical and chemical processes in the atmosphere, both at the present time and in the future under conditions of elevated concentrations of radiatively active gases (enhanced greenhouse effect). In some instances integrated with comparable processes occurring at the surface and within oceans and at the land surface. Genetic diversity. The variety of genes within a particular species, variety, or breed. Genetic drift. A cumulative process involving the chance loss of some genes and the disproportion ate replication of others over successive generations in a small population, so that the frequencies of genes in the population is altered. The process can lead to a population that differs genetically and in appearance from the original population. Genetic material. Means any material of plant, animal, microbial or other origin containing functional units of heredity. Gene-pool. The collection of genes in an interbreeding population. Genetic resources. Means genetic material of actual or potential value. Genotype. The genetic constitution of an organism as distinguished from its physical appearance. Genus. A_ category of biological classification ranking between the family and Supporting Materials B77 the species, comprising structurally or phylogenetically related species or an isolated species exhibiting unusual differentiation. Germplasm. The genetic material, especially its specific molecular and chemical constitution, that compromises the inherited qualities of an organism. Grassroots (organisations or movements). People or society at a local level, rather than at the centre of major political activity. Grow-out (growing-out). The process of growing a plant for the purpose of producing fresh viable seed to evaluate its varietal characteristics. Habitat. Is the environment in which an animal or plant lives, generally defined in terms of vegetation and physical features. Hotspot. Is an area on earth with an unusual concentration of species, many of which are often endemic to the area. Hybrid. An offspring of a cross between two genetically unlike individuals. Hybridisation. Crossing of individuals from genetically different strains, populations, or species. Important Bird Area (IBA). Sites of importance to birds, identified by BirdLife International and Wetlands International. The sites are identified for four groups of birds: regularly occurring migratory species which concentrate at and are dependent on particular sites either when breeding, or migration, or during the winter; globally threatened species (ie species at risk of total extinction); species and sub-species threatened throughout all or parts of their range but not globally; species that have relatively small total world ranges with important populations in specific areas. In-situ. Maintenance or study of organisms within an organism’s native environment. In-situ conservation. The conservation of biodiversity within the evolutionary dynamic ecosystems of the original habitat or natural environment. Inbreeding. Mating of close _ relatives resulting in increased genetic uniformity in the offspring. Indicator species. A species whose status provides information on the overall condition of the ecosystem and of other species in that ecosystem. Indigenous peoples. People whose ancestors once inhabited a place or country, and continue to live in conformity with their own social, economic, and cultural customs and traditions (also: ‘native peoples’ or ‘tribal peoples’) Intellectual Property Rights (IPR). Rights intended to protect knowledge from being exploited without consent. Inter-species. Between different species. Intrinsic value. The value of creatures and plants independent of human recognition and estimation of their worth. Introduced species. See ‘Alien species’. Inventory. On-site collection of data on natural resources and their properties. In vitro. (Literally ‘in glass’). The growing of cells, tissues, or organs in plastic vessels under sterile conditions on an artificially prepared medium. Island biogeography. The study of the relationship between island area and species number. This idea has also been applied to isolated areas of habitat in continental areas which are effectively islands for many species. The extent to which habitat fragmentation may lead to extinction of species can be predicted from _ the relationship between number of species and island area. 78 Supporting Materials Isoenzyme (Isozyne). The protein product of an individual gene and one of a group of such products with differing chemical structures but similar enzymatic function. Keystone species. A species whose loss from an ecosystem would cause a greater than average change in other species populations or ecosystem processes. Landrace. Primitive or antique variety usually associated with traditional agriculture. Often highly adapted to local conditions. Land Mapping Unit (LMU). The smallest are of land that can be delineated on a map of a particular scale. Used in land evaluation as the basis of spatial variation. Land Quality (LQ). A complex attribute of land, which acts in a manner distinct from the actions of other land qualities in its influence on the suitability of land for a specified kind of use. Land Use Requirements (LUR). The requirements are related to growth and yield of crops and trees, animal husbandry, land management and conservation. The expression of the conditions for successful implementation are described for each LUT, eg growth requirements of certain tree species. Land Utilisation Type (LUT). Described in terms of necessary inputs and expected results, based on a number of key attributes obtained from land use data; produce, capital input, labour input, farm size, land tenure, technical know-how, level of mechanism etc. LUTs relate to the physical social and economic conditions of the area and according to the development of objectives; description of the key attributes, reflecting biological, socio-economic and_ technical aspects of the production environment and which are relevant to the productive capacity of a LMU. Supporting Materials Living collections. A management system involving the use of off-site methods such as zoological parks, botanic gardens, arboretums, and captive breeding programs to protect and maintain biological diversity in plants, animals, and micro-organisms. Marine Protected Area (MPA). An area of sea (or coast) especially dedicated to the protection and maintenance of biological diversity, and of natural and associated cultural resources, and managed through legal or other effective means. Megadiversity countries. Are the small number of countries, located largely in the tropics, which account for a high percentage of the world’s biodiversity by virtue of containing very large numbers of species. Micro-organisms. In practice, a diverse classification of all those organisms not classed as plants or animals, usually minute microscopic or submicroscopic and found in nearly all environments. Examples are bacteria, cyanobacteria (blue-green algae), mycoplasma, protozoa, fungi (including yeasts), and viruses. Minimum Viable Population (MVP). The smallest isolated population having a good chance of surviving for a given number of years despite the foreseeable effects of demographic, environmental, and genetic events and natural catastrophes. Minor breed. A _ livestock breed not generally found in commercial production. Modelling. The use of mathematical and computer based simulations as a planning technique. Morphology. A branch of biology that deals with form and structure of organisms. Multiple use. An on-site management strategy that encourages an optimum mix of several uses on a parcel of land or water or by creating a mosaic of land or water parcels, each with a designated use within a larger geographic area. 79 Mycorrhizal fungi. A fungus living in a mutualistic association with plants and facilitating nutrient and water uptake. National income accounts. System of record by which the vigour of a nation’s economy is measured, (results are often listed as Gross National Product, or Gross Domestic Product). Native. Indigenous to a particular locality or region. Nitrogen fixation. A process whereby nitrogen fixing bacteria living in mutualistic associations with plants convert atmospheric nitrogen to nitrogen compounds that plants can utilise directly. Non-Governmental Organisation (NGO). A non-profit group or association organised outside of governmental structures to realise particular objectives (such as environmental protection) or serve particular constituencies (such as indigenous peoples). NGO activities range from research, information distribution, training, local organisation, and community service to legal advocacy, lobbying for legislative change, and civil disobedience. NGOs range in size from small groups within a particular community to huge membership groups with a national or international scope. Off-site. Propagation and preservation of plant, animal, and micro-organism species outside their natural habitat. On-site. Preservation of species in their natural environment. Open-pollinated. Plants that are pollinated by physical or biological agents (eg wind, insects) and without human intervention or control Orthodox seeds. Seeds that are able to withstand the reductions in moisture and temperature necessary for long-term storage and remain viable. Parataxonomists. Field-trained biodiversity collection and inventory specialists recruited from local areas. Participatory Rural Appraisal (PRA). Also known as Rapid Rural Appraisal, PRA is a relatively new and different approach for conducting action-oriented research in developing countries. PRAs are used to help involve villagers and local officials leaders in all stages of development work, from the identification of needs and decision making to the assessment of completed projects. The term can be used to describe any new methodology which makes use of a multidisciplinary team. Patent. A government grant of temporary monopoly rights on innovative processes or products. Pathogen. A_ disease-causing micro- organism, bacterium or virus. Phenotype. The observable appearance of an organism, as determined by environmental and genetic influences (in contrast to genotype). Phytochemical. Chemicals found naturally in plants. Phylogenetic. Pertaining to the evolutionary history of a particular group of organisms. Phylum. In taxonomy, a high-level category just beneath the kingdom and above the class; a group of related, similar classes. Population. A group of individuals with common ancestry that are much more likely to breed with one another than with individuals from another such group. Population and _ Habitat Viability Assessment (PHVA). The _ theoretical modelling of minimum areas, habitat types and population sizes, to sustain any one or more species. Population size will be determined by the carrying capacity of the habitat. 80 & Supporting Materials Population Viability Analysis (PVA). The theoretical determination of the minimum viable (in terms of genetic make-up) breeding population for any one species to survive in a given range. Predator. An animal that obtains its food primarily by killing and consuming other animals. Primary (or natural) forest. A forest largely undisturbed by human activities. Primary productivity. The transformation of chemical or solar energy to biomass. Most primary production occurs through photosynthesis, whereby green plants convert solar energy, carbon dioxide, and water to glucose and eventually to plant tissue. In addition, some bacteria in the deep sea can convert chemical energy to biomass through chemosynthesis. Protected Area (PA). An area of land and/or sea especially dedicated to the protection and maintenance of biological diversity, and of natural and associated cultural resources, and managed through legal or other effective means. Provinciality effect. Increased diversity of species because of geographical isolation. Recalcitrant seeds. Seeds that cannot survive the reductions in moisture content or lowering of temperature necessary for long- term storage. Recombinant DNA technology. Techniques involving modifications of an organism by incorporation of DNA fragments from other organisms using molecular biology techniques. Rehabilitation. The recovery of specific ecosystem services in a degraded ecosystem or habitat. Restoration. The return of an ecosystem or habitat to its original community structure, natural complement of species, and natural functions. Riparian. Related to, living, or located on the bank of a natural watercourse, usually a river, sometimes a lake or tidewater. Seedbank. A facility designed for the ex situ conservation of individual plant varieties through seed preservation and storage. Selection. Natural selection is _ the differential contribution of offspring to the next generation by genetic types belonging to the same populations. Artificial selection is the intentional manipulation by man of the fitness of individuals in a population to produce a desired evolutionary response. Serological testing. Immunologic testing of blood serum for the presence of infectious foreign disease agents. Somaclonal variations. Structural, physiological, or biochemical changes in a tissue, organ, or plant that arise during the process of in vitro culture. Species. A group of organisms capable of interbreeding freely with each other but not with members of other species. Species diversity. The number and variety of species found in a given area in a region. Species richness. Is the number of species within a specified region or locality. Spectroscopy. Any of several methods of chemical analysis that identify or classify compounds based on examination of their spectral properties. Stochastic. Models, processes, or procedures that are based on elements of chance or probability. Subspecies. A distinct form or race of a species. Succession. The more or less predictable changes in the composition of communities following a natural or human disturbance. Sustainable development. Development that meets the needs and aspirations of the Supporting Materials current generation without compromising the ability to meet those of future generations. Sustainable use. The use of components of biological diversity in a way and at a rate that does not lead to the long-term decline of biological diversity, thereby maintaining its potential to meet the needs and aspirations of present and future generations. Systematics. The study of the historical evolutionary and genetic relationships among organisms and of their phenotypic similarities and differences. Taxon (pl. taxa). The named classification unit (eg Homo sapiens, Hominidae, or Mammalia) to which individuals, or sets of species, are assigned. Higher taxa are those above the species level. Taxonomy. Is the classification of animals and plants based upon natural relationships. Threatened species. A United States technical classification referring to a species that is likely to become endangered within the foreseeable future, throughout all or a significant portion of its range. Tissue culture. A technique in which portions of a plant or animal are grown on an artificial culture in an organised (eg as plantlets) or unorganised (eg as callus) state. Trophic level. Position in the food chain, determined by the number of energy-transfer steps to that level. Variety. See ‘Cultivar’. Wild relative. Plant species that are taxonomically related to crop species and serve as potential sources for genes in breeding of new varieties of those crops. Wild species. Organisms, captive or living in the wild, that have not been bred to alter them from their native state. Wildlife. Living, non-domesticated animals. 82 7.2.2 Information Management Terms Application. Any special purpose software fulfilling a specific function on the desktop. Applications can be general-purpose (eg a word processor) or custom-built to meet a specific requirement. (Database) Application. A collection of tools (eg data entry screens, reports) which facilitate the operation of a database. American Standard Code for Information Interchange (ASCII). A standard character set that assigns a numeric code to each letter, number, and selected control characters. Attribute. Properties of an entity which are measured to produce data (eg ‘designation’ is an attribute of the ‘Protected Areas’ entity). Benchmark. A numerical value that gives a measure of the performance of a computer product in a specific test. Best Practice Technology (BPT). The compromise whereby industrial premises are allowed to emit higher than normally acceptable pollution levels due to exceptional circumstances. these circumstances include the use of equipment which in itself is not life-expired, they are using in effect the best practicable means available to them. Bulletin. board. Also known as a newsgroup, is an ‘area’ on a WAN where text messages can be posted by an author, so that they are available to be read by anyone accessing the bulletin board. CD-ROM (Compact Disc-Read Only Memory). A relatively new technology that uses laser-read discs with their high data compression to store very large amounts of data. Data can only be read from the disk, it cannot be altered or re-written. Central Processing Unit (CPU). The microchip that is the ‘computer within the computer’, it logically coordinates the operations of all the other components of the computer. Supporting Materials Client-server. A computer architecture that is a hybrid of the traditional stand-alone and network options with computing tasks shared between the server and the user’s workstations. Computer Aided Design (CAD). Software used for designing in general. It facilitates geometrical drawing on the computer. Computer Aided Software Engineering (CASE). Software used for designing and developing information systems and databases. Data. Facts that result from measuremenis or observations. Database. A_ logically structured and consistent set of data that can be used for analysis. Database Management System (DBMS). Software which stores, maintains, and retrieves data. May also offer a wide range of additional features for data analysis and management. Data Definition (or Description) Language (DDL). A programming language used to describe the structure and content of data files and the relationship between them (often referred to as schemas). A data description language is included as one component of many database management systems. Data. dictionary. A_ repository of information about the definition, structure, and use of data. Data flow model. A representational tool showing how information flows in an organisation or process. Special symbols depict different kinds of flow. Data model. A representational tool showing the structure and inter-relationships between data entities. Dataset. A_ collection of data and accompanying documentation which relate to a specific theme (usually consisting of one or Supporting Materials more computer readable files on the same system). DBF format. The data file format originally used by the dBASE product and now the most common PC DBMS format. Digitising table. A device for inputting map features into a computer, for instance into a GIS. Directory Interchange Format (DIF). A data structure originally defined by NASA used to exchange directory - level information about data sets among information systems. Dynamic Data Exchange (DDE). A mechanism of ‘live link’ which enables items of information in separate application programs to be inter-connected. Electronic mail (email). A network (including Internet) resource allowing messages and files to be sent and received between computers. Entity. Items of interest (concrete of abstract) whose attributes (properties) are being measured. Entity-Relationship (E-R) diagram. A respresentational tool showing the relationships between entities in an information system. Field. In the context of databases, a field is a vertical column in a database table. File Transfer Protocol (FTP). An Internet resource allowing exchange of files between remote computers. Flatfile. A matrix of columns (fields) of data, where each row represents one record. Equivalent to the term ‘Table’ or ‘Relation’ in a relational database. Flat-file database. The simplest type of database that allows the user to work with only one table of data (‘flat-file’) at a time. Geographic Information System (GIS). An information system that stores and manipulates data which is referenced to locations on the earth’s surface, such as digital maps and sample locations. Geo-referenced data. Data which is connected to a specific location on the Earth’s surface. Global Positioning System (GPS). A data capture tool allowing mobile receivers to determine their position anywhere on the Earth’s surface in latitude and longitude coordinates to an accuracy of fractions of a second of arc (1 second of arc latitude is approximately 30 metres). Graphical User Interface (GUI). Computer software that is controlled by the user by the selection of options and symbols from a pictorial presentation on the computer screen (Microsoft’s Windows is the most frequently seen example). The contrasting approach is a ‘command line’ interface. Hard copy. Data or information that has been printed out from a computer onto paper. Hardware. The physical components of a computer system such as the computers, disk drives and the screen. Hyperlink. Hyperlinks are connections that have been programmed into a ‘hypertext’ document. A reader browsing a hypertext document can select a hyperlink symbol to be presented with additional text on the subject of interest. IBM _ compatible. Describes equipment, ranging from personal computers to large mainframes, that can run operating or applications software written for equivalent IBM computers without alteration. Index. A direct access method to data in a database. An index has a key value and a pointer to the row of the table that contains data with the key. Information. Data which have been interpreted to facilitate understanding. Information system. A _ structured set of people, processes, data and tools, for converting data into information. Interface. The way that users communicate with a computer system. Internet. The most widely used international communications computer network. Listserver. An Internet facility similar in concept to a bulletin board. The main difference is that each time a message is posted by an author to a listserver, it is posted out by electronic mail to all the subscribers of that listserver. Local Area Network (LAN). A computer network operating within a site or institution. Logical database design. The (conceptual) design of a database which is independent of implementation issues. Mainframe. A multi-user computer designed to meet the needs of a large organisation; a mainframe has a greater capacity than that of a minicomputer or a microcomputer. Menu. A list of options graphically presented for selection to the software application user. Metadata. Data about data, for instance its location, source, content, or other specifics. Also co-data. Metadatabase. A_ database which is designed to manage metadata. Modem. A piece of equipment used to link digital devices such as computers to an analogue telephone line. The term is a contraction of modulator-demodulator. Multimedia. Integration of many forms of data in an application, including text, sound, graphics, and video. Multitasking. A computing environment that allows several software packages to be run concurrently. 84 wf Supporting Materials Network. A collection of computers that can communicate with each other. Normalisation. In the context of databases, the process of organising data into a structure of one or more tables, where each column has a specific unambiguous meaning. Normalisation is necessary to achieve the optimum structure for a relational database. Object Linking and Embedding (OLE). A feature to transfer and share information between different software applications. For example, whilst within a word-processing document, a spreadsheet table can be directly worked upon using OLE. Object Oriented (OO). A way of looking at processing problems and their solutions in terms of ‘objects’. An object has a recognisable identity which includes information on its ‘behaviour’ and function. In contrast with conventional software where program and data are separated, the object includes both the data and the procedures and functions that operate on it. Objects cooperate by sending messages to one another. On-line database. An information retrieval service that can be accessed from computers dialling up over public networks. Operating system. Controls access to all the resources of the computer and supervises the running of other programs. Examples of operating systems are MS-DOS, Windows and UNIX. Optical Character Recognition (OCR). Technique for rapid capture of text into a computer. First the text is scanned, then the image of each character in the text is analysed and converted into the computer code. Characters that cannot be matched may be displayed on a screen for an operator to enter manually. Personal Computer (PC). Otherwise known aS a microcomputer, is a _ single-user Supporting Materials computer with a central processing unit based on a microprocessor chip. Physical database. The actual physical structure of databases as implemented for a particular hardware or software configuration and database system. Pixel. Abbreviation for picture element, meaning the smallest, discrete elements that are used to create an image on a visual display unit. Polygon Attribute Table (PAT). The database table associated with a spatial dataset holding details (attributes) of the geographic objects. Process. An activity, function or procedure applied to a resource (eg an arithmetic procedure applied to data, or a critical step in a business operation). Process model. A _ representational tool consisting of language and diagramming standards representing the inter-relationships between a group of related processes. Prototyping. A system development methodology which quickly develops a partial or preliminary version to determine its feasibility and user evaluation. Prototypes can then be _ refined into delivered applications. Public domain. Intellectual property available to people without paying a fee. Most computer software developed at universities is in the public domain. Query. A request to a database to select and extract data. Random Access Memory (RAM). Dynamic memory provided by the computer’s RAM microchips, sometimes known as central memory or core. Raster graphics. Definition of an image to be produced on a computer screen is stored on a “pixel-by-pixel’ basis. @ 85 Record. A collection of data about a specific case or subject. In the context of databases a record is a horizontal row in a database table. Relational database. A database consisting of two or more tables related via common fields. Relational Database Management System (RDBMS). Advanced DBMS software which allows the storage of multiple, related files. Relationship. Describes how two entities are related to one another (eg ‘species’ may be related to ‘genera’ by a ‘belongs to’ relationship). Server. Any program or computer that provides a service to other programs or users. A network server, for example, provides dedicated hardware and software for the purpose of giving terminals or computers access to a network. Software. The programs that are run on a computer. Spatial data. Data which contains reference to a location (which may be a specific location on the Earth’s surface, or relative to an arbitrary point). Spreadsheet. A software program that allows users to establish relationships between rows and columns of data in a tabular format. Structured design. A methodology for the design of information systems that breaks the program down into a series of modules with carefully specified interfaces between the modules. Structured Query Language (SQL). ANSI standard data manipulation language used in most relational database systems. Table. An physical entity in a relational database, in which data are laid out in rows and columns. Theme. A broad data area which may be subdivided into datasets. Vector graphics. Definition of an object’s image to be produced on a computer screen is stored by defining its geometry as a series of connected points - to be contrasted with raster graphics. Wide Area Information Server (WAIS). A system designed for retrieving information from networks. It is a searching facility dependent on matching requests with a specific request. Wide Area Network (WAN). A computer network where the constituent systems may be widely dispersed geographically and links are formed by the use of telephones, radio, satellite, etc. Workstation. Powerful desktop computer equipped with a high-resolution display and designed for technical applications. Groups of these workstations are normally linked to a shared computer which holds common information. World Wide Web (WWW). Popular Internet resource based on the exchange of information via a graphical, hypertext, interface. Universal Resource Locator (URL). Address’ describing the location of information sources on the Internet global communications network. xBASE. Data management software which trace their origins to the dBASE package. 86 Supporting Materials