Biodiversity informatics: a brief introduction
The current biodiversity crisis, with its accelerated rates of extinction and habitat loss, brings the importance of documenting global biodiversity sharply into focus. With the vast majority of species still undescribed and many becoming extinct before they are discovered, the huge task of describing, documenting and storing information about the diversity of life is one of the greatest challenges facing modern biology. Taxonomists, ecologists and conservationists all agree on the paramount importance of this endeavor and over the past 20 years, efforts to overcome some of the limitations imposed by the ‘taxonomic impediment’ have led to much greater use of information technology at all stages in the process of biodiversity research. This trend reflects concurrent advances in information science which, together with the urgent need for powerful web-based tools to facilitate the compilation and storage of biological data, led to the development of a new multidisciplinary effort with the goal of providing the informatics infrastructure needed to support modern biodiversity research initiatives. The result was biodiversity informatics; a new discipline built on a foundation of taxonomic, biogeographic and ecological data stored in digital format and utilizing modern computer techniques to facilitate the collection, organization and use of these data in basic research.
Biodiversity informatics is defined as the application of informatics techniques to the management, presentation and dissemination of information concerning biological diversity. Moreover, the widespread use of taxonomic databases in recent years has fueled the development of novel informatics-based tools for the discovery, exploration and analysis of biodiversity data. Consequently, biodiversity informatics initiatives are much more than just data management engines, having now become powerful analytical tools in their own right, essential for the adequate exploration and documentation of biological diversity and the utility of the resulting data. Today, biodiversity informatics is a dynamic and fast-growing field incorporating aspects of taxonomy, systematics, ecology and genomics within the broader frameworks of conservation biology and information science. This interface between the biological and computer sciences heralds a new phase in the development of biodiversity research, providing not only wider access to relevant data, but new ways in which to use it. Thus, biodiversity informatics represents an invaluable set of resources not only for taxonomists, ecologists and conservationists, but also for government policy makers, non-governmental organizations, and the public.
Taxonomic databases: the backbone of biodiversity informatics
At the heart of biodiversity informatics are taxonomic databases – digital repositories developed for the storage of detailed information about biological taxa and designed to maximize efficiency in terms of data management and information retrieval. Such repositories contain information organized by taxonomic name and are routinely used to produce biological checklists for formal publication and in the management of biological collections. With the advent of biodiversity informatics, taxonomic databases took on a central role in the effort to consolidate our knowledge of biological diversity. Such databases constitute the essential infrastructure underpinning the operation of web-based species information systems, frequently providing both the data itself and a means by which to manage it. This fundamental function of taxonomic databases makes their development, maintenance and cross-system integration a top priority within the biodiversity informatics community.
The ultimate goal of a taxonomic database should be to accurately model all relevant data concerning the organisms of interest within the overall scope of the system and its intended usage. Modeling taxonomic hierarchies in such databases is relatively intuitive given the relational schema employed by almost all database systems. Moreover, taxonomic databases can be designed to incorporate the rules of nomenclature as laid out in the relevant International Codes and are extremely flexible in terms of the types of data they can handle. For example, in addition to encoding taxon identifiers (e.g. valid scientific name, author, date of publication, etc.) a taxonomic database may frequently incorporate other valuable information, such as specimen data, synonyms, literature citations and taxonomic notes. Furthermore, a wide range of biological attributes can also be documented, such as geographic distribution, ecology and conservation status (e.g. threatened, vulnerable etc.) along with various digital media including images, sound recordings and videos. The great flexibility offered by taxonomic databases makes the construction of data-rich repositories straightforward and rapid, thereby facilitating the collection, organization and long-term management of biodiversity data – the essential first steps of any biodiversity informatics initiative.