Experience of design and development of a new open access interactive multifunctional database management system for special lexis of biology

system consists of multiple modules (input, statistics, export, etc.). The data input module was successfully designed and developed and has been used effectively by researchers for entering and collecting special lexis units


Introduction
In 2011, researchers asked the question 'How many species are there on earth and in the ocean?' (Mora et al., 2011). The authors of the study made mathematical calculations and indicated that around 2010 there were more than 1.2 million documented species, but, according to predictions, this could be only 14% of the real number of species. However, more than 1.2 million documented species does not mean that there is the same number of scientific names of organisms in science. In fact, the scientific names of the organisms that have been introduced so far are much more. This is because science denominates not only the names of species, but also the names of taxa of other taxonomic levels, so-called ranks. There are also many synonyms and scientific names published without considering taxonomic principles that have been governed by particular International Codes.
Currently, there have not been any calculations that would allow us to determine how many scientific names for the classification of organisms have actually been introduced by now since the Swedish naturalist Carl Linnaeus published his first work 'Species plantarum..' in 1753 (Linné, 1753), in which he introduced modern traditions in the denomination of organisms. However, it is likely that this number could be closer to 2 million different scientific names for different organism taxa. Despite the large number of synonyms for the scientific names of organisms, these names are a fairly stable group of lexemes that are important in cross-linguistic communication. Scientists from different countries communicate with each other using mainly scientific names. However, in publications, the names of organisms sometimes appear without their scientific names, so the name of the organism in the local language can be the key term in a particular publication. Considering that scientific names of organisms are of international use, and they are being used together with local names in publications in many languages, it is possible to obtain equivalents in any language. This is crucial and important not only in studying linguistic aspects, but also in terminology and translation.
There are databases that aim to compile the scientific names of the world's flora, for example http://www. worldfloraonline.org/ as 'an open access web-based compendium of the world's 400,000 species of vascular plants and mosses' (World Flora Online, 2022), but not fauna: such an electronic database would also be needed with the local names of organisms, which would be useful for terminology research, translators, and experts in the field. Some local names of various plants and animals in Latvian have been published in the first religious publications in Latvian and other publications since the beginning of the 1600s (e.g., Šulcs, 2020). However, because of different spellings in several older works, it is not possible to know what these specific names correspond to. Since the classification of organisms using scientific names has been introduced, it can be assumed that these scientific names allow the more correct and precise matches to be obtained in a certain language. Therefore, in most cases, the matches are understandable through the attributed scientific names. Publications which mention both scientific and local names have been used to gather information in previous studies (Stalažs, 2015;Šulcs, 2020). The first publication, where both scientific names and local Latvian names are used, was published in 1777 (Hupel, 1777). Historically, since then and in paper format alone, starting with special scientific literature and ending with articles in encyclopaedias, as well as in popular publications and the press, several thousands of works containing the names of organisms in Latvian have been published. The amount of information and data is impressive, and it cannot be processed manually, therefore, modern information technology solutions are necessary.
Although many electronic publications, as well as individual species/taxa lists of specific organisms, have become available online over the last 25 years, they do not cover the diversity of lexemes introduced earlier or those required in terminological and linguistic work. There are several online databases where it is possible to search for scientific names and names of organisms in Latvian and major contact languages, e.g., German, English, Russian. For example, the Latvian National Terminology Portal (https://termini.gov.lv/) collects the names of organisms and their scientific name equivalents. There are organism names not only in Latvian but also in other languages, and data are linked to information source references. Unfortunately, this database covers only a selected number of information sources that does not include a remarkable number of organisms, as well as the names of the diseases caused by particular organism species. During the creation of this database, the classification of organisms and scientific names used in older publications have not been checked for compliance with Theoretical Background scientific requirements. Therefore, the database reflects only the equivalents of the terms that were published at a certain historical time. Considering that a lot has changed in biological science, part of the data in the information sources used in this database are outdated or incorrect.
General online dictionaries also include the names of organisms, for example, those available on Letonika. lv (https://www.letonika.lv/), a resource of bilingual translation dictionaries, and on the consolidated dictionary portal Tēzaurs.lv (https://tezaurs.lv/). The information available on Tēzaurs.lv is not intended for research of historical language changes, as it does not provide information on exact data sources of specific entry units and time periods used to compile an entry. Like the Latvian National Terminology Portal database, there is also no verification of organism classification information here. Although Tēzaurs.lv offers the possibility to use the Latvian Language Historical Dictionary, which covers historical lexeme sources from the 16 th -17 th centuries, including the source references, there is no connection with the scientific names of organisms, since modern scientific classification was only introduced in the 18 th century.
The largest electronic encyclopaedia of species that provides information on scientific names of plants and animals with names in local languages is Latvijasdaba.lv (https://www.latvijasdaba.lv/). This is the main electronic encyclopaedia, which covers a large number of species found in Latvia but not in other regions of the world. This encyclopaedia also does not examine changes in the classification systems of organisms. In some organism groups, a number of species do not have their names in Latvian. Although the range of species is relatively wide, it does not cover all groups of organisms, especially names of diseases caused by particular organisms. However, it should be noted that Latvijasdaba.lv is the most comprehensive electronic publication covering the organisms found in Latvia.
A special database dedicated to vascular plants was created at the Institute of Biology of the University of Latvia. Unfortunately, this is not a public database for the needs of specific researchers. The database covers the period of the 17 th century and includes data on information sources (Šulcs, 2011, 2020). Since this database is not available to the public and does not cover all groups of organisms, but only a part of the plants (vascular plants only), it is not available for translators, gardeners, and the general public.
From a terminological point of view, the problem is that most information resources cover only a limited data or publication-specific group of organisms, rather than all taxonomic groups. Traditionally, researchers specialise in studying narrow groups of organisms, and authors, through their publications, convey to the public the names of organisms related to their interests. The practical experience of the authors of this article shows that currently there are no electronic databases dedicated for research purposes, e.g., for analysing the use of organism names over time, frequency of name usage, etc.
Other different resources such as Periodika.lv (http://periodika.lv/) or language corpora can also be used for translation and research, but in practice it has been found that the performance of the available corpus of modern Latvian texts is not inadequate for translating and studying texts in the field of biology (see Jasmonts et al., 2022). Translators render texts on diverse subjects, so they often need different names of organisms, not only the organism groups that have been interesting in recent decades, with their names covered by internet resources. However, most publications were printed books: botanical dictionaries [e.g., Botaniskā vārdnīca by Galenieks (1950)], which are not digitised), textbooks (e.g., first botany textbooks), plant indexes, and a monograph Latviešu valodas augu nosaukumi (Ēdelmane & Ozola, 2003) which collects a wide range of plant names, including regional plant names, and is not completely digitised. These printed books contain the largest range of taxon names of organisms. Since internet resources cover only a short period of time, generally 30 years only, they alone are no longer sufficient for linguistic research and terminological needs. There are two aspects to consider. If all the historical material is not digitised, it is impossible to trace the changes in language traditions over time. To enable such studies in the future, the IMDS is developed considering the gaps described above. Focussing on terminological issues, it has previously been concluded that it is important to follow the historical usage of organism names when making decisions about changing particular names (Stalažs, 2015). To ensure this, extensive information resources are needed, (1) including as many taxa of organisms as possible; (2) covering the longest possible time period from the past to the present; and (3) linking the use of local names of organisms in the past with the modern taxonomic changes of organisms.
Modern technical capabilities allow to obtain new information that changes the perception of the classification of organisms. These changes affect specific groups of organisms that have been studied in the past or groups whose earlier names even not correlate at all with actual herbarium materials. Changes in organism classification have a significant impact on terminological and linguistic aspects as well, since the named realities change .
There are some specific problems. For example, it is important for both linguists and terminologists to specify whether the scientific names Arvicola amphibius and Arvicola terrestris refer to one species or two species. An answer would enable us to determine how many local names are needed to denote realia. To develop effective new terminology dictionaries that recommend appropriate terms to translators, it is essential to keep up with advancements in the biological sciences and methodically account for terminology in the relevant field. Any taxonomic changes must consider whether new local names need to be created to prevent multiple names from being given to a single species or other taxon, when in reality there is only one species or other taxon with several scientific names assigned.
These issues are essential not only for terminology, but also for technical solutions during the development of the database when comparing publications issued at different time periods. There are taxa of organisms that were named with different scientific names or alternate spellings of names (e.g., Pinus sylvestris and Pinus silvestris) in different historical periods. If terminologists are not aware of the variants, a faulty decision could be made. Therefore, it is important to provide digital tools to keep track of historical changes in the classification of organisms, representing these changes according to international codes of taxonomy and the latest research results in biology, while ensuring that it is technically possible to automatically link future changes to historical usage of local names. The same applies to technical solutions that would allow distinguishing homonymous organism names between different groups of organisms (e.g., Pieris as plants and Pieris as animals), as well as homonymous local names (e.g., Latvian kaulenes as plants and kaulenes as animals), etc. This is a significant aspect, which would provide further assistance to translators, for example, if they have to translate a text in the subject domain where such information cannot be determined without context.
To collect data related to the biology field, multiple approaches can be used, for example, machine learning (ML) algorithms for optical character recognition (OCR), integrations with existing information systems, and manual data entry in a file, database or system.
In recent years, machine learning algorithms have been developed in various fields, including linguistics (Kaur & Garg, 2021). It should be noted that ML algorithms cannot provide 100% precision and manual testing is necessary. Also, a large data set is necessary for ML algorithms to learn and it is not always possible to collect enough data for learning purposes, especially in languages with a relatively small number of users, such as Latvian (Jasmonts et al., 2022). Integration with other systems is possible, but this specific set of biology data may not be sufficient for neural network training (Darģis et al., 2020). In addition, not all information systems permit access via application programming interface to collect data dynamically and more effectively. And the third option, manual entering, is time consuming. However, it can be reduced by developing an information system that is intuitive and offers multiple autocompletion possibilities.
In this research, authors aim to achieve linguistic and terminological needs with new information technology solutions, especially for data collection and visualisation. First technical solutions that would allow creating a dynamic open access database system dedicated to the names of organisms will be examined. It would also ensure the following options: (1) the collection of data on as many taxa of organisms as possible; (2) tracking the use of local names of organisms during different times; and (3) linking the organism names used in specific time periods with taxonomic changes by considering the latest discoveries in biology.

Data Coverage
Lexemes (names of organisms) from different publications entered into the IMDS created during this study were used to build and test the database system and for further analysis. Also, this data set was used to create technical solutions when: (1) homonymous names of organisms are used in the publications, as well as in the classification of organisms, and (2) different spelling variants of the organism names are used. In addition, technical solutions are implemented for automated linking of recurring data, so that data linking does not have to be done manually. This is a preliminary report that examines the results as of 16.12.2022.
To create the database system, various types of publications, both paper and electronic, were used for data collection. Excerpts are gathered from these publications by entering them into the database system of local names of organisms that are linked to the particular scientific names. All publications are divided into six groups according to their types ( Table 1). During the development of IMDS, publications in Latvian is primarily used, but for testing purposes, selected publications in other languages are used as well. Regardless of the base language of the publication, the local names of all languages that are clearly linked to a specific scientific name are entered into the IMDS. If synonyms are used in the publications, each name is entered separately to allow statistical data processing. If there are spelling variants in the publications, such lexemes are also considered. If the names of organisms are used in inflections in the original publications, in the IMDS they are entered in the nominative form. All lexemes are linked to publication data and scientific names in their original spelling. Both our experience and the problems identified in the beginning of this article and the problems faced during the implementation of the project are considered in order to identify and implement the most effective technical IT solutions in the creation of the IMDS. The data presented in Table 1 is only mid-project data, which is subject to change by the end of the project (December 2023). The authors intend to produce a separate scientific publication dedicated to the publications used for collecting the data in the IMDS.

New Solutions and Innovations to be Offered
Within the framework of this project, it is planned to cover as many publications as possible, published starting from the beginning of the 20 th century, in order to create the first and the largest database system of the Latvian special biology lexicon.

Description of the Technical Solutions of the IMDS
Before the development stage of the information system, requirements were defined. Thus, the system should: (1) be open access with an impressive amount of linguistic and terminological heritage of many languages; (2) facilitate comparative studies in linguistics and terminology; (3) be available online without additional software installation; (4) be scalable and with the possibility to effectively add new IMDS modules. A team of linguists, terminologists, and programmers are working to design and developing the IMDS with data storage and a wide range of statistical or search options designed especially for language research purposes and comparative multilingual studies in linguistics and terminology. At the end of the project, more than 4000 referenced information sources are planned to be included in the IMDS, and at least 600,000 linkages of local names of different organisms in different languages are planned to be collected.
The developed IMDS consists of multiple modules which are mutually connected (see Fig. 1):

Fig. 1. Modules of the IMDS
1 User module. All data entry is stored with information about the user who entered the data. This module consists of user registration, authentication, and authorisation.
2 Bibliography module. All lexeme information is linked to the bibliographic information. This module provides the ability to store all publication data: monographs, journals/proceedings, and papers with linked additional information, for example, publisher, ISBN, alternative title, place, Digital Object Identifier (DOI), author/authors, etc. In this case, it was important: (1) to link the bibliographic source correctly with the specific excerpted unit; (2) to organise the collection of such publication data, which will allow further provision of an appropriate reference to the source of information in the database. This reference is formatted as a Harvard-style bibliographic citation.
3 Module of linked organism names. In this module, three types of organism names are entered: (1) scientific name, (2) local name, and (3) names of diseases caused by organisms (since the names of organisms can also be the names of diseases). In this module, the linkage between the organism name and bibliography unit is stored with additional information, for example, language mark, page number on the source bibliography, system user data, date and time, user comments. In addition, the linkage between the root element and the names of other organisms in the same linkage group is carried out in this module.
4 Module of unlinked terms and special lexis units. In this module, the linkage between a specific organism name (not linked to other languages) and bibliography unit is stored with additional information, for example, language mark, page number on the source bibliography, system user data, date and time, user comments.
There are no linkages between other names of organisms.
5 Dictionary module. It is for lexemes of different languages that are not related to the scientific name of organisms and where entries are linked to specific bibliography units with additional information, for example, part of speech, gender, number, language mark, page number on the bibliographic source, system user data, date and time of the entry, user comments. Dictionary entries are linked if there is a terminological or linguistic connection between them. This module also provides links to synonyms of the same language, if the scientific name of the organisms is not used in the original publication used for data collection in the database.
6 Module of terms and definitions. In this module, terms with their corresponding definitions are linked to specific bibliography units, for example, identified subdomains for term and for definition, related scientific term (e.g., scientific designation in Greek or Latin) and umbrella term, language mark, page number on the source bibliography, system user data, date and time of the entry, user comments, etc. The entered terms are stored with linkage to term equivalents in other languages and their corresponding definitions.
7 Module of names of plant cultivars. In this module, cultivar names are linked to bibliography units with additional information, such as: country of origin, cultivar breeder's rights owner, breeder's rights protection date, system user data, language, related organism taxon name, time and date of entry, user comments, cultivar's description, species epithet, group affiliation, etc.
8 Module of hierarchical linkages. This module consists of organism names that are linked in a hierarchical tree with specified root element (higher taxa), for example, Plantae, Animalia, etc. In this hierarchical tree, information about scientific names of organisms, their taxon level, and linkages to parent name and children names are described. This module also includes information about the linked local organism names accepted in the IMDS. This module is integrated with Module 3 and data entering is managed by using autocomplete options because organism names are mostly entered by using Module 3 whereas this module provides the creation of hierarchical linkage between the names of organisms. This module will provide hierarchical linkage with the correct scientific names of organisms according to modern classification of organisms.
9 Module of data linkage. In this module, the link of scientific names entered in Module 8 with the excerpt from different publications will be ensured and controlled. This will ensure that, regardless of spelling differences, scientific names will be linked to the appropriate and correctly spelt names.
10 Data visualisation. This is a module for information filtering and retrieving from the databases of the developed IMDS. Results are reproduced in multiple ways, for example, in graphs, plots of time series, etc.

Data Coverage
On 16.12.2022, the IMDS databases contained 4406 bibliographical references and 283,618 linkages between records and bibliographical data. The IMDS database contains 50,846 scientific names (including spelling variants as used in original publications), as well as 60,585 local names in Latvian and in several other languages, and 1427 names of organism-caused diseases (Fig. 2). At this stage of the project, the main focus is on Module 3 (linked organism names), while the other modules are under development or already as finished prototypes. For testing purposes, the following data have been entered in these modules: Module 4-1712 scientific names of organisms and 204 other terms not linked to any language; Module 5-2942 terms linked among different languages including Latvian, German, English, Russian, Latin, French, Spanish; Castilian, Danish, Swedish, Polish, Greek, Norwegian, and Portuguese; Module 6-313 terms with a total of 329 definitions.

Problems and Solutions of Automated Data Linkage
The many synonyms of scientific names of organisms that have been used in different time periods present one of the main problems. However, it was found that even greater problems were found to be caused by different spelling variants, which sometimes exceed the number of synonyms. These problems are essential when trying to develop IT solutions for linking the scientific names used in the publications as easily as possible with the modern classification of organisms in accordance with the International Codes of Nomenclature. The more spelling variants there are, the more difficult it is to ensure that an automated system works and manual control is required. It is especially necessary to prevent possible errors.
Firstly, all names are linked to their scientific name. Considering that these names are entered in the IMDS as they are in the original publications and with the authors of the names (if any), technically it is possible to filter the parts of the names (ignoring the authors of the names). However, if the original names are misspelt, each variant creates technical problems. Further problems arise from homonymous names that can be distinguished either by the author of the scientific name or by the local name used. This requires manual monitoring of name linkages. Sometimes the data is misspelt at the time of data entry and the data entry specialist cannot add a comment that would help to understand the applicability of particular names.

Results and Discussion
Several categories can be used to classify typical problematic cases: (1) scientific names that are originally written identically (excluding authors of the names); (2) the variety of spelling variants of the same scientific name; (3) scientific names, the first word of which is abbreviated; (4) significant differences in the spelling of scientific names, as well as synonyms; and (5) homonyms of scientific names, distinguished by designation of the author of the scientific name or local name. Homonymy is also observed in local names, but if there are no problems with the recognition of the scientific name, then homonymy of local names does not cause problems in the automated data linkage.
1 To avoid introducing errors, primary name linkage is done manually, but IMDS automatically suggests identical scientific names. This can be done easily if the scientific names (excluding the designation of the author of the names) are identical or differ by one letter. Table 2 shows examples in which the initial part of the scientific name is identical or differs only by one character. Since the name of the authors is part of the scientific names entered in the IMDS, the different spellings of the following name of the authors are also indicated in the examples (Tables 2-5). This is important because it is also necessary to provide technical possibilities for distinguishing homonyms separated by the names of the authors and to ensure that IMDS automatically distinguishes the  2 In the original publications, the number of variants of the spelling of certain scientific names can be observed, which can significantly complicate the automatic finding of the names in the system. In this case, the local name equivalent is required. However, the spelling of equivalents of local organism names may also vary ( Table 3). If the spelling of local names does not match, it can be problematic to automatically find connections between the names corresponding to the same taxon (usually to the same species). In such cases, the greatest possible number of bibliographic sources is necessary.
3 In some publications, the generic epithet of scientific names is neither written in full nor explained elsewhere in the text, which can make automatic recognition of names difficult. There is an option to automatically select potential equivalents in cases where the given local name equivalent matches the equivalents of fully written scientific names (Table 4). However, even these situations may require a one-time manual check to avoid potential mistakes. In cases where local name equivalents do not match, there may be problems in correctly deciphering the attribution of names. To be able to find binding names, it is necessary to collect a large amount of terminological data.
Homonymy on the local name side can cause true problems, and in these cases linking should be done manually ( Table 5).
It is enough to do it once, though, as the next identical records will link automatically. Table 3 involve a mixed variation in spelling, both on the side of scientific and local names. If the spelling of scientific names varies significantly, which also applies to synonyms, then only the local language name Table 3. Examples of data entry with spelling variations on the side of the scientific and Latvian name of one species. The correct spelling of the scientific name of the species is Juglans mandshurica should be used for automatic name linking ( Table 6). In this situation, manual control and verification is mandatory, as local names can also be homonyms. Local names can also help to find correct equivalents for records where significant typographical errors have been made in the scientific name. These errors can be so significant that they prevent automatic finding of equivalents by scientific name alone.

The examples shown in
5 Homonymous scientific names can be found within a group of organisms or between different groups of organisms ( Table 7). These names are indistinguishable when standing alone without additional information. The author of the scientific name or the local name can be used to distinguish the names. If such names are within the same group of organisms, one of these names eventually becomes a synonym. However, homonyms may have been used in different periods of time in the literature, so a manual check of the names is required to ensure that there are no erroneous linkages of the IMDS database records.
Since manual linking has been done once, the homonymy of local names ( Table 8) is no longer a problem if new records repeat the same scientific and local name combinations as previously registered. However, if a scientific name with a different spelling is added, the equivalence ratio of the new name combination must be manually checked considering the new combination only.

Practical Significance and Novelties of the IMDS
Before the launch of the IMDS database project, the need for such a resource was expressed and supported by experts in the field and by students and translators specialising in translation of texts related to the field of biology. On the other hand, while preparing to implement the project, we received feedback from reviewers on the usefulness of the project and on the expected difference of the IMDS from existing solutions. In the Theoretical Background chapter, we already mentioned some online resources, listing the shortcomings of these resources and the fact that some of them are not open access resources or do not include special lexis units in Latvian.
Within the framework of this project, we focus on online resources that include both the scientific names of organisms and their Latvian equivalents. After briefly describing the resources that are already available, we found that they have shortcomings, which are listed here, and we offer new solutions: 1 Limited terminological and data source coverage-when developing the IMDS, we anticipated that the terminological coverage should be broad and should cover: (a) all groups of organisms, not just a few specific ones; (b) as many sources of information as possible during the project implementation period; (c) the widest possible time period for collection with methodology developed within this project. One of the main focusses is also the referencing of data sources, which also provides statistical possibilities in studies of terminological culture across various time periods.
2 Wide range of possibilities, including open access and personalised use of the database system-the creation of an open access database system that would allow users to participate in adding data themselves, as well as provide opportunities for wider use of data, was defined as a mandatory requirement. The IMDS is designed to provide terminological information available to a diverse range of international interest networks, not only to those interested in Latvia. All of the given options are crafted to make the IMDS useful information to specialists in various fields who use the names of organisms on a daily basis, e.g., biologists, farmers, foresters, doctors, translators, linguists, terminologists, etc.
3 Broad cross-linguistic coverage and linkages -one of the important disadvantages of many sources of information is the limited variety of the languages covered. Many sources offer single-language equivalents or a limited number of languages covered. Consequently, many organism taxa remain incorporated if only the literature published in Latvian is considered. The IMDS is built to be able to link the terms from different languages in one place, providing an unlimited number of languages and reference source accessions.
Here is an example from the data collected in the IMDS in the first half of the project, when scientific names related to the plant genus Pinus were selected as examples. On 16 December 2022, the IMDS already covered 322 scientific names related to this plant genus, which included different spellings of the names. These names were associated with more than 300 local names in seven languages (Fig. 3)  Dynamic features of the IMDS, as well as association with the most modern classifications of organisms according to the International Codes of Nomenclature -since discoveries in science bring new knowledge that makes it possible to clarify the classification systems of organisms, it is necessary that the terminological work is dynamic and keeps up with such discoveries in science. This is a key condition set during the creation of the IMDS. Therefore, the IMDS provides a technical option to link outdated historical naming systems with modern classifications in accordance with the International Codes of Nomenclature. Technically, solutions for the problems listed above and will be described in future scientific publications by the authors.
5 Preserving linguistic heritage to track changes in the development of the terminological culture of different languages, since the oldest terms and organism name variants can serve as a basis for the formation of new terms and names of organisms. Therefore, the historical material is also practical significance.
Designing of the IMDS is a challenging task to ensure effective performance in all IMDS data collection, processing, and retrieval operations especially because of the data linkages in multiple directions: for example, the linkages of organism names to bibliography units and linkage to other organism names, hierarchical linkages according to the taxonomy structure, etc. During the development of the IMDS, various technical problems have been identified, which need to be solved in order to ensure automated linking of organism names recorded in the IMDS database. However, it is evident that partial manual control and monitoring will always be necessary to eliminate the probability of errors. The first solutions have already been developed, and they will be fully completed by the end of the project in December 2023. The current solutions provide wide language coverage according to ISO 639-1, and an unlimited number of source references. The IMDS is an attempt to create a large multilingual linguistic heritage database dedicated to organism names and intended especially for research. The IMDS is intended as an international database system; therefore, we invite colleagues from other countries to join us in pursuit of solutions and new opportunities in the future.
Acknowledgements. This research has been funded by the Latvian Council of Science. The project 'Smart complex of information systems of specialized biology lexis for the research and preservation of linguistic diversity' No. lzp-2020/1-0179. The authors thank Aiga Bādere and Aiga Veckalne for improving the English language of the manuscript. Brief Conclusions and Prospects for the Future 6 Timeline options for statistics and historical usage of the language unit-to keep terminology and language science up-to-date, it is necessary to use all the technical possibilities of information technology to provide statistics and related automated research options. These features are currently under development and will be described in separate publications by the authors. Below is an example from the records already included in the IMDS database, which refer to Latvian names of the plant species Betula pendula, āra bērzs and kārpainais bērzs (Fig. 4).

Santrauka
This article is an Open Access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 (CC BY 4.0) License (http://creativecommons.org/licenses/by/4.0/)