On the Problem of Parts of Speech Identification in the English Language: A Historical Overview

The paper focuses on the problem of identifying parts of speech in the historical perspective covering the period of 1700–2019. It provides an insight into classical Greek and Latin approaches to exploring parts of speech, which lay foundation for further formation of the English tradition of parts of speech identification. More than 400 genuine grammar books, comprising variegated approaches towards parts of speech classifications that were used to functioning and are currently adopted in the English language, were analysed. The research suggests that classical approaches, Greek and Latin, in particular, had a profound impact on establishing the original English tradition in parts of speech identification. Since the period of standardisation (the 18th century) in the English grammar tradition, over 30 different classifications have been in use, either becoming popular and applicable in the English language or going into disuse. In the present paper, all classifications are analysed in detail and arranged into 5 groups and 13 subgroups, respectively.

Parts of speech (henceforth -PoS) identification is predominantly a subject matter of descriptivism and descriptive grammar books, as it is believed to be rather a conspicuous, widely recognised, and classical concept in syntax and, thus, is ignored by prescriptivists, "whose works tend to be highly selective, dealing only with points on which people make mistakes (or what are commonly thought to be mistakes)" (Huddleston, 2002, p. 6). However, due to its well-known nature, it seems that PoS identification is just taken for granted, whereas very few grammarians try to introduce a new vision of the problem being limited by either apparent simplicity of the issue, which requires no additional explanation, refinement or reanalysis; or unwillingness to reshape what has been formed for centuries. Nevertheless, every language is constantly evolving, and this shows permanent functional changes, which predominantly reflect the current state of affairs in speech and in language as well, which makes it necessary to formalize them.

L I N G U I S T I C S / K A L B O T Y R A
The aim of the paper is to provide an in-depth historical study on PoS identification in the English language grammar books over the period of 1700-2019, to describe and formalise approaches existing in grammar. We hypothesise that despite a generally-accepted approach to "keep everything in view -form, function, meaning" (Jespersen, 1924, p. 60) when identifying PoS, the parts of speech system is much more sophisticated and diversified than traditionally applicable eight-or nine-component classifications. The following objectives are specified to achieve the aim and to verify the hypothesis: 1) to analyse classical Latin and Greek approaches towards PoS as they established the foundations of the English PoS system; 2) to provide the insight into the theories that have been developing since the so-called "age of standardization or prescription stage" (Hogg, 2006, p. 284); 3) to study classifications elaborated on the basis of classical approaches; 4) to research into alternative classifications which deviate from traditional ones; and 5) to describe approaches predominating in the 18 th , 19 th , 20 th -21 st centuries.
The data analysis in the present paper is based on more than 400 grammar books published over the period of 1700-2019 and starts with the period which was "a great age for dictionaries and grammars in England (the late 18 th century)" (Romaine, 2007, p. 8).

Classical Traditions in Parts of Speech Identification
To a great extent, modern English PoS are hostages to their Latin definitions, which profoundly assimilated into the Old English language, were fully incorporated in Middle English and consequently formalised at the age of normalisation in all English grammar books.
The issue of PoS identification in all grammar books refers to the section called "etymology", which was a vital problem treated with the utmost sobriety by Greek classical philosophers, first of all by Plato, who expressed the idea of combinatorial analysis stating that language was analysable, i.e., could be divided into chunks whose value and combinatorial properties are amenable to systematic description and explanation (Ashdowne, 2008, p. 4). As a result, Plato considered sentences as units which could be divided into smaller ones, viz: nouns and verbs, to the latter belonged those units which could be expressed in more than one word (noun phrases and verb phrases) and then it is more reasonable to speak of subject-predicate dichotomy (Plato, Sophist, lines 261-265 in Jowett's translation, 1892). The grammatical system of the language was later evolved by Aristotle, who singled out the following parts of the language: letter, syllable, connecting word, noun, verb, inflexion or case, sentence or phrase (Aristotle, Ch. XX in Butcher's translation, 1895), which depended not only on relationship demonstrated between facts or expressed between statements, but also on relationship between the sounds and inflections of words. Analysing Aristotle's classification, we can conclude that, in fact, it represents only 3 modern PoS: noun, verb, connecting word (inflected article, inflected pronoun, uninflected conjunction). At the same time, we must admit that articles do not directly signify what is now known as determiners (Ashdowne, 2008), as well as conjunctions cover anything used to link verbs and nouns into propositions (Vinokurova, 2005).
Plato and Aristotle paid more attention to the way the proposition was formed and to the elements of the speech structure, whereas their successors, who belonged to the Stoic school of grammar, elaborated Aristotle's system determining the case not as a part of speech but as a crucial factor in distinguishing PoS. Namely, due to the notion of cases they substantiated and formalised the difference between the verbs and nouns as individual PoS and differentiated between inflected pronouns and articles as well as singled out categories of invariant prepositions and conjunctions. The notion of cases also assisted in shifting adjectives from the category of verb to the category of noun; however, adjectives were not determined as an individual part of speech. Another achievement was splitting off the class of adverbs from the class of common nouns. Therefore, the Stoic grammarians started a new period in the history of PoS identification, which to a great extent resembles the modern one, comprising nouns, verbs, articles, pronouns (all 4 PoS were inflected) and adverbs, prepositions, conjunctions (which were uninflected). Nevertheless, it is possible to presume that the ambiguity characterising PoS identification aroused as the principles which lay in the grounds of the PoS division were not unanimous and united semantic aspects (differentiation of proper and common nouns, etc.), syntactical aspects (distinguishing adverbs as the units syntactically associated with verbs, but from a morphological point of view belonging to nouns), morphological aspects (which were basic approach towards conjunctions, prepositions).
The following stage was commenced by the Alexandrian school, Dionysius Thrax, in particular, who continued the research started by the Stoic school and evolved their results. Applying basically the same foundations as the Stoics did, viz.: semantic, syntactic and morphological principles, Dionysius Thrax improved the system, which conceptually became the basics for the modern one. The scholar divided PoS into 8 classes with comprehensive definitions: noun, verb, participle, article, pronoun, preposition, adverb, conjunction (Thrax in Davidson's translation, 1874). In the analysis of the above mentioned classification, it is necessary to emphasise two main changes -introduction of participle and differentiation of adverbs, and the latter we believe would become significant for further PoS identification. Apollonius Dyscolus made use of the thesis representing the category of adverb as a grammatical category with its syntactic functions and changed the whole idea of the PoS system (Apollonius Dyscolus, pp. 69-70 in Egger's translation, 1854). Moreover, he defined it as "an indeclinable word which determines by a general or particular attribute the modes of the verbs, and which, without the verbs, cannot complete a thought" (Apollonius Dyscolus, p. 189 in Egger's translation, 1854).
Much more influential if not determinative were the PoS theories made up by Roman philosophers and grammarians. Rather distinctive was the system elaborated by Varro who applied an exclusively morphological principle of contrasting inflected and uninflected classes: the first one having cases but not tenses is nouns; the second having tenses but not cases is verbs; the third having both is participles; the fourth having neither is adverbs (Varro, pp. 543-551 in Kent's translation, 1938). However, this system did not become generally accepted (Law, 2003) as it did not take into account other aspects, but a morphological one and other Roman/Latin grammarians decided to remodel classical Greek systems to the needs of the Latin language in which the category of article did not exist. But for the sake of equality and tradition, the general number of PoS (8) retained. Priscian, analysing linguistic categories, distinguished "interjection", making it the 8 th PoS. Interjections were separated from adverbs as "they did not fulfil the foremost task of an adverb, namely that of closely adhering to the verb syntactically and semantically" (Sluiter, 1990, p. 212). Thus, the PoS system includednoun, verb, participle, pronoun, preposition, adverb, conjunction, interjection.
In general, the issue of PoS identification is also described from the typological perspective. According to Comrie (1998, p. 79), "two most readily distinguishable word classes in Proto-Indo-European are the verb and the noun (and the pronoun distinguished from it)", cf. Goldstucker (1860, p. 220), who states that in Sanskrit the speech was divided into 4 parts: nouns, verbs, preverbs/prepositions and particles, and the main focus was on the verbal nature of all PoS, as "certain nouns or the majority of nouns are derived from verbs". "The distinguishability of a separate class of adjectives was questionable for Proto-Indo-European" (Comrie 1998, p. 79), cf. Bammesberger (1992, p. 52), who notes "that 'adjectival' form of Indo-European probably lacked special morphological characteristics which would have set it off from a noun". "Many adverbs are etymologically, and sometimes even synchronically, case forms of nominal" (Comrie 1998, p. 80), cf. Bhat (2000, p.47), who identifies adverb in the Indo-European languages as a modification of the verb. As Comrie (1998, p. 80) states "there is a set of primarily short adverbs for which at least no substantival origin is clear and which also show up as prepositions and verbal prefixes", cf. Viti (2014, p. 290), who treats them as prepositions, which were adverbs originally. "Numerals are not to be treated as a single word class" (Comrie 1998, p. 80); moreover, their reconstruction either in the Indo-European languages or in the Germanic languages is rather complicated or impossible (Gukhman, 1963). Finally, Proto-Indo-European had a number of invariable particles and conjunctions (Comrie 1998), cf. Whitney (1889), who distinguishes them as well.
Nevertheless, all classical categories in the Indo-European language family tended to a comprehensive description aimed at combining morphological, semantic and syntactic criteria. This, of course, provided a possibility to introduce 8 PoS, cf. Varro, who, applying only a morphological principle, distinguished only 4 PoS, or Plato, thinking of proposition as a semantic kernel of a sentence, named just 2 PoS, etc. Another issue under discussion must become the literal translation of the PoS names, predominantly, copied from the Greek grammar into Latin, without paying enough attention to the differences in the languages' structures. For instance, Latin nouns and adjectives are often separated by a preposition (which, in fact, was copied into English and gave rise to one of the strictest syntactical rules in defining PoS in English -a preposition must precede a substantive), whereas this structure was quite rare in Ancient Greek, and the literal name for the class of prepositions was "standing before" (Viti, 2014, p. 289), not specifying what parts of speech go after prepositions.
We argue that dealing with English, it is necessary not to follow classical Greek and Latin traditions, but to describe Old English lexical units and evolve the system which would mirror the English grammar.
To sum up classical traditions of the PoS identification, it is necessary to state that despite being indirect predecessors (belonging to different language groups) of the English grammar tradition, Latin and Greek theories played a crucial role in its formation. In Table 1, we sum up the above discussed approaches towards PoS identification to make further comparison with English theories more obvious.
Therefore, speaking of classical traditions in linguistics, it is possible to draw the following conclusions: • at different periods of time, morphological, syntactic and semantic criteria were applied in totality or individually to explain the PoS identification, and their combination and elaboration promoted the origins of the PoS systems, which have become fundamental in the Western linguistic tradition; • it can be argued that the original PoS classifications were predominantly based on the semantic criterion -meaning in the sentence, later on it was supplemented by the syntactic element -relationships between PoS, position in the sentence and finally the morphological criterion came to the fore and has remained principal for the following millennium; • the exclusive difference between the Greek and Latin typologies lies in the presence of interjection and absence of articles in the latter; • the adjectives were not distinguished as an individual PoS, but just as a type/representative of a noun, which, clearly testifies for the hypothesis that adverbs predominantly descended from nouns and were determined earlier than adjectives.
In this section, we are focusing in detail on the approaches which were and are still in use in the English grammar, reviewing all possible attempts in PoS identification, describing and grouping classifications. For the sake of brevity and when it is enough to name just one grammarian to represent the ideas of the whole group of scholars, we provide the joint lists of linguists, who shared the same views. The designations of the classifications are suggested by the author and they may be subjects to discussion.

The Archetypal Latin Classification
Due to the historical events and influence of the Latin language on English, it would be reasonable to suppose that the former left some impact on the latter. The predominant majority of the classifications listed below are modelled according to the classical Latin approach; however, the original Latin classification was sometimes applied even as corresponding English, especially before the late 18 th century, when the adjective was identified and formalised as an individual class of words, see: Turner (1710) This approach was not established as it did not take into account the system of the English language (e.g., presence of articles) and scholars admitted the necessity to identify adjectives. Nevertheless, the archetypal Latin classification at least provided the names of the PoS in English and established the number of PoS accordingly to one of the most spread classifications.

The Eight-component Classification
In discussing any classification in the English language, it is worth mentioning that the names of the PoS were usually under consideration but far from unification, Another peculiarity of the PoS classifications is the number of components and lexical representatives, which can differ significantly. One of the most common approaches is based on the archetypal Latin classification, which, in the paper, is defined as the modified eight-component Latin classification.

The Modified Eight-component Latin Classification
This modification is grounded on the archetypal Latin classification, while the only difference between them lies in substituting the Latin participle by the adjective that developed in English.
In our opinion, the absence of the article as a part of speech testifies that the "modified Latin classification" can rather be called morphological as the article would belong to indeclinable PoS; whereas treating this classification as syntactic, functional or semantic would require presence of articles as they can vary within these frames. classes -the noun, verb and particle -the linguist writes that particles include the adverb, preposition and conjunction; the verb stands by itself; whereas the noun contains the appellatives divided into the substantive and pronoun, and the adjectives are divided into the adjective and participle. In other words, the author partially applies the Stoics' system that differentiated between the proper and appellative nouns (Harris, 1751) and includes adjectives into the category of the noun, but neglects the category of the article (see Table 1).

The Modified Greek Classification
Despite the unique nature of the article in the English language and presence of it in the archetypal Greek classification, this approach did not evolve in English grammar. However, there are some attempts either to modify the archetypal Greek classification or to use as the basis for some more elaborated classifications. In this section, we are investigating the eight-component modification only.
The main difference between the modified Latin and Greek classifications is presence of the interjection in the former and presence of the article in the latter; whereas the divergence between the archetypal and modified Greek classifications is, of course, development of the adjective as an individual PoS.
The modified Greek classification includes noun, pronoun, verb, adjective, adverb, preposition, conjunction and article (Thring, 1868) or, according to Eastwood (2002), these are noun, pronoun, verb, adjective, adverb, preposition, conjunction and determiner. The linguist notes that the class of determiners comprises not only articles, but also possessives, quantifiers, demonstratives. Nevertheless, we claim that Eastwood's approach represents the modified Greek classification as all other subclasses, except articles, are the results of functional transposition and originally belong to other PoS.
Notwithstanding the presence of the article, the modified Greek classification did not become widespread, supposedly due to the predominant influence of the Latin language and reasonable claims to include the interjection in the PoS. Correspondingly, this has led to amplification and elaboration of the PoS system and introduction of the nine-component classifications.

The Mixed Latin-Greek Classification
Another widely spread classification, according to the data, retrieved in the course of the research, is a nine-component classification. We define it as the mixed Latin-Greek, for it merely unites the Greek and Latin archetypal classifications into one and substitutes participle with a newly defined adjective. Therefore, the nine-component classification appears as follows: noun, pronoun, verb, adjective, adverb, preposition, conjunction, interjection, article. It

The Alternative Nine-component Classifications
a) The first alternative approach or, more precisely, the first nine-component classification appeared even earlier than the mixed Latin-Greek classification, see: Johnson (1756), Ward (1767). It is characterised by the absence of the adjective and the use of the parti-ciple. It includes: noun, pronoun, verb, participle, adverb, preposition, conjunction, interjection, article (Johnson, 1756). Its emergence is explained by the fact that in the first half of the 18 th century, the adjective was not defined as a separate part of speech, but rather as a part of the noun class (see Table 1). In fact, it combines the Greek and Latin classifications; nevertheless, we do not treat it as a conventional one, due to participle-adjective shift and introduction of the adjective, which has become predominant since the end of the 18 th century. In the course of time, scholars made attempts to return to that approach, see: Brown (1802)

The Low-component Classifications
As the most widely spread and frequently used approach is the eight-component modified Latin classification and the nine-component mixed Latin-Greek classification following it, we claim to define all other concepts which include less than 8 PoS as low-component classifications. To this list, we refer original/individual approaches, which substantially deviate from the classical ones and due to logical reasons did not come into wide use: a) Four-component classification (see Farro, 1754;Brightland, 1759) includes "names, words which express things themselves, qualities, words which express manners, properties, affections of things, affirmations, which express the actions, and particles or little words showing the manner or quality of actions, as also their relations, regards, connections" (Farro, 1754, p. 27). Modern comprehension of the first three PoS is rather evident, whereas particles combine at large all prepositions, adverbs, conjunctions, which later were standardised as independent Pos. b) Six-component classification is offered by Brown (1899) and, in fact, is quite a unique approach based on the theory of ideas that can be of 6 types: object-ideas, attribute-ideas, attributes of action, connecting ideas, conjunction-ideas, copula-ideas. According to the author "ideas are mental pictures or notions we form of things (external objects that we can see, hear, feel, smell or taste)" (Brown, 1899, p. 12). Of course, it is easy to draw a parallel between Brown's classification and a traditional one in order to get a more conventional comprehension.  (1852), identifies content words/predicates (noun, verb, adjective, adverb) and grammatical elements (article, preposition, conjunction) at the same time substituting the pronoun with the article.
At first sight, it is possible to speak of some other four-, five-, or six-component classifications, whereas, on closer examination, it becomes clear that many scholars single out the category of particle, which consists of three or four other PoS (Philips, 1731; Webster, 1790; Bailey, 1855; Rushton, 1869, etc.). Hence, in the present paper, we do not take into account such classifications, but make a remark as to their subdivision.

The Multi-component Classifications
According to the paradigm stated above, any classification consisting of more than nine components is considered to be identified as a multi-component classification. As the research shows, the multi-component approaches were introduced almost at the same time as the traditional, however, did not gain popularity: Fleay (1852) also speaks of 11 classes of words distinguished by syntactic or logical analysis, namely, interjectional, verbal, substantival, adjectival, adverbial, preposition, conjunction, relative, copula, article, symbolic. This classification is quite outstanding due to defining interjectional words as the earliest words in the primitive language; distinguishing copula verbs and symbolic words (just three units) as independent PoS. c) A twelve-component classification is represented by Becker (1845, 1855) who divides PoS into notional and relational words. According to him, nouns, verbs, adjectives, and adverbs belong to the first group and the second group, apart from the usual prepositions, conjunctions, pronouns, articles and interjections, comprises auxiliary verbs and relational adverbs (the distinction of which became popular at the end of the 20 th century) and numerals (for the first time defined as an individual class in the English language). d) Fourteen-component classifications refer to a group of quite fractionised approaches, according to which the authors try and identify various subclasses within the traditional parts of speech system.
One of the classical PoS identification approaches is suggested by Sweet (1892Sweet ( , 1900 in his later works, as in the early papers (1887) the linguist differentiates only 9 PoS. Sweet (1892) supports a traditional division into declinable and indeclinable or particles, stating that such a division is not entirely dependent on the presence or absence of inflection, but really goes deeper, corresponding, to some extent, to the distinction between head-word and adjunct word. However, rather elaborated is his following subdivision into 14 subclasses or PoS: among declinable, Sweet (1892) names noun wordsnoun, noun-pronoun, noun-numeral, infinitive, gerund; adjective wordsadjective, adjective-pronoun, adjective-numeral, participle; verbfinite verb, verbals (infinitive, gerund, participles); and he conventionally refers to adverb, preposition, conjunction, and interjection as indeclinable PoS. This classification evidently shows ambiguity that exists within the PoS, for instance nouns, adjectives, verbs. Quirk et al. (1985) presented the fourteen-component classification in which there are 10 main PoS, 2 secondary PoS and 2 lexical units, which do not fall into any of the previous groups. Thus, the former consists of the open word classnoun, adjective, adverb, full verb and the closed word classpronoun, preposition, conjunction, determiner, modal verb, primary verb; additional classes cover numeral and interjection; while particle not and infinitive marker to stand distinctive. The most significant differentiation is observed within the class of verbs, which, as the linguists state, comprises full, primary and modal verbs; and particle not and infinitive marker to that do not fit under any classification.
However, these lexical units are nothing more but genuine particles, the representatives of which are usually distributed among other PoS creating ambiguity. e) A nineteen-component classification is the most complicated approach according to our research. Fries (1952) states that "the words that occupy the same sets of in English sentences must belong to the same class of words" (pp. 118-119). In his theory, the linguist, on the basis of syntactic-distributional approach and by means of the so-called substitution Another nineteen-component approach is introduced by Biber (1999) who offers the following division into the lexical wordsnoun, verb, adjective, adverb -singled out on the basis of the conventional approach; whereas the class of functional words does not much differ from those described by other scholars, but it comprises a lot of already proposed PoSdeterminer, pronoun, numeral, preposition, primary auxiliary, modal auxiliary, adverbial particle, coordinator, subordinator, wh-words, particle not, existential there, infinitive marker to -all of which indicate relations between lexical words; and inserts interjections and various markers. Such a division is evidently based on the main functions of the items and their grammatical behaviour in sentence/discourse.

Conclusion
The research on the history of the PoS identification in the English language shows an extreme diversity as to the approaches applied. By the diachronic development of classical approaches, Greek and Latin classifications and their profound impact on English, in particular, are explained. The concepts of PoS identification changed from semantic to syntactic and then morphological. Greek and Latin classifications were not identical, differing in the presence of interjection and absence of articles in the latter. Another common peculiarity was the absence of adjectives as an individual PoS. Furthermore, these classifications were not multiple in synchronic perspectives; however, they changed in diachrony, due to advance of philosophy and linguistics.
Despite the fact that classical Greek and Latin classifications provided the English language with the theoretical grounds for PoS identification, English grammarians and linguists managed to present numerous alternative approaches (see Fig. 1), which, in fact, enumerates more than 30 different classifications of PoS since 1700, arranged into 14 general subgroups and 5 groups. Such multiplicity refers not only to the period of standardisation (the 18 th century), but to all other periods as well, both in synchrony and diachrony.  Further research in the field is of critical importance as there is a necessity to scrutinise currently accepted subdivisions of PoS and newly elaborated approaches to PoS identification. Another issue is to address Middle English classifications which might also lay foundation for new plausible principles of PoS division. Additionally, it is required to explore an etymological correlation between the paradigms of PoS and lexical units representing them in Old Germanic languages as well as in comparison with other branches of the Indo-European language family. It will help to explore lexical units which overlap in form, function or meaning in the framework of comparative linguistics and language typology.