چکیده:
صاحبنظرانی همچون بائر (2004) و باین (1992) زایایی را یک پیوستار میدانند. منظور از زایایی وندها، میزان قابلیت استفاده از یک وند در ساخت واژههای جدید است. با بررسی میزان زایایی، میتوان ترکیبها و واژههای بالقوة زبان را پیشبینی کرد. در پژوهشهای انجامشده تاکنون، تعداد مختصری از وندها بررسی شدهاند. برای بررسی تعداد وندهای بیشتر و نتیجهگیری جامعتر، پژوهش حاضر انجام شد. هدف این پژوهش، بررسی میزان زایایی پیشوندها و پسوندهای اشتقاقی زبان فارسی و ارائه تعدادی از وندوارههای زبان فارسی بود. برای این منظور166 وند اشتقاقی، شامل 28 پیشوند و 72 پسوند و 66 پسوندواره، گردآوری شدند و محاسبۀ میزان زایایی پیشوندها و پسوندها، جداگانه انجام شد. وندها در قسمتی از پیکرة بهروزشونده، جستوجو شدند. برای جستوجوی وندها از یک دستور برنامهنویسی به زبان پایتون استفاده شد. برای بررسی میزان زایایی، از شاخص ارزیابی باین (P*) (2009) استفاده شد. یافتههای پژوهش نشان داد که زایاترین پیشوند «بی» و غیرزایاترین پیشوند «پیرا» است. همچنین، زایاترین پسوند «ی (مصدرساز)» غیرزایاترین پسوند «وش» است.
Abstract: One of the most important aspects of word-formation is considered as productivity which is a continuum concept in morphology (Bauer, 1996, and Baayen, 2009). Booij (2007) says that the frequent process in word formation is derivation. To the best of the researchers’ knowledge, only a partial number of Persian derivational affixes have been investigated in previous studies so far. In this regard, the following corpus-based study is done to discuss more affixes of Persian language and to reach more comprehensive results. The current study tries to answer the following two questions: 1. What are the most and the least frequent affixes in Persian language? 2. Is there any relation between productivity and word meaning prediction? To fulfill the aims of the research, 166 affixes, consisting of 28 prefixes, 72 postfixes and 66 affixoids were collected by a python code which searches through the Monitor corpus. In this study, Baayen measurement (P*) was used to measure the productivity. Results of the study showed that the theory of Bauer (2004) about productivity and predictability of words is true in most cases but with few exceptions. Keywords: Productivity, Affixes, Derivational, Prefix, Postfix, Affixoids, Baayen measurement. Introduction: Productivity means the extent to which new words are produced using a particular affix with a considerable frequency (Aronoff, 1976; Lieber, 1981; Al and Booij, 1981; Bauer, 2004; Booij, 2007). Shaghaghi (2010) treats productivity as a process in which new words are produced using a frequent word-formation rule, and Bauer (2004) claims that the meaning of words created by productive affixes is more predictable. As a consequence, the potential words and actual words of a language can be distinguished by studying and measuring the productivity. Studying the productivity can be done through morphology and syntax. Considering syntax, an unlimited number of sentences can be produced due to syntactic rules. Regarding morphology, it is impossible to produce unlimited words due to limitations imposed on word-formation rules which are syntactic, phonetic, morphological, and semantic. Therefore, in order to study the productivity, classification of affixes is needed. In general, affixes are considered as conditioned morphemes, creating new words or creating other forms of already existing words (Tabatabaei, 2016). According to Samie et al (2008) there are two types of affixes: Derivational affixes and Inflectional affixes. Besides, there are other kinds of affixes in Persian language: Affixoids, which are morphemes that are no longer used as free morphemes and can be used as affixes such as /pæɹæst/ in /vætænpæɹæst/. Having said that, /pæɹæst/ is a free morpheme that is used to function as a stem. The difference between such affixes and other derivational ones is that they can take conditioned morphemes as an affix and create new words. There are some words that take a secondary meaning besides their former ones, over time, and function as affixes in words to create a different meaning. For example: word /sah/ functioning as an affix in /sahɹah/ (Kalbasi, 2008). Complex afiixes, which are those that can be combined to create new affixes. For example: /an+h= anɛh/. Affixes investigated in this study, to measure the productivity, consist of all the above-mentioned types. Materials and Methods: The population of the study consists of 166 Persian affixes, consisting of 28 prefixes and 72 postfixes, that were collected from different sources like Farshidvard (2013), Moein Dictionary and Kalbasi (2008). Moreover, a great portion of affixes was selected from Viravirast[1] which is a dataset of Persian affixes extracted from Persika[2] corpus and web using a crawler algorithm (viravirast.com). Furthermore, a section of a Monitor corpus[3] was selected Morgan rule. in order to find sample words for each affix. Since there are four different affixes with one form and four different meanings in Persian language, having the context was necessary to differentiate the different types of that affix. Fir example: (ی/ i/ derivational, creating name), (ی/ i/ derivational, function like (-ness) in English), (ی/ i/ inflectional, for third singular), (ی/ i/ inflectional, functions like the determinator (a) in English). Therefore, Lancsbox[4] tool was also used to study this postfix through the context. The research method consists of five stages. First, affixes were collected and divided into two groups of prefixes and postfixes. Second, the corpus was preprocessed. Third, affixes were searched through the corpus by a python code using NLTK and HAZM libraries. Fourth, the sample words for each affix were listed in an excel sheet. Fifth, nonwords were deleted and consequently, the related words were extracted. To measure the productivity, Baayen general productivity measurement (P*) was used. P* = VN (1, c)/h. In this formula (VN) stands for hapax legomena for each affix and (h) stands for all hapax legomena for all affixes. Hapax legomena generally means words with one occurrence. Based on Baayen (2009) words with one frequency of occurrence are probably new words. Therefore, the higher the number of words with one frequency are, the more productive the process will be. Discussion of Results and Conclusion: The results of the current study, according to the Baayen's general productivity measurement, showed that among the Persian prefixes, "بی/ bi/" is considered as the most productive one and " پیرا/ piɹa/" and "ور/ væɹ/" are considered as the least productive ones. Also, among the Persian postfixes, "ی/ i/" is considered as the least productive one. Based on the study of Bauer (2004), the more frequent the affixes of words are, the more predictable their meaning will be. Results showed that the theory is accepted in most cases but some examples do not match. For example: (دار/ daɹ/) is one of the productive affixes meaning (have), but the word /mæɹdomdaɹ/ does not mean to have people. It rather means humanitarianism, someone who behaves well to people. [1] Viravirast is an automatic system for writing and editing Persian language. [2] A Persian Corpus for Multipurpose Text Mining and Natural Language Processing. [3] A Persian monitor corpus with different subject categories (2020-8-8). [4] Lancsbox is one of corpus analysis tools.
خلاصه ماشینی:
میان زایایی و قابل پیش بینی بودن معنای واژه چه ارتباطی وجود دارد؟ مفهوم زایایی، در صرف و نحو مطرح شده است ؛ با این تفاوت که در حوزۀ نحو بـه دلیـل اسـتفاده از قواعـد نحـوی مـیتـوان جملات نامحدودی تولید کرد؛ اما در حوزۀ صرف ، قواعد صرفی و محدودیت های نـاظر بـر آن ، اجـازۀ تولیـد بـینهایـت واژه را نمیدهند.
Burani اگر زایایی یک وند به معنای ترکیب یک وند با تمام مقوله ها باشد، آیا میتوان چنین وندی را مشـخص کـرد؟ آیـا مـیتـوان گفت که به طور کلی ترکیب یک وند با تمام مقوله ها ممکن است ؟ برای پاسخ گویی به این پرسش ها باید محدودیت های ناظر بـر فرایندهای واژه سازی را بررسی کرد.
وندهایی که در پژوهش حاضر مورد بررسی قرار گرفته اند، در دو دستۀ پیشوند و پسوند در جدول های ذیل نمایش داده شده اند: جدول ١- پیشوندها Table 1- Prefixes (به تصویر صفحه رجوع شود) همچنین ، تعدادی از وندواره های زبان فارسی در جدول زیر ارائه شدند: جدول ٣- وندواره ها Table 3- Affixoids (به تصویر صفحه رجوع شود) عمل جست وجو با استفاده از یک دستور برنامه نویسی به زبان پایتون ، نسخۀ ٣ و همچنین ، فراخـوانی کتـاب خانـه هـای NLTK و 1 HAZM٢ انجام شد.
Morphological and Syntactic Constraints of Productivity in Persian Derivation.
A Corpus-based Study of Productivity of Derivational Prefixes in the Written Variety of Contemporary Persian.
Semantic analysis of pish prefix in Persian Language: Cognitive linguistics approach.