WEKO3
インデックスリンク
アイテム
武道歌の計量テキスト分析における 形態素解析辞書の選択とテキストデータの加工
https://doi.org/10.15034/0002002413
https://doi.org/10.15034/000200241388a9f593-6ef6-45fc-8b25-e3248e008a2c
| 名前 / ファイル | ライセンス | アクション |
|---|---|---|
|
|
|
| Item type | 文教大学学術リポジトリ登録用アイテムタイプ(1) | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 公開日 | 2025-05-30 | |||||||||||||
| タイトル | ||||||||||||||
| タイトル | 武道歌の計量テキスト分析における 形態素解析辞書の選択とテキストデータの加工 | |||||||||||||
| タイトル | ||||||||||||||
| タイトル | Selection of a morphological analysis dictionary and processing of text data to analyze martial arts poems texts | |||||||||||||
| 作成者 |
小林, 勝法
× 小林, 勝法
WEKO
430
× 佐藤, 晧也 |
|||||||||||||
| 内容記述 | ||||||||||||||
| 内容記述タイプ | Abstract | |||||||||||||
| 内容記述 | Martial arts poems are 31-character, fixed-form poems that describe the principles, techniques, and spirit of martial arts training. When performing text mining on the content of these poems, it is necessary to select an appropriate morphological analysis dictionary. This study aims to investigate how to make that selection. Additionally, unlike prose, these poems contain no punctuation marks and use specific rhetorical methods characteristic of tanka. Therefore, we examined whether processing the text could improve the accuracy of the analysis. For the analysis, we used the martial arts poem anthology “Tsukahara Bokuden Ikunsho,”which consists of 96 poems composed by Tsukahara Bokuden at the end of the Muromachi period. Morphological analysis was performed using the analysis software “Web Chamame,” developed by the National Institute for Japanese Language and Linguistics. The dictionaries found to be suitable for this collection of poems were the “Middle Japanese Literary UniDic,” “Middle Japanese Colloquial UniDic,” and “Early Modern Japanese Literary UniDic,” all developed by the National Institute for Japanese Language and Linguistics. The morphological analysis performed using these dictionaries demonstrated significant accuracy, approximately 97%. This indicates that quantitative text analysis of martial arts poems requires selecting a morphological analysis dictionary appropriate for the period and literary style of the poems. The most suitable method involves analyzing with software, such as “Web Chamame,” and selecting the most appropriate dictionary from multiple options.Furthermore, this research concludes that a dictionary selection guideline of 95% analysis accuracy is appropriate. Additionally, it was suggested that when extracting samples to compare analysis accuracy, a minimum of 60 poems or over 1,000 morphemes is required. However, more studies are needed to determine whether these results are generally applicable. Lastly, we found no decrease in analytical accuracy due to the combination of compound sentences and sentence-ending particles—a rhetorical device unique to tanka. Therefore, we conclude that it is not necessary to process text data, such as by adding punctuation or dividing sentences. |
|||||||||||||
| 出版者 | ||||||||||||||
| 出版者 | 文教大学大学院言語文化研究科付属言語文化研究所 | |||||||||||||
| 出版者 | ||||||||||||||
| 出版者 | Bunkyo University Graduate School's Institute of Language and Culture | |||||||||||||
| 言語 | ||||||||||||||
| 言語 | jpn | |||||||||||||
| 資源タイプ | ||||||||||||||
| 資源タイプ | departmental bulletin paper | |||||||||||||
| 出版タイプ | ||||||||||||||
| 出版タイプ | VoR | |||||||||||||
| ID登録 | ||||||||||||||
| ID登録 | 10.15034/0002002413 | |||||||||||||
| ID登録タイプ | JaLC | |||||||||||||
| 収録物識別子 | ||||||||||||||
| 収録物識別子タイプ | PISSN | |||||||||||||
| 収録物識別子 | 09147977 | |||||||||||||
| 書誌情報 |
ja : 言語と文化 en : Language and Culture 巻 37, p. 95-110, ページ数 16, 発行日 2025-03-16 |
|||||||||||||