This study investigated the difference between loanword of North-South Korean from the perspective of natural language processing and discussed effective ways to build up the North Korean corpus. Chapter 2 examined trends in research on existing North-South Korean loanword and pointed out that discussions on processing information on North-South Korean loanword are insufficient. In Chapter 3, the differences in loanword, which showed a lot of heterogeneity of North-South Korean languages, were examined by type through the headings of large dictionaries Urimalseam and Joseonmaldaesajeon. First, the type of difference in notation, second, the type of difference in etymology, and third, the type of difference in vocabulary due to socio-cultural factors were divided and explained through actual examples. Chapter 4 discussed how to deal with North Korean loanword for the construction of North Korean corpus. In terms of lexical aspects, etymological aspects, phonological aspects, notation aspects, and socio-cultural vocabulary, North Korean corpus construction and loanword processing methods were explored. The study is meaningful in that it laid the preliminary foundation for the study of North Korean corpus treatment from a natural language processing perspective.
카카오톡
페이스북
블로그