Understanding the North Korean (NK) educational system can advance our efforts toward the reunification. This paper constructs a corpus digitizing obtainable English textbooks used in the NK for a further analysis. The corpus includes six English textbooks from the first grade to the sixth grade of middle school for a synchronic analysis which were published in 1995 and three more textbooks of 2008 for diachronic analysis. Construction of the corpus applied three principles: All English letters are digitized, different components of textbooks are digitized into different files to maintain the forms and contents, and finally consistent names and conventions are tagged to files. The corpus construction resulted in a tag set for the NK English textbooks, markup conventions, abbreviation conventions and dialogue interactive markups applied in the NK English textbook corpus. The paper discovered important differences of the NK English textbooks in its use of compound nouns and forms-focused as opposed to meaning-focused. The corpus will provide a basic data for further exploration of vocabulary, grammar and text.
카카오톡
페이스북
블로그