15세기 문헌자료의 전산화 - 연구자 중심 말뭉치 구축 및 활용 -

(주)학지사

최초 등록일: 2017.02.01
최종 저작일: 2014.04; 26페이지/ 어도비 PDF; 가격 5,500원

다운로드

장바구니

상세정보
자료후기 (0)
자료문의 (0)
판매자정보

* 본 문서는 배포용으로 복사 및 편집이 불가합니다.

서지정보

ㆍ발행기관 : 우리말학회 ㆍ수록지정보 : 우리말연구 / 37권
ㆍ저자명 : 남경란

한국어 초록

15세기 문헌자료를 원시 말뭉치로 구축할 때는 반드시 문헌의 원
본과 여러 차례 대조하고 수정하는 작업을 선행해야 한다. 연구자가 원하
는 주제, 혹은 영역에 맞게 말뭉치를 가공하여 이를 연구에 활용하기 위해
서는 서명(書名)과 간행 연대, 즉 시간적 의미를 반드시 함께 정보 처리하
는 것이 좋다. 또한 국어 어휘 변화의 통시적인 연구를 위한 데이터베이스
를 구축할 때에는 적어도 세 개의 정보, 즉 최초 출현 서명의 시대 정보를
포함한 어휘정보, 최초 어휘에 대한 변이형의 정보를 처리한 유사 어휘 정
보, 그리고 현재 사용 의미에 관한 정보 처리에 기반을 두어야 한다.

영어 초록

The current study aims to introduce how to computerize the original to utilize on
various studies, choosing and sorting the materials by the wants of researchers in
terms of vocabulary, word phrasing, and sentence structure required for their
individual studies of 15th century Korean documents.
In the case of computerizing 15th century Korean documents, it is certain that
errorless corpus is the best corpus. However, as corpus construction work is time
consuming, it is extremely difficult to create an errorless corpus. Therefore, when
changing 15th century Korean documents into raw corpus, comparing and editing
them with the original Korean documents should always come first. The ways to
utilize this study with research processing corpus, within the topic or domain of
what researchers want, are as follows. First, on Korean language materials, it is
better to ensure that the time based meaning given by the important ‘book title’ and
‘publication period’ are processed together. Second, in the case of building a
database for the diachronic inquiry of the change in Korean vocabulary, it must be
based on processing with at least three pieces of information, that is to say,
vocabulary information including the age information with the first book title to
appear, similar vocabulary information processing with variations on the initial
vocabulary, and information processing on the present meaning.
In sum, if similar vocabulary(variation) of search vocabulary is extracted, from time information such as ‘book title’ and ‘publication period,’ and corpus listed by
time and literature is built, it will have a value that can be used in various ways
as not only with the 15th century Korean documents, but also with the research
materials which Korean researchers want to use.