중국 법조문의 한국어 인공신경망 기계번역 품질에 관한 자동평가와 인간평가 분석 :

고수진

추천

검색

자료유형: 학위논문

저자정보: 고수진 (한국외국어대학교, 한국외국어대학교 통번역대학원)

지도교수: 김진아

발행연도: 2023

저작권: 한국외국어대학교 논문은 저작권에 의해 보호받습니다.

이용수5

초록· 키워드

The present study conducted the translation quality of Korean neural machine translation in Chinese statutory provisions with automatic assessment and human judgement and compares and analyzes the results from the perspective of Chesterman’s expectancy norms among the descriptive translations studies.
The purpose of this is to examine whether Chinese legal texts of neural machine translation meets the reader’s expected standard to explore the translation quality of neural machine translation and how human evaluators can efficiently use automatic assessment.
This study targets a total of 648 sentences of 219 articles composed of the Constitution of People’s Republic of China, Chinese Nationality Act, Chinese State Reimbursement Act and Anti-foreign Sanctions Act provided by the World Laws Information Center under the Office of Legislation. The expected standards for legal text translation were set as accuracy, readability, and consistency in accordance with the legal translation guidelines of the Office of Legislation and the Korea Legislation Research Institute, which are public institutions in charge of legal translation in Korea. Machine translation was calculated through Naver Papago(N2MT) and Google Translation(GNMT) for legal texts, and evaluation was conducted by applying an automatic assessment model and human judgement model. The calculation date of the neural machine translation system was June 30, 2022 and three human evaluators participated in human judgement.
As a result of data analysis, both automatic assessment and human judgement except for Bleu score recorded 0.8 or higher. In terms of correlation, the correlation between automatic assessment was 0.60 to 0.85, showing a medium to high volume of correlation, and the correlation between human evaluators was 0.45 to 0.55, showing a weak or medium volume of correlation. The correlation between automatic assessment and human judgement showed the lowest correlation between Bleu score and human judgement with 0.22~0.26, Bert score and human judgement with 0.27~0.41 and Laser score with 0.00~0.18 respectively.
From the perspective of accuracy among expected standard, automatic assessment has advantages in accuracy because metrics have developed around accuracy, but errors have emerged that can’t completely judge from some sentences to contexts. In terms of human judgement, there were differences between evaluators in word layer, phrase layer, and text layer except sentence layer.
In terms of readability, automatic assessment and human judgement showed different patterns in judging the context. For some sentences that all human evaluators gave high scores, when vocabulary aspects or logical relationships changed, automatic assessment tended to judge them as errors.
For the consistency, in the case of automatic assessment, the same word is mixed and translated into different vocabulary even within a sentence, and it showed that it’s not unified with legal titles. In case of human judgement, overall consistency and coherency were judged with generally similar probabilities.
To recap, automatic assessment had advantages in the evaluation of ‘accuracy’ among the expected standard of legal texts, ‘accuracy’, ‘readability’ and ‘consistency,’ while ‘readability’ and ‘consistency’ were partially possible. In terms of human judgement, it had the advantage of being able to evaluate all of ‘accuracy’, ‘readability’, and ‘consistency,’ but there was a problem with the evaluator’s subjectivity. From the perspective of expected standard, to efficiently evaluate the machine translation of Chinese legal texts provisions, it’s necessary to consider items for ‘consistency’ and ‘weight factor’.
Analyzing the results of the present study, Korean neural machine translation is still difficult to completely meet the expected standard for Chinese legal texts, and it’s necessary to improve by technological development such as improving the similarity. In terms of human judgement, in order to use neural machine translation in the translation quality evaluation of Chinese legal texts done by human, methodological plan such as using statute books, analyzing, and synthesizing weak parts of neural machine translation and making evaluation criteria specialized in Chinese legal texts.

#중국법조문 #번역품질평가 #TQA #인공신경망기계번역 #법률번역

1. 서론 1
1.1. 연구배경 및 연구목적 1
1.2. 연구문제 및 연구방법 3
2. 이론적 배경과 선행연구 6
2.1. 기술론적 번역학(Descriptive Translation Studies: DTS) 6
2.1.1. 체스터만(Chesterman)의 기대규범 9
2.1.2. 번역품질평가(TQA)와 TQA 지표 17
2.1.3. TQA 모델 24
2.2. 인공신경망 기계번역(Neural Machine Translation: NMT) 42
2.2.1. 인공신경망 기계번역의 개념 42
2.2.2. 인공신경망 기계번역의 품질평가지표 45
2.2.3. 인공신경망 기계번역 모델 50
2.3. 법률번역 56
2.3.1. 한국의 법률체계와 법조문의 구조 56
2.3.2. 중국의 법률체계와 법조문의 구조 59
2.3.3. 법률번역과 번역전략 63
3. 연구방법 66
3.1. 연구문제와 연구모형 66
3.2. 연구절차 69
3.2.1. 데이터 수집 69
3.2.2. 데이터 분석 73
3.2.2.1. 자동평가모델 73
3.2.2.1.1. Bleu score 74
3.2.2.1.2. Bert score 78
3.2.2.1.3. Laser score 84
3.2.2.2. 인간평가모델 87
3.2.2.2.1. 한중 인공신경망 기계번역 평가모델 87
3.2.2.3. 상관계수와 p-value 92
4. 데이터 분석 결과 93
4.1. 자동평가 분석 93
4.1.1. 자동평가모델 분석 93
4.1.1.1. Bleu score 93
4.1.1.2. Bert score 103
4.1.1.3. Laser score 109
4.1.2. 법령 분석 111
4.1.2.1. 헌법 111
4.1.2.2. 국적법 113
4.1.2.3. 국가배상법 115
4.1.2.4. 반외국제재법 117
4.1.3. 상관계수와 p-value 119
4.1.4. 소결 124
4.2. 인간평가 분석 127
4.2.1. 인간평가모델 분석 127
4.2.1.1. 단어 층위 130
4.2.1.2. 구 층위 133
4.2.1.3. 문장 층위 135
4.2.1.4. 텍스트 층위 137
4.2.2. 법령 분석 139
4.2.2.1. 헌법 139
4.2.2.2. 국적법 141
4.2.2.3. 국가배상법 144
4.2.2.4. 반외국제재법 147
4.2.3. 상관계수와 p-value 149
4.2.4. 소결 154
4.3. 자동평가와 인간평가를 통한 기대규범 분석 157
4.3.1. 정확성 157
4.3.2. 이해 용이성 160
4.3.3. 통일성 164
4.3.4. 소결 167
5. 결론 169
5.1 연구 요약 169
5.2 연구의 한계와 의의 174
참고문헌 177
부록 186
영문초록 188

최근 본 자료

전체보기

구분	그룹	데이터 항목
AI 학습용 데이터	원문	원문 PDF 파일
AI 학습용 데이터	원문 + 메타 (기본/상세)	원문 PDF 파일 및 서지정보 CSV
대량 구매용 데이터	B2B 구독 방식	특정 자료 한정으로 원문 접근 권한 부여
대량 구매용 데이터	URL 전달 방식	바로 PDF 뷰어를 열람할 수 있는 URL 제공

구분	그룹	데이터 항목
AI 학습용 데이터	기본 메타	발행기관명, 간행물명, 권호명, 권(vol), 호(issue), 통권, 발행연도, 발행월, 논문명, 저자명, 시작페이지, 종료페이지, 전체페이지, 상세페이지URL
상세 메타 데이터	발행기관 메타	발행기관 이명, 영문명, 창립연도, 홈페이지URL, 발행기관 소개
	간행물 메타	부제목, 간행물 유형, ISSN, ISBN, 최초발행연도, 폐간연도, 간행빈도, 발행주기, 등재사항, 이용수, 피인용수, 권호수, 논문수, 표지이미지
	논문 메타	작성 언어, 부제목, 대등제목, 목차, 키워드, 초록, 이미지, 참고문헌, 이용수, 피인용수, 논문활용도, DBpia통합주제분류, KDC분류, DDC분류, 한국연구재단분류, UCI, DOI
	저자 메타	소속기관, 소속부서, 직급, 연구분야, 연구키워드, 이용수, 피인용수, 저자 논문활용도

구분	그룹	데이터 항목
※ 결합형/맞춤형 메타 데이터는 신청 내용에 따라 다양하게 제공 가능
이용순위 정보	주제분야별 많이 이용된 논문	“인문학”에서 많이 이용된 논문 TOP100
	이용기관별 많이 이용된 논문	“중고등학교”에서 많이 이용된 논문 TOP100
	세부기관별 많이 이용된 논문	“서울대학교”에서 많이 이용된 논문 TOP100
	키워드별 많이 이용된 논문	“Chat GPT”에서 많이 이용된 논문 TOP100
키워드 정보	많이 이용된 키워드	특정기간/분야/저널 내 많이 이용된 키워드
	많이 발행된 키워드	특정기간/분야/저널 내 많이 발행된 키워드
	많이 검색된 키워드	특정기간/분야/저널 내 많이 검색된 키워드
	연구 트렌드 키워드	특정 키워드 연관 연구동향 분석 데이터 키워드

논문 기본 정보

초록· 키워드

목차

최근 본 자료

댓글(0)