[CS224n] Lecture 7 - Translation, Seq2Seq

Machine Transaltion : new task

Seq2Seq : new neural architecture

(1) Machine Transaltion : New task

SMT ( Statistical Machine Translation )

Bayes Rule

Learning alignment

need large amount of parallel data!

↓

Alignment : correspondence between particular words

(Prob) one - to - many , many - to - many

Alignment a = latent variable

explicitly specified X

Decoding

(Q) How to compute this argmax?

(A) Impose strong indep' assumption == Decoding

(2) Seq2Seq : New neural architecture

NMT (Nueral Machine Translation)

Seq2Seq

a single end-to-end nueral network = Seq2Seq (involves two models : encoder & decoder)

encoder : encode source sentence

decoder : generate target sentence

-> Summarization / dialogue / parsing / code generation

P ( y | x )

NMT directly calculate conditional model probability

| of target langugae sentence

| given source source language sentence

(Q) How to train?

(A) big corpus -> J = predicted // actual

Multi-layer RNNs <- more complex

Multi-layer deep encoder-decoder achine translation net

2-4 layer -> encoder RNN

4 layer -> decoder RNN

(improvement : 1 to 2 >> 2 to 3)

Decoding

Greedy Decoding : step by step

(Prob) ex. He hit a ___ ..? -> no way to go back

Exhausive search decoding

(Prob) computing all possible y -> EXPENSIVE

Beam Search Decoding **

On each step -> k most peobable partial translation

k : beam size

(ex) what are the two most likely things ? -> k=2

compare score for each hypothesis -> keep

(prob) longer hypthesis ? lower score

(fix) Normalize by length

Disadvantage

Evaluate

BLEU : compare Machine written vs Human-written translation

| useful but imperfect

Is Machine Translation solved? NOPE!

+ common sense ex. paper jam

+ biases in training data ex. gender

'CS224N' 카테고리의 다른 글

[CS224n] Lecture 7(3)-8 Attention (0)	2022.01.12

IIIIIIIIIIII

[CS224n] Lecture 7 - Translation, Seq2Seq

(1) Machine Transaltion : New task

SMT ( Statistical Machine Translation )

(2) Seq2Seq : New neural architecture

NMT (Nueral Machine Translation)

'CS224N' 카테고리의 다른 글

티스토리툴바

[CS224n] Lecture 7 - Translation, Seq2Seq

(1) Machine Transaltion : New task

SMT ( Statistical Machine Translation )

(2) Seq2Seq : New neural architecture

NMT (Nueral Machine Translation)

'CS224N' 카테고리의 다른 글

'CS224N' Related Articles

티스토리툴바