4

Phrase-Based Machine Translation

The last two chapters—Chapter 2 on learning bilingual mappings and Chapter 3 on IBM models—showed how to establish word alignments given parallel corpora. The basic methodology adopted was expectation maximization, which disentangled the knot of word alignment j ←→ i depending on translations fj ←→ ei, and translations fj ←→ ei depending on alignment j ←→ i. In IBM model 1, all alignments were assumed to be equally likely. To get translation probabilities t(fj|ei) that maximized the probability of observations, EM was run with randomly initialized probability values for t(fj|ei)s. This enabled getting expected counts C(fj|ei; fs, es), s = 1, …, S, of fj ←→ ei over the whole corpus. These expected counts revised ...

Get Machine Translation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.