Pre-translating long segments with fragment assembly in OmegaT

I have discovered this very nice feature in OmegaT 3.5_1. The client sends XLIFF files with no segmentation, but I am applying sentence-based segmentation to make the translation task easier and to get better leverage from the project TM. A look at the file as my clients prepares it:

The multi-sentence segment in the file sent by my client.

And this is how I work on it:

Segmented paragraph in OmegaT

However, my client has asked me to include also the project TM together with my translation in my delivery. Now, here’s a conflict because my TM is segmented by sentence, and the XLIFF files my client prepares is not segmented, and therefore my client won’t get full leverage from my TM…

However, I’ll be able to deliver a TM with no segmentation very easily. OmegaT (as of 3.5_1) has a very nice hidden feature that could be called “fragment assembly”. Let’s say you have to translate new segment “ABC” and the TM contains fuzzy matches “A”, “B” and “C”. OmegaT identifies them as parts of the new segment and merges them so as to get a longer 100% match, matching the paragraph-long new segment.

All I need to do is disable the sentence-level segmentation in OmegaT:

Disable logical sentence-level segmentaiton

The project will reload, showing a one-paragraph segment, and I will get an assembled 100% match automatically.

All those matches can be inserted automatically if OmegaT is configured for that (in the Editing Behaviour options). I just need to go through the file quickly and all paragraph-long segments will be pre-translated with auto-populated assembled matches from the TM.

Finally I just need to create the translated documents (Ctrl+D) and include the level2.tmx in my delivery.

Leave a Reply

Your email address will not be published. Required fields are marked *