Wednesday, February 22, 2006

Trados and Omegat - compatibility issues

OmegaT now has sentence segmentation based on SRX and is to be considered the leader in that specific field. As for tmx compatibility now it is compatible. Well what remains is the possibility to translate whatever kind of file with it ... and: even that is coming nearer and nearer.
Even not having filters for Trados files for example, now it is possible to work on trados presegmented files. I wanted to use a file created with RTF styler by MaxPrograms.com, but I could not manage to see the segmentation ... don't ask me why ... I opened the document after having passed it with rtf styler in Microsoft Word and it just looked like before. Well, one can create Trados presegmented files also with the trial version of Trados or you can ask your customer to send you the file already segmented to work on it ... of course I had no customer sending me files, but some colleagues (Thank you!!!).
Now what to do with such a file ...

  • open the .doc file with OpenOffice.org and save it as .odt (open document) format
  • within OmegaT go to the segmentation rules and add some rules
  • in each new rule you create you must check the checkbox, since we are creating artificial linebreaks that will "pull out" the text to be translated
  1. add a new rule ad add \{> in the before-column and leave the after column empty
  2. add a new rule ad add <\} in the after-column and leave the before column empty
  3. add a new rule ad add \{0> in the before-column and leave the after column empty
  4. add a new rule ad add <0\} in the after-column and leave the before column empty
  • now, reloading the project you will have each sentence twice + segments with only the Trados brackets
  • the ony thing you must be careful with is to translate only every second segment
  • once the translation is ready you create the target text, open it in OpenOffice.org and save it as .doc or .rtf (according to your original file).
Please understand that this is a workaround - it should work for approximately 90% of the Trados files - I am missing just one answer and the short files I tried out seem all to work as they should. To be sure if OmegaT can handle this create a project, make some changes is some segments and then save your target file.
Check if your target file opens correctly and save as .doc/.rtf file.

It would be a huge advantage if OmegaT had a "ignore hidden text"-function since this would allow for all kinds of particular workarounds using OpenOffice.org macros :-) and if it had the ignore hidden text thingie we would not need any workaround for Trados presegmented files since the source segment and the particular brackets are already marked as hidden text.

Well I don't know how difficult it is to implement that in OmegaT, but having that possibility would be really great.

Thanks for any feed-back on your experiences.

2 comments:

SabineWanner said...

First of all let me coyp a mail from the OmegaT user- list by JC here:

(please substitute the ( brackets around the segment tags with < brackets ... the blog does not allow me to post the tags as they should be).

****

Sabine,

Let me paste parts of your blog item here so as to reply more easily.

> OmegaT now has sentence segmentation based on SRX and is to be
> considered the leader in that specific field.

Don't forget that SRX is already implemented by Trados, SDLX, Heartsome.

> add a new rule ad add \{> in the before-column and leave the after
> column empty
> add a new rule ad add <\} in the after-column and leave the before
> column empty
> add a new rule ad add \{0> in the before-column and leave the after
> column empty
> add a new rule ad add <0\} in the after-column and leave the before
> column empty
> now, reloading the project you will have each sentence twice +
> segments with only the Trados brackets
> the ony thing you must be careful with is to translate only every
> second segment
> once the translation is ready you create the target text, open it
> in OpenOffice.org and save it as .doc or .rtf (according to your
> original file).

The process is very similar to what I proposed here last October to
have rudimentary Latex file support.

Basically, use segmentation rules to transform a formatted file like:

[marker]blablabla[endofmarker]

to:

(segment01)[marker](/end segment)
(segment02)blablabla(/end segment)
(segment03)[endofmarker](/end segment)

The result is pseudo segments (segment01) and (segment03) will be in
the translation memory, but since OmegaT does not keep duplicates
there will only be one instance of each. And the item to translate
will be clearly identified by containing no specific marker.

Since segmentation rules are meant to segment sentences and not
markers it may be a little confusing to mix the 2 to create pseudo
"file filters" but as long as it works, why not use our imagination.

I suppose though that at one point in time we'll have to clearly add
a section in the segmentation rules for "pseudo filters" so that
language segmentation and format segmentation are neatly separated.

> It would be a huge advantage if OmegaT had a "ignore hidden text"-
> function since this would allow for all kinds of particular
> workarounds using OpenOffice.org macros :-) and if it had the
> ignore hidden text thingie we would not need any workaround for
> Trados presegmented files since the source segment and the
> particular brackets are already marked as hidden text.

Well, if I am not wrong, that is what ITS is about, so if we manage
to have OmegaT react to styles and not only to structural markers we
will be able to have what you ask.

Thank you for your comments. And we are all waiting for you and
Samuel to find the final solution :)

Jean-Christophe

SabineWanner said...

And now one more comment from my side: SRX is being implemented and is not fully supported right now - so you cannot simply copy an .srx file into a folder and it works ... well the whole thingie is quite long to describe ... another day :-)

Khalil Gibran über die Musik

Die Musik wirkt wie die Sonne, die alle Blumen des Feldes mit ihrem Strahlen zum Leben erweckt. ( Khalil Gibran ) Image by Pete Linforth fr...