Pages in topic:   < [1 2 3 4] >
How do you organise translation memories?
Thread poster: Dan Lucas
Giles Watson
Giles Watson  Identity Verified
Italy
Local time: 23:13
Italian to English
In memoriam
TM not TB Sep 26, 2014

Michael Beijer wrote:

The general idea is to move from relying on TMs to using only TBs, as they offer much more fine-grained control when it comes to auto-assembly.



Auto-assembly was the first thing I turned off in DV when I used to translate with it.

The problem is that in my sectors and language combination (IT-EN), and I imagine many others, the key to effective translation lies not so much in substituting discrete words or phrases as in reformulating the thought of the source text in an appropriate manner for the target language.

In my experience, concordance searches of translation memories are much more likely to throw up useful suggestions than the chunk-for-chunk approach of termbases.


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 22:13
Member (2009)
Dutch to English
+ ...
@Giles: Sep 26, 2014

Giles Watson wrote:

Michael Beijer wrote:

The general idea is to move from relying on TMs to using only TBs, as they offer much more fine-grained control when it comes to auto-assembly.



Auto-assembly was the first thing I turned off in DV when I used to translate with it.

The problem is that in my sectors and language combination (IT-EN), and I imagine many others, the key to effective translation lies not so much in substituting discrete words or phrases as in reformulating the thought of the source text in an appropriate manner for the target language.


Yes, the best results will be achieved with technical, repetitive content. I should probably have mentioned that the colleague of mine who has had the best luck with this approach translates primarily technical manuals, which contain the same thing, over and over again, in infinite variations. I myself translate many kinds of things (some not very well suited to this chunk-for-chunk approach), but I intend to test this method on standard contracts (terms & conditions!) and technical manuals.


In my experience, concordance searches of translation memories are much more likely to throw up useful suggestions than the chunk-for-chunk approach of termbases.


It depends on what you are looking for: a specific legal term (in which case a TB will most likely be of more use) or an idiomatic expression (in which case running a concordance search across millions of TUs will probably be more useful). For the latter, I have yet to find anything that can hold a candle to the amazing (and free) TMLookup: http://www.farkastranslations.com/tmlookup.php

I have tried running concordance searches on my massive TM database (which currently contains around fifty million TUs!) in memoQ 2014, DVX2/3, SDL Studio 2014, CafeTran, and a few others I can't remember now, and TMLookup runs circles around them all. This is not surprising, as TMLookup was designed to do just one thing, and to do it well: search through masses of TMXs. CAT tools are doing all kinds of other things in the background, which really slows them down.

Michael


 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 22:13
Member (2014)
Japanese to English
TOPIC STARTER
I can imagine situations in which both approaches work Sep 26, 2014

Giles Watson wrote:
In my experience, concordance searches of translation memories are much more likely to throw up useful suggestions than the chunk-for-chunk approach of termbases.

I think in some cases, such as documents densely packed with technical vocabulary, an approach biased towards termbases would be both effective and maybe efficient.

Dan


[Edited at 2014-09-26 22:06 GMT]


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 22:13
Member (2009)
Dutch to English
+ ...
Indeed! Sep 26, 2014

Dan Lucas wrote:

Giles Watson wrote:
In my experience, concordance searches of translation memories are much more likely to throw up useful suggestions than the chunk-for-chunk approach of termbases.

I think in some cases, such as documents densely packed with technical vocabulary, an approach biased towards termbases would be both effective and maybe efficient.

Dan

[Edited at 2014-09-26 22:06 GMT]

Hi Dan,

‘Documents densely packed with technical vocabulary:’ sounds like pretty much every text I am sent these days

Michael


 
Meta Arkadia
Meta Arkadia
Local time: 05:13
English to Indonesian
+ ...
Auto-assembly Sep 27, 2014

Giles Watson wrote:
Auto-assembly was the first thing I turned off in DV when I used to translate with it.

I think AA is very useful, in fact it's the core of a decent CAT tool. AA comes up with suggestions based on the CAT tool's algorithms and your (priority) settings of the connected TMs and TBs. If you get the wrong "hits," you can change those settings, and/or add the correct term in the TM/TB with the highest priority, so next time, it'll show up correctly.

However, I can imagine automatic insertion of those results can be counterproductive, especially if the word order of the target language differs from the one in the source language. That doesn't make the actual results less useful, though.

In CafeTran, the AA results will either show up as a pop-up screen, or they will be inserted in the target language panel. I use the insertion option, because that will allow me to have a look at the results before I may or not decide to delete them with a single keystroke. If I decide to delete them because of the wrong word order, auto-completion will help me to type the terms, because it's based on the same terms as present in the resources, including the ones from the current project.



To (finally) return to the original question: Using AA in a meaningful way requires organising your resources in a meaningful way - to avoid using the word "philosophy" that is not being appreciated by some.

I use
- A general memory for segments ("Big Mama"). All of my translations are to be found there, since I started using the BM last century. Two problems: It gets to big to process for AA the normal way (but it can be searched anytime) so I have to use a pretranslation process, and both DejaVu and CafeTran (and probably other tools) use an algorithm that prefers longer strings to shorter ones. This makes sense in most cases, but the larger the TM gets the more useful hits you'll get that way, but unfortunately also the more false positives. Excluding the Big Mama from consistency checks when doing the QA at the end of the project is of the essence. (settings: keep all segments, either pretranslate or search only)
- Any subject specific memories for segments, like the DGT for EU jobs (read-only, either pretranslate or search only)
- A project specific memory for segments to deal with the problems of the BM above and for auto-completion. It also preserves any tags. (highest priority, save tags, newest segments only, QA)
- A general memory for err, general terms ("Big Papa"). (lowest priority, keep all phrases)
- A subject specific memory for terms (rather than a client specific one). (medium or high priority, keep all phrases). Could be more than one TMX file, e.g. plus IATE for EU jobs (read-only, medium priority)
- Any end-client specific memory/ies for terms (high priority, keep newest segments)
- Any memories provided by the client. (read-only, high priority, QA, dock at a spot where it's always visible)
- Plus MT, and web resources (reference only)


Cheers,

Hans

[Edited at 2014-09-27 02:33 GMT]


 
Giles Watson
Giles Watson  Identity Verified
Italy
Local time: 23:13
Italian to English
In memoriam
Scattergun vs target pistol Sep 27, 2014

In that case I agree with you, Michael.

Churning through vast quantities of tmx guff is obviously going to be useful if genre, register, context and form are largely predetermined - as they are in technical manuals - and your main problem is vocabulary.

But when le mot juste won't come and you can feel it on the tip of your tongue, a concordance search of a targeted TM is more likely to do the business, particularly if you've compiled the TM yourself ...
See more
In that case I agree with you, Michael.

Churning through vast quantities of tmx guff is obviously going to be useful if genre, register, context and form are largely predetermined - as they are in technical manuals - and your main problem is vocabulary.

But when le mot juste won't come and you can feel it on the tip of your tongue, a concordance search of a targeted TM is more likely to do the business, particularly if you've compiled the TM yourself
Collapse


 
Thomas Rebotier
Thomas Rebotier  Identity Verified
Local time: 15:13
English to French
[per job +] per client + big fat Sep 27, 2014

In general, a per client plus the big fat TM, with priorities for the per client. Concordance searches from the big fat TM often return questionable results -- they can be useful to remind you of rare expressions that you nailed well in some contexts.
Occasionally i have to send back a project-only TM, like everybody, and then I add it as top priority. But still feeding into the other two as well.


 
Tomás Cano Binder, BA, CT
Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 23:13
Member (2005)
English to Spanish
+ ...
By end customer Sep 27, 2014

Mostly for privacy issues, but also in order to ensure the exact preferred wording and terminology of each customer is preserved, maintained, and reused, I have one memory per end customer. This means that right now I have a total of 670 memories created over a span of 5 years. Some of these memories are used a lot and some very rarely or never since they were created for a particular project. Sounds messy indeed, but makes sense for the kind of work I do (technical translation).

Ed
... See more
Mostly for privacy issues, but also in order to ensure the exact preferred wording and terminology of each customer is preserved, maintained, and reused, I have one memory per end customer. This means that right now I have a total of 670 memories created over a span of 5 years. Some of these memories are used a lot and some very rarely or never since they were created for a particular project. Sounds messy indeed, but makes sense for the kind of work I do (technical translation).

Edited to add this: Of course, any terminology I research along the way in my main two language pairs gets stored in a general termbase which is used for all work. This saves me lots of time when I work for more than one customer in a particular field. On the other hand, for specific customers I also have their own termbase to ensure their preferred terminology is used instead of the general one.

[Edited at 2014-09-27 06:37 GMT]
Collapse


 
Łukasz Gos-Furmankiewicz
Łukasz Gos-Furmankiewicz  Identity Verified
Poland
Local time: 23:13
English to Polish
+ ...
... Sep 27, 2014

Dan Lucas wrote:

I'm considering various ways of slicing and dicing my translation memories. One approach would be to have just one TM for everything, a sort of MyBigFatTM.sdltm. At the other end of the scale one might have a number of finely divided TMs, such as Electronics.sdltm, PatentsElectronic.sdltm, ManualsElectronics.sdltm and so on.

Currently I'm leaning towards having one large TM. Are there any disadvantages to this approach? How do other translators go about this?

For what it's worth, I'm using SDL Trados Studio 2014, but this is not a question specific to any particular CAT package.

Thanks
Dan



I avoid reliance on CAT tools in so far as at all reasonably possible. So yeah, I'll use a CAT tool for ease of use when translating some text contained in an editable file (I translate text, not files), I'll have my sweet table and black 16 font, which is a very comfortable thing and a great proofreading and editing aid, but translation memories...

... Well, I'll keep client-specific memories if more of the same subject is supposed to follow in foreseeable future, but that's it.

Definitely no EverythingIEverDid.sdltm because sources tend to be someone else's intellectual property that I was supposed to translate, not to benefit from — especially in a commercial way.

I like starting afresh with every new text and client, I only keep a sort of dictionary with the choicest problem solutions I've come across, but otherwise I choose not to rely on a TM to manage my increasing experience as I translate more and more.

Plus, it's kinda like the difference between artisanal beer and mass-produced beer. Today being Saturday and all.


 
Kay Denney
Kay Denney  Identity Verified
France
Local time: 23:13
French to English
per field and per client Sep 27, 2014

I file my work under a set of folders for the various fields I work in. Fashion and Textiles, Art and Architecture, Music, Dance and Theatre, Sustainable Development, Corporate bla-bla and Miscellaneous. Each of these folders has a bigfatmama TM, then within each folder there's a sub-folder per client and a TM for each client. So when I use a CAT tool I hook the project up to both the client-specific and field-specific TMs. That way I know at once whether the term is the right one for that clien... See more
I file my work under a set of folders for the various fields I work in. Fashion and Textiles, Art and Architecture, Music, Dance and Theatre, Sustainable Development, Corporate bla-bla and Miscellaneous. Each of these folders has a bigfatmama TM, then within each folder there's a sub-folder per client and a TM for each client. So when I use a CAT tool I hook the project up to both the client-specific and field-specific TMs. That way I know at once whether the term is the right one for that client but I can borrow from other clients in the same field when the client-specific folder doesn't produce anything worthwhile.

I'm by no means a CAT tool expert, I'm pretty sure that there are other, simpler or smarter ways of doing it, but it works for me!
Collapse


 
Eileen Cartoon
Eileen Cartoon  Identity Verified
Local time: 23:13
Italian to English
I work it more or less like Tomàs Sep 27, 2014

My work is tech only and so I have TMs for each client. I find this better because I may have more than one client in the same area and they use different terms (sometimes even using the English which I promptly add to my TermBase for that customer. In fact I have customer-specific term bases but I also have general TBs for a general area such as "alternative energy" or "pharmaceuticals" "furniture".

One thing I find very useful is, when they use an english abbreviation (I hate abb
... See more
My work is tech only and so I have TMs for each client. I find this better because I may have more than one client in the same area and they use different terms (sometimes even using the English which I promptly add to my TermBase for that customer. In fact I have customer-specific term bases but I also have general TBs for a general area such as "alternative energy" or "pharmaceuticals" "furniture".

One thing I find very useful is, when they use an english abbreviation (I hate abbreviations) and at some point they define it I Always save the extended version of in the TB as a reminder for the future. Helps a lot.
Collapse


 
Dominique Pivard
Dominique Pivard  Identity Verified
Local time: 00:13
Finnish to French
Auto-assembly/fragment assembly/etc. = useless for me too Sep 28, 2014

Giles Watson wrote:
Auto-assembly was the first thing I turned off in DV when I used to translate with it.

The problem is that in my sectors and language combination (IT-EN), and I imagine many others, the key to effective translation lies not so much in substituting discrete words or phrases as in reformulating the thought of the source text in an appropriate manner for the target language.

In my experience, concordance searches of translation memories are much more likely to throw up useful suggestions than the chunk-for-chunk approach of termbases.

+1: I've never understood all the fuss about auto-assembly/fragment assembly, no matter how it is further "improved" by "repairing" the assembled stuff with MT and other gimmicks.

Likewise, I've never understood people who have termbases or glossaries (really list of chunks) with 400,000 entries (or even more). They are usually the same that rave about AA.

Well, to each their own. This is why it's great there are so many tools that cater for all tastes.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:13
Member (2006)
English to Afrikaans
+ ...
Autocomplete Sep 28, 2014

Dominique Pivard wrote:
Well, to each their own. This is why it's great there are so many tools that cater for all tastes.


Agreed. To my, autocomplete at segment level is utterly useless, but at a word level or two-word level it would have been extremely helpful. A combination of Swiftkey and TM, maybe.


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 22:13
Member (2009)
Dutch to English
+ ...
CT's auto-complete might be just what you are looking for Sep 28, 2014

Samuel Murray wrote:

Dominique Pivard wrote:
Well, to each their own. This is why it's great there are so many tools that cater for all tastes.


Agreed. To my, autocomplete at segment level is utterly useless, but at a word level or two-word level it would have been extremely helpful. A combination of Swiftkey and TM, maybe.



See: http://cafetran.wikidot.com/using-auto-completion


 
Meta Arkadia
Meta Arkadia
Local time: 05:13
English to Indonesian
+ ...
Disagree Sep 28, 2014

Dominique Pivard wrote:
+1: I've never understood all the fuss about auto-assembly/fragment assembly

Just look upon it as a way to make the CAT tool provide you with the best suggestions from your resources, based on the tool's algorithms and your settings. A regular search will a show you all hits, AA will show you only the most relevant hit.

English "bank" can be
- an accumulation of sand, snow, clouds, and things
- the side of a river
- a financial institution
- a money box
- the "bank" of a casino
- a device consisting of rows (bank of keys, ... of cylinders, ... oars)
- and a few more things

Searching your resources will show you all those results, AA - if fine-tuned - only the relevant result in your target language. The result you want. The only result you want.

Cheers,

Hans


 
Pages in topic:   < [1 2 3 4] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How do you organise translation memories?







TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »