Free CAT tool
Thread poster: Tom in London
Tom in London
Tom in London
United Kingdom
Local time: 01:24
Member (2008)
Italian to English
Dec 6, 2022

Has anyone tried using MateCat? It's a web-based CAT tool that uses a giant TM shared by all users (you have to sign in to use this feature). I haven't explored it much, but it looks interesting.

https://www.matecat.com



[Edited at 2022-12-06 18:34 GMT]


luppivega
 
Inna Borymova
Inna Borymova  Identity Verified
Kyrgyzstan
Member (2013)
English to Russian
+ ...
I did and I am not a big fan of it Dec 6, 2022

One of my clients uses it for their project (but they use a project TM, not the shared one). I've had just a few jobs from them, but I did not like the tool. It is workable, but when you have tags in the translation it can turn into a nightmare... At least this is my personal impression. If I had no choice, I would better use MateCat than nothing. Probably one just needs more practice with it.


[Edited at 2022-12-06 19:48 GMT]


Elena Feriani
expressisverbis
 
Jean Dimitriadis
Jean Dimitriadis  Identity Verified
English to French
+ ...
Major flaw Dec 7, 2022

It is a good tool with an Open Source core plus free MyMemory MT (which essentially provides Google Translate/Microsoft translate suggestions for free) and MyMemory public TM integrations. Geared towards postediting, it also plays well with a paid ModernMT subscription, which is developed by the same company.

It has a few flaws, though, one being major, almost a deal breaker:

1. Annoying: When you create a project, it offers you to outsource it for peanuts to Translated
... See more
It is a good tool with an Open Source core plus free MyMemory MT (which essentially provides Google Translate/Microsoft translate suggestions for free) and MyMemory public TM integrations. Geared towards postediting, it also plays well with a paid ModernMT subscription, which is developed by the same company.

It has a few flaws, though, one being major, almost a deal breaker:

1. Annoying: When you create a project, it offers you to outsource it for peanuts to Translated, the company/agency behind the tool.
2. Annoying: When analyzing a project, on top of a standard word count, they offer their own weighted word count, with discounts on repetitions, MyMemory public TM matches (which can be garbage) AND discount for MT on new words/words with no matches. Entirely optional, but cringy.
3. Not good: Although you can download the bilingual XLF file, you cannot import it back. You can only upload it to a separate page to generate the target (translated) file. This means you cannot easily update a Matecat project if you have worked on an external tool. You can of course upload the finalized TMX, but still, this hurts backward interoperability.
3. Inexcusable: By default, MateCat stores your translated segments in the public MyMemory TM. This means that if you just upload a file and start translating, the content you translate is saved in the open Web! How can this be a sensible default setting is just baffling. Especially when it is known most translators work on private or highly private documents... Beyond the pale.

To make sure this does not happen unwillingly, create a private TM resource: In the Project creation page, click on Settings (Alternatively, in the TM and glossary field, expand the drop-down menu and select Create resource). Click on + New resource button in the opened dialog. Give the TM an optional name. Hit Confirm. You will see that “MyMemory: Collaborative translation memory” resource is Enabled for Lookup, but not set to be Updated anymore. That way, translated segments will only be stored in your private resources.

Then and only then, Matecat can be used for professional reasons. You have been warned.

[Edited at 2022-12-07 10:38 GMT]
Collapse


Elena Feriani
expressisverbis
Mike Tung
 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
. Dec 7, 2022

.

[Edited at 2022-12-07 10:39 GMT]


Wilsonn Perez Reyes
 
Jean Dimitriadis
Jean Dimitriadis  Identity Verified
English to French
+ ...
Public Data vs Private Data Dec 7, 2022

When you create a private resource, MyMemory (collaborative translation memory shared with all Matecat users, and publicly available through MyMemory service) is automatically set to Lookup, but not Update. Without a private resource, MyMemory is set to Lookup AND Update. Update means that your translations are published on MyMemory.



More info here:
Create a private translation memory - https://guides.matecat.com/importing/exporting
Manage Your Language Resources - https://site.matecat.com/support/managing-language-resources/manage-language-resources/

For privacy matters, I would refer to the terms of use:

Philosophy and Terms of service - https://site.matecat.com/terms

There you can find the following definitions (plus one for "Personal Data"):
We will refer as “Public Data” (or “Public Contributions”) to all the segments sent to Matecat and not explicitly marked as private while being contributed to MyMemory.

We will refer as “Private Data” (or “Private Contributions”) to all the segments sent to Matecat and explicitly marked as private (for example, by providing a TM key during project creation) while being contributed to MyMemory.


Then, in another section:

All segments, whether they are “Public Data” or “Private Data”, are collected, processed and used by Matecat to create internal statistics.

Additionally, “Public Data” are used to provide translation matches to other users of the Matecat software.


Then:

Matecat uses external partners to outsource some developments and provide some functionalities: this involves data sharing. External Machine Translation providers is the most obvious example.

Based on the above, as I understand it, data is accessed for TM lookup and internal statistics purposes, and shared with third parties as needed to provide functionality (for instance, to use one of the available configurable MT services). One can reasonably expect that their translations will not be used to feed internal MT training or internal/public TM.

Finding private resources on the public MyMemory would be quickly discovered and expose the company to huge backlash and repercussions. Not likely.
About internal use, apart from the reasonable expectations above, it's anyone's guess. I'll only mention that the company has a track record of being committed to Open Source. Furthermore, Matecat and ModernMT have been initially developed jointly with the public sector (EU and universities).

Default settings can have a huge impact. Enabling the contribution of translations to MyMemory public TM by default, they ensure the continuous growth of this resource. But by doing so, they expose professional translators to potential breach of privacy and NDA agreements and to private content leaking into the open.

While the commitment to Open Source is commendable, I cannot say the same for the decision to offer this default setting, with no safeguards whatsoever.

Also: How secure is translating with Matecat? - https://guides.matecat.com/pri

[Edited at 2022-12-07 10:24 GMT]


Tom in London
expressisverbis
 
Tom in London
Tom in London
United Kingdom
Local time: 01:24
Member (2008)
Italian to English
TOPIC STARTER
Interesting Dec 7, 2022

Interesting reflections.

The MyMemory website does include some fairly honest and comprehensive statements about privacy (as quoted above).

Can we be sure that other MTs and APIs don't handle inputs in the same way as MyMemory?

My understanding is that Deepl, for instance, is able to achieve its sometimes remarkable accuracy by sharing all the data that translators contribute to it.

If the translator happens to be working on a confidential
... See more
Interesting reflections.

The MyMemory website does include some fairly honest and comprehensive statements about privacy (as quoted above).

Can we be sure that other MTs and APIs don't handle inputs in the same way as MyMemory?

My understanding is that Deepl, for instance, is able to achieve its sometimes remarkable accuracy by sharing all the data that translators contribute to it.

If the translator happens to be working on a confidential document, I can't see how it would it be possible to prevent Deepl (or GT) from memorising and sharing everything the translator inputs into it.

[Edited at 2022-12-07 10:35 GMT]
Collapse


 
Jean Dimitriadis
Jean Dimitriadis  Identity Verified
English to French
+ ...
Free vs Pro MT versions Dec 7, 2022

Last time I checked, Google and Microsoft MT engines offer privacy for their API key versions, which require registration and are a professional offering.

Their Web versions do not fall under the same terms of use, and anything you enter there can be used for training purposes. Google even encourages users to fix incorrect occurrences.

For DeepL, the DeepL Pro subscription ensures data privacy and GDRP compliance, including the deletion of your texts immediately after t
... See more
Last time I checked, Google and Microsoft MT engines offer privacy for their API key versions, which require registration and are a professional offering.

Their Web versions do not fall under the same terms of use, and anything you enter there can be used for training purposes. Google even encourages users to fix incorrect occurrences.

For DeepL, the DeepL Pro subscription ensures data privacy and GDRP compliance, including the deletion of your texts immediately after the translation. This includes both the API version and the Web version while you are connected to your Pro account.

Again, you do not enjoy the same protection under the free Web version of DeepL, and any data can be (and is) used for further training and improvements.

For details, it is best to check terms of use, this is just a high level summary!

[Edited at 2022-12-07 10:49 GMT]
Collapse


expressisverbis
Samuel Murray
Mike Tung
 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Free CAT tool






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »