Pages in topic:   < [1 2]
From Ms Word table to TMX file
Thread poster: Hans Lenting
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Automatic detection of source and target language Aug 23, 2022

Sub Macro1()
Dim strSL As String
Dim strTL As String

Selection.Tables(1).Columns(1).Select
strSL = Selection.LanguageID
Selection.Tables(1).Columns(2).Select
strTL = Selection.LanguageID

Select Case strSL
Case 1031
strSL = "de_DE"
Case 1033
strSL = "en_US"
Case 1043
strSL = "nl_NL"
End Select
... See more
Sub Macro1()
Dim strSL As String
Dim strTL As String

Selection.Tables(1).Columns(1).Select
strSL = Selection.LanguageID
Selection.Tables(1).Columns(2).Select
strTL = Selection.LanguageID

Select Case strSL
Case 1031
strSL = "de_DE"
Case 1033
strSL = "en_US"
Case 1043
strSL = "nl_NL"
End Select

Select Case strTL
Case 1031
strTL = "de_DE"
Case 1033
strTL = "en_US"
Case 1043
strTL = "nl_NL"
End Select

MsgBox (strSL & " " & strTL)

End Sub
Collapse


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Complete macro Aug 23, 2022

Here is the complete macro. It will try to detect the source and target language and if it fails, it will ask you for the language codes.

The macro assumes that the table that you want to convert to TMX has two columns only: the first column contains the source language, the second one the target language.

How to install this macro? There are many instructions on the inte
... See more
Here is the complete macro. It will try to detect the source and target language and if it fails, it will ask you for the language codes.

The macro assumes that the table that you want to convert to TMX has two columns only: the first column contains the source language, the second one the target language.

How to install this macro? There are many instructions on the internet. Here is one: https://wintersediting.com/install-run-macros/

Copy the code below and replace all « with < and all » with >.

Adapt/add language codes where necessary.

Adjust the storage path (this one is for my Desktop on my Mac).


Sub TableToTMX()

Dim rngTemp As Range
Dim tableTemp As Table
Dim strSL As String
Dim strTL As String

'Determine the source and target language
Selection.Tables(1).Columns(1).Select
strSL = Selection.LanguageID
Selection.Tables(1).Columns(2).Select
strTL = Selection.LanguageID

Select Case strSL
Case 1031
strSL = "de-DE"
Case 1033
strSL = "en-US"
Case 1043
strSL = "nl-NL"
Case Else
strSL = InputBox("Type the code for the source language", "Source language selection", "en-GB")
End Select

Select Case strTL
Case 1031
strTL = "de-DE"
Case 1033
strTL = "en-US"
Case 1043
strTL = "nl-NL"
Case Else
strTL = InputBox("Type the code for the target language", "Target language selection", "nl-NL")
End Select

Options.AutoFormatReplaceQuotes = False
Selection.Tables(1).Select
Selection.Copy
Documents.Add
Selection.Paste


Set tableTemp = ActiveDocument.Tables(1)
Set rngTemp = _
tableTemp.ConvertToText(Separator:=wdSeparateByTabs)
Selection.Delete

'Convert ampersand, less than and greater than to markup
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "&"
.Replacement.Text = "&"
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = ""
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^p"
.Replacement.Text = _
"^p"
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^t"
.Replacement.Text = ""
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll


Selection.TypeText Text:=""
Selection.HomeKey Unit:=wdStory
Selection.TypeText Text:=""

'Replace placeholder language codes with correct codes
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "xx-XX"
.Replacement.Text = strSL
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "yy-YY"
.Replacement.Text = strTL
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

'Adjust path below
ActiveDocument.SaveAs2 FileName:="/Users/hl/Desktop/memory.tmx", FileFormat:= _
wdFormatText, Encoding:=65001, LineEnding:=wdLFOnly
End Sub
Collapse


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Demo 2 Aug 23, 2022

Second demo, with automatic language detection:

2221


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Apostrophe and straight double quote Sep 11, 2022

Added correct encoding of apostrophes and straight double quotes.

Will add handling of bold, italics, underlined, superscript and subscript soon. See here if you want to go ahead.

BTW: For a complete list of Microsoft's locale IDs see:... See more
Added correct encoding of apostrophes and straight double quotes.

Will add handling of bold, italics, underlined, superscript and subscript soon. See here if you want to go ahead.

BTW: For a complete list of Microsoft's locale IDs see:
https://docs.microsoft.com/en-us/openspecs/office_standards/ms-oe376/6c085406-a698-4e12-9d4d-c3b0ee3dbc4a


Sub TableToTMX()

Dim rngTemp As Range
Dim tableTemp As Table
Dim strSL As String
Dim strTL As String

'Determine the source and target language
Selection.Tables(1).Columns(1).Select
strSL = Selection.LanguageID
Selection.Tables(1).Columns(2).Select
strTL = Selection.LanguageID

Select Case strSL
Case 1031
strSL = "de-DE"
Case 1033
strSL = "en-US"
Case 1043
strSL = "nl-NL"
Case Else
strSL = InputBox("Type the code for the source language", "Source language selection", "en-GB")
End Select

Select Case strTL
Case 1031
strTL = "de-DE"
Case 1033
strTL = "en-US"
Case 1043
strTL = "nl-NL"
Case Else
strTL = InputBox("Type the code for the target language", "Target language selection", "nl-NL")
End Select

Options.AutoFormatReplaceQuotes = False
Selection.Tables(1).Select
Selection.Copy
Documents.Add
Selection.Paste


Set tableTemp = ActiveDocument.Tables(1)
Set rngTemp = _
tableTemp.ConvertToText(Separator:=wdSeparateByTabs)
Selection.Delete

'Convert ampersand, less than and greater than to markup
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "&"
.Replacement.Text = "&"
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "«"
.Replacement.Text = ""
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
'Handle apostrophe and straight double quote
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "'"
.Replacement.Text = "'"
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = """"
.Replacement.Text = """
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^p"
.Replacement.Text = _
"«/seg»«/tuv»«/tu»^p«tu»«tuv xml:lang=""xx-XX""»«seg»"
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^t"
.Replacement.Text = "«/seg»«/tuv»«tuv xml:lang=""yy-YY""»«seg»"
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll


Selection.TypeText Text:="«/seg»«/tuv»«/tu»«/body»«/tmx»"
Selection.HomeKey Unit:=wdStory
Selection.TypeText Text:="«?xml version=""1.0"" encoding=""utf-8""?»«tmx version=""1.4""»«header»«/header»«body»«tu»«tuv xml:lang=""xx-XX""»«seg»"

'Replace placeholder language codes with correct codes
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "xx-XX"
.Replacement.Text = strSL
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "yy-YY"
.Replacement.Text = strTL
.Forward = False
.Wrap = wdFindAsk
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

'Adjust path below
ActiveDocument.SaveAs2 FileName:="/Users/hl/Desktop/memory.tmx", FileFormat:= _
wdFormatText, Encoding:=65001, LineEnding:=wdLFOnly
End Sub

[Edited at 2022-09-11 07:05 GMT]
Collapse


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 22:04
Member (2009)
Dutch to English
+ ...
wow, pretty cool Sep 14, 2022

I usually do this with the old Heartsome TMX editor (as someone pointed out earlier), but just noticed that the GoldPan TMX editor can also do it: Batch tools > Convert to TMX (it takes .xlsx files).

And also just found this: https://translatum.gr/cgi-bin/excel-to-tmx.pl ("Convert Excel files (xls, xlsx) and tab delimited txt to TMX")

And the now ancient Olifant TMX
... See more
I usually do this with the old Heartsome TMX editor (as someone pointed out earlier), but just noticed that the GoldPan TMX editor can also do it: Batch tools > Convert to TMX (it takes .xlsx files).

And also just found this: https://translatum.gr/cgi-bin/excel-to-tmx.pl ("Convert Excel files (xls, xlsx) and tab delimited txt to TMX")

And the now ancient Olifant TMX editor: https://pangeanic.com/diary-inhouse-translator/6-steps-to-create-tmx-file-from-excel-or-other-formats/
Collapse


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 22:04
Member (2009)
Dutch to English
+ ...
or try Logrus's new Memose (cloud-based TM storage. Searcher/editor/etc) Sep 14, 2022

Logrus also has a very interesting new thing called "Memose":

https://cloud.logrusglobal.com/#memose

They call it a "Cloud-based Translation Memory with multi-proposal TM + Multi-engine triple NMT noise cancellation and quadruple quality boost Editor." !!!!!!!!!!!!

… which, among many oth
... See more
Logrus also has a very interesting new thing called "Memose":

https://cloud.logrusglobal.com/#memose

They call it a "Cloud-based Translation Memory with multi-proposal TM + Multi-engine triple NMT noise cancellation and quadruple quality boost Editor." !!!!!!!!!!!!

… which, among many other things, can also convert various formats into TMX.

see: https://memose.logrusglobal.com/editor

it can do much more than that though, see e.g.:

https://memose.logrusglobal.com/index

Memose



[Edited at 2022-09-14 22:43 GMT]
Collapse


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Clever Sep 15, 2022

Michael Beijer wrote:

Logrus also has a very interesting new thing called "Memose":


Looks to me like a clever way to harvest TMs, e.g. to train an MT system ...

BTW: My macro is meant to easily create a TM while reading/browsing through a document in Ms Word, that contains bilingual tables, for example, with legends of illustrations. I still have to come up with a fancy name, like Logrussia does.


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Install macro Sep 20, 2022

The updated macro will follow soon.

In the meantime, here is how to install a macro:

https://www.proz.com/forum/apple_mac_operating_systems/359051-how_to_install_a_word_macro_from_the_internet_mac_version.html


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Final (?) version Sep 22, 2022

You can download an Ms Word document containing the final macro here.

 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Framework Sep 23, 2022

I would like to point out that the techniques used in this macro can be used as a framework for other macro projects, such as:

  • Create a TM from all bilingual tables in an Ms Word document.
  • [Your idea here]


 
Pages in topic:   < [1 2]


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

From Ms Word table to TMX file







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »