Using regex to bulk find and replace text with superscript
Thread poster: Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 20:25
English to Spanish
+ ...
Jul 3, 2022

Hello there, colleagues:

I have a significant number of segments where I need to replace numbers in pairs with those same numbers in superscript (in this particular case, numbers in scientific notation, like 7.00E+02 or 7.00E-02 (meaning, 7 × 102 or 7 × 10−2) (2 o −2 in superscript) (English into Spanish)

Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2})
... See more
Hello there, colleagues:

I have a significant number of segments where I need to replace numbers in pairs with those same numbers in superscript (in this particular case, numbers in scientific notation, like 7.00E+02 or 7.00E-02 (meaning, 7 × 102 or 7 × 10−2) (2 o −2 in superscript) (English into Spanish)

Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2}) needs to be set to a superscript font (I won't be exporting the document, just sending the bilingual, so I won't be doing it in Word). I can, of course, settle for the power sign (^) and be done with it, but I'd like to know if that is possible through regex.

Kind regards and appreciate your help.
Collapse


 
James Plastow
James Plastow  Identity Verified
United Kingdom
Local time: 03:25
Member (2020)
Japanese to English
notepad++ Jul 3, 2022

I once had a job like this and did it by batch replacing the tags in the sdlxliff file directly in notepad++. If you are careful it is easy enough but be sure to make a backup as you will corrupt the file if you make any mistake.

 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 20:25
English to Spanish
+ ...
TOPIC STARTER
No tags in this case, I'm afraid Jul 3, 2022

Hello, James:

I would try that option, but there are no tags in this case. I'm wondering if I could create superscript tags. Is that possible?


 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 03:25
Member (2014)
Japanese to English
Unicode points? Jul 3, 2022

Rodrigo Rosales Sosa wrote:
Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2})

Superscript minus, superscript two, and many similar symbols exist as unicode points. So you might be able to get this kind of thing: ⁻². I would use this site to look up the code points (just type in "superscript"), then it's just a case of working out how MemoQ handles unicode in its regex engine. If it uses the .NET flavour of regexes it would probably be something like \u207B for superscript minus.

Dan


Rodrigo Rosales Sosa
 
James Plastow
James Plastow  Identity Verified
United Kingdom
Local time: 03:25
Member (2020)
Japanese to English
tags Jul 3, 2022

Rodrigo Rosales Sosa wrote:

Hello, James:

I would try that option, but there are no tags in this case. I'm wondering if I could create superscript tags. Is that possible?



Hi Rodrigo,

Are you working in Trados? If you are, try opening the xliff in Notepad++ and see what is there. (it helps to install an XML plugin so you can see the text more clearly). There should be tags where there is a superscript. You can batch find and replace these to the other elements you want to make superscript.

Dan's solution sounds quicker though.

[Edited at 2022-07-03 19:51 GMT]


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 05:25
English to Russian
A series of replacements Jul 3, 2022

You have to run a number of replacements.
Begin with E-:
1. Replace E-(\d+) with ×10@@-$1
2. Replace @@-01 with ⁻¹
3. Replace @@-02 with ⁻²
4. Replace @@-03 with ⁻³
5. Replace @@-04 with ⁻⁴
etc.

Then
1. Replace E\+(\d+) with ×10@@$1
2. Replace @@01 with blank field
3. Replace @@02 with ²
4. Replace @@03 with ³
5. Replace @@04 with ⁴
etc.

[Edited at 2022-07-03 21:20 GMT]


Rodrigo Rosales Sosa
 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 20:25
English to Spanish
+ ...
TOPIC STARTER
I'll try it out and report back. Jul 4, 2022

Stepan Konev wrote:

You have to run a number of replacements.
Begin with E-:
1. Replace E-(\d+) with ×10@@-$1
2. Replace @@-01 with ⁻¹
3. Replace @@-02 with ⁻²
4. Replace @@-03 with ⁻³
5. Replace @@-04 with ⁻⁴
etc.

Then
1. Replace E\+(\d+) with ×10@@$1
2. Replace @@01 with blank field
3. Replace @@02 with ²
4. Replace @@03 with ³
5. Replace @@04 with ⁴
etc.

[Edited at 2022-07-03 21:20 GMT]


Why didn't I think about copying the numbers already in superscript? I'll report back later. Thank you


 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 20:25
English to Spanish
+ ...
TOPIC STARTER
I'll check it out Jul 4, 2022

Dan Lucas wrote:

Rodrigo Rosales Sosa wrote:
Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2})

Superscript minus, superscript two, and many similar symbols exist as unicode points. So you might be able to get this kind of thing: ⁻². I would use this site to look up the code points (just type in "superscript"), then it's just a case of working out how MemoQ handles unicode in its regex engine. If it uses the .NET flavour of regexes it would probably be something like \u207B for superscript minus.

Dan


Thank you, Dan. I'll check this option out and report back later


 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 20:25
English to Spanish
+ ...
TOPIC STARTER
Solved it Jul 5, 2022

Hello there:

I managed to solve the issue by finding the unicode characters for each superscript number and the minus operator sign (−) and running a series of replacements starting from 1 and voilà. (link to screenshot for future reference: https://imgur.com/a/hVVlbPU).

Thank you for your suggestions and help.


 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 20:25
English to Spanish
+ ...
TOPIC STARTER
Sorry, I should've mentioned Jul 5, 2022

James Plastow wrote:

Rodrigo Rosales Sosa wrote:

Hello, James:

I would try that option, but there are no tags in this case. I'm wondering if I could create superscript tags. Is that possible?



Hi Rodrigo,

Are you working in Trados? If you are, try opening the xliff in Notepad++ and see what is there. (it helps to install an XML plugin so you can see the text more clearly). There should be tags where there is a superscript. You can batch find and replace these to the other elements you want to make superscript.

Dan's solution sounds quicker though.

[Edited at 2022-07-03 19:51 GMT]


I should've mentioned it earlier: I'm working in memoQ. I did try Dan's solution and it worked. Thank you


Dan Lucas
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Using regex to bulk find and replace text with superscript






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »