Who knows a regular expression for this "garbage"? Thread poster: Hans Lenting
|
Something must have gone wrong during the creation of a project that a client has sent me. The project contains segments like this one (see below). What would be a valid regular expression to filter segments like this? (I assume that this is the content of an SVG graphic in... See more Something must have gone wrong during the creation of a project that a client has sent me. The project contains segments like this one (see below). What would be a valid regular expression to filter segments like this? (I assume that this is the content of an SVG graphic in a Schema ST4 file...) https://www.dropbox.com/s/np6mlcbhxuloie0/schema_st4_garbage.txt.zip?dl=1
[Edited at 2022-09-30 09:30 GMT] ▲ Collapse | | | Elena Feriani Italy Local time: 06:41 Member French to Italian + ... Filter by character length? | Sep 30, 2022 |
Hi Hans, That's a pretty long segment. If you are using Trados, you can use the Advanced Display Filter to filter by character length. EDIT: I meant segment length
[Edited at 2022-09-30 10:49 GMT] | | | Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER Why didn't I think of that? (Because I was overwhelmed?) | Sep 30, 2022 |
Elena Feriani wrote: Hi Hans, That's a pretty long segment. If you are using Trados, you can use the Advanced Display Filter to filter by character length. Hi Elena, Yes, it is a Trados project, but I prefer to translate it with CafeTran Espresso on my Mac. Your suggestion is good, I can filter on segment length in CafeTran too. Thanks! H | | | Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER That was easy (if you know how to do it :))! | Sep 30, 2022 |
Like this: Of course the regular expression isn't completely valid, since the segments contain non-letter characters too, but it is fine enough.
[Edited at 2022-09-30 09:55 GMT] | |
|
|
What about the original text? | Sep 30, 2022 |
Hans Lenting wrote: (I assume that this is the content of an SVG graphic in a Schema ST4 file...) Did you receive the original text in PDF format for reference? If yes, you would see what that is. If not, I suggest to require such file ASAP from your client. HTH | | | Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER Yes, I received a PDF | Sep 30, 2022 |
Andrzej Mierzejewski wrote: Hans Lenting wrote: (I assume that this is the content of an SVG graphic in a Schema ST4 file...) Did you receive the original text in PDF format for reference? If yes, you would see what that is. If not, I suggest to require such file ASAP from your client. HTH Yes, I received a PDF. And I'm pretty sure that these long segments concern SVG images since every next segment contains a long path, ending with .svg . Thanks for your help! | | | If so, then... | Sep 30, 2022 |
Such long character chains are nothing else than images. I'd simply delete them during the translation work. No need to have a special procedure or macro. Thereafter, I'd copy-and-paste the illustrations from the source PDF into the target DOC (or whatever your final format is) file in the formatting stage. That should satisfy the client unless no special requirements had been given. Amendment: to make the work easier for myself, I'd replace such segment with a short info, e.g.: P... See more Such long character chains are nothing else than images. I'd simply delete them during the translation work. No need to have a special procedure or macro. Thereafter, I'd copy-and-paste the illustrations from the source PDF into the target DOC (or whatever your final format is) file in the formatting stage. That should satisfy the client unless no special requirements had been given. Amendment: to make the work easier for myself, I'd replace such segment with a short info, e.g.: Page so-and-so, Figure so-and-so. And that's in both columns: Source and Target. HTH
[Редактировалось 2022-09-30 12:10 GMT]
[Редактировалось 2022-09-30 12:22 GMT] ▲ Collapse | | | Stepan Konev Russian Federation Local time: 07:41 English to Russian
Hans Lenting wrote: Of course the regular expression isn't completely valid, since the segments contain non-letter characters too, but it is fine enough. You can use \S{30,} instead. \S = anything except white spaces; also I believe 300 is too much, 30 should suffice. | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Who knows a regular expression for this "garbage"? Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
| Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |