I have a code from a '.do' file, which is
** 2) PATSTAT - this uses some non-standard set so I have come up with a mapping as follows.
** Again number in brackets refer to decimal representation, decode in above reference.
** However, some are control/disused characters not in reference, they are: 156, 132
** The codes have been discovered using "hexdump <filename>, tab"
if "${_3}" == "EPO" {
* a grave
replace standard_name = subinstr( standard_name, char(195)+char(160), "a", 30)
* a acute
replace standard_name = subinstr( standard_name, char(195)+char(161), "a", 30)
* A acute
replace standard_name = subinstr( standard_name, char(195)+char(128), "A", 30)
* Some sort of o (Italian)
replace standard_name = subinstr( standard_name, char(195)+char(178), "o", 30)
* a circumflex
...
* U umlaut
replace standard_name = subinstr( standard_name, char(195)+char(156), "UE", 30)
* N tilde
replace standard_name = subinstr( standard_name, char(195)+char(145), "N", 30)
* n tilde
replace standard_name = subinstr( standard_name, char(195)+char(177), "n", 30)
/* SOME UNKNOWN ONES - VERY RARE
* ? italian "r??ta"
replace standard_name = subinstr( standard_name, char(195)+char(180), "?", 30)
* ? belgian VERY RARE
* £
replace standard_name = subinstr( standard_name, char(195)+char(163), "?", 30)
* little raised o
replace standard_name = subinstr( standard_name, char(195)+char(186), "?", 30)
* >>
replace standard_name = subinstr( standard_name, char(195)+char(187), "?", 30)
* Dutch Industriële - UNKNOWN and rare
replace standard_name = subinstr( standard_name, char(195)+char(171), "?", 30)
*/
}
and the explanation is 'Accented characters are widely used in many European countries. PATSTAT and Amadeus use slightly different character sets and so accented characters are replaced with non-accented equivalents, for example u umlaut becomes “ue”.'
I don't know which software should be used? Besides, Could you please explain the function of 'subinstr()' and how does it work in this file?
thanks in advance.
Aucun commentaire:
Enregistrer un commentaire