#PSTip Replacing special characters
Once upon a time I answered Stack Overflow question about easy way to replace ‘special’ characters with something ‘web safe’. I answered question with the following code:
$Replacer = @{
Å = 'aa'
é = 'e'
}
$string_to_fix = 'æøåéüÅ'
$pattern = "[$(-join $Replacer.Keys)]"
[regex]::Replace(
$string_to_fix,
$pattern,
{
$Replacer[$args[0].value]
}
)
My answer was accepted by the person asking question, but I was not fully satisfied with it. It maybe worked fine for him, but when I tried to apply the same pattern to text with Polish national characters I couldn’t get what I wanted. My problem was the fact that hash tables in PowerShell are case-insensitive. To cover both upper and lower case characters I would have to do it in two steps:
$lower = @{
ą = 'a'
ć = 'c'
ę = 'e'
ł = 'l'
ń = 'n'
ó = 'o'
ś = 's'
ż = 'z'
ź = 'z'
}
$patternLower = "[$(-join $lower.Keys)]"
$lowerReplaced = [regex]::Replace(
'Zażółć GĘŚLĄ jaźń',
$patternLower,
{
$lower[$args[0].value]
}
)
$patternUpper = $patternLower.ToUpper()
[regex]::Replace(
$lowerReplaced,
$patternUpper,
{
($lower[$args[0].value]).ToUpper()
}
)
Zazolc GESLA jazn
In the first step I use $lower hash table and match any lower-case Polish character. I save result to $lowerReplace variable and use it in the next Replace() call. In the second step I match upper-case Polish characters. Hash table is case-insensitive so it will return lower-case replacement. All I need to do is to convert it ToUpper().
I would prefer to do it in one step, with hash table (or any other dictionary) that is case-sensitive. I didn’t have time to investigate it further back than. But few weeks ago I came across solution for my problem. Even better – it was used in a tip that had exactly same purpose! It was in PowerShell.com tip about converting special characters..
With information and code provided in that tip I was able to build case-sensitive hash table. But instead of listing all letters one by one I decided to take it step further and build hash table slightly different:
$Replacer = New-Object hashtable
foreach ($letter in
Write-Output ą a Ą A ć c Ć C ę e Ę E ł l Ł L ń n Ń N ó o Ó O ś s Ś S ż z Ż Z ź z Ź Z) {
$foreach.MoveNext() | Out-Null
$Replacer.$letter = $foreach.Current
}
First, I create simple array by using the Write-Output cmdlet. Each item that I want to replace is followed by a string that should be used as a replacement string. This array is processed by foreach() loop. Inside this loop I use $foreach automatic variable. With $foreach.MoveNext() method and $foreach.Current property I can access two elements in a single cycle. Note that $letter is not updated when MoveNext() method is used.
Once our case-sensitive hash table is defined we can use it in Replace() method:
$pattern = "[$(-join $Replacer.Keys)]"
[regex]::Replace(
'Zażółć GĘŚLĄ jaźń',
$pattern,
{
$Replacer[$args[0].value]
}
)
Zazolc GESLA jazn
Share on: