Matthew Lindfield Seager

Matthew Lindfield Seager

Today I learned SQL Server’s default encoding automatically equates similar characters to each other for sorting/searching. Normally that’s a good thing (e.g. so c and รง sort the same way) but today I wanted to find a specific unicode character…

I needed to find (and replace) all “Narrow Non-Break Space” characters but searching for LIKE '%' + NCHAR(8239) + '%' was also finding ordinary spaces.

The answer was to tell it to use a binary comparison when evaluating that field with COLLATE Latin1_General_BIN:

SELECT          *
FROM            Comments
WHERE           Comment COLLATE Latin1_General_BIN LIKE '%' + NCHAR(8239) + '%'

to find them and:

UPDATE          Comments
SET             Comment = REPLACE(
                    Comment COLLATE Latin1_General_BIN,
                    NCHAR(8239) COLLATE Latin1_General_BIN,
                    ' '
                )

to replace them with regular spaces.