The quality of data varies vastly between systems and the good old soundex algorithm on SQL server is a complete waste of time for matching addresses. The main issues are
- House Numbers are going to generate invalid keys soundex of 7 my street = 0000
- Spaces or punctuation stop soundex from working select SOUNDEX('my house '), SOUNDEX('myhouse') returns M000 and M200
Exact matching has the same problem e.g. "my-house" and "my house" and "myhouse" are different.
One option to get round some of these problems is simply remove any none alphanumeric characters - this can be as complex as a replace statement removing any unwanted characters e.g. replace(replace(replace(field,' ',''),'-',''),'"','') etc. for any unexpected characters, or use a regular expression or a clr routine - there are a lot of published CLR routines and regular expression CLR's as examples
Other options include using a match algorithm e.g. JaroWinkler or Levenshtein on the "cleansed" addresses to see how different they are.