Approximate String Matching
Doug Easterbrook
doug at artsman.com
Tue Sep 21 01:26:21 UTC 2021
you said native database…. so that kind of kills it for you.
there is an algorithm we used at one time with omnis native databases called soundex. https://www.archives.gov/research/census/soundex which the US census used to track people back in the 80’s (meaning 1880’s) with all sorts of weird european names . This does not work well for non-european or more moderns names.
if you implemented the algorithm …. you could make a field and search that.
I had a look at omnis online documentation … and omnis sql implements the LIKE clause which might allow the notion below. Can’t say how fast it might be — but it appears that you can
for example, in postgres, you could do something like
select * from people where
name ilike ‘%M%K%SM%TH%’
all I did was search for mike smith … but I removed all vowels and replaced with % (wildcard)
this would find
Mike Smith
Mike Smyth
mike Smithson
Mike Smythson
mark smothers. (% matches any number of characters so. M%K finds. MARK)
etc
it is a small trick, if omnis supports the % search in their omnis SQL.
so to make this trick work even better, we implemented something long ago
we would calculate a ‘long name field’ as follows
calc LongName as upp(con(firstname,’ ‘,lastname,’ ‘,firstname,’ ‘,company))
so, note that I’ve make a hidden field with the order of first name and last name independant, plus company…..
why do this?
using SQL
select * from people where longname ilike ‘%M%K%SM%TH%’. (this does search for MIKE SMITH)
or
select * from people where longname ilike ‘%SM%TH%M%K%’ (allows search for smith mike)
meaning people can put the name in any order …. makes it just a little more gracious for searching.
anyway, you can try and see if OmnisSQL allows that. As I say, we use it all the time in postgres .. for searching almost everything. I’ve given up on exact match searches these days.
the other option, again postgres, would be to use the postgres text search. This lets you find things in any order.
Doug Easterbrook
Arts Management Systems Ltd.
mailto:doug at artsman.com
http://www.artsman.com
Phone (403) 650-1978
> On September 20, 2021, at 3:50 PM, Jeanne Reyes <bornfree11 at gmail.com> wrote:
>
> Has anyone had the need to create an approximate string matching algorithm
> to search for data in studio? I have to make a search on a native database
> based on the user input and bring matches. I need an approximate match of
> around 75%. So if the input is 'Mike Smith', I can bring Matches that
> include Mike Smyth, Mike Peterson-Smith, M. Smithson, etc. so we have room
> for misspelling, transpositions and omissions (like searching for Smth
> still brings Smith), etc.
>
> If you have something already done that I can adapt please contact me.
>
> Thanks,
>
> Jeanne
>
> --
> <a href="http://www.upromise.com/guest/2653678835">Shop online through this
> link to help me pay for college!</a>
> _____________________________________________________________
> Manage your list subscriptions at http://lists.omnis-dev.com
> Start a new message -> mailto:omnisdev-en at lists.omnis-dev.com
More information about the omnisdev-en
mailing list