
MeSH is a medical ontology that works pretty well to provide some sensible synonyms for the health domain. Multi-word synonyms won’t work as phrase queriesĪt Health On the Net, our search engine uses MeSH terms for query expansion. This is kind of complicated, so it’s worth stepping through each of these problems in turn. Multi-word synonyms won’t be matched in queries.The IDF of rare synonyms will be boosted, causing unintuitive results.

SOLR SUGGESTER VS SEARCHING UPDATE
Your synonyms can be swapped out at any time, without having to update the index.In theory, this should have several advantages: Your first, intuitive choice might be to put the SynonymFilterFactory in the query analyzer. Our problem is specific to Solr, but the choice between these two approaches can apply to any information retrieval system. The graphic below summarizes the basic differences between index-time and query-time expansion. Where it gets complicated is when you have to decide where to fit the SynonymFilterFactory: into the query analyzer or the index analyzer? Index-time vs.

You can even choose whether to expand your synonyms reciprocally or to specify a particular directionality.įor instance, you can make “dog,” “hound,” and “pooch” all expand to “dog | hound | pooch,” or you can specify that “dog” maps to “hound” but not vice-versa, or you can make them all collapse to “dog.” This part of the synonym handling is very flexible and works quite well. Solr provides a cool-sounding SynonymFilterFactory, which can be a fed a simple text file containing comma-separated synonyms. And there are lots of good ways to shoot yourself in the foot.

A Rover by any other name would taste just as sweet.Īs it turns out, though, Solr doesn’t make synonym expansion as easy as you might like.
