Let's compare!
Welcome to our comparison section!
Here you can find the evaluation results of different approaches for better search relevancy or other use-cases.
For our first comparison we looked at stemming algorithms; a standard approach that has been used in search engines for decades. We used the available stemmer token filters in Elasticsearch and compared them to the default search setup without any stemming. You can read more about what stemming is and how useful it is for search in our stemming comparison.
The topic of our second comparison was decomposition. Since decomposition is a very useful task to get high improvements - especially in German search results - we compared different Elasticsearch built-ins and analyzed in detail where decomposition is working well and where it doesn't. Read about it in our decomposition comparison.
Our third comparison will focus on embeddings and the emerging vector search capabilities in search engines. Embeddings are currently a rather hot topic and part of several state-of-the-art deep learning language models which have achieved record benchmark scores in Natural Language Processing (NLP) tasks. You can read more about embeddings and the current state-of-the-art in language modeling in our Guides Section. Alternatively, you can check out how a simple approach to using embeddings in search compares to the default search in Elasticsearch in our Embeddings Comparison.
The evaluations are based on several publically available datasets collected by the University of Glasgow. We also provide code that makes it easy to use these datasets for your own experiments. This table shows a short overview comparison of all researched data sets to date:
Corpus | Documents | Queries | Relevant Use-Cases |
---|---|---|---|
ADI | 83 | 35 | Q&A, (Search, too small) |
CACM | 3,204 | 64 | Search |
CISI | 1,460 | 112 | Q&A, Search, (includes ADI queries) |
Cranfield | 1,400 | 225 | Q&A, Search |
LISA | 6004 | 35 | Search |
MS MARCO | 3,213,835 | 5,193 | Search, Q&A |
Medline | 1,033 | 30 | Search |
NPL | 11,429 | 93 | Search |
Time | 423 | 83 | Search |
Which data set is best for evaluation depends strongly on the use case. To get a better understanding of what the data sets look like, check out our Data Sets Guide.
Feel free to browse through our comparisons:
Stay tuned for upcoming evaluations and subscribe to our newsletter so you don't miss any updates.