Full-text search over files from the computer's file system.
To perform the search, an inverted search index was implemented (the index was serialized using the flatbuffers library). Habr article was taken as a basis
The problem of ranking (ordering documents found by request) has also been solved. The ranking function was used BM25, and as a parameter for it TF-IDF.
Thus, two applications were implemented - an indexer (preparing a search index) and a search (a program that directly searches the constructed index)
UI was implemented using glfw and imgui libraries.
Queries support parentheses as well as AND and OR operators. Thus, logical expressions act as a request. Separator between words - space(s)
The following queries are considered valid
- "for"
- "vector OR list"
- "vector AND list"
- "(for)"
- "(vector OR list)"
- "(vector AND list)"
- "(while OR for) and vector"
- "for AND and"
Invalid requests are considered
- "for AND"
- "vector list"
- "for AND OR list"
- "vector Or list"