About: Various aspects of a retrieval system, implemented for class credits
- Vector Space Retrieval (vsr)
- Relevance Feedback
- PageRank
- Text Categorization
Perform retrieval based on the cosine similarities of documents
Invoke with:
java ir.vsr.InvertedIndex -html [path to document corpus]
Reassess retrievals with relevance feedback and pseudofeedback
Invoke with:
java ir.vsr.InvertedIndex -html -pseudofeedback 8 [path to document corpus]
Implemented PageRank to spider a corpus of documents and perform link analysis.
Invoke with:
java ir.vsr.PageRankInvertedIndex -html -weight 0 crawledPages
Implemented Rocchio and KNN algorithms to categorize a set of sample documents.
Invoke with:
java ir.classifiers.TestKNN [-K K]
java ir.classifiers.TestRocchio [-neg]