uSearch
We can use natural language processing techniques to mine information from the Union College catalogue of courses. Interested in finding a course that matches your interests? We can index the course descriptions of all the courses available at Union, and use vector space methods for matching between a user query, and the course descriptions.
An initial prototype uSearch exists (and can be tested on campus only), but there are many extensions that can be implemented, including:
- smart methods for refining queries
- expanding queries using lexical semantic resources such as wordnet
- speed increases
- performance improvements, using stop words, punctuation
- expansion to include phrases
- improvements in result presentation
- incorporating filters including Gen Ed. designations
References
U. Kruschwitz, A. De Roeck, P. Scott, S. Steel, R. Turner, N. Webb (1999). Extracting Semi-Structured Data - Lessons Learnt. AAAI Fall Symposium on Using Layout for the Generation, Understanding or Retrieval of Documents.