uSearch

We can use natural language processing techniques to mine information from the Union College catalogue of courses. Interested in finding a course that matches your interests? We can index the course descriptions of all the courses available at Union, and use vector space methods for matching between a user query, and the course descriptions.

An initial prototype uSearch exists (and can be tested on campus only), but there are many extensions that can be implemented, including:

  • smart methods for refining queries
  • expanding queries using lexical semantic resources such as wordnet
  • speed increases
  • performance improvements, using stop words, punctuation
  • expansion to include phrases
  • improvements in result presentation
  • incorporating filters including Gen Ed. designations

References

U. Kruschwitz, A. De Roeck, P. Scott, S. Steel, R. Turner, N. Webb (1999). Extracting Semi-Structured Data - Lessons Learnt. AAAI Fall Symposium on Using Layout for the Generation, Understanding or Retrieval of Documents.

Avatar
Nick Webb
Associate Professor of Computer Science / Director of Data Analytics

My research interests include Natural Language Processing, Social Robotics and Data Analytics.

Related