Information Storage and Retrieval

CSCE 561
Monday and Wednesday - 2:30 pm to 3:45 pm
OLVR 118
Fall 2017


Instructor: Dr. Vijay Raghavan
Office: OLVR 305
Office Hours: Tue: 3:00 pm - 4:30 pm
Wed: 4:00 pm - 5:30 pm
Phone: (337) 482-6603
Email: raghavan@louisiana.edu
Grader: Sarika Kondra
Office: LINC Lab
Office Hours: Appointment Only
Phone: (337) 806-6587
Email: sarika.vm35@gmail.com
C00219805@louisiana.edu

Page Content


Roster

Click here to check the class roster

Please check and let me (TA) know if your name is not in the roster !


Prerequisites

CMPS 460 or consent of the instructor.

Some background knowledge on WWW protocols for database access from web browsers is assumed.


Outline

Modern retrieval systems that operate on text databases can provide interactive, user-customizable techniques for retrieval. In contrast, many tools available for accessing text databases on the Web use techniques that are quite primitive. Thus, there is a need to make state-of-the-art search algorithms available over the Internet. In this context, we will explore intelligent information retrieval techniques and protocols associated with the implementation of Web-browser based interfaces to document database servers. It is also important to extend the search algorithms to heterogeneous (multimedia) data. This aspect requires the development of appropriate indexing schemes in order that the search algorithms applicable to text databases can be extended to other types (e.g., pictures, video, sound) of data. The course will consider research issues in this context. We will also look at some aspects of database and text mining.


Policies on Cheating

Cheating: It should be strictly noted that any sort of cheating will NOT be tolerated. All work you submitted must be entirely your own. If any student is found cheating in an assignment (either programming or non-programming), he/she will be given a 0 for that assignment. This includes both the person showing their work and the person involved in copying. If any student is found cheating in a test, he/she will be given either a grade of 'C' or 'F' or in some cases will also be brought to the attention of Dean (Again includes both the person showing their work and the person involved in copying).


References

Note: The books in bold are available for overnight use from the Reserve Section of the Drupre Library.

Note: Lots of relevant materials can be obtained from the Internet. Also, visit my webpage and click on ''Some URLs of Interest to my Students'' and other links that interest you.[Link]

 


Grading Policy

*Typically, a term project involves the design and implementation of search and indexing algorithms or interface requirements or other infromation retrieval system components.


Class Notes

Index Lecture Link Link
1 Introduction   [pdf]
2 IR Models(part 1) - Boolean Retrieval   [pdf] [pdf]
3 IR Models(part 2)   [pdf]
4 Retrieval System Evaluation (part 1)   [pdf]
5 Retrieval System Evaluation (part 2)   [pdf]
6 RUBRIC (part 1)   [pdf]
7 RUBRIC (part 2) [ppt]  
8 Vector Space Model   [pdf]
9 GVSM   [pdf]
10 Learning (Part 1)   [pdf]
11 Learning (Part 2)   [pdf]
12 Notes for Probablistic Retrieval Model   [pdf]

Assignments

  1. Assignment 1: Due 10/02/2017
  2. Assignment 2: Due 10/27/2017
  3. Assignment 3: Due 11/29/2017

Note:

  1. All non-programming assignments should be written legibly (Please check Policies on Cheating).
  2. Before submission a photo-copy of the assignment should be made (for reference).
  3. Only the original should be submitted.
  4. Retain the photocopy. DO NOT submit it.
  5. Please staple the question paper on top of the answer sheet.
  6. Answer sheets that are not stapled properly will not be graded.
  7. All assignments should be done individually unless otherwise stated.
  8. Academic dishonesty will be prosecuted in accordance with the rules and regulations specified by the university.
  9. All answer sheets should be numbered.
  10. While answering questions please begin answering individual questions on separate pages.
  11. Please provide an index, stating each question number and the corresponding page number where its answer can be found.

Final Project

Class project proposal and the final report should have the following details

Useful Links

  1. Information Retrieval (IR) Resources [Link]
  2.  Chapter 1 from Salton's Book [Scanned_PDF].
  3. What do you say after you say, 'I work in IR'? [PDF]
  4. Information Retrieval on the World-wide Web[PDF]
  5. A General Mathematical Model For Information Retrieval Systems [PDF]
  6. Boolean Retrieval Model[Link]
  7. Fuzzy Set Theory to Document Retrieval[Scaned_PDF]
  8. A Critical Analysis of Vector Space Model for Information Retrival[Scaned_PDF]
  9. On Modeling of Information Retrieval Concept in Vector Spaces[PDF]
  10. RUBRIC: A Rule System for information Retrieva[PDF]
  11. A Critical Investigation of Recall and Precision as Measures of RetrievalSystem Performance [PDF]
  12. Linear Structure for Information Retrieval [Scaned_PDF]
  13. The Shape of the Web and Its Implications for Searching the Web[PDF]
  14. Meta Search (1)[Link]
  15. Meta Search (2)[Link]
  16. Enhancing Internet Search Engines to Achieve Concept-based Retrieval [PPT]
  17. Content and Link Structure Analysis for Searching the Web[PDF]
  18. Crawling the hidden Web[PDF]
  19. Personalized Search[PDF]
  20. Pattern Recognition: Statistical, Structural, and Neural Approaches [Scanned_PDF]
  21. Text Retrieval Quality: A Primer[Link]
  22. Concept Map for Beginners[Link]

Popular Information Retrieval Systems


Sample Exam Papers

These are the previous midterm and final question papers for your reference


Last updated: August 25, 2017