Unit I
Information Retrieval and Web Search - Basic Concepts of Information Retrieval - Information Retrieval Models - Relevance Feedback - Evaluation Measures - Text and Web Page Pre-Processing - Inverted Index and Its Compression - Latent Semantic Indexing - Meta-Search - Web Spamming.
Unit II
Link Analysis - Social Network Analysis - Co-Citation and Bibliographic Coupling - PageRank - PageRank Algorithm - HITS - Community Discovery –
Unit III
Web Crawling - A Basic Crawler Algorithm - Implementation Issues - Universal Crawlers - Focused Crawlers - Evaluation - Crawler Ethics and Conflicts - Some New Developments.
Unit IV
Structured Data Extraction: Wrapper Generation – Preliminaries - Wrapper Induction - Instance-Based Wrapper Learning - Automatic Wrapper Generation Problems - String Matching and Tree Matching - Multiple Alignment - Center Star Method - Partial Tree Alignment - Building DOM Trees - Extraction Based on a Single List Page : Flat Data Records - Extraction Based on a Single List Page - Extraction Based on Multiple Pages.
Unit V
Opinion Mining - Sentiment Classification - Feature-Based Opinion Mining and Summarization - Comparative Sentence and Relation Mining - Opinion Search - Opinion Spam - Web Usage Mining - Data Modeling for Web Usage Mining - Discovery and Analysis of Web Usage Patterns.
Text Book:
Web Data Mining – Exploring Hyperlinks, Contents and Usage Data, Bing Liu, Springer, December 2006.
.References Book:
Handbook of Text and Data Mining, Min Song, Yi-Fang Wu. IGI Global,2008.

