Preview Banner Image
MTech.(CS)- Statistical Natural Language Processing

Unit I
Introduction - Rationalist and Empiricist Approaches to Language - Scientific Content - The Ambiguity of Language: Why NLP Is Difficult - Mathematical Foundations - Elementary Probability Theory - Essential Information Theory - Linguistics Essentials - Parts of Speech and Morphology - Phrase Structure - Semantics and Pragmatics.

Unit II
Collocations - Frequency - Mean and Variance - Hypothesis Testing - Mutual Information - The Notion of Collocation - Statistical Inference: n-gram Models over Sparse Data - Bins: Forming Equivalence Classes - Statistical Estimators - Combining Estimators- Lexical Acquisition - Evaluation Measures.

Unit III
Markov Models - Markov Models - Hidden Markov Models - The Three Fundamental Questions for HMMs - HMMs: Implementation, Properties, and Variants - Part-of-Speech Tagging - The Information Sources in Tagging - Markov Model Taggers - Hidden Markov Model Taggers.

Unit IV
Statistical Alignment and Machine Translation - Text Alignment - Word Alignment - Statistical Machine Translation - Clustering - Hierarchical Clustering - Non-Hierarchical Clustering.

Unit V
Topics in Information Retrieval - Some Background on Information Retrieval - The Vector Space Models - Term Distribution Models - Latent Semantic Indexing - Discourse Segmentation - Text Categorization - Decision Trees.

Text Book:
"Foundations of Statistical Natural Language Processing" - Christopher D. Manning and Hinrich Schütze - MIT Press - 1999.

References Books:
Speech and Language Processing,Daniel Jurafsky, James Martin, Pearson Education, 2008.