Lign 165: Computational Linguistics

Prof. Andrew Kehler
UCSD Department of Linguistics
kehler@ling.ucsd.edu
(858) 534-6239

Fall, 2003
MWF, 12:00-12:50, York 4050A
Office Hours: Wed 2-3:30 or by appt. (McGill 5237)

TA: Henry Beecher, hbeecher@ling.ucsd.edu
Office Hours: Mon 3-4 and Thurs 1-2, or by appt. (McGill 2137; 822-4904)

Overview

This course provides an introduction to the fundamental concepts of computational linguistics, including computational approaches to language modelling, part-of-speech tagging, syntactic analysis, semantic interpretation, and discourse processing.

Prerequisites

Ling 101 is strongly recommended. No programming experience assumed. Formal reasoning skills, an enthusiasm for natural language, and an ability to become computer savvy are all required.

Textbooks

Daniel Jurafsky and James H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Prentice Hall, 2000, ISBN: 0130950696.

Fernando Pereira and Stuart Shieber, Prolog and Natural-Language Analysis, CSLI Publications, 1987, ISBN: 0937073180. An on-line version is available at http://www.mtome.com/pnla-digital.html

Administrivia

There will be five homework assignments, distributed approximately every four lectures, each worth 9% of your grade. They will include a mixture of paper-and-pencil and computer exercises. Assignments are due in class. Late assignments will be penalized at 5% up until 5pm on the due date, at 10% per day otherwise, and will not be accepted at all after the time at which graded assignments and answer keys are distributed.

There will be two exams: a midterm and a final, worth 20% and 35% of your grade respectively.

Students are permitted to consult with each other and/or work together in learning the concepts necessary for completing the homework, as long as each student: (1) writes up his or her own homework alone, using no notes resulting from the collaboration, and (2) lists the names of all other students involved in the collaboration prominently on the assignment. Collaborative efforts not meeting these restrictions are strictly forbidden.

Needless to say, please turn off your cell phones before entering the classroom.



Schedule

I. Introduction: How to Get Rich in Silicon Valley (Friday, September 26)

II. Language Modelling, Part-of-Speech Tagging, Transformation-Based Learning, Empirical Evaluation (Monday, September 29 - Monday, October 6)

Reading: J&M Sections 5.1-5.5 (optional), Sections 6.1-6.3 (through pg. 210), Chapter 8

Friday, October 3: Assignment One Distributed (Due Monday, October 13)

III. Syntax and Parsing (Wednesday, October 8 - Friday, October 17)

Reading: J&M Chapter 9, Sections 10.1-10.3

Monday, October 13: Assignment One Collected, Assignment Two Distributed (Due Wednesday, October 22)

IV. Introduction to Prolog (Monday, October 20 - Monday October 27)

Reading: P&S Sections 2.1, 2.2, 2.5, and 2.7

Wednesday, October 22: Assignment Two Collected, Assignment Three Distributed (Due Friday, October 31)

V. Parsing with Prolog, Unification (Wednesday, October 29 - Friday, November 14)

Reading: P&S Sections 3.7 and 4.2.1--4.2.5; excerpts from J&M Chapter 11

Friday, October 31: Assignment Three Collected

Wednesday, November 5: Midterm Review

Friday, November 7: Midterm Exam

Monday, November 10: Assignment Four Distributed (Due Friday, November 21)

VI. Semantic Interpretation (Monday, November 17 - Wednesday, November 26)

Reading: P&S Section 4.1; excerpts from J&M Chapters 14 and 15

Friday, November 21: Assignment Four Collected, Assignment Five Distributed (Due Monday, December 1)

Friday, November 28: Thanksgiving Break

VII. Discourse Processing (Monday, December 1 - Wednesday, December 3)

Monday, December 1: Assignment Five Collected

VIII.Summary and Review (Friday, December 5)


Andy Kehler 2003-09-12