Natural Language Processing, CS322, SP19

Description:

Natural languages (e.g., Chinese, English, etc.) enable humans to communicate, but, for the better part of history, computers have been left out of the conversation. Enabling machines to understand language is the goal of Natural Language Processing (NLP), and achieving this goal (or coming close) offers immense promise in fields like human-computer interaction, computational social science, and medicine (among others). However — language understanding is quite difficult, as spoken/written language often encodes complex factors beyond literal meaning; in fact, NLP is so hard that it is sometimes called “AI Complete,” i.e., if you could build a machine that truly understands language, you could build a machine with intellectual capacity equal to a human.

This course will cover several topics in Natural Language Processing, with a particular focus on statistical methods that learn patterns automatically from corpora (in contrast to methods that rely on hand-designed features and rules). Topics will include language modeling, supervised learning with bag-of-words inputs, lexical/vector semantics, feed-forward neural networks (for language modeling), and recurrent neural networks (for sequence tagging).

Calendar

In-Class Reading (for next lecture) Assignments Out Assignments Due
Week 1 (April 1) Welcome; Why NLP?; slides “I’m sorry Dave, I’m afraid I can’t do that…” Lee 2004.; J+M 2.2, 2.3, 2.4 intro, 2.4.2-2.4.6    
Statistical/Rule-based Discussion; Tokenization J+M 3.intro, 3.1, 3.2 HW1: N-Gram Language Models; and the data you need for A1 Group preferences; Getting to know you
Tokenization, Stemming, Lemmatizing; Probability Refresher; Demo, Bayes Rule Video J+M 3.3, 3.4    
Week 2 (April 8) Language models J+M 3.5, 3.6 Groups for project assigned  
Language models None! Optionally, just for fun, you can listen to this podcast    
Language models + Whirlwind tour of project topics J+M 4.intro, 4.1, 4.2, 4.3, 4.4    
Week 3 (April 15) Language Models Wrapup + Text Classification J+M 4.5, 4.6, 4.7 HW2: Sentiment Classification and the data you need for HW2. HW1, Project Topic Preferences
Text Classification J+M 4.8, 4.9; Read over/follow along in a terminal with Justin Johnson’s python/numpy/scipy intro    
Logistic Regression J+M 5.intro, 5.1    
Week 4 (April 22) Logistic Regression J+M 5.2, 5.3, 5.4   Projects: Proposals
Logistic Regression J+M 5.5, 5.6, 5.7    
Logistic Regression sklearn demo J+M 6.intro, 6.1, 6.2, 6.3, 6.4    
Week 5 (April 29) Neural Networks intro, Keras Demo Tensorflow/keras introduction   HW2
Classification Wrapup, Lexical and Vector Semantics J+M 6.5, 6.6, 6.7; linear algebra cheatsheet; optionally, the linear SVM section of wikipedia HW3: Vector Semantics, and the data you need for HW3.  
Lexical and Vector Semantics, Midterm Evaluations J+M 6.8 – 6.12; Kirk Baker’s Truncated SVD; optional linear algebra review    
Week 6 (May 6) ** Midterm Break **      
Lexical and Vector Semantics J+M 7.intro – 7.3    
word embeddings from SVD demo, Neural Networks for Language Modeling, Exam Topics J+M 7.4 – 7.6   Projects: Progress Report
Week 7 (May 13) ** Midterm Exam ** No reading! :)    
Neural Networks for Language Modeling Review or catch up on J+M ch. 7    
Neural Networks for Language Modeling; word2vec in keras demo None!   HW3
Week 8 (May 20) Project Updates from Teams Dependency Parse, Duplicate Questions; Recurrent Nets J+M 8.1-8.3; J+M 9.intro, 9.1 HW4: Neural Language Models; data for HW4  
Project Updates from Teams Paraphrase, Summarization, Constituency, Named Entity; Recurrent Nets J+M 9.2, 9.3 (and review 9.1 and 9.intro)    
Project Updates from Teams Question Answering, Coreference, Inference; Recurrent Nets      
Week 9 (May 27) Ethics: “can” versus “should” (in Machine Learning)      
Extra topic day: Computer Vision (+ Language)      
Extra topic day: Computer Vision (+ Language)     HW4
Week 10 (June 3) Group Presentations      
Guest Lecture from Zachary Levonian     Projects: Final Writeups

Projects

The final project for this class is a research/review project about an NLP task of your choosing. Aside from researching the task, it’s history, its practical importance, etc. you will: 1) download/explore a real dataset researchers use to build/evaluate models for your task; 2) implement two baselines for the task, and measure their performance; 3) measure the performance of a 3rd party API on your task; and 4) read and summarize a real NLP research paper.

Proposal

Your proposal should:

Code (will be turned in with the final writeup)

While different groups will have different code formats, your code should:

Off-the-shelf tools for each task

Note: If you find a tool that you’d rather use elsewhere online, you are free to use it — just make sure to check with me first.

Note2: I expect (and understand) that these resources somewhat vary in their ease-of-runability (e.g., some have a relatively simple API, and others require messing around with code on GitHub).

Progress Report

The goal of the progress report is to ensure that both you and I have a realistic expectation of how much work you will be able to do for the rest of the term. Your progress report should be typeset in LaTeX, and use the Association for Computational Linguistics template. The easiest way to access this template is by using overleaf and link sharing between group members. Your progress report should contain the following sections, which purposefully mirror the sections for the final writeup:

  1. An introduction: what is your problem, why is it interesting?
  2. A related work section: each member in your group individually will read a research paper and write a 3-paragraph summary of that paper (what question does it tackle?; at a high-level, what methods does it use?; what are the results and conclusions?). The paper should tackle your group’s task (though it doesn’t need to be on exactly the same dataset). I expect this to be difficult, and that’s okay (in fact — it’s part of the point)! At the time of the progress report, you should have at least decided who will read what paper.
  3. A dataset section: describe the dataset you selected, and the statistics of the corpus
  4. An evaluation section: what evaluation metrics are used for your task? How will you implement them, if you haven’t yet?
  5. An experiment section for your baseline: what simple baseline did you select? Have you run it on your dataset yet?
  6. An experiment section for your slightly-better-than-baseline: what method did you choose that improves performance over the baseline? Have you run it on your dataset yet?
  7. An experiment section for the external API’s performance: what external API have you chosen? Will it be easy to run this 3rd party code on your dataset? What roadblocks have you encountered?

I will be providing in-person feedback on your progress reports in the form of 15-minute individual group meetings: your entire group is required to attend this meeting. More information will be provided closer to the due date of the progress report.

Update Presentation

Towards the end of the term (e.g., week 7-8) one group will present a 5 minute update presentation at the beginning of class each day. The goal of these presentations is to explain your task to your classmates, highlight the progress you’ve made thus far, and describe some of the difficulties you’ve encountered.

Final Presentation

The final two days of the course are reserved for 12-15 minute group presentations. In addition to re-introducing your task to the group and explaining why it is cool/interesting/useful, you will present your final results, including your evaluations of your simple baseline, your slightly-smarter baseline, and the off-the-shelf tool.

Writeup

Your final writeups should be typeset in LaTeX, and use the Association for Computational Linguistics template. It is okay to have textual overlap with your progress report — in fact, that is the intention of making the sections (mostly) mirror each-other. Your writeup should include:

  1. An introduction: what is your problem, why is it interesting?
  2. A related work section: each member in your group should read and write a summary of a research paper that tackles your group’s task.
  3. A dataset section: describe the dataset you selected, and the statistics of the corpus
  4. An evaluation section: what evaluation metrics are used for your task? How did you implement them?
  5. An experiment section for your baseline: what simple baseline did you select, and how does it perform according to the metrics described in the evaluation section?
  6. An experiment section for your slightly-better-than-baseline: what method did you choose that improves performance over the baseline? How does it perform?
  7. An experiment section for the external API’s performance: how well does the off-the-shelf tool perform for this task?
  8. Shortcomings and Future Work: What did you aim to accomplish that you did not? What would you like to do in the future with this task?
  9. Conclusion: A summary of your main findings.

Additional Resources