christanner [at] seas [dot] harvard [dot] edu

PRESENT

Hi, I’m Chris Tanner.

I am a lecturer at Harvard’s Institute for Applied Computational Science (IACS), which centers around two Master’s programs: Data Science; and Computational Science and Engineering. I research and teach data science, machine learning, and natural language processing (NLP). I teach:

Deep Learning for Natural Language Processing (40 students): Coming Fall 2021! I’m excited to make a new course from scratch! Graduate course concerning language models, machine translation, transformers, many NLP tasks, and a significant research project.
Introduction to Data Science (390 students):
EDA, visualization, regression, boosting, PCA, trees, neural networks
Advanced Data Science (230 students): GLMs, deep learning, CNNs, LSTMs, VAEs, GANs, Reinforcement Learning, Transformers.
Capstone course (40 students): Working with partner organizations, I craft real-world machine learning, data science research projects, then I advise and manage students as they work in teams to research, develop, and effectively communicate solutions.

If you’re interested in applying to Harvard IACS, please read my brief description of grad programs.

RESEARCH

My research lies within natural language processing (NLP), specifically discourse, semantics, and understanding. The persistent theme in my work is trying to better understand, within any body of text, what is being said, what exactly is happening, and who is who? Toward these goals, my current projects involve entity linking, knowledge graphs, American Sign Language (ASL) translation, and coreference resolution.

CURRENT PROJECTS

Sensors to Sign Language Classification (Ali Hindy, Thomas Fouts, Julia Kreutzer, and Chris Tanner). In Submission.

We built sensors, attached them to one’s arms, signed a corpus of ASL words, and developed a model that leverages a video corpus for out-of-vocabulary classification.

American Sign Language Corpus (Thomas Fouts, Ali Hindy, and Chris Tanner). Aiming for EMNLP.

We built sensors, attached them to one’s arms, and signed 1,000 unique ASL words while capturing both the sensor values (5 muscle sensors, (1) 6-axis gyroscope, (1) accelerometer) and video feed.

Toward Featureless Event Coreference Resolution via Conjoined Convolutional Neural Networks (Chris Tanner and Eugene Charniak). Aiming for EMNLP.

We developed SOTA results for event coreference on the ECB+ corpus, while using almost no features.

Symbiotic Coreference Resolution for Entities and Events (Ning Hua and Chris Tanner). Aiming for EMNLP.

We demonstrate a new approach for jointly performing entity and event coreference.

HUMBLE: An Annotation Suite for Lexical Grounding (Joe Brucker, Eduardo Peynetti, Shivas Jayaram, and Chris Tanner). Aiming for EMNLP.

In pursuit of building the biggest, best event coreference dataset to date.

End-to-end Entity Linking (Mingyue Wei and Chris Tanner). Aiming for EMNLP.

For her Master’s Thesis at Harvard, Mingyue is researching end-to-end entity linking.

Unsupervised Coreference Resolution (Alessandro Stolfo, Mrinmaya Sachan, Vikram Gupta, and Chris Tanner)

For his Master’s Thesis at ETH Zurich, Alessandro is researching unsupervised coreference resolution.

Bringing BERT to the field: Transformer models for gene expression prediction in maize (Benjamin Levy, Zihao Xu, Liyang Zhao, Shuying Ni, Phoebe Wong, Ross Karl Kremling, Ross Altman, and Chris Tanner). Preparing for Nature Genetics Submission.

For my Capstone course, students partnered with Inari to predict gene expression. They produced great results, so we’re extending this work for a publication. [Blog Overview] [Slides] [Poster] [Poster Video]

Toward a Revamped Real Estate Index (Will Fried, Jessica Wijaya, Shucheng Yan, Yixuan Di, Zona Kostic, Andy Terrel, and Chris Tanner). Aiming for IEEE Transactions on Knowledge and Data Engineering.

For my Capstone course, students partnered with REX Real Estate to predict future housing market conditions. They produced great results, so we’re extending this work for a publication. [Blog Overview] [Slides] [Poster] [Poster Video]

My dissertation concerned entity and event coreference resolution. I was fortunate to be Dr. Eugene Charniak’s final PhD student.

CURRENT STUDENTS

Anita Mahinpei (Master’s Thesis)
Xiaohan Yang (Master’s Thesis)
Xin Zeng (Master’s Thesis)
Jack Scudder (Master’s Thesis)
Mingyue Wei (Master’s Thesis)
Xavier Evans (Harvard ‘23)
Sun Jie (Master’s in Health Data Science)
Alessandro Stolfo (ETH-Zurich Master’s Thesis, co-advising with Mrinmaya Sachan)
Thomas Fouts (Brunswick School -> Michigan ‘24)
Ali Hindy (Brunswick School -> Stanford ‘24)
Ning Hua (Smith ‘21 -> Harvard ‘23)

PAST STUDENTS

Brendan Falk (Harvard ‘20 -> CEO @ Fig)

INVITED TALKS

2021

May 14 — Deep Learning with Attention @ Keystone Strategy AI/ML Speaker Series
April 30 — Hard NLP Tasks: Determining who is who and what is what @ Harvard IACS Seminar Series [vid][slides]
January 20 — Language Models to Transformers @ Harvard ComputeFest

2020

November 20 — Research Talk @ Florida Institute of Tech.
October 15 — Career Advice @ Florida Institute of Tech.
May 19 — Open Data Science Conference (ODSC)
January 23 — Sequential Data @ Harvard ComputeFest

2019

September 27 — PhD Alumni Panel @ Brown
October 27 — RDMeetsIT Panel @ MIT Media Lab + Mercedes Benz
March 11 — Coreference Resolution @ Invitae
April 1 — MIT
March 15 — University of Washington
March 6 — CMU
February 21 — Brown
February 15 — Harvard

EXPERIENCE

During my career within academia, industry, and the government, my work has concerned:

entity linking
coreference resolution
natural language understanding (NLU)
citation prediction
face recognition
topic modelling
machine translation
streaming algorithms for NLP
anomaly detection
adaptive web personalization
speech recognition via active learning
error-correcting codes
social network analysis
2D pattern recognition
animats-based learning (swarm intelligence)

Research (old)