Thesis

My thesis focused on structured prediction methods for models including recurrent neural networks, with the unifying theme of integrating structured learning with inexact search. As one contribution, I developed a reinforcement learning-style learning algorithm – with applicability beyond parsing (as shown in Edunov et al., 2018) – along with the first neural network parsing model optimized for the final evaluation metric with a structure-level loss. This is also the first work that used beam search to enable the learning of non-locally-normalized RNN models that condition on the full input, and the first work to do so with an expected loss. It demonstrates label bias is present even with models that exploit unbounded lookahead, and that global normalization is a strategy for mitigating its negative effects. As good side effects, the model also mitigates exposure bias and loss-evaluation mismatch.

As another contribution, I solved a long-standing problem in CCG parsing by developing the first dependency model for a shift-reduce parser, in which the key components are a dependency oracle and a learning algorithm that integrates the dependency oracle, the violation-fixing structured perceptron, and beam search. The dependency oracle is also a general hypergraph search algorithm with other potential applications.



don't click here (credit to the real cool)




Man-made techniques do have a habit of becoming obsolete, whereas basic discoveries about how nature works should last forever. But truly fundamental insights such as those of Darwin or Watson & Crick are rare and often subject to intense competition, whereas development of successful techniques to address important problems allows lesser mortals to exert a widespread beneficial impact for at least a few years. Moreover, the same engineering approach is what creates new therapeutic strategies to alleviate disease, not just tools for our fellow researchers.

-- Roger Y. Tsien