AllenNLP grows eyes: Combining natural language with vision
In May, AllenNLP shipped its 1.0 version, marking a stable point after years of development. What’s next?
In this talk, I’ll start by highlighting the most important aspects of AllenNLP 1.0. This is not a “getting started” workshop (those are online), but I will lay out some basic design decisions we have made, talk about principles we follow during development and show how the AllenNLP platform speeds up research at AI2. After the 1.0 release, the AllenNLP team took some time to determine the next challenge for NLP in a post-BERT world. Seeing the excitement in the community and in our own research team, we chose grounded language understanding as a likely next frontier.
Together with our colleagues from AI2’s Computer Vision group, we developed a plan to make sure AllenNLP is a natural choice to do this research. Even as we speak, the team is hard at work building reusable components that can load images, detect regions of interest, embed them, and combine them with natural language.
To prove these components, we’re developing models to perform tasks such as visual question answering, visual entailment, grounded common sense reasoning, and more. Along the way, we’re making improvements to caching and performance and started work on a flexible toolkit for multi-modal, multi-task transformers.
Dirk Groeneveld
Sr. Software Engineer at Allen Institute for Artificial Intelligence (AI2) & Allen NLP Committer
An engineer with an academic bent, “Mechanical Dirk” started his career working on search relevance at Microsoft’s SharePoint Search. From there, he moved to Bing’s “Document Understanding” team, before switching to Amazon to work on automatically processing product descriptions. After an experiment as a founder of a startup, he joined the Allen Institute in 2014.
As one of the early members, he has worked on a variety of projects, ranging from the purely academic (Aristo) to the mostly-applied (Semantic Scholar), and is now a senior engineer on the AllenNLP platform team.