Challenges in Financial Document Understanding – NLP and Beyond
We have made significant progress in NLP on Web and Social Media documents. Applying NLP to business documents in more challenging as they come in all shapes and sizes: financial reports, invoices, project plans, RFPs, legal agreements. Interpreting these documents with NLP tasks like information extraction, summarization is challenging on these documents due to their complex formats, structure, quality of OCR and domain specific needs.
It is difficult in financial domain as these documents consists of sensitive information which limits the availability of these documents. They are lengthy, glossy, vary in formats, a single documents consisting information in various forms in tables, charts etc which adds noise. In this talk we are going to present these challenges and how are we trying to address them.
Neelesh Shukla
Research Scientist & Manager AI at State Street Corporation
Neelesh is Research Scientist with State Street. He heads AI Group’s Innovation and Research efforts. He leads a team focusing on solving complex problems in finance domain and pushing the current solutions and provide His primary area of work is NLP and current focus in on understanding and processing financial documents and exploring FinNLP, VisualNLP. He has 14+ years of experience both in Industry and Academia building solutions and conducting research.