Spark NLP 5.5: Breaking Barriers in LLM Inference Scalability

Spark NLP 5.5 dramatically enhances the landscape of large language model (LLM) inference. This major release introduces native integration with Llama.cpp, unlocking access to tens of thousands of GGUF models available on Hugging Face – but now deployable at scale. Spark NLP 5.5 enables you to load any quantized model via the Llama.cpp library as a GGUF model. This capability extends across diverse computing environments – from local machines to single-node and multi-node setups – and seamlessly integrates with managed clusters on platforms like Databricks, AWS EMR, Azure, and Google Cloud Platform.

A standout feature of this release is its hardware-agnostic approach, offering optimized performance across Intel processors, Nvidia CUDA GPUs, and Apple Silicon chips. This versatility ensures that organizations can leverage their existing infrastructure while scaling their LLM & NLP capabilities.

As we celebrate reaching the milestone of 120 million downloads of Spark NLP, the 5.5 release puts tens of thousands of new models at your fingertips, ready to power the next generation of AI applications. Join us to explore how this release is set to redefine the boundaries of LLM inference – making it more scalable, efficient, and accessible than ever before.

About the speaker
Amy-Heineike

Maziyar Panahi

Principal AI Engineer & Team Lead at John Snow Labs

Maziyar Panahi is a Senior Data Scientist and Spark NLP Lead at John Snow Labs with over a decade long experience in public research. He is a senior Big Data engineer and a Cloud architect with extensive experience in computer networks and software engineering. He has been developing software and planning networks for the last 15 years. In the past, he also worked as a network engineer in high-level places after he completed his Microsoft and Cisco training (MCSE, MCSA, and CCNA).

He has been designing and implementing large-scale databases and real-time Web services in public and private Clouds such as AWS, Azure, and OpenStack for the past decade. He is one of the early adopters and main maintainers of the Spark NLP library. He is currently employed by The French National Centre for Scientific Research (CNRS) as a Big Data engineer and System/Network Administrator working at the Institute of Complex Systems of Paris (ISCPIF).

NLP-Summit

When

Online Event: September 24, 2024

 

Contact

nlpsummit@johnsnowlabs.com

Presented by

jhonsnow_logo