8 great Python libraries for natural language processing

Natural language processing, or NLP for short, is best described as “AI for speech and text.” The magic behind voice commands, speech and text translation, sentiment analysis, text summarization, and many other linguistic applications and analyses, natural language processing has been improved dramatically through deep learning.

The Python language provides a convenient front-end to all varieties of machine learning including NLP. In fact, there is an embarrassment of NLP riches to choose from in the Python ecosystem. In this article we’ll explore each of the NLP libraries available for Python—their use cases, their strengths, their weaknesses, and their general level of popularity.

Note that some of these libraries provide higher-level versions of the same functionality exposed by others, making that functionality easier to use at the cost of some precision or performance. You’ll want to choose a library well-suited both to your level of expertise and to the nature of the project.


The CoreNLP library — a product of Stanford University — was built to be a production-ready natural language processing solution, capable of delivering NLP predictions and analyses at scale. CoreNLP is written in Java, but multiple Python packages and APIs are available for it, including a native Python NLP library called Stanza.

CoreNLP includes a broad range of language tools—grammar tagging, named entity recognition, parsing, sentiment analysis, and plenty more. It was designed to be human language agnostic, and currently supports Arabic, Chinese, French, German, and Spanish in addition to English (with Russian, Swedish, and Danish support available from third parties). CoreNLP also includes a web API server, a convenient way to serve predictions without too much additional work.

The easiest place to start with CoreNLP’s Python wrappers is Stanza, the reference implementation created by the Stanford NLP Group. In addition to being well-documented, Stanza is also maintained regularly; many of the other Python libraries for CoreNLP were not updated for some time.

Copyright © 2021 IDG Communications, Inc.

Source link