Machine Learning with PySpark MLlib

Aruna Singh


In the last article, you learned about PySpark SQL and how to interact with it using DataFrame API and SQL query. In this article, you’ll learn about PySpark MLlib which is a built-in library for scalable machine learning.

Importing the PySpark MLlib libraries which is being discussed thoroughly in the last article.



Aruna Singh

As a BIE at Amazon, I explore why we call data, the new oil by interpreting and generating meaningful insights.