Linear algebra in Spark
From Spark In Action
Linear algebra is a branch of mathematics focusing on vector spaces and linear operations and mappings between them expressed mainly by matrices.
Linear algebra is essential for understanding the math behind most machine-learning algorithms, so if you don’t know much about vectors and matrices.
Matrices and vectors (MLLib DataTypes) in Spark can be manipulated locally (in the driver or executor processes) or in a distributed manner. Implementations of distributed matrices in Spark enable you to perform linear algebra operations on huge amounts of data, spanning numerous machines. For local linear algebra operations, Spark uses the very fast Breeze and jblas libraries (and NumPy in Python), and it has its own implementations of distributed ones.
Last updated
Was this helpful?