Writing LLMs in Rust: Looking for an Efficient Matrix Multiplication | by Stefano Bosisio | Nov, 2024

14Nov

Starting from Karpathy `llm.c,` I wonder myself “Could I write this in Rust?” Here are the lessons I learned and how I am writing `llm.rust.` In this first article, let’s tackle the matrix multiplication problem.

Matrix multiplication may be the most important operation in Machine Learning. I still remember when I was an engineering student, and in one of the first linear algebra lessons, the teacher started to explain matrices, eigenvectors, and basis and orthonormal basis. I was very confused, my head took a little while to start understanding why we were bothering so much about matrices and basis sets, and what a good basis implies for our world. From there, I always found linear algebra so fascinating, and, from a pure computer science point of view, how amazing all those algorithms that try to be more and more efficient in handling matrices.

In particular, we know that the matrix-vector product is pretty simple, but things are getting more and more complicated when we have matrices-matrices or tensors-tensors products. From here, many methodologies have been implemented to optimize the matrix multiplication. For example, a long time ago I posted about DeepMind…

Source link

Starting from Karpathy llm.c, I wonder myself “Could I write this in Rust?” Here are the lessons I learned and how I am writing llm.rust. In this first article, let’s tackle the matrix multiplication problem.

Starting from Karpathy `llm.c,` I wonder myself “Could I write this in Rust?” Here are the lessons I learned and how I am writing `llm.rust.` In this first article, let’s tackle the matrix multiplication problem.