I'm enjoying using the library! However I can't see any speedup cf regular python, so potentially I'm missing something here. Actually, using the matrix_multiply function from the user manual, I see ...