Linear Algebra Foundations #6 - An Equation involving Matrices

  • + 0 comments

    Everyone is wondering how to code this.

    Conceptually, this is what is called an overdetermined system. What that means is that there are more equations than unknown variables. In most cases, overdetermined systems do not have a solution, but this one does.


    Here's the way to go about it:

    1. Subtract A^2 from both sides to obtain the equivilent system xA + yI = -A^2

    2. Flatten the matrices A and I into column vectors a and i in row-major order.

    3. Let the matrix B be defined as the matrix whose columns are the column vectors a and i from step 1.

    4. let the vector c be the column vector made by flattening -A^2 (negative because we subtracted it from both sides).

    5. Now, our system can be re-written as Bz = c. Verify this yourself on paper.

    6. This system is overdetermined as mentioned, previously but can be solved using the least squares method.


    The Python library Numpy can do this for you, but here is the gist of what happens behind the curtain:

    Rearranging our new equation into the form Bz -c = 0, we search for the vector z = that minimizes the squared Euclidean norm ||Bz - c||^2. For those who might not remember what the Euclidean norm means, it's just the distance between two points!

    You can see how finding a z that minimizes this norm will get us "as close as possible" to a solution, even if one doesn't exist.


    Here is some python code that utilizes Numpy:

    # Find the solution [x, y] for the matrix equation xA + yI = -A^2, where x and y are scalars
    A = np.array([[1,1,0],
                  [0,1,0],
                  [0,0,1]])
    
    # The identity matrix
    I = np.identity(3) 
    
    # Convert system to the form Bz = c:
    # B is a two column matrix whose columns are flattened A and I.
    B = np.array([A.flatten(), I.flatten()]).transpose()  # 9x2 matrix
    
    # c is a column vector that is flattened A^2.
    c = -1*np.linalg.matrix_power(A, 2).flatten().transpose() # 9x1 matrix
    print(c)
    
    # Solve Bz = -c 
    z = np.linalg.lstsq(B, c, rcond=-1)[0]
    z = np.array(z)
    print(z)
    

    Then, solving the equation Bz=−c will give a solution, when using numpy's least squares solver. What I don't underst