Skip to main content

Mathematics for Artifical Intelligence : Linear Algebra

A simplified guide on how to prep up on Mathematics for Artificial Intelligence, Machine Learning and Data Science: Linear Algebra (Important Pointers only)

Module - I : Linear Algebra

I. Vector and its properties.

A mathematical entity with magnitude and direction denoted as or v

In component form represented as ; 

v=(v1v2)\mathbf{v} = \begin{pmatrix} v_1 \\ v_2 \end{pmatrix}for 2D vectors, v=(v1v2v3)\mathbf{v} = \begin{pmatrix} v_1 \\ v_2 \\ v_3 \end{pmat

In Cartesian coordinates represented as :

 v=v1i+v2j+v3k\mathbf{v} = v_1 \mathbf{i} + v_2 \mathbf{j} + v_3 \mathbf{k} in 3D space, where   \mathbf{i}, j\mathbf{j} and   k\mathbf{k} are the unit vectors along the x, y and z axes 

 Types of Vectors : Zero Vector (magnitude ->0, no direction) , Unit Vector (magnitude ->1, directional), Position Vector (position of a point relative to its origin) , Equal Vectors (same magnitude and same direction) , Opposite Vectors (same magnitude but opposite direction).

Vector operations:

1. Addition and Subtraction

  • Vector Addition: u+v=(u1u2)+(v1v2)=(u1+v1u2+v2)
  • Vector Subtraction: uv=(u1u2)(v1v2)=(u1v1u2v2)

2. Scalar Multiplication

  • Multiplying a vector by a scalar kv=k(v1v2)=(kv1kv2)

3. Dot Product (Scalar Product)

  • The dot product of two vectors is a scalar.uv=u1v1+u2v2++unvn
  • The dot product can also be expressed in terms of the magnitudes of the vectors and the angle between them: uv=uvcosθ

4. Cross Product (Vector Product)

The cross product is only defined in three-dimensional space.u×v=ijku1u2u3v1v2v3

 

Vector Properties:

  • Commutativity of Addition: u+v=v+u
  • Associativity of Addition: u+(v+w)=(u+v)+w
  • Distributivity: k(u+v)=ku+kv
  • Zero Vector: v+0=v
  • Negative Vector: v+(v)=0

 

II. Vector Spaces and Subspaces.

 A collection of vectors that can be added together and multiplied by scalars (real or complex numbers) and still remain within the set.

Denoted as  over a field FF (usually R\mathbb{R} or C\mathbb{C})

  • Vector Addition: For vectors u,vV and wV\mathbf{w} \in Vu+v=w
  • Scalar Multiplication: For aFa \in F and vectors vV\mathbf{v} \in V and  wV,    av=wa \mathbf{v} = \mathbf{w}.
  • Important Axioms: 

  • Associativity of Addition: (u+v)+w=u+(v+w)(\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u} + (\mathbf{v} + \mathbf{w})
  • Commutativity of Addition: u+v=v+u
  • Identity Element of Addition: There exists an element 0V\mathbf{0} \in V such that v+0=v\mathbf{v} + \mathbf{0} = \mathbf{v} for all vV.
  • Inverse Elements of Addition: For each vV\mathbf{v} \in V, there exists an element v-\mathbf{v} \in Vsuch that v+(v)=0.
  • Compatibility of Scalar Multiplication with Field Multiplication: a(bv)=(ab)va(b\mathbf{v}) = (ab)\mathbf{v} for all a,bFa, b \in F and vV
  • Identity Element of Scalar Multiplication: 1v=v1\mathbf{v} = \mathbf{v}for all vV\mathbf{v} \in V.
  • Distributivity of Scalar Multiplication with Respect to Vector Addition: a(u+v)=au+ava(\mathbf{u} + \mathbf{v}) = a\mathbf{u} + a\mathbf{v} for all aF and u,vV.\mathbf{u}, \mathbf{v} \in V
  • Distributivity of Scalar Multiplication with Respect to Field Addition: (a+b)v=av+bv(a + b)\mathbf{v} = a\mathbf{v} + b\mathbf{v} for all a,bFa, b \in F and vV.
  • Subspaces:

    A subset WW of a vector space VV is called a subspace of VV if:

    • Zero Vector: 0W\mathbf{0} \in W.
    • Closed under Addition: For all u,vW\mathbf{u}, \mathbf{v} \in W, u+vW.
    • Closed under Scalar Multiplication: For all aFa \in F and vW, avW.

    If WW satisfies these conditions, WW is a subspace of VV.

     Eg: The zero subspace: {0}\{\mathbf{0}\} is a subspace of any vector space.

    Properties of Subspaces:

    • Intersection: The intersection of two subspaces of VV is also a subspace of VV.
    • Sum: The sum of two subspaces W1W_1 and W2W_2, defined as W1+W2={u+vuW1,vW2}W_1 + W_2 = \{ \mathbf{u} + \mathbf{v} \mid \mathbf{u} \in W_1, \mathbf{v} \in W_2 \}, is also a subspace of VV.
    • Span: The span of a set of vectors in VV is the smallest subspace of VV that contains all those vectors.

    Spanning Set:

    A set of vectors S={v1,v2,,vk}S = \{\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k\} in VV is said to span VV if every vector in VV can be written as a linear combination of the vectors in SS.

    Basis:

    A basis of a vector space VV is a set of vectors in VV that is linearly independent and spans VV.

    Dimension:

    The dimension of a vector space VV is the number of vectors in a basis of VV. It is a measure of the "size" or "degree of freedom" of the vector space.

     

    III. Matrices.  

     A rectangular array of numbers, symbols, or expressions arranged in rows and columns, denoted by AA. If a matrix has mm rows and nn columns, it is called an m×nm \times n matrix.

    For example, a 2×32 \times 3 matrix AA:

    A=(a11a12a13a21a22a23)A = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \end{pmatrix}

    Types of Matrices:

    • Square Matrix: A matrix with the same number of rows and columns (m=n).
    • Row Matrix: A matrix with a single row (1×n).
    • Column Matrix: A matrix with a single column (m×1).
    • Zero Matrix: A matrix in which all elements are zero.
    • Identity Matrix: A square matrix with ones on the main diagonal and zeros elsewhere.
    • Diagonal Matrix: A square matrix in which all off-diagonal elements are zero.
    • Symmetric Matrix: A square matrix that is equal to its transpose (A=AT).
    • Skew-Symmetric Matrix: A square matrix that is equal to the negative of its transpose (A=ATA = -A^T).
    • Upper Triangular Matrix: A square matrix in which all elements below the main diagonal are zero.
    • Lower Triangular Matrix: A square matrix in which all elements above the main diagonal are zero.

     

    Matrix Operations :

    1. Addition:

    A+B=(a11+b11a12+b12a21+b21a22+b22)A + B = \begin{pmatrix} a_{11} + b_{11} & a_{12} + b_{12} \\ a_{21} + b_{21} & a_{22} + b_{22} \end{pmatrix}

    2. Subtraction

    AB=(a11b11a12b12a21b21a22b22)A - B = \begin{pmatrix} a_{11} - b_{11} & a_{12} - b_{12} \\ a_{21} - b_{21} & a_{22} - b_{22} \end{pmatrix}

    3. Scalar Multiplication

    cA=(ca11ca12ca21ca22)cA = \begin{pmatrix} c \cdot a_{11} & c \cdot a_{12} \\ c \cdot a_{21} & c \cdot a_{22} \end{pmatrix}

    4. Matrix Multiplication

    Two matrices AA (m×nm \times n) and B (n×p) can be multiplied to form an m×p matrix C:

    C=AB

    Where each element cij is calculated as:

    cij=k=1naikbkj

    5. Transpose

     AT is formed by swapping rows and columns:

    AT=(a11a21a12a22)A^T = \begin{pmatrix} a_{11} & a_{21} \\ a_{12} & a_{22} \end{pmatrix}

    6. Determinant

    The determinant is a scalar value. For a 2×22 \times 2matrix:

    det(A)=a11a12a21a22=a11a22a12a2

     

    IV. Matrix Inversion.

    The inverse of a square matrix AA, denoted A1A^{-1}, is the matrix such that AA1=A1A=IAA^{-1} = A^{-1}A = I, where is the identity matrix. The inverse exists only if det(A)0\det(A) \neq 0.

    For a 2×2 matrix:

    A1=1det(A)(a22a12a21a11)A^{-1} = \frac{1}{\det(A)} \begin{pmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \end{pmatrix}

    Conditions for Inversion:

    Not all matrices have inverses. A matrix AA is invertible (or non-singular) if and only if:

    1. AA is a square matrix.
    2. The determinant of AA is non-zero, i.e., det(A)0.

    If these conditions are not met, the matrix is said to be singular or non-invertible.

    Properties of Inverse Matrix:

    • Uniqueness: If AA is invertible, its inverse A1A^{-1} is unique.
    • Product of Inverses: If AA and BB are invertible matrices of the same dimension, then the product ABAB is invertible 
      (AB)1=B1A1
    • Inverse of Transpose: If AA is invertible, then
      (AT)1=(A1)T
    • Inverse of a Scalar Multiple: If AA is invertible and cc is a non-zero scalar, then
      (cA)1=1cA1(cA)^{-1} = \frac{1}{c}A^{-1}

    Methods for Finding the Inverse:

    1. Gaussian Elimination

    To find the inverse of a matrix AA using Gaussian elimination:

    1. Form the augmented matrix [AI][A | I], where II is the identity matrix of the same dimension as AA.
    2. Use row operations to transform [AI][A | I]into [IA1][I | A^{-1}].
    3. If this is possible, the matrix AA is invertible and the right half of the augmented matrix will be A1A^{-1}.

    2. Adjugate Method

    For a square matrix AA, the inverse can also be found using the adjugate (or adjoint) and the determinant:

    A1=1det(A)adj(A)A^{-1} = \frac{1}{\det(A)} \text{adj}(A)

    where adj(A)\text{adj}(A) is the adjugate of AA. The steps are:

    1. Compute the determinant det(A)\det(A).
    2. Find the matrix of cofactors.
    3. Transpose the matrix of cofactors to get the adjugate.
    4. Divide each entry of the adjugate by det(A)\det(A).

    3. Using Elementary Matrices

    An elementary matrix is obtained by performing a single elementary row operation on an identity matrix. The inverse of AA can be found by expressing AA as a product of elementary matrices:

    A=E1E2EkA = E_1 E_2 \cdots E_k

    Then,

    A1=Ek1E21E11A^{-1} = E_k^{-1} \cdots E_2^{-1} E_1^{-1}

    Eg:

    Let's find the inverse of the following 2x2 matrix AA:

    A=(abcd)A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}

    The inverse is given by:

    A1=1det(A)(dbca)A^{-1} = \frac{1}{\det(A)} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}

    where det(A)=adbc\det(A) = ad - bc

    For example, if:

    A=(2314)A = \begin{pmatrix} 2 & 3 \\ 1 & 4 \end{pmatrix}

    Then:

    det(A)=(2)(4)(3)(1)=83=5\det(A) = (2)(4) - (3)(1) = 8 - 3 = 5

    A1=15(4312)=(0.80.60.20.4)A^{-1} = \frac{1}{5} \begin{pmatrix} 4 & -3 \\ -1 & 2 \end{pmatrix} = \begin{pmatrix} 0.8 & -0.6 \\ -0.2 & 0.4 \end{pmatrix}

    Applications

    • Solving Linear Systems: Given Ax=b, if AA is invertible, the solution is x=A1b.
    • Computer Graphics: Inverse matrices are used for transforming coordinates and manipulating images.
    • Control Theory: Inverse matrices are essential in system design and stability analysis.

     

    V. Properties of Determinants. 

    1. Determinant of the Identity Matrix

      The determinant of an identity matrix II of any size is 1:

      det(I)=1\det(I) = 1
    2. Determinant of a Diagonal Matrix

      The determinant of a diagonal matrix (a square matrix in which all off-diagonal elements are zero) DD with elements d11,d22,,dnn:

      det(D)=d11d22dnn\det(D) = d_{11} \cdot d_{22} \cdot \ldots \cdot d_{nn
    3. Determinant of a Triangular Matrix

      Similar to diagonal matrices, the determinant of a triangular matrix (either upper or lower triangular) is the product of its diagonal elements.

    4. Determinant of the Transpose

      The determinant of a matrix is equal to the determinant of its transpose:

      det(A)=det(AT)\det(A) = \det(A^T)
    5. Multiplicative Property

      The determinant of the product of two matrices is the product of their determinants:

      det(AB)=det(A)det(B)\det(AB) = \det(A) \cdot \det(B)
    6. Determinant of an Inverse

      If AA is an invertible matrix, the determinant of its inverse is the reciprocal of the determinant of AA:

      det(A1)=1det(A)\det(A^{-1}) = \frac{1}{\det(A
    7. Determinant of a Scalar Multiple

      If AA is an n×nn \times nmatrix and cc is a scalar, the determinant of the scalar multiple of AA is:

      det(cA)=cndet(A)\det(cA) = c^n \det(A)
    8. Row and Column Operations

      • Row Interchange: Swapping two rows (or two columns) of a matrix changes the sign of the determinant:

        det(B)=det(A)\det(B) = -\det(A)

        if BB is obtained by interchanging two rows (or columns) of AA.

      • Row Scaling: Multiplying a row (or column) by a scalar multiplies the determinant by the same scalar:

        det(B)=kdet(A)\det(B) = k \cdot \det(A)

        if BB is obtained by multiplying a row (or column) of AA by kk.

      • Row Addition: Adding a multiple of one row (or column) to another row (or column) does not change the determinant:

        det(B)=det(A)\det(B) = \det(A)

        if BB is obtained by adding a multiple of one row (or column) to another row (or column) of AA.

    9. Determinant of a Block Matrix

      For a block diagonal matrix:

      A=(A100A2)A = \begin{pmatrix} A_1 & 0 \\ 0 & A_2 \end{pmatrix}

      The determinant is the product of the determinants of the blocks:

      det(A)=det(A1)det(A2)\det(A) = \det(A_1) \cdot \det(A_2)
    10. Singular Matrix

      A matrix is singular (non-invertible) if and only if its determinant is zero:

      det(A)=0    A is singular\det(A) = 0 \iff A \text{ is singular}
    11. Linearity in Rows and Columns

      The determinant is a linear function in each row and each column. 

       

    VI. Eigen Values and Eigen Vectors.

    Given a square matrix A of dimension n×n:

    • Eigenvalue: A scalar λ is called an eigenvalue of A if there exists a non-zero vector v (called an eigenvector) such that:

      Av=λv
    • Eigenvector: A non-zero vector v is called an eigenvector of A corresponding to the eigenvalue λ if it satisfies the above equation.

     Finding Eigenvalues and Eigenvectors

    To find the eigenvalues and eigenvectors of a matrix AA:

    1. Eigenvalues: Solve the characteristic equation:

      det(AλI)=0\det(A - \lambda I) = 0

      This equation is derived from the equation Av=λv by rearranging it to:

      (AλI)v=0(A - \lambda I) \mathbf{v} = 0

      For non-trivial solutions (non-zero v\mathbf{v}), the matrix AλA - \lambda Imust be singular, which means its determinant must be zero.

    2. Eigenvectors: For each eigenvalue λ\lambda, solve the linear system:

      (AλI)v=0(A - \lambda I) \mathbf{v} = 0

      to find the corresponding eigenvector(s) v\mathbf{v}.

    Properties of Eigenvalues and Eigenvectors

    1. Sum of Eigenvalues: The sum of the eigenvalues of a matrix AA is equal to the trace of AA (the sum of the diagonal elements).

      i=1nλi=tr(A)\sum_{i=1}^{n} \lambda_i = \text{tr}(A)
    2. Product of Eigenvalues: The product of the eigenvalues of a matrix AA is equal to the determinant of AA.

      i=1nλi=det(A)\prod_{i=1}^{n} \lambda_i = \det(A)
    3. Eigenvectors of Different Eigenvalues: Eigenvectors corresponding to distinct eigenvalues are linearly independent.

    4. Diagonalizability: A matrix AA is diagonalizable if it has nn linearly independent eigenvectors. In such cases, AA can be written as:

      A=PDP1A = PDP^{-1}

      where PP is the matrix of eigenvectors and DD is the diagonal matrix of eigenvalues.

    5. Similarity Transformation: If AA and BB are similar matrices, they have the same eigenvalues.

    6. Power of a Matrix: If AA is diagonalizable, then Ak can be expressed as:

      Ak=PDkP1A^k = PD^kP^{-1}

      where DD is the diagonal matrix with eigenvalues of AA on the diagonal, and DkD^k is simply raising each diagonal entry to the kk-th power.

    Applications:

    1. Differential Equations: Eigenvalues and eigenvectors are used to solve systems of linear differential equations.
    2. Stability Analysis: In control theory, the stability of a system can be analyzed using the eigenvalues of the system matrix.
    3. Principal Component Analysis (PCA): In statistics, PCA uses eigenvalues and eigenvectors of the covariance matrix to reduce the dimensionality of data.
    4. Quantum Mechanics: Eigenvalues and eigenvectors are used to solve the Schrödinger equation.
    5. Vibration Analysis: In mechanical engineering, the natural frequencies (eigenvalues) and mode shapes (eigenvectors) of a system are analyzed.

     

    VII. Diagonalization of Matrices.  

    Diagonalization of matrices is a process by which a given square matrix AA is decomposed into a product of three matrices: A=PDP1A = PDP^{-1}, where PP is an invertible matrix whose columns are the eigenvectors of AA, DD is a diagonal matrix whose entries are the eigenvalues of AA, and P1P^{-1} is the inverse of PP.

    Here’s a step-by-step guide to diagonalizing a matrix:

    1. Find the Eigenvalues of A:

      • Solve the characteristic equation det(AλI)=0 for λ. The solutions λ1,λ2,,λn are the eigenvalues of A.
    2. Find the Eigenvectors of A:

      • For each eigenvalue λ, solve the equation (AλI)v=0 for the eigenvector v.
    3. Form the Matrix P:

      • Construct the matrix P using the eigenvectors as columns.
    4. Form the Diagonal Matrix D:

      • Construct the matrix D by placing the eigenvalues λi on the diagonal. The order of the eigenvalues in D should correspond to the order of the eigenvectors in P.
    5. Verify the Diagonalization:

      • Verify that A=PDP by calculating both sides of the equation.
       

    VIII. Singular Value Decomposition (SVD).

    For a given m× matrix A, the SVD is given by:

    A=UΣVT

    where:

    • U is an m×m orthogonal matrix whose columns are called the left singular vectors of A.
    • Σ is an m×n diagonal matrix with non-negative real numbers on the diagonal, known as the singular values of A.
    • V is an n×n orthogonal matrix whose columns are called the right singular vectors of A.

    Steps to Compute the SVD

    1. Compute ATA and AAT:

      • These are both symmetric matrices.
    2. Find the eigenvalues and eigenvectors:

      • Compute the eigenvalues and eigenvectors of ATA. The eigenvalues are the squares of the singular values of A.
      • Compute the eigenvalues and eigenvectors of AAT. The eigenvectors form the columns of U and V.
    3. Construct Σ:

      • The singular values σi are the square roots of the eigenvalues of ATA.
    4. Form the matrices U and V:

      • The columns of U are the eigenvectors of AAT.
      • The columns of V are the eigenvectors of ATA.

    Applications of SVD in Machine Learning:

    1. Reducing Dimensions in Data (in PCA - Principal Component Analysis)
    2. Compressing Images
    3. Removing Noise
    4. Making Recommendation systems (like Netflix or Amazon)
    5. Understanding Text in Latent Semantic Analysis (LSA)
    6. Designing Control Systems like robots or aircraft
    7. Face Recognition
    8. Analyzing Genetic Datasets
    9. Compressing Data for Communication
    10. Preventing Overfitting in ML models.
    11. Quantum Computing
     

    k \mathbf{v} = k \begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \begin{pmatrix} k v_1 \\ k v_2 \end{pmatri\mathbf{u} - \mathbf{v} = \begin{pmatrix} u_1 \\ u_2 \end{pmatrix} - \begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \begin{pmatrix} u_1 - v_1 \\ u_2 - v_2 \end{pmatrix}

    Popular posts from this blog

    Case Study: Reported Rape Cases Analysis

    Case Study  : Rape Cases Analysis Country : India Samples used are the reports of rape cases from 2016 to 2021 in Indian states and Union Territories Abstract : Analyzing rape cases reported in India is crucial for understanding patterns, identifying systemic failures and driving policy reforms to ensure justice and safety. With high underreporting and societal stigma, data-driven insights can help reveal gaps in law enforcement, judicial processes and victim support systems. Examining factors such as regional trends, conviction rates and yearly variations aids in developing more effective legal frameworks and prevention strategies. Furthermore, such analysis raises awareness, encourages institutional accountability and empowers advocacy efforts aimed at addressing gender-based violence. A comprehensive approach to studying these cases is essential to creating a safer, legally sound and legitimate society. This study is being carried out with an objective to perform descriptive a...

    Trials vs. Internet Vigilantism : Authoritative View

      1. In an era of internet vigilantism, would there be any impact on a fair trial due to interference of social media and public platforms ?  Ans. It depends on many factors. Social media can create public opinion based on half truths or misinformation, which can pressurize a judge to interpret evidence especially in a 50-50% chance case, in tune with the public opinion. A wavering judge may align his/her decision in favor of public opinion, lest he/she should be adversely criticized. But a trained judicial mind will not be influenced by external factors, but will be guided by the proof appearing from the evidence adduced in the case under trial. He/she will not succumb to the pressure exerted by social media. Similar is the case of prosecutors and investigators. Social media can easily affect a layman witness. It can affect the privacy of vulnerable victims also. Thus trial by media is a social evil. 2. With the rise of digital tools, how has the use of technology like digit...

    SQL Commands - Basics

      SQL stands for Structured Query Language SQL lets you access and manipulate databases   A  SQL database is a collection of tables that stores a specific set of structured data. Database Management Systems (DBMS) are software systems used to store, retrieve and run queries on data . A DBMS serves as an interface between an end-user and a database, allowing users to create, read, update, and delete data in the database. eg : MySQL, Oracle DB, etc.  RDBMS stands for Relational Database Management System . RDBMS is a program used to maintain a relational database. RDBMS uses SQL queries to access the data in the database. Types of Commands Available in SQL :  Data Definition Language  Data Manipulation Language  Data Query Language  Data Control Language  Transactional Control Language    Data Definition Language : Set of commands used to create and modify the structure of database objects in a database. create, alter, drop, trunca...