Mathematics for Artificial Intelligence : Numerical Methods

A simplified guide on how to prep up on Mathematics for Artificial Intelligence, Machine Learning and Data Science: Numerical Methods (Important Pointers only)

Module VII : Numerical Methods

I. Bisection Method.

A numerical technique for solving equations of the form $f (x) = 0$ . It is a type of root-finding method that repeatedly narrows down an interval where a root of the function exists.

Steps:

Choose the initial interval $[a, b]$ : Select two points $a$ and $b$ such that $f(a)$ and $f(b)$ have opposite signs. This indicates that there is at least one root in the interval $[a, b]$ .
Compute the midpoint $c$ : Calculate the midpoint of the interval:
$c = \frac{a + b}{2}$
Evaluate the function at the midpoint: Compute $f(c)$ .
Determine the subinterval:
- If $f (a) \cdot f (c) < 0$ , the root lies in the interval $[a, c]$ . Set $b = c$ .
- If $f(b) \cdot f(c) < 0$ , the root lies in the interval $[c, b]$ . Set $a = c$ .
- If $f (c) = 0$ , then $c$ is the root, and the method stops.
Check for convergence: Repeat steps 2-4 until the interval $[a, b]$ is sufficiently small (i.e., $|b - a| < \epsilon$ for some tolerance $\epsilon$ ), or until the value of $f(c)$ is close enough to zero.
Output the result: The midpoint $c$ is an approximation of the root.

Eg: To find a root of the function $f (x) = x^{2} - 4$ .

Choose the initial interval: $[a, b] = [1, 3]$
- $f (1) = 1^{2} - 4 = - 3$
- $f(3) = 3^2 - 4 = 5$
- Since $f(1)$ and $f (3)$ have opposite signs, there is a root in the interval $[1, 3]$ .
Compute the midpoint:
$c = \frac{1 + 3}{2} = 2$
Evaluate the function at the midpoint:
- $f (2) = 2^{2} - 4 = 0$
Since $f(2) = 0$ , the root is exactly at $c = 2$ .

For functions where the root is not exactly at the midpoint, the method would continue iterating until the desired precision is achieved.

II. Trapezoidal Rule.

A numerical method used to approximate the definite integral of a function.

The trapezoidal rule approximates the integral of a function $f(x)$ over an interval $[a, b]$ using the formula:

\int_a^b f(x) \, dx \approx \frac{b - a}{2} \left[ f(a) + f(b) \right]

For better accuracy, the interval $[a, b]$ can be divided into $n$ smaller subintervals of equal width. If the interval $[a, b]$ is divided into $n$ subintervals, each of width $h = \frac{b - a}{n}$ , the composite trapezoidal rule is used:

\int_a^b f(x) \, dx \approx \frac{h}{2} \left[ f(x_0) + 2 \sum_{i=1}^{n-1} f(x_i) + f(x_n) \right]

where $x_{0} = a$ , $x_n = b$ , and $x_{i} = a + i \cdot h$ for $i = 1, 2, \ldots, n-1$ .

Steps

Divide the interval $[a, b]$ : Divide the interval into $n$ equal subintervals. The width of each subinterval is $h = \frac{b - a}{n}$ .
Calculate the endpoints: Compute the function values at the endpoints of each subinterval. These points are $x_{0}, x_{1}, \dots, x_{n}$ , where $x_{i} = a + i \cdot h$ .
Apply the trapezoidal rule formula: Sum the function values, multiplying the endpoints by $\frac{1}{2}$ and the interior points by $2$ .
Calculate the approximation: Multiply the result by $\frac{h}{2}$ to get the final approximation of the integral.

Eg: Approximate the integral of $f(x) = e^x$ over the interval $[0, 1]$ using the trapezoidal rule with $n = 4$ subintervals.

Divide the interval $[0, 1]$ :
$h = \frac{1 - 0}{4} = \frac{1}{4$
Calculate the endpoints:
$x_0 = 0, \quad x_1 = \frac{1}{4}, \quad x_2 = \frac{1}{2}, \quad x_3 = \frac{3}{4}, \quad x_4 = 1$
Evaluate the function at the points:
$f(x_0) = e^0 = 1, \quad f(x_1) = e^{1/4} \approx 1.284, \quad f(x_2) = e^{1/2} \approx 1.649$ $f(x_3) = e^{3/4} \approx 2.117, \quad f(x_4) = e^1 \approx 2.718$
Apply the trapezoidal rule formula:
$\int_0^1 e^x \, dx \approx \frac{h}{2} \left[ f(x_0) + 2 \sum_{i=1}^{3} f(x_i) + f(x_4) \right]$ $= \frac{1/4}{2} \left[ f(0) + 2 \left( f\left(\frac{1}{4}\right) + f\left(\frac{1}{2}\right) + f\left(\frac{3}{4}\right) \right) + f(1) \right]$ $= \frac{1}{8} \left[ 1 + 2 \left( 1.284 + 1.649 + 2.117 \right) + 2.718 \right]$
Calculate the sum:
$= \frac{1}{8} \left[ 1 + 2 \left( 1.284 + 1.649 + 2.117 \right) + 2.718 \right] = \frac{1}{8} \left[ 1 + 2 \times 5.05 + 2.718 \right] = \frac{1}{8} \left[ 1 + 10.1 + 2.718 \right] = \frac{1}{8} \left[ 13.818 \right] = 1.72725$

So, the approximate value of the integral $\int_{0}^{1} e^{x} d x$ using the trapezoidal rule with 4 subintervals is approximately $1.72725$ .

The exact value of the integral $\int_{0}^{1} e^{x} d x$ is $e - 1 \approx 1.71828$ . The trapezoidal rule gives a reasonably close approximation, which can be improved by increasing the number of subintervals $n$ .

III. Secant Method.

The secant method is an iterative numerical technique used to find roots of a function $f (x) = 0$ . Unlike the bisection method, which requires the function values to have opposite signs at the endpoints of an interval, the secant method uses two initial approximations to generate a sequence of improving approximations to the root.

The secant method approximates the root by using the following iterative formula:

x_{n + 1} = x_{n} - f (x_{n}) \frac{x_{n} - x_{n - 1}}{f (x_{n}) - f (x_{n - 1})}

where:

$x_{n}$ and $x_{n - 1}$ are the current and previous approximations, respectively.
$f (x_{n})$ and $f (x_{n - 1})$ are the function values at these approximations.

Steps

Choose initial approximations: Select two initial guesses $x_0$ and $x_1$ close to the root.
Iterate using the secant formula: Use the secant formula to compute the next approximation $x_{n+1}$
Check for convergence: Repeat the iteration until the difference between successive approximations is smaller than a predetermined tolerance $\epsilon$ or until the function value $f(x_{n+1})$ is close to zero.
Output the result: The final approximation $x_{n+1}$ is taken as the root.

Eg: Find a root of the function $f (x) = x^{2} - 2$ using the secant method.

Choose initial approximations: $x_{0} = 1$ and $x_{1} = 2$ .
First iteration:
$x_2 = x_1 - f(x_1) \frac{x_1 - x_0}{f(x_1) - f(x_0)}$ $f(x_0) = 1^2 - 2 = -1, \quad f(x_1) = 2^2 - 2 = 2$
$x_2 = 2 - 2 \frac{2 - 1}{2 - (-1)} = 2 - 2 \frac{1}{3} = 2 - \frac{2}{3} = \frac{4}{3} \approx 1.3333$
Second iteration:
$x_3 = x_2 - f(x_2) \frac{x_2 - x_1}{f(x_2) - f(x_1)}$ $f(x_2) = \left(\frac{4}{3}\right)^2 - 2 = \frac{16}{9} - 2 = \frac{16}{9} - \frac{18}{9} = -\frac{2}{9}$ $x_3 = \frac{4}{3} - \left( -\frac{2}{9} \right) \frac{\frac{4}{3} - 2}{-\frac{2}{9} - 2}$ $= \frac{4}{3} - \left( -\frac{2}{9} \right) \frac{\frac{4}{3} - \frac{6}{3}}{-\frac{2}{9} - \frac{18}{9}}$ $= \frac{4}{3} - \left( -\frac{2}{9} \right) \frac{-\frac{2}{3}}{-\frac{20}{9}}$ $= \frac{4}{3} - \left( -\frac{2}{9} \right) \cdot \frac{-2}{-20}$ $= \frac{4}{3} - \frac{2}{9} \cdot \frac{1}{10}$ $= \frac{4}{3} + \frac{1}{45}$ $= \frac{60}{45} + \frac{1}{45}$ $= \frac{61}{45} \approx 1.3556$
Further iterations: Repeat the above steps until the difference between successive approximations is less than the tolerance $\epsilon$ .

IV. Newton-Raphson Method.

An iterative numerical technique used to find approximations to the roots of a real-valued function $f (x) = 0$ . It is known for its fast convergence properties, especially when the initial guess is close to the actual root.

The Newton-Raphson iteration formula is given by:

x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}

where:

$x_n$ is the current approximation.
$f(x_n)$ is the value of the function at $x_n$ .
$f'(x_n)$ is the value of the derivative of the function at $x_{n}$ .

Steps

Choose an initial approximation: Select an initial guess $x_0$ close to the root.
Iterate using the Newton-Raphson formula: Use the formula to compute the next approximation $x_{n + 1}$ .
Check for convergence: Repeat the iteration until the difference between successive approximations is smaller than a predetermined tolerance $\epsilon$ or until the function value $f(x_{n+1})$ is close to zero.
Output the result: The final approximation $x_{n+1}$ is taken as the root.

Eg: Find a root of the function $f(x) = x^2 - 2$ using the Newton-Raphson method.

Choose an initial approximation: $x_{0} = 1$ .
First iteration:
$x_1 = x_0 - \frac{f(x_0)}{f'(x_0)}$ $f(x_0) = 1^2 - 2 = -1, \quad f'(x_0) = 2x_0 = 2 \times 1 = 2$ $x_1 = 1 - \frac{-1}{2} = 1 + \frac{1}{2} = 1.5$
Second iteration:
$x_2 = x_1 - \frac{f(x_1)}{f'(x_1)}$ $f(x_1) = 1.5^2 - 2 = 2.25 - 2 = 0.25, \quad f'(x_1) = 2x_1 = 2 \times 1.5 = 3$ $x_2 = 1.5 - \frac{0.25}{3} = 1.5 - 0.0833 = 1.4167$
Third iteration:
$x_3 = x_2 - \frac{f(x_2)}{f'(x_2)}$ $f(x_2) = 1.4167^2 - 2 \approx 0.0069, \quad f'(x_2) = 2 \times 1.4167 = 2.8334$ $x_3 = 1.4167 - \frac{0.0069}{2.8334} \approx 1.4167 - 0.0024 = 1.4143$
Further iterations: Continue iterating until the difference between successive approximations is less than the tolerance $\epsilon$ .

V. Numerical Stability and Error Analysis.

1. Numerical Stability

Numerical stability refers to how errors are propagated by an algorithm. An algorithm is numerically stable if small changes in the input or intermediate calculations lead to small changes in the output. Conversely, if small changes in the input result in large changes in the output, the algorithm is numerically unstable.

Types of Stability:

Forward Stability: An algorithm is forward stable if the computed solution is close to the exact solution of the given problem. This implies that the errors in the output are proportional to the errors in the input.
Backward Stability: An algorithm is backward stable if the computed solution is the exact solution to a slightly perturbed version of the original problem. This means the algorithm produces results that are accurate for some nearby problem.
Mixed Stability: Combines aspects of both forward and backward stability, considering both input and output errors.

2. Error Analysis

Error analysis is the study of the types, sources and propagation of errors in numerical computations. It helps in understanding the accuracy and precision of numerical solutions.

Types of Errors:

Round-off Error: Errors that occur because of the finite precision with which computers represent real numbers. For example, floating-point arithmetic can introduce small errors in calculations due to truncation or rounding.
Truncation Error: Errors that result from approximating a mathematical procedure. For example, truncating an infinite series to a finite number of terms introduces a truncation error.
Approximation Error: Errors that arise when a mathematical model or method approximates a physical process or another mathematical model. This includes discretization errors in methods like finite differences or finite elements.

3. Error Propagation:

Error propagation studies how errors in input data or intermediate steps affect the final result. It is crucial for understanding and mitigating the impact of errors in numerical algorithms.

4. Condition Number:

The condition number of a problem measures its sensitivity to changes in input. A problem with a high condition number is ill-conditioned, meaning small changes in input can cause large changes in output. Conversely, a problem with a low condition number is well-conditioned.

For example, the condition number of a matrix $A$ in solving the linear system $Ax = b$ is given by $\kappa(A) = \|A\| \cdot \|A^{-1}\|$ . If $\kappa(A)$ is large, the system is ill-conditioned.

5. Mitigating Errors

Improving Precision: Use higher precision arithmetic (e.g., double precision instead of single precision).
Algorithm Choice: Choose stable algorithms (e.g., using LU decomposition with pivoting instead of Gaussian elimination without pivoting).
Conditioning: Precondition the problem (e.g., scaling the input data to improve the condition number).
Error Estimation: Use a posteriori error estimates to assess the accuracy of the computed solution.

Eg: Error Propagation in Polynomial Evaluation

Evaluating a polynomial $p (x) = a_{0} + a_{1} x + a_{2} x^{2} + \dots + a_{n} x^{n}$ using Horner's method:

p(x) = a_0 + x(a_1 + x(a_2 + \cdots + x(a_{n-1} + x \cdot a_n) \cdots))

Horner's method is more stable than the naive approach because it minimizes the number of operations and thus the potential round-off errors.

VI. Euler's Method for ODEs.

Euler's method is a simple and widely used numerical technique for solving ordinary differential equations (ODEs).

Consider an initial value problem of the form:

\frac{dy}{dx} = f(x, y), \quad y(x_0) = y_0

where $f(x, y)$ is a given function, $y (x_{0}) = y_{0}$ is the initial condition, and we seek the value of $y$ at $x = x_1$

Euler's method uses the following iterative formula to approximate the solution:

y_{n+1} = y_n + h f(x_n, y_n)

where:

$y_n$ is the approximation of $y$ at $x_n$ .
$x_{n+1} = x_n + h$ is the next value of $x$ .
$h$ is the step size.

Steps of Euler's Method

Initialization: Set the initial values $x_0$ and $y_0$ , and choose the step size $h$ .
Iteration: Use the Euler formula to compute $y_{n+1}$ from $y_n$ for $n = 0, 1, 2, \ldots$ until the desired interval is covered.
Output: The values $(x_n, y_n)$ give the approximate solution to the ODE at discrete points.

Eg : Use Euler's method to solve the initial value problem:

\frac{dy}{dx} = x + y, \quad y(0) = 1

on the interval $[0, 1]$ with a step size of $h = 0.2$ .

Initialization:
$x_0 = 0, \quad y_0 = 1, \quad h = 0.2$
Iteration:
- Step 1:
  $y_1 = y_0 + h f(x_0, y_0) = 1 + 0.2 (0 + 1) = 1 + 0.2 = 1.2$ $x_1 = x_0 + h = 0 + 0.2 = 0.2$
- Step 2:
  $y_2 = y_1 + h f(x_1, y_1) = 1.2 + 0.2 (0.2 + 1.2) = 1.2 + 0.2 \times 1.4 = 1.2 + 0.28 = 1.48$ $x_2 = x_1 + h = 0.2 + 0.2 = 0.4$
- Step 3:
  $y_3 = y_2 + h f(x_2, y_2) = 1.48 + 0.2 (0.4 + 1.48) = 1.48 + 0.2 \times 1.88 = 1.48 + 0.376 = 1.856$ $x_3 = x_2 + h = 0.4 + 0.2 = 0.6$
- Step 4:
  $y_4 = y_3 + h f(x_3, y_3) = 1.856 + 0.2 (0.6 + 1.856) = 1.856 + 0.2 \times 2.456 = 1.856 + 0.4912 = 2.3472$ $x_4 = x_3 + h = 0.6 + 0.2 = 0.8$
- Step 5:
  $y_5 = y_4 + h f(x_4, y_4) = 2.3472 + 0.2 (0.8 + 2.3472) = 2.3472 + 0.2 \times 3.1472 = 2.3472 + 0.62944 = 2.97664$ $x_5 = x_4 + h = 0.8 + 0.2 = 1.0$
Output:

The approximate solution at $x = 1$ is $y(1) \approx 2.97664$ .

Error and Stability:

Euler's method is simple and easy to implement, but it has limitations:

Global Truncation Error: The error accumulates over steps and is proportional to $h$ . The global error is $O(h)$ , making it less accurate for large $h$ .
Stability: For stiff ODEs, Euler's method can be unstable unless the step size $h$ is very small.

VII. Runge-Kutta Methods.

The most commonly used Runge-Kutta method is the fourth-order method, often referred to simply as the Runge-Kutta method.

The general form of an $s$ -stage Runge-Kutta method for solving the initial value problem

\frac{dy}{dx} = f(x, y), \quad y(x_0) = y_0

is given by:

y_{n+1} = y_n + h \sum_{i=1}^s b_i k_i

where:

k_i = f\left( x_n + c_i h, y_n + h \sum_{j=1}^{i-1} a_{ij} k_j \right)

for $i = 1, 2, \dots,$ . The coefficients $a_{ij}$ , $b_i$ , and $c_i$ define a specific Runge-Kutta method and are typically arranged in a Butcher tableau.

Fourth-Order Runge-Kutta Method (RK4)

The fourth-order Runge-Kutta method is the most popular Runge-Kutta method due to its accuracy and simplicity. It is given by the following formulas:

Compute the intermediate slopes:
$k_1 = f(x_n, y_n)$ $k_2 = f\left(x_n + \frac{h}{2}, y_n + \frac{h}{2} k_1\right)$ $k_3 = f\left(x_n + \frac{h}{2}, y_n + \frac{h}{2} k_2\right)$ $k_4 = f(x_n + h, y_n + h k_3)$
Update the solution:
$y_{n+1} = y_n + \frac{h}{6} (k_1 + 2k_2 + 2k_3 + k_4)$

Eg : Use the fourth-order Runge-Kutta method to solve the initial value problem:

\frac{dy}{dx} = x + y, \quad y(0) = 1

on the interval $[0, 1]$ with a step size of $h = 0.2$ .

Initialization:
$x_0 = 0, \quad y_0 = 1, \quad h = 0.2$
First iteration:
- Compute the intermediate slopes:
  $k_1 = f(x_0, y_0) = 0 + 1 = 1$ $k_2 = f\left(x_0 + \frac{h}{2}, y_0 + \frac{h}{2} k_1\right) = f\left(0 + 0.1, 1 + 0.1 \times 1\right) = f(0.1, 1.1) = 0.1 + 1.1 = 1.2$ $k_3 = f\left(x_0 + \frac{h}{2}, y_0 + \frac{h}{2} k_2\right) = f\left(0.1, 1 + 0.1 \times 1.2\right) = f(0.1, 1.12) = 0.1 + 1.12 = 1.22$ $k_4 = f(x_0 + h, y_0 + h k_3) = f(0.2, 1 + 0.2 \times 1.22) = f(0.2, 1.244) = 0.2 + 1.244 = 1.444$
- Update the solution:
  $y_1 = y_0 + \frac{h}{6} (k_1 + 2k_2 + 2k_3 + k_4) = 1 + \frac{0.2}{6} (1 + 2 \times 1.2 + 2 \times 1.22 + 1.444)$ $= 1 + \frac{0.2}{6} (1 + 2.4 + 2.44 + 1.444) = 1 + \frac{0.2}{6} \times 7.284 = 1 + 0.2 \times 1.214 = 1.2428$

Continue this process for more iterations to find the solution at desired points.

VIII. Simpson's Rule.

Simpson's rule is a numerical method for approximating the definite integral of a function. Simpson's rule uses parabolic arcs instead of straight lines to approximate the area under a curve.

For a function $f(x)$ defined on the interval $[a, b]$ , Simpson's rule approximates the integral $\int_{a}^{b} f (x) d x$ as follows:

Simpson's 1/3 Rule:
$\int_a^b f(x) \, dx \approx \frac{b - a}{6} \left[ f(a) + 4f\left( \frac{a + b}{2} \right) + f(b) \right]$
This rule requires that the interval $[a, b]$ is divided into an even number of subintervals, each of equal width $h$ .
Composite Simpson's 1/3 Rule: For better accuracy, especially over larger intervals, the interval $[a, b]$ is divided into $n$ equal subintervals (where $n$ is even), each of width $h = \frac{b - a}{n}$ . The composite Simpson's rule is then:
$\int_a^b f(x) \, dx \approx \frac{h}{3} \left[ f(x_0) + 4 \sum_{i=1, 3, 5, \ldots, n-1} f(x_i) + 2 \sum_{i=2, 4, 6, \ldots, n-2} f(x_i) + f(x_n) \right]$
where $x_i = a + ih$ for $i = 0, 1, \ldots, n$ .

Eg: Approximate the integral $\int_0^2 e^x \, dx$ using Simpson's rule with $n = 4$ subintervals.

Define the function:
$f(x) = e^x$
Set the interval and subinterval width:
$a = 0, \quad b = 2, \quad n = 4, \quad h = \frac{b - a}{n} = \frac{2 - 0}{4} = 0.5$
Compute the function values at the required points:
$x_0 = 0, \quad x_1 = 0.5, \quad x_2 = 1.0, \quad x_3 = 1.5, \quad x_4 = 2.0$ $f(x_0) = e^0 = 1$ $f(x_1) = e^{0.5} \approx 1.64872$ $f(x_2) = e^1 \approx 2.71828$ $f(x_3) = e^{1.5} \approx 4.48169$ $f(x_4) = e^2 \approx 7.38906$
Apply the composite Simpson's rule formula:
$\int_0^2 e^x \, dx \approx \frac{h}{3} \left[ f(x_0) + 4 (f(x_1) + f(x_3)) + 2 f(x_2) + f(x_4) \right]$ $\approx \frac{0.5}{3} \left[ 1 + 4(1.64872 + 4.48169) + 2(2.71828) + 7.38906 \right]$ $\approx \frac{0.5}{3} \left[ 1 + 4 \cdot 6.13041 + 5.43656 + 7.38906 \right]$ $\approx \frac{0.5}{3} \left[ 1 + 24.52164 + 5.43656 + 7.38906 \right]$ $\approx \frac{0.5}{3} \left[ 38.34726 \right]$ $\approx 6.39121$

The exact value of the integral is $e^{2} - 1 \approx 6.38906$ , so the approximation using Simpson's rule is quite accurate.

technotes.

Search This Blog