MRL: Now supported by CCO-10/11

  • The videos herein discuss the Artificial Intelligence.
  • Most of the videos do not have a corresponding MATLAB code. However, recent videos have a corresponding MATLAB impmementation. All the future videos are expected to have a corresponding MATLAB or Python implementation.
  • Old videos (1-8) having bad audio or improper content will be replaced with a newer version.
  • It is assumed that the viewers have basic background in linear algebra.
  • Convex Optimization is not a pre-requisite, and the convex analysis videos are delivered in such a way that the viewer can skip that particular part (if he/she does not have background in convex optimization) without affecting the ability to understand the algorithm and its implementation.

Artificial Intelligence: Machine Learning & Convex Analysis

  • Machine Learning: Linear Regression Download - PDF; Download MATLAB Code; Python Code (Gaussian Kernel; Polynomial and Linear Basis Functions)
    Reference Reading:

    Short Note: Regression is a method of determining the relationship between the data and its outcome or the target value. Linear Regression is a method of fitting a line to a data set. When the data follows a linear pattern, a line, depending of a set of parameters which defines the line, is fit to the data set. This enables the determination of the target value of a new instance/example. The inner product of the weight parameters (defining the line) and the feature vector of the new instance give the target value of that instance.

    Implementation: Code
    The implementation in MATLAB not just allow you to minimize the L_2 error, but also L_1 error, along with L_2 regularization by setting the trade-off parameter β. If beta is zero, then simple linear regressions with L_2 and L_1 errors are performed. migregL1 re-formulates the optimization problem and then solves it as a quadratic program in matlab.

    function w2 = minregL2(X,y,beta) takes the Nxn data matrix X, Nx1 target values y and a scalar beta as its input parameter and returns the weight vector w2 as its output.
    function w1 = minregL1(X,y,beta) takes the Nxn data matrix X, Nx1 target values y and a scalar beta as its input parameter and returns the weight vector w1 as its output.

    This PDF File explains the formulations mathematically --- the formulations in the above files.

  • Machine Learning: Locally Weighted Regression

    Short Note: Locally weighted regression, as the name suggests, is a method of determining the relation between the data/instance/input to the outcome/target value/output based on the based on the local neighborhood of that instance. This is useful in the cases where the data and its outcome follow a particular patter, but the pattern depends on the location/neighborhood of the data. A weight function assigns the weight to the neighboring points/data/instances which decrease as the distance between the neighboring data and the input instance increases. Then the regression is carried out to determine the relation between the outcomes in that neighborhood.

  • Machine Learning: Non-Linear Regression
    Reference Reading: C.M. Bishop - Pattern Recognition & Machine Learning.

    Short Note: Nonlinear regression is an algorithm for finding the non-linear relationship between the input variable and its target value. The difference between the linear and nonlinear regression is that linear regression uses a linear set of basis functions whereas the later uses the nonlinear basis functions. Both have the same analytical solution in terms of the basis functions.

  • Machine Learning: Perceptron, Voted Perceptron and Kernel Perceptron Download PDF; Download MATLAB Code
    Reference Reading: Manfred Georg Mercer Kernels; B. Scholkopf, R. Herbrich, A.J. Smola and R. Williamson: A Generalized Representer Theorem

    Short Note: Perceptron Learning algorithm is one of the simplest algorithms to apply but yet quite efficient to solve classification problems. Simple perceptron learning algorithm (for a separable) adjusts the decision boundary for each error it makes on the training set and terminates when it makes no error, and returns the learned parameters. Voted perceptron, on the other hand, uses a voting scheme to vote each of the modified weight vector that it encounters during the learning process, and finally outputs the voted-sum of weight vectors. In general it's hard to determine what kind of basis functions will be able to perform well in the task at hand. Sometimes we may find that the polynomial function can perform well, but what should be the degree of the polynomial? Using kernels circumvent this problem. The cardinal advantage of using Kernels is that it obviates the necessity of constructing the basis functions explicitly. Kernel Perceptron learning utilizes the kernel trick to solve the classification problem. Hence Kernel Perceptron algorithm obviates the need to construct the basis functions.

  • Machine Learning: Support Vector Machine & In-depth Convex Analysis Download PDF; MATLAB Code; Instruction SVM_Matlab_Inscruction.pdf
    Reference Reading: Pattern Recognition & Machine Learning - C.M. Bishop - Ch07; Convex Optimization - Stephen Boyd & L. Vangenberghe - Ch05.

    Short Note: Support Vector can be used both as a Hard-Margin and Soft-Margin classifier, in Primal form as well as the dual form. The primal form of the SVM is useful in the cases where we do not need to construct basis functions or have prior knowledge about the basis functions. The Dual form however is quite powerful as it implicitly involves the Kernel-Trick, facilitating the formation of implicit feature mapping. It thus obviates the need to explicitly construct the basis functions. The SVM satisfies the Slater's Constraint Qualification condition and therefore hold the strong-duality i.e. the optimal-dual gap is zero. The constraints are affine and objective is convex therefore the modified Slater's conditions satisfy. Utilizing the fact that Optimal-Primal objective value is equal to the Dual-Optimal objective value, SVM solves the DUAL problem. Utilizing the Complementary Slackness, the SVM identifies the Support-Vectors and discards all other instances in the data-set to make further decisions. Hence the name, Sparse Kernel Machine. This lecture provides the complete convex analysis of SVM.

  • Artificial Intelligence: Exploring Node Dependencies in a Bayesian Network Graph  Download - PDF; MATLAB Code -
    Reference Reading: Probabilistic Graphical Models, Koller and Friedman, Chapter 3

    Short Note: Bayesian Network graphs exploit the dependencies between two variables/nodes for an inference. A Bayesian Network graph, G, represents the underlying independencies in a given probability distribution that factorize over G. It is often difficult to analyze the graph by hand and therefore we need some tools to read the dependencies from the given Bayesian Network graph. This presentation discusses the concepts that will allow you to read the independencies in a given Bayesian Network (BN) graph. The presentation also provides MATLAB and Python 3.x implementation of the some of algorithms discussed in this presentation.