• ECTS

    5 credits

  • Training structure

    Faculty of Science

Description

The size of statistical data continues to grow, particularly in terms of the richness of the description of statistical units. However, classical linear statistical modeling becomes invalid in high dimensions, i.e., when the number of variables exceeds the number of statistical units. This course presents the most common techniques used to regularize linear models in high dimensions.

Read more

Objectives

Training in high-dimensional univariate and multivariate linear modeling, i.e., various regularization techniques for classical linear modeling.

Read more

Teaching hours

  • Multivariate Analysis - CMLecture9 p.m.

Mandatory prerequisites

Multidimensional data analysis course (ACP & CA). Course on Euclidean geometry, normed vector spaces, and endomorphism reduction.

 

 

Recommended prerequisites: Courses in univariate and bivariate descriptive statistics. Good command of matrix calculations.

Read more

Syllabus

Introduction

Large-scale data. Dimensional reduction and regularization.

I - Regularized linear modeling of a continuous variable.

  1. The classic linear model.

    a) Express recalls.

    b) Failures due to collinearities.

  2. Principal component regression.

    a) The method.

    b) Qualities and defects.

  3. PLS regression.

    a) Tier 1 criteria and program.

    b) Criteria and program for subsequent ranks.

    c) Why PLS regularizes.

    d) Choosing the number of components for prediction.

    e) Metric of the continuum between OLS and PLS.

  4. Penalized linear regressions.

    a) Ridge regression.

    b) LASSO.

    c) Elastic net.

II - Regularized linear modeling of a group of continuous variables.

  1. The multivariate Gaussian linear model

    a) The classic model.

    b) The penalized model.

    c) The MANOVA model.

  2. Multivariate PLS regression.

    a) Rank 1 criterion and program with arbitrary metrics.

    b) Special cases: canonical analysis, PCA on instrumental variables, PLS2 regression.

    c) Criteria and program for subsequent ranks.

    d) Prediction: choosing the optimal number of components.

    e) Metrics of the continuum between Canonical Analysis, ACPVI, and PLS.

III - Linear modeling of a nominal variable: linear discriminant analyses.

  1. Discriminant factor analysis

    a) Criteria and program.

    b) Components and discriminating powers.

  2. PLS discriminant analysis.

    a) Criteria and program.

    b) Components and discriminating powers.

    c) Centroidal discriminant analysis.

  3. Decision-making aspects.

    a) Decision (classification), losses, decision rules (allocation), risks.

    b) Choosing the right number of components for the decision.

Read more

Additional information

Hourly volumes:

            CM: 21

            TD:

            TP: 

            Land:

Read more