Sciences

Statistics and Data Science (SSD)

  • Training structure

    Faculty of Science

Presentation

The SSD program is a course in applied mathematics that aims to provide high-level skills in statistics, random modeling and data science.
It is designed to provide solid knowledge and professional skills to enable students to join multidisciplinary teams in a wide range of sectors: health, biology, ecology, environment, genomics, energy, agronomy, economics, banking, insurance, marketing, research, higher education, etc.

Read more

Organization

Open in alternation

This course is open on a sandwich basis.

Program

Select a program

M1 - Statistics and Data Science (SSD)

Fueled by increasingly powerful means of collection, statistical data (aka data) is growing exponentially, and few areas escape measurement that is becoming more extensive every day. But if the collection of data is one thing, its analysis is another. This is made difficult by two main phenomena: the size of the data and the complexity of the measured phenomena. Contemporary statistics seeks to solve these two problems. It is thus led to evolve very rapidly, by preserving the best of past tools, which it adapts to massive and large-scale data, and by proposing at the same time more and more refined modeling methods that respect the complexity of the phenomena. Classical statistics has thus evolved towards a more computational "data science", which integrates automatic learning and diagnostic techniques halfway between statistics and artificial intelligence.

The Statistics and Data Science program provides training in all contemporary statistical analysis and modeling methodologies. While it leads to the profession of "data scientist", it integrates the aspects of methodological design - thanks to the mastery of the underlying mathematics and their computer programming - as well as the rigorous application of methods and models to data of various types and domains.

This course is split in the second year into two more specialized sub-courses, the teaching of which remains partially shared. The first of these specializations is Biostatistics, which focuses on the analysis and modeling of life data. The second is Information and Decision Management (MIND), which specializes in the analysis and modeling of economic data as well as the management of decisions and associated risks.

 

What do you want to do?New mailCopy

See the complete page of this course

  • Stochastic processes

  • Information system and databases

    4 credits
  • Analysis of multi-dimensional data

    5 credits
  • Optimization

    5 credits
  • Software development

    4 credits
  • Inferential statistics

  • Information and decision theories

    2 credits
  • Stochastic control

    2 credits
  • Time series

    4 credits
  • Estimation and non-parametric tests

    4 credits
  • Linear model

    5 credits
  • Project

    5 credits
  • English

    2 credits
  • CHOICE2

    2 credits
    • Your choice: 1 of 4

      • Epidemiology tools

        2 credits
      • Microeconomics

        2 credits
      • Bioinformatics Learning Lab

        2 credits
      • Biological information

        2 credits
  • CHOICES1

    4 credits
    • Your choice: 1 of 2

      • Alignment and Phylogeny

        4 credits
      • General economy

        4 credits
  • Programming R

    2 credits

M2 - Statistics and Data Science (SSD) - BIOSTATS

This M2 is intended for students with a M1 in Statistics and Data Science (SSD) or any other M1 in mathematics or equivalent with a strong specialization in probability and statistics.

This M2 is divided into two more specialized sub-courses, whose teaching remains partially shared.

- The first of these specializations is Biostatistics, which focuses on the analysis and modeling of living data.

- The second is Information and Decision Management (MIND), which specializes in the analysis and modeling of economic data as well as the management of decisions and associated risks.

  • The ambition of the SSD-Biostat course is to meet the expectations of M1 SSD students who are attracted to the modeling of data from life or the environment. The statistical aspects addressed in this course range from the modeling of living organisms to the most theoretical problems of statistics and stochastic modeling. The numerical aspects are extremely present in this program and require a strong taste for computer programming.

The SSD-Biostat course is a demanding course because it focuses on concepts rather than techniques. Indeed, in the field of data in the broadest sense, digital technologies, with the advent of artificial intelligence, are evolving rapidly and becoming outdated even faster. Future statistical engineers or researchers who will have to deal with data will be able to train themselves to new technologies throughout their professional life, all the better if they have had a solid initial conceptual training. The added value of the training is precisely to provide the theoretical understanding of the statistical concepts underlying automatic algorithms. Graduates must also be able to ensure a quality technological watch.

The SSD-Biostat course remains partly shared in the second year with the Information and Decision Management course (SSD-MIND). However, the SSD-BIOSTAT course has its own specialization courses in life and environmental data science, and is more oriented towards initiation to research (two courses per semester in M2).

  • The ambition of the SSD-MIND program is to meet the expectations of M1 SSD students who feel attracted to the application of data science in business. Given the great diversity of companies and their problems, this M2 program trains students in general data science, "all fields". In addition, it provides training that is more specific to the context of the company and its economic and managerial issues (economic information, financial risk management, customers, corporate strategy, etc.).

The SSD - MIND program is a mathematical engineering type of training, which places the emphasis on methodology and the perfect mastery of statistical concepts and models. The graduate of this program will be able to deal with all types of data and problems, and to design a complete and often original methodology for this problem, starting with the management and organization of data, continuing with their exploration and targeted reduction, then with the modeling of the phenomena of interest, and finally synthesizing the extracted information for decision-making purposes. He or she will also have to know how to transmit to the company the knowledge synthesized from the information extracted from the data. Each new set of data and each question asked about it is often a new problem, and the application of a standard method to these data is then inadequate. On the contrary, it is necessary to write a mathematical model adapted to these data (in the sense that it reflects their complexity in a satisfactory way) and to make it assimilable to a standard estimation method, or to design and program a more specific method. The emphasis placed by this training on the conceptual and mathematical mastery of tools guarantees graduates of this program the great capacity for adaptation and self-training required by the rapid evolution of data science.

The SSD-MIND program is partly shared in the second year with the biostatistics program (SSD-Biostat), which is more specialized in the analysis and modeling of data from the living world or the environment. The SSD-MIND program is a double program in partnership with the IAE (which provides the economics and management courses), leading to a double degree.

See the complete page of this course

  • Non-parametric estimation

    5 credits
  • Generalized linear models

    5 credits
  • English

    2 credits
  • Alternation project or defense

    3 credits
  • Bayesian Statistics

    5 credits
  • Multivariate analysis

    5 credits
  • Statistical learning

    5 credits
  • Lifetime analysis

    4 credits
  • Supplement 2

    4 credits
  • Supplement 1

    4 credits
  • Internship

    14 credits
  • Latent variable models

    4 credits

M2 - Statistics and Data Science (SSD) - MIND

This M2 is intended for students with a M1 in Statistics and Data Science (SSD) or any other M1 in mathematics or equivalent with a strong specialization in probability and statistics.

This M2 is divided into two more specialized sub-courses, whose teaching remains partially shared.

- The first of these specializations is Biostatistics, which focuses on the analysis and modeling of living data.

- The second is Information and Decision Management (MIND), which specializes in the analysis and modeling of economic data as well as the management of decisions and associated risks.

  • The ambition of the SSD-Biostat course is to meet the expectations of M1 SSD students who are attracted to the modeling of data from life or the environment. The statistical aspects addressed in this course range from the modeling of living organisms to the most theoretical problems of statistics and stochastic modeling. The numerical aspects are extremely present in this program and require a strong taste for computer programming.

The SSD-Biostat course is a demanding course because it focuses on concepts rather than techniques. Indeed, in the field of data in the broadest sense, digital technologies, with the advent of artificial intelligence, are evolving rapidly and becoming outdated even faster. Future statistical engineers or researchers who will have to deal with data will be able to train themselves to new technologies throughout their professional life, all the better if they have had a solid initial conceptual training. The added value of the training is precisely to provide the theoretical understanding of the statistical concepts underlying automatic algorithms. Graduates must also be able to ensure a quality technological watch.

The SSD-Biostat course remains partly shared in the second year with the Information and Decision Management course (SSD-MIND). However, the SSD-BIOSTAT course has its own specialization courses in life and environmental data science, and is more oriented towards initiation to research (two courses per semester in M2).

  • The ambition of the SSD-MIND program is to meet the expectations of M1 SSD students who feel attracted to the application of data science in business. Given the great diversity of companies and their problems, this M2 program trains students in general data science, "all fields". In addition, it provides training that is more specific to the context of the company and its economic and managerial issues (economic information, financial risk management, customers, corporate strategy, etc.).

The SSD - MIND program is a mathematical engineering type of training, which places the emphasis on methodology and the perfect mastery of statistical concepts and models. The graduate of this program will be able to deal with all types of data and problems, and to design a complete and often original methodology for this problem, starting with the management and organization of data, continuing with their exploration and targeted reduction, then with the modeling of the phenomena of interest, and finally synthesizing the extracted information for decision-making purposes. He or she will also have to know how to transmit to the company the knowledge synthesized from the information extracted from the data. Each new set of data and each question asked about it is often a new problem, and the application of a standard method to these data is then inadequate. On the contrary, it is necessary to write a mathematical model adapted to these data (in the sense that it reflects their complexity in a satisfactory way) and to make it assimilable to a standard estimation method, or to design and program a more specific method. The emphasis placed by this training on the conceptual and mathematical mastery of tools guarantees graduates of this program the great capacity for adaptation and self-training required by the rapid evolution of data science.

The SSD-MIND program is partly shared in the second year with the biostatistics program (SSD-Biostat), which is more specialized in the analysis and modeling of data from the living world or the environment. The SSD-MIND program is a double program in partnership with the IAE (which provides the economics and management courses), leading to a double degree.

See the complete page of this course

  • Generalized linear models

    5 credits
  • English

    2 credits
  • Alternation project or defense

    3 credits
  • Risk management

    10 credits84h
  • Multivariate analysis

    5 credits
  • Statistical learning

    5 credits
  • Lifetime analysis

    4 credits
  • Internship

    14 credits
  • Strategy and project management

    4 credits
  • Latent variable models

    4 credits
  • Data mining and missing data

    4 credits

Admission

Conditions of access

The Master's degree in Maths - SSD is accessible after a Bachelor's degree in Mathematics (fundamental or applied).

Read more

How to register

Applications are made on the following platforms: 

French & European students:

International students from outside the EU: follow the "Studies in France" procedure: https: //pastel.diplomatie.gouv.fr/etudesenfrance/dyn/public/authentification/login.html

Read more

And then

Further studies

The Master's degree in Maths - SSD also leads to the pursuit of a thesis in the academic or professional world, to train future teacher-researchers or research engineers.

Read more

Professional integration

Statistician, data scientist, data manager, marketing researcher, customer relations manager, risk management manager, biostatistician, researcher in public research establishments, in corporate R&D teams.

Read more