Initiative In Data Science And Statistics: A Collaborative Effort With Two Distinct Components

The Initiative in Data Science and Statistics, recently launched by New York University and spearheaded by Gérard Ben Arous, Director of the Courant Institute of Mathematical Sciences, actually had its origins last year within a working group of leaders throughout the university. The group was led by Yann LeCun, Courant Institute Silver Professor of Computer ScienceNeural Science, and Electrical and Computer Engineering, who will serve as the Director of the Center for Data Science, part of the new initiative.

Seeking to propel NYU to the forefront of the rapidly developing field of data science, the Initiative in Data Science and Statistics will build on the university’s strength in many fields of knowledge at schools across its academic spectrum, including the Leonard N. Stern School of Business; the Polytechnic Institute of NYU; the Center for the Promotion of Research Involving Innovative Statistical Methodology (PRIISM) at the Steinhardt School of Culture, Education, and Human Development; the Center for Health Informatics and Bioinformatics at NYU Langone Medical Center; the College of Arts and Sciences; and the newly created Center for Urban Science and Progress (CUSP).

The initiative includes two separate but complementary components: education (Master of Science in Data Science program) and research (Center for Data Science). Both are essential because as massive data sets are constantly being generated, it has become increasingly important to extract knowledge from them, requiring both the teaching and utilization of advanced analytic methods.

The difference between big data and data science

Big data and data science may seem similar, even identical, to some, but there is an important distinction between the two. Gérard Ben Arous emphasizes: “We are not doing big data. This is crucial. The difference between the two is the word science. I am, we are, scientists.”

He states, “Big data is more concerned with the engineering components of data and in answering the following questions: how do you store it, how do you manipulate it, how do you do parallelized computations on it, how do you access it, how do you mine it? That is more of what CUSP will be interested in and we will collaborate. But we will do more science,” he says, “looking at the algorithmic and mathematical aspects of extracting knowledge from data.”


-By ML Ball