My main research theme is Machine Learning and Data Science for large-scale connected Intelligent Systems.
The notion of connectedness in this context does not reduce to mere edges of graphs. Connectedness in modern systems is more adequately represented via a concept of (possibly dynamic or one-sided) non-independence.
Specific cases can include causal relations, hypergraph structures, spatial dependence, influence from common factors or adversaries, and various forms of stochastic dependence going beyond analysis of correlations.
In addition, modern learning systems constantly collect, produce, and analyze, vast amounts of high-dimensional data of novel types.
Functions, texts, images and videos, complex physical spectra, time series and stochastic processes, manifold-valued and graph-valued data recorded by modern sensors, as well as complicated records describing dependencies within the system itself, all have to be taken into account in order to ensure reliable decisions.
Examples of application areas include the Internet of Things for Industry 4.0, SmartGrid and smart environments, automated analysis of social, telecom, and political networks, systems of autonomous robots, and optimal configurations of neurons for Deep Learning applications.
My work has received many academic honors, and applications of my results made an impact and improved standards in a number of industries.
MAIN CURRENT DIRECTION
Within the area of large-scale autonomous connected learning systems, I am particularly interested in machine learning for Advanced Manufacturing and the Internet of Things for Industry 4.0.
This is one of the national strategic focus areas for Switzerland, where I lead a long-term national project “Nano-Assembly”, collaborating with major Swiss and international manufacturers.
I attempt to develop machine learning methodology and practice guidelines for Advanced Manufacturing and Materials Science, setting new standards for machine learning work in this field, and helping companies to improve the quality of materials and goods in dozens of industries, including manufacturing, agriculture, security and energy.
This is an urgent and necessary task, as currently the vast majority of research in machine learning is designed to handle virtual data from the Internet or other digital environments. In its current state, machine learning needs new tools to be effectively applied to production of physical goods at scale.
Studying numerous particular cases suggests that many connected autonomous learning systems, be it in industry or in society, can be better predicted, controlled, and directed, when the scale of the system grows to be extremely large.
For example, Deep Learning had shown us that networks with large number of neurons are easier to train, while at the same time there exist simple tricks to avoid overfitting in large neural networks.
Additionally, some nontrivial useful properties such as self-organization or emergence, seem to be more likely to appear in very large systems, as compared to moderate sized systems.
I aim to understand the nature of this phenomenon with the help of asymptotic methods of mathematics and statistics, and to learn to create such effects in intelligent systems in practice.
Methodologically, my research is at the intersection of machine learning, statistics, applied mathematics, and computer science, with the general focus being on inventing novel methods and algorithms for automated, unsupervised analysis of massive datasets encountered in practice nowadays.
This includes analysis of novel types of complex data, such as networks, images, manifolds and infinite-dimensional objects, or analysis of dynamic data, such as time series and stochastic processes, as well as real-time data analysis.
Given the breadth of my interaction with businesses worldwide, including at a regulatory government level in Germany and the European Union, and as a scientific expert in statistical data analysis and machine learning, I am very well placed to lead both fundamental research as well as its translation to business applications.
I believe my industrial experience has a unique edge: I have not been limited to purely technical (narrow) goals of one company, but have instead had exposure to numerous top-level tech consuming and producing companies by virtue of being at Germany’s National Institute of Standards and Technology and the country’s highest technical authority.
This experience has given me a richer perspective on industrial functioning, regulation, and its needs, which I believe is a key advantage that I bring. This will help me to define well-grounded research directions with a long term perspective and high relevance to the world of international business.
My previous research experience laid out the foundation for my current and future work, and is organically integrated into the toolbox that will be used in my upcoming projects. Many of these topics and ideas periodically reoccur in my research, depending on the current application area.
I developed a number of novel data analysis approaches throughout my career; their relatively detailed description can be found below.
LEARNING FOR GRAPH-VALUED DATA AND ON NETWORKS
Statistics of networks and structures, statistical and probabilistic image analysis, spatial statistics.
I developed a novel unsupervised nonparametric method for detection of signals and anomalies and reconstruction of images and clusters in the presence of random noise of possibly extreme characteristics. We are able to detect and estimate not only regular signals in the images or networks, but also weak objects of unknown shape, as well as fine structures such as curves (even those that are not visible by the human eye), or small clusters in networks.
The new algorithms have linear and sub-linear computational complexity and exponential accuracy and are therefore appropriate for real-time systems. Our analysis is mathematically rigorous. Each of the algorithms has a built-in data-driven stopping rule, so there is no need in human assistance to stop the algorithm at an appropriate step.
Application fields include nanotechnology, image analysis, robot vision, mathematical biology and network analysis. Important contributions to this development have been made by Professors Michael Habeck, Olaf Wittich, Laurie Davies and Bernhard Schölkopf.
Statistics, machine learning and optimization for giant networks, statistics of mobile and wireless networks and of the Internet
A wide variety of applications in machine learning, data mining and related areas involve large-scale graphs. A useful step in analyzing such graphs is to obtain certain summary statistics about the graph, as these statistics provide insight into the structure of a graph. Estimating them helps predict properties of the entire graph without having to actually look at the whole graph. This is generally very practical, and can be the only option in those cases where the whole network is not observable in principle. This is the case for Internet, to name just one example.
Motivated by this, me and Suvrit Sra proposed a novel approach to studying statistical properties of structured subgraphs (of a given graph) and, in particular, to estimate the expected objective function value of a combinatorial optimization problem over these subgraphs.
We have shown that, even for regular graphs, very surprising phenomena occur when a property of the whole network is studied via considering random subgraphs of this network. In particular, statistical estimators exhibit nontrivial behavior, and their consistency depends on the number of unexpected conditions.
These estimators can be consistent even when they are based on a “small” graph; this provides theoretical grounds for replacing processing of giant networks by processing suitably constructed, smaller networks.
I applied these results to analysis of statistical and dynamical properties of a structured, dynamic, spatially distributed mobile phone network of Orange in Ivory Coast, and to analysis and modeling of both Smart Grids and transportation networks.
LEARNING FOR MANIFOLD-VALUED DATA
Active learning of manifolds and geometric statistics for computer vision, design of experiments, customer perception prediction
Since 2013, I am leading statistics and machine learning research, as well as data analysis activities, within the international Industrial Joint Research Program xDReflect of European Association of National Metrology Institutes.
STATISTICAL METHODS AND MACHINE LEARNING
Machine learning theory and support vector machines
In my development of new methodologies for machine learning theory and support vector machines (SVMs), I combine techniques from both machine learning and mathematical statistics.
For research in this area, I was awarded the Max Planck Society Grant “Statistical learning theory for autonomous systems”. The grant has started in October 2011, with me as a principal investigator, and lasted until my departure for a permanent position at Bell Labs.
Support Vector Machines (SVMs) are one of the most popular and successful classes of learning algorithms used in Artificial Intelligence (AI) systems. SVMs are often used by AI systems to perform automated classification of objects. Typically, it is assumed that the number of possible classes is fixed in advance. However, modern autonomous and intelligent systems have to perform in an uncertain environment where the types of objects and necessary number of classes are unknown.
I designed SVMs of a new generation, called ISVMs, which are adapting to these uncertainties, and proved that the ISVMs solve the classification problem with an unknown number of classes. In a joint effort with Bernhard Schölkopf, universal consistency of ISVMs was proved.
Another important line in my research is to develop methodology for unsupervised learning by combining recent advances in both learning theory and nonparametric statistics. This approach has already lead to new results in unsupervised learning for image and network analysis.
Statistical inverse problems, inference for stochastic processes, nonparametric testing and machine learning
LEARNING FOR DYNAMIC DATA AND NETWORKS
Time series analysis and prediction, Smart Grid and energy research
Research Experience (keywords)
– Statistics of networks and of the Internet
– Big Data analytics and scalable algorithms
– Machine learning theory, unsupervised learning
– Support vector machines
– Mobile and wireless networks analysis
– Statistical modeling and traffic analysis
– Structured, dynamic, spatially distributed data
– Computational advertising
– Games research
– Spatial statistics and statistical image analysis
– Time series analysis for Smart Grid
– Econometrics and financial statistics
– Randomized algorithms
– Statistical inverse problems and their applications
– Statistics for stochastic processes, nonparametric statistics
– Stochastic analysis and quantitative finance
– Heavy-tailed distributions, strong dependencies and extremes
– Nonstandard statistics: singular or discrete models, unusual limit distributions
– Statistical quality control and sampling strategies
– Geometric statistics and learning on manifolds for computer vision
– Statistical design of experiments, active learning, online learning
– Customer perception analysis and prediction, A/B testing