Selected research interests
of Susanne
Still

Physics of Computation, and Optimal Information Processing

Selected Machine Learning Applications

The purpose of this text is to
give
interested readers a synnopsis of some of my work. Obviously, there is
a large body of relevant work done by others in these areas, but since
this text is not intended as a tutorial, I am limiting references to
the work of others to only the most essential. Please find further
references in my papers.

Physics of Computation, and Optimal Information Processing

Selected Machine Learning Applications

- energy efficiency.

- efficient predictive inference: the ability to produce a model of environmental variables that has predictive power at smallest possible model complexity.

It
is not
inconceivable that von Neumann, Wiener and Shannon had these
ideas in the back
of their minds when they developed measures of information. However,
the
analysis we use here hinges upon the notion of nonequilibrium (or generalized) free
energy,
which emerged much later, and which is becoming a common tool in the
study of systems operating far from thermodynamic equilibrium (such as
living
systems). Since inference is an activity of the human mind, which is
obviously not in thermodynamic equilibrium, it comes as no great
surprise that the
concept of
generalized free energy is helping understand the
thermodynamics of
inference and communication.

The
Information
Bottleneck framework provides then provides not only a constructive method
for predictive inference from which learning algorithms
can be derived, but also a general information
theoretic framework
for data processing that is well grounded in physics, as I have argued
[4]. The framework can be generalized to dynamical learning yielding a recursive
algorithm [4],
and
further
to
interactive
learning [8] (see also next paragraph). The generalized
Information Bottleneck framework then provides not only a way to better
understand known
models of dynamical systems [4, 7], but also a way to learn them from
data [4, 7], and to extend them to the situation with feedback [8].

Recently, we generalized the Information Bottleneck framework to quantum information processing [1]. This work enables a quantitative assessment of the advantages of using a quantum memory over a classical memory. All systems ultimately have to obey quantum mechanics. With the advent of quantum computers, and with mounting evidence for the importance of quantum effects in certain biological systems, understanding efficient use of quantum information has become increasingly important. In this context, I joined a collaboration studying light harvesting complexes. We found indications for possible adaptation mechanisms in a model of nonphotochemical quenching [2], and are now trying to understand how the thermodynamics of information processing come to bear on the subject. This project is relevant with regards to bio-fuel production. I believe that the future of humanity hinges upon efficient use of regenerative energy sources.

Living systems learn by interacting with their
environment, in a
sense
they "ask questions and do experiments", not only by actively filtering
the data but also
by perturbing, and, to some degree, controlling the environment that
they are learning about. Ultimately, one would like to
understand the emergence of complex
behaviors from simple first principles. To ask about simple
characteristics of policies which
would allow an
agent to optimally capture predictive information, I extended the
Information Bottleneck approach to the situation with feedback from the
learner, and showed that optimal encoding in the presence of feedback
requires action strategies to balance exploration with control [8].
Both aspects, exploration
and control, emerge in this treatment as necessary ingredients for
behaviors with maximal predictive power. This study resulted in a novel
algorithm for computing
optimal models and policies from data, which my student Lisa Miller
has applied to selected problems in robotics (work in progress). In the
context of reinforcement learning this approach allowed us to study [6]
how exploration emerges as an optimal strategy, driven by the need to
gather information, rather than being put in by hand as action policy
randomization.

- [1] A. L. Grimsmo and S. Still (2016) Quantum Predictive Filtering. Phys. Rev. A 94, 012338
- [2] G. P. Berman, A. I. Nesterov, R. T. Sayre and S.Still (2016) On
improving
the
performance
of
nonphotochemical
quenching
in
CP29
light-harvesting antenna complex. Physics Letters A,
380
(13),
pp.
1279–1283
(preprint)

- [3] S. Still (2014) Lossy
is lazy. Proc. Seventh
Workshop on Information Theoretic Methods in Science and Engineering
(WITMSE-2014), eds. J. Rissanen, P. Myllymäki, T. Roos, and N. P.
Santhanam

- [4] S. Still (2014) Information Bottleneck
Approach to Predictive Inference.
*Entropy*16(2):968-989

- [5] S. Still, D. A. Sivak, A. J. Bell, and G. E. Crooks (2012) Thermodynamics of Prediction Phys. Rev. Lett. 109, 120604 (Paper was reported on in Nature News)
- [6] S. Still and D. Precup (2012) An
information-theoretic
approach
to
curiosity-driven
reinforcement
learning Theory
in
Biosciences,
131
(3)
pp. 139-148

- [7] S. Still, J. P. Crutchfield and C. J. Ellison (2010) Optimally Predictive Causal
Inference. CHAOS 20, 037111

- [8] S. Still (2009) Information theoretic approach to interactive learning EPL 85 28005
- [9] S. Still and W. Bialek (2004) How many clusters? An information theoretic perspective. Neural Computation 16(12):2483-2506

- [10] G. E. Crooks and S. Still. Marginal and Conditional Second
Law of Thermodynamics for Strongly Coupled Systems.

- [11] S. Still, Thermodynamics of inference and optimal
information
processing.

- 11/18/2016 (planned) Statistical Physics, Information Processing and Biology, Santa Fe Institute, Santa Fe, NM
- 09/25/2016 Information,
Control,
and Learning--The
Ingredients of Intelligent Behavior, Hebrew University,
Jerusalem, Israel (remote talk).

- 08/20/2016 Foundational
Questions
Institute,
5th
International
Conference, Banff, Canada.

- 04/25/2016 Spring
College in the Physics of Complex Systems International Center for
Theoretical Physics (ICTP), Trieste, Italy.

- 7/14-17/2015 Conference
on Sensing, Information and Decision at the Cellular Level ICTP

- 5/4-6/2015 Workshop
"Nature as Computation". Beyond Center for
Fundamental Concepts in Science.

- 4/8-10/2015 Workshop on Entropy and Information in Biological Systems National Institute for Mathematical and Biological Synthesis (NIMBioS).
- 10/26-31/2014 Biological
and
Bio-Inspired
Information
Theory Banff, Canada.

- 7/5-8/2014 Seventh Workshop on Information Theoretic Methods in Science and Engineering
- 5/8-10/2014 Statistical
Mechanics
Foundations
of
Complexity–Where
do
we
stand? Santa Fe Institute.

- 1/14-16/2014 The Foundational
Questions Institute Fourth International Conference, Vieques
Island, PR.

- 6/26-28/2013 Modeling Neural Activity (MONA) Kauai, HI.
- 01/2011 Workshop on measures of
complexity Santa Fe Institute, Santa Fe, NM

- 01/2011 - Berkeley Mini Stat. Mech. Meeting.

- 11/2016 (planned) Condensed Matter Seminar, UC Santa Cruz.
- 08/2016 - Biophysics Seminar, Simon Frazer University, Vancouver, Canada.
- 06/2013 - Max Planck Institute for Dynamics and Self-organization, Göttingen, Germany.
- 04/2013 - Scuola Internazionale Superiore di Studi Avanzati (SISSA) Trieste, Italy.
- 03/2013 - Physics Department, The University of Auckland, Auckland, NZ.
- 03/2013 - Physics Department, The University of the South Pacific, Suva, Fiji.
- 11/2012 - Center for Mind,
Brain and Computation Stanford
University.

- 09/2012 - Physics Colloquium University
of
Hawaii
at
Manoa.

- 10/2011 - Redwood Center for Neuroscience, University of California at Berkeley.
- 08/2011 - Institute for Neuroinformatics, ETH/UNI Zürich, Switzerland.
- 11/2011 - Symposium in honor of W. Bialek’s 50th Birthday, Princeton University, Princeton, NJ.
- 11/2011 - Applied Math Seminar,
City College New York, NY.

- 03/18/2013 - APS March meeting; Session: Fluctuations in Non-Equilibrium Systems; Chair: Chris Jarzynski.

- 2/19/2015 Nostalgia Just Became a Law of Nature (by S. DeDeo)
- 10/9/2014 Life's Quantum Crystal Ball (by C. Piekema)
- 10/4/2012 Proteins remember the past to predict the future (by P. Ball) Nature News.

Textbook portfolio optimization methods used in quantitative finance produce solutions that are not stable under sample fluctuations when used in practice. This effect was discovered by a team of physicists, lead by Imre Kondor, and characterized using methods from statistical physics. The instability poses a fundamental problem, because solutions that are not stable under sample fluctuations may look optimal for a given sample, but are, in effect, very far from optimal with respect to the average risk. In the bigger picture, instabilities of this type show up in many places in finance, in the economy at large, and also in other complex systems. Understanding systemic risk has become a priority since the recent financial crisis, partly because this understanding could help to determine the right regulation.

The instability was discovered in the regime in which the number of assets is large and comparable to the number of data points, as is typically the case in large institutions, such as banks and insurance companies. I realized that the instability is related to over-fitting, and pointed out that portfolio optimization needs to be regularized to fix the problem. The main insight is that large portfolios are selected by minimization of an emperical risk measure, in a regime in which there is not enough data to guarantee small actual risk, i.e. there is not enough data to ensure that empirical averages converge to expectation values. This is the case because the practical situation for selecting large institutional portfolios dictates that the amount of historical data is more or less comparable to the number of assets. The problem can be addressed by known regularization methods. Interestingly, when one uses the fashionable "expected shortfall" risk measure, then the regularized portfolio problem results in an algorithm that is closely related to support vector regression. Support vector algorithms have met with considerable success in machine learning and it is highly desirable to be able to exploit them also for portfolio selection. We gave a detailed derivation of the algorithm [16], which slightly differs from a previously known SVM algorithm due to the nature of the portfolio selection problem. We also show that the proposed regularization corresponds to a diversification ''pressure". This then means that diversification, besides counteracting downward fluctuations in some assets by upward fluctuations in others, is also crucial for improving the stability of the solution. The approach we provide here allows for the simultaneous treatment of optimization and diversification in one framework which allows the investor to trade-off between the two, depending on the size of the available data set.

In two follow-up papers [12, 14] we have characterized the typical behavior of the optimal liquidation strategies, in the limit of large portfolio sizes, by means of a replica calculation, showing how regularization can remove the instability. We furthermore showed how regularization naturally emerges when market impact of portfolio liquidation is taken into account. The idea is that an investor should care about the risk of the cashflow that could be generated by the portfolio if it was liquidated. But the liquidation of large positions will influence prices, and that has to be taken into account when computing the risk of the cash that could be generated from the portfolio. We showed which market impact functions correspond to different regularizers, and systematically analyzed their effects on performance [12]. Importantly, we found that the instability is cured (meaning that the divergence goes away) for all Lp norms with p > 1. However, for the fashionable L1 norm, things are more complicated. There is a way of implementing it that does cure the instability, but the most naive implementation may not - it may only shift the divergence.

- [12] F. Caccioli, I. Kondor, M. Marsili and S.Still
(2016) Liquidity
Risk
And
Instabilities
In
Portfolio
Optimization.
*Int. J. Theor. Appl. Finan.*19, 1650035 (earlier version on arxiv:1404.4040). - [13] L. J. Miller, R. Gazan and S. Still (2014) Unsupervised Document Classification and Visualization of Unstructured Text for the Support of Interdisciplinary Collaboration. Proc. 17th ACM Conf. Computer Supported Cooperative Work and Social Computing (CSCW-2014).
- [14] F. Caccioli, I. Kondor, M.
Marsili and S. Still (2013) Optimal
liquidation
strategies
regularize
portfolio
selection.
*The European Journal of Finance, 19 (6)*, 554-571 (preprint on arxiv:1004.4169) - [15] Hamilton CW, C Beggan, S Still, M Beuthe, R Lopes, D
Williams, J Radebaugh, and W Wright (2013) Spatial
distribution
of
volcanoes
on
Io:
implications
for
tidal
heating
and
magma
ascent. Earth
and
Planetary Sciences Letters 361, pp. 272–286 (pdf) (Paper was
reported on by several news agencies, including NBC
and LA
Times)

- [16] S. Still and I. Kondor: Regularizing
Portfolio
Optimization (2010)
*New Journal of Physics*12 075034 (Special Issue on Statistical Physics Modeling in Economics and Finance

*
This material is presented to ensure timely dissemination of
scholarly and technical work. Copyright and all rights therein
are retained by authors or by other copyright holders.
All person copying this information are expected to adhere to
the terms and constraints invoked by each author's copyright.
In most cases, these works may not be reposted
without the explicit permission of the copyright holder.*