LOGIC AND LEARNING
The workshop was held on January 11th and 12th.
Logic has proved in the last decades a powerful tool in understanding complex systems. It is instrumental in the development of formal methods, which are mathematically based techniques obsessing on hard guarantees. Learning is a pervasive paradigm which has seen tremendous success recently. The use of statistical approaches yields practical solutions to problems which yesterday seemed out of reach. These two mindsets should not be kept apart, and many efforts have been made recently to combine the formal reasoning offered by logic and the power of learning.
The goal of this workshop is to bring together expertise from various areas to try and understand the opportunities offered by combining logic and learning. There were 12 invited speakers.
PROGRAMME (PDF version)
Thursday 11th morning (chair: David Pym)
9:30–10:30
9:30 - 9:45 Nathanaël Fijalkow Introduction
9:45 - 10:30 Martin Grohe (RWTH Aachen University) Learning Logically Defined Hypotheses
((slides)) ((video))- Speaker:
- Martin Grohe, RWTH Aachen University
- Title:
- Learning Logically Defined Hypotheses
- Abstract:
-
I will introduce a declarative framework for machine learning
where hypotheses are defined by formulas of a logic over
some background structure. Within this framework, I will discuss
positive as well as negative learnability results (in the "probably
approximately correct" learning sense) for hypothesis classed defined
in first-order logic and monadic second-order logic over strings,
trees, and graphs of bounded degree. While purely theoretical at this
point, the hope is that our framework may serve as a foundation for
declarative approaches to ML in logic-affine areas such as database
systems or automated verification.
(Joint work with Christof Löding and Martin Ritzert.)
Coffee break
11:00–12:30
11:00 - 11:45 Borja Balle (Amazon Research Cambridge) Learning Automata with Hankel Matrices
((slides)) ((video))- Speaker:
- Borja Balle, Amazon Research Cambridge
- Title:
- Learning Automata with Hankel Matrices
- Abstract:
- The Hankel matrix is a fundamental tool in the theory of weighted automata. In this talk we will describe a general framework for learning automata with Hankel matrices. Our framework provides a unified view of many classical and recent algorithms for learning automata under different learning paradigms, including query learning algorithms, spectral learning algorithms, and Hankel matrix completion algorithms.
11:45 - 12:30 Edward Grefenstette (DeepMind) Recurrent Neural Networks and Models of Computation
((slides)) ((video))- Speaker:
- Edward Grefenstette, Staff Research Scientist, DeepMind
- Title:
- Recurrent Neural Networks and Models of Computation
- Abstract:
- This talk presents an analysis of various recurrent neural network architectures in terms of traditional models of computation. It makes the case for simpler recurrent architectures being closer to finite state automata, and argues that memory-enhanced architectures support better algorithmic efficiency, even in problems which are describable as regular languages.
Lunch
Thursday 11th afternoon (chair: Andrew Blake)
14:00–15:30
14:00 - 14:45 Luc De Raedt (Leuven) Probabilistic logic programming and its applications
((slides)) ((video))- Speaker:
- Luc De Raedt, Leuven
- Title:
- Probabilistic logic programming and its applications
- Abstract:
- Probabilistic programs combine the power of programming languages with that of probabilistic graphical models. There has been a lot of progress in this paradigm over the past twenty years. This talk will introduce probabilistic logic programming languages, which are based on Sato's distribution semantics and which extend probabilistic databases. The key idea is that facts or tuples can be annotated with probabilities that indicate their degree of belief. Together with the rules that encode domain knowledge they induce a set of possible worlds. After an introduction to probabilistic programs, which will cover semantics, inference, and learning, the talk will sketch some emerging applications in knowledge based systems, in cognitive robotics and in answering probability questions.
14:45 - 15:30 Aditya Nori (Microsoft Research, Cambridge) Fairness and robustness in machine learning – a formal methods perspective
((slides)) ((video))- Speaker:
- Aditya Nori, Microsoft Research, Cambridge
- Title:
- Fairness and robustness in machine learning – a formal methods perspective
- Abstract:
- With the range and sensitivity of algorithmic decisions expanding at a break-neck speed, it is imperative that we aggressively investigate fairness and bias in decision-making programs. First, we show that a number of recently proposed formal definitions of fairness can be encoded as probabilistic program properties. Second, with the goal of enabling rigorous reasoning about fairness, we design a novel technique for verifying probabilistic properties that admits a wide class of decision-making programs. Third, we present FairSquare, the first verification tool for automatically certifying that a program meets a given fairness property. We evaluate FairSquare on a range of decision-making programs. Our evaluation demonstrates FairSquare's ability to verify fairness for a range of different programs, which we show are out-of-reach for state-of-the-art program analysis techniques.
Coffee break
16:00–16:45
16:00 - 16:45 Jane Hillston (University of Edinburgh) Integrating Inference with Stochastic Process Algebra Models
((slides)) ((video))- Speaker:
- Jane Hillston, School of Informatics, University of Edinburgh
- Title:
- Integrating Inference with Stochastic Process Algebra Models
- Abstract:
-
ProPPA is a probabilistic programming language for continuous-time dynamical systems, developed as an extension of the stochastic process algebra Bio-PEPA. It offers a high-level syntax for describing systems of interacting components with stochastic behaviours where some of the parameters are unknown. Such systems occur in many and diverse fields, including biology, ecology and urban transport, and while their modelling and analysis are important, existing methodologies are often not accessible to non-experts, in addition to being tailor-made rather than generally applicable. In particular, parameter learning can be of crucial significance but is a difficult problem due to the complexity of the underlying probabilistic model --- namely, the continuous time setting and the fast-growing number of states.
The purpose of the ProPPA framework is to facilitate both the description of these systems and the process of inference, by automating the application of appropriate algorithms. The language is equipped with different parameter inference methods, including a novel MCMC scheme which employs a random truncation strategy to obtain unbiased likelihood estimates. This method is particularly suited to systems with infinite state-spaces, which were previously not manageable without imposing an ad-hoc truncation. Other methods include a naive Approximate Bayesian Computation algorithm and a sampler based on a continuous approximation of the state-space.
Friday 12th morning (chair: Anuj Dawar)
9:00–10:30
9:00 - 9:45 Jan Křetínský (Technical University of Munich) Fast learning of small strategies
((slides)) ((video))- Speaker:
- Jan Křetínský, Technical University of Munich
- Title:
- Fast learning of small strategies
- Abstract:
- In verification, precise analysis is required, but the algorithms usually suffer from scalability issues. In machine learning, scalability is achieved, but with only very weak guarantees. We show how to merge the two philosophies and profit from both. In this talk, we focus on analysing Markov decision processes. We show how to learn ε-optimal strategies fast and how to represent them concisely so that some understanding of the behaviour and debugging information can be extracted.
9:45 - 10:30 Alessandro Abate (University of Oxford) Formal verification and learning of complex systems
((slides)) ((video))- Speaker:
- Alessandro Abate, University of Oxford
- Title:
- Formal verification and learning of complex systems
- Abstract:
-
Two known shortcomings of standard techniques in formal verification are the limited capability to provide system-level assertions, and the scalability to large-scale, complex models, such as those needed in Cyber-Physical Systems (CPS) applications. This talk covers research which addresses these shortcomings, by bringing model-based and data-driven methods together, which can help pushing the envelope of existing algorithms and tools in formal verification.
In the first part of the talk, I will discuss a new, formal, measurement-driven and model-based automated technique, for the quantitative verification of systems with partly unknown dynamics. I will formulate this setup as a data-driven Bayesian inference problem, formally embedded within a quantitative, model-based verification procedure. I argue that the approach can be applied to complex physical systems (e.g., with spatially continuous variables), driven by external inputs and accessed under noisy measurements.
In the later part of the talk, I will concentrate on learning actions in a model, whilst verifying logical constraints, such as safety requirements. I will introduce a new paradigm called ‘Logically Constrained Reinforcement Learning' to attain this, and explain its virtues and its application over complex systems.
Coffee break
11:00–12:30
11:00 - 11:45 Joost-Pieter Katoen (RWTH Aachen University) Bayesian Inference by Program Verification
((slides)) ((video))- Speaker:
- Joost-Pieter Katoen, RWTH Aachen University
- Title:
- Bayesian Inference by Program Verification
- Abstract:
- In this talk, I will give a perspective on inference in Bayes' networks (BNs) using program verification. I will argue how weakest precondition reasoning a la Dijkstra can be used for exact inference (and more). As exact inference is NP-complete, inference is typically done by means of simulation. I will show how by means of wp-reasoning exact expected sampling times of BNs can be obtained in a fully automated fashion. An experimental evaluation on BN benchmarks demonstrates that very large expected sampling times (in the magnitude of millions of years) can be inferred within less than a second. This provides a means to decide whether sampling-based methods are appropriate for a given BN.
11:45 - 12:30 Sam Staton (Oxford) Semantic models for higher-order Bayesian inference
((slides)) ((video))- Speaker:
- Sam Staton, University of Oxford
- Title:
- Semantic models for higher-order Bayesian inference
- Abstract:
-
In this talk I will discuss probabilistic programming as a method of Bayesian modelling and inference, with a focus on fully featured probabilistic programming languages with higher order functions, soft constraints, and continuous distributions. These languages are pushing the limits of expressivity even for "non-parametric" models. I will describe a new foundation, "quasi-Borel spaces", for higher order measure theory, and I'll present a new modular inference library whose correctness is derived from this foundation.
This is based on papers at ESOP 2017, LICS 2017 and POPL 2018, joint work with Y Cai, Z Ghahramani, C Heunen, O Kammar, SK Moss, K Ostermann, A Ścibior, M Vákár, H. Yang.
Lunch
Friday 12th afternoon (chair: Stephen H. Muggleton)
14:00–15:30
14:00 - 14:45 Richard Evans (DeepMind) Learning Explanatory Rules from Noisy Data
((slides)) ((video))- Speaker:
- Richard Evans, Senior Research Scientist, DeepMind
- Title:
- Learning Explanatory Rules from Noisy Data
- Abstract:
- Artificial Neural Networks are powerful function approximators capable of modelling solutions to a wide variety of problems, both supervised and unsupervised. As their size and expressivity increases, so too does the variance of the model, yielding a nearly ubiquitous overfitting problem. Although mitigated by a variety of model regularisation methods, the common cure is to seek large amounts of training data—which is not necessarily easily obtained—that sufficiently approximates the data distribution of the domain we wish to test on. In contrast, logic programming methods such as Inductive Logic Programming offer an extremely data-efficient process by which models can be trained to reason on symbolic domains. However, these methods are unable to deal with the variety of domains neural networks can be applied to: they are not robust to noise in or mislabelling of inputs, and perhaps more importantly, cannot be applied to non-symbolic domains where the data is ambiguous, such as operating on raw pixels. In this paper, we propose a Differentiable Inductive Logic framework (∂ILP), which can not only solve tasks which traditional ILP systems are suited for, but shows a robustness to noise and error in the training data which ILP cannot cope with. Furthermore, as it is trained by back-propagation against a likelihood objective, it can be hybridised by connecting it with neural networks over ambiguous data in order to be applied to domains which ILP cannot address, while providing data efficiency and generalisation beyond what neural networks on their own can achieve.
14:45 - 15:30 Tim Rocktäschel (University of Oxford) End-to-End Differentiable Proving
((slides)) ((video))- Speaker:
- Tim Rocktäschel, University of Oxford
- Title:
- End-to-End Differentiable Proving
- Abstract:
- We introduce neural networks for end-to-end differentiable proving of queries to knowledge bases by operating on dense vector representations of symbols. These neural networks are constructed recursively by taking inspiration from the backward chaining algorithm as used in Prolog. Specifically, we replace symbolic unification with a differentiable computation on vector representations of symbols using a radial basis function kernel, thereby combining symbolic reasoning with learning subsymbolic vector representations. By using gradient descent, the resulting neural network can be trained to infer facts from a given incomplete knowledge base. It learns to (i) place representations of similar symbols in close proximity in a vector space, (ii) make use of such similarities to prove queries, (iii) induce logical rules, and (iv) use provided and induced logical rules for multi-hop reasoning. We demonstrate that this architecture outperforms ComplEx, a state-of-the-art neural link prediction model, on three out of four benchmark knowledge bases while at the same time inducing interpretable function-free first-order logic rules.