publications
- tl;dr: KGE models such as CP, RESCAL, TuckER, ComplEx can be re-interpreted as circuits to unlock their generative capabilities, scaling up inference and learning and guaranteeing the satisfaction of logical constraints by design.[ paper | bibtex | abstract ]
Some of the most successful knowledge graph embedding (KGE) models for link prediction -- CP, RESCAL, TuckER, ComplEx -- can be interpreted as energy-based models. Under this perspective they are not amenable for exact maximum-likelihood estimation (MLE), sampling and struggle to integrate logical constraints. This work re-interprets the score functions of these KGEs as circuits -- constrained computational graphs allowing efficient marginalisation. Then, we design two recipes to obtain efficient generative circuit models by either restricting their activations to be non-negative or squaring their outputs. Our interpretation comes with little or no loss of performance for link prediction, while the circuits framework unlocks exact learning by MLE, efficient sampling of new triples, and guarantee that logical constraints are satisfied by design. Furthermore, our models scale more gracefully than the original KGEs on graphs with millions of entities.
@article{loconte2023gekcs,
title={How to Turn Your Knowledge Graph Embeddings into Generative Models via Probabilistic Circuits},
author={Loconte, Lorenzo and Di Mauro, Nicola and Peharz, Robert and Vergari, Antonio},
journal={arXiv preprint arXiv:2305.15944},
year={2023} } - tl;dr: We analyze causes and mitigation of reasoning shortcuts in neuro-symbolic models, which let them make the right predictions but for the wrong reasons.[ paper | bibtex | abstract ]
Neuro-Symbolic (NeSy) predictive models hold the promise of improved compliance with given constraints, systematic generalization, and interpretability, as they allow to infer labels that are consistent with some prior knowledge by reasoning over high-level concepts extracted from sub-symbolic inputs. It was recently shown that NeSy predictors are affected by reasoning shortcuts: they can attain high accuracy but by leveraging concepts with unintended semantics, thus coming short of their promised advantages. Yet, a systematic characterization of reasoning shortcuts and of potential mitigation strategies is missing. This work fills this gap by characterizing them as unintended optima of the learning objective and identifying four key conditions behind their occurrence. Based on this, we derive several natural mitigation strategies, and analyze their efficacy both theoretically and empirically. Our analysis shows reasoning shortcuts are difficult to deal with, casting doubts on the trustworthiness and interpretability of existing NeSy solutions.
@inproceedings{marconato2023shortcuts,
title={Not All Neuro-Symbolic Concepts Are Created Equal: Analysis and Mitigation of Reasoning Shortcuts},
author={Emanuele Marconato, Stefano Teso, Antonio Vergari, Andrea Passerini},
booktitle={arXiv},
year={2023}} - tl;dr: We disentangle the few structural choices that can characterize the performance of modern circuit architectures in terms of expressiveness and computational complexity.[ paper | bibtex | abstract ]
Tensorizing probabilistic circuits (PCs) -- structured computational graphs capable of efficiently and accurately performing various probabilistic reasoning tasks -- is the go-to way to represent and learn these models. This paper systematically explores the architectural options employed in modern overparameterized PCs, namely RAT-SPNs, EiNets, and HCLTs, and unifies them into a single algorithmic framework. By trying to compress the existing overparameterized layers via low-rank decompositions, we discover alternative parameterizations that possess the same expressive power but are computationally more efficient. This emphasizes the possibility of “mixing & matching” different design choices to create new PCs and helps to disentangle the few ones that really matter.
@inproceedings{mari2023lorapc,
title={Unifying and Understanding Overparameterized Circuit Representations via Low-Rank Tensor Decompositions},
author={Antonio Mari, Gennaro Vessio, Antonio Vergari},
booktitle={The 6th Workshop on Tractable Probabilistic Modeling},
year={2023}} - tl;dr: We propose to build (hierarchical) negative mixture models by squaring circuits. We theoretically prove their expressiveness by deriving an exponential lowerbound on the size of circuits with positive parameters only.[ paper | bibtex | abstract ]
Negative mixture models (NMMs) can potentially be more expressive than classical non-negative ones by allowing negative coefficients, thus greatly reducing the number of components and parameters to fit. However, modeling NMMs features a number of challenges, from ensuring that negative combinations still encode valid densities or masses, to effectively learning them from data. In this paper, we investigate how we can model both shallow and hierarchical NMMs in a generic framework, via squaring. We do so by representing NMMs as probabilistic circuits (PCs) – structured computational graphs that ensure tractability. Then, we show when and how we can represent these squared NMMs as tensorized computational graphs efficiently, while theoretically proving that for certain function classes including negative parameters can exponentially reduce the model size.
@inproceedings{loconte2023nmm,
title={Negative Mixture Models via Squaring: Representation and Learning},
author={Lorenzo Loconte and Stefan Mengel and Nicolas Gillis and Antonio Vergari},
booktitle={The 6th Workshop on Tractable Probabilistic Modeling},
year={2023}} - tl;dr: We design a differentiable layer that can be plugged into any neural network to guarantee that predictions are always consistent with a set of predefined symbolic constraints and can be trained end-to-end.[ paper | supplemental | code | bibtex | abstract ]
We design a predictive layer for structured-output prediction (SOP) that can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints. Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space all while being amenable to end-to-end learning via maximum likelihood.SPLs combine exact probabilistic inference with logical reasoning in a clean and modular way, learning complex distributions and restricting their support to solutions of the constraint. As such, they can faithfully, and efficiently, model complex SOP tasks beyond the reach of alternative neuro-symbolic approaches. We empirically demonstrate that SPLs outperform these competitors in terms of accuracy on challenging SOP tasks such as hierarchical multi-label classification, pathfinding and preference learning, while retaining perfect constraint satisfaction.
@inproceedings{ahmed2022spl,
author = {Ahmed, Kareem and Teso, Stefano and Chang, Kai-Wei and Van den Broeck, Guy and Vergari, Antonio},
booktitle = {Advances in Neural Information Processing Systems},
editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh},
pages = {29944--29959},
publisher = {Curran Associates, Inc.},
title = {Semantic Probabilistic Layers for Neuro-Symbolic Learning},
url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/c182ec594f38926b7fcb827635b9a8f4-Paper-Conference.pdf}, volume = {35},
year = {2022} } - tl;dr: We cast chemical reaction prediction as algebraic reasoning to evaluate the reasoning capabilities of Transformers and provide a challenging benchmark for it.[ paper | bibtex | abstract ]
While showing impressive performance on various kinds of learning tasks, it is yet unclear whether deep learning models have the ability to robustly tackle reasoning tasks. than by learning the underlying reasoning process that is actually required to solve the tasks. Measuring the robustness of reasoning in machine learning models is challenging as one needs to provide a task that cannot be easily shortcut by exploiting spurious statistical correlations in the data, while operating on complex objects and constraints. reasoning task. To address this issue, we propose ChemAlgebra, a benchmark for measuring the reasoning capabilities of deep learning models through the prediction of stoichiometrically-balanced chemical reactions. ChemAlgebra requires manipulating sets of complex discrete objects -- molecules represented as formulas or graphs -- under algebraic constraints such as the mass preservation principle. We believe that ChemAlgebra can serve as a useful test bed for the next generation of machine reasoning models and as a promoter of their development.
@article{valenti2022chemalgebra,
title={ChemAlgebra: Algebraic Reasoning on Chemical Reactions},
author={Valenti, Andrea and Bacciu, Davide and Vergari, Antonio},
journal={arXiv preprint arXiv:2210.02095},
year={2022} } - tl;dr: A systematic framework in which tractable inference routines can be broken down into smaller and composable primitives operating on circuit representations.[ paper | supplemental | code | bibtex | abstract ]
Circuit representations are becoming the lingua franca to express and reason about tractable generative and discriminative models. In this paper, we show how complex inference scenarios for these models that commonly arise in machine learning---from computing the expectations of decision tree ensembles to information-theoretic divergences of sum-product networks---can be represented in terms of tractable modular operations over circuits. Specifically, we characterize the tractability of simple transformations---sums, products, quotients, powers, logarithms, and exponentials---in terms of sufficient structural constraints of the circuits they operate on, and present novel hardness results for the cases in which these properties are not satisfied. Building on these operations, we derive a unified framework for reasoning about tractable models that generalizes several results in the literature and opens up novel tractable inference scenarios.
@inproceedings{vergari2021atlas,
author = {Vergari, Antonio and Choi, YooJung and Liu, Anji and Teso, Stefano and Van den Broeck, Guy},
booktitle = {Advances in Neural Information Processing Systems},
editor = {M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan},
pages = {13189--13201},
publisher = {Curran Associates, Inc.},
title = {A Compositional Atlas of Tractable Circuit Operations for Probabilistic Inference},
url = {https://proceedings.neurips.cc/paper_files/paper/2021/file/6e01383fd96a17ae51cc3e15447e7533-Paper.pdf},
volume = {34},
year = {2021} }