People
Latest Research Publications:
Latest Research Publications:

Latest Research Publications:
When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.
@article{484, author = {Arthur Venter, Marthinus Theunissen, Marelie Davel}, title = {Pre-interpolation loss behaviour in neural networks}, abstract = {When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.}, year = {2020}, journal = {Communications in Computer and Information Science}, volume = {1342}, pages = {296-309}, publisher = {Southern African Conference for Artificial Intelligence Research}, address = {South Africa}, isbn = {978-3-030-66151-9}, doi = {https://doi.org/10.1007/978-3-030-66151-9_19}, }
The understanding of generalisation in machine learning is in a state of flux, in part due to the ability of deep learning models to interpolate noisy training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about the bias-variance trade off in learning. We expand upon relevant existing work by discussing local attributes of neural network training within the context of a relatively simple framework.We describe how various types of noise can be compensated for within the proposed framework in order to allow the deep learning model to generalise in spite of interpolating spurious function descriptors. Empirically,we support our postulates with experiments involving overparameterised multilayer perceptrons and controlled training data noise. The main insights are that deep learning models are optimised for training data modularly, with different regions in the function space dedicated to fitting distinct types of sample information. Additionally,we show that models tend to fit uncorrupted samples first. Based on this finding, we propose a conjecture to explain an observed instance of the epoch-wise double-descent phenomenon. Our findings suggest that the notion of model capacity needs to be modified to consider the distributed way training data is fitted across sub-units.
@article{394, author = {Marthinus Theunissen, Marelie Davel, Etienne Barnard}, title = {Benign interpolation of noise in deep learning}, abstract = {The understanding of generalisation in machine learning is in a state of flux, in part due to the ability of deep learning models to interpolate noisy training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about the bias-variance trade off in learning. We expand upon relevant existing work by discussing local attributes of neural network training within the context of a relatively simple framework.We describe how various types of noise can be compensated for within the proposed framework in order to allow the deep learning model to generalise in spite of interpolating spurious function descriptors. Empirically,we support our postulates with experiments involving overparameterised multilayer perceptrons and controlled training data noise. The main insights are that deep learning models are optimised for training data modularly, with different regions in the function space dedicated to fitting distinct types of sample information. Additionally,we show that models tend to fit uncorrupted samples first. Based on this finding, we propose a conjecture to explain an observed instance of the epoch-wise double-descent phenomenon. Our findings suggest that the notion of model capacity needs to be modified to consider the distributed way training data is fitted across sub-units.}, year = {2020}, journal = {South African Computer Journal}, volume = {32}, pages = {80-101}, issue = {2}, publisher = {South African Institute of Computer Scientists and Information Technologists}, isbn = {ISSN: 1015-7999; E:2313-7835}, doi = {https://doi.org/10.18489/sacj.v32i2.833}, }
A robust theoretical framework that can describe and predict the generalization ability of deep neural networks (DNNs) in general circumstances remains elusive. Classical attempts have produced complexity metrics that rely heavily on global measures of compactness and capacity with little investigation into the effects of sub-component collaboration. We demonstrate intriguing regularities in the activation patterns of the hidden nodes within fully-connected feedforward networks. By tracing the origin of these patterns, we show how such networks can be viewed as the combination of two information processing systems: one continuous and one discrete. We describe how these two systems arise naturally from the gradient-based optimization process, and demonstrate the classification ability of the two systems, individually and in collaboration. This perspective on DNN classification offers a novel way to think about generalization, in which different subsets of the training data are used to train distinct classifiers; those classifiers are then combined to perform the classification task, and their consistency is crucial for accurate classification.
@{236, author = {Marelie Davel, Marthinus Theunissen, Arnold Pretorius, Etienne Barnard}, title = {DNNs as layers of cooperating classifiers}, abstract = {A robust theoretical framework that can describe and predict the generalization ability of deep neural networks (DNNs) in general circumstances remains elusive. Classical attempts have produced complexity metrics that rely heavily on global measures of compactness and capacity with little investigation into the effects of sub-component collaboration. We demonstrate intriguing regularities in the activation patterns of the hidden nodes within fully-connected feedforward networks. By tracing the origin of these patterns, we show how such networks can be viewed as the combination of two information processing systems: one continuous and one discrete. We describe how these two systems arise naturally from the gradient-based optimization process, and demonstrate the classification ability of the two systems, individually and in collaboration. This perspective on DNN classification offers a novel way to think about generalization, in which different subsets of the training data are used to train distinct classifiers; those classifiers are then combined to perform the classification task, and their consistency is crucial for accurate classification.}, year = {2020}, journal = {The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)}, pages = {3725 - 3732}, month = {07/02-12/02/2020}, address = {New York}, }
The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.
@{284, author = {Marthinus Theunissen, Marelie Davel, Etienne Barnard}, title = {Insights regarding overfitting on noise in deep learning}, abstract = {The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.}, year = {2019}, journal = {South African Forum for Artificial Intelligence Research (FAIR)}, pages = {49-63}, address = {Cape Town, South Africa}, }
Latest Research Publications:

TALKS:
1) 'Toward a Coherent Account of Moral Agency' (FAIR 2019);
2) 'Functional Moral Agency' (4IR: Philosophical, Ethical and Legal Perspectives 2019).
PUBLICATIONS:
1) Tollon, Fabio. 2020. The Artificial View: toward a non-anthropocentric account of moral patiency. Ethics and Information Technology. https://doi.org/10.1007/s10676-020-09540-4;
2) Tollon, Fabio. 2019. Moral Agents or Mindless Machines? A critical appraisal of agency in artificial systems. Hungarian Philosophical Review 63(4), pp. 9-23;
3) Tollon, Fabio. 2019. Toward a Coherent Account of Moral Agency. Proceedings of the South African Forum for Artificial Intelligence Research. Vol 2540. http://ceur-ws.org/Vol-2540/.
Latest Research Publications:
Up to 70% of all watch time on YouTube is due to the suggested content of its recommender system. This system has been found, by virtue of its design, to be promoting conspiratorial content. In this paper, the author firstly critiques the value neutrality thesis regarding technology, showing it to be philosophically untenable. This means that technological artefacts can influence what people come to value (or perhaps even embody values themselves) and change the moral evaluation of an action. Secondly, he introduces the concept of an affordance, borrowed from the literature on ecological psychology. This concept allows him to make salient how technologies come to solicit certain kinds of actions from users, making such actions more or less likely, and in this way influencing the kinds of things one comes to value. Thirdly, he critically assesses the results of a study by Alfano et al. He makes use of the literature on affordances, introduced earlier, to shed light on how these technological systems come to mediate our perception of the world and influence action.
@article{415, author = {Fabio Tollon}, title = {Designed to Seduce: Epistemically Retrograde Ideation and YouTube's Recommender System}, abstract = {Up to 70% of all watch time on YouTube is due to the suggested content of its recommender system. This system has been found, by virtue of its design, to be promoting conspiratorial content. In this paper, the author firstly critiques the value neutrality thesis regarding technology, showing it to be philosophically untenable. This means that technological artefacts can influence what people come to value (or perhaps even embody values themselves) and change the moral evaluation of an action. Secondly, he introduces the concept of an affordance, borrowed from the literature on ecological psychology. This concept allows him to make salient how technologies come to solicit certain kinds of actions from users, making such actions more or less likely, and in this way influencing the kinds of things one comes to value. Thirdly, he critically assesses the results of a study by Alfano et al. He makes use of the literature on affordances, introduced earlier, to shed light on how these technological systems come to mediate our perception of the world and influence action.}, year = {2021}, journal = {International Journal of Technoethics (IJT)}, volume = {12}, issue = {2}, publisher = {IGI Global}, isbn = {9781799861492}, url = {https://www.igi-global.com/gateway/article/281077}, doi = {10.4018/IJT.2021070105}, }
In this paper I critically evaluate the value neutrality thesis regarding technology, and find it wanting. I then introduce the various ways in which artifacts can come to influence moral value, and our evaluation of moral situations and actions. Here, following van de Poel and Kroes, I introduce the idea of value sensitive design. Specifically, I show how by virtue of their designed properties, artifacts may come to embody values. Such accounts, however, have several shortcomings. In agreement with Michael Klenk, I raise epistemic and metaphysical issues with respect to designed properties embodying value. The concept of an affordance, borrowed from ecological psychology, provides a more philosophically fruitful grounding to the potential way(s) in which artifacts might embody values. This is due to the way in which it incorporates key insights from perception more generally, and how we go about determining possibilities for action in our environment specifically. The affordance account as it is presented by Klenk, however, is insufficient. I therefore argue that we understand affordances based on whether they are meaningful, and, secondly, that we grade them based on their force.
@article{386, author = {Fabio Tollon}, title = {Artifacts and affordances: from designed properties to possibilities for action}, abstract = {In this paper I critically evaluate the value neutrality thesis regarding technology, and find it wanting. I then introduce the various ways in which artifacts can come to influence moral value, and our evaluation of moral situations and actions. Here, following van de Poel and Kroes, I introduce the idea of value sensitive design. Specifically, I show how by virtue of their designed properties, artifacts may come to embody values. Such accounts, however, have several shortcomings. In agreement with Michael Klenk, I raise epistemic and metaphysical issues with respect to designed properties embodying value. The concept of an affordance, borrowed from ecological psychology, provides a more philosophically fruitful grounding to the potential way(s) in which artifacts might embody values. This is due to the way in which it incorporates key insights from perception more generally, and how we go about determining possibilities for action in our environment specifically. The affordance account as it is presented by Klenk, however, is insufficient. I therefore argue that we understand affordances based on whether they are meaningful, and, secondly, that we grade them based on their force.}, year = {2021}, journal = {AI & SOCIETY Journal of Knowledge, Culture and Communication}, volume = {36}, issue = {1}, publisher = {Springer}, url = {https://link.springer.com/article/10.1007%2Fs00146-021-01155-7}, doi = {https://doi.org/10.1007/s00146-021-01155-7}, }
In this paper I provide an exposition and critique of the Organic View of Ethical Status, as outlined by Torrance (2008). A key presupposition of this view is that only moral patients can be moral agents. It is claimed that because artificial agents lack sentience, they cannot be proper subjects of moral concern (i.e. moral patients). This account of moral standing in principle excludes machines from participating in our moral universe. I will argue that the Organic View operationalises anthropocentric intuitions regarding sentience ascription, and by extension how we identify moral patients. The main difference between the argument I provide here and traditional arguments surrounding moral attributability is that I do not necessarily defend the view that internal states ground our ascriptions of moral patiency. This is in contrast to views such as those defended by Singer (1975, 2011) and Torrance (2008), where concepts such as sentience play starring roles. I will raise both conceptual and epistemic issues with regards to this sense of sentience. While this does not preclude the usage of sentience outright, it suggests that we should be more careful in our usage of internal mental states to ground our moral ascriptions. Following from this I suggest other avenues for further exploration into machine moral patiency which may not have the same shortcomings as the Organic View.
@article{387, author = {Fabio Tollon}, title = {The artifcial view: toward a non‑anthropocentric account of moral patiency}, abstract = {In this paper I provide an exposition and critique of the Organic View of Ethical Status, as outlined by Torrance (2008). A key presupposition of this view is that only moral patients can be moral agents. It is claimed that because artificial agents lack sentience, they cannot be proper subjects of moral concern (i.e. moral patients). This account of moral standing in principle excludes machines from participating in our moral universe. I will argue that the Organic View operationalises anthropocentric intuitions regarding sentience ascription, and by extension how we identify moral patients. The main difference between the argument I provide here and traditional arguments surrounding moral attributability is that I do not necessarily defend the view that internal states ground our ascriptions of moral patiency. This is in contrast to views such as those defended by Singer (1975, 2011) and Torrance (2008), where concepts such as sentience play starring roles. I will raise both conceptual and epistemic issues with regards to this sense of sentience. While this does not preclude the usage of sentience outright, it suggests that we should be more careful in our usage of internal mental states to ground our moral ascriptions. Following from this I suggest other avenues for further exploration into machine moral patiency which may not have the same shortcomings as the Organic View.}, year = {2020}, journal = {Ethics and Information Technology}, volume = {22}, issue = {4}, publisher = {Springer}, url = {https://link.springer.com/article/10.1007%2Fs10676-020-09540-4}, doi = {https://doi.org/10.1007/s10676-020-09540-4}, }
Latest Research Publications:

Latest Research Publications:
The output size problem, for a string-to-tree transducer, is to determine the asymptotic behavior of the function describing the maximum size of output trees, with respect to the length of input strings. We show that the problem to determine, for a given regular expression, the worst-case matching time of a backtracking regular expression matcher, can be reduced to the output size problem. The latter can, in turn, be solved by determining the degree of ambiguity of a non-deterministic finite automaton.
Keywords: string-to-tree transducers, output size, backtracking regular expression matchers, NFA ambiguity
@article{201, author = {Martin Berglund, F. Drewes, Brink van der Merwe}, title = {The Output Size Problem for String-to-Tree Transducers}, abstract = {The output size problem, for a string-to-tree transducer, is to determine the asymptotic behavior of the function describing the maximum size of output trees, with respect to the length of input strings. We show that the problem to determine, for a given regular expression, the worst-case matching time of a backtracking regular expression matcher, can be reduced to the output size problem. The latter can, in turn, be solved by determining the degree of ambiguity of a non-deterministic finite automaton. Keywords: string-to-tree transducers, output size, backtracking regular expression matchers, NFA ambiguity}, year = {2018}, journal = {Journal of Automata, Languages and Combinatorics}, volume = {23}, pages = {19-38}, issue = {1}, publisher = {Institut für Informatik, Justus-Liebig-Universität Giessen}, address = {Germany}, isbn = {2567-3785}, url = {https://www.jalc.de/issues/2018/issue_23_1-3/jalc-2018-019-038.php}, }
Modern regular expression matching software features many extensions, some general while some are very narrowly specied. Here we consider the generalization of adding a class of operators which can be described by, e.g. nite-state transducers. Combined with backreferences they enable new classes of languages to be matched. The addition of nite-state transducers is shown to make membership testing undecidable. Following this result, we study the complexity of membership testing for various restricted cases of the model.
@{199, author = {Martin Berglund, F. Drewes, Brink van der Merwe}, title = {On Regular Expressions with Backreferences and Transducers}, abstract = {Modern regular expression matching software features many extensions, some general while some are very narrowly specied. Here we consider the generalization of adding a class of operators which can be described by, e.g. nite-state transducers. Combined with backreferences they enable new classes of languages to be matched. The addition of nite-state transducers is shown to make membership testing undecidable. Following this result, we study the complexity of membership testing for various restricted cases of the model.}, year = {2018}, journal = {10th Workshop on Non-Classical Models of Automata and Applications (NCMA 2018)}, pages = {1-19}, month = {21/08-22/08}, }
Whereas Perl-compatible regular expression matchers typically exhibit some variation of leftmost-greedy semantics, those conforming to the posix standard are prescribed leftmost-longest semantics. However, the posix standard leaves some room for interpretation, and Fowler and Kuklewicz have done experimental work to confirm differences between various posix matchers. The Boost library has an interesting take on the posix standard, where it maximises the leftmost match not with respect to subexpressions of the regular expression pattern, but rather, with respect to capturing groups. In our work, we provide the first formalisation of Boost semantics, and we analyse the complexity of regular expression matching when using Boost semantics.
@{196, author = {Brink van der Merwe, Martin Berglund, Willem Bester}, title = {Formalising Boost POSIX Regular Expression Matching}, abstract = {Whereas Perl-compatible regular expression matchers typically exhibit some variation of leftmost-greedy semantics, those conforming to the posix standard are prescribed leftmost-longest semantics. However, the posix standard leaves some room for interpretation, and Fowler and Kuklewicz have done experimental work to confirm differences between various posix matchers. The Boost library has an interesting take on the posix standard, where it maximises the leftmost match not with respect to subexpressions of the regular expression pattern, but rather, with respect to capturing groups. In our work, we provide the first formalisation of Boost semantics, and we analyse the complexity of regular expression matching when using Boost semantics.}, year = {2018}, journal = {International Colloquium on Theoretical Aspects of Computing}, pages = {99-115}, month = {17/02}, publisher = {Springer}, isbn = {978-3-030-02508-3}, url = {https://link.springer.com/chapter/10.1007/978-3-030-02508-3_6}, }
No Abstract
@{178, author = {Brink van der Merwe, N. Weideman, Martin Berglund}, title = {Turning evil regexes harmless}, abstract = {No Abstract}, year = {2017}, journal = {Conference of South African Institute of Computer Scientists and Information Technologists (SAICSIT'17)}, month = {26/09-28/09}, publisher = {ACM}, url = {https://dl.acm.org/citation.cfm?id=3129416}, }
Most modern regular expression matching libraries (one of the rare exceptions being Google’s RE2) allow backreferences, operations which bind a substring to a variable allowing it to be matched again verbatim. However, different implementations not only vary in the syntax permitted when using backreferences, but both implementations and definitions in the literature offer up a number of different variants on how backreferences match. Our aim is to compare the various flavors by considering the formal languages that each can describe, resulting in the establishment of a hierarchy of language classes. Beyond the hierarchy itself, some complexity results are given, and as part of the effort on comparing language classes new pumping lemmas are established, and old ones extended to new classes.
@{176, author = {Martin Berglund, Brink van der Merwe}, title = {Regular Expressions with Backreferences Re-examined}, abstract = {Most modern regular expression matching libraries (one of the rare exceptions being Google’s RE2) allow backreferences, operations which bind a substring to a variable allowing it to be matched again verbatim. However, different implementations not only vary in the syntax permitted when using backreferences, but both implementations and definitions in the literature offer up a number of different variants on how backreferences match. Our aim is to compare the various flavors by considering the formal languages that each can describe, resulting in the establishment of a hierarchy of language classes. Beyond the hierarchy itself, some complexity results are given, and as part of the effort on comparing language classes new pumping lemmas are established, and old ones extended to new classes.}, year = {2017}, journal = {The Prague Stringology Conference (PSC 2017)}, pages = {30-41}, month = {28/08-30/08}, address = {Czech Technical University in Prague,}, isbn = {ISBN 978-80-01-06193-0}, }
Latest Research Publications:
We investigate the use of silhouette coefficients in cluster analysis for speaker diarisation, with the dual purpose of unsupervised fine-tuning during domain adaptation and determining the number of speakers in an audio file. Our main contribution is to demonstrate the use of silhouette coefficients to perform per-file domain adaptation, which we show to deliver an improvement over per-corpus domain adaptation. Secondly, we show that this method of silhouette-based cluster analysis can be used to accurately determine more than one hyperparameter at the same time. Finally, we propose a novel method for calculating the silhouette coefficient of clusters using a PLDA score matrix as input.
@{482, author = {Lucas Van Wyk, Marelie Davel, Charl Van Heerden}, title = {Unsupervised fine-tuning of speaker diarisation pipelines using silhouette coefficients}, abstract = {We investigate the use of silhouette coefficients in cluster analysis for speaker diarisation, with the dual purpose of unsupervised fine-tuning during domain adaptation and determining the number of speakers in an audio file. Our main contribution is to demonstrate the use of silhouette coefficients to perform per-file domain adaptation, which we show to deliver an improvement over per-corpus domain adaptation. Secondly, we show that this method of silhouette-based cluster analysis can be used to accurately determine more than one hyperparameter at the same time. Finally, we propose a novel method for calculating the silhouette coefficient of clusters using a PLDA score matrix as input.}, year = {2021}, journal = {Southern African Conference for Artificial Intelligence Research}, pages = {202 - 216}, month = {06/12/2021 - 10/12/2021}, address = {South Africa}, isbn = {978-0-620-94410-6}, url = {https://2021.sacair.org.za/proceedings/}, }

Ivan’s main research interest area is logic-based knowledge representation and reasoning in artificial intelligence, with focus on modal and description logics and their applications in non-monotonic reasoning, reasoning about actions and change, and the semantic web.
Latest Research Publications:
We extend the expressivity of classical conditional reasoning by introducing context as a new parameter. The enriched
conditional logic generalises the defeasible conditional setting in the style of Kraus, Lehmann, and Magidor, and allows for a refined semantics that is able to distinguish, for example, between expectations and counterfactuals. In this paper we introduce the language for the enriched logic and define an appropriate semantic framework for it. We analyse which properties generally associated with conditional reasoning are still satisfied by the new semantic framework, provide a suitable representation result, and define an entailment relation based on Lehmann and Magidor’s generally-accepted notion of Rational Closure.
@{430, author = {Giovanni Casini, Tommie Meyer, Ivan Varzinczak}, title = {Contextual Conditional Reasoning}, abstract = {We extend the expressivity of classical conditional reasoning by introducing context as a new parameter. The enriched conditional logic generalises the defeasible conditional setting in the style of Kraus, Lehmann, and Magidor, and allows for a refined semantics that is able to distinguish, for example, between expectations and counterfactuals. In this paper we introduce the language for the enriched logic and define an appropriate semantic framework for it. We analyse which properties generally associated with conditional reasoning are still satisfied by the new semantic framework, provide a suitable representation result, and define an entailment relation based on Lehmann and Magidor’s generally-accepted notion of Rational Closure.}, year = {2021}, journal = {35th AAAI Conference on Artificial Intelligence}, pages = {6254-6261}, month = {02/02/2021-09/02/2021}, publisher = {AAAI Press}, address = {Online}, }
The past 25 years have seen many attempts to introduce defeasible-reasoning capabilities into a description logic setting. Many, if not most, of these attempts are based on preferential extensions of description logics, with a significant number of these, in turn, following the so-called KLM approach to defeasible reasoning initially advocated for propositional logic by Kraus, Lehmann, and Magidor. Each of these attempts has its own aim of investigating particular constructions and variants of the (KLM-style) preferential approach. Here our aim is to provide a comprehensive study of the formal foundations of preferential defeasible reasoning for description logics in the KLM tradition. We start by investigating a notion of defeasible subsumption in the spirit of defeasible conditionals as studied by Kraus, Lehmann, and Magidor in the propositional case. In particular, we consider a natural and intuitive semantics for defeasible subsumption, and we investigate KLM-style syntactic properties for both preferential and rational subsumption. Our contribution includes two representation results linking our semantic
constructions to the set of preferential and rational properties considered. Besides showing that our semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. Indeed, we also analyse the problem of non-monotonic reasoning in description logics at the level of entailment and present an algorithm for the computation of rational closure of a defeasible knowledge base. Importantly, our algorithm relies completely on classical entailment and shows that the computational complexity of reasoning over defeasible knowledge bases is no worse than that of reasoning in the underlying classical DL ALC.
@article{433, author = {Katarina Britz, Giovanni Casini, Tommie Meyer, Kody Moodley, Uli Sattler, Ivan Varzinczak}, title = {Principles of KLM-style Defeasible Description Logics}, abstract = {The past 25 years have seen many attempts to introduce defeasible-reasoning capabilities into a description logic setting. Many, if not most, of these attempts are based on preferential extensions of description logics, with a significant number of these, in turn, following the so-called KLM approach to defeasible reasoning initially advocated for propositional logic by Kraus, Lehmann, and Magidor. Each of these attempts has its own aim of investigating particular constructions and variants of the (KLM-style) preferential approach. Here our aim is to provide a comprehensive study of the formal foundations of preferential defeasible reasoning for description logics in the KLM tradition. We start by investigating a notion of defeasible subsumption in the spirit of defeasible conditionals as studied by Kraus, Lehmann, and Magidor in the propositional case. In particular, we consider a natural and intuitive semantics for defeasible subsumption, and we investigate KLM-style syntactic properties for both preferential and rational subsumption. Our contribution includes two representation results linking our semantic constructions to the set of preferential and rational properties considered. Besides showing that our semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. Indeed, we also analyse the problem of non-monotonic reasoning in description logics at the level of entailment and present an algorithm for the computation of rational closure of a defeasible knowledge base. Importantly, our algorithm relies completely on classical entailment and shows that the computational complexity of reasoning over defeasible knowledge bases is no worse than that of reasoning in the underlying classical DL ALC.}, year = {2020}, journal = {Transactions on Computational Logic}, volume = {22 (1)}, pages = {1-46}, publisher = {ACM}, url = {https://dl-acm-org.ezproxy.uct.ac.za/doi/abs/10.1145/3420258}, doi = {10.1145/3420258}, }
We present a formal framework for modelling belief change within a non-monotonic reasoning system. Belief change and non-monotonic reasoning are two areas that are formally closely related, with recent attention being paid towards the analysis of belief change within a non-monotonic environment. In this paper we consider the classical AGM belief change operators, contraction and revision, applied to a defeasible setting in the style of Kraus, Lehmann, and Magidor. The investigation leads us to the formal characterisation of a number of classes of defeasible belief change operators. For the most interesting classes we need to consider the problem of iterated belief change, generalising the classical work of Darwiche and Pearl in the process. Our work involves belief change operators aimed at ensuring logical consistency, as well as the characterisation of analogous operators aimed at obtaining coherence—an important notion within the field of logic-based ontologies
@{382, author = {Giovanni Casini, Tommie Meyer, Ivan Varzinczak}, title = {Rational Defeasible Belief Change}, abstract = {We present a formal framework for modelling belief change within a non-monotonic reasoning system. Belief change and non-monotonic reasoning are two areas that are formally closely related, with recent attention being paid towards the analysis of belief change within a non-monotonic environment. In this paper we consider the classical AGM belief change operators, contraction and revision, applied to a defeasible setting in the style of Kraus, Lehmann, and Magidor. The investigation leads us to the formal characterisation of a number of classes of defeasible belief change operators. For the most interesting classes we need to consider the problem of iterated belief change, generalising the classical work of Darwiche and Pearl in the process. Our work involves belief change operators aimed at ensuring logical consistency, as well as the characterisation of analogous operators aimed at obtaining coherence—an important notion within the field of logic-based ontologies}, year = {2020}, journal = {17th International Conference on Principles of Knowledge Representation and Reasoning (KR 2020)}, pages = {213-222}, month = {12/09/2020}, publisher = {IJCAI}, address = {Virtual}, url = {https://library.confdna.com/kr/2020/}, doi = {10.24963/kr.2020/22}, }
In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.
@{247, author = {Katarina Britz, Ivan Varzinczak}, title = {Preferential tableaux for contextual defeasible ALC}, abstract = {In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.}, year = {2019}, journal = {28th International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (TABLEAUX)}, pages = {39-57}, month = {03/09-05/09}, publisher = {Springer LNAI no. 11714}, isbn = {ISBN 978-3-030-29026-9}, url = {https://www.springer.com/gp/book/9783030290252}, }
Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.
@article{246, author = {Katarina Britz, Ivan Varzinczak}, title = {Contextual rational closure for defeasible ALC}, abstract = {Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.}, year = {2019}, journal = {Annals of Mathematics and Artificial Intelligence}, volume = {87}, pages = {83-108}, issue = {1-2}, isbn = {ISSN: 1012-2443}, url = {https://link.springer.com/article/10.1007/s10472-019-09658-2}, doi = {10.1007/s10472-019-09658-2}, }

Latest Research Publications:
When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.
@article{484, author = {Arthur Venter, Marthinus Theunissen, Marelie Davel}, title = {Pre-interpolation loss behaviour in neural networks}, abstract = {When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.}, year = {2020}, journal = {Communications in Computer and Information Science}, volume = {1342}, pages = {296-309}, publisher = {Southern African Conference for Artificial Intelligence Research}, address = {South Africa}, isbn = {978-3-030-66151-9}, doi = {https://doi.org/10.1007/978-3-030-66151-9_19}, }