People

Latest Research Publications:
Latest Research Publications:

Latest Research Publications:

Latest Research Publications:

Latest Research Publications:
Latest Research Publications:

Latest Research Publications:

I employ empirical methods, that stem from data science and cognitive science, to evaluate the theory of belief revision with human reasoners.
I have been a part of the Knowledge Representation and Reasoning research group since 2019.
Latest Research Publications:
Belief revision and belief update are approaches to represent and reason with knowledge in artificial intelligence. Previous empirical studies have shown that human reasoning is consistent with non-monotonic logic and postulates of defeasible reasoning, belief revision and belief update. We extended previous work, which tested natural language translations of the postulates of defeasible reasoning, belief revision and belief update with human reasoners via surveys, in three respects.
Firstly, we only tested postulates of belief revision and belief update, taking the position that belief change aligns more with human reasoning than non-monotonic defeasible reasoning. Secondly, we decomposed the postulates of revision and update into material implication statements of the form “If x is the case, then y is the case”, each containing a premise
and a conclusion, and then translated the premises and conclusions into natural language. Thirdly, we asked human participants to judge each component of the postulate for plausibility. In our analysis, we measured the strength of the association between the premises and the conclusion of each postulate. We used Possibility theory to determine whether the postulates hold with our participants in general. Our results showed that our participants’ reasoning is consistent with postulates of belief
revision and belief update when judging the premises and conclusion of the postulate separately.
@{427, author = {Clayton Baker, Tommie Meyer}, title = {Belief Change in Human Reasoning: An Empirical Investigation on MTurk}, abstract = {Belief revision and belief update are approaches to represent and reason with knowledge in artificial intelligence. Previous empirical studies have shown that human reasoning is consistent with non-monotonic logic and postulates of defeasible reasoning, belief revision and belief update. We extended previous work, which tested natural language translations of the postulates of defeasible reasoning, belief revision and belief update with human reasoners via surveys, in three respects. Firstly, we only tested postulates of belief revision and belief update, taking the position that belief change aligns more with human reasoning than non-monotonic defeasible reasoning. Secondly, we decomposed the postulates of revision and update into material implication statements of the form “If x is the case, then y is the case”, each containing a premise and a conclusion, and then translated the premises and conclusions into natural language. Thirdly, we asked human participants to judge each component of the postulate for plausibility. In our analysis, we measured the strength of the association between the premises and the conclusion of each postulate. We used Possibility theory to determine whether the postulates hold with our participants in general. Our results showed that our participants’ reasoning is consistent with postulates of belief revision and belief update when judging the premises and conclusion of the postulate separately.}, year = {2022}, journal = {Second Southern African Conference for AI Research (SACAIR 2022)}, pages = {218-234}, month = {06/12/2021-10/12/2021}, publisher = {SACAIR 2021 Organising Committee}, address = {Online}, isbn = {978-0-620-94410-6}, url = {https://2021.sacair.org.za/proceedings/}, }
Classical logic forms the basis of knowledge representation and reasoning in AI. In the real world, however, classical logic alone is insufficient to describe the reasoning behaviour of human beings. It lacks the flexibility so characteristically required of reasoning under uncertainty, reasoning under incomplete information and reasoning with new information, as humans must. In response, non-classical extensions to propositional logic have been formulated, to provide non-monotonicity. It has been shown in previous studies that human reasoning exhibits non-monotonicity. This work is the product of merging three independent studies, each one focusing on a different formalism for non-monotonic reasoning: KLM defeasible reasoning, AGM belief revision and KM belief update. We investigate, for each of the postulates propounded to characterise these logic forms, the extent to which they have correspondence with human reasoners. We do this via three respective experiments and present each of the postulates in concrete and abstract form. We discuss related work, our experiment design, testing and evaluation, and report on the results from our experiments. We find evidence to believe that 1 out of 5 KLM defeasible reasoning postulates, 3 out of 8 AGM belief revision postulates and 4 out of 8 KM belief update postulates conform in both the concrete and abstract case. For each experiment, we performed an additional investigation. In the experiments of KLM defeasible reasoning and AGM belief revision, we analyse the explanations given by participants to determine whether the postulates have a normative or descriptive relationship with human reasoning. We find evidence that suggests, overall, KLM defeasible reasoning has a normative relationship with human reasoning while AGM belief revision has a descriptive relationship with human reasoning. In the experiment of KM belief update, we discuss counter-examples to the KM postulates.
@{412, author = {Clayton Baker, Claire Denny, Paul Freund, Tommie Meyer}, title = {Cognitive Defeasible Reasoning: the Extent to which Forms of Defeasible Reasoning Correspond with Human Reasoning}, abstract = {Classical logic forms the basis of knowledge representation and reasoning in AI. In the real world, however, classical logic alone is insufficient to describe the reasoning behaviour of human beings. It lacks the flexibility so characteristically required of reasoning under uncertainty, reasoning under incomplete information and reasoning with new information, as humans must. In response, non-classical extensions to propositional logic have been formulated, to provide non-monotonicity. It has been shown in previous studies that human reasoning exhibits non-monotonicity. This work is the product of merging three independent studies, each one focusing on a different formalism for non-monotonic reasoning: KLM defeasible reasoning, AGM belief revision and KM belief update. We investigate, for each of the postulates propounded to characterise these logic forms, the extent to which they have correspondence with human reasoners. We do this via three respective experiments and present each of the postulates in concrete and abstract form. We discuss related work, our experiment design, testing and evaluation, and report on the results from our experiments. We find evidence to believe that 1 out of 5 KLM defeasible reasoning postulates, 3 out of 8 AGM belief revision postulates and 4 out of 8 KM belief update postulates conform in both the concrete and abstract case. For each experiment, we performed an additional investigation. In the experiments of KLM defeasible reasoning and AGM belief revision, we analyse the explanations given by participants to determine whether the postulates have a normative or descriptive relationship with human reasoning. We find evidence that suggests, overall, KLM defeasible reasoning has a normative relationship with human reasoning while AGM belief revision has a descriptive relationship with human reasoning. In the experiment of KM belief update, we discuss counter-examples to the KM postulates.}, year = {2020}, journal = {First Southern African Conference for AI Research (SACAIR 2020)}, pages = {199-219}, month = {22/02/2021-26/02/2021}, publisher = {Springer}, address = {Muldersdrift, South Africa}, isbn = {978-3-030-66151-9}, url = {https://link.springer.com/book/10.1007/978-3-030-66151-9}, doi = {10.1007/978-3-030-66151-9_13}, }

Latest Research Publications:
A robust theoretical framework that can describe and predict the generalization ability of deep neural networks (DNNs) in general circumstances remains elusive. Classical attempts have produced complexity metrics that rely heavily on global measures of compactness and capacity with little investigation into the effects of sub-component collaboration. We demonstrate intriguing regularities in the activation patterns of the hidden nodes within fully-connected feedforward networks. By tracing the origin of these patterns, we show how such networks can be viewed as the combination of two information processing systems: one continuous and one discrete. We describe how these two systems arise naturally from the gradient-based optimization process, and demonstrate the classification ability of the two systems, individually and in collaboration. This perspective on DNN classification offers a novel way to think about generalization, in which different subsets of the training data are used to train distinct classifiers; those classifiers are then combined to perform the classification task, and their consistency is crucial for accurate classification.
@{236, author = {Marelie Davel, Marthinus Theunissen, Arnold Pretorius, Etienne Barnard}, title = {DNNs as layers of cooperating classifiers}, abstract = {A robust theoretical framework that can describe and predict the generalization ability of deep neural networks (DNNs) in general circumstances remains elusive. Classical attempts have produced complexity metrics that rely heavily on global measures of compactness and capacity with little investigation into the effects of sub-component collaboration. We demonstrate intriguing regularities in the activation patterns of the hidden nodes within fully-connected feedforward networks. By tracing the origin of these patterns, we show how such networks can be viewed as the combination of two information processing systems: one continuous and one discrete. We describe how these two systems arise naturally from the gradient-based optimization process, and demonstrate the classification ability of the two systems, individually and in collaboration. This perspective on DNN classification offers a novel way to think about generalization, in which different subsets of the training data are used to train distinct classifiers; those classifiers are then combined to perform the classification task, and their consistency is crucial for accurate classification.}, year = {2020}, journal = {The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)}, pages = {3725 - 3732}, month = {07/02-12/02/2020}, address = {New York}, }
The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.
@{284, author = {Marthinus Theunissen, Marelie Davel, Etienne Barnard}, title = {Insights regarding overfitting on noise in deep learning}, abstract = {The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.}, year = {2019}, journal = {South African Forum for Artificial Intelligence Research (FAIR)}, pages = {49-63}, address = {Cape Town, South Africa}, }
The generalization capabilities of deep neural networks are not well understood, and in particular, the influence of activation functions on generalization has received little theoretical attention. Phenomena such as vanishing gradients, node saturation and network sparsity have been identified as possible factors when comparing different activation functions [1]. We investigate these factors using fully connected feedforward networks on two standard benchmark problems, and find that the most salient differences between networks with sigmoidal and ReLU activations relate to the way that class-distinctive information is propagated through a network.
@{279, author = {Arnold Pretorius, Etienne Barnard, Marelie Davel}, title = {ReLU and sigmoidal activation functions}, abstract = {The generalization capabilities of deep neural networks are not well understood, and in particular, the influence of activation functions on generalization has received little theoretical attention. Phenomena such as vanishing gradients, node saturation and network sparsity have been identified as possible factors when comparing different activation functions [1]. We investigate these factors using fully connected feedforward networks on two standard benchmark problems, and find that the most salient differences between networks with sigmoidal and ReLU activations relate to the way that class-distinctive information is propagated through a network.}, year = {2019}, journal = {South African Forum for Artificial Intelligence Research (FAIR)}, pages = {37-48}, month = {04/12-07/12}, publisher = {CEUR Workshop Proceedings}, address = {Cape Town, South Africa}, }
Estimating the joint probability density function of a dataset is a central task in many machine learning applications. In this work we address the fundamental problem of kernel bandwidth estimation for variable kernel density estimation in high-dimensional feature spaces. We derive a variable kernel bandwidth estimator by minimizing the leave-one-out entropy objective function and show that this estimator is capable of performing estimation in high-dimensional feature spaces with great success. We compare the performance of this estimator to state-of-the art maximum likelihood estimators on a number of representative high-dimensional machine learning tasks and show that the newly introduced minimum leave-one-out entropy estimator performs optimally on a number of high-dimensional datasets considered.
@{275, author = {Christiaan Van der Walt, Etienne Barnard}, title = {Variable Kernel Density Estimation in High-dimensional Feature Spaces}, abstract = {Estimating the joint probability density function of a dataset is a central task in many machine learning applications. In this work we address the fundamental problem of kernel bandwidth estimation for variable kernel density estimation in high-dimensional feature spaces. We derive a variable kernel bandwidth estimator by minimizing the leave-one-out entropy objective function and show that this estimator is capable of performing estimation in high-dimensional feature spaces with great success. We compare the performance of this estimator to state-of-the art maximum likelihood estimators on a number of representative high-dimensional machine learning tasks and show that the newly introduced minimum leave-one-out entropy estimator performs optimally on a number of high-dimensional datasets considered.}, year = {2017}, journal = {AAAI Conf. on Artificial Intelligence (AAAI-17)}, pages = {2674-2680}, month = {04/02-09/04}, }
Automatic speech recognition (ASR) technology has matured over the past few decades and has made significant impacts in a variety of fields, from assistive technologies to commercial products. However, ASR system development is a resource intensive activity and requires language resources in the form of text annotated audio recordings and pronunciation dictionaries. Unfortunately, many languages found in the developing world fall into the resource-scarce category and due to this resource scarcity the deployment of ASR systems in the developing world is severely inhibited. One approach to assist with resource-scarce ASR system development, is to select 'useful' training samples which could reduce the resources needed to collect new corpora. In this work, we propose a new data selection framework which can be used to design a speech recognition corpus. We show for limited data sets, independent of language and bandwidth, the most effective strategy for data selection is frequency-matched selection and that the widely-used maximum entropy methods generally produced the least promising results. In our model, the frequency-matched selection method corresponds to a logarithmic relationship between accuracy and corpus size; we also investigated other model relationships, and found that a hyperbolic relationship (as suggested from simple asymptotic arguments in learning theory) may lead to somewhat better performance under certain conditions.
@article{291, author = {Neil Kleynhans, Etienne Barnard}, title = {Efficient data selection for ASR}, abstract = {Automatic speech recognition (ASR) technology has matured over the past few decades and has made significant impacts in a variety of fields, from assistive technologies to commercial products. However, ASR system development is a resource intensive activity and requires language resources in the form of text annotated audio recordings and pronunciation dictionaries. Unfortunately, many languages found in the developing world fall into the resource-scarce category and due to this resource scarcity the deployment of ASR systems in the developing world is severely inhibited. One approach to assist with resource-scarce ASR system development, is to select 'useful' training samples which could reduce the resources needed to collect new corpora. In this work, we propose a new data selection framework which can be used to design a speech recognition corpus. We show for limited data sets, independent of language and bandwidth, the most effective strategy for data selection is frequency-matched selection and that the widely-used maximum entropy methods generally produced the least promising results. In our model, the frequency-matched selection method corresponds to a logarithmic relationship between accuracy and corpus size; we also investigated other model relationships, and found that a hyperbolic relationship (as suggested from simple asymptotic arguments in learning theory) may lead to somewhat better performance under certain conditions.}, year = {2015}, journal = {Language Resources and Evaluation}, volume = {49}, pages = {327-353}, issue = {2}, publisher = {Springer Science+Business Media}, address = {Dordrecht}, doi = {10.1007/s10579-014-9285-0}, }
DEGREES LINKED TO THIS RESEARCH GROUP:
1) 2018-current PhD (Philosophy): 'Interfaces between Knowledge Representation and Reasoning and Political Philosophy: The Symbiotic Relationship Between Morality and Justice'.
TALKS:
1) 'Philosophy in/as Translation' (PSSA 2019);
2) 'Decolonizing Knowledge' (PPA 2019);
3) 'AI and the Social Good' (4th CAIR/UP Symposium 2019);
4) 'Decolonization and Alterity: Intersecting Theories and Praxis' (PPA 2018).
Latest Research Publications: