Research Publications

2020

Davel M, Theunissen T, Pretorius A, Barnard E. DNNs as layers of cooperating classifiers. The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20). 2020.

A robust theoretical framework that can describe and predict the generalization ability of deep neural networks (DNNs) in general circumstances remains elusive. Classical attempts have produced complexity metrics that rely heavily on global measures of compactness and capacity with little investigation into the effects of sub-component collaboration. We demonstrate intriguing regularities in the activation patterns of the hidden nodes within fully-connected feedforward networks. By tracing the origin of these patterns, we show how such networks can be viewed as the combination of two information processing systems: one continuous and one discrete. We describe how these two systems arise naturally from the gradient-based optimization process, and demonstrate the classification ability of the two systems, individually and in collaboration. This perspective on DNN classification offers a novel way to think about generalization, in which different subsets of the training data are used to train distinct classifiers; those classifiers are then combined to perform the classification task, and their consistency is crucial for accurate classification.

@proceedings{236,
  author = {Marelie Davel and Tian Theunissen and Arnold Pretorius and Etienne Barnard},
  title = {DNNs as layers of cooperating classifiers},
  abstract = {A robust theoretical framework that can describe and predict the generalization ability of deep neural networks (DNNs) in general circumstances remains elusive. Classical attempts have produced complexity metrics that rely heavily on global measures of compactness and capacity with little investigation into the effects of sub-component collaboration. We demonstrate intriguing regularities in the activation patterns of the hidden nodes within fully-connected feedforward networks. By tracing the origin of these patterns, we show how such networks can be viewed as the combination of two information processing systems: one continuous and one discrete. We describe how these two systems arise naturally from the gradient-based optimization process, and demonstrate the classification ability of the two systems, individually and in collaboration. This perspective on DNN classification offers a novel way to think about generalization, in which different subsets of the training data are used to train distinct classifiers; those classifiers are then combined to perform the classification task, and their consistency is crucial for accurate classification.},
  year = {2020},
  journal = {The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)},
  month = {07/02-12/02/2020},
  publisher = {arXiv:2001.06178v1},
}

2019

Thomas A, Gerber A, van der Merwe A. A Conceptual Framework of Research on Visual Language Specification Languages. International Conference on Advances in Big Data, Computing and Data Communication Systems (icABCD). 2019. doi:10.1109/ICABCD.2019.8851003.

Visual languages make use of spatial arrangements of graphical and textual elements to represent information. Domain specific diagrams, including flowcharts and music sheets, are examples of visual languages. An established area of research is the study of languages which can be used to create declarative specifications of visual languages. In this paper, the result of a review of research on visual language specification languages is presented. Specifically, a structured literature review is conducted to establish research themes by analysing what has been studied in the context of specification languages. The result of the literature review is used to develop a conceptual framework that consists of six research themes with related topics. Additionally, discussions on how the conceptual framework can be used as a basis to guide research in the field of specification languages, to perform feature based characterisations and to create lists of criteria to evaluate and compare specification languages are included in this paper.

@proceedings{255,
  author = {Anitta Thomas and Aurona Gerber and Alta van der Merwe},
  title = {A Conceptual Framework of Research on Visual Language Specification Languages},
  abstract = {Visual languages make use of spatial arrangements of graphical and textual elements to represent information. Domain specific diagrams, including flowcharts and music sheets, are examples of visual languages. An established area of research is the study of languages which can be used to create declarative specifications of visual languages. In this paper, the result of a review of research on visual language specification languages is presented. Specifically, a structured literature review is conducted to establish research themes by analysing what has been studied in the context of specification languages. The result of the literature review is used to develop a conceptual framework that consists of six research themes with related topics. Additionally, discussions on how the conceptual framework can be used as a basis to guide research in the field of specification languages, to perform feature based characterisations and to create lists of criteria to evaluate and compare specification languages are included in this paper.},
  year = {2019},
  journal = {International Conference on Advances in Big Data, Computing and Data Communication Systems (icABCD)},
  month = {05/09 - 06/09},
  publisher = {IEEE},
  address = {Winterton, South Africa},
  isbn = {978-1-5386-9236-3},
  url = {https://ieeexplore.ieee.org/document/8851003},
  doi = {10.1109/ICABCD.2019.8851003},
}
Wolpe Z, de Waal A. Autoencoding variational Bayes for latent Dirichlet allocation. Proceedings of the South African Forum for Artificial Intelligence Research. 2019. http://ceur-ws.org/Vol-2540/FAIR2019_paper_33.pdf.

Many posterior distributions take intractable forms and thus require variational inference where analytical solutions cannot be found. Variational Inference and Monte Carlo Markov Chains (MCMC) are established mechanism to approximate these intractable values. An alternative approach to sampling and optimisation for approximation is a direct mapping between the data and posterior distribution. This is made possible by recent advances in deep learning methods. Latent Dirichlet Allocation (LDA) is a model which offers an intractable posterior of this nature. In LDA latent topics are learnt over unlabelled documents to soft cluster the documents. This paper assesses the viability of learning latent topics leveraging an autoencoder (in the form of Autoencoding variational Bayes) and compares the mimicked posterior distributions to that achieved by VI. After conducting various experiments the proposed AEVB delivers inadequate performance. Under Utopian conditions comparable conclusion are achieved which are generally unattainable. Further, model specification becomes increasingly complex and deeply circumstantially dependant - which is in itself not a deterrent but does warrant consideration. In a recent study, these concerns were highlighted and discussed theoretically. We confirm the argument empirically by dissecting the autoencoder’s iterative process. In investigating the autoencoder, we see performance degrade as models grow in dimensionality. Visualization of the autoencoder reveals a bias towards the initial randomised topics.

@proceedings{254,
  author = {Zach Wolpe and Alta de Waal},
  title = {Autoencoding variational Bayes for latent Dirichlet allocation},
  abstract = {Many posterior distributions take intractable forms and thus
require variational inference where analytical solutions cannot be found.
Variational Inference and Monte Carlo Markov Chains (MCMC) are established mechanism to approximate these intractable values. An alternative approach to sampling and optimisation for approximation is a direct mapping between the data and posterior distribution. This is made
possible by recent advances in deep learning methods. Latent Dirichlet
Allocation (LDA) is a model which offers an intractable posterior of this
nature. In LDA latent topics are learnt over unlabelled documents to
soft cluster the documents. This paper assesses the viability of learning
latent topics leveraging an autoencoder (in the form of Autoencoding
variational Bayes) and compares the mimicked posterior distributions to
that achieved by VI. After conducting various experiments the proposed
AEVB delivers inadequate performance. Under Utopian conditions comparable conclusion are achieved which are generally unattainable. Further, model specification becomes increasingly complex and deeply circumstantially dependant - which is in itself not a deterrent but does warrant consideration. In a recent study, these concerns were highlighted and
discussed theoretically. We confirm the argument empirically by dissecting the autoencoder’s iterative process. In investigating the autoencoder,
we see performance degrade as models grow in dimensionality. Visualization of the autoencoder reveals a bias towards the initial randomised
topics.},
  year = {2019},
  journal = {Proceedings of the South African Forum for Artificial Intelligence Research},
  pages = {25-36},
  month = {12/09},
  publisher = {CEUR Workshop Proceedings},
  isbn = {1613-0073},
  url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_33.pdf},
}
Weyer VD, de Waal A, Lechner AM, et al. Quantifying rehabilitation risks for surface‐strip coal mines using a soil compaction Bayesian network in South Africa and Australia: To demonstrate the R2AIN Framework. Integrated Environmental Assessment and Management. 2019;15(2). doi:10.1002/ieam.4128.

Environmental information is acquired and assessed during the environmental impact assessment process for surface‐strip coal mine approval. However, integrating these data and quantifying rehabilitation risk using a holistic multidisciplinary approach is seldom undertaken. We present a rehabilitation risk assessment integrated network (R2AIN™) framework that can be applied using Bayesian networks (BNs) to integrate and quantify such rehabilitation risks. Our framework has 7 steps, including key integration of rehabilitation risk sources and the quantification of undesired rehabilitation risk events to the final application of mitigation. We demonstrate the framework using a soil compaction BN case study in the Witbank Coalfield, South Africa and the Bowen Basin, Australia. Our approach allows for a probabilistic assessment of rehabilitation risk associated with multidisciplines to be integrated and quantified. Using this method, a site's rehabilitation risk profile can be determined before mining activities commence and the effects of manipulating management actions during later mine phases to reduce risk can be gauged, to aid decision making

@article{253,
  author = {Vanessa Weyer and Alta de Waal and Alex Lechner and Corinne Unger and Tim O'Connor and Thomas Baumgartl and Roland Schulze and Wayne Truter},
  title = {Quantifying rehabilitation risks for surface‐strip coal mines using a soil compaction Bayesian network in South Africa and Australia: To demonstrate the R2AIN Framework},
  abstract = {Environmental information is acquired and assessed during the environmental impact assessment process for surface‐strip coal mine approval. However, integrating these data and quantifying rehabilitation risk using a holistic multidisciplinary approach is seldom undertaken. We present a rehabilitation risk assessment integrated network (R2AIN™) framework that can be applied using Bayesian networks (BNs) to integrate and quantify such rehabilitation risks. Our framework has 7 steps, including key integration of rehabilitation risk sources and the quantification of undesired rehabilitation risk events to the final application of mitigation. We demonstrate the framework using a soil compaction BN case study in the Witbank Coalfield, South Africa and the Bowen Basin, Australia. Our approach allows for a probabilistic assessment of rehabilitation risk associated with multidisciplines to be integrated and quantified. Using this method, a site's rehabilitation risk profile can be determined before mining activities commence and the effects of manipulating management actions during later mine phases to reduce risk can be gauged, to aid decision making},
  year = {2019},
  journal = {Integrated Environmental Assessment and Management},
  volume = {15},
  pages = {190-208},
  issue = {2},
  publisher = {Wiley Online},
  doi = {10.1002/ieam.4128},
}
Toussaint W, Moodley D. Comparison of clustering techniques for residential load profiles in South Africa. Forum for Artificial Intelligence Research. 2019. http://ceur-ws.org/Vol-2540/FAIR2019_paper_55.pdf.

This work compares techniques for clustering metered residential energy consumption data to construct representative daily load profiles in South Africa. The input data captures a population with high variability across temporal, geographic, social and economic dimensions. Different algorithms, normalisation and pre-binning techniques are evaluated to determine their effect on producing a good clustering structure. A Combined Index is developed as a relative score to ease the comparison of experiments across different metrics. The study shows that normalisation, specifically unit norm and the zero-one scaler, produce the best clusters. Pre-binning appears to improve clustering structures as a whole, but its effect on individual experiments remains unclear. Like several previous studies, the k-means algorithm produces the best results. To our knowledge this is the first work that rigorously compares state of the art cluster analysis techniques in the residential energy domain in a developing country context.

@proceedings{249,
  author = {Wiebke Toussaint and Deshen Moodley},
  title = {Comparison of clustering techniques for residential load profiles in South Africa},
  abstract = {This work compares techniques for clustering metered residential energy consumption data to construct representative daily load profiles in South Africa. The input data captures a population with high variability across temporal, geographic, social and economic dimensions. Different algorithms, normalisation and pre-binning techniques are evaluated to determine their effect on producing a good clustering structure. A Combined Index is developed as a relative score to ease the comparison of experiments across different metrics. The study shows that normalisation, specifically unit norm and the zero-one scaler, produce the best clusters. Pre-binning appears to improve clustering structures as a whole, but its effect on individual experiments remains unclear. Like several previous studies, the k-means algorithm produces the best results. To our knowledge this is the first work that rigorously compares state of the art cluster analysis techniques in the residential energy domain in a developing country context.},
  year = {2019},
  journal = {Forum for Artificial Intelligence Research},
  pages = {117 -132},
  month = {03/12 - 06/12},
  publisher = {CEUR},
  isbn = {1613-0073},
  url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_55.pdf},
}
Britz K, Varzinczak I. Preferential tableaux for contextual defeasible ALC. 28th International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (TABLEAUX). 2019. https://www.springer.com/gp/book/9783030290252.

In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.

@proceedings{247,
  author = {Katarina Britz and Ivan Varzinczak},
  title = {Preferential tableaux for contextual defeasible ALC},
  abstract = {In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.},
  year = {2019},
  journal = {28th International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (TABLEAUX)},
  pages = {39-57},
  month = {03/09-05/09},
  publisher = {Springer LNAI no. 11714},
  isbn = {ISBN 978-3-030-29026-9},
  url = {https://www.springer.com/gp/book/9783030290252},
}
Britz K, Varzinczak I. Contextual rational closure for defeasible ALC. Annals of Mathematics and Artificial Intelligence. 2019;87(1-2). doi:10.1007/s10472-019-09658-2.

Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.

@article{246,
  author = {Katarina Britz and Ivan Varzinczak},
  title = {Contextual rational closure for defeasible ALC},
  abstract = {Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.},
  year = {2019},
  journal = {Annals of Mathematics and Artificial Intelligence},
  volume = {87},
  pages = {83-108},
  issue = {1-2},
  isbn = {ISSN: 1012-2443},
  url = {https://link.springer.com/article/10.1007/s10472-019-09658-2},
  doi = {10.1007/s10472-019-09658-2},
}
Price CS, Moodley D, Pillay A. Modelling uncertain adaptive decisions: Application to KwaZulu-Natal sugarcane growers. Forum for Artificial Intelligence Research (FAIR2019). 2019. http://ceur-ws.org/Vol-2540/FAIR2019_paper_53.pdf.

A dynamic Bayesian decision network was developed to model the preharvest burning decision-making processes of sugarcane growers in a KwaZulu-Natal sugarcane supply chain and extends previous work by Price et al. (2018). This model was created using an iterative development approach. This paper recounts the development and validation process of the third version of the model. The model was validated using Pitchforth and Mengersen (2013)’s framework for validating expert elicited Bayesian networks. During this process, growers and cane supply members assessed the model in a focus group by executing the model, and reviewing the results of a prerun scenario. The participants were generally positive about how the model represented their decision-making processes. However, they identified some issues that could be addressed in the next iteration. Dynamic Bayesian decision networks offer a promising approach to modelling adaptive decisions in uncertain conditions. This model can be used to simulate the cognitive mechanism for a grower agent in a simulation of a sugarcane supply chain.

@proceedings{244,
  author = {C. Sue Price and Deshen Moodley and Anban Pillay},
  title = {Modelling uncertain adaptive decisions: Application to KwaZulu-Natal sugarcane growers},
  abstract = {A dynamic Bayesian decision network was developed to model the preharvest burning decision-making processes of sugarcane growers in a KwaZulu-Natal sugarcane supply chain and extends previous work by Price et al. (2018). This model was created using an iterative development approach. This paper recounts the development and validation process of the third version of the model. The model was validated using Pitchforth and Mengersen (2013)’s framework for validating expert elicited Bayesian networks. During this process, growers and cane supply members assessed the model in a focus group by executing the model, and reviewing the results of a prerun scenario. The participants were generally positive about how the model represented their decision-making processes. However, they identified some issues that could be addressed in the next iteration. Dynamic Bayesian decision networks offer a promising approach to modelling adaptive decisions in uncertain conditions. This model can be used to simulate the cognitive mechanism for a grower agent in a simulation of a sugarcane supply chain.},
  year = {2019},
  journal = {Forum for Artificial Intelligence Research (FAIR2019)},
  pages = {145-160},
  month = {4/12-6/12},
  publisher = {CEUR},
  address = {Cape Town},
  url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_53.pdf},
}
Ikram MS, Pillay A, Jembere E. Using social networks to enhance a deep learning approach to solve the cold-start problem in recommender systems. Forum for Artificial Intelligence Research (FAIR2019). 2019. http://ceur-ws.org/Vol-2540/FAIR2019_paper_51.pdf.

The Cold-Start problem refers to the initial sparsity of data available to Recommender Systems that leads to poor recommendations to users. This research compares a Deep Learning Approach, a Deep Learning Approach that makes use of social information and Matrix Factorization. The social information was used to form communities of users. The intuition behind this approach is that users within a given community are likely to have similar interests. A community detection algorithm was used to group users. Thereafter a deep learning model was trained on each community. The comparative models were evaluated on the Yelp Round 9 Academic Dataset. The dataset was pruned to consist only of users with at least 1 social link. The evaluation metrics used were Mean Squared Error (MSE) and Mean Absolute Error (MAE). The evaluation was carried out using 5-fold cross-validation. The results showed that the use of social information improved on the results achieved from the Deep Learning Approach, and grouping users into communities was advantageous. However, the Deep Learning Approach that made use of social information did not outperform SVD++, a state of the art approach for recommender systems. However, the new approach shows promise for improving Deep Learning models.

@proceedings{243,
  author = {Muhammad Ikram and Anban Pillay and Edgar Jembere},
  title = {Using social networks to enhance a deep learning approach to solve the cold-start problem in recommender systems},
  abstract = {The Cold-Start problem refers to the initial sparsity of data available to Recommender Systems that leads to poor recommendations to users. This research compares a Deep Learning Approach, a Deep Learning Approach that makes use of social information and Matrix Factorization. The social information was used to form communities of users. The intuition behind this approach is that users within a given community are likely to have similar interests. A community detection algorithm was used to group users. Thereafter a deep learning model was trained on each community. The comparative models were evaluated on the Yelp Round 9 Academic Dataset. The dataset was pruned to consist only of users with at least 1 social link. The evaluation metrics used were Mean Squared Error (MSE) and Mean Absolute Error (MAE). The evaluation was carried out using 5-fold cross-validation. The results showed that the use of social information improved on the results achieved from the Deep Learning Approach, and grouping users into communities was advantageous. However, the Deep Learning Approach that made use of social information did not outperform SVD++, a state of the art approach for recommender systems. However, the new approach shows promise for improving Deep Learning models.},
  year = {2019},
  journal = {Forum for Artificial Intelligence Research (FAIR2019)},
  pages = {173-184},
  month = {4/12-6/12},
  publisher = {CEUR},
  address = {Cape Town},
  url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_51.pdf},
}
Jeewa A, Pillay A, Jembere E. Directed curiosity-driven exploration in hard exploration, sparse reward environments. Forum for Artificial Intelligence Research (FAIR). 2019. http://ceur-ws.org/Vol-2540/FAIR2019_paper_42.pdf.

Training agents in hard exploration, sparse reward environments is a difficult task since the reward feedback is insufficient for meaningful learning. In this work, we propose a new technique, called Directed Curiosity, that is a hybrid of Curiosity-Driven Exploration and distancebased reward shaping. The technique is evaluated in a custom navigation task where an agent tries to learn the shortest path to a distant target, in environments of varying difficulty. The technique is compared to agents trained with only a shaped reward signal, a curiosity signal as well as a sparse reward signal. It is shown that directed curiosity is the most successful in hard exploration environments, with the benefits of the approach being highlighted in environments with numerous obstacles and decision points. The limitations of the shaped reward function are also discussed.

@proceedings{242,
  author = {Asad Jeewa and Anban Pillay and Edgar Jembere},
  title = {Directed curiosity-driven exploration in hard exploration, sparse reward environments},
  abstract = {Training agents in hard exploration, sparse reward environments is a difficult task since the reward feedback is insufficient for meaningful learning. In this work, we propose a new technique, called Directed Curiosity, that is a hybrid of Curiosity-Driven Exploration and distancebased reward shaping. The technique is evaluated in a custom navigation task where an agent tries to learn the shortest path to a distant target, in environments of varying difficulty. The technique is compared to agents trained with only a shaped reward signal, a curiosity signal as well as a sparse reward signal. It is shown that directed curiosity is the most successful in hard exploration environments, with the benefits of the approach being highlighted in environments with numerous obstacles and decision points. The limitations of the shaped reward function are also discussed.},
  year = {2019},
  journal = {Forum for Artificial Intelligence Research (FAIR)},
  pages = {12 -24},
  month = {4/12-6/12},
  publisher = {CEUR},
  address = {Cape Town},
  url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_42.pdf},
}
Britz K, Casini G, Meyer T, Varzinczak I. A KLM Perspective on Defeasible Reasoning for Description Logics. In: Description Logic, Theory Combination, And All That. Switzerland: Springer; 2019. doi:https://doi.org/10.1007/978-3-030-22102-7 _ 7.

In this paper we present an approach to defeasible reasoning for the description logic ALC. The results discussed here are based on work done by Kraus, Lehmann and Magidor (KLM) on defeasible conditionals in the propositional case. We consider versions of a preferential semantics for two forms of defeasible subsumption, and link these semantic constructions formally to KLM-style syntactic properties via representation results. In addition to showing that the semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. With the semantics of the defeasible version of ALC in place, we turn to the investigation of an appropriate form of defeasible entailment for this enriched version of ALC. This investigation includes an algorithm for the computation of a form of defeasible entailment known as rational closure in the propositional case. Importantly, the algorithm relies completely on classical entailment checks and shows that the computational complexity of reasoning over defeasible ontologies is no worse than that of the underlying classical ALC. Before concluding, we take a brief tour of some existing work on defeasible extensions of ALC that go beyond defeasible subsumption.

@inbook{240,
  author = {Katarina Britz and Giovanni Casini and Thomas Meyer and Ivan Varzinczak},
  title = {A KLM Perspective on Defeasible Reasoning for Description Logics},
  abstract = {In this paper we present an approach to defeasible reasoning for the description logic ALC. The results discussed here are based on work done by Kraus, Lehmann and Magidor (KLM) on defeasible conditionals in the propositional case. We consider versions of a preferential semantics for two forms of defeasible subsumption, and link these semantic constructions formally to KLM-style syntactic properties via representation results. In addition to showing that the semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. With the semantics of the defeasible version of ALC in place, we turn to the investigation of an appropriate form of defeasible entailment for this enriched version of ALC. This investigation includes an algorithm for the computation of a form of defeasible entailment known as rational closure in the propositional case. Importantly, the algorithm relies completely on classical entailment checks and shows that the computational complexity of reasoning over defeasible ontologies is no worse than that of the underlying classical ALC. Before concluding, we take a brief tour of some existing work on defeasible extensions of ALC that go beyond defeasible subsumption.},
  year = {2019},
  journal = {Description Logic, Theory Combination, and All That},
  pages = {147–173},
  publisher = {Springer},
  address = {Switzerland},
  isbn = {978-3-030-22101-0},
  url = {https://link.springer.com/book/10.1007%2F978-3-030-22102-7},
  doi = {https://doi.org/10.1007/978-3-030-22102-7 _ 7},
}
Casini G, Meyer T, Varzinczak I. Taking Defeasible Entailment Beyond Rational Closure. European Conference on Logics in Artificial Intelligence. 2019. doi:https://doi.org/10.1007/978-3-030-19570-0 _ 12.

We present a systematic approach for extending the KLM framework for defeasible entailment. We first present a class of basic defeasible entailment relations, characterise it in three distinct ways and provide a high-level algorithm for computing it. This framework is then refined, with the refined version being characterised in a similar manner. We show that the two well-known forms of defeasible entailment, rational closure and lexicographic closure, fall within our refined framework, that rational closure is the most conservative of the defeasible entailment relations within the framework (with respect to subset inclusion), but that there are forms of defeasible entailment within our framework that are more “adventurous” than lexicographic closure.

@proceedings{238,
  author = {Giovanni Casini and Thomas Meyer and Ivan Varzinczak},
  title = {Taking Defeasible Entailment Beyond Rational Closure},
  abstract = {We present a systematic approach for extending the KLM framework for defeasible entailment. We first present a class of basic defeasible entailment relations, characterise it in three distinct ways and provide a high-level algorithm for computing it. This framework is then refined, with the refined version being characterised in a similar manner. We show that the two well-known forms of defeasible entailment, rational closure and lexicographic closure, fall within our refined framework, that rational closure is the most conservative of the defeasible entailment relations within the framework (with respect to subset inclusion), but that there are forms of defeasible entailment within our framework that are more “adventurous” than lexicographic closure.},
  year = {2019},
  journal = {European Conference on Logics in Artificial Intelligence},
  pages = {182 - 197},
  month = {07/05 - 11/05},
  publisher = {Springer},
  address = {Switzerland},
  isbn = {978-3-030-19569-4},
  url = {https://link.springer.com/chapter/10.1007%2F978-3-030-19570-0_12},
  doi = {https://doi.org/10.1007/978-3-030-19570-0 _ 12},
}
Botha L, Meyer T, Peñaloza R. A Bayesian Extension of the Description Logic ALC. European Conference on Logics in Artificial Intelligence. 2019. doi:https://doi.org/10.1007/978-3-030-19570-0 _ 22.

Description logics (DLs) are well-known knowledge representation formalisms focused on the representation of terminological knowledge. A probabilistic extension of a light-weight DL was recently proposed for dealing with certain knowledge occurring in uncertain contexts. In this paper, we continue that line of research by introducing the Bayesian extension BALC of the DL ALC. We present a tableau based procedure for deciding consistency, and adapt it to solve other probabilistic, contextual, and general inferences in this logic. We also show that all these problems remain ExpTime-complete, the same as reasoning in the underlying classical ALC.

@proceedings{237,
  author = {Leonard Botha and Thomas Meyer and Rafael Peñaloza},
  title = {A Bayesian Extension of the Description Logic ALC},
  abstract = {Description logics (DLs) are well-known knowledge representation formalisms focused on the representation of terminological knowledge. A probabilistic extension of a light-weight DL was recently proposed for dealing with certain knowledge occurring in uncertain contexts. In this paper, we continue that line of research by introducing the Bayesian extension BALC of the DL ALC. We present a tableau based procedure for deciding consistency, and adapt it to solve other probabilistic, contextual, and general inferences in this logic. We also show that all these problems remain ExpTime-complete, the same as reasoning in the underlying classical ALC.},
  year = {2019},
  journal = {European Conference on Logics in Artificial Intelligence},
  pages = {339 - 354},
  month = {07/05 - 11/05},
  publisher = {Springer},
  address = {Switzerland},
  isbn = {978-3-030-19569-4},
  url = {https://link.springer.com/chapter/10.1007%2F978-3-030-19570-0_22},
  doi = {https://doi.org/10.1007/978-3-030-19570-0 _ 22},
}
Thirion JWF, van Heerden C, Giwa O, Davel M. The South African directory enquiries (SADE) name corpus. In: Language Resources & Evaluation. Springer; 2019.

We present the design and development of a South African directory enquiries corpus. It contains audio and orthographic transcriptions of a wide range of South African names produced by first-language speakers of four languages, namely Afrikaans, English, isiZulu and Sesotho. Useful as a resource to understand the effect of name language and speaker language on pronunciation, this is the first corpus to also aim to identify the ‘‘intended language’’: an implicit assumption with regard to word origin made by the speaker of the name. We describe the design, collection, annotation, and verification of the corpus. This includes an analysis of the algorithms used to tag the corpus with meta information that may be beneficial to pronunciation modelling tasks.

@inbook{235,
  author = {Jan Thirion and Charl van Heerden and Oluwapelumi Giwa and Marelie Davel},
  title = {The South African directory enquiries (SADE) name corpus},
  abstract = {We present the design and development of a South African directory enquiries corpus. It contains audio and orthographic transcriptions of a wide range of South African names produced by first-language speakers of four languages, namely Afrikaans, English, isiZulu and Sesotho. Useful as a resource to understand the effect of name language and speaker language on pronunciation, this is the first corpus to also aim to identify the ‘‘intended language’’: an implicit assumption with regard to word origin made by the speaker of the name. We describe the design, collection, annotation, and verification of the corpus. This includes an analysis of the algorithms used to tag the corpus with meta information that may be beneficial to pronunciation modelling tasks.},
  year = {2019},
  journal = {Language Resources & Evaluation},
  publisher = {Springer},
  isbn = {1574-020X},
}
Theunissen T, Davel M, Barnard E. Insights regarding overfitting on noise in deep learning. In: South African Forum For Artificial Intelligence Research. CEUR workshop proceedings; 2019.

The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.

@inbook{234,
  author = {Tian Theunissen and Marelie Davel and Etienne Barnard},
  title = {Insights regarding overfitting on noise in deep learning},
  abstract = {The understanding of generalization in machine learning is in a state of flux. This is partly due to the elatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural network training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterized multilayer perceptrons and controlled noise in the training data. 
The main insights are that deep learning models are optimized for training data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is
largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.},
  year = {2019},
  journal = {South African Forum for Artificial Intelligence Research},
  pages = {49-63},
  publisher = {CEUR workshop proceedings},
  isbn = {1613-0073},
}
Pretorius A, Barnard E, Davel M. ReLU and sigmoidal activation functions. In: South African Forum For Artificial Intelligence Research. CEUR workshop proceedings; 2019.

The generalization capabilities of deep neural networks are not well understood, and in particular, the influence of activation functions on generalization has received little theoretical attention. Phenomena such as vanishing gradients, node saturation and network sparsity have been identified as possible factors when comparing different activation functions [1]. We investigate these factors using fully connected feedforward networks on two standard benchmark problems, and find that the most salient differences between networks with sigmoidal and ReLU activations relate to the way that class-distinctive information is propagated through a network.

@inbook{233,
  author = {Arnold Pretorius and Etienne Barnard and Marelie Davel},
  title = {ReLU and sigmoidal activation functions},
  abstract = {The generalization capabilities of deep neural networks are not well understood, and in particular, the influence of activation functions on generalization has received little theoretical attention. Phenomena such as vanishing gradients, node saturation and network sparsity have been identified as possible factors when comparing different activation functions [1]. We investigate these factors using fully connected feedforward networks on two standard benchmark problems, and find that the most salient differences between networks with sigmoidal and
ReLU activations relate to the way that class-distinctive information is propagated through a network.},
  year = {2019},
  journal = {South African Forum for Artificial Intelligence Research},
  pages = {37-48},
  publisher = {CEUR workshop proceedings},
  isbn = {1613-0073},
}
Lotz S, Beukes J, Davel M. Input parameter ranking for neural networks in a space weather regression problem. In: South African Forum For Artificial Intelligence Research. CEUR workshop proceedings; 2019.

Geomagnetic storms are multi-day events characterised by significant perturbations to the magnetic field of the Earth, driven by solar activity. Numerous efforts have been undertaken to utilise in-situ measurements of the solar wind plasma to predict perturbations to the geomagnetic field measured on the ground. Typically, solar wind measurements are used as input parameters to a regression problem tasked with predicting a perturbation index such as the 1-minute cadence symmetric-H (Sym-H) index. We re-visit this problem, with two important twists: (i) An adapted feedforward neural network topology is designed to enable the pairwise analysis of input parameter weights. This enables the ranking of input parameters in terms of importance to output accuracy, without the need to train numerous models. (ii) Geomagnetic storm phase information is incorporated as model inputs and shown to increase performance. This is motivated by the fact that different physical phenomena are at play during different phases of a geomagnetic storm.

@inbook{232,
  author = {Stefan Lotz and Jacques Beukes and Marelie Davel},
  title = {Input parameter ranking for neural networks in a space weather regression problem},
  abstract = {Geomagnetic storms are multi-day events characterised by significant perturbations to the magnetic field of the Earth, driven by solar activity. Numerous efforts have been undertaken to utilise in-situ measurements of the solar wind plasma to predict perturbations to the geomagnetic field measured on the ground. Typically, solar wind measurements are used as input parameters to a regression problem tasked with predicting a perturbation index such as the 1-minute cadence symmetric-H (Sym-H) index. We re-visit this problem, with two important twists:
(i) An adapted feedforward neural network topology is designed to enable the pairwise analysis of input parameter weights. This enables the ranking of input parameters in terms of importance to output accuracy, without the need to train numerous models. (ii) Geomagnetic storm phase information is incorporated as model inputs and shown to increase performance. This is motivated by the fact that different physical phenomena are at play during different phases of a geomagnetic storm.},
  year = {2019},
  journal = {South African Forum for Artificial Intelligence Research},
  pages = {133-144},
  publisher = {CEUR workshop proceedings},
  isbn = {1613-0073},
}
Krynauw D, Davel M, Lotz S. Solar flare prediction with temporal convolutional networks (Work in progress). In: South African Forum For Artificial Intelligence Research. CEUR workshop proceedings; 2019.

Sequences are typically modelled with recurrent architectures, but growing research is finding convolutional architectures to also work well for sequence modelling [1]. We explore the performance of Temporal Convolutional Networks (TCNs) when applied to an important sequence modelling task: solar flare prediction. We take this approach, as our future goal is to apply techniques developed for probing and interpreting general convolutional neural networks (CNNs) to solar flare prediction.

@inbook{231,
  author = {Dewald Krynauw and Marelie Davel and Stefan Lotz},
  title = {Solar flare prediction with temporal convolutional networks (Work in progress)},
  abstract = {Sequences are typically modelled with recurrent architectures, but growing research is finding convolutional architectures to also work well for sequence modelling [1]. We explore the performance of Temporal Convolutional Networks (TCNs) when applied to an important sequence modelling task: solar flare prediction. We take this approach, as our future goal is to apply techniques developed for probing and interpreting general convolutional neural networks (CNNs) to solar flare prediction.},
  year = {2019},
  journal = {South African Forum for Artificial Intelligence Research},
  pages = {Work in progress},
  publisher = {CEUR workshop proceedings},
  isbn = {1613-0073},
}
Davel M. Activation gap generators in neural networks. South African Forum for Artificial Intelligence Research. 2019.

No framework exists that can explain and predict the generalisation ability of DNNs in general circumstances. In fact, this question has not been addressed for some of the least complicated of neural network architectures: fully-connected feedforward networks with ReLU activations and a limited number of hidden layers. Building on recent work [2] that demonstrates the ability of individual nodes in a hidden layer to draw class-specific activation distributions apart, we show how a simplified network architecture can be analysed in terms of these activation distributions, and more specifically, the sample distances or activation gaps each node produces. We provide a theoretical perspective on the utility of viewing nodes as activation gap generators, and define the gap conditions that are guaranteed to result in perfect classification of a set of samples. We support these conclusions with empirical results.

@proceedings{230,
  author = {Marelie Davel},
  title = {Activation gap generators in neural networks},
  abstract = {No framework exists that can explain and predict the generalisation ability of DNNs in general circumstances. In fact, this question has not been addressed for some of the least complicated of neural network architectures: fully-connected feedforward networks with ReLU activations and a limited number of hidden layers. Building on recent work [2] that demonstrates the ability of individual nodes in a hidden layer to draw class-specific activation distributions apart, we show how a simplified network architecture can be analysed in terms of these activation distributions, and more specifically, the sample distances or activation gaps each node produces. We provide a theoretical perspective on the utility of viewing nodes as activation gap generators, and define
the gap conditions that are guaranteed to result in perfect classification of a set of samples. We support these conclusions with empirical results.},
  year = {2019},
  journal = {South African Forum for Artificial Intelligence Research},
  pages = {64-76},
  month = {04/12-06/12/2019},
  publisher = {CEUR workshop proceedings},
  isbn = {1613-0073},
}
Du Toit T, Berndt J, Britz K, Fischer B. ConceptCloud 2.0: Visualisation and exploration of geolocation-rich semi-structured data sets. ICFCA 2019 Conference and Workshops. 2019. http://ceur-ws.org/Vol-2378/.

ConceptCloud is a flexible interactive tool for exploring, vi- sualising, and analysing semi-structured data sets. It uses a combination of an intuitive tag cloud visualisation with an underlying concept lattice to provide a formal structure for navigation through a data set. Con- ceptCloud 2.0 extends the tool with an integrated map view to exploit the geolocation aspect of data. The tool’s implementation of exploratory search does not require prior knowledge of the structure of the data or compromise on scalability, and provides seamless navigation through the tag cloud and the map viewer.

@misc{227,
  author = {Tiaan Du Toit and Joshua Berndt and Katarina Britz and Bernd Fischer},
  title = {ConceptCloud 2.0: Visualisation and exploration of geolocation-rich semi-structured data sets},
  abstract = {ConceptCloud is a flexible interactive tool for exploring, vi- sualising, and analysing semi-structured data sets. It uses a combination of an intuitive tag cloud visualisation with an underlying concept lattice to provide a formal structure for navigation through a data set. Con- ceptCloud 2.0 extends the tool with an integrated map view to exploit the geolocation aspect of data. The tool’s implementation of exploratory search does not require prior knowledge of the structure of the data or compromise on scalability, and provides seamless navigation through the tag cloud and the map viewer.},
  year = {2019},
  journal = {ICFCA 2019 Conference and Workshops},
  month = {06/2019},
  publisher = {CEUR-WS},
  isbn = {1613-0073},
  url = {http://ceur-ws.org/Vol-2378/},
}
Casini G, Meyer T, Varzinczak I. Simple Conditionals with Constrained Right Weakening. International Joint Conference on Artificial Intelligence. 2019. doi:10.24963/ijcai.2019/226.

In this paper we introduce and investigate a very basic semantics for conditionals that can be used to define a broad class of conditional reasoning systems. We show that it encompasses the most popular kinds of conditional reasoning developed in logic-based KR. It turns out that the semantics we propose is appropriate for a structural analysis of those conditionals that do not satisfy the property of Right Weakening. We show that it can be used for the further development of an analysis of the notion of relevance in conditional reasoning.

@proceedings{226,
  author = {Giovanni Casini and Thomas Meyer and Ivan Varzinczak},
  title = {Simple Conditionals with Constrained Right Weakening},
  abstract = {In this paper we introduce and investigate a very basic semantics for conditionals that can be used to define a broad class of conditional reasoning systems. We show that it encompasses the most popular kinds of conditional reasoning developed in logic-based KR. It turns out that the semantics we propose is appropriate for a structural analysis of those conditionals that do not satisfy the property of Right Weakening. We show that it can be used for the further development of an analysis of the notion of relevance in conditional reasoning.},
  year = {2019},
  journal = {International Joint Conference on Artificial Intelligence},
  pages = {1632-1638},
  month = {10/08-16/08},
  publisher = {International Joint Conferences on Artificial Intelligence},
  isbn = {978-0-9992411-4-1},
  url = {https://www.ijcai.org/Proceedings/2019/0226.pdf},
  doi = {10.24963/ijcai.2019/226},
}
Morris M, Ross T, Meyer T. Defeasible disjunctive datalog. Forum for Artificial Intelligence Research. 2019. http://ceur-ws.org/Vol-2540/FAIR2019_paper_38.pdf.

Datalog is a declarative logic programming language that uses classical logical reasoning as its basic form of reasoning. Defeasible reasoning is a form of non-classical reasoning that is able to deal with exceptions to general assertions in a formal manner. The KLM approach to defeasible reasoning is an axiomatic approach based on the concept of plausible inference. Since Datalog uses classical reasoning, it is currently not able to handle defeasible implications and exceptions. We aim to extend the expressivity of Datalog by incorporating KLM-style defeasible reasoning into classical Datalog. We present a systematic approach to extending the KLM properties and a well-known form of defeasible entailment: Rational Closure. We conclude by exploring Datalog extensions of less conservative forms of defeasible entailment: Relevant and Lexicographic Closure.

@proceedings{225,
  author = {Matthew Morris and Tala Ross and Thomas Meyer},
  title = {Defeasible disjunctive datalog},
  abstract = {Datalog is a declarative logic programming language that uses classical logical reasoning as its basic form of reasoning. Defeasible reasoning is a form of non-classical reasoning that is able to deal with exceptions to general assertions in a formal manner. The KLM approach to defeasible reasoning is an axiomatic approach based on the concept of plausible inference. Since Datalog uses classical reasoning, it is currently not able to handle defeasible implications and exceptions. We aim to extend the expressivity of Datalog by incorporating KLM-style defeasible reasoning into classical Datalog. We present a systematic approach to extending the KLM properties and a well-known form of defeasible entailment: Rational Closure. We conclude by exploring Datalog extensions of less conservative forms of defeasible entailment: Relevant and Lexicographic Closure.},
  year = {2019},
  journal = {Forum for Artificial Intelligence Research},
  pages = {208-219},
  month = {03/12-06/12},
  publisher = {CEUR},
  isbn = {1613-0073},
  url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_38.pdf},
}
Harrison M, Meyer T. Rational preferential reasoning for datalog. Forum for Artificial Intelligence Research. 2019. http://ceur-ws.org/Vol-2540/FAIR2019_paper_67.pdf.

Datalog is a powerful language that can be used to represent explicit knowledge and compute inferences in knowledge bases. Datalog cannot represent or reason about contradictory rules, though. This is a limitation as contradictions are often present in domains that contain exceptions. In this paper, we extend datalog to represent contradictory and defeasible information. We define an approach to efficiently reason about contradictory information in datalog and show that it satisfies the KLM requirements for a rational consequence relation. Finally, we introduce an implementation of this approach in the form of a defeasible datalog reasoning tool and evaluate the performance of this tool.

@proceedings{224,
  author = {Michael Harrison and Thomas Meyer},
  title = {Rational preferential reasoning for datalog},
  abstract = {Datalog is a powerful language that can be used to represent explicit knowledge and compute inferences in knowledge bases. Datalog cannot represent or reason about contradictory rules, though. This is a limitation as contradictions are often present in domains that contain exceptions. In this paper, we extend datalog to represent contradictory and defeasible information. We define an approach to efficiently reason
about contradictory information in datalog and show that it satisfies the KLM requirements for a rational consequence relation. Finally, we introduce an implementation of this approach in the form of a defeasible datalog reasoning tool and evaluate the performance of this tool.},
  year = {2019},
  journal = {Forum for Artificial Intelligence Research},
  pages = {232-243},
  month = {03/12-06/12},
  publisher = {CEUR},
  isbn = {1613-0073},
  url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_67.pdf},
}
Chingoma J, Meyer T. Forrester’s paradox using typicality. Forum for Artificial Intelligence Research. 2019. http://ceur-ws.org/Vol-2540/FAIR2019_paper_54.pdf.

Deontic logic is a logic often used to formalise scenarios in the legal domain. Within the legal domain there are many exceptions and conflicting obligations. This motivates the enrichment of deontic logic with a notion of typicality which is based on defeasibility, with defeasibility allowing for reasoning about exceptions. Propositional Typicality Logic (PTL) is a logic that employs typicality. Deontic paradoxes are often used to examine logic systems as they provide undesirable results even if the scenarios seem intuitive. Forrester’s paradox is one of the most famous of these paradoxes. This paper shows that PTL can be used to represent and reason with Forrester’s paradox in such a way as to block undesirable conclusions without sacrificing desirable deontic properties.

@proceedings{223,
  author = {Julian Chingoma and Thomas Meyer},
  title = {Forrester’s paradox using typicality},
  abstract = {Deontic logic is a logic often used to formalise scenarios in the legal domain. Within the legal domain there are many exceptions and conflicting obligations. This motivates the enrichment of deontic logic with a notion of typicality which is based on defeasibility, with defeasibility allowing for reasoning about exceptions. Propositional Typicality Logic (PTL) is a logic that employs typicality. Deontic paradoxes are often used to examine logic systems as they provide undesirable results even if the scenarios seem intuitive. Forrester’s paradox is one of the most famous of these paradoxes. This paper shows that PTL can be used to represent and reason with Forrester’s paradox in such a way as to block undesirable conclusions without sacrificing desirable deontic properties.},
  year = {2019},
  journal = {Forum for Artificial Intelligence Research},
  pages = {220-231},
  month = {03/12-06/12},
  publisher = {CEUR},
  isbn = {1613-0073},
  url = {http://ceur-ws.org/Vol-2540/FAIR2019_paper_54.pdf},
}
  • CSIR
  • DSI