People
Latest Research Publications:
The past 25 years have seen many attempts to introduce defeasible-reasoning capabilities into a description logic setting. Many, if not most, of these attempts are based on preferential extensions of description logics, with a significant number of these, in turn, following the so-called KLM approach to defeasible reasoning initially advocated for propositional logic by Kraus, Lehmann, and Magidor. Each of these attempts has its own aim of investigating particular constructions and variants of the (KLM-style) preferential approach. Here our aim is to provide a comprehensive study of the formal foundations of preferential defeasible reasoning for description logics in the KLM tradition. We start by investigating a notion of defeasible subsumption in the spirit of defeasible conditionals as studied by Kraus, Lehmann, and Magidor in the propositional case. In particular, we consider a natural and intuitive semantics for defeasible subsumption, and we investigate KLM-style syntactic properties for both preferential and rational subsumption. Our contribution includes two representation results linking our semantic
constructions to the set of preferential and rational properties considered. Besides showing that our semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. Indeed, we also analyse the problem of non-monotonic reasoning in description logics at the level of entailment and present an algorithm for the computation of rational closure of a defeasible knowledge base. Importantly, our algorithm relies completely on classical entailment and shows that the computational complexity of reasoning over defeasible knowledge bases is no worse than that of reasoning in the underlying classical DL ALC.
@article{433,
author = {Katarina Britz, Giovanni Casini, Tommie Meyer, Kody Moodley, Uli Sattler, Ivan Varzinczak},
title = {Principles of KLM-style Defeasible Description Logics},
abstract = {The past 25 years have seen many attempts to introduce defeasible-reasoning capabilities into a description logic setting. Many, if not most, of these attempts are based on preferential extensions of description logics, with a significant number of these, in turn, following the so-called KLM approach to defeasible reasoning initially advocated for propositional logic by Kraus, Lehmann, and Magidor. Each of these attempts has its own aim of investigating particular constructions and variants of the (KLM-style) preferential approach. Here our aim is to provide a comprehensive study of the formal foundations of preferential defeasible reasoning for description logics in the KLM tradition. We start by investigating a notion of defeasible subsumption in the spirit of defeasible conditionals as studied by Kraus, Lehmann, and Magidor in the propositional case. In particular, we consider a natural and intuitive semantics for defeasible subsumption, and we investigate KLM-style syntactic properties for both preferential and rational subsumption. Our contribution includes two representation results linking our semantic
constructions to the set of preferential and rational properties considered. Besides showing that our semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. Indeed, we also analyse the problem of non-monotonic reasoning in description logics at the level of entailment and present an algorithm for the computation of rational closure of a defeasible knowledge base. Importantly, our algorithm relies completely on classical entailment and shows that the computational complexity of reasoning over defeasible knowledge bases is no worse than that of reasoning in the underlying classical DL ALC.},
year = {2020},
journal = {Transactions on Computational Logic},
volume = {22 (1)},
pages = {1-46},
publisher = {ACM},
url = {https://dl-acm-org.ezproxy.uct.ac.za/doi/abs/10.1145/3420258},
doi = {10.1145/3420258},
}
In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.
@{247,
author = {Katarina Britz, Ivan Varzinczak},
title = {Preferential tableaux for contextual defeasible ALC},
abstract = {In recent work, we addressed an important limitation in previous ex- tensions of description logics to represent defeasible knowledge, namely the re- striction in the semantics of defeasible concept inclusion to a single preference or- der on objects of the domain. Syntactically, this limitation translates to a context- agnostic notion of defeasible subsumption, which is quite restrictive when it comes to modelling different nuances of defeasibility. Our point of departure in our recent proposal allows for different orderings on the interpretation of roles. This yields a notion of contextual defeasible subsumption, where the context is informed by a role. In the present paper, we extend this work to also provide a proof-theoretic counterpart and associated results. We define a (naïve) tableau- based algorithm for checking preferential consistency of contextual defeasible knowledge bases, a central piece in the definition of other forms of contextual defeasible reasoning over ontologies, notably contextual rational closure.},
year = {2019},
journal = {28th International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (TABLEAUX)},
pages = {39-57},
month = {03/09-05/09},
publisher = {Springer LNAI no. 11714},
isbn = {ISBN 978-3-030-29026-9},
url = {https://www.springer.com/gp/book/9783030290252},
}
Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.
@article{246,
author = {Katarina Britz, Ivan Varzinczak},
title = {Contextual rational closure for defeasible ALC},
abstract = {Description logics have been extended in a number of ways to support defeasible reason- ing in the KLM tradition. Such features include preferential or rational defeasible concept inclusion, and defeasible roles in complex concept descriptions. Semantically, defeasible subsumption is obtained by means of a preference order on objects, while defeasible roles are obtained by adding a preference order to role interpretations. In this paper, we address an important limitation in defeasible extensions of description logics, namely the restriction in the semantics of defeasible concept inclusion to a single preference order on objects. We do this by inducing a modular preference order on objects from each modular preference order on roles, and using these to relativise defeasible subsumption. This yields a notion of contextualised rational defeasible subsumption, with contexts described by roles. We also provide a semantic construction for rational closure and a method for its computation, and present a correspondence result between the two.},
year = {2019},
journal = {Annals of Mathematics and Artificial Intelligence},
volume = {87},
pages = {83-108},
issue = {1-2},
isbn = {ISSN: 1012-2443},
url = {https://link.springer.com/article/10.1007/s10472-019-09658-2},
doi = {10.1007/s10472-019-09658-2},
}
In this paper we present an approach to defeasible reasoning for the description logic ALC. The results discussed here are based on work done by Kraus, Lehmann and Magidor (KLM) on defeasible conditionals in the propositional case. We consider versions of a preferential semantics for two forms of defeasible subsumption, and link these semantic constructions formally to KLM-style syntactic properties via representation results. In addition to showing that the semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. With the semantics of the defeasible version of ALC in place, we turn to the investigation of an appropriate form of defeasible entailment for this enriched version of ALC. This investigation includes an algorithm for the computation of a form of defeasible entailment known as rational closure in the propositional case. Importantly, the algorithm relies completely on classical entailment checks and shows that the computational complexity of reasoning over defeasible ontologies is no worse than that of the underlying classical ALC. Before concluding, we take a brief tour of some existing work on defeasible extensions of ALC that go beyond defeasible subsumption.
@inbook{240,
author = {Katarina Britz, Giovanni Casini, Tommie Meyer, Ivan Varzinczak},
title = {A KLM Perspective on Defeasible Reasoning for Description Logics},
abstract = {In this paper we present an approach to defeasible reasoning for the description logic ALC. The results discussed here are based on work done by Kraus, Lehmann and Magidor (KLM) on defeasible conditionals in the propositional case. We consider versions of a preferential semantics for two forms of defeasible subsumption, and link these semantic constructions formally to KLM-style syntactic properties via representation results. In addition to showing that the semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. With the semantics of the defeasible version of ALC in place, we turn to the investigation of an appropriate form of defeasible entailment for this enriched version of ALC. This investigation includes an algorithm for the computation of a form of defeasible entailment known as rational closure in the propositional case. Importantly, the algorithm relies completely on classical entailment checks and shows that the computational complexity of reasoning over defeasible ontologies is no worse than that of the underlying classical ALC. Before concluding, we take a brief tour of some existing work on defeasible extensions of ALC that go beyond defeasible subsumption.},
year = {2019},
journal = {Description Logic, Theory Combination, and All That},
pages = {147–173},
publisher = {Springer},
address = {Switzerland},
isbn = {978-3-030-22101-0},
url = {https://link.springer.com/book/10.1007%2F978-3-030-22102-7},
doi = {https://doi.org/10.1007/978-3-030-22102-7 _ 7},
}
ConceptCloud is a flexible interactive tool for exploring, vi- sualising, and analysing semi-structured data sets. It uses a combination of an intuitive tag cloud visualisation with an underlying concept lattice to provide a formal structure for navigation through a data set. Con- ceptCloud 2.0 extends the tool with an integrated map view to exploit the geolocation aspect of data. The tool’s implementation of exploratory search does not require prior knowledge of the structure of the data or compromise on scalability, and provides seamless navigation through the tag cloud and the map viewer.
@misc{227,
author = {Tiaan Du Toit, Joshua Berndt, Katarina Britz, Bernd Fischer},
title = {ConceptCloud 2.0: Visualisation and exploration of geolocation-rich semi-structured data sets},
abstract = {ConceptCloud is a flexible interactive tool for exploring, vi- sualising, and analysing semi-structured data sets. It uses a combination of an intuitive tag cloud visualisation with an underlying concept lattice to provide a formal structure for navigation through a data set. Con- ceptCloud 2.0 extends the tool with an integrated map view to exploit the geolocation aspect of data. The tool’s implementation of exploratory search does not require prior knowledge of the structure of the data or compromise on scalability, and provides seamless navigation through the tag cloud and the map viewer.},
year = {2019},
journal = {ICFCA 2019 Conference and Workshops},
month = {06/2019},
publisher = {CEUR-WS},
isbn = {1613-0073},
url = {http://ceur-ws.org/Vol-2378/},
}
Latest Research Publications:
Latest Research Publications:
We extend the expressivity of classical conditional reasoning by introducing context as a new parameter. The enriched
conditional logic generalises the defeasible conditional setting in the style of Kraus, Lehmann, and Magidor, and allows for a refined semantics that is able to distinguish, for example, between expectations and counterfactuals. In this paper we introduce the language for the enriched logic and define an appropriate semantic framework for it. We analyse which properties generally associated with conditional reasoning are still satisfied by the new semantic framework, provide a suitable representation result, and define an entailment relation based on Lehmann and Magidor’s generally-accepted notion of Rational Closure.
@{430,
author = {Giovanni Casini, Tommie Meyer, Ivan Varzinczak},
title = {Contextual Conditional Reasoning},
abstract = {We extend the expressivity of classical conditional reasoning by introducing context as a new parameter. The enriched
conditional logic generalises the defeasible conditional setting in the style of Kraus, Lehmann, and Magidor, and allows for a refined semantics that is able to distinguish, for example, between expectations and counterfactuals. In this paper we introduce the language for the enriched logic and define an appropriate semantic framework for it. We analyse which properties generally associated with conditional reasoning are still satisfied by the new semantic framework, provide a suitable representation result, and define an entailment relation based on Lehmann and Magidor’s generally-accepted notion of Rational Closure.},
year = {2021},
journal = {35th AAAI Conference on Artificial Intelligence},
pages = {6254-6261},
month = {02/02/2021-09/02/2021},
publisher = {AAAI Press},
address = {Online},
}
We extend the KLM approach to defeasible reasoning to be applicable to a restricted version of first-order logic. We describe defeasibility for this logic using a set of rationality postulates, provide an appropriate semantics for it, and present a representation result that characterises the semantic description of defeasibility in terms of the rationality postulates. Based on this theoretical core, we then propose a version of defeasible entailment that is inspired by Rational Closure as it is defined for defeasible propositional logic and defeasible description logics. We show that this form of defeasible entailment is rational in the sense that it adheres to our rationality postulates. The work in this paper is the first step towards our ultimate goal of introducing KLM-style defeasible reasoning into the family of Datalog+/- ontology languages.
@{429,
author = {Giovanni Casini, Tommie Meyer, Guy Paterson-Jones},
title = {KLM-Style Defeasibility for Restricted First-Order Logic},
abstract = {We extend the KLM approach to defeasible reasoning to be applicable to a restricted version of first-order logic. We describe defeasibility for this logic using a set of rationality postulates, provide an appropriate semantics for it, and present a representation result that characterises the semantic description of defeasibility in terms of the rationality postulates. Based on this theoretical core, we then propose a version of defeasible entailment that is inspired by Rational Closure as it is defined for defeasible propositional logic and defeasible description logics. We show that this form of defeasible entailment is rational in the sense that it adheres to our rationality postulates. The work in this paper is the first step towards our ultimate goal of introducing KLM-style defeasible reasoning into the family of Datalog+/- ontology languages.},
year = {2021},
journal = {19th International Workshop on Non-Monotonic Reasoning},
pages = {184-193},
month = {03/11/2021-05/11/2021},
address = {Online},
url = {https://drive.google.com/open?id=1WSIl3TOrXBhaWhckWN4NLXoD9AVFKp5R},
}
The past 25 years have seen many attempts to introduce defeasible-reasoning capabilities into a description logic setting. Many, if not most, of these attempts are based on preferential extensions of description logics, with a significant number of these, in turn, following the so-called KLM approach to defeasible reasoning initially advocated for propositional logic by Kraus, Lehmann, and Magidor. Each of these attempts has its own aim of investigating particular constructions and variants of the (KLM-style) preferential approach. Here our aim is to provide a comprehensive study of the formal foundations of preferential defeasible reasoning for description logics in the KLM tradition. We start by investigating a notion of defeasible subsumption in the spirit of defeasible conditionals as studied by Kraus, Lehmann, and Magidor in the propositional case. In particular, we consider a natural and intuitive semantics for defeasible subsumption, and we investigate KLM-style syntactic properties for both preferential and rational subsumption. Our contribution includes two representation results linking our semantic
constructions to the set of preferential and rational properties considered. Besides showing that our semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. Indeed, we also analyse the problem of non-monotonic reasoning in description logics at the level of entailment and present an algorithm for the computation of rational closure of a defeasible knowledge base. Importantly, our algorithm relies completely on classical entailment and shows that the computational complexity of reasoning over defeasible knowledge bases is no worse than that of reasoning in the underlying classical DL ALC.
@article{433,
author = {Katarina Britz, Giovanni Casini, Tommie Meyer, Kody Moodley, Uli Sattler, Ivan Varzinczak},
title = {Principles of KLM-style Defeasible Description Logics},
abstract = {The past 25 years have seen many attempts to introduce defeasible-reasoning capabilities into a description logic setting. Many, if not most, of these attempts are based on preferential extensions of description logics, with a significant number of these, in turn, following the so-called KLM approach to defeasible reasoning initially advocated for propositional logic by Kraus, Lehmann, and Magidor. Each of these attempts has its own aim of investigating particular constructions and variants of the (KLM-style) preferential approach. Here our aim is to provide a comprehensive study of the formal foundations of preferential defeasible reasoning for description logics in the KLM tradition. We start by investigating a notion of defeasible subsumption in the spirit of defeasible conditionals as studied by Kraus, Lehmann, and Magidor in the propositional case. In particular, we consider a natural and intuitive semantics for defeasible subsumption, and we investigate KLM-style syntactic properties for both preferential and rational subsumption. Our contribution includes two representation results linking our semantic
constructions to the set of preferential and rational properties considered. Besides showing that our semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in description logics. Indeed, we also analyse the problem of non-monotonic reasoning in description logics at the level of entailment and present an algorithm for the computation of rational closure of a defeasible knowledge base. Importantly, our algorithm relies completely on classical entailment and shows that the computational complexity of reasoning over defeasible knowledge bases is no worse than that of reasoning in the underlying classical DL ALC.},
year = {2020},
journal = {Transactions on Computational Logic},
volume = {22 (1)},
pages = {1-46},
publisher = {ACM},
url = {https://dl-acm-org.ezproxy.uct.ac.za/doi/abs/10.1145/3420258},
doi = {10.1145/3420258},
}
Propositional KLM-style defeasible reasoning involves a core propositional logic capable of expressing defeasible (or conditional) implications. The semantics for this logic is based on Kripke-like structures known as ranked interpretations. KLM-style defeasible entailment is referred to as rational whenever the defeasible entailment relation under consideration generates a set of defeasible implications all satisfying a set of rationality postulates known as the KLM postulates. In a recent paper Booth et al. proposed PTL, a logic that is more expressive than the core KLM logic. They proved an impossibility result, showing that defeasible entailment for PTL fails to satisfy a set of rationality postulates similar in spirit to the KLM postulates. Their interpretation of the impossibility result is that defeasible entailment for PTL need not be unique.
In this paper we continue the line of research in which the expressivity of the core KLM logic is extended. We present the logic Boolean KLM (BKLM) in which we allow for disjunctions, conjunctions, and negations, but not nesting, of defeasible implications. Our contribution is twofold. Firstly, we show (perhaps surprisingly) that BKLM is more expressive than PTL. Our proof is based on the fact that BKLM can characterise all single ranked interpretations, whereas PTL cannot. Secondly, given that the PTL impossibility result also applies to BKLM, we adapt the different forms of PTL entailment proposed by Booth et al. to apply to BKLM.
@misc{383,
author = {Guy Paterson-Jones, Giovanni Casini, Tommie Meyer},
title = {BKLM - An expressive logic for defeasible reasoning},
abstract = {Propositional KLM-style defeasible reasoning involves a core propositional logic capable of expressing defeasible (or conditional) implications. The semantics for this logic is based on Kripke-like structures known as ranked interpretations. KLM-style defeasible entailment is referred to as rational whenever the defeasible entailment relation under consideration generates a set of defeasible implications all satisfying a set of rationality postulates known as the KLM postulates. In a recent paper Booth et al. proposed PTL, a logic that is more expressive than the core KLM logic. They proved an impossibility result, showing that defeasible entailment for PTL fails to satisfy a set of rationality postulates similar in spirit to the KLM postulates. Their interpretation of the impossibility result is that defeasible entailment for PTL need not be unique.
In this paper we continue the line of research in which the expressivity of the core KLM logic is extended. We present the logic Boolean KLM (BKLM) in which we allow for disjunctions, conjunctions, and negations, but not nesting, of defeasible implications. Our contribution is twofold. Firstly, we show (perhaps surprisingly) that BKLM is more expressive than PTL. Our proof is based on the fact that BKLM can characterise all single ranked interpretations, whereas PTL cannot. Secondly, given that the PTL impossibility result also applies to BKLM, we adapt the different forms of PTL entailment proposed by Booth et al. to apply to BKLM.},
year = {2020},
journal = {18th International Workshop on Non-Monotonic Reasoning},
month = {12/09/2020-24/09/2020},
}
We present a formal framework for modelling belief change within a non-monotonic reasoning system. Belief change and non-monotonic reasoning are two areas that are formally closely related, with recent attention being paid towards the analysis of belief change within a non-monotonic environment. In this paper we consider the classical AGM belief change operators, contraction and revision, applied to a defeasible setting in the style of Kraus, Lehmann, and Magidor. The investigation leads us to the formal characterisation of a number of classes of defeasible belief change operators. For the most interesting classes we need to consider the problem of iterated belief change, generalising the classical work of Darwiche and Pearl in the process. Our work involves belief change operators aimed at ensuring logical consistency, as well as the characterisation of analogous operators aimed at obtaining coherence—an important notion within the field of logic-based ontologies
@{382,
author = {Giovanni Casini, Tommie Meyer, Ivan Varzinczak},
title = {Rational Defeasible Belief Change},
abstract = {We present a formal framework for modelling belief change within a non-monotonic reasoning system. Belief change and non-monotonic reasoning are two areas that are formally closely related, with recent attention being paid towards the analysis of belief change within a non-monotonic environment. In this paper we consider the classical AGM belief change operators, contraction and revision, applied to a defeasible setting in the style of Kraus, Lehmann, and Magidor. The investigation leads us to the formal characterisation of a number of classes of defeasible belief change operators. For the most interesting classes we need to consider the problem of iterated belief change, generalising the classical work of Darwiche and Pearl in the process. Our work involves belief change operators aimed at ensuring logical consistency, as well as the characterisation of analogous operators aimed at obtaining coherence—an important notion within the field of logic-based ontologies},
year = {2020},
journal = {17th International Conference on Principles of Knowledge Representation and Reasoning (KR 2020)},
pages = {213-222},
month = {12/09/2020},
publisher = {IJCAI},
address = {Virtual},
url = {https://library.confdna.com/kr/2020/},
doi = {10.24963/kr.2020/22},
}
Latest Research Publications:
Latest Research Publications:
Latest Research Publications:
Latest Research Publications:
Marelie obtained her undergraduate degrees (Computer Science & Mathematics) from Stellenbosch University, receiving the Dean’s medal as best student in the US Faculty of Science at the end of her Honours degree. Prior to joining NWU, Marelie was a principal researcher and research group leader at the South African CSIR, involved in technology-oriented research and development. Her research group focussed on speech technology development in under-resourced environments; in 2005, she received her PhD from the University of Pretoria (UP), with a thesis on bootstrapping pronunciation models, at the time one of the core ‘missing’ components when developing speech recognition for South African languages.
In 2011, Marelie joined NWU, becoming the Director of MuST in 2014. MuST is a focussed research environment with an emphasis on postgraduate training and delivering on externally-focussed projects. Recent projects include the development of an automatic speech transcription platform for the South African government, development of a new multilingual text-to-speech corpus in collaboration with Internet giant Google, and being part of the winning consortium of the BABEL project: a 5-year internationally collaborative challenge aimed at solving the spoken term detection task for under-resourced languages.
Over the past few years, Marelie has supervised 23 post-graduate students, all producing research related to the theory and applications of machine learning. She frequently participates in various scientific committees both nationally and internationally (AAAI, IJCAI, Interspeech, SLT, MediaEval, ICASSP, SLTU), is the NWU group representative at the national Centre for Artificial Intelligence Research (CAIR), and an NRF-rated researcher. Since 2003, she has published 100 peer-reviewed papers related to machine learning; she has an h-index of 21, and an i10-index of 37.
Latest Research Publications:
@article{521,
author = {Aldrin Ngorima, Albert Helberg, Marelie Davel},
title = {Simplified Temporal Convolutional-Based Channel Estimation for a WiFi Vehicular Communication Channel},
abstract = {},
year = {2025},
journal = {IEEE 3rd Wireless Africa Conference (WAC)},
pages = {1 - 5},
month = {02/2025},
publisher = {IEEE},
address = {Pretoria, South Africa},
isbn = {979-8-3315-1758-8},
doi = {10.1109/WAC63911.2025.10992609},
}
@article{520,
author = {William Brooks, Marelie Davel, Coenraad Mouton},
title = {Does Simple Trump Complex? Comparing Strategies for Adversarial Robustness in DNNs},
abstract = {},
year = {2024},
journal = {Artificial Intelligence Research. SACAIR 2024. Communications in Computer and Information Science},
volume = {vol 2326},
pages = {253 - 269},
month = {12/2024},
publisher = {Springer Nature Switzerland},
address = {Cham},
doi = {https://doi.org/10.1007/978-3-031-78255-8_15},
}
<p>Batch normalization (BatchNorm) is a popular layer normalization technique used when training deep neural networks. It has been shown to enhance the training speed and accuracy of deep learning models. However, the mechanics by which BatchNorm achieves these benefits is an active area of research, and different perspectives have been proposed. In this paper, we investigate the effect of BatchNorm on the resulting hidden representations, that is, the vectors of activation values formed as samples are processed at each hidden layer. Specifically, we consider the sparsity of these representations, as well as their implicit clustering – the creation of groups of representations that are similar to some extent. We contrast image classification models trained with and without batch normalization and highlight consistent differences observed. These findings highlight that BatchNorm’s effect on representational sparsity is not a significant factor affecting generalization, while the representations of models trained with BatchNorm tend to show more advantageous clustering characteristics.</p>
@article{518,
author = {Harmen Potgieter, Coenraad Mouton, Marelie Davel},
title = {Impact of Batch Normalization on Convolutional Network Representations},
abstract = {<p>Batch normalization (BatchNorm) is a popular layer normalization technique used when training deep neural networks. It has been shown to enhance the training speed and accuracy of deep learning models. However, the mechanics by which BatchNorm achieves these benefits is an active area of research, and different perspectives have been proposed. In this paper, we investigate the effect of BatchNorm on the resulting hidden representations, that is, the vectors of activation values formed as samples are processed at each hidden layer. Specifically, we consider the sparsity of these representations, as well as their implicit clustering – the creation of groups of representations that are similar to some extent. We contrast image classification models trained with and without batch normalization and highlight consistent differences observed. These findings highlight that BatchNorm’s effect on representational sparsity is not a significant factor affecting generalization, while the representations of models trained with BatchNorm tend to show more advantageous clustering characteristics.</p>},
year = {2024},
journal = {Artificial Intelligence Research (SACAIR 2024)},
volume = {vol 2326},
pages = {235 - 252},
month = {12/2024},
publisher = {Springer Nature Switzerland},
address = {Cham},
doi = {https://doi.org/10.1007/978-3-031-78255-8_14},
}
<p><span style="-webkit-text-stroke-width:0px;background-color:rgb(255, 255, 255);color:rgb(34, 34, 34);display:inline !important;float:none;font-family:Merriweather, serif;font-size:18px;font-style:normal;font-variant-caps:normal;font-variant-ligatures:normal;font-weight:400;letter-spacing:normal;orphans:2;text-align:start;text-decoration-color:initial;text-decoration-style:initial;text-decoration-thickness:initial;text-indent:0px;text-transform:none;white-space:normal;widows:2;word-spacing:0px;">Due to the scarcity of data in low-resourced languages, the development of language models for these languages has been very slow. Currently, pre-trained language models have gained popularity in natural language processing, especially, in developing domain-specific models for low-resourced languages. In this study, we experiment with the impact of using occlusion-based techniques when training a language model for a text generation task. We curate 2 new datasets, the Sepedi monolingual (SepMono) dataset from several South African resources and the Sepedi radio news (SepNews) dataset from the radio news domain. We use the SepMono dataset to pre-train transformer-based models using the occlusion and non-occlusion pre-training techniques and compare performance. The SepNews dataset is specifically used for fine-tuning. Our results show that the non-occlusion models perform better compared to the occlusion-based models when measuring validation loss and perplexity. However, analysis of the generated text using the BLEU score metric, which measures the quality of the generated text, shows a slightly higher BLEU score for the occlusion-based models compared to the non-occlusion models.</span></p>
@article{517,
author = {Simon Ramalepe, Thipe Modipa, Marelie Davel},
title = {Pre-training a Transformer-Based Generative Model Using a Small Sepedi Dataset},
abstract = {<p><span style="-webkit-text-stroke-width:0px;background-color:rgb(255, 255, 255);color:rgb(34, 34, 34);display:inline !important;float:none;font-family:Merriweather, serif;font-size:18px;font-style:normal;font-variant-caps:normal;font-variant-ligatures:normal;font-weight:400;letter-spacing:normal;orphans:2;text-align:start;text-decoration-color:initial;text-decoration-style:initial;text-decoration-thickness:initial;text-indent:0px;text-transform:none;white-space:normal;widows:2;word-spacing:0px;">Due to the scarcity of data in low-resourced languages, the development of language models for these languages has been very slow. Currently, pre-trained language models have gained popularity in natural language processing, especially, in developing domain-specific models for low-resourced languages. In this study, we experiment with the impact of using occlusion-based techniques when training a language model for a text generation task. We curate 2 new datasets, the Sepedi monolingual (SepMono) dataset from several South African resources and the Sepedi radio news (SepNews) dataset from the radio news domain. We use the SepMono dataset to pre-train transformer-based models using the occlusion and non-occlusion pre-training techniques and compare performance. The SepNews dataset is specifically used for fine-tuning. Our results show that the non-occlusion models perform better compared to the occlusion-based models when measuring validation loss and perplexity. However, analysis of the generated text using the BLEU score metric, which measures the quality of the generated text, shows a slightly higher BLEU score for the occlusion-based models compared to the non-occlusion models.</span></p>},
year = {2024},
journal = {Artificial Intelligence Research. SACAIR 2024. Communications in Computer and Information Science},
volume = {vol 2326},
pages = {319-333},
month = {12/2024},
publisher = {Springer Nature Switzerland},
address = {Cham},
doi = {https://doi.org/10.1007/978-3-031-78255-8_19},
}
<p><span style="-webkit-text-stroke-width:0px;background-color:rgb(255, 255, 255);color:rgb(34, 34, 34);display:inline !important;float:none;font-family:Merriweather, serif;font-size:18px;font-style:normal;font-variant-caps:normal;font-variant-ligatures:normal;font-weight:400;letter-spacing:normal;orphans:2;text-align:start;text-decoration-color:initial;text-decoration-style:initial;text-decoration-thickness:initial;text-indent:0px;text-transform:none;white-space:normal;widows:2;word-spacing:0px;">Vehicular communication systems face significant challenges due to high mobility and rapidly changing environments, which affect the channel over which the signals travel. To address these challenges, neural network (NN)-based channel estimation methods have been suggested. These methods are primarily trained on high signal-to-noise ratio (SNR) with the assumption that training a NN in less noisy conditions can result in good generalisation. This study examines the effectiveness of training NN-based channel estimators on mixed SNR datasets compared to training solely on high SNR datasets, as seen in several related works. Estimators evaluated in this work include an architecture that uses convolutional layers and self-attention mechanisms; a method that employs temporal convolutional networks and data pilot-aided estimation; two methods that combine classical methods with multilayer perceptrons; and the current state-of-the-art model that combines Long-Short-Term Memory networks with data pilot-aided and temporal averaging methods as post processing. Our results indicate that using only high SNR data for training is not always optimal, and the SNR range in the training dataset should be treated as a hyperparameter that can be adjusted for better performance. This is illustrated by the better performance of some models in low SNR conditions when trained on the mixed SNR dataset, as opposed to when trained exclusively on high SNR data.</span></p>
@article{516,
author = {Aldrin Ngorima, Albert Helberg, Marelie Davel},
title = {Neural Network-Based Vehicular Channel Estimation Performance: Effect of Noise in the Training Set},
abstract = {<p><span style="-webkit-text-stroke-width:0px;background-color:rgb(255, 255, 255);color:rgb(34, 34, 34);display:inline !important;float:none;font-family:Merriweather, serif;font-size:18px;font-style:normal;font-variant-caps:normal;font-variant-ligatures:normal;font-weight:400;letter-spacing:normal;orphans:2;text-align:start;text-decoration-color:initial;text-decoration-style:initial;text-decoration-thickness:initial;text-indent:0px;text-transform:none;white-space:normal;widows:2;word-spacing:0px;">Vehicular communication systems face significant challenges due to high mobility and rapidly changing environments, which affect the channel over which the signals travel. To address these challenges, neural network (NN)-based channel estimation methods have been suggested. These methods are primarily trained&nbsp;on high signal-to-noise ratio (SNR) with the assumption that training&nbsp;a NN in less noisy conditions can result in good generalisation.&nbsp;This study examines the effectiveness of training NN-based channel estimators on mixed SNR datasets compared to training solely on&nbsp;high SNR datasets, as seen in several related works. Estimators evaluated in this work include an architecture that uses convolutional layers and self-attention mechanisms; a method that employs temporal convolutional networks and data pilot-aided estimation; two methods that combine classical methods with multilayer perceptrons; and&nbsp;the current state-of-the-art model that combines Long-Short-Term Memory networks with data pilot-aided and temporal averaging methods&nbsp;as post processing. Our results indicate that using only high SNR&nbsp;data for training is not always optimal, and the SNR range in&nbsp;the training dataset should be treated as a hyperparameter that can&nbsp;be adjusted for better performance. This is illustrated by the better performance of some models in low SNR conditions when trained on&nbsp;the mixed SNR dataset, as opposed to when trained exclusively on&nbsp;high SNR data.</span></p>},
year = {2024},
journal = {Artificial Intelligence Research. SACAIR 2024. Communications in Computer and Information Science},
volume = {vol 2326},
pages = {192 - 206},
month = {12/2024},
publisher = {Springer Nature Switzerland},
address = {Cham},
isbn = {978-3-031-78255-8},
doi = {https://doi.org/10.1007/978-3-031-78255-8_12},
}
Latest Research Publications:
Latest Research Publications:
• Complex time series data often encountered in scientific and engineering domains.
• Deep learning (DL) is particularly successful here:
– large data sets, multivariate input and/or ouput,
– highly complex sequences of interactions.
• Model interpretability:
– Ability to understand a model’s decisions in a given context [1].
– Techniques typically not originally developed for time series data.
– Time series interpretations themselves become uninterpretable.
• Knowledge Discovery:
– DL has potential to reveal interesting patterns in large data sets.
– Potential to produce novel insights about the task itself [2, 3].
• ‘know-it’: Collaborative project that studies knowledge discovery in
time series data.
@{507,
author = {Marelie Davel, Stefan Lotz, Marthinus Theunissen, Almaro De Villiers, Chara Grant, Randle Rabe, Cleo Conacher},
title = {Knowledge Discovery in Time Series Data},
abstract = {• Complex time series data often encountered in scientific and engineering domains.
• Deep learning (DL) is particularly successful here:
– large data sets, multivariate input and/or ouput,
– highly complex sequences of interactions.
• Model interpretability:
– Ability to understand a model’s decisions in a given context [1].
– Techniques typically not originally developed for time series data.
– Time series interpretations themselves become uninterpretable.
• Knowledge Discovery:
– DL has potential to reveal interesting patterns in large data sets.
– Potential to produce novel insights about the task itself [2, 3].
• ‘know-it’: Collaborative project that studies knowledge discovery in
time series data.},
year = {2023},
journal = {Deep Learning Indaba 2023},
month = {September 2023},
}


