Research Publications

2021

Tollon F. Designed to Seduce: Epistemically Retrograde Ideation and YouTube's Recommender System. International Journal of Technoethics (IJT). 2021;12 (2). doi:10.4018/IJT.2021070105.

Up to 70% of all watch time on YouTube is due to the suggested content of its recommender system. This system has been found, by virtue of its design, to be promoting conspiratorial content. In this paper, the author firstly critiques the value neutrality thesis regarding technology, showing it to be philosophically untenable. This means that technological artefacts can influence what people come to value (or perhaps even embody values themselves) and change the moral evaluation of an action. Secondly, he introduces the concept of an affordance, borrowed from the literature on ecological psychology. This concept allows him to make salient how technologies come to solicit certain kinds of actions from users, making such actions more or less likely, and in this way influencing the kinds of things one comes to value. Thirdly, he critically assesses the results of a study by Alfano et al. He makes use of the literature on affordances, introduced earlier, to shed light on how these technological systems come to mediate our perception of the world and influence action.

@article{415,
  author = {Fabio Tollon},
  title = {Designed to Seduce: Epistemically Retrograde Ideation and YouTube's Recommender System},
  abstract = {Up to 70% of all watch time on YouTube is due to the suggested content of its recommender system. This system has been found, by virtue of its design, to be promoting conspiratorial content. In this paper, the author firstly critiques the value neutrality thesis regarding technology, showing it to be philosophically untenable. This means that technological artefacts can influence what people come to value (or perhaps even embody values themselves) and change the moral evaluation of an action. Secondly, he introduces the concept of an affordance, borrowed from the literature on ecological psychology. This concept allows him to make salient how technologies come to solicit certain kinds of actions from users, making such actions more or less likely, and in this way influencing the kinds of things one comes to value. Thirdly, he critically assesses the results of a study by Alfano et al. He makes use of the literature on affordances, introduced earlier, to shed light on how these technological systems come to mediate our perception of the world and influence action.},
  year = {2021},
  journal = {International Journal of Technoethics (IJT)},
  volume = {12},
  issue = {2},
  publisher = {IGI Global},
  isbn = {9781799861492},
  url = {https://www.igi-global.com/gateway/article/281077},
  doi = {10.4018/IJT.2021070105},
}
Tollon F. Artifacts and affordances: from designed properties to possibilities for action. AI & SOCIETY Journal of Knowledge, Culture and Communication. 2021;36(1). doi:https://doi.org/10.1007/s00146-021-01155-7.

In this paper I critically evaluate the value neutrality thesis regarding technology, and find it wanting. I then introduce the various ways in which artifacts can come to influence moral value, and our evaluation of moral situations and actions. Here, following van de Poel and Kroes, I introduce the idea of value sensitive design. Specifically, I show how by virtue of their designed properties, artifacts may come to embody values. Such accounts, however, have several shortcomings. In agreement with Michael Klenk, I raise epistemic and metaphysical issues with respect to designed properties embodying value. The concept of an affordance, borrowed from ecological psychology, provides a more philosophically fruitful grounding to the potential way(s) in which artifacts might embody values. This is due to the way in which it incorporates key insights from perception more generally, and how we go about determining possibilities for action in our environment specifically. The affordance account as it is presented by Klenk, however, is insufficient. I therefore argue that we understand affordances based on whether they are meaningful, and, secondly, that we grade them based on their force.

@article{386,
  author = {Fabio Tollon},
  title = {Artifacts and affordances: from designed properties to possibilities for action},
  abstract = {In this paper I critically evaluate the value neutrality thesis regarding technology, and find it wanting. I then introduce the various ways in which artifacts can come to influence moral value, and our evaluation of moral situations and actions. Here, following van de Poel and Kroes, I introduce the idea of value sensitive design. Specifically, I show how by virtue of their designed properties, artifacts may come to embody values. Such accounts, however, have several shortcomings. In agreement with Michael Klenk, I raise epistemic and metaphysical issues with respect to designed properties embodying value. The concept of an affordance, borrowed from ecological psychology, provides a more philosophically fruitful grounding to the potential way(s) in which artifacts might embody values. This is due to the way in which it incorporates key insights from perception more generally, and how we go about determining possibilities for action in our environment specifically. The affordance account as it is presented by Klenk, however, is insufficient. I therefore argue that we understand affordances based on whether they are meaningful, and, secondly, that we grade them based on their force.},
  year = {2021},
  journal = {AI & SOCIETY Journal of Knowledge, Culture and Communication},
  volume = {36},
  issue = {1},
  publisher = {Springer},
  url = {https://link.springer.com/article/10.1007%2Fs00146-021-01155-7},
  doi = {https://doi.org/10.1007/s00146-021-01155-7},
}

2020

Kaliski A, Meyer T. Quo Vadis KLM-style Defeasible Reasoning? In: First Southern African Conference for Artificial Intelligence Research. Virtual: SACAIR2020; 2020. https://2020.sacair.org.za/wp-content/uploads/2021/02/SACAIR_Proceedings-MainBook_Finv4_compressed.pdf?_ga=2.116601743.849395099.1621802506-572599210.1621419278.

The field of defeasible reasoning has a variety of frameworks, all of which are constructed with the view of codifying the patterns of common-sense reasoning inherent to human reasoning. One of these frameworks was first described by Kraus, Lehmann and Magidor, and is accordingly referred to as the KLM framework. Initially defined in propositional logic, it has since been imported into description and modal logics, and implemented into many defeasible reasoning engines. However, there are many ways in which this framework may be advanced theoretically, and many opportunities for it to be applied. This paper covers some of the most prominent areas of future work and possible applications of this framework, with the intention that anyone who has recently familiarized themselves with this approach may then have an understanding of the kind of work in which they could engage.

@{414,
  author = {Adam Kaliski and Thomas Meyer},
  title = {Quo Vadis KLM-style Defeasible Reasoning?},
  abstract = {The field of defeasible reasoning has a variety of frameworks, all of which are constructed with the view of codifying the patterns of common-sense reasoning inherent to human reasoning. One of these frameworks was first described by Kraus, Lehmann and Magidor, and is accordingly referred to as the KLM framework. Initially defined in propositional logic, it has since been imported into description and modal logics, and implemented into many defeasible reasoning engines. However, there are many ways in which this framework may be advanced theoretically, and many opportunities for it to be applied. This paper covers some of the most prominent areas of future work and possible applications of this framework, with the intention that anyone who has recently familiarized themselves with this approach may then have an understanding of the kind of work in which they could engage.},
  year = {2020},
  journal = {First Southern African Conference for Artificial Intelligence Research},
  pages = {231-246},
  month = {22/02/2021},
  publisher = {SACAIR2020},
  address = {Virtual},
  isbn = {978-0-620-89373-2},
  url = {https://2020.sacair.org.za/wp-content/uploads/2021/02/SACAIR_Proceedings-MainBook_Finv4_compressed.pdf?_ga=2.116601743.849395099.1621802506-572599210.1621419278},
}
Paterson-Jones G, Meyer T. A Boolean Extension of KLM-style Conditional Reasoning. In: First Southern African Conference for AI Research (SACAIR 2020). Virtual: Springer; 2020. doi:https://doi.org/10.1007/978-3-030-66151-9_15.

Propositional KLM-style defeasible reasoning involves extending propositional logic with a new logical connective that can express defeasible (or conditional) implications, with semantics given by ordered structures known as ranked interpretations. KLM-style defeasible entailment is referred to as rational whenever the defeasible entailment relation under consideration generates a set of defeasible implications all satisfying a set of rationality postulates known as the KLM postulates. In a recent paper Booth et al. proposed PTL, a logic that is more expressive than the core KLM logic. They proved an impossibility result, showing that defeasible entailment for PTL fails to satisfy a set of rationality postulates similar in spirit to the KLM postulates. Their interpretation of the impossibility result is that defeasible entailment for PTL need not be unique. In this paper we continue the line of research in which the expressivity of the core KLM logic is extended. We present the logic Boolean KLM (BKLM) in which we allow for disjunctions, conjunctions, and negations, but not nesting, of defeasible implications. Our contribution is twofold. Firstly, we show (perhaps surprisingly) that BKLM is more expressive than PTL. Our proof is based on the fact that BKLM can characterise all single ranked interpretations, whereas PTL cannot. Secondly, given that the PTL impossibility result also applies to BKLM, we adapt the different forms of PTL entailment proposed by Booth et al. to apply to BKLM.

@{413,
  author = {Guy Paterson-Jones and Thomas Meyer},
  title = {A Boolean Extension of KLM-style Conditional Reasoning},
  abstract = {Propositional KLM-style defeasible reasoning involves extending propositional logic with a new logical connective that can express defeasible (or conditional) implications, with semantics given by ordered structures known as ranked interpretations. KLM-style defeasible entailment is referred to as rational whenever the defeasible entailment relation under consideration generates a set of defeasible implications all satisfying a set of rationality postulates known as the KLM postulates. In a recent paper Booth et al. proposed PTL, a logic that is more expressive than the core KLM logic. They proved an impossibility result, showing that defeasible entailment for PTL fails to satisfy a set of rationality postulates similar in spirit to the KLM postulates. Their interpretation of the impossibility result is that defeasible entailment for PTL need not be unique. In this paper we continue the line of research in which the expressivity of the core KLM logic is extended. We present the logic Boolean KLM (BKLM) in which we allow for disjunctions, conjunctions, and negations, but not nesting, of defeasible implications. Our contribution is twofold. Firstly, we show (perhaps surprisingly) that BKLM is more expressive than PTL. Our proof is based on the fact that BKLM can characterise all single ranked interpretations, whereas PTL cannot. Secondly, given that the PTL impossibility result also applies to BKLM, we adapt the different forms of PTL entailment proposed by Booth et al. to apply to BKLM.},
  year = {2020},
  journal = {First Southern African Conference for AI Research (SACAIR 2020)},
  pages = {236-252},
  month = {22/02/2021},
  publisher = {Springer},
  address = {Virtual},
  isbn = {978-3-030-66151-9},
  url = {https://link.springer.com/book/10.1007/978-3-030-66151-9},
  doi = {https://doi.org/10.1007/978-3-030-66151-9_15},
}
Baker C, Denny C, Frend P, Meyer T. Cognitive Defeasible Reasoning: the Extent to which Forms of Defeasible Reasoning Correspond with Human Reasoning. In: First Southern African Conference for AI Research (SACAIR 2020). Virtual : Springer; 2020. doi:https://doi.org/10.1007/978-3-030-66151-9_13.

Classical logic forms the basis of knowledge representation and reasoning in AI. In the real world, however, classical logic alone is insufficient to describe the reasoning behaviour of human beings. It lacks the flexibility so characteristically required of reasoning under uncertainty, reasoning under incomplete information and reasoning with new information, as humans must. In response, non-classical extensions to propositional logic have been formulated, to provide non-monotonicity. It has been shown in previous studies that human reasoning exhibits non-monotonicity. This work is the product of merging three independent studies, each one focusing on a different formalism for non-monotonic reasoning: KLM defeasible reasoning, AGM belief revision and KM belief update. We investigate, for each of the postulates propounded to characterise these logic forms, the extent to which they have correspondence with human reasoners. We do this via three respective experiments and present each of the postulates in concrete and abstract form. We discuss related work, our experiment design, testing and evaluation, and report on the results from our experiments. We find evidence to believe that 1 out of 5 KLM defeasible reasoning postulates, 3 out of 8 AGM belief revision postulates and 4 out of 8 KM belief update postulates conform in both the concrete and abstract case. For each experiment, we performed an additional investigation. In the experiments of KLM defeasible reasoning and AGM belief revision, we analyse the explanations given by participants to determine whether the postulates have a normative or descriptive relationship with human reasoning. We find evidence that suggests, overall, KLM defeasible reasoning has a normative relationship with human reasoning while AGM belief revision has a descriptive relationship with human reasoning. In the experiment of KM belief update, we discuss counter-examples to the KM postulates.

@{412,
  author = {Clayton Baker and Claire Denny and Paul Frend and Thomas Meyer},
  title = {Cognitive Defeasible Reasoning: the Extent to which Forms of Defeasible Reasoning Correspond with Human Reasoning},
  abstract = {Classical logic forms the basis of knowledge representation and reasoning in AI. In the real world, however, classical logic alone is insufficient to describe the reasoning behaviour of human beings. It lacks the flexibility so characteristically required of reasoning under uncertainty, reasoning under incomplete information and reasoning with new information, as humans must. In response, non-classical extensions to propositional logic have been formulated, to provide non-monotonicity. It has been shown in previous studies that human reasoning exhibits non-monotonicity. This work is the product of merging three independent studies, each one focusing on a different formalism for non-monotonic reasoning: KLM defeasible reasoning, AGM belief revision and KM belief update. We investigate, for each of the postulates propounded to characterise these logic forms, the extent to which they have correspondence with human reasoners. We do this via three respective experiments and present each of the postulates in concrete and abstract form. We discuss related work, our experiment design, testing and evaluation, and report on the results from our experiments. We find evidence to believe that 1 out of 5 KLM defeasible reasoning postulates, 3 out of 8 AGM belief revision postulates and 4 out of 8 KM belief update postulates conform in both the concrete and abstract case. For each experiment, we performed an additional investigation. In the experiments of KLM defeasible reasoning and AGM belief revision, we analyse the explanations given by participants to determine whether the postulates have a normative or descriptive relationship with human reasoning. We find evidence that suggests, overall, KLM defeasible reasoning has a normative relationship with human reasoning while AGM belief revision has a descriptive relationship with human reasoning. In the experiment of KM belief update, we discuss counter-examples to the KM postulates.},
  year = {2020},
  journal = {First Southern African Conference for AI Research (SACAIR 2020)},
  pages = {199-219},
  month = {22/02/2021},
  publisher = {Springer},
  address = {Virtual},
  isbn = {978-3-030-66151-9},
  url = {https://link.springer.com/book/10.1007/978-3-030-66151-9},
  doi = {https://doi.org/10.1007/978-3-030-66151-9_13},
}
Harrison M, Meyer T. DDLV: A System for rational preferential reasoning for datalog. South African Computer Journal. 2020;32(2). doi:https://doi.org/10.18489/sacj.v32i2.850.

Datalog is a powerful language that can be used to represent explicit knowledge and compute inferences in knowledge bases. Datalog cannot, however, represent or reason about contradictory rules. This is a limitation as contradictions are often present in domains that contain exceptions. In this paper, we extend Datalog to represent contradictory and defeasible information. We define an approach to efficiently reason about contradictory information in Datalog and show that it satisfies the KLM requirements for a rational consequence relation. We introduce DDLV, a defeasible Datalog reasoning system that implements this approach. Finally, we evaluate the performance of DDLV.

@article{411,
  author = {Michael Harrison and Thomas Meyer},
  title = {DDLV: A System for rational preferential reasoning for datalog},
  abstract = {Datalog is a powerful language that can be used to represent explicit knowledge and compute inferences in knowledge bases. Datalog cannot, however, represent or reason about contradictory rules. This is a limitation as contradictions are often present in domains that contain exceptions. In this paper, we extend Datalog to represent contradictory and defeasible information. We define an approach to efficiently reason about contradictory information in Datalog and show that it satisfies the KLM requirements for a rational consequence relation. We introduce DDLV, a defeasible Datalog reasoning system that implements this approach. Finally, we evaluate the performance of DDLV.},
  year = {2020},
  journal = {South African Computer Journal},
  volume = {32},
  pages = {184-217},
  issue = {2},
  publisher = {SACJ},
  address = {Online},
  isbn = {ISSN 2313-7835},
  doi = {https://doi.org/10.18489/sacj.v32i2.850},
}
Chingoma J, Meyer T. Defeasibility applied to Forrester’s paradox. South African Computer Journal. 2020;32(2). doi:https://doi.org/10.18489/sacj.v32i2.848.

Deontic logic is a logic often used to formalise scenarios in the legal domain. Within the legal domain there are many exceptions and conflicting obligations. This motivates the enrichment of deontic logic with not only the notion of defeasibility, which allows for reasoning about exceptions, but a stronger notion of typicality that is based on defeasibility. KLM-style defeasible reasoning is a logic system that employs defeasibility while Propositional Typicality Logic (PTL) is a logic that does the same for the notion of typicality. Deontic paradoxes are often used to examine logic systems as the paradoxes provide undesirable results even if the scenarios seem intuitive. Forrester’s paradox is one of the most famous of these paradoxes. This paper shows that KLM-style defeasible reasoning and PTL can be used to represent and reason with Forrester’s paradox in such a way as to block undesirable conclusions without completely sacrificing desirable deontic properties.

@article{410,
  author = {Julian Chingoma and Thomas Meyer},
  title = {Defeasibility applied to Forrester’s paradox},
  abstract = {Deontic logic is a logic often used to formalise scenarios in the legal domain. Within the legal domain there are many exceptions and conflicting obligations. This motivates the enrichment of deontic logic with not only the notion of defeasibility, which allows for reasoning about exceptions, but a stronger notion of typicality that is based on defeasibility. KLM-style defeasible reasoning is a logic system that employs defeasibility while Propositional Typicality Logic (PTL) is a logic that does the same for the notion of typicality. Deontic paradoxes are often used to examine logic systems as the paradoxes provide undesirable results even if the scenarios seem intuitive. Forrester’s paradox is one of the most famous of these paradoxes. This paper shows that KLM-style defeasible reasoning and PTL can be used to represent and reason with Forrester’s paradox in such a way as to block undesirable conclusions without completely sacrificing desirable deontic properties.},
  year = {2020},
  journal = {South African Computer Journal},
  volume = {32},
  pages = {161-183},
  issue = {2},
  publisher = {SACJ},
  address = {Online},
  isbn = {ISSN 2313-7835},
  doi = {https://doi.org/10.18489/sacj.v32i2.848},
}
Morris M, Ross T, Meyer T. Algorithmic definitions for KLM-style defeasible disjunctive Datalog. South African Computer Journal. 2020;32(2). doi:https://doi.org/10.18489/sacj.v32i2.846.

Datalog is a declarative logic programming language that uses classical logical reasoning as its basic form of reasoning. Defeasible reasoning is a form of non-classical reasoning that is able to deal with exceptions to general assertions in a formal manner. The KLM approach to defeasible reasoning is an axiomatic approach based on the concept of plausible inference. Since Datalog uses classical reasoning, it is currently not able to handle defeasible implications and exceptions. We aim to extend the expressivity of Datalog by incorporating KLM-style defeasible reasoning into classical Datalog. We present a systematic approach for extending the KLM properties and a well-known form of defeasible entailment: Rational Closure. We conclude by exploring Datalog extensions of less conservative forms of defeasible entailment: Relevant and Lexicographic Closure. We provide algorithmic definitions for these forms of defeasible entailment and prove that the definitions are LM-rational.

@article{409,
  author = {Matthew Morris and Tala Ross and Thomas Meyer},
  title = {Algorithmic definitions for KLM-style defeasible disjunctive Datalog},
  abstract = {Datalog is a declarative logic programming language that uses classical logical reasoning as its basic form of reasoning. Defeasible reasoning is a form of non-classical reasoning that is able to deal with exceptions to general assertions in a formal manner. The KLM approach to defeasible reasoning is an axiomatic approach based on the concept of plausible inference. Since Datalog uses classical reasoning, it is currently not able to handle defeasible implications and exceptions. We aim to extend the expressivity of Datalog by incorporating KLM-style defeasible reasoning into classical Datalog. We present a systematic approach for extending the KLM properties and a well-known form of defeasible entailment: Rational Closure. We conclude by exploring Datalog extensions of less conservative forms of defeasible entailment: Relevant and Lexicographic Closure. We provide algorithmic definitions for these forms of defeasible entailment and prove that the definitions are LM-rational.},
  year = {2020},
  journal = {South African Computer Journal},
  volume = {32},
  pages = {141-160},
  issue = {2},
  publisher = {SACJ},
  address = {Online},
  isbn = {ISSN 2313-7835},
  doi = {https://doi.org/10.18489/sacj.v32i2.846},
}
Toussaint W, Moodley D. Clustering Residential Electricity Consumption Data to Create Archetypes that Capture Household Behaviour in South Africa. South African Computer Journal. 2020;32(2). doi:http://dx.doi.org/10.18489/sacj.v32i2.845 .

Clustering is frequently used in the energy domain to identify dominant electricity consumption patterns of households, which can be used to construct customer archetypes for long term energy planning. Selecting a useful set of clusters however requires extensive experimentation and domain knowledge. While internal clustering validation measures are well established in the electricity domain, they are limited for selecting useful clusters. Based on an application case study in South Africa, we present an approach for formalising implicit expert knowledge as external evaluation measures to create customer archetypes that capture variability in residential electricity consumption behaviour. By combining internal and external validation measures in a structured manner, we were able to evaluate clustering structures based on the utility they present for our application. We validate the selected clusters in a use case where we successfully reconstruct customer archetypes previously developed by experts. Our approach shows promise for transparent and repeatable cluster ranking and selection by data scientists, even if they have limited domain knowledge.

@article{408,
  author = {Wiebke Toussaint and Deshen Moodley},
  title = {Clustering Residential Electricity Consumption Data to Create Archetypes that Capture Household Behaviour in South Africa},
  abstract = {Clustering is frequently used in the energy domain to identify dominant electricity consumption patterns of households, which can be used to construct customer archetypes for long term energy planning. Selecting a useful set of clusters however requires extensive experimentation and domain knowledge. While internal clustering validation measures are well established in the electricity domain, they are limited for selecting useful clusters. Based on an application case study in South Africa, we present an approach for formalising implicit expert knowledge as external evaluation measures to create customer archetypes that capture variability in residential electricity consumption behaviour. By combining internal and external validation measures in a structured manner, we were able to evaluate clustering structures based on the utility they present for our application. We validate the selected clusters in a use case where we successfully reconstruct customer archetypes previously developed by experts. Our approach shows promise for transparent and repeatable cluster ranking and selection by data scientists, even if they have limited domain knowledge.},
  year = {2020},
  journal = {South African Computer Journal},
  volume = {32},
  pages = {1-34},
  issue = {2},
  publisher = {SACJ},
  address = {Online},
  isbn = {ISSN 2313-7835},
  url = {http://www.scielo.org.za/scielo.php?pid=S2313-78352020000200003&script=sci_arttext&tlng=en},
  doi = {http://dx.doi.org/10.18489/sacj.v32i2.845},
}
Kouassi K, Moodley D. An Analysis of Deep Neural Networks for Predicting Trends in Time Series Data. In: First Southern African Conference for AI Research (SACAIR 2020). Virtual: Springer; 2020. doi:https://doi.org/10.1007/978-3-030-66151-9_8.

Recently, a hybrid Deep Neural Network (DNN) algorithm, TreNet was proposed for predicting trends in time series data. While TreNet was shown to have superior performance for trend prediction to other DNN and traditional ML approaches, the validation method used did not take into account the sequential nature of time series datasets and did not deal with model update. In this research we replicated the TreNet experiments on the same datasets using a walk-forward validation method and tested our best model over multiple independent runs to evaluate model stability. We compared the performance of the hybrid TreNet algorithm, on four datasets to vanilla DNN algorithms that take in point data, and also to traditional ML algorithms. We found that in general TreNet still performs better than the vanilla DNN models, but not on all datasets as reported in the original TreNet study. This study highlights the importance of using an appropriate validation method and evaluating model stability for evaluating and developing machine learning models for trend prediction in time series data.

@{407,
  author = {Kouame Kouassi and Deshen Moodley},
  title = {An Analysis of Deep Neural Networks for Predicting Trends in Time Series Data},
  abstract = {Recently, a hybrid Deep Neural Network (DNN) algorithm, TreNet was proposed for predicting trends in time series data. While TreNet was shown to have superior performance for trend prediction to other DNN and traditional ML approaches, the validation method used did not take into account the sequential nature of time series datasets and did not deal with model update. In this research we replicated the TreNet experiments on the same datasets using a walk-forward validation method and tested our best model over multiple independent runs to evaluate model stability. We compared the performance of the hybrid TreNet algorithm, on four datasets to vanilla DNN algorithms that take in point data, and also to traditional ML algorithms. We found that in general TreNet still performs better than the vanilla DNN models, but not on all datasets as reported in the original TreNet study. This study highlights the importance of using an appropriate validation method and evaluating model stability for evaluating and developing machine learning models for trend prediction in time series data.},
  year = {2020},
  journal = {First Southern African Conference for AI Research (SACAIR 2020)},
  pages = {119-140},
  month = {22/02/2021},
  publisher = {Springer},
  address = {Virtual},
  isbn = {978-3-030-66151-9},
  url = {https://link.springer.com/book/10.1007/978-3-030-66151-9},
  doi = {https://doi.org/10.1007/978-3-030-66151-9_8},
}
Wanyana T, Moodley D, Meyer T. An Ontology for Supporting Knowledge Discovery and Evolution. In: First Southern African Conference for Artificial Intelligence Research. Virtual: SACAIR2020; 2020. https://2020.sacair.org.za/wp-content/uploads/2021/02/SACAIR_Proceedings-MainBook_Finv4_compressed.pdf?_ga=2.116601743.849395099.1621802506-572599210.1621419278.

Knowledge Discovery and Evolution (KDE) is of interest to a broad array of researchers from both Philosophy of Science (PoS) and Artificial Intelligence (AI), in particular, Knowledge Representation and Reasoning (KR), Machine Learning and Data Mining (ML-DM) and the Agent Based Systems (ABS) communities. In PoS, Haig recently pro- posed a so-called broad theory of scientific method that uses abduction for generating theories to explain phenomena. He refers to this method of scientific inquiry as the Abductive Theory of Method (ATOM). In this paper, we analyse ATOM, align it with KR and ML-DM perspectives and propose an algorithm and an ontology for supporting agent based knowledge discovery and evolution based on ATOM. We illustrate the use of the algorithm and the ontology on a use case application for electricity consumption behaviour in residential households.

@{405,
  author = {Tezira Wanyana and Deshen Moodley and Thomas Meyer},
  title = {An Ontology for Supporting Knowledge Discovery and Evolution},
  abstract = {Knowledge Discovery and Evolution (KDE) is of interest to a broad array of researchers from both Philosophy of Science (PoS) and Artificial Intelligence (AI), in particular, Knowledge Representation and Reasoning (KR), Machine Learning and Data Mining (ML-DM) and the Agent Based Systems (ABS) communities. In PoS, Haig recently pro- posed a so-called broad theory of scientific method that uses abduction for generating theories to explain phenomena. He refers to this method of scientific inquiry as the Abductive Theory of Method (ATOM). In this paper, we analyse ATOM, align it with KR and ML-DM perspectives and propose an algorithm and an ontology for supporting agent based knowledge discovery and evolution based on ATOM. We illustrate the use of the algorithm and the ontology on a use case application for electricity consumption behaviour in residential households.},
  year = {2020},
  journal = {First Southern African Conference for Artificial Intelligence Research},
  pages = {206-221},
  month = {22/02/2021},
  publisher = {SACAIR2020},
  address = {Virtual},
  isbn = {978-0-620-89373-2},
  url = {https://2020.sacair.org.za/wp-content/uploads/2021/02/SACAIR_Proceedings-MainBook_Finv4_compressed.pdf?_ga=2.116601743.849395099.1621802506-572599210.1621419278},
}
Lamprecht DB, Barnard E. Using a meta-model to compensate for training-evaluation mismatches. In: Southern African Conference for Artificial Intelligence Research. South Africa; 2020. https://sacair.org.za/proceedings/.

One of the fundamental assumptions of machine learning is that learnt models are applied to data that is identically distributed to the training data. This assumption is often not realistic: for example, data collected from a single source at different times may not be distributed identically, due to sampling bias or changes in the environment. We propose a new architecture called a meta-model which predicts performance for unseen models. This approach is applicable when several ‘proxy’ datasets are available to train a model to be deployed on a ‘target’ test set; the architecture is used to identify which regression algorithms should be used as well as which datasets are most useful to train for a given target dataset. Finally, we demonstrate the strengths and weaknesses of the proposed meta-model by making use of artificially generated datasets using a variation of the Friedman method 3 used to generate artificial regression datasets, and discuss real-world applications of our approach.

@{404,
  author = {Dylan Lamprecht and Etienne Barnard},
  title = {Using a meta-model to compensate for training-evaluation mismatches},
  abstract = {One of the fundamental assumptions of machine learning is that learnt models are applied to data that is identically distributed to the training data. This assumption is often not realistic: for example, data collected from a single source at different times may not be distributed identically, due to sampling bias or changes in the environment. We propose a new architecture called a meta-model which predicts performance for unseen models. This approach is applicable when several ‘proxy’ datasets are available to train a model to be deployed on a ‘target’ test set; the architecture is used to identify which regression algorithms should be used as well as which datasets are most useful to train for a given target dataset. Finally, we demonstrate the strengths and weaknesses of the proposed meta-model by making use of artificially generated datasets using a variation of the Friedman method 3 used to generate artificial regression datasets, and discuss real-world applications of our approach.},
  year = {2020},
  journal = {Southern African Conference for Artificial Intelligence Research},
  pages = {321-334},
  month = {22/02/2021 - 26/02/2021},
  address = {South Africa},
  isbn = {978-0-620-89373-2},
  url = {https://sacair.org.za/proceedings/},
}
Heyns N, Barnard E. Optimising word embeddings for recognised multilingual speech. In: Southern African Conference for Artificial Intelligence Research. South Africa; 2020. https://sacair.org.za/proceedings/.

Word embeddings are widely used in natural language processing (NLP) tasks. Most work on word embeddings focuses on monolingual languages with large available datasets. For embeddings to be useful in a multilingual environment, as in South Africa, the training techniques have to be adjusted to cater for a) multiple languages, b) smaller datasets and c) the occurrence of code-switching. One of the biggest roadblocks is to obtain datasets that include examples of natural code-switching, since code switching is generally avoided in written material. A solution to this problem is to use speech recognised data. Embedding packages like Word2Vec and GloVe have default hyper-parameter settings that are usually optimised for training on large datasets and evaluation on analogy tasks. When using embeddings for problems such as text classification in our multilingual environment, the hyper-parameters have to be optimised for the specific data and task. We investigate the importance of optimising relevant hyper-parameters for training word embeddings with speech recognised data, where code-switching occurs, and evaluate against the real-world problem of classifying radio and television recordings with code switching. We compare these models with a bag of words baseline model as well as a pre-trained GloVe model.

@{403,
  author = {Nuette Heyns and Etienne Barnard},
  title = {Optimising word embeddings for recognised multilingual speech},
  abstract = {Word embeddings are widely used in natural language processing (NLP) tasks. Most work on word embeddings focuses on monolingual languages with large available datasets. For embeddings to be useful in a multilingual environment, as in South Africa, the training techniques have to be adjusted to cater for a) multiple languages, b) smaller datasets and c) the occurrence of code-switching. One of the biggest roadblocks is to obtain datasets that include examples of natural code-switching, since code switching is generally avoided in written material. A solution to this problem is to use speech recognised data. Embedding packages like Word2Vec and GloVe have default hyper-parameter settings that are usually optimised for training on large datasets and evaluation on analogy tasks. When using embeddings for problems such as text classification in our multilingual environment, the hyper-parameters have to be optimised for the specific data and task. We investigate the importance of optimising relevant hyper-parameters for training word embeddings with speech recognised data, where code-switching occurs, and evaluate against the real-world problem of classifying radio and television recordings with code switching. We compare these models with a bag of words baseline model as well as a pre-trained GloVe model.},
  year = {2020},
  journal = {Southern African Conference for Artificial Intelligence Research},
  pages = {102-116},
  month = {22/02/2021 - 26/02/2021},
  address = {South Africa},
  isbn = {978-0-620-89373-2},
  url = {https://sacair.org.za/proceedings/},
}
Haasbroek DG, Davel MH. Exploring neural network training dynamics through binary node activations. In: Southern African Conference for Artificial Intelligence Research. South Africa; 2020. https://sacair.org.za/proceedings/.

Each node in a neural network is trained to activate for a specific region in the input domain. Any training samples that fall within this domain are therefore implicitly clustered together. Recent work has highlighted the importance of these clusters during the training process but has not yet investigated their evolution during training. Towards this goal, we train several ReLU-activated MLPs on a simple classification task (MNIST) and show that a consistent training process emerges: (1) sample clusters initially increase in size and then decrease as training progresses, (2) the size of sample clusters in the first layer decreases more rapidly than in deeper layers, (3) binary node activations, especially of nodes in deeper layers, become more sensitive to class membership as training progresses, (4) individual nodes remain poor predictors of class membership, even if accurate when applied as a group. We report on the detail of these findings and interpret them from the perspective of a high-dimensional clustering process.

@{402,
  author = {Daniël Haasbroek and Marelie Davel},
  title = {Exploring neural network training dynamics through binary node activations},
  abstract = {Each node in a neural network is trained to activate for a specific region in the input domain. Any training samples that fall within this domain are therefore implicitly clustered together. Recent work has highlighted the importance of these clusters during the training process but has not yet investigated their evolution during training. Towards this goal, we train several ReLU-activated MLPs on a simple classification task (MNIST) and show that a consistent training process emerges: (1) sample clusters initially increase in size and then decrease as training progresses, (2) the size of sample clusters in the first layer decreases more rapidly than in deeper layers, (3) binary node activations, especially of nodes in deeper layers, become more sensitive to class membership as training progresses, (4) individual nodes remain poor predictors of class membership, even if accurate when applied as a group. We report on the detail of these findings and interpret them from the perspective of a high-dimensional clustering process.},
  year = {2020},
  journal = {Southern African Conference for Artificial Intelligence Research},
  pages = {304-320},
  month = {22/02/2021 - 26/02/2021},
  address = {South Africa},
  isbn = {978-0-620-89373-2},
  url = {https://sacair.org.za/proceedings/},
}
Venter AEW, Theunissen MW, Davel MH. Pre-interpolation loss behavior in neural networks. In: Southern African Conference for Artificial Intelligence Research. South Africa: Springer; 2020. doi:https://doi.org/10.1007/978-3-030-66151-9_19.

When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.

@{401,
  author = {Arthur Venter and Marthinus Theunissen and Marelie Davel},
  title = {Pre-interpolation loss behavior in neural networks},
  abstract = {When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.},
  year = {2020},
  journal = {Southern African Conference for Artificial Intelligence Research},
  pages = {296-309},
  month = {22/02/2021 - 26/02/2021},
  publisher = {Springer},
  address = {South Africa},
  isbn = {978-3-030-66151-9},
  doi = {https://doi.org/10.1007/978-3-030-66151-9_19},
}
Myburgh JC, Mouton C, Davel MH. Tracking translation invariance in CNNs. In: Southern African Conference for Artificial Intelligence Research. South Africa: Springer; 2020. doi:https://doi.org/10.1007/978-3-030-66151-9_18.

Although Convolutional Neural Networks (CNNs) are widely used, their translation invariance (ability to deal with translated inputs) is still subject to some controversy. We explore this question using translation-sensitivity maps to quantify how sensitive a standard CNN is to a translated input. We propose the use of cosine similarity as sensitivity metric over Euclidean distance, and discuss the importance of restricting the dimensionality of either of these metrics when comparing architectures. Our main focus is to investigate the effect of different architectural components of a standard CNN on that network’s sensitivity to translation. By varying convolutional kernel sizes and amounts of zero padding, we control the size of the feature maps produced, allowing us to quantify the extent to which these elements influence translation invariance. We also measure translation invariance at different locations within the CNN to determine the extent to which convolutional and fully connected layers, respectively, contribute to the translation invariance of a CNN as a whole. Our analysis indicates that both convolutional kernel size and feature map size have a systematic influence on translation invariance. We also see that convolutional layers contribute less than expected to translation invariance, when not specifically forced to do so.

@{400,
  author = {Johannes Myburgh and Coenraad Mouton and Marelie Davel},
  title = {Tracking translation invariance in CNNs},
  abstract = {Although Convolutional Neural Networks (CNNs) are widely used, their translation invariance (ability to deal with translated inputs) is still subject to some controversy. We explore this question using translation-sensitivity maps to quantify how sensitive a standard CNN is to a translated input. We propose the use of cosine similarity as sensitivity metric over Euclidean distance, and discuss the importance of restricting the dimensionality of either of these metrics when comparing architectures. Our main focus is to investigate the effect of different architectural components of a standard CNN on that network’s sensitivity to translation. By varying convolutional kernel sizes and amounts of zero padding, we control the size of the feature maps produced, allowing us to quantify the extent to which these elements influence translation invariance. We also measure translation invariance at different locations within the CNN to determine the extent to which convolutional and fully connected layers, respectively, contribute to the translation invariance of a CNN as a whole. Our analysis indicates that both convolutional kernel size and feature map size have a systematic influence on translation invariance. We also see that convolutional layers contribute less than expected to translation invariance, when not specifically forced to do so.},
  year = {2020},
  journal = {Southern African Conference for Artificial Intelligence Research},
  pages = {282-295},
  month = {22/02/2021 - 26/02/2021},
  publisher = {Springer},
  address = {South Africa},
  isbn = {978-3-030-66151-9},
  doi = {https://doi.org/10.1007/978-3-030-66151-9_18},
}
Mouton C, Myburgh JC, Davel MH. Stride and translation invariance in CNNs. In: Southern African Conference for Artificial Intelligence Research. South Africa: Springer; 2020. doi:https://doi.org/10.1007/978-3-030-66151-9_17.

Convolutional Neural Networks have become the standard for image classification tasks, however, these architectures are not invariant to translations of the input image. This lack of invariance is attributed to the use of stride which subsamples the input, resulting in a loss of information, and fully connected layers which lack spatial reasoning. We show that stride can greatly benefit translation invariance given that it is combined with sufficient similarity between neighbouring pixels, a characteristic which we refer to as local homogeneity. We also observe that this characteristic is dataset-specific and dictates the relationship between pooling kernel size and stride required for translation invariance. Furthermore we find that a trade-off exists between generalization and translation invariance in the case of pooling kernel size, as larger kernel sizes lead to better invariance but poorer generalization. Finally we explore the efficacy of other solutions proposed, namely global average pooling, anti-aliasing, and data augmentation, both empirically and through the lens of local homogeneity.

@{399,
  author = {Coenraad Mouton and Johannes Myburgh and Marelie Davel},
  title = {Stride and translation invariance in CNNs},
  abstract = {Convolutional Neural Networks have become the standard for image classification tasks, however, these architectures are not invariant to translations of the input image. This lack of invariance is attributed to the use of stride which subsamples the input, resulting in a loss of information, and fully connected layers which lack spatial reasoning. We show that stride can greatly benefit translation invariance given that it is combined with sufficient similarity between neighbouring pixels, a characteristic which we refer to as local homogeneity. We also observe that this characteristic is dataset-specific and dictates the relationship between pooling kernel size and stride required for translation invariance. Furthermore we find that a trade-off exists between generalization and translation invariance in the case of pooling kernel size, as larger kernel sizes lead to better invariance but poorer generalization. Finally we explore the efficacy of other solutions proposed, namely global average pooling, anti-aliasing, and data augmentation, both empirically and through the lens of local homogeneity.},
  year = {2020},
  journal = {Southern African Conference for Artificial Intelligence Research},
  pages = {267-281},
  month = {22/02/2021 - 26/02/2021},
  publisher = {Springer},
  address = {South Africa},
  isbn = {978-3-030-66151-9},
  doi = {https://doi.org/10.1007/978-3-030-66151-9_17},
}
Strydom RA, Barnard E. Classifying recognised speech with deep neural networks. In: Southern African Conference for Artificial Intelligence Research. South Africa: Southern African Conference for Artificial Intelligence Research; 2020.

We investigate whether word embeddings using deep neural networks can assist in the analysis of text produced by a speechrecognition system. In particular, we develop algorithms to identify which words are incorrectly detected by a speech-recognition system in broadcast news. The multilingual corpus used in this investigation contains speech from the eleven official South African languages, as well as Hindi. Popular word embedding algorithms such as Word2Vec and fastText are investigated and compared with context-specific embedding representations such as Doc2Vec and non-context specific statistical sentence embedding methods such as term frequency-inverse document frequency (TFIDF), which is used as our baseline method. These various embeddding methods are then used as fixed length input representations for a logistic regression and feed forward neural network classifier. The output is used as an additional categorical input feature to a CatBoost classifier to determine whether the words were correctly recognised. Other methods are also investigated, including a method that uses the word embedding itself and cosine similarity between specific keywords to identify whether a specific keyword was correctly detected. When relying only on the speech-text data, the best result was obtained using the TFIDF document embeddings as input features to a feed forward neural network. Adding the output from the feed forward neural network as an additional feature to the CatBoost classifier did not enhance the classifier’s performance compared to using the non-textual information provided, although adding the output from a weaker classifier was somewhat beneficial

@{398,
  author = {Rhyno Strydom and Etienne Barnard},
  title = {Classifying recognised speech with deep neural networks},
  abstract = {We investigate whether word embeddings using deep neural networks can assist in the analysis of text produced by a speechrecognition system. In particular, we develop algorithms to identify which words are incorrectly detected by a speech-recognition system in broadcast news. The multilingual corpus used in this investigation contains speech from the eleven official South African languages, as well as Hindi. Popular word embedding algorithms such as Word2Vec and fastText are investigated and compared with context-specific embedding representations such as Doc2Vec and non-context specific statistical sentence embedding methods such as term frequency-inverse document frequency (TFIDF), which is used as our baseline method. These various embeddding methods are then used as fixed length input representations for a logistic regression and feed forward neural network classifier. The output is used as an additional categorical input feature to a CatBoost classifier to determine whether the words were correctly recognised. Other methods are also investigated, including a method that uses the word embedding itself and cosine similarity between specific keywords to identify whether a specific keyword was correctly detected. When relying only on the speech-text data, the best result was obtained using the TFIDF document embeddings as input features to a feed forward neural network. Adding the output from the feed forward neural network as an additional feature to the CatBoost classifier did not enhance the classifier’s performance compared to using the non-textual information provided, although adding the output from a weaker classifier was somewhat beneficial},
  year = {2020},
  journal = {Southern African Conference for Artificial Intelligence Research},
  pages = {191-205},
  month = {22/02/2021 - 26/02/2021},
  publisher = {Southern African Conference for Artificial Intelligence Research},
  address = {South Africa},
  isbn = {978-0-620-89373-2},
}
Theunissen MW, Davel MH, Barnard E. Benign interpolation of noise in deep learning. South African Computer Journal. 2020;32(2). doi:https://doi.org/10.18489/sacj.v32i2.833.

The understanding of generalisation in machine learning is in a state of flux, in part due to the ability of deep learning models to interpolate noisy training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about the bias-variance trade off in learning. We expand upon relevant existing work by discussing local attributes of neural network training within the context of a relatively simple framework.We describe how various types of noise can be compensated for within the proposed framework in order to allow the deep learning model to generalise in spite of interpolating spurious function descriptors. Empirically,we support our postulates with experiments involving overparameterised multilayer perceptrons and controlled training data noise. The main insights are that deep learning models are optimised for training data modularly, with different regions in the function space dedicated to fitting distinct types of sample information. Additionally,we show that models tend to fit uncorrupted samples first. Based on this finding, we propose a conjecture to explain an observed instance of the epoch-wise double-descent phenomenon. Our findings suggest that the notion of model capacity needs to be modified to consider the distributed way training data is fitted across sub-units.

@article{394,
  author = {Marthinus Theunissen and Marelie Davel and Etienne Barnard},
  title = {Benign interpolation of noise in deep learning},
  abstract = {The understanding of generalisation in machine learning is in a state of flux, in part due to the ability of deep learning models to interpolate noisy training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about the bias-variance trade off in learning. We expand upon relevant existing work by discussing local attributes of neural network training within the context of a relatively simple framework.We describe how various types of noise can be compensated for within the proposed framework in order to allow the deep learning model to generalise in spite of interpolating spurious function descriptors. Empirically,we support our postulates with experiments involving overparameterised multilayer perceptrons and controlled training data noise. The main insights are that deep learning models are optimised for training data modularly, with different regions in the function space dedicated to fitting distinct types of sample information. Additionally,we show that models tend to fit uncorrupted samples first. Based on this finding, we propose a conjecture to explain an observed instance of the epoch-wise double-descent phenomenon. Our findings suggest that the notion of model capacity needs to be modified to consider the distributed way training data is fitted across sub-units.},
  year = {2020},
  journal = {South African Computer Journal},
  volume = {32},
  pages = {80-101},
  issue = {2},
  publisher = {South African Institute of Computer Scientists and Information Technologists},
  isbn = {ISSN: 1015-7999; E:2313-7835},
  doi = {https://doi.org/10.18489/sacj.v32i2.833},
}
Beukes JP, Davel MH, Lotz S. Pairwise networks for feature ranking of a geomagnetic storm model. South African Computer Journal. 2020;32(2). doi:https://doi.org/10.18489/sacj.v32i2.860.

Feedforward neural networks provide the basis for complex regression models that produce accurate predictions in a variety of applications. However, they generally do not explicitly provide any information about the utility of each of the input parameters in terms of their contribution to model accuracy. With this is mind, we develop the pairwise network, an adaptation to the fully connected feedforward network that allows the ranking of input parameters according to their contribution to the model output. The application is demonstrated in the context of a space physics problem. Geomagnetic storms are multi-day events characterised by significant perturbations to the magnetic field of the Earth, driven by solar activity. Previous storm forecasting efforts typically use solar wind measurements as input parameters to a regression problem tasked with predicting a perturbation index such as the 1-minute cadence symmetric-H (Sym-H) index. We re-visit the task of predicting Sym-H from solar wind parameters, with two 'twists': (i) Geomagnetic storm phase information is incorporated as model inputs and shown to increase prediction performance. (ii) We describe the pairwise network structure and training process - first validating ranking ability on synthetic data, before using the network to analyse the Sym-H problem.

@article{392,
  author = {Jacques Beukes and Marelie Davel and Stefan Lotz},
  title = {Pairwise networks for feature ranking of a geomagnetic storm model},
  abstract = {Feedforward neural networks provide the basis for complex regression models that produce accurate predictions in a variety of applications. However, they generally do not explicitly provide any information about the utility of each of the input parameters in terms of their contribution to model accuracy. With this is mind, we develop the pairwise network, an adaptation to the fully connected feedforward network that allows the ranking of input parameters according to their contribution to the model output. The application is demonstrated in the context of a space physics problem. Geomagnetic storms are multi-day events characterised by significant perturbations to the magnetic field of the Earth, driven by solar activity. Previous storm forecasting efforts typically use solar wind measurements as input parameters to a regression problem tasked with predicting a perturbation index such as the 1-minute cadence symmetric-H (Sym-H) index. We re-visit the task of predicting Sym-H from solar wind parameters, with two 'twists': (i) Geomagnetic storm phase information is incorporated as model inputs and shown to increase prediction performance. (ii) We describe the pairwise network structure and training process - first validating ranking ability on synthetic data, before using the network to analyse the Sym-H problem.},
  year = {2020},
  journal = {South African Computer Journal},
  volume = {32},
  pages = {35-55},
  issue = {2},
  publisher = {South African Institute of Computer Scientists and Information Technologists},
  isbn = {ISSN: 1015-7999; E:2313-7835},
  doi = {https://doi.org/10.18489/sacj.v32i2.860},
}
Davel MH. Using summary layers to probe neural network behaviour. South African Computer Journal. 2020;32(2). doi:https://doi.org/10.18489/sacj.v32i2.861.

No framework exists that can explain and predict the generalisation ability of deep neural networks in general circumstances. In fact, this question has not been answered for some of the least complicated of neural network architectures: fully-connected feedforward networks with rectified linear activations and a limited number of hidden layers. For such an architecture, we show how adding a summary layer to the network makes it more amenable to analysis, and allows us to define the conditions that are required to guarantee that a set of samples will all be classified correctly. This process does not describe the generalisation behaviour of these networks,but produces a number of metrics that are useful for probing their learning and generalisation behaviour. We support the analytical conclusions with empirical results, both to confirm that the mathematical guarantees hold in practice, and to demonstrate the use of the analysis process.

@article{391,
  author = {Marelie Davel},
  title = {Using summary layers to probe neural network behaviour},
  abstract = {No framework exists that can explain and predict the generalisation ability of deep neural networks in general circumstances. In fact, this question has not been answered for some of the least complicated of neural network architectures: fully-connected feedforward networks with rectified linear activations and a limited number of hidden layers. For such an architecture, we show how adding a summary layer to the network makes it more amenable to analysis, and allows us to define the conditions that are required to guarantee that a set of samples will all be classified correctly. This process does not describe the generalisation behaviour of these networks,but produces a number of metrics that are useful for probing their learning and generalisation behaviour. We support the analytical conclusions with empirical results, both to confirm that the mathematical guarantees hold in practice, and to demonstrate the use of the analysis process.},
  year = {2020},
  journal = {South African Computer Journal},
  volume = {32},
  pages = {102-123},
  issue = {2},
  publisher = {South African Institute of Computer Scientists and Information Technologists},
  isbn = {ISSN: 1015-7999; E:2313-7835},
  url = {http://hdl.handle.net/10394/36916},
  doi = {https://doi.org/10.18489/sacj.v32i2.861},
}
Tollon F. The artifcial view: toward a non‑anthropocentric account of moral patiency. Ethics and Information Technology. 2020;22(4). doi:https://doi.org/10.1007/s10676-020-09540-4.

In this paper I provide an exposition and critique of the Organic View of Ethical Status, as outlined by Torrance (2008). A key presupposition of this view is that only moral patients can be moral agents. It is claimed that because artificial agents lack sentience, they cannot be proper subjects of moral concern (i.e. moral patients). This account of moral standing in principle excludes machines from participating in our moral universe. I will argue that the Organic View operationalises anthropocentric intuitions regarding sentience ascription, and by extension how we identify moral patients. The main difference between the argument I provide here and traditional arguments surrounding moral attributability is that I do not necessarily defend the view that internal states ground our ascriptions of moral patiency. This is in contrast to views such as those defended by Singer (1975, 2011) and Torrance (2008), where concepts such as sentience play starring roles. I will raise both conceptual and epistemic issues with regards to this sense of sentience. While this does not preclude the usage of sentience outright, it suggests that we should be more careful in our usage of internal mental states to ground our moral ascriptions. Following from this I suggest other avenues for further exploration into machine moral patiency which may not have the same shortcomings as the Organic View.

@article{387,
  author = {Fabio Tollon},
  title = {The artifcial view: toward a non‑anthropocentric account of moral patiency},
  abstract = {In this paper I provide an exposition and critique of the Organic View of Ethical Status, as outlined by Torrance (2008). A key presupposition of this view is that only moral patients can be moral agents. It is claimed that because artificial agents lack sentience, they cannot be proper subjects of moral concern (i.e. moral patients). This account of moral standing in principle excludes machines from participating in our moral universe. I will argue that the Organic View operationalises anthropocentric intuitions regarding sentience ascription, and by extension how we identify moral patients. The main difference between the argument I provide here and traditional arguments surrounding moral attributability is that I do not necessarily defend the view that internal states ground our ascriptions of moral patiency. This is in contrast to views such as those defended by Singer (1975, 2011) and Torrance (2008), where concepts such as sentience play starring roles. I will raise both conceptual and epistemic issues with regards to this sense of sentience. While this does not preclude the usage of sentience outright, it suggests that we should be more careful in our usage of internal mental states to ground our moral ascriptions. Following from this I suggest other avenues for further exploration into machine moral patiency which may not have the same shortcomings as the Organic View.},
  year = {2020},
  journal = {Ethics and Information Technology},
  volume = {22},
  issue = {4},
  publisher = {Springer},
  url = {https://link.springer.com/article/10.1007%2Fs10676-020-09540-4},
  doi = {https://doi.org/10.1007/s10676-020-09540-4},
}
Friedman C. Human-Robot Moral Relations: Human Interactants as Moral Patients of Their Own Agential Moral Actions Towards Robots. Communications in Computer and Information Science . 2020;1342. doi:https://doi.org/10.1007/978-3-030-66151-9_1.

This paper contributes to the debate in the ethics of social robots on how or whether to treat social robots morally by way of considering a novel perspective on the moral relations between human interactants and social robots. This perspective is significant as it allows us to circumnavigate debates about the (im)possibility of robot consciousness and moral patiency (debates which often slow down discussion on the ethics of HRI), thus allowing us to address actual and urgent current ethical issues in relation to human-robot interaction. The paper considers the different ways in which human interactants may be moral patients in the context of interaction with social robots: robots as conduits of human moral action towards human moral patients; humans as moral patients to the actions of robots; and human interactants as moral patients of their own agential moral actions towards social robots. This third perspective is the focal point of the paper. The argument is that due to perceived robot consciousness, and the possibility that the immoral treatment of social robots may morally harm human interactants, there is a unique moral relation between humans and social robots wherein human interactants are both the moral agents of their actions towards robots, as well as the actual moral patients of those agential moral actions towards robots. Robots, however, are no more than perceived moral patients. This discussion further adds to debates in the context of robot moral status, and the consideration of the moral treatment of robots in the context of human-robot interaction.

@article{385,
  author = {Cindy Friedman},
  title = {Human-Robot Moral Relations: Human Interactants as Moral Patients of Their Own Agential Moral Actions Towards Robots},
  abstract = {This paper contributes to the debate in the ethics of social robots on how or whether to treat social robots morally by way of considering a novel perspective on the moral relations between human interactants and social robots. This perspective is significant as it allows us to circumnavigate debates about the (im)possibility of robot consciousness and moral patiency (debates which often slow down discussion on the ethics of HRI), thus allowing us to address actual and urgent current ethical issues in relation to human-robot interaction. The paper considers the different ways in which human interactants may be moral patients in the context of interaction with social robots: robots as conduits of human moral action towards human moral patients; humans as moral patients to the actions of robots; and human interactants as moral patients of their own agential moral actions towards social robots. This third perspective is the focal point of the paper. The argument is that due to perceived robot consciousness, and the possibility that the immoral treatment of social robots may morally harm human interactants, there is a unique moral relation between humans and social robots wherein human interactants are both the moral agents of their actions towards robots, as well as the actual moral patients of those agential moral actions towards robots. Robots, however, are no more than perceived moral patients. This discussion further adds to debates in the context of robot moral status, and the consideration of the moral treatment of robots in the context of human-robot interaction.},
  year = {2020},
  journal = {Communications in Computer and Information Science},
  volume = {1342},
  pages = {3-20},
  publisher = {Springer},
  isbn = {978-3-030-66151-9},
  url = {https://link.springer.com/chapter/10.1007/978-3-030-66151-9_1},
  doi = {https://doi.org/10.1007/978-3-030-66151-9_1},
}
Ruttkamp-Bloem E. The Quest for Actionable AI Ethics. Communications in Computer and Information Science . 2020;1342. doi:https://doi.org/10.1007/978-3-030-66151-9_3.

In the face of the fact that AI ethics guidelines currently, on the whole, seem to have no significant impact on AI practices, the quest of AI ethics to ensure trustworthy AI is in danger of becoming nothing more than a nice ideal. Serious work is to be done to ensure AI ethics guidelines are actionable. To this end, in this paper, I argue that AI ethics should be approached 1) in a multi-disciplinary manner focused on concrete research in the discipline of the ethics of AI and 2) as a dynamic system on the basis of virtue ethics in order to work towards enabling all AI actors to take responsibility for their own actions and to hold others accountable for theirs. In conclusion, the paper emphasises the importance of understanding AI ethics as playing out on a continuum of interconnected interests across academia, civil society, public policy-making and the private sector, and a novel notion of ‘AI ethics capital’ is put on the table as outcome of actionable AI ethics and essential ingredient for sustainable trustworthy AI.

@article{384,
  author = {Emma Ruttkamp-Bloem},
  title = {The Quest for Actionable AI Ethics},
  abstract = {In the face of the fact that AI ethics guidelines currently, on the whole, seem to have no significant impact on AI practices, the quest of AI ethics to ensure trustworthy AI is in danger of becoming nothing more than a nice ideal. Serious work is to be done to ensure AI ethics guidelines are actionable. To this end, in this paper, I argue that AI ethics should be approached 1) in a multi-disciplinary manner focused on concrete research in the discipline of the ethics of AI and 2) as a dynamic system on the basis of virtue ethics in order to work towards enabling all AI actors to take responsibility for their own actions and to hold others accountable for theirs. In conclusion, the paper emphasises the importance of understanding AI ethics as playing out on a continuum of interconnected interests across academia, civil society, public policy-making and the private sector, and a novel notion of ‘AI ethics capital’ is put on the table as outcome of actionable AI ethics and essential ingredient for sustainable trustworthy AI.},
  year = {2020},
  journal = {Communications in Computer and Information Science},
  volume = {1342},
  pages = {34-52},
  publisher = {Springer},
  isbn = {978-3-030-66151-9},
  url = {https://link.springer.com/chapter/10.1007/978-3-030-66151-9_3},
  doi = {https://doi.org/10.1007/978-3-030-66151-9_3},
}
  • CSIR
  • DSI
  • Covid-19