arXiv daily

Artificial Intelligence (cs.AI)

Tue, 29 Aug 2023

Other arXiv digests in this category:Thu, 14 Sep 2023; Wed, 13 Sep 2023; Tue, 12 Sep 2023; Mon, 11 Sep 2023; Fri, 08 Sep 2023; Tue, 05 Sep 2023; Fri, 01 Sep 2023; Thu, 31 Aug 2023; Wed, 30 Aug 2023; Mon, 28 Aug 2023; Fri, 25 Aug 2023; Thu, 24 Aug 2023; Wed, 23 Aug 2023; Tue, 22 Aug 2023; Mon, 21 Aug 2023; Fri, 18 Aug 2023; Thu, 17 Aug 2023; Wed, 16 Aug 2023; Tue, 15 Aug 2023; Mon, 14 Aug 2023; Fri, 11 Aug 2023; Thu, 10 Aug 2023; Wed, 09 Aug 2023; Tue, 08 Aug 2023; Mon, 07 Aug 2023; Fri, 04 Aug 2023; Thu, 03 Aug 2023; Wed, 02 Aug 2023; Tue, 01 Aug 2023; Mon, 31 Jul 2023; Fri, 28 Jul 2023; Thu, 27 Jul 2023; Wed, 26 Jul 2023; Tue, 25 Jul 2023; Mon, 24 Jul 2023; Fri, 21 Jul 2023; Thu, 20 Jul 2023; Wed, 19 Jul 2023; Tue, 18 Jul 2023; Mon, 17 Jul 2023; Thu, 13 Jul 2023; Wed, 12 Jul 2023; Tue, 11 Jul 2023; Mon, 10 Jul 2023; Fri, 07 Jul 2023; Thu, 06 Jul 2023; Wed, 05 Jul 2023; Tue, 04 Jul 2023; Mon, 03 Jul 2023; Fri, 30 Jun 2023; Thu, 29 Jun 2023; Wed, 28 Jun 2023; Tue, 27 Jun 2023; Mon, 26 Jun 2023; Fri, 23 Jun 2023; Thu, 22 Jun 2023; Tue, 20 Jun 2023; Fri, 16 Jun 2023; Thu, 15 Jun 2023; Tue, 13 Jun 2023; Mon, 12 Jun 2023; Fri, 09 Jun 2023; Thu, 08 Jun 2023; Wed, 07 Jun 2023; Tue, 06 Jun 2023; Mon, 05 Jun 2023; Fri, 02 Jun 2023; Thu, 01 Jun 2023; Wed, 31 May 2023; Tue, 30 May 2023; Mon, 29 May 2023; Fri, 26 May 2023; Thu, 25 May 2023; Wed, 24 May 2023; Tue, 23 May 2023; Mon, 22 May 2023; Fri, 19 May 2023; Thu, 18 May 2023; Wed, 17 May 2023; Tue, 16 May 2023; Mon, 15 May 2023; Fri, 12 May 2023; Thu, 11 May 2023; Wed, 10 May 2023; Tue, 09 May 2023; Mon, 08 May 2023; Fri, 05 May 2023; Thu, 04 May 2023; Wed, 03 May 2023; Tue, 02 May 2023; Mon, 01 May 2023; Fri, 28 Apr 2023; Thu, 27 Apr 2023; Wed, 26 Apr 2023; Tue, 25 Apr 2023; Mon, 24 Apr 2023; Fri, 21 Apr 2023; Thu, 20 Apr 2023; Wed, 19 Apr 2023; Tue, 18 Apr 2023; Mon, 17 Apr 2023; Fri, 14 Apr 2023; Thu, 13 Apr 2023; Wed, 12 Apr 2023; Tue, 11 Apr 2023; Mon, 10 Apr 2023; Thu, 06 Apr 2023; Wed, 05 Apr 2023; Tue, 04 Apr 2023
1.Serving MoE Models on Resource-constrained Edge Devices via Dynamic Expert Swapping

Authors:Rui Kong, Yuanchun Li, Qingtian Feng, Weijun Wang, Linghe Kong, Yunxin Liu

Abstract: Mixture of experts (MoE) is a popular technique in deep learning that improves model capacity with conditionally-activated parallel neural network modules (experts). However, serving MoE models in resource-constrained latency-critical edge scenarios is challenging due to the significantly increased model size and complexity. In this paper, we first analyze the behavior pattern of MoE models in continuous inference scenarios, which leads to three key observations about the expert activations, including temporal locality, exchangeability, and skippable computation. Based on these observations, we introduce PC-MoE, an inference framework for resource-constrained continuous MoE model serving. The core of PC-MoE is a new data structure, Parameter Committee, that intelligently maintains a subset of important experts in use to reduce resource consumption. The optimal configuration of Parameter Committee is found offline by a profiling-guided committee planner, and expert swapping and request handling at runtime are managed by an adaptive committee scheduler. To evaluate the effectiveness of PC-MoE, we conduct experiments using state-of-the-art MoE models on common computer vision and natural language processing tasks. The results demonstrate optimal trade-offs between resource consumption and model accuracy achieved by PC-MoE. For instance, on object detection tasks with the Swin-MoE model, our approach can reduce memory usage and latency by 42.34% and 18.63% with only 0.10% accuracy degradation.

2.A Comprehensive Augmentation Framework for Anomaly Detection

Authors:Jiang Lin, Yaping Yan

Abstract: Data augmentation methods are commonly integrated into the training of anomaly detection models. Previous approaches have primarily focused on replicating real-world anomalies or enhancing diversity, without considering that the standard of anomaly varies across different classes, potentially leading to a biased training distribution.This paper analyzes crucial traits of simulated anomalies that contribute to the training of reconstructive networks and condenses them into several methods, thus creating a comprehensive framework by selectively utilizing appropriate combinations.Furthermore, we integrate this framework with a reconstruction-based approach and concurrently propose a split training strategy that alleviates the issue of overfitting while avoiding introducing interference to the reconstruction process. The evaluations conducted on the MVTec anomaly detection dataset demonstrate that our method outperforms the previous state-of-the-art approach, particularly in terms of object classes.To evaluate generalizability, we generate a simulated dataset comprising anomalies with diverse characteristics since the original test samples only include specific types of anomalies and may lead to biased evaluations. Experimental results demonstrate that our approach exhibits promising potential for generalizing effectively to various unforeseen anomalies encountered in real-world scenarios.

3.LAMBO: Large Language Model Empowered Edge Intelligence

Authors:Li Dong, Feibo Jiang, Yubo Peng, Kezhi Wang, Kun Yang, Cunhua Pan, Robert Schober

Abstract: Next-generation edge intelligence is anticipated to bring huge benefits to various applications, e.g., offloading systems. However, traditional deep offloading architectures face several issues, including heterogeneous constraints, partial perception, uncertain generalization, and lack of tractability. In this context, the integration of offloading with large language models (LLMs) presents numerous advantages. Therefore, we propose an LLM-Based Offloading (LAMBO) framework for mobile edge computing (MEC), which comprises four components: (i) Input embedding (IE), which is used to represent the information of the offloading system with constraints and prompts through learnable vectors with high quality; (ii) Asymmetric encoderdecoder (AED) model, which is a decision-making module with a deep encoder and a shallow decoder. It can achieve high performance based on multi-head self-attention schemes; (iii) Actor-critic reinforcement learning (ACRL) module, which is employed to pre-train the whole AED for different optimization tasks under corresponding prompts; and (iv) Active learning from expert feedback (ALEF), which can be used to finetune the decoder part of the AED while adapting to dynamic environmental changes. Our simulation results corroborate the advantages of the proposed LAMBO framework.

4.Sequential annotations for naturally-occurring HRI: first insights

Authors:Lucien Tisserand ICAR, Frédéric Armetta SyCoSMA, LIRIS, Heike Baldauf-Quilliatre ICAR, Antoine Bouquin SyCoSMA, LIRIS, Salima Hassas SyCoSMA, LIRIS, Mathieu Lefort LIRIS, SyCoSMA

Abstract: We explain the methodology we developed for improving the interactions accomplished by an embedded conversational agent, drawing from Conversation Analytic sequential and multimodal analysis. The use case is a Pepper robot that is expected to inform and orient users in a library. In order to propose and learn better interactive schema, we are creating a corpus of naturally-occurring interactions that will be made available to the community. To do so, we propose an annotation practice based on some theoretical underpinnings about the use of language and multimodal resources in human-robot interaction. CCS CONCEPTS $\bullet$ Computing methodologies $\rightarrow$ Discourse, dialogue and pragmatics; $\bullet$ Human-centered computing $\rightarrow$ Text input; HCI theory, concepts and models; Field studies.

5.Probabilistic Dataset Reconstruction from Interpretable Models

Authors:Julien Ferry LAAS-ROC, Ulrich Aïvodji ETS, Sébastien Gambs UQAM, Marie-José Huguet LAAS-ROC, Mohamed Siala LAAS-ROC

Abstract: Interpretability is often pointed out as a key requirement for trustworthy machine learning. However, learning and releasing models that are inherently interpretable leaks information regarding the underlying training data. As such disclosure may directly conflict with privacy, a precise quantification of the privacy impact of such breach is a fundamental problem. For instance, previous work have shown that the structure of a decision tree can be leveraged to build a probabilistic reconstruction of its training dataset, with the uncertainty of the reconstruction being a relevant metric for the information leak. In this paper, we propose of a novel framework generalizing these probabilistic reconstructions in the sense that it can handle other forms of interpretable models and more generic types of knowledge. In addition, we demonstrate that under realistic assumptions regarding the interpretable models' structure, the uncertainty of the reconstruction can be computed efficiently. Finally, we illustrate the applicability of our approach on both decision trees and rule lists, by comparing the theoretical information leak associated to either exact or heuristic learning algorithms. Our results suggest that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones, for a given accuracy level.

6.AI-Based Facial Emotion Recognition Solutions for Education: A Study of Teacher-User and Other Categories

Authors:R. Yamamoto Ravenor

Abstract: Existing information on AI-based facial emotion recognition (FER) is not easily comprehensible by those outside the field of computer science, requiring cross-disciplinary effort to determine a categorisation framework that promotes the understanding of this technology, and its impact on users. Most proponents classify FER in terms of methodology, implementation and analysis; relatively few by its application in education; and none by its users. This paper is concerned primarily with (potential) teacher-users of FER tools for education. It proposes a three-part classification of these teachers, by orientation, condition and preference, based on a classical taxonomy of affective educational objectives, and related theories. It also compiles and organises the types of FER solutions found in or inferred from the literature into "technology" and "applications" categories, as a prerequisite for structuring the proposed "teacher-user" category. This work has implications for proponents', critics', and users' understanding of the relationship between teachers and FER.

7.Ontologies in Digital Twins: A Systematic Literature Review

Authors:Erkan Karabulut, Salvatore F. Pileggi, Paul Groth, Victoria Degeler

Abstract: Digital Twins (DT) facilitate monitoring and reasoning processes in cyber-physical systems. They have progressively gained popularity over the past years because of intense research activity and industrial advancements. Cognitive Twins is a novel concept, recently coined to refer to the involvement of Semantic Web technology in DTs. Recent studies address the relevance of ontologies and knowledge graphs in the context of DTs, in terms of knowledge representation, interoperability and automatic reasoning. However, there is no comprehensive analysis of how semantic technologies, and specifically ontologies, are utilized within DTs. This Systematic Literature Review (SLR) is based on the analysis of 82 research articles, that either propose or benefit from ontologies with respect to DT. The paper uses different analysis perspectives, including a structural analysis based on a reference DT architecture, and an application-specific analysis to specifically address the different domains, such as Manufacturing and Infrastructure. The review also identifies open issues and possible research directions on the usage of ontologies and knowledge graphs in DTs.

8.Symbolic LTLf Best-Effort Synthesis

Authors:Giuseppe De Giacomo, Gianmarco Parretti, Shufang Zhu

Abstract: We consider an agent acting to fulfil tasks in a nondeterministic environment. When a strategy that fulfills the task regardless of how the environment acts does not exist, the agent should at least avoid adopting strategies that prevent from fulfilling its task. Best-effort synthesis captures this intuition. In this paper, we devise and compare various symbolic approaches for best-effort synthesis in Linear Temporal Logic on finite traces (LTLf). These approaches are based on the same basic components, however they change in how these components are combined, and this has a significant impact on the performance of the approaches as confirmed by our empirical evaluations.

9.LTLf Best-Effort Synthesis in Nondeterministic Planning Domains

Authors:Giuseppe De Giacomo, Gianmarco Parretti, Shufang Zhu

Abstract: We study best-effort strategies (aka plans) in fully observable nondeterministic domains (FOND) for goals expressed in Linear Temporal Logic on Finite Traces (LTLf). The notion of best-effort strategy has been introduced to also deal with the scenario when no agent strategy exists that fulfills the goal against every possible nondeterministic environment reaction. Such strategies fulfill the goal if possible, and do their best to do so otherwise. We present a game-theoretic technique for synthesizing best-effort strategies that exploit the specificity of nondeterministic planning domains. We formally show its correctness and demonstrate its effectiveness experimentally, exhibiting a much greater scalability with respect to a direct best-effort synthesis approach based on re-expressing the planning domain as generic environment specifications.

10.Enhancing Psychological Counseling with Large Language Model: A Multifaceted Decision-Support System for Non-Professionals

Authors:Guanghui Fu, Qing Zhao, Jianqiang Li, Dan Luo, Changwei Song, Wei Zhai, Shuo Liu, Fan Wang, Yan Wang, Lijuan Cheng, Juan Zhang, Bing Xiang Yang

Abstract: In the contemporary landscape of social media, an alarming number of users express negative emotions, some of which manifest as strong suicidal intentions. This situation underscores a profound need for trained psychological counselors who can enact effective mental interventions. However, the development of these professionals is often an imperative but time-consuming task. Consequently, the mobilization of non-professionals or volunteers in this capacity emerges as a pressing concern. Leveraging the capabilities of artificial intelligence, and in particular, the recent advances in large language models, offers a viable solution to this challenge. This paper introduces a novel model constructed on the foundation of large language models to fully assist non-professionals in providing psychological interventions on online user discourses. This framework makes it plausible to harness the power of non-professional counselors in a meaningful way. A comprehensive study was conducted involving ten professional psychological counselors of varying expertise, evaluating the system across five critical dimensions. The findings affirm that our system is capable of analyzing patients' issues with relative accuracy and proffering professional-level strategies recommendations, thereby enhancing support for non-professionals. This research serves as a compelling validation of the application of large language models in the field of psychology and lays the groundwork for a new paradigm of community-based mental health support.

11.Ensemble of Counterfactual Explainers

Authors:Riccardo Guidotti, Salvatore Ruggieri

Abstract: In eXplainable Artificial Intelligence (XAI), several counterfactual explainers have been proposed, each focusing on some desirable properties of counterfactual instances: minimality, actionability, stability, diversity, plausibility, discriminative power. We propose an ensemble of counterfactual explainers that boosts weak explainers, which provide only a subset of such properties, to a powerful method covering all of them. The ensemble runs weak explainers on a sample of instances and of features, and it combines their results by exploiting a diversity-driven selection function. The method is model-agnostic and, through a wrapping approach based on autoencoders, it is also data-agnostic.

12.Where Would I Go Next? Large Language Models as Human Mobility Predictors

Authors:Xinglei Wang, Meng Fang, Zichao Zeng, Tao Cheng

Abstract: Accurate human mobility prediction underpins many important applications across a variety of domains, including epidemic modelling, transport planning, and emergency responses. Due to the sparsity of mobility data and the stochastic nature of people's daily activities, achieving precise predictions of people's locations remains a challenge. While recently developed large language models (LLMs) have demonstrated superior performance across numerous language-related tasks, their applicability to human mobility studies remains unexplored. Addressing this gap, this article delves into the potential of LLMs for human mobility prediction tasks. We introduce a novel method, LLM-Mob, which leverages the language understanding and reasoning capabilities of LLMs for analysing human mobility data. We present concepts of historical stays and context stays to capture both long-term and short-term dependencies in human movement and enable time-aware prediction by using time information of the prediction target. Additionally, we design context-inclusive prompts that enable LLMs to generate more accurate predictions. Comprehensive evaluations of our method reveal that LLM-Mob excels in providing accurate and interpretable predictions, highlighting the untapped potential of LLMs in advancing human mobility prediction techniques. We posit that our research marks a significant paradigm shift in human mobility modelling, transitioning from building complex domain-specific models to harnessing general-purpose LLMs that yield accurate predictions through language instructions. The code for this work is available at https://github.com/xlwang233/LLM-Mob.

13.Natural language to SQL in low-code platforms

Authors:Sofia Aparicio, Samuel Arcadinho, João Nadkarni, David Aparício, João Lages, Mariana Lourenço, Bartłomiej Matejczyk, Filipe Assunção

Abstract: One of the developers' biggest challenges in low-code platforms is retrieving data from a database using SQL queries. Here, we propose a pipeline allowing developers to write natural language (NL) to retrieve data. In this study, we collect, label, and validate data covering the SQL queries most often performed by OutSystems users. We use that data to train a NL model that generates SQL. Alongside this, we describe the entire pipeline, which comprises a feedback loop that allows us to quickly collect production data and use it to retrain our SQL generation model. Using crowd-sourcing, we collect 26k NL and SQL pairs and obtain an additional 1k pairs from production data. Finally, we develop a UI that allows developers to input a NL query in a prompt and receive a user-friendly representation of the resulting SQL query. We use A/B testing to compare four different models in production and observe a 240% improvement in terms of adoption of the feature, 220% in terms of engagement rate, and a 90% decrease in failure rate when compared against the first model that we put into production, showcasing the effectiveness of our pipeline in continuously improving our feature.

14.Empowering LLM to use Smartphone for Intelligent Task Automation

Authors:Hao Wen, Yuanchun Li, Guohong Liu, Shanhui Zhao, Tao Yu, Toby Jia-Jun Li, Shiqi Jiang, Yunhao Liu, Yaqin Zhang, Yunxin Liu

Abstract: Mobile task automation is an attractive technique that aims to enable voice-based hands-free user interaction with smartphones. However, existing approaches suffer from poor scalability due to the limited language understanding ability and the non-trivial manual efforts required from developers or end-users. The recent advance of large language models (LLMs) in language understanding and reasoning inspires us to rethink the problem from a model-centric perspective, where task preparation, comprehension, and execution are handled by a unified language model. In this work, we introduce AutoDroid, a mobile task automation system that can handle arbitrary tasks on any Android application without manual efforts. The key insight is to combine the commonsense knowledge of LLMs and domain-specific knowledge of apps through automated dynamic analysis. The main components include a functionality-aware UI representation method that bridges the UI with the LLM, exploration-based memory injection techniques that augment the app-specific domain knowledge of LLM, and a multi-granularity query optimization module that reduces the cost of model inference. We integrate AutoDroid with off-the-shelf LLMs including online GPT-4/GPT-3.5 and on-device Vicuna, and evaluate its performance on a new benchmark for memory-augmented Android task automation with 158 common tasks. The results demonstrated that AutoDroid is able to precisely generate actions with an accuracy of 90.9%, and complete tasks with a success rate of 71.3%, outperforming the GPT-4-powered baselines by 36.4% and 39.7%. The demo, benchmark suites, and source code of AutoDroid will be released at https://autodroid-sys.github.io/.

15.FedLogic: Interpretable Federated Multi-Domain Chain-of-Thought Prompt Selection for Large Language Models

Authors:Pengwei Xing, Songtao Lu, Han Yu

Abstract: Leveraging ``chain-of-thought (CoT)'' reasoning to elicit rapid and precise responses from large language models (LLMs) is rapidly attracting research interest. A notable challenge here is how to design or select optimal prompts. The process of prompt selection relies on trial and error, involving continuous adjustments and combinations of input prompts by users based on the corresponding new responses generated from LLMs. Furthermore, minimal research has been conducted to explore how LLMs employ the mathematical problem-solving capabilities learned from user interactions to address issues in narrative writing. To improve interpretability and explore the balance principle between generality and personalization under a multi-domain CoT prompt selection scenario, we propose the Federated Logic rule learning approach (FedLogic). We introduce a theoretical formalization and interactive emulation of the multi-domain CoT prompt selection dilemma in the context of federated LLMs. We cast the problem of joint probability modeling as a bilevel program, where the CoT prompt selection intricacy can be likened to a fuzzy score-based rule selection with the LLMs function as rule generators. FedLogic solves this problem through variational expectation maximization (V-EM). In addition, we incorporate two KL-divergence constraints within this probabilistic modeling framework to surmount the intricacies of managing extensive search spaces and accomplishing cross-domain personalization of CoTs. To the best of our knowledge, FedLogic is the first interpretable and principled federated multi-domain CoT prompt selection approach for LLMs.

16.AI Framework for Early Diagnosis of Coronary Artery Disease: An Integration of Borderline SMOTE, Autoencoders and Convolutional Neural Networks Approach

Authors:Elham Nasarian, Danial Sharifrazi, Saman Mohsenirad, Kwok Tsui, Roohallah Alizadehsani

Abstract: The accuracy of coronary artery disease (CAD) diagnosis is dependent on a variety of factors, including demographic, symptom, and medical examination, ECG, and echocardiography data, among others. In this context, artificial intelligence (AI) can help clinicians identify high-risk patients early in the diagnostic process, by synthesizing information from multiple factors. To this aim, Machine Learning algorithms are used to classify patients based on their CAD disease risk. In this study, we contribute to this research filed by developing a methodology for balancing and augmenting data for more accurate prediction when the data is imbalanced and the sample size is small. The methodology can be used in a variety of other situations, particularly when data collection is expensive and the sample size is small. The experimental results revealed that the average accuracy of our proposed method for CAD prediction was 95.36, and was higher than random forest (RF), decision tree (DT), support vector machine (SVM), logistic regression (LR), and artificial neural network (ANN).

17.Bayesian Integration of Information Using Top-Down Modulated WTA Networks

Authors:Otto van der Himst, Leila Bagheriye, Johan Kwisthout

Abstract: Winner Take All (WTA) circuits a type of Spiking Neural Networks (SNN) have been suggested as facilitating the brain's ability to process information in a Bayesian manner. Research has shown that WTA circuits are capable of approximating hierarchical Bayesian models via Expectation Maximization (EM). So far, research in this direction has focused on bottom up processes. This is contrary to neuroscientific evidence that shows that, besides bottom up processes, top down processes too play a key role in information processing by the human brain. Several functions ascribed to top down processes include direction of attention, adjusting for expectations, facilitation of encoding and recall of learned information, and imagery. This paper explores whether WTA circuits are suitable for further integrating information represented in separate WTA networks. Furthermore, it explores whether, and under what circumstances, top down processes can improve WTA network performance with respect to inference and learning. The results show that WTA circuits are capable of integrating the probabilistic information represented by other WTA networks, and that top down processes can improve a WTA network's inference and learning performance. Notably, it is able to do this according to key neuromorphic principles, making it ideal for low-latency and energy efficient implementation on neuromorphic hardware.

18.Decentralized Multi-agent Reinforcement Learning based State-of-Charge Balancing Strategy for Distributed Energy Storage System

Authors:Zheng Xiong, Biao Luo, Bing-Chuan Wang, Xiaodong Xu, Xiaodong Liu, Tingwen Huang

Abstract: This paper develops a Decentralized Multi-Agent Reinforcement Learning (Dec-MARL) method to solve the SoC balancing problem in the distributed energy storage system (DESS). First, the SoC balancing problem is formulated into a finite Markov decision process with action constraints derived from demand balance, which can be solved by Dec-MARL. Specifically, the first-order average consensus algorithm is utilized to expand the observations of the DESS state in a fully-decentralized way, and the initial actions (i.e., output power) are decided by the agents (i.e., energy storage units) according to these observations. In order to get the final actions in the allowable range, a counterfactual demand balance algorithm is proposed to balance the total demand and the initial actions. Next, the agents execute the final actions and get local rewards from the environment, and the DESS steps into the next state. Finally, through the first-order average consensus algorithm, the agents get the average reward and the expended observation of the next state for later training. By the above procedure, Dec-MARL reveals outstanding performance in a fully-decentralized system without any expert experience or constructing any complicated model. Besides, it is flexible and can be extended to other decentralized multi-agent systems straightforwardly. Extensive simulations have validated the effectiveness and efficiency of Dec-MARL.