Sharif Digital Repository / Sharif University of Technology / Search result

A graph-theoretic approach toward autonomous skill acquisition in reinforcement learning

, Article Evolving Systems ; Volume 9, Issue 3 , 2018 , Pages 227-244 ; 18686478 (ISSN) Kazemitabar, S. J ; Taghizadeh, N ; Beigy, H ; Sharif University of Technology

Springer Verlag 2018

Abstract

Hierarchical reinforcement learning facilitates learning in large and complex domains by exploiting subtasks and creating hierarchical structures using these subtasks. Subtasks are usually defined through finding subgoals of the problem. Providing mechanisms for autonomous subgoal discovery and skill acquisition is a challenging issue in reinforcement learning. Among the proposed algorithms, a few of them are successful both in performance and also efficiency in terms of the running time of the algorithm. In this paper, we study four methods for subgoal discovery which are based on graph partitioning. The idea behind the methods proposed in this paper is that if we partition the transition...

Using strongly connected components as a basis for autonomous skill acquisition in reinforcement learning

, Article 6th International Symposium on Neural Networks, ISNN 2009, Wuhan, 26 May 2009 through 29 May 2009 ; Volume 5551 LNCS, Issue PART 1 , 2009 , Pages 794-803 ; 03029743 (ISSN); 3642015069 (ISBN); 9783642015069 (ISBN) Kazemitabar, J ; Beigy, H ; Sharif University of Technology

2009

Abstract

Hierarchical reinforcement learning (HRL) has had a vast range of applications in recent years. Preparing mechanisms for autonomous acquisition of skills has been a main topic of research in this area. While different methods have been proposed to achieve this goal, few methods have been shown to be successful both in performance and also efficiency in terms of time complexity of the algorithm. In this paper, a linear time algorithm is proposed to find subgoal states of the environment in early episodes of learning. Having subgoals available in early phases of a learning task, results in building skills that dramatically increase the convergence rate of the learning process. © 2009 Springer...

Cyber-social systems: modeling, inference, and optimal design

, Article IEEE Systems Journal ; Volume 14, Issue 1 , 2020 , Pages 73-83 Doostmohammadian, M ; Rabiee, H. R ; Khan, U. A ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2020

Abstract

This paper models the cyber-social system as a cyber-network of agents monitoring states of individuals in a social network. The state of each individual is represented by a social node, and the interactions among individuals are represented by a social link. In the cyber-network, each node represents an agent, and the links represent information sharing among agents. The agents make an observation of social states and perform distributed inference. In this direction, the contribution of this paper is threefold: First, a novel distributed inference protocol is proposed that makes no assumption on the rank of the underlying social system. This is significant as most protocols in the...

Critical graphs in index coding

, Article IEEE International Symposium on Information Theory - Proceedings ; 2014 , p. 281-285 Tahmasbi, M ; Shahrasbi, A ; Gohari, A ; Sharif University of Technology

Abstract

In this paper we define critical graphs as minimal graphs that support a given set of rates for the index coding problem, and study them for both the one-shot and asymptotic setups. For the case of equal rates, we find the critical graph with minimum number of edges for both one-shot and asymptotic cases. For the general case of possibly distinct rates, we show that for one-shot and asymptotic linear index coding, as well as asymptotic non-linear index coding, each critical graph is a union of disjoint strongly connected subgraphs (USCS). On the other hand, we identify a non-USCS critical graph for a one-shot non-linear index coding problem. In addition, we show that the capacity region of...

Distributed estimation recovery under sensor failure

, Article IEEE Signal Processing Letters ; Volume 24, Issue 10 , 2017 , Pages 1532-1536 ; 10709908 (ISSN) Doostmohammadian, M ; Rabiee, H. R ; Zarrabi, H ; Khan, U. A ; Sharif University of Technology

Abstract

Single-time-scale distributed estimation of dynamic systems via a network of sensors/estimators is addressed in this letter. In single-time-scale distributed estimation, the two fusion steps, consensus and measurement exchange, are implemented only once, in contrast to, e.g., a large number of consensus iterations at every step of the system dynamics. We particularly discuss the problem of failure in the sensor/estimator network and how to recover for distributed estimation by adding new sensor measurements from equivalent states. We separately discuss the recovery for two types of sensors, namely α and β sensors. We propose polynomial-order algorithms to find equivalent state nodes in graph...

Automatic discovery of subgoals in reinforcement learning using strongly connected components

, Article 15th International Conference on Neuro-Information Processing, ICONIP 2008, Auckland, 25 November 2008 through 28 November 2008 ; Volume 5506 LNCS, Issue PART 1 , 2009 , Pages 829-834 ; 03029743 (ISSN); 3642024890 (ISBN); 9783642024894 (ISBN) Kazemitabar, J ; Beigy, H ; Asia Pacific Neural Network Assembly (APNNA); International Neural Network Society (INNS); IEEE Computational Intelligence Society; Japanese Neural Network Society (JNNS); European Neural Network Society (ENNS) ; Sharif University of Technology

2009

Abstract

The hierarchical structure of real-world problems has resulted in a focus on hierarchical frameworks in the reinforcement learning paradigm. Preparing mechanisms for automatic discovery of macro-actions has mainly concentrated on subgoal discovery methods. Among the proposed algorithms, those based on graph partitioning have achieved precise results. However, few methods have been shown to be successful both in performance and also efficiency in terms of time complexity of the algorithm. In this paper, we present a SCC-based subgoal discovery algorithm; a graph theoretic approach for automatic detection of subgoals in linear time. Meanwhile a parameter tuning method is proposed to find the...

Improved K2 algorithm for Bayesian network structure learning

, Article Engineering Applications of Artificial Intelligence ; Volume 91 , 2020 Behjati, S ; Beigy, H ; Sharif University of Technology

Elsevier Ltd 2020

Abstract

In this paper, we study the problem of learning the structure of Bayesian networks from data, which takes a dataset and outputs a directed acyclic graph. This problem is known to be NP-hard. Almost most of the existing algorithms for structure learning can be classified into three categories: constraint-based, score-based, and hybrid methods. The K2 algorithm, as a score-based algorithm, takes a random order of variables as input and its efficiency is strongly dependent on this ordering. Incorrect order of variables can lead to learning an incorrect structure. Therefore, the main challenge of this algorithm is strongly dependency of output quality on the initial order of variables. The main...