Loading...
Search for: proposed-architectures
0.006 seconds
Total 65 records

    Value-Aware low-power register file architecture

    , Article CADS 2012 - 16th CSI International Symposium on Computer Architecture and Digital Systems ; 2012 , Pages 44-49 ; 9781467314824 (ISBN) Ahmadian, S. N ; Fazeli, M ; Ghalaty, N. F ; Miremadi, S. G ; Sharif University of Technology
    2012
    Abstract
    In this paper, we propose a low power register file architecture for embedded processors. The proposed architecture, "Value-Aware Partitioned Register File (VAP-RF)", employs a partitioning technique that divides the register file into two partitions such that the most frequently accessed registers are stored in the smaller register partition. In our partitioning algorithm, we introduce an aggressive clock-gating scheme based on narrow-value registers to furthermore reduce power. Experimental results on an ARM processor for selected MiBench workloads show that the proposed architecture has an average power saving of 70% over generic register file structure  

    A survey on deep learning based approaches for action and gesture recognition in image sequences

    , Article 12th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2017, 30 May 2017 through 3 June 2017 ; 2017 , Pages 476-483 ; 9781509040230 (ISBN) Asadi Aghbolaghi, M ; Clapes, A ; Bellantonio, M ; Escalante, H. J ; Ponce Lopez, V ; Baro, X ; Guyon, I ; Kasaei, S ; Escalera, S ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2017
    Abstract
    The interest in action and gesture recognition has grown considerably in the last years. In this paper, we present a survey on current deep learning methodologies for action and gesture recognition in image sequences. We introduce a taxonomy that summarizes important aspects of deep learning for approaching both tasks. We review the details of the proposed architectures, fusion strategies, main datasets, and competitions. We summarize and discuss the main works proposed so far with particular interest on how they treat the temporal dimension of data, discussing their main features and identify opportunities and challenges for future research. © 2017 IEEE  

    Design and implementation issues for mobfish

    , Article 12th International Conference on Distributed Multimedia Systems, DMS 2006, 30 August 2006 through 1 September 2006 ; 2006 , Pages 17-22 ; 1891706195 (ISBN) Sheikhattar, H ; Rahiman, V ; Jaberipur, G ; Noroozi, N ; Sharif University of Technology
    Knowledge Systems Institute Graduate School  2006
    Abstract
    New advances in mobile-computer technology, together with the advent of wireless networking, introduce new requirements, capabilities and concerns to the computer science and industry. In this paper, we discuss the special requirements of file sharing in mobile environments. Then our proposed architecture for mobile file sharing (MobFish) is presented. Finally, we present an implementation of the proposed architecture based on its special design tactics. © 2006 by Knowledge Systems Institute Graduate School. All rights reserved  

    An efficient high-throughput LSI architecture for a synchronization block applied to real-time optical OFDM systems

    , Article Proceedings - IEEE International Symposium on Circuits and Systems ; 1- 5 June , 2014 , pp. 1752-1755 ; ISSN: 02714310 Ghanaatian, R ; Shabany, M ; Sharifkhani, M ; Sharif University of Technology
    Abstract
    An efficient low-complexity VLSI architecture for timing synchronization of a real-time intensity modulation direct detection optical OFDM (IMDD-OOFDM) system is proposed, which results in a significant area reduction. This architecture calculates the correlation among cyclic prefix (CP) regions to estimate the beginning of the OFDM symbol. The proposed architecture utilizes only one functional unit for this purpose, while the throughput is devised for high data-rate optical OFDM systems. Synthesis results of this architecture proves an area saving of 31% compared to the previous work. Moreover, the performance of the correlation method is significantly improved due to a modification applied... 

    Energy efficient all-optical arbitration in optical network-on-chip

    , Article 2012 Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference, OFC/NFOEC 2012 ; 2012 ; 9781467302623 (ISBN) Koohi, S ; Yin, Y ; Hessabi, S ; Yoo, S. J. B ; Sharif University of Technology
    2012
    Abstract
    We propose an all-optical arbitration architecture to resolve end-point contention in the optical networks-on-chip. The proposed architecture reduces on-chip optical power and energy losses by 37% and 21%, respectively, compared to Corona's token-based control plane  

    High-throughput low-complexity systolic montgomery multiplication over GF(2m) Based on Trinomials

    , Article IEEE Transactions on Circuits and Systems II: Express Briefs ; Volume 62, Issue 4 , January , 2015 , Pages 377-381 ; 15497747 (ISSN) Bayat Sarmadi, S ; Farmani, M ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2015
    Abstract
    Cryptographic computation exploits finite field arithmetic and, in particular, multiplication. Lightweight and fast implementations of such arithmetic are necessary for many sensitive applications. This brief proposed a low-complexity systolic Montgomery multiplication over GF(2m). Our complexity analysis shows that the area complexity of the proposed architecture is reduced compared with the previous work. This has also been confirmed through our application-specific integrated circuit area and time equivalent estimations and implementations. Hence, the proposed architecture appears to be very well suited for high-throughput low-complexity cryptographic applications  

    Architecture to improve the accuracy of automatic image annotation systems

    , Article IET Computer Vision ; Volume 14, Issue 5 , August , 2020 , Pages 214-223 Khatchatoorian, A. G ; Jamzad, M ; Sharif University of Technology
    Institution of Engineering and Technology  2020
    Abstract
    Automatic image annotation (AIA) is an image retrieval mechanism to extract relative semantic tags from visual content. So far, the improvement of accuracy in newly developed such methods have been about 1 or 2% in the F1-score and the architectures seem to have room for improvement. Therefore, the authors designed a more detailed architecture for AIA and suggested new algorithms for its main parts. The proposed architecture has three main parts: feature extraction, learning, and annotation. They designed a novel learning method using machine learning and probability bases. In the annotation part, they suggest a novel method that gains the maximum benefit from the learning part. The... 

    A two-stage pipelined passive charge-sharing SAR ADC

    , Article APCCAS 2008 - 2008 IEEE Asia Pacific Conference on Circuits and Systems, Macao, 30 November 2008 through 3 December 2008 ; January , 2008 , Pages 141-144 ; 9781424423422 (ISBN) Imani, A ; Bakhtiar, M. S ; Sharif University of Technology
    2008
    Abstract
    This paper presents a new ADC based on using passive charge sharing SAR ADC in a 2-stage pipeline architecture. The charge domain operation of passive charge sharing ADC poses an inherent limitation on its resolution. The proposed architecture increases the achievable resolution with a low power overhead. Designed and simulated in a 0.18um CMOS process, the 12-bits, 40MS/sec ADC core consumes 7mW from a 1.8V supply  

    A time modulated array with polarization diversity capability

    , Article IEEE Access ; Volume 9 , 2021 , Pages 53735-53744 ; 21693536 (ISSN) Mazaheri, M. H ; Fakharzadeh, M ; Akbari, M ; Safavi Naeini, S ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2021
    Abstract
    The Time Modulated Array (TMA) provides multiple radiation beams at different frequencies, by turning the radiating elements on and off. In this paper, we propose to add polarization diversity to a wireless communication link by leveraging the capabilities of the TMA. The proposed architecture switches between two orthogonal polarizations, instead of turning the antenna off, providing polarization diversity for the receivers. We show that the beams of each polarization at the sidebandsare exactly the same. However, the switching sequence should be optimized for the fundamental frequency to provide the same radiation level at each sideband. Two different switching sequences are proposed,... 

    PSP-Cache: A low-cost fault-tolerant cache memory architecture

    , Article Proceedings -Design, Automation and Test in Europe, DATE ; 2014 ; ISSN: 15301591 ; ISBN: 9783981537024 Farbeh, H ; Miremadi, S. G ; Sharif University of Technology
    Abstract
    Cache memories constitute a large fraction of processor chip area and are highly vulnerable to soft errors caused by energetic particles. To protect these memories, most of the modern processors employ Error Detection Codes (EDCs) or Error Correction Codes (ECCs). EDCs/ECCs impose significant overheads in terms of area and energy; these overheads increase as a function of interleaving EDCs/ECCs to detect/correct multiple errors. This paper proposes a new cache architecture to minimize the area and energy overheads of EDCs/ECCs in set-associative L1-caches. Simulation results for a 4-way set-associative cache show that the proposed architecture reduces both the area and static power overheads... 

    Towards dark silicon era in FPGAs using complementary hard logic design

    , Article Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014 ; Sept , 2014 , pp. 1 - 6 ; ISBN: 9783000446450 Ahari, A ; Khaleghi, B ; Ebrahimi, Z ; Asadi, H ; Tahoori, M. B ; Sharif University of Technology
    Abstract
    While the transistor density continues to grow exponentially in Field-Programmable Gate Arrays (FPGAs), the increased leakage current of CMOS transistors act as a power wall for the aggressive integration of transistors in a single die. One recently trend to alleviate the power wall in FPGAs is to turn off inactive regions of the silicon die, referred to as dark silicon. This paper presents a reconfigurable architecture to enable effective fine-grained power gating of unused Logic Blocks (LBs) in FPGAs. In the proposed architecture, the traditional soft logic is replaced with Mega Cells (MCs), each consists of a set of complementary Generic Reconfigurable Hard Logic (GRHL) and a conventional... 

    A power-efficient reconfigurable architecture using PCM configuration technology

    , Article Proceedings -Design, Automation and Test in Europe, DATE ; 2014 Ahari, A ; Asadi, H ; Khaleghi, B ; Tahoori, M. B ; Sharif University of Technology
    Abstract
    Promising advantages offered by resistive NonVolatile Memories (NVMs) have brought great attention to replace existing volatile memory technologies. While NVMs were primarily studied to be used in the memory hierarchy, they can also provide benefits in Field-Programmable Gate Arrays (FPGAs). One major limitation of employing NVMs in FPGAs is significant power and area overheads imposed by the Peripheral Circuitry (PC) of NVM configuration bits. In this paper, we investigate the applicability of different NVM technologies for configuration bits of FPGAs and propose a power-efficient reconfigurable architecture based on Phase Change Memory (PCM). The proposed PCM-based architecture has been... 

    An efficient VLSI architecture of QPP interleaver/deinterleaver for LTE turbo coding

    , Article Proceedings - IEEE International Symposium on Circuits and Systems ; 2013 , Pages 797-800 ; 02714310 (ISSN) ; 9781467357609 (ISBN) Ardakani, A ; Mahdavi, M ; Shabany, M ; Sharif University of Technology
    2013
    Abstract
    Long Term Evolution (LTE) supports peak data rates in excess of 300 Mb/s. A good approach to achieve such rates is by parallelizing the required processing in turbo decoders. An interleaver is an important part of a turbo decoder. LTE uses the Quadratic Permutation Polynomial (QPP) interleaver, which makes it suitable for parallel decoding. In this paper, we propose an efficient architecture for the QPP interleaver, called the Add-Compare-Select (ACS) permuting network. A unique feature of the proposed architecture is that it can be used both as the interleaver and deinterleaver leading to a high-speed low-complexity hardware interleaver/deinterleaver for turbo decoding. The proposed design... 

    HAFTA: Highly available fault-tolerant architecture to protect SRAM-based reconfigurable devices against multiple bit upsets

    , Article IEEE Transactions on Device and Materials Reliability ; Volume 13, Issue 1 , November , 2013 , Pages 203-212 ; 15304388 (ISSN) Ghaderi, Z ; Miremadi, S. G ; Asadi, H ; Fazeli, M ; Sharif University of Technology
    2013
    Abstract
    Despite widespread use of SRAM-based reconfigurable devices (SRDs) in mainstream applications, their usage has been very limited in enterprise and safety-critical applications due to SRAM susceptibility to soft errors. Previous mitigation techniques to protect SRDs impose significant area and power overheads. Additionally, they suffer from susceptibility of configuration bits to multiple bit upsets (MBUs). In this paper, we present a highly available fault-tolerant architecture to protect SRD-based designs against MBUs in both configuration and user bits. In the proposed architecture, the entire design is duplicated with respect to the relative locations of logic blocks within the SRD and... 

    Maestro: A high performance AES encryption/decryption system

    , Article Proceedings - 17th CSI International Symposium on Computer Architecture and Digital Systems, CADS 2013 ; October , 2013 , Pages 145-148 ; 9781479905621 (ISBN) Biglari, M ; Qasemi, E ; Pourmohseni, B ; Computer Society of Iran; IPM ; Sharif University of Technology
    IEEE Computer Society  2013
    Abstract
    High throughput AES encryption/decryption is a necessity for many of modern embedded systems. This article presents a high performance yet cost efficient AES system. Maestro can be used in a wide range of embedded applications with various requirements and limitations. Maestro is about one million times faster than the pure software implementation. The Maestro architecture is composed of two major components; the soft processor aimed at system initialization and control, and the hardware AES engine for high performance AES encryption/decryption. A ten stage implicit pipelined architecture is considered for the AES engine. Two novel techniques are proposed in design of AES engine which enable... 

    A distributed locality-aware neighbor selection algorithm for P2P video streaming over wireless mesh networks

    , Article 2012 6th International Symposium on Telecommunications, IST 2012 ; 2012 , Pages 639-643 ; 9781467320733 (ISBN) Moayeri, F ; Akbari, B ; Khansari, M ; Ahmadifar, B ; Sharif University of Technology
    2012
    Abstract
    Nowadays, deployment of peer-to-peer video streaming systems over wireless mesh networks has attained raising popularity among large number of users around the world. In this paper, we present an efficient peer-to-peer live video streaming architecture over multi-hop wireless mesh networks. In our proposed architecture, we take the physical topology of network into account and based on a distributed distributed locality-aware neighbor selection algorithm in the overlay construction phase, we generate an efficient mesh-based overlay on top of wireless mesh networks. In locality-aware neighbor selection algorithm, instead of choosing randomly, peers find their best neighbors based on their... 

    ONC3: All-optical NoC based on cube-connected cycles with quasi-DOR algorithm

    , Article Proceedings - 15th Euromicro Conference on Digital System Design, DSD 2012 ; 2012 , Pages 296-303 ; 9780769547985 (ISBN) Abdollahi, M ; Tavana, M. K ; Koohi, S ; Hessabi, S ; Sharif University of Technology
    2012
    Abstract
    This paper proposes a nanophotonic Network-on-Chip architecture based on the traditional Cube-Connected Cycles topology (CCC), which is named as ONC3. We also suggest a contention-free quasi-Dimension-Order-Routing algorithm for the proposed structure. Compared to the previous 2D layouts, our novel scheme lessens the crosstalk parameter of the insertion loss and consequently, the power consumption. Besides, the router structure is area-efficient. On the other hand, optical destination checking supersedes electrical resource reservation, with utilizing passive wavelength routing method and Wavelength Division Multiplexing scheme, simultaneously. The efficiency of the proposed architecture, in... 

    An efficient architecture for Sequential Monte Carlo receivers in wireless flat-fading channels

    , Article Journal of Signal Processing Systems ; Volume 68, Issue 3 , 2012 , Pages 303-315 ; 19398018 (ISSN) Shabany, M ; Sharif University of Technology
    Springer New York LLC  2012
    Abstract
    A pipelined architecture is developed for a Sequential Monte Carlo (SMC) receiver that performs joint channel estimation and data detection. The promising feature of the proposed SMC receiver is achieving the near-bound performance in fading channels without using any decision feedback, training or pilot symbols. The proposed architecture exploits the parallelism intrinsic to the algorithm and consists of three blocks, i.e., the SMC core, weight calculator, and resampler. Hardware efficient/parallel architectures for each functional block including the resampling block is developed. The novel feature of the proposed architecture is that makes the execution time of the resampling independent... 

    Ultra high-throughput architectures for hard-output MIMO detectors in the complex domain

    , Article Midwest Symposium on Circuits and Systems, 7 August 2011 through 10 August 2011l ; August , 2011 ; 15483746 (ISSN) ; 9781612848570 (ISBN) Mahdavi, M ; Shabany, M ; Sharif University of Technology
    2011
    Abstract
    In this paper, a novel hard-output detection algorithm for the complex multiple-input multiple-output (MIMO) detectors is proposed, which results in a significant throughput enhancement, a near-ML performance, and an SNR-independent fixed-throughput. Moreover, a high-throughput VLSI implementation is proposed, which is based on a novel method of the node generation and sorting scheme. The proposed design achieves the throughput of 10Gbps in a 0.13 μ CMOS process, which is the highest throughput reported in the literature for both the real and the complex domains. Synthesis results in 90nm CMOS also show that the proposed scheme can achieve the throughput of up to 15Gbps. Moreover, the FPGA... 

    ScTMR: A scan chain-based error recovery technique for TMR systems in safety-critical applications

    , Article Proceedings -Design, Automation and Test in Europe, DATE, 14 March 2011 through 18 March 2011 ; March , 2011 , Pages 289-292 ; 15301591 (ISSN) ; 9783981080179 (ISBN) Ebrahimi, M ; Miremadi, S. G ; Asadi, H ; Sharif University of Technology
    2011
    Abstract
    We propose a roll-forward error recovery technique based on multiple scan chains for TMR systems, called Scan chained TMR (ScTMR). ScTMR reuses the scan chain flip-flops employed for testability purposes to restore the correct state of a TMR system in the presence of transient or permanent errors. In the proposed ScTMR technique, we present a voter circuitry to locate the faulty module and a controller circuitry to restore the system to the fault-free state. As a case study, we have implemented the proposed ScTMR technique on an embedded processor, suited for safety-critical applications. Exhaustive fault injection experiments reveal that the proposed architecture has the error detection and...