Medium 9781601322364

Computer Design: The 2013 WorldComp International Conference Proceedings

Views: 1182
Ratings: (0)

New research by international contributors on Computer Design

List price: $29.95

Your Price: $23.96

You Save: 20%

Remix
Remove
 

15 Chapters

Format Buy Remix

Session - Processor and Integrated Circuit Design + Low Power Computing + Testing

PDF

 

Allocation of NBTI Aging Sensors for Circuit Failure Prediction

PDF

Int'l Conf. Computer Design | CDES'13 |

3

Allocation of NBTI Aging Sensors for Circuit Failure Prediction

Samir Mahaboob Khan Kagadkar and Hussain Al-Asaad

Department of Electrical and Computer Engineering,

University of California, Davis

Abstract— Negative bias temperature instability (NBTI) is a critical device reliability concern in nanometer-scale

CMOS processes. We review the degradation effects of this phenomenon and present techniques to measure and combat NBTI aging. Such techniques involve the insertion of specialized aging sensors and their use in self-correcting dynamic reliability management systems. We propose a novel approach to optimize the allocation of such aging sensors to minimize overhead.

Keywords: Integrated Circuit (IC) reliability, Circuit failure prediction, Negative Bias Temperature Instability (NBTI)

1. Introduction

Engineers working on the design of modern integrated circuits (ICs) fabricated in nanometer-scale technologies are in the unenviable position of having to face a plethora of issues that were benign in the past. Mounting parametric variability, radiation-induced soft errors and time-dependent device degradation make transistors increasingly unreliable components. A generation of engineers is realizing that hardware failures from these unreliable components are a distinctly realistic possibility.

 

Implementation of a Fast Fourier Transform Processor in NULL Convention Logic

PDF

10

Int'l Conf. Computer Design | CDES'13 |

Implementation of a Fast Fourier Transform

Processor in NULL Convention Logic

Zhen Song and Scott C. Smith

Department of Electrical Engineering, University of Arkansas, Fayetteville, AR, U.S.A. szuark@gmail.com and smithsco@uark.edu

Abstract – The Fast Fourier Transform (FFT) is a critical part in communication systems, because it can greatly reduce the computation requirement for signal processing. This paper presents the design of a FFT processor using NULL Convention Logic (NCL), which has been shown to have power consumption advantages over its synchronous counterpart. Performance metrics for the NCL FFT processor are obtained from Cadence simulation, and compared to an equivalent synchronous implementation.

1.

INTRODUCTION

Hardware implementations of FFT are divided into two categories, fixed-point and floating-point. Although floatingpoint numbers inherently have large dynamic range, hardware implementation is larger, slower, and more power consuming than the fixed-point counterpart. This is because arithmetic operations for both mantissa and exponent need to be handled in the hardware [1]. Therefore, in order to design a low-power and high-speed processor, a synchronous FFT processor is usually designed using Q15 Fixed-point format [1, 2].

 

Network-Based System for Face Recognition on Mobile Wireless Devices

PDF

Int'l Conf. Computer Design | CDES'13 |

17

Network-Based System for Face Recognition on Mobile Wireless Devices

Keita Imaizumi and Vasily G. Moshnyaga

Graduate School of Engineering, Faculty of Engineering

Fukuoka University

Fukuoka, 814-0180 Japan

Abstract - This paper describes new internet-based face recognition system to be used in portable devices. In contrast to existing systems, which run computationally intensive facerecognition tasks at a mobile terminal shortening its battery lifetime, the proposed system uses mobile device only for image capturing and user-interface. All complex image processing tasks are performed by a remote high-powered network server to achieve robust and real time face recognition. The system is implemented in software and tested on Android-based Sony

Tablet-S wireless terminal. According to measurements, it provides face recognition in images of 240x320 pixels in size at

10f/sec rate with very high accuracy. The paper discusses the proposed client-server architecture and the results of its experimental evaluation.

 

A Fault Injection Environment for the Evaluation of a Soft Error Detection Technique based on Time Redundancy

PDF

Int'l Conf. Computer Design | CDES'13 |

23

A Fault Injection Environment for the Evaluation of a Soft Error Detection Technique based on

Time Redundancy

Luis Bustamante and Hussain Al-Asaad

Department of Electrical and Computer Engineering

University of California, Davis, CA, USA

Abstract − This paper presents a Verilog test simulation environment designed to inject random transient faults on a

32-bit microprocessor. The purpose of the test environment is to study a hardware-assisted soft error detection technique based on time redundancy. The soft error detection method compares the states of the microprocessor of two independent executions of the same program. The simulation environment takes advantage of the redundant execution and divides the process in two phases. The first phase operates during the first execution of the program to determine the number of cycles that are required to complete the task when no faults are present. The second phase is performed during the second execution of the program to inject a soft error in the microprocessor. The hardware assistance saves the microprocessor states as the program is executing the first time and detects the soft error as they occur during the second execution.

 

A Comparative Analysis of Parallel Prefix Adders

PDF

Int'l Conf. Computer Design | CDES'13 |

A Comparative Analysis of Parallel Prefix Adders

Megha Talsania and Eugene John

Department of Electrical and Computer Engineering

University of Texas at San Antonio

San Antonio, TX 78249 megha.talsania@gmail.com, eugene.john@utsa.edu

Abstract- All modern processors, including general purpose microprocessors, digital signal processors and

GPUs contain an Arithmetic Logic Unit (ALU). The computing efficiency of modern processors mainly depends of the efficiency of the ALU. An adder is the basic building block for an ALU which performs arithmetic as well as logic operations. This paper investigates the performance of six different parallel prefix adders implemented using four different TSMC technology nodes. The parallel prefix adders investigated in this paper are: Kogge Stone Adder, Brent Kung Adder, Han Carlson Adder, Sklansky Adder, Lander Fischer

Adder, and Knowles Adder. The performance metrics considered for the analysis of the adders are: power, delay and area. Simulation studies are carried out for 16, 32 and 64 bit input data width.

 

A Short Survey on User-aware Power Management

PDF

Int'l Conf. Computer Design | CDES'13 |

37

A Short Survey on User-aware Power Management

Dongwon Min1 , Sangjun Lee2 and Sung Woo Chung3

School of Business Administration, Dankook University, Yongin-si, Gyonggi-do, 448-701, Korea

2

School of Computer Science and Engineering, Soongsil University, Seoul 156-743, Korea

3

Division of Computer and Communication Engineering, Korea University, Seoul 136-713, Korea

1

Abstract – This paper briefly surveys user aware power management. In the past, most power saving techniques have been focused on power and performance. However, recently, there are some power management techniques that concentrate on user satisfaction rather than performance itself.

After reading this paper, power management researchers are expected to collaborate with consumer researchers as well as

HCI(Human Computer Interface) researchers.

Keywords: User Aware Power Management, Human

Computer Interface, Consumer research

1

Introduction

In the green computing where users utilize computers for a longer time with less power consumption, low power is crucial as much as performance. Especially, low power techniques are considered as much more important for mobile devices such as smartphones, pads, and notebooks. As time goes on, users need to consume more power, because they want to run more powerful (which is more power consuming) applications. In this case, however, battery usage time is reduced, which eventually makes users uncomfortable. Thus, there have been various low power techniques for mobile devices: from clock gating and power gating at the circuit level to power-aware scheduling at the operating systems level.

 

Session - Performance Issues + HPC and Multi-Processor Systems + FPGA + File Servers

PDF

 

Performance Tradeoff Spectrum of Integer and Floating Point Applications Kernels on Various GPUs

PDF

Int'l Conf. Computer Design | CDES'13 |

41

Performance Tradeoff Spectrum of Integer and Floating Point

Applications Kernels on Various GPUs

M.G.B. Johnson, D. P. Playne and K.A. Hawick

Computer Science, Massey University, North Shore 102-904, Auckland, New Zealand email: {m.johnson, d.p.playne, k.a.hawick}@massey.ac.nz

Tel: +64 9 414 0800 Fax: +64 9 441 8181

April 2013

ABSTRACT

Floating point precision and performance and the ratio of floating point units to integer processing elements on a graphics processing unit accelerator all continue to present complex tradeoffs for optimising core utilisation on modern devices. We investigate various hybrid CPU and GPU combinations using a range of different GPU models occupying different points in this tradeoff space. We analyse some performance data for a range of numerical simulation kernels and discuss their use as benchmark problems for characterising such devices.

KEY WORDS

MIPS vs FLOPS; computational performance; accelerator; benchmark; GPU.

1

Introduction

 

Performance Measures of an Implementation of a Parallel Compiler

PDF

48

Int'l Conf. Computer Design | CDES'13 |

Performance Measures of an Implementation of a

Parallel Compiler

Deepa. Komathukattil1, Roger. Eggen1, Sanjay. P. Ahuja1 and Behrooz. Seyed. Abbassi1

1

Computing and Information Sciences, University of North Florida, Jacksonville, FL, USA

Abstract - Parallel programming is prevalent in every field mainly to speed up computation. Advancements in multiprocessor technology fuel this trend toward parallel programming. However, modern compilers are still largely single threaded and do not take advantage of the machine resources available to them. A good deal of research has been reported on compilers that add parallel constructs to the programs they are compiling, enabling programs to exploit parallelism at run time. Auto parallelization of loops by a compiler is one such example. Parallelizing the compilation process itself has received less attention.

Parallelization brings along with it issues like synchronization and communication overhead. In the semantic analysis phase of a compiler, these issues are of particular relevance during the construction of the symbol table. This paper presents an approach to parallelizing program compilation during the semantic analysis phase. The parallel compiler developed here augments the work done formerly on a concurrent compiler developed at the

 

SDD: Selective De-Duplication with Index by File Size for Primary File Servers

PDF

Int'l Conf. Computer Design | CDES'13 |

55

SDD: Selective De-Duplication with Index by File Size for

Primary File Servers

Hitoshi Kamei1, Tomonori Esaka1, Satoru Kishimoto1, Takayuki Fukatani2,

Takaki Nakamura3 ,and Norihisa Komoda4

1

Hitachi, Ltd., Yokohama, Kanagawa, Japan

2

Hitachi Europe Ltd., Bracknell, Berkshire, United Kingdom

3

Tohoku University, Sendai, Miyagi, Japan

4

Osaka University, Suita, Osaka, Japan

Abstract – We propose a method, called SDD, for improving performance of file level de-duplication for primary file servers. The processing time of the deduplication is increasing because more and more files are being stored in the servers, therefore the de-duplication process cannot finish during assigned time. According to previous studies, large files stored in the servers are dominant in terms of the storage space, while rather small files are dominant in terms of file count. SDD sets a file size threshold to narrow down target files. We develop and evaluate a prototype system using SDD, which increases the throughput of the de-duplication processes.

 

Iterative Synthesis Techniques for Multiple-Valued Logic Functions -- A Review and Comparison

PDF

Int'l Conf. Computer Design | CDES'13 |

61

Iterative Synthesis Techniques for Multiple-Valued Logic

Functions � A Review and Comparison

Mostafa Abd-El-Barr

Department of Information Science, Kuwait University, Kuwait.

Abstract - A number of heuristics for near optimal functional synthesis of Multi-Valued Logic (MVL) have been reported in the literature. Among the well-known heuristics is the Direct

Cover algorithm (DCA). We have introduced a number of improved versions of the DCA. These include the Weighted

Direct Cover (WDC), the Ordered Direct Cover (ODC), and the Fuzzy Direct Cover (FDC). In this paper, we review and compare the performance of those heuristic iterative techniques using two set of benchmarks. The first consists of

50000 randomly generated 2-varaible 4-valued functions and the second consists of 50000 2-variable 5-valued functions.

The average number of product terms required to synthesize a given MVL function is used as the criterion for comparison.

The results obtained show that the modified iterative synthesis heuristics outperformed the DCA and that among the modified techniques the FDC produces the best results.

 

FPGA-based Hexapod Robot Spider

PDF

Int'l Conf. Computer Design | CDES'13 |

67

FPGA-based Hexapod Robot Spider

Yuhua Li

Huimin Ma

Dept. of Computer Science and Technology

Xi’an Jiaotong University, City College

Xi’an, China yhli@mail.xjtu.edu.cn

Abstract—This paper describes a FPGA-based hexapod robot spider, which is used for student education purposes. In the paper mechanical design, kinematic analysis, electro-mechanical device and FPGA system are introduced. Some key points about gait mode, non-stop PWM signal,

,filter of reflex ultrasonic wave and Bluetooth control are given in detail. At the end of the paper a discussion section gives some technical opinions about FPGAbased system, gait mode, filter of sonar echo and remote control.

Index Terms—FPGA, robot spider, mini-sever, PWM ,

Bluetooth Control

I. INTRODUCTION

Robot spiders are widely built and researched for a variety of purposes, including space exploration, mine cleaning in battle fields and rescue work in disasters. Robot spiders can be used in such application situations because of their special legwalking mode. Compared with wheel mode robot, the efficiency of robot spider is not higher, but a robot spider can easily cross over obstacles.

 

3D Lattice Monte Carlo Simulations on FPGAs

PDF

72

Int'l Conf. Computer Design | CDES'13 |

3D Lattice Monte Carlo Simulations on FPGAs

A. Gilman1 , A. Leist2 and K.A. Hawick1

1 Institute of Natural and Mathematical Sciences,

2 School of Engineering and Advanced Technology,

Massey University, North Shore 102-904, Auckland, New Zealand email: { a.gilman, a.leist, k.a.hawick }@massey.ac.nz

Tel: +64 9 414 0800 Fax: +64 9 441 8181

Abstract— Field Programmable Gate Arrays (FPGAs) offer significant performance advantages over general purpose compute architectures for certain scientific problems, including lattice-based Monte Carlo simulations of complex systems models. We report on a custom logic design for the 3D-lattice Ising model that keeps the entire system state in on-chip memory to achieve very high throughput rates. The pipelined architecture, which is implemented in Verilog, is able to process an entire row of cells per clock cycle. When processing a system of 2563 spins on a Xilinx Virtex-7 device, about 3000 full system sweeps can be performed per second. We discuss implementation issues and solutions that apply in similar ways to a variety of nearest neighbour, lattice-based Monte

 

Redundancy + Reconfigurability = Recoverability

PDF

Int'l Conf. Computer Design | CDES'13 |

79

Redundancy + Reconfigurability = Recoverability

Simon Monkman1, and Igor Schagaev2

ITACS Ltd, 157 Shephall View, Stevenage, SG1 1RR, England

2

Faculty of Computing, London Metropolitan University, 166-220 Holloway Road, London, N7 8DB,

England

1

Abstract - An approach to consider computers and connected computer systems using structural, time and information redundancies is proposed. An application of redundancy for reconfigurability and recoverability of computer and connected computer systems is discussed, gaining performance, reliability and power-saving in operation. A paradigm of recoverability is introduced and, if followed, shifts connected computer systems toward real-time applications. Use of redundancy for connected computers is analysed in terms of recoverability, where two supportive algorithms of forward and backward tracing are proposed and explained. As an example, growth of mission reliability is formulated.

Keywords: redundancy; reconfigurability; recoverability; performance-reliability-energy-wise systems

 



Details

Print Book
E-Books
Chapters

Format name
PDF
Encrypted
No
Sku
B000000030921
Isbn
9781601322364
File size
5.19 MB
Printing
Allowed
Copying
Allowed
Read aloud
Allowed
Format name
PDF
Encrypted
No
Printing
Allowed
Copying
Allowed
Read aloud
Allowed
Sku
In metadata
Isbn
In metadata
File size
In metadata