An FPGA-based Parallel Hardware Architecture for Real-Time Face ...

2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors

An FPGA-based Parallel Hardware Architecture for Real-time Face Detection using a Face Certainty Map

Seunghun Jin, Dongkyun Kim, Thuy Tuong Nguyen Dept. of Electrical and Computer Engineering, Sungkyunkwan University Suwon, South Korea {coredev, bluster, ntthuy}@ece.skku.ac.kr

Bongjin Jun, Daijin Kim

Jae Wook Jeon

Dept. of Computer Science and Engineering, Postech Pohang, South Korea {simple21, dkim}@postech.ac.kr

School of Information and Computer Engineering, Sungkyunkwan University Suwon, South Korea [email protected]

improvements compared to previous systems, as a result of the rapid development of programmable devices in recent years. C. Gao and S. Lu [20] proposed an FPGA-based haar classifier face detection algorithm accelerator and implemented it with a Xilinx Virtex-5 LX110T FPGA. The proposed system can process 256×192 images at 37 fps (frames per second) with 1-classifier and 98 fps with 16classifiers. H. Ngo et al. [12] developed a cost-effective face detection system based on a low-cost FPGA prototype board from Altera (DE2 board). They proposed an area efficient modular architecture for the Viola-Jones face detector with 320×240 video streams and a minimum processing rate of 30 frames/sec. H. Lai et al. [21] designed FPGA hardware architecture for high frame rate face detection using feature cascade classifiers. The proposed architecture is verified using the Xilinx Virtex-2 Pro 30 platform, achieving 143 fps for an image size of 640×480 pixels using a single scan window when running at 126 MHz frequency. K. Irick et al. [22] proposed a unified streaming architecture for real-time face detection and gender classification and implemented it on the Xilinx Virtex-4 FX12 FPGA. The system processes 320×240 images and runs at about 52 and 175 frames/sec in the worst and typical cases, respectively. Although considerable progress has been made, face detection systems developed so far can only partially satisfy requirements for accuracy, speed, and area-efficiency. Systems have a relatively high FAR (false acceptance rate) caused by a small number of classifiers [21], employ an unrealistically high pixel offset of 10 [22], and process images which are of relatively small size [12, 20]. In this paper, we propose a dedicated hardware architecture for a real-time face detection algorithm that Jun and Kim [10] proposed. By employing FCM in the post-processing step, the proposed method successfully reduced the FAR to around one-tenth that of the existing cascade adaboost detector while maintaining a comparable detection rate. We designed an algorithm and implemented the entire procedure used to detect the face into a single FPGA, including pyramid scaling, LBP (local binary pattern) transform, multistage classifier cascade, and FCM-based post-processing. By eliminating the asymmetry between image capturing and face detection through fully parallelized processing of pyramid images, the sequential bottleneck caused by repetitive feature evaluation is removed. In addition, the

Abstract— This paper presents an FPGA-based parallel hardware architecture for real-time face detection. An image pyramid with twenty depth levels is generated using the input image. For these scaled-down images, a local binary pattern transform and feature evaluation are performed in parallel by using the proposed block RAM-based window processing architecture. By sharing the feature look-up tables between two corresponding scaled-down images, we can reduce the use of routing resources by half. For prototyping and evaluation purposes, the hardware architecture was integrated into a Virtex-5 FPGA. The experimental result shows around 300 frames per second speed performance for processing standard VGA (640×480×8) images. In addition, the throughput of the implementation can be adjusted in proportion to the frame rate of the camera, by synchronizing each individual module with the pixel sampling clock. Keywords-face detection; hardware architecture; FPGA; image processing

I.

INTRODUCTION

Face detection is widely used in various computer vision applications, such as face recognition, content-based image retrieval, video surveillance, and human-computer interface. However, detecting faces in real-time is challenging since the human face is a dynamic object which has huge variety of instances. Many different approaches to face detection have been conducted in the past decades [1-5]. Generally, statistical face detection methods, such as SVM and adaboost, are widely used because of their robustness and computational efficiency [6-9]. However, practical use of pure software-based face detection remains difficult. This is because conventional face detection algorithms are carried out by repeated downscaling and searching for all possible face candidates until the downscaled image is smaller than the processing window [10]. This feature computation consumes almost all of the computing power of conventional processors and requires frequent memory access [11, 12]. As a result, deployment and real-time processing of face detection in an embedded environment becomes difficult. For this reason, several approaches to face detection hardware implementations have been conducted using FPGAs and reconfigurable multi-processor platforms [13-19]. In particular, FPGA-based systems show significant

1063-6862/09 $25.00 © 2009 IEEE DOI 10.1109/ASAP.2009.36

61

Authorized licensed use limited to: Sungkyunkwan University. Downloaded on August 20,2009 at 01:55:17 EDT from IEEE Xplore. Restrictions apply.

In order to detect faces, a conventional adaboost face detection algorithm is implemented by scanning all possible analysis windows. The face detector analyzes image patches of a pre-defined size to identify candidate face regions. A set of weak classifiers trained by LBP-transformed images is used to build a strong classifier. A strong classifier is a linear combination of weak classifiers and classifies each window either as a face or non-face. If the confidence value at each window is greater than a specified threshold, we can let the next cascade execute; otherwise, we identify the candidate as a non-face region. For each scanning window centered at (x, y), the confidence value is calculated as:

entire system is synchronized with a pixel sampling clock signal to optimize scalability and throughput. As a result, the frame rate of the proposed system can be flexibly increased or decreased in proportion to the frame rate of the camera. The remainder of the paper is organized as follow. Section II briefly reviews the face detection algorithm using FCM. Section III overviews the proposed hardware architecture and describes the detailed design of sub-modules. Section IV presents FPGA-based implementation of the proposed hardware architecture along with synthesis and experimental results. Finally, Section V concludes this paper and discusses future work. II.

FACE DETECTION USING FCM

Hi 

The FCM-based face detection algorithm consists of four separate stages, as shown in Figure 1. In the first stage, the captured image is successively downscaled to detect different sized faces since the algorithm employs a fixed processing window size for classifying face candidates. Then, each scaled-down image is transformed to a LBP image. Since LBP is invariant to monotonic grayscale transformation, it is less sensitive to illumination changes than other techniques. At a specific pixel position (xc, yc), LBP is defined as an ordered set of binary comparisons of pixel intensities between the center pixel and its eight surrounding pixels. The decimal form of the resultant 8-bit LBP code can be expressed as: (1)

i0

where vc is the gray value of the center pixel (xc, yc), vi is the gray values of the 8 surrounding pixels, and f(x) is a function defined as: 1 if f ( x)   0 if

x0 x0

p

(3)

where i represents the i-th cascade, p represents the p-th feature location, and Si is the set of feature locations. When the scanning window passes all cascades, the window is selected as a face. To find faces of various sizes, this procedure is repeatedly applied to entire scaled-down images. With the selected faces, FCM-based post-processing is applied to remove the overlapped or falsely accepted windows. The FCM is constructed based on four parameters: Smax(x, y), Wmax(x, y), Hmax(x, y), and C(x, y). Smax(x, y) is the maximum confidence value among the windows of pyramid images with the same center location (x, y). Wmax(x, y) and Hmax(x, y) are the width and height, respectively, of the detected face window which contains the maximum confidence value, and C(x, y) is the cumulative confidence value for all pyramid images. Since the size of a face between adjacent down-scaled images is consistent, we can reduce FAR by observing both Smax(x, y) and C(x, y). For all values above the threshold TS in Smax(x, y), the location (x, y) is determined to be the center of the face when C(x, y) is above the threshold TC. If one of the conditions is not satisfied, the candidate is not classified as a face region [10].

7

LBP ( xc , yc )   2i f (vi  vc )

 h ( LBP( p))

pSi

(2)

III.

After the image is transformed into a LBP image, a pixel is replaced into the specific pattern that describes the surrounding pixels.

PROPOSED HARDWARE ARCHITECTURE

Figure 2 shows the overall hardware architecture of the proposed system. The proposed system consists of four submodules: pyramid scaling module, LBP transform module, parallel classifier evaluation module, and FCM-based postprocessing module. The pyramid scaling module receives an incoming image from the camera and generates pyramid images with control signals corresponding to the scale level, to handle a face larger than the size of the processing window. After pyramid scaling, each pixel in the scaled images is transformed to LBP as previously described. Each LBP image is transferred to the parallel classifier module for face detection. For each cascade, the sum of the confidence values is calculated and compared to a specific threshold in order to determine whether the current window is a face or not. If the current window passes all cascade stages, the final decisions for the face candidates are made by the post-processing unit which uses the sum of confidence values generated from the parallel classifier evaluation module. The advantages and details of the proposed architecture are described below.

Figure 1. Block diagram of the FCM-based face detection algorithm.

62


Figure 2. Overall hardware architecture of the proposed system.

generally faster than the write clock in pyramid scaling. This temporal asymmetry increases the input/output latency and limits the maximum frequency of the system. As a result, the overall throughput of the system is limited. Since we use a total of twenty successively downscaled images, we divide the entire image into a 20×20 grid of small patches. Table 1 shows a pre-defined mapping table for pyramid scaling. Each value in the table indicates the pixel location within a 2×2 pixel window at a given coordinate and pyramid level. A valid flag along with the xand y-axis are generated based on the mapping table, as in Figure 3. As shown in Table 1 and Figure 3, the valid flags are generated symmetrically to share the feature LUTs between scaled-down images. The design advantage of window-based pyramid scaling is that all the scaled-down images and their corresponding pixel/line valid flags are derived simultaneously with low latency. As a result, the temporal asymmetry between image acquisition and pyramid scaling can be eliminated. Moreover, the buffer read clock signal can be synchronized with the write clock signal. Consequently, the maximum throughput

A. Window-based Pyramid Scaling Module The size of faces in the captured image varies with the distance from the face. Prior to face detection, scaling-down of the input image, which is called pyramid scaling, is required to detect the different sizes of faces in the scene [1]. Equation (4) shows the pyramid scaling equation where (x’, y’) and (Sx, Sy) are the coordinates of the output pixel and scaling factor, respectively. Since the scaling factors are not integers, the value of the output pixel is substantially calculated in the reverse direction.

 x   S x  y   0   

0   x S y   y 

(4)

Since the arbitrary pixel in the captured image can be fetched during reverse mapping, conventional systems store the entire image to the video buffer for random access to the image. Then, pyramid scaling is performed by reading the image from the buffer. Because the amount of data to be read is not the same as the amount written, the read clock is

63


pixel and its neighbors. Therefore, a 3×3 window processing module which consists of window and line buffers is used again. After a latency of two lines and three pixels, the first 3×3 pixels of the image are assigned to each shift register in the window. Then, the intensity of the center pixel is compared with its neighboring pixels, and eight bits of binary code are generated as a result of the comparison. By concatenating eight binary values in counter-clockwise order, the LBP value corresponding to the respective 3x3 neighborhood is obtained. Then, the resulting LBP is delivered to the parallel classifier module to evaluate the current window. C. Parallel Classifier Module As described in Section II, face detection is carried out by scanning all possible candidate windows. If the window passes all stages of the cascade, the candidate window is confirmed to be a face region. The refined version of the weak and strong classifiers constructed by [10] is used to classify the face and non-face patterns, as shown in Figure 4. Since each classifier consists of the set of feature locations and the confidence value corresponding to the LBP, we designed the classifiers using LUTs. The used training face data constructed by [10] is based on the Postech database [23]. The database contains face images with a large number of variations in illumination, pose, and facial expression. The training data has a resolution of 20x20, which is the processing window size. At each stage, the different feature positions for classification, confidence value, and threshold value were chosen as described in Section II. Five stages of cascades are used, and each stage has an optimized number of allowed positions, 8, 36, 80, 150, and 250, respectively. In the software implementation, the classifiers are cascaded to increase detection speed. In particular, the early stages of the cascade have a smaller number of features, while the late stages of cascade have a larger number. Because all classification is performed sequentially through the entire pyramid, unnecessary computations can be avoided by excluding non-face windows in the early stages of a cascade. Even more computational advantage can be expected considering the number of classifications caused by pyramid scaling. In the proposed hardware implementation, however, all the cascade stages are pipelined and operated in

Figure 3. Generated pixel / Line valid flags.

of the system is not limited by the pyramid scaling clock frequency. To access window pixels in parallel, we employed a DPRAM (dual-port RAM) based window processing module which consists of scan-line buffers and window buffers. The scan-line buffer is a buffer which contains a scan-line of the input image. The window buffer is a set of shift registers which can contain the pixels of a certain area. Since the window buffer consists of registers, it guarantees instant access to its elements. Assume the window size is M×N in the horizontal and vertical dimensions, respectively. To build the appropriate scan-line buffer, a total of (N-1) line buffers with the length of the image width are required. We use DPRAMs as line buffers instead of a set of shift registers to reduce the configurable logic cost. For simultaneous access of window pixels, only a small number of shift registers are used as a window buffer. The scan-line buffer caches a specific number of pixels and generates a pixel vector. Within the window, the generated pixel vector is transferred to the right-most column of the window buffer and is shifted from right to left at each pixel clock cycle. As a result, all (M×N) pixels in the window buffer can be accessed simultaneously. B. LBP Transform Module To achieve robustness to illumination changes while detecting faces, LBP transforms are applied to each scaleddown image as described in Section II. Similar to pyramid scaling, LBP transforms are performed based on the current

TABLE I. MAPPING REFERENCE TABLE FOR PYRAMID SCALING (col.: x/y coordinate in 20×20 patches, row: pyramid level, value: resulting pixel location in 2×2 window)

1 2 3 4 5

20 1.00 1.00 1.00 1.00 1.00

19 1.05 1.11 1.16 1.21 1.26

18 1.11 1.22 1.33 1.44 1.56

17 1.18 1.35 1.53 1.71 1.88

15 16 17 18 19 20

1.00 1.00 1.00 1.00 1.00 1.00

1.79 1.84 1.89 1.95 2.00

1.56 1.67 1.78 1.89 2.00

1.29 1.47 1.65 1.82 2.00

16 15 14 13 12 11 10 1.25 1.33 1.43 1.54 1.67 1.82 1.50 1.67 1.86 1.00 1.75 2.00 1.08 1.33 1.64 2.00 1.29 1.62 1.00 1.33 1.71 1.00 1.45 …. 2.00 1.71 1.38 1.25 1.92 1.67 1.36 1.00 1.50 1.33 1.14 1.75 1.67 1.57 1.46 1.33 1.18 1.00 2.00 2.00 2.00 1.00 1.00 1.00 1.00

9

8

7

6

5

4

3

2

1

1.22 1.50 1.86 1.33 1.44 2.00

1.00 1.71

1.00

1.56 1.00

1.00 1.67 1.00

1.78 1.50 1.14 2.00 2.00

2.00 1.00

1.00 1.00 1.00 1.00 1.00

64


Figure 5. FPGA-based realization of the proposed face detection system TABLE II. DEVICE UTILIZATION / TIMING SUMMARY Used Avail. Slice Logic Utilization Number of Slice Registers 75,766 207,360 Number of Slice LUTs 135,041 207,360 Slice Logic Distribution: Number with an unused Flip Flop: 37,980 172,827 Number with an unused LUT: 37,786 172,827 Specific Feature Utilization Number of BRAM/FIFO 285 288 Number of BUFG 2 32 Min. period: 6.341ns (Max. Frequency: 157.704MHz) Minimum input arrival time before clock: 4.741ns Maximum output required time after clock: 3.259ns

Figure 4. Parallel classifier cascade and Feature Evaluation.

parallel. The feature evaluation result is generated at each pixel clock after fixed pipeline latency. If the candidate window passes all the stages of the cascade, it is classified as a valid face region. D. Post-processing with FCM In the post-processing stage, the overlapped and falselyaccepted faces are removed using the overlapping characteristic of faces in adaboost. As shown in previous sections, different scales of faces can be detected in the same center location for real face regions. Faces with different locations and the same scale can be detected as well. Therefore, we can determine the regions where multiple detected face windows overlap as a face region. The regions with no overlapped face windows are identified as falsely accepted regions. Because the width and height of the maximum confidence value for each pyramid image is calculated by using the result of the classifier cascade and the corresponding scaling factor, FCM can be easily constructed. The detected face window with the highest confidence value is selected as the resulting detected face. IV.

Util. 36% 65% 21% 21% 98% 6%

VGA (640×480×3) images at 60 fps with a 24.576MHz pixel clock. The MT9M112 CMOS camera captures standard VGA images at 30 fps with a 12.288MHz pixel clock. In our experiments, both of the cameras with a designated pixel clock perform face detection without any problem. Because the frame rate increases linearly with the pixel clock increment, we can expect around 300 fps processing of standard VGA images at the reported maximum frequency of the system. Even more enhancement can be expected considering the underestimation characteristic of the synthesis tool [24]. Since the maximum frame rate of the camera available for this research is limited to 60, the performance of the proposed system is verified via timing simulation. The Mentor Graphics ModelSim 6.1f simulation environment, with the test vectors which describe the actual behavior of the camera, is used for the simulation. The software implementation corresponding to the proposed system was evaluated for the purpose of performance and disparity quality comparison. With 640×480 images, face detection was performed at 30 fps on a conventional 2.4 GHz Intel Core2 Duo system with 2.0 GB of working memory. Figure 6 shows the face detection results of the proposed system. The results were processed and obtained in real-time by using the implemented system with a 60 fps VCC-8350CL camera. The experimental results show 93% detection accuracy, the same as the result reported in [10].

IMPLEMENTATION / EXPERIMENTAL RESULTS

A proposed real-time stereo vision system is designed using VHSIC hardware description language (VHDL) and implemented using a Virtex-5 LX330 FPGA from Xilinx. Figure 5 and Table 2 show the implemented face detection system and device utilization / timing summary reported from the Xilinx synthesis tool, respectively. The number of used slices and the maximum allowed frequency of the proposed system are 135,954 (about 64% of the device) and 123.396 MHz, as shown in Table 2. To verify the performance of the proposed system, two kinds of cameras with different frame rates are interfaced. The VCC-8350CL camera-link camera captures standard

65


[5]

[6]

[7] Figure 6.a. Face detection result of the proposed system in various scenes (Indoor / Google / News)

[8] [9]

[10] (a) “Dancing” at frame t.

[11]

(b) “Dancing” at frame t+Δ.

[12]

[13]

(c) “Dancing” at frame t+2Δ. (d) “Dancing” at frame t+3Δ. Figure 6.b. Face detection on video clip. The proposed system detect multiple faces in real-time, even when the scene changes very rapidly.

V.

[14]

CONCLUSIONS

In this paper, we proposed a dedicated hardware architecture for a FPGA-based face detection system using a face certainty map. All the procedures required for detecting faces are integrated within a single FPGA, such as pyramid scaling, LBP transform, multi-stage classifier cascade, and FCM-based post-processing. The real-time performance of the proposed system is evaluated to measure its further applicability. As shown in the experiments, the frame rate of the proposed system can be flexibly increased or decreased in proportion to the frame rate of the camera using the pixel clock as the system clock. As future work, we plan to extend the applicability of the proposed system to detect more face features, such as eyes and lips. Using our implementation in an intelligent sensor is possible for higher-level vision applications such as intelligent robots, surveillance, automotive, and humancomputer interfaces. Additional applications for the proposed face detection system will be evaluated and explored.

[15]

[16]

[17]

[18]

[19]

[20]

[21]

REFERENCES [1]

[2]

[3]

[4]

Y. Dai and Y. Nakano, "Face-texture Model based on SGLD and its Application in Face Detection in a Color Scene," Pattern Recognition, vol. 29, pp. 1007-1017, 1996. H. Rowley, S. Baluja, and T. Kanade, "Neural Network-based Face Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 23-38, 1998. H. Schneiderman and T. Kanade, "A Statistical Method for 3D Object Detection Applied to Faces and Cars," in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 746751, 2000. K. Sung and T. Poggio, "Example-based Learning for View-based Human Face Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 39-51, 1998.

[22]

[23]

[24]

J. Yang and A. Waibel, "A Real-time Face Tracker," in Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV), pp. 142-147, 1996. B. Froba and A. Ernst, "Face Detection with the Modified Census Transform," in Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. A. Mohan, C. Papageorgiou, and T. Poggio, "Example-Based Object Detection in Images by Components," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 349-361, 2001. E. Osuna, R. Freund, and F. Girosi, Support Vector Machines: Training and Applications: A.I. Memo, MIT A. I. Lab., 1997. P. Viola and M. Jones, "Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade," Advances in Neural Information Processing Systems, vol. 2, pp. 1311-1318, 2002. B. Jun and D. Kim, "Robust Real-Time Face Detection Using Face Certainty Map," Lecture Notes in Computer Science, vol. 4642, 2007. Y. Wei, X. Bing, and C. Chareonsak, "FPGA Implementation of Adaboost Algorithm for Detection of Face Biometrics," in IEEE International Workshop on Biomedical Circuits and Systems, 2004. H. Ngo, R. Tompkins, J. Foytik, and V. Asari, "An Area Efficient Modular Architecture for Real-time Detection of Multiple Faces in Video Stream," in 6th International Conference on Information, Communications & Signal Processing, pp. 1-5, 2007. T. Theocharides, G. Link, N. Vijaykrishnan, M. Irwin, and W. Wolf, "Embedded Hardware Face Detection," in Proceedings of the 17th International Conference on VLSI Design, 2004. M. Yang, Y. Wu, J. Crenshaw, B. Augustine, and R. Mareachen, "Face Detection for Automatic Exposure Control in Handheld Camera," in IEEE International Conference on Computer Vision Systems, 2006. D. Nguyen, D. Halupka, P. Aarabi, and A. Sheikholeslami, "RealTime Face Detection and Lip Feature Extraction Using FieldProgrammable Gate Arrays," IEEE Transactions on Systems Man and Cybernetics Part B, vol. 36, p. 902, 2006. E. Painkras and C. Charoensak, "A Framework for the Design and Implementation of a Dynamic Face Tracking System," IEEE Region 10 Conference (TENCON), pp. 1-6, 2005. T. Theocharides, G. Link, N. Vijaykrishnan, M. Irwin, and W. Wolf, "Embedded Hardware Face Detection," in Proceedings of the 17th International Conference on VLSI Design, 2004. T. Ramdas, L. Ang, and G. Egan, "FPGA Implementation of an Integer MIPS Processor in Handel-C and its Application to Human Face Detection," in IEEE Region 10 Conference (TENCON), 2004. Y. Hori, K. Shimizu, Y. Nakamura, and T. Kuroda, "A Real-time Multi Face Detection Technique using Positive-Negative Lines-ofFace Template," in Proceedings of the 17th International Conference on Pattern Recognition, 2004. C. Gao and S. Lu, "Novel FPGA based Haar Classifier Face Detection Algorithm Acceleration," in International Conference on Field Programmable Logic and Applications, pp. 373-378, 2008. H. Lai, M. Savvides, and T. Chen, "Proposed FPGA Hardware Architecture for High Frame Rate (> 100 fps) Face Detection Using Feature Cascade Classifiers," in Proceedings of the IEEE International Conference on Biometrics: Theory, Applications, and Systems, pp. 1-6, 2007. K. Irick, M. DeBole, V. Narayanan, R. Sharma, H. Moon, and S. Mummareddy, "A Unified Streaming Architecture for Real Time Face Detection and Gender Classification," in International Conference on Field Programmable Logic and Applications, pp. 267272, 2007. H. Kim, J. Sung, H. Je, S. Kim, B. Jun, D. Kim, and S. Bang, Asian Face Image Database PF01, Intelligent Multimedia Lab, Department of CSE, Postech, 2001. J. Diaz, E. Ros, F. Pelayo, E. M. Ortigosa, and S. Mota, "FPGA-based Real-time Optical-flow System," IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, pp. 274-279, Feb 2006.

66


An FPGA-based Parallel Hardware Architecture for Real-Time Face ...

An FPGA-based Parallel Hardware Architecture for Real-Time Face ...

Suggest Documents

FPGAbased educational platform for realtime image processing ...

An FPGAbased integrated environment for computer architecture

FPGA-based Parallel Hardware Architecture for Real

a scalable parallel hardware architecture for ...

Architecture Considerations for Massively Parallel Hardware ... - CRoCS

An FPGA-based Parallel Hardware Architecture for Real ... - CiteSeerX

A Parallel Hardware Architecture for Information-Theoretic Adaptive ...

A Parallel Hardware Architecture for Information-Theoretic Adaptive

Hardware Architecture for Large Parallel Array of Random Feature ...

Cell-based hardware architecture for full-parallel ... - OSA Publishing

A cortical architecture on parallel hardware for motion ... - FortKnox

A Parallel Hardware Architecture for Information-Theoretic Adaptive ...

A Parallel Hardware Architecture for fast Gaussian ... - CiteSeerX

Parallel Hardware/Software Architecture for the BWT and ... - SciELO

A Parallel Architecture for Multiple-Face Detection Technique Using ...

Hardware architecture for security improved

EFFICIENT SCALABLE HARDWARE ARCHITECTURE FOR ...

parallel computer architecture a hardware software approach pdf ...

An Optimized Hardware Architecture of Montgomery Multiplication

FPGA-BASED PARALLEL HARDWARE

Distributed Parallel Architecture for - ASE

A Realtime Hardware System for Stereoscopic Videoconferencing with ...

An efficient hardware architecture without line memories for

An Efficient Hardware Architecture for H.264 Intra Prediction Algorithm