struktur protein dan model persamaan stuktur dan agihan pengkomputeran.
Infrastruktur servis grid terbuka. Alatan pengkomputeran grid untuk aplikasi ...
1.
Name of Course/Module
High Performance Bioinformatics Computing
2.
Course Code
HPB3029
3.
Status of Subject
Major for B. Sc Bioinformatics
4.
MQF Level/Stage
Bachelor Degree – MQF Level 6
5.
June 2012
6.
Version (state the date of the last Senate approval) Requirement for Registration
7.
Name(s) of academic/teaching staff
8.
Semester and Year offered
9.
Objective of the course/module in the programme :
ECP2056 OR TCE2311 – Data Communications and Networks G. Mohamed Hanifa H L Seldon Ali Afzalian Mand Trimester 2 (Delta level)
1. To introduce the students to the implementation, management and maintenance of HPC (High Performance Computing) clusters. 2. To introduce the students to the concepts of parallelizing and distributing compute intensive algorithms. 3. To enable the students to run distributed algorithms on HPC clusters. 4. To introduce the theories in grid computing, basic infrastructure and infostructure for grid computing. 5. To introduce the theory and practice of grid computing in bioinformatics.
10.
Learning Outcomes : At the completion of the subject, students should be able to: LO1: LO2: LO3: LO4:
11.
Define the basic concept of high performance computing in bioinformatics.(Cognitive, Level 1) Design an algorithm for distributed systems on local and remote HPC clusters.(Cognitive, Level 5) Adapt the distributed algorithms on local and remote HPC clusters. (Psychomotor, Level 6) Apply grid computing architecture and service infrastructure. .(Cognitive, Level 3)
Synopsis: Introduction; Computing clusters; Design and setting up a Beowulf cluster; Super computers and computer clusters; genetic algorithms and distributed algorithms; Complexicity of biological data; Metagenomics and metagenomeanalysis using HPC; BLAST and distributed BLAST; Evolutionary bioinformatics and phylogenetic studies with HPC; Large scale proteomics- Protein structure prediction and structural similarity modelling and distributed computing. Grid computing architecture. Open grid services architecture. Open grid services infrastructure. Grid computing toolkits for bioinformatics applications. Grid computing for bioinformatics. Pengenalan, Pengkomputeran berkelompok, Merangka dan menubuhkan kelompok Beowulf, Komputer Super dan Komputer berkelompok, Algoritma Genetik dan Algoritma Agihan, Data Biologikal kompleks, Metagenomik dan analisis metagenomik menggunakan HPC, BLAST, dan Agihan BLAST. Bioinfomatik Evolusi dan kajian phylogenetik menggunakan HPC. Proteomik berskala besar. Ramalan struktur protein dan model persamaan stuktur dan agihan pengkomputeran. Infrastruktur servis grid terbuka. Alatan pengkomputeran grid untuk aplikasi bioinfomatik. Pengkomputeran grid untuk bioinfomatik.
12.
13.
Mapping of Subject to Programme Outcomes : Programme Outcomes PO1: Apply soft skills in work and career related activities PO7: Demonstrate knowledge and understanding of essential facts, concepts, principles, and theories relating to bioinformatics PO8: Apply principles and knowledge of bioinformatics in relevant areas PO9: Demonstrate the ability in analysing, modelling, designing, developing and evaluating computing solutions Assessment Methods and Types : Method and Type Description/Details Test Tutorial/ Laboratory Assignment Report & Presentation Final Exam
% of Contribution 20.00 30.00 30.00 20.00
Percentage 20.00% 10.00% 30.00% 40.00%
14. Details of Subject Topics
Mode of Delivery Lecture 2
Tutorial
2. Parallel Computing Definitions of Parallel and Serial Computing Need of Parallel Computing Seeking concurrency Data clustering Computational Speed
2
2
3. Parallel computer and architectures Types of parallel computers SISD, SIMD, MISD & MIMD Multiprocessors Centralized multiprocessors Cache coherence problem in centralized multiprocessors Distributed multiprocessors Cache coherence problem in distributed Multiprocessors Directory based protocol Multi-computers Asymmetric multi-computers Symmetrical multi-computers Practical parallel computers Interconnection networks
4
2
1. Introduction : HPC Introduction Bioinformatics Overview Role of HPC in Bioinformatics field HPC concepts Penalization Techniques
2.
4. Parallel (MIMD) algorithm design Task/ Channel model of algorithm design Foster’s design methodology Partitioning Communication Agglomeration Mapping Case studies
4
5. Introduction to Message Passing Interface Parallel programming – Matrix-vector multiplication Parallel programming – Matrix-Matrix Multiplication Case study : Parallelizing Smith Waterman with MPI Performance analysis
4
6. Grid Computing: Overview History and evolution of grid computing. The applications and environment. The requirements for grid computing environment
2
2
7. Grid Computing Architecture Grid service architecture: components and organization. Service Oriented Architecture (SOA) and its respective implementations. Web service architecture with XML, SOAP and WSDL. Global XML Architecture (GXA) and emerging drip computing standards.
2
2
8. Open Grid Services Architecture Introduction to Open Grid Services Architecture. Global grid forum initiative. Case study on the industry applications.
2
9. Grid Computing Toolkits for Bioinformatics Applications Introduction to available grid computing toolkits. Design of bioinformatics applications. Development of grid computing modules for bioinformatics applications.
2
Total
28
2
12
Tutorials Introduction to HPC Distributing and rendering Smith Waterman algorithm on a remote HPC cluster Distributing and rendering ClustalW on a remote HPC cluster Distribution and rendering ProtML (Protein based Maximum Likelihood) on a remote HPC cluster Distribution and rendering of ProtDist (Protein based Distance Matrix) on a remote HPC cluster Distribution and rendering a genome sequence based PHYLIP package I Distribution and rendering a genome sequence based PHYLIP package II Distribution and rendering a Greedy algorithm for protein secondary structure prediction I Distribution and rendering a Hill climbing algorithm for protein secondary structure prediction II Introduction: how to setup a grid/cluster
3.
4. 5.
6.
Total Student Face to Face Learning Time (SLT) (Hour) 28 Lecture 12 Tutorials 10 Laboratory/Practical Presentation Assignment 1 Mid Term Test 2 Final Exam Quizzes 53 Sub Total Total SLT Credit Value Reading Materials : Textbook 1. Parallel Computing for Bioinformatics and Computational Biology: Models, Enabling Technologies, and Case studies (Wiley Series on Parallel and Distributed Computing). Albert Y. Zomaya. (Editor). John Wiley. 2006 2.
Grid Computing. Joshy Joseph and Craig Fellenstein. Prentice Hall. 2004.
3.
Grid Computing for Bioinformatics and Computational Biology. Albert Y. Zomaya , ElGhazali Talbi. 9780471784098. WileyInterscience. 2007.
Total Guided and Independent Learning 28 12 5 10 5 20 80 133/40 = 3.325 3 3 Reference Materials 1. A Networking Approach to Grid Computing. Daniel Minoli. Wiley. 2004. 2. Grid Computing: A Practical Guide to Technology and Applications (Programming Series). Ahmar Abbas. Charles River Media. 2004.
Appendix (to be compiled when submitting the complete syllabus for the programme) : 1. Mission and Vision of the University and Faculty 2. Mapping of Programme Objectives to Vision and Mission of Faculty and University 3. Mapping of Programme Outcome to Programme Objectives 4. Programme Objective and Outcomes (Measurement and Descriptions)