|
Motivation
The size of
various data sets has increased tremendously in recent years as speedups in
processing and communication have greatly improved the capability for data
generation and collection in areas such as scientific experimentation,
business and government transactions, as well as the Internet. Due to the
huge size and high dimensionality of the available data sets, it is quite
common to see databases on the order of gigabytes or terabytes. Sequential
processing is unable to run in-core or would take a tremendous amount of
time. Therefore, parallel computing is an essential solution to speed up the
computation.
Many parallel computing platforms have been developed in the past couple of
decades, including shared memory parallel machines, distributed parallel
machines, and clusters. They offer more computation capability, more memory
and storage than single-processor systems. The challenge is to develop
parallel programs for applications to achieve efficiency and performance
goals.
GPU (Graphics Processing Unit) is a dedicated graphics rendering device for
a personal computer, workstation, or game console. Modern GPUs are very
efficient at manipulating and displaying computer graphics, and their highly
parallel structure makes them more effective than general-purpose CPUs for a
range of complex algorithms. It has been demonstrated that many
computational-intensive applications on GPUs can achieve dramatic speedups
than on CPUs, which is a revolution of parallel processing due to its
massive parallelism and low cost.
Nvidia’s Compute Unified Device Architecture (CUDA) is a general purpose
scalable parallel programming model for writing highly parallel applications
on GPUs. It provides several key abstractions – a hierarchy of thread
blocks, shared memory, and barrier synchronization. This model has proven
quite successful at programming multi-threaded many-core GPUs and scales
transparently to hundreds of cores. CUDA is steadily winning customers in
scientific and engineering fields.
Course Objectives
This course
presents an introduction to a new emerging paradigm: GPU Computing with CUDA.
The objective of this course is to provide students with knowledge and
hands-on experience in developing multi-threaded code for GPUs using CUDA.
We present parallel programming principles, the parallelism models,
communication models, synchronization mechanism, toolkits, as well as the
resource limitations of GPUs. Some existing examples and application areas
are also presented.
Spring
2009
Syllabus
|
Lecture
|
Material
|
|
Lecture 1 -
Introduction
|
Slides (pdf)
|
|
Lecture
2 - Parallel Computing
|
Slides (pdf)
|
|
Lecture
3 - CUDA Programming Model
|
Slides (pdf)
|
|
Lecture
4 - CUDA Memory
|
Slides (pdf)
|
|
Lecture
5 - CUDA Threads
|
Slides (pdf)
|
|
Lecture
6 - Performance
Optimization
|
Slides (pdf)
|
|
Lecture
7 - Case Study: Typical
Examples
|
Slides (pdf)
|
|
Lecture
8 - Case Study:
Association Rules Mining
|
Slides (pdf)
|
|
Lecture
9 - Case Study: Clustering
in Cosmological Simulation
|
Slides (pdf)
|
|
Lecture
10 - Final Project Presentation & Conclusion
|
Slides (pdf)
|
Prerequisites
• C programming
• Operating System
• Algorithms/Data Structure
References
• Matt
Pharr (ed.), GPU Gems 2: Programming Techniques for High-Performance Graphics
and General-Purpose Computation, Addison Wesley.
• http://www.nvidia.com/object/cuda_home.html
• Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar, Introduction to Parallel Computing (2nd
Edition), Addison Wesley, 2003.
• Barry Wilkinson, Michael Allen, Parallel Programming 2nd edition, Pearson
Education (Prentice Hall)
• Quinn, Michael J., Parallel Programming in C with MPI and OpenMP McGraw-Hill Science, 2004
• http://www.mpi-forum.org/
• http://www.openmp.org/
Contact
Information
Instructor:
Ying Liu. yingliu@gucas.ac.cn
Teaching Assistant:
Liheng Jian, Peng Zhang, Shenshen
Liang.
Lecture
Hours
Lecture: Thu 7:00-9:30 PM
Room: S104
Lab and Assignments
We will
use NVIDIA processors and the CUDA™ programming tools in the lab
section of this course. The programming assignment will involve successively sophisticated
programming skills. The topic of the final project is free, but must involve
a computational-intensive application followed by some form of display of the
results, such as mathematics, image processing, data mining, etc.
Grading Policy
Lab
assignment: 60%
Final Project: 40%
Fall 2009
Syllabus
|
Lecture |
Material |
|
Lecture 1 - Introduction |
Slides (pdf) |
|
Lecture
2 - Parallel Computing |
|
|
Lecture
3 - CUDA Programming Model |
|
|
Lecture
4 - CUDA Memory |
|
|
Lecture
5 - CUDA Threads |
|
|
Lecture
6 - Performance Optimization |
|
|
Lecture
7 - Case Study: Typical Examples |
|
|
Lecture
8 - Case Study: Association Rules
Mining |
|
|
Lecture
9 - Case Study: Clustering in
Cosmological Simulation |
|
|
Lecture 10 - Final Project Presentation
& Conclusion |
|
Prerequisites
•
C programming
• Operating System
• Algorithms/Data Structure
References
• Matt Pharr
(ed.), GPU Gems 2: Programming Techniques for High-Performance Graphics and
General-Purpose Computation, Addison Wesley.
• http://www.nvidia.com/object/cuda_home.html
• Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar, Introduction to
Parallel Computing (2nd Edition), Addison Wesley, 2003.
• Barry Wilkinson, Michael Allen, Parallel Programming 2nd edition, Pearson
Education (Prentice Hall)
• Quinn, Michael J., Parallel Programming in C with MPI and OpenMP
McGraw-Hill Science, 2004
• http://www.mpi-forum.org/
• http://www.openmp.org/
Contact
Information
Instructor:
Ying Liu. yingliu@gucas.ac.cn
Teaching Assistant:
Sheng Xiao, Shenshen Liang.
Lecture Hours
Lecture:
Mon 7:00-9:30 PM
Room: S304
Lab and
Assignments
We will use
NVIDIA processors and the CUDA™ programming tools in the lab
section of this course. The programming assignment will involve successively
sophisticated programming skills. The topic of the final project is free,
but must involve a computational-intensive application followed by some form
of display of the results, such as mathematics, image processing, data
mining, etc.
Grading Policy
Lab
assignment: 60%
Final Project: 40%
|