This book provides a hands-on, class-tested introduction to CUDA and GPU programming. It begins by introducing CPU programming and the concepts of P-threads, thread programming, multi-tasking, and parallelism, and then interweaves those concepts into an introduction of GPU programming. Using Nvidia's new platform, based on Amazon EC2 and Web GPU, the book uses a standardized architecture, while also exploring other architectures and their differences. The book also covers GPU multi-threading and Global Memory, CUDA atomics and the use of libraries on GPUs. Example applications in image processing, face detection, and tumor detection are also included.