History ========= .. note:: Historical context for key developments in scientific computing and materials science. These notes track the evolution of tools and techniques that enable modern research. .. raw:: html

GPU revolution --------------- NVIDIA is led by co-founder and CEO Jensen Huang. Many individuals made pioneering contributions to the development of GPU computing and its application to scientific research and AI. Here is my attempt to organize sources and understand the direction of tomorrow's GPU and its impact. GPU software and hardware evolution ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Who were the pioneers of GPU? The idea that graphics hardware could perform general computation emerged in the early 2000s. Hanrahan (PI, 2019 Turing Award) inspired the concept of stream processing on graphics hardware and mentored the first GPU computing researchers at Stanford in the early 2000s. Buck created *Brook for GPUs* :cite:`buck2004brook` and proved that programmable graphics pipelines could execute general algorithms. Nickolls co-designed the CUDA programming model (introduced in 2006) and led NVIDIA's GPU Computing group that defined grids, blocks, and threads. Kirk (Chief Scientist, 1997–2009) guided NVIDIA's transition from fixed-function graphics to programmable architectures through the GeForce FX and G80 generations (2002–2006). Harris demonstrated practical GPU scientific computing and evangelized GPGPU through early publications (2004–2006). Houston extended Brook to scientific applications, bridging graphics and high-performance computing. Who pioneered the dissemination of GPU beyond computer graphics? The formalization and mass education of GPU computing, paired with architectural redesign for precision and scalability, enabled broader adoption. Hwu built the academic and educational ecosystem for CUDA after 2007 :cite:`hwu2008cuda`, training thousands of scientists worldwide in GPU programming. Owens codified GPGPU as a research field :cite:`owens2007gpgpu` and developed foundational parallel algorithm primitives in the mid-2000s. Dally architected scalable, efficient GPU designs and introduced tensor-friendly compute structures, shaping the Tesla and Fermi families (2007–2009). Lindholm engineered NVIDIA's Tesla (2007) and Fermi (2009) cores, enabling double precision and general-purpose computation for scientific workloads. Who led the pioneering of GPU acceleration for AI and HPC? Modern AI and scientific computing needed systems that could handle massive parallel workloads efficiently, requiring both specialized hardware and accessible software frameworks. Catanzaro connected CUDA with neural network research, developing early GPU deep learning frameworks and contributing to *Caffe* :cite:`jia2014caffe` and *cuDNN* :cite:`chetlur2014cudnn`. Diamos built GPU compiler and runtime systems bridging academic prototypes with NVIDIA's production software stack around the Kepler and Maxwell generations (2012–2015). Volkov optimized linear algebra kernels, proving GPUs could surpass CPUs in scientific performance, notably with early cuBLAS and MAGMA optimizations (2008–2012). Nickolls evolved CUDA's parallel model for large-scale data center GPUs, shaping multi-GPU communication standards that culminated in NVLink (2016). Dally architected Tensor Core designs that unified HPC and AI computing through efficient parallel processing pipelines, debuting in the Volta architecture (2017). Seibert led developer tools and performance analysis frameworks such as Nsight and CUDA Profiler (2009–2020), making large-scale GPU programming practical for scientists and engineers. Micikevicius advanced mixed-precision training and Tensor Core utilization (2017–2023), powering the modern deep learning acceleration era across Volta, Ampere, and Hopper. References ^^^^^^^^^^ .. bibliography:: :filter: docname in docnames