GRAPHICS PROCESSING UNIT (GPU)

Context

Recently, during the IndiaAI Impact Summit in New Delhi, the Government of India announced plans to triple the country’s sovereign GPU capacity to 100,000 units by the end of the year. This initiative, part of the ₹10,372-crore IndiaAI Mission, aims to provide subsidized high-performance computing to startups and researchers, reducing India’s dependence on global technology giants like Nvidia while fostering a domestic ecosystem for Large Language Models (LLMs) and deep learning.

1. Architectural Philosophy: Serial vs. Parallel

  • Central Processing Unit (CPU): It is designed as a “General Purpose” processor that excels at Sequential (Serial) Processing. It contains a few powerful cores (typically 4 to 64) optimized for low-latency, complex logical branching, and system management.
  • Graphics Processing Unit (GPU): It is a “Specialized” processor designed for Parallel Processing. It houses thousands of smaller, more efficient cores that can handle multiple independent tasks simultaneously.

2. How a GPU Works: The Technical Mechanism

  • SIMD Architecture: GPUs operate on the Single Instruction, Multiple Data (SIMD) principle, where a single command is executed across thousands of data points (pixels or parameters) at once.
  • The Rendering Pipeline: For visual tasks, GPUs use a four-step process:
    • Vertex Processing: Calculating 3D positions using matrix mathematics.
    • Rasterization: Converting geometric shapes into a grid of pixels.
    • Shading: Determining color, light, and texture for each pixel.
    • Output: Writing the final frame to the Video RAM (VRAM).
  • AI Transformation: In AI training, the GPU skips the visual steps and uses its cores for Matrix Multiplication, which is the mathematical foundation of neural networks.

3. Key Internal Components

  • Cores: Standard units like CUDA Cores (Nvidia) or Stream Processors (AMD) handle general math. Specialized Tensor Cores are designed specifically for the “deep learning” math required by AI.
  • VRAM (Video RAM): Unlike system RAM, VRAM (e.g., GDDR6X or HBM3) has massive bandwidth, allowing it to feed huge amounts of data to the thousands of cores without creating a bottleneck.
  • Thermal Design: High-end GPUs in 2026 consume over 1000W of power, necessitating advanced liquid cooling systems in modern data centers.

4. Strategic Modern Applications

  • Artificial Intelligence: Training Large Language Models (LLMs) and running real-time “inference” for chatbots and autonomous vehicles.
  • Cryptocurrency: Performing “Proof of Work” (PoW) hashing at high speeds (though being phased out by some blockchains like Ethereum).
  • Scientific Simulation: Modeling climate change, molecular dynamics for drug discovery, and genomic sequencing.
  • Digital Twins: Creating real-time virtual replicas of factories or cities for industrial optimization.
Q. With reference to Graphics Processing Units (GPUs), consider the following statements:

1. Unlike a CPU which is optimized for sequential processing, a GPU uses a parallel architecture to handle thousands of tasks simultaneously.

2. The term "General-Purpose Computing on Graphics Processing Units" (GPGPU) refers to the use of GPUs for non-graphics tasks like scientific research and AI.

3. Integrated GPUs share the system's main RAM, whereas Discrete GPUs possess their own dedicated high-bandwidth memory called VRAM.

How many of the above statements are correct?

A) Only one
B) Only two
C) All three
D) None

Solution: (C)

• STATEMENT 1 IS CORRECT: This is the core difference; CPUs handle complex logic one after another (serial), while GPUs handle many simple tasks at once (parallel).
• STATEMENT 2 IS CORRECT: GPGPU is the shift that allowed GPUs to be used for things like weather forecasting and AI instead of just video games.
• STATEMENT 3 IS CORRECT: Integrated GPUs (found in basic laptops) use the computer's shared RAM, which is slower, while Discrete (dedicated) GPUs have specialized VRAM (like GDDR6) for high performance.

Practice Today’s MCQs