C++
====

Nov 7, 2025 - This is a learning journal for C++ questions focused on scientific computing. I used C++ and C as an undergrad but never built scientific applications. Now, with GPU programming and high performance computing, I want to revisit C++ for scientific computing.

**Purpose of this documentation**: This page is designed as a practical reference to help you read and write C++ code faster for scientific computing. It focuses on common patterns, syntax, and concepts you'll encounter in real codebases, with emphasis on floating-point arithmetic and numerical computing.

The following is sourced from various online references, Copilot, and my own experiments. I do not claim originality for any of the content here.

Quick reference for ``&`` and ``*`` symbols
-------------------------------------------

.. list-table::
   :header-rows: 1
   :widths: 20 20 30 15 15

   * - Context
     - Code Example
     - Meaning of ``&``
     - Type of Thing
     - Behavior / Effect
   * - In a type declaration
     - ``int& ref = x;``
     - Declares a reference
     - Reference
     - ``ref`` becomes another name for ``x`` (no copy). Changing ``ref`` changes ``x``.
   * - In an expression
     - ``int* ptr = &x;``
     - Address-of operator
     - Pointer
     - ``&x`` gets the memory address of ``x``, stored in ``ptr``.
   * - Using pointer dereference
     - ``*ptr = 10;``
     - ``*`` means "value at address"
     - Pointer
     - 
   * - Combining both
     - ``int& alias = *ptr;``
     - ``*ptr`` gives the value of ``x``, ``alias`` refers to it
     - Reference + Pointer
     - ``alias`` becomes another name for whatever ``ptr`` points to.
   * - In a function parameter
     - ``void foo(int& n)``
     - Passed by reference
     - Reference
     - Function uses the original variable, not a copy.
   * - In a copy constructor
     - ``Tracker(const Tracker& other)``
     - Copy parameter safely and efficiently
     - Reference
     - Avoids recursion (no extra copies).
 
Pointer in practice
^^^^^^^^^^^^^^^^^^

To get the device number in CUDA, you run this code:

.. code-block:: cpp

    int device;
    cudaGetDevice(&device);

The ``&`` means "address of". We want to pass the address of ``device`` so that ``cudaGetDevice`` can write the result into that memory location. Now, if you look at the function signature for ``cudaGetDevice`` (simplified):

.. code-block:: cpp

    cudaError_t cudaGetDevice(int* device);

    // Ask the driver (lower-level API) which GPU is bound to this thread
    int currentDevice = driver_get_current_device();  // e.g. returns 1

    // Copy that value into the caller’s memory
    *device = currentDevice;

    return cudaSuccess;


Now, the parameter requires ``int* device``, it is the pointer type that holds the address of an integer that the user has entered. Inside the function, we use ``*device`` to dereference the pointer and write the value into that memory location.


Why ``const`` is needed for member functions
--------------------------------------------

The ``const`` keyword after a member function declaration indicates that the function will not modify the object's state. This is important for const-correctness and enables the function to be called on const objects.

.. code-block:: cpp

    class Rectangle {
        int width, height;
    public:
        void setWidth(int w) { width = w; }       // modifies state (non-const)
        int area() const { return width * height; }  // doesn't modify state (const)
    };

Key points
^^^^^^^^^^

When you write ``int area() const``, the ``const`` applies to the hidden ``this`` pointer inside the method.

C++ secretly treats member functions like this:

.. code-block:: cpp

    int area(Rectangle* const this) { return this->width * this->height; }

But if you mark the function as ``const``, it becomes:

.. code-block:: cpp

    int area(const Rectangle* const this) { return this->width * this->height; }

Now ``this`` points to a ``const Rectangle``, meaning the compiler won't let you modify any member variables inside that function. This ensures:

- The function can be called on const objects
- The function is guaranteed not to modify the object's state
- Better code safety and clearer intent

In contrast with Python
^^^^^^^^^^^^^^^^^^^^^^^

Python makes ``self`` explicit:

.. code-block:: python

    class Rectangle:
        def area(self):
            return self.width * self.height

C++ makes ``this`` implicit - it's there, you just don't have to say it. When you write ``width`` in a C++ member function, the compiler automatically treats it as ``this->width``.


Numerical data types for scientific computing
---------------------------------------------

.. list-table::
   :header-rows: 1
   :widths: 20 15 25 40

   * - Type
     - Size
     - Precision
     - Scientific use
   * - ``float``
     - 4 bytes (32-bit)
     - ~7 decimal digits
     - GPU computing (faster), machine learning, graphics
   * - ``double``
     - 8 bytes (64-bit)
     - ~15-16 decimal digits
     - Default for scientific computing, general numerical work
   * - ``long double``
     - 16 bytes (128-bit)
     - ~18-19 decimal digits
     - High-precision calculations, rarely needed
   * - ``std::complex<float>``
     - 8 bytes
     - Same as ``float``
     - Quantum mechanics, signal processing (GPU)
   * - ``std::complex<double>``
     - 16 bytes
     - Same as ``double``
     - Quantum mechanics, signal processing (default)
   * - ``int``
     - 4 bytes
     - Exact integers
     - Loop counters, array indices only


**Key points for scientific computing**:

- **Always use ``double`` by default** unless you have a specific reason (GPU memory, performance)
- **Use ``float`` for GPU computing** where memory bandwidth matters
- **Use ``std::complex<double>`` for complex numbers** in quantum mechanics, FFTs, signal processing
- **Precision matters**: ``float`` accumulates error faster in iterative algorithms

.. dropdown:: Understanding floating-point precision
    :icon: info
    :color: info

    Floating-point numbers have limited precision:

    .. code-block:: cpp

        float x = 1.0f / 3.0f;      // 0.333333... (7 digits)
        double y = 1.0 / 3.0;       // 0.333333333333333... (15 digits)
        
        // Accumulated error example
        float sum_f = 0.0f;
        double sum_d = 0.0;
        
        for (int i = 0; i < 1000000; i++) {
            sum_f += 0.1f;  // Accumulates rounding error
            sum_d += 0.1;   // Better, but still not perfect
        }
        
        std::cout << sum_f << std::endl;  // ~99994.5 (wrong!)
        std::cout << sum_d << std::endl;  // ~100000.0 (closer)

    **Rule of thumb**: For scientific computing with many iterations or accumulations, prefer ``double``. Use ``float`` only when profiling shows performance benefits.

Complex numbers in C++
----------------------

For quantum mechanics, signal processing, and Fourier transforms:

.. code-block:: cpp

    #include <complex>
    #include <iostream>
    
    // Declaration and initialization
    std::complex<double> z1(3.0, 4.0);        // 3 + 4i
    std::complex<double> z2 = {1.0, -2.0};    // 1 - 2i
    std::complex<float> z3(2.0f, 1.0f);       // Use float for GPU
    
    // Basic operations
    auto z_sum = z1 + z2;                     // (4, 2)
    auto z_prod = z1 * z2;                    // (11, -2)
    auto z_conj = std::conj(z1);              // 3 - 4i (complex conjugate)
    
    // Access real and imaginary parts
    double real_part = z1.real();             // 3.0
    double imag_part = z1.imag();             // 4.0
    
    // Magnitude and phase
    double magnitude = std::abs(z1);          // 5.0 = sqrt(3²+4²)
    double phase = std::arg(z1);              // ~0.927 radians
    
    // Exponential form (Euler's formula: e^(iθ) = cos(θ) + i*sin(θ))
    std::complex<double> z_polar = std::polar(5.0, 0.927);  // 5*e^(i*0.927)

**Common in scientific computing**:

.. code-block:: cpp

    // Wave functions in quantum mechanics
    std::complex<double> psi(0.707, 0.707);  // normalized state
    
    // Fourier transform (simplified)
    std::vector<std::complex<double>> fft_result(1024);
    
    // Complex exponentials for signal processing
    double omega = 2.0 * M_PI * 440.0;  // 440 Hz
    std::complex<double> signal = std::exp(std::complex<double>(0.0, omega * t));

Array initialization
-------------------

Understanding various initialization methods is essential because different codebases prefer different styles. Some projects use classic C-style initialization, while others adopt modern C++11 brace initialization. Being familiar with all approaches helps you read, understand, and contribute to diverse codebases more effectively.

1. Classic C-style
^^^^^^^^^^^^^^^^^^

.. code-block:: cpp

    double arr[3] = {1.0, 2.5, 3.14};

This is the most common way. Specify the size and the list.

If you make the list shorter than the size, the remaining elements are initialized to zero:

.. code-block:: cpp

    double arr[5] = {1.0, 2.5};  // becomes {1.0, 2.5, 0.0, 0.0, 0.0}

If you omit the size, it is deduced from the initializer list:

.. code-block:: cpp

    double arr[] = {1.0, 2.5, 3.14}; // size is 3

2. Default (zero) initialization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: cpp

    double arr[3] = {};     // all elements are 0.0
    double arr2[3] = {0};   // also all elements are 0.0

3. C++11 uniform (brace) initialization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: cpp

    double arr[3]{1.0, 2.5, 3.14};   // same as = {1.0, 2.5, 3.14}, but modern style
    double arr2[]{1.0, 2.5, 3.14};   // size deduced automatically
    double arr3[5]{};                // all zeros

4. Using memset (low-level, C-style)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: cpp

    double arr[3];
    std::memset(arr, 0, sizeof(arr));  // set all bytes to 0

Note: This does not support non-zero initialization directly. **Use with caution for floating-point** - it's safer to use loop initialization or brace initialization.

5. Runtime initialization (loops or algorithms)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: cpp

    double arr[3];
    for (int i = 0; i < 3; ++i)
        arr[i] = static_cast<double>(i) + 1.5;   // {1.5, 2.5, 3.5}

Uninitialized arrays
^^^^^^^^^^^^^^^^^^^^

.. dropdown:: What happens with ``double arr[3];`` without initialization within a function?
    :icon: info
    :color: warning

    .. code-block:: cpp

        void foo() {
            double arr[3];
            std::cout << arr[0];  // ❌ undefined behavior (garbage)
        }

    ``arr`` is a local automatic variable. It lives on **stack memory**, and the contents are uninitialized.

    **What is stack memory?** Stack memory is a region of memory used for local variables and function calls. It operates in a Last-In-First-Out (LIFO) manner. When a function is called, space is allocated on the stack for its local variables. When the function returns, that space is freed.

    **Why is it undefined behavior?** When you declare ``double arr[3];`` without initialization, the compiler allocates space on the stack but does not clear or set those memory locations. Whatever data was previously stored in those memory locations remains there. This is called "garbage" data—it could be zeros, could be leftover values from previous function calls, or any random bit pattern. Reading this uninitialized memory leads to unpredictable program behavior, which is why it's called undefined behavior.

.. dropdown:: What about ``double arr[3];`` for global or static arrays?
    :icon: info
    :color: info

    .. code-block:: cpp

        double arr[3];        // global → all elements = 0.0

        void foo() {
            static double arr2[3];  // static local → also all 0.0
        }

    Global and static arrays are automatically zero-initialized.

    **What does "static" mean here?** The ``static`` keyword in this context means the variable has **static storage duration**. Unlike normal local variables that are created on the stack each time a function is called and destroyed when the function returns, a static local variable:

    - Is created only once, the first time the function is called
    - Persists for the entire lifetime of the program
    - Retains its value between function calls
    - Is stored in a separate memory region (not on the stack), similar to global variables
    - Is automatically zero-initialized by the compiler (just like global variables)

    Example:

    .. code-block:: cpp

        void counter() {
            static int count = 0;  // initialized only once
            count++;
            std::cout << count << std::endl;
        }

        int main() {
            counter();  // prints 1
            counter();  // prints 2
            counter();  // prints 3
        }

    The ``count`` variable persists between function calls, unlike a normal local variable which would reset to 0 each time.

Array initialization summary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1
   :widths: 25 35 40

   * - Form
     - Example
     - Notes
   * - Explicit C-style
     - ``double arr[3] = {1.0, 2.5, 3.14};``
     - Classic
   * - Implicit size
     - ``double arr[] = {1.0, 2.5, 3.14};``
     - Compiler deduces size
   * - Partial init
     - ``double arr[5] = {1.0, 2.5};``
     - Remaining elements = 0.0
   * - Zero init
     - ``double arr[3] = {};``
     - All zeros
   * - Brace init
     - ``double arr[3]{1.0, 2.5, 3.14};``
     - C++11 style
   * - Heap array
     - ``double* arr = new double[3]{1.0, 2.5, 3.14};``
     - Manual memory
   * - ``std::array``
     - ``std::array<double,3> a{1.0,2.5,3.14};``
     - Safer fixed-size
   * - ``std::vector``
     - ``std::vector<double> v{1.0,2.5,3.14};``
     - Resizable container

.. dropdown:: Why std::array and std::vector exist
    :icon: info
    :color: info

    C++ inherited raw C arrays (``double arr[5]``), but they have fundamental safety issues. The Standard Library introduced ``std::array`` and ``std::vector`` to solve these problems:

    **The problem with C arrays**

    .. code-block:: cpp

        double arr[5] = {1.0, 2.5, 3.14, 4.2, 5.7};
        
        // No size information - decays to pointer
        void process(double arr[]) {  // really just double*
            int size = sizeof(arr);  // Wrong! Gives pointer size, not array size
        }
        
        // No bounds checking
        arr[10] = 42.0;  // Compiles but writes to random memory (undefined behavior)
        
        // Can't be copied or returned easily
        double arr2[5] = arr;  // Compilation error
        
        // Can't be resized
        // No way to dynamically change size

    **std::array: Safe fixed-size arrays**

    Created to replace C arrays with zero runtime overhead while adding safety:

    .. code-block:: cpp

        std::array<double, 5> arr = {1.0, 2.5, 3.14, 4.2, 5.7};
        
        // Knows its size
        void process(std::array<double, 5>& arr) {
            int size = arr.size();  // Returns 5
        }
        
        // Optional bounds checking
        arr.at(10) = 42.0;  // Throws exception (debug builds)
        arr[10] = 42.0;     // Undefined behavior (same as C array, but you have .at())
        
        // Can be copied and returned
        std::array<double, 5> arr2 = arr;  // Works
        
        // Works with STL algorithms
        std::sort(arr.begin(), arr.end());

    **Key limitation**: Size must be known at compile time. Same memory layout as C arrays (lives on stack).

    **std::vector: Dynamic arrays with memory management**

    Created to provide dynamic sizing with automatic memory management:

    .. code-block:: cpp

        std::vector<double> vec = {1.0, 2.5, 3.14, 4.2, 5.7};
        
        // Size determined at runtime
        int n;
        std::cin >> n;
        std::vector<double> data(n);  // Size from user input
        
        // Can grow and shrink
        vec.push_back(6.8);   // Add element
        vec.pop_back();       // Remove element
        vec.resize(100);      // Change size
        
        // Automatic memory management (RAII)
        {
            std::vector<double> temp(1000000);  // Allocates memory from heap
            // use temp...
        }  // Destructor automatically frees memory
        
        // Prevents stack overflow for large data
        std::vector<double> big_data(10000000);  // Works (heap allocation)

    **Key tradeoff**: Small overhead for dynamic capability (stores size, capacity, and pointer to heap memory).

    **Summary**

    - **C arrays**: Fast but unsafe, no size info, can't be copied, fixed size at compile time
    - **std::array**: Safe wrapper for C arrays, same performance, knows its size, but still fixed at compile time
    - **std::vector**: Dynamic sizing, automatic memory management, slight overhead, lives on heap

    Modern C++ code should prefer ``std::array`` for fixed sizes and ``std::vector`` for dynamic data. Use raw C arrays only when interfacing with C APIs or for specific performance reasons (measure first!).

Practical tips for working with large datasets
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

When working with large arrays or datasets in scientific computing:

**Use ``std::vector`` for dynamic data**

.. code-block:: cpp

    std::vector<double> data(1000000);            // 1 million doubles, all zeros
    std::vector<double> data2(1000000, 3.14);     // all initialized to π
    std::vector<std::complex<double>> wavefunction(1024);  // complex array

.. dropdown:: What does "automatically manage memory" mean?
    :icon: info
    :color: info

    Vectors handle allocation and deallocation behind the scenes. When you create a vector, it allocates memory from the heap. When the vector goes out of scope, it automatically frees that memory. You don't need to call ``new``/``delete`` or ``malloc``/``free`` manually.

    .. code-block:: cpp

        {
            std::vector<double> data(1000);  // memory allocated
            // use data...
        }  // vector destroyed, memory automatically freed

    This prevents memory leaks compared to manual management:

    .. code-block:: cpp

        double* data = new double[1000];  // manual allocation
        // use data...
        delete[] data;  // must remember to free! (easy to forget)

.. dropdown:: What about GPU computing and fine control?
    :icon: info
    :color: info

    For GPU computing (CUDA, OpenCL, etc.), you're right that you need fine control over memory transfers between CPU and GPU. In those cases:

    - **CPU-side preparation**: ``std::vector`` is commonly used to prepare data on the CPU side before transferring to GPU
    - **GPU memory**: You still use explicit GPU memory APIs (``cudaMalloc``, ``cudaMemcpy``, etc.) for GPU-side buffers
    - **Why still use vectors?**: Even in GPU programming, vectors are preferred on the CPU side for their safety and convenience. You get the data pointer with ``data.data()`` when you need to copy to GPU

    .. code-block:: cpp

        // Common pattern in CUDA programming
        std::vector<float> host_data(1000000, 1.0f);  // CPU side, safe and easy
        
        float* device_data;
        cudaMalloc(&device_data, host_data.size() * sizeof(float));  // GPU side
        cudaMemcpy(device_data, host_data.data(), /* ... */);  // transfer
        
        // ... GPU computation ...
        
        cudaFree(device_data);  // manual GPU cleanup
        // host_data automatically cleaned up when out of scope

    For high-performance computing, the pattern is: use modern C++ containers (vectors) for CPU data management, and explicit APIs for GPU/specialized memory.

**Use ``std::array`` for fixed-size data**

.. code-block:: cpp

    std::array<double, 1000> buffer{};  // all zeros, safer than C arrays
    std::array<float, 3> position{0.0f, 0.0f, 0.0f};  // 3D coordinates

Provides bounds checking in debug mode and works with STL algorithms.

.. dropdown:: What are STL algorithms?
    :icon: info
    :color: info

    STL (Standard Template Library) algorithms are pre-built functions in ``<algorithm>`` for common operations on containers. Examples:

    .. code-block:: cpp

        #include <algorithm>
        #include <array>
        
        std::array<double, 5> arr = {3.1, 1.4, 4.2, 1.0, 5.9};
        
        // Sort the array
        std::sort(arr.begin(), arr.end());
        
        // Find an element
        auto it = std::find(arr.begin(), arr.end(), 4);
        
        // Transform elements (square each value)
        std::transform(arr.begin(), arr.end(), arr.begin(), 
                       [](int x) { return x * x; });
        
        // Count occurrences
        int count = std::count(arr.begin(), arr.end(), 1);

    These work seamlessly with ``std::array`` and ``std::vector``, but not with raw C arrays.

**Analogy to NumPy**

``std::array`` and ``std::vector`` are analogous to NumPy's ``ndarray`` in Python:

.. code-block:: python

    # Python with NumPy
    import numpy as np
    arr = np.array([1.0, 2.5, 3.14, 4.2, 5.9])
    arr.sort()           # method on array
    arr.mean()           # built-in statistical function
    arr * 2              # vectorized operations

.. code-block:: cpp

    // C++ with std::array
    std::array<double, 5> arr = {1.0, 2.5, 3.14, 4.2, 5.9};
    std::sort(arr.begin(), arr.end());           // algorithm
    // No built-in mean, but can use STL accumulate
    double mean = std::accumulate(arr.begin(), arr.end(), 0.0) / arr.size();
    std::transform(arr.begin(), arr.end(), arr.begin(), 
                   [](double x) { return x * 2.0; });  // element-wise operation

Key difference: NumPy has more built-in numerical methods (mean, std, etc.), while C++ requires STL algorithms or external libraries (like Eigen or Armadillo) for numerical computing. However, both provide contiguous memory layout for performance.

**Reserve capacity for vectors when size is known**

.. code-block:: cpp

    std::vector<double> data;
    data.reserve(1000000);  // pre-allocate memory to avoid reallocations
    for (int i = 0; i < 1000000; ++i) {
        data.push_back(static_cast<double>(i) * 0.001);  // convert to double
    }

This avoids multiple memory reallocations during growth.

**Avoid uninitialized arrays for large data**

.. code-block:: cpp

    // Bad: uninitialized, potentially dangerous
    double data[1000000];
    
    // Good: explicitly zero-initialized
    double data[1000000] = {};
    
    // Better: use vector
    std::vector<double> data(1000000);

For large datasets, uninitialized data can lead to hard-to-debug issues.

.. dropdown:: Why is vector better than array here?
    :icon: info
    :color: info

    **Automatic memory management**
    
    - Arrays like ``double data[1000000]`` allocate on the stack (limited size, typically a few MB)
    - Large arrays can cause stack overflow
    - Vectors allocate on the heap automatically, no size limit concerns
    
    **Safety and flexibility**
    
    - If you need to change the size later, vectors can resize; arrays cannot
    - Vectors know their own size (``data.size()``); arrays do not
    - Vectors automatically clean up when going out of scope
    
    Example of the problem with large stack arrays:
    
    .. code-block:: cpp
    
        void foo() {
            double data[10000000];  // Stack overflow (80 MB on stack)
        }
        
        void bar() {
            std::vector<double> data(10000000);  // Works fine (heap allocation)
        }

.. dropdown:: When should I use ``std::array`` vs ``std::vector``?
    :icon: info
    :color: info

    **Use ``std::array`` when:**
    
    - Size is fixed and known at compile time
    - Small, performance-critical code (stack allocation is faster)
    - You want array bounds checking with ``.at()``
    
    .. code-block:: cpp
    
        std::array<int, 100> buffer;  // fixed size, stack allocated
    
    **Use ``std::vector`` when:**
    
    - Size is determined at runtime or might change
    - Size is large (> few KB)
    - You need to resize, add, or remove elements
    
    .. code-block:: cpp
    
        std::vector<double> data(n);  // n determined at runtime
        data.push_back(42.5);          // can grow dynamically
    
    **Use raw C arrays when:**
    
    - Working with legacy code or C APIs
    - Interfacing with GPU memory APIs
    - Extreme performance optimization (but measure first!)
    
    .. code-block:: cpp
    
        double buffer[256];  // C-style, minimal overhead

**Use memory-efficient initialization for specific patterns**

.. code-block:: cpp

    // Fill with specific value
    std::vector<double> data(1000000, 3.14);
    
    // Or use std::fill
    std::vector<double> data(1000000);
    std::fill(data.begin(), data.end(), 3.14);
    
    // Generate values with a pattern
    std::vector<double> data(1000);
    std::iota(data.begin(), data.end(), 0.0);  // 0.0, 1.0, 2.0, ..., 999.0

**Only zero-fill when you need zero**

Zeroing a large buffer is O(n). If you will overwrite everything anyway, skip the zeroing and write the data once:

.. code-block:: cpp

    // Wasteful: zeros then overwrites
    std::vector<double> data(1000000);  // zeros all elements
    for (int i = 0; i < 1000000; ++i) {
        data[i] = compute_value(i);  // overwrites the zeros
    }
    
    // Better: allocate but don't initialize, then fill once
    std::vector<double> data;
    data.reserve(1000000);  // allocate space without initializing
    for (int i = 0; i < 1000000; ++i) {
        data.push_back(compute_value(i));  // write data once
    }

**Prefer standard containers over raw arrays**

Standard containers like ``std::vector`` and ``std::array`` offer significant advantages over raw C arrays:

Fixed size known at compile time:

.. code-block:: cpp

    std::array<double, 1000> data{};  // safer than double data[1000]

Large or variable size:

.. code-block:: cpp

    std::vector<double> data(1000000);  // contiguous, cache-friendly

**Why prefer standard containers?**

- **Automatic memory management**: ``std::vector`` handles allocation and deallocation, preventing memory leaks
- **Bounds checking**: ``std::array::at()`` and ``std::vector::at()`` check bounds in debug mode
- **Size tracking**: containers know their own size (``data.size()``), unlike raw arrays
- **STL algorithm compatibility**: works seamlessly with ``std::sort``, ``std::find``, ``std::transform``, etc.
- **Exception safety**: proper cleanup on exceptions, unlike raw arrays with manual memory management
- **Resizable**: ``std::vector`` can grow dynamically without manual reallocation
- **No pointer arithmetic errors**: iterators are safer than pointer manipulation
- **Similar to NumPy arrays**: just as NumPy arrays are preferred over raw Python lists for numerical work, ``std::vector`` is preferred for similar reasons—contiguous memory layout enables cache efficiency and SIMD optimizations

Pointers
--------

**What is a pointer?**

A pointer in C++ is a variable that stores the memory address of another object or function. It provides direct memory access and flexibility, but requires careful management.

While powerful, pointers are more primitive and potentially dangerous compared to references. They can be null, uninitialized, or point to arbitrary memory locations, which makes them both powerful and risky to use.

**Pointer basics**

- Pointers are stored in memory managed by the operating system
- Size: 4 bytes on 32-bit systems, 8 bytes on 64-bit systems (need to address 2^32 or 2^64 memory locations)
- Example memory address: `0x7ffee3bff6c8` (hexadecimal format)

**Null pointers**

In C++, null refers to a pointer that does not point to any valid object. It is represented by `nullptr` (C++11) or `NULL` (from C).

Why use null pointers?
- **Indicating absence**: Signify that a pointer is not currently pointing to any valid object
Pointers
--------

**What is a pointer?**

A pointer is a variable that stores the memory address of another variable. Pointers are essential for dynamic memory allocation, data structures, and interfacing with C libraries.

**Why use pointers?**

- **Dynamic memory**: Allocate memory at runtime (essential for large datasets)
- **Efficiency**: Pass large objects without copying
- **Data structures**: Build linked lists, trees, graphs
- **Hardware access**: Interface with GPU memory, C APIs, hardware registers
- **Polymorphism**: Access derived classes through base class pointers
- **Error handling**: Functions can return null to indicate failure
- **Memory management**: Set pointers to null after deallocation to prevent dangling pointers

**Declaring pointers with concrete examples**

Pointers have types and can be null. Each pointer declaration specifies what type of data it points to. Here's what actually happens in memory:

.. code-block:: cpp

    #include <iostream>
    
    double x = 3.14159;
    double* p1 = &x;        // p1 stores address of x
    
    std::cout << "Value of x: " << x << std::endl;           // 3.14159
    std::cout << "Address of x: " << &x << std::endl;        // 0x7ffeeb5c4a18 (example)
    std::cout << "Value of p1: " << p1 << std::endl;         // 0x7ffeeb5c4a18 (same address)
    std::cout << "Dereferenced p1: " << *p1 << std::endl;    // 3.14159 (accesses value at address)

**Memory visualization**:

.. code-block:: text

    Memory Address    Variable    Value
    ──────────────    ────────    ─────────────────
    0x7ffeeb5c4a18    x          3.14159 (double, 8 bytes)
    0x7ffeeb5c4a20    p1         0x7ffeeb5c4a18 (pointer, 8 bytes)
                                  └─ points to x

**Common pointer types for scientific computing**:

.. code-block:: cpp

    double* ptr1;                   // pointer to double (most common in scientific code)
    float* ptr2;                    // pointer to float (GPU computing)
    std::complex<double>* ptr3;     // pointer to complex number
    const double* ptr4;             // pointer to constant double (can't modify *ptr4)
    double* const ptr5 = &x;        // constant pointer (can't change ptr5 itself)

**Important**: Uninitialized pointers contain garbage values (random addresses). Always initialize:

.. code-block:: cpp

    double x = 2.718;
    
    double* ptr = nullptr;      // Good: explicitly null
    double* ptr2 = &x;          // Good: points to valid object (address: e.g., 0x7ffeeb5c4b00)
    
    double* ptr3;               // Bad: contains garbage address (e.g., 0x000deadbeef)
    *ptr3 = 5.0;                // Crash - writing to random memory

The type of a pointer determines:
- What type of data it can point to
- How pointer arithmetic works (e.g., ``ptr + 1`` moves by ``sizeof(type)`` bytes)
- Whether it can be null (all pointer types can be ``nullptr``)

**Dereferencing pointers**

Use the asterisk (*) operator to access the value a pointer points to:

.. code-block:: cpp

    double x = 42.5;
    double* ptr = &x;           // ptr stores address of x (e.g., 0x7ffeeb5c4c10)
    double value = *ptr;        // dereference: value = 42.5
    
    *ptr = 99.9;                // modify through pointer
    std::cout << x;             // x is now 99.9

**Practical example: passing arrays to functions**

.. code-block:: cpp

    void compute_mean(const double* data, int size) {
        double sum = 0.0;
        for (int i = 0; i < size; i++) {
            sum += data[i];  // same as *(data + i)
        }
        return sum / size;
    }
    
    std::vector<double> measurements = {1.2, 3.4, 5.6, 7.8};
    double mean = compute_mean(measurements.data(), measurements.size());

**Dangling pointers**

A dangling pointer does not point to a valid object. This typically occurs when the object was deleted or deallocated, but the pointer was not updated. Accessing a dangling pointer leads to undefined behavior, crashes, or data corruption.

.. code-block:: cpp

    double* ptr = new double(3.14);
    delete ptr;         // memory freed
    // ptr still holds old address (e.g., 0x600001234567)
    *ptr = 2.71;        // Crash - memory no longer valid
    
    ptr = nullptr;      // Good practice: mark as invalid

References
----------

**What is a reference?**

A reference in C++ is an alias for another variable, created using the ampersand (&) symbol. Once a reference is initialized, it cannot be changed to refer to another variable.

.. code-block:: cpp

    double x = 3.14;
    double& ref = x;   // ref is an alias for x
    ref = 2.71;        // modifies x directly
    std::cout << x;    // prints 2.71

**Why use references instead of pointers?**

References provide a safer alternative to pointers:

- Cannot be null or uninitialized (must always refer to a valid object)
- More intuitive syntax (no need for dereferencing)
- Prevent common pointer-related errors
- Avoid copying large objects when passing to functions

.. code-block:: cpp

    // Pass by reference: no copy, direct access
    void increment(double& value) {
        value += 1.0;  // modifies the original
    }
    
    double x = 5.5;
    increment(x);  // x is now 6.5

**Analogy: pointers vs references**

If a pointer is like holding someone's address written on paper, a reference is like being given a permanent portal directly into their house—you don't need to look up the address, you just walk in.

**Pointers vs References comparison**

.. list-table::
   :header-rows: 1
   :widths: 20 40 40

   * - Feature
     - Pointer
     - Reference
   * - Can be null?
     - Yes
     - No (must refer to something valid)
   * - Can be reseated?
     - Yes (``p = &y;``)
     - No (always refers to the same object)
   * - Syntax for access
     - ``*p`` (dereference)
     - Just use the name
   * - Used for
     - Dynamic memory, low-level control
     - Function parameters, operator overloads, clean aliasing


.. dropdown:: How do you use references in C++?

    :icon: info
    :color: info

    .. code-block:: cpp
    
        int x = 10;
        int& ref = x;   // ref refers to x
        ref = 20;       // modifies x directly
        std::cout << x; // prints 20

.. dropdown:: What problem does reference solve?

    :icon: info
    :color: info

    In C, you must use pointers to modify variables in functions:

    .. code-block:: cpp

        void increment(int* p) {
            (*p)++;
        }

        int main() {
            int x = 5;
            increment(&x);  // must pass address
            std::cout << x; // prints 6
        }

    This can lead to issues like null pointers or pointer arithmetic errors.

    In C++, references provide a safer and more intuitive way:

    .. code-block:: cpp

        void increment(int& r) {
            r++;
        }
        
        int main() {
            int x = 5;
            increment(x);  // pass by reference
            std::cout << x; // prints 6
        }
    
        
    References eliminate pointer-related risks while maintaining efficiency.

CUDA C++ extensions for PyTorch
--------------------------------

When writing custom CUDA kernels for PyTorch (e.g., for ptychography reconstruction, FFTs, or custom neural network layers), you divide code into **host** (CPU) and **device** (GPU) operations.

Host code responsibilities
^^^^^^^^^^^^^^^^^^^^^^^^^^

1. Input validation and tensor interfacing
"""""""""""""""""""""""""""""""""""""""""""

.. code-block:: cpp

    #include <torch/extension.h>
    
    torch::Tensor ptychography_forward(
        torch::Tensor probe,           // complex probe function
        torch::Tensor object,          // object transmission
        torch::Tensor positions        // scan positions
    ) {
        // Validate inputs
        TORCH_CHECK(probe.is_cuda(), "probe must be CUDA tensor");
        TORCH_CHECK(probe.dtype() == torch::kComplexFloat, 
                   "probe must be complex32");
        
        // Get tensor dimensions
        int64_t batch_size = probe.size(0);
        int64_t probe_size = probe.size(1);

2. Memory allocation and data preparation
"""""""""""""""""""""""""""""""""""""""""

.. code-block:: cpp

        // Allocate output tensor on GPU
        auto options = torch::TensorOptions()
            .dtype(torch::kComplexFloat)
            .device(probe.device());
        auto output = torch::zeros({batch_size, probe_size, probe_size}, options);
        
        // Get raw pointers for CUDA kernel
        auto probe_ptr = probe.data_ptr<c10::complex<float>>();
        auto object_ptr = object.data_ptr<c10::complex<float>>();
        auto output_ptr = output.data_ptr<c10::complex<float>>();

3. CUDA kernel launch configuration
"""""""""""""""""""""""""""""""""""

.. code-block:: cpp

        // Calculate grid and block dimensions
        const int threads = 256;
        const int blocks = (batch_size * probe_size * probe_size + threads - 1) / threads;
        
        // Launch kernel
        ptychography_kernel<<<blocks, threads>>>(
            probe_ptr, 
            object_ptr, 
            output_ptr,
            batch_size, 
            probe_size
        );
        
        // Check for kernel errors
        cudaError_t error = cudaGetLastError();
        TORCH_CHECK(error == cudaSuccess, 
                   "CUDA kernel failed: ", cudaGetErrorString(error));

4. Synchronization and output handling
""""""""""""""""""""""""""""""""""""""

.. code-block:: cpp

        // Wait for kernel completion
        cudaDeviceSynchronize();
        
        return output;  // Return PyTorch tensor
    }
    
    // Bind to Python
    PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
        m.def("forward", &ptychography_forward, "Ptychography forward pass");
    }

Device code (CUDA kernel)
^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: cuda

    __global__ void ptychography_kernel(
        const c10::complex<float>* probe,
        const c10::complex<float>* object,
        c10::complex<float>* output,
        int batch_size,
        int probe_size
    ) {
        int idx = blockIdx.x * blockDim.x + threadIdx.x;
        int total = batch_size * probe_size * probe_size;
        
        if (idx < total) {
            // Complex multiplication for ptychography
            output[idx] = probe[idx] * object[idx];
        }
    }

Why this division?
^^^^^^^^^^^^^^^^^^

- **Host**: Handles PyTorch tensor interface, memory management, validation, kernel orchestration
- **Device**: Pure computation (complex arithmetic, FFTs, ptychography operations)
- **PyTorch integration**: Host code bridges Python/C++/CUDA seamlessly

Common host operations for scientific computing
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- Input/output tensor validation (types, shapes, device placement)
- Memory allocation (output tensors, workspace buffers)
- Data layout conversions (contiguous memory, proper strides)
- Kernel launch parameters (grid/block sizing)
- Error checking and synchronization
- Gradient computation setup (for backward pass)
- Multi-GPU coordination (if using distributed computing)

Typical workflow pattern
^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: cpp

    // 1. Validate inputs (host)
    TORCH_CHECK(input.is_cuda(), "input must be on CUDA");
    TORCH_CHECK(input.is_contiguous(), "input must be contiguous");
    
    // 2. Allocate outputs (host)
    auto output = torch::empty_like(input);
    
    // 3. Get raw pointers (host)
    auto input_ptr = input.data_ptr<float>();
    auto output_ptr = output.data_ptr<float>();
    
    // 4. Launch kernel (host calls device)
    const int threads = 256;
    const int blocks = (input.numel() + threads - 1) / threads;
    my_kernel<<<blocks, threads>>>(input_ptr, output_ptr, input.numel());
    
    // 5. Synchronize (host)
    cudaDeviceSynchronize();
    
    // 6. Return PyTorch tensor (host)
    return output;

Building and using the extension
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

**setup.py for compilation**:

.. code-block:: python

    from setuptools import setup
    from torch.utils.cpp_extension import BuildExtension, CUDAExtension
    
    setup(
        name='ptychography_cuda',
        ext_modules=[
            CUDAExtension(
                'ptychography_cuda',
                ['ptychography.cpp', 'ptychography_kernel.cu'],
                extra_compile_args={
                    'cxx': ['-O3'],
                    'nvcc': ['-O3', '--use_fast_math']
                }
            )
        ],
        cmdclass={'build_ext': BuildExtension}
    )

**Using in Python**:

.. code-block:: python

    import torch
    import ptychography_cuda
    
    # Create complex tensors on GPU
    probe = torch.randn(32, 256, 256, dtype=torch.complex64, device='cuda')
    object_wave = torch.randn(256, 256, dtype=torch.complex64, device='cuda')
    positions = torch.randn(32, 2, device='cuda')
    
    # Call custom CUDA kernel
    result = ptychography_cuda.forward(probe, object_wave, positions)


Memory management
-----------------

Memory allocation in C
^^^^^^^^^^^^^^^^^^^^^

.. dropdown:: How does memory allocation work in C?
    :icon: info
    :color: info

    Memory management in C uses ``<stdlib.h>`` (or ``<cstdlib>`` in C++):

    - ``malloc(size)``: Allocates heap memory and returns a ``void*`` pointer
    - ``free(ptr)``: Deallocates memory allocated by ``malloc``

    Example:
    ``int* ptr = (int*)malloc(sizeof(int) * n);``

Memory allocation in C++
^^^^^^^^^^^^^^^^^^^^^^^^

.. dropdown:: What are the new and delete operators in C++?
    :icon: info
    :color: info

    In C++, the `new` operator allocates memory for an object or array on the heap, while the `delete` operator deallocates that memory when it is no longer needed.

    - To allocate memory for a single object:
    
      .. code-block:: cpp

          int* ptr = new int; // Allocates memory for a single integer

    - To allocate memory for an array of objects:
    
      .. code-block:: cpp

          int* arr = new int[10]; // Allocates memory for an array of 10 integers

    - To deallocate memory for a single object:
    
      .. code-block:: cpp

          delete ptr; // Deallocates memory for the single integer

    - To deallocate memory for an array of objects:
    
      .. code-block:: cpp

          delete[] arr; // Deallocates memory for the array of integers

.. dropdown:: What is the difference between ``malloc/free`` in C and ``new/delete`` in C++?
    :icon: info
    :color: info

    ``malloc/free`` deal with raw bytes, while ``new/delete`` handle typed objects and call constructors/destructors.

    .. code-block:: cpp 

        class Widget {
            public:
                Widget() { std::cout << "Constructing\n"; }
                ~Widget() { std::cout << "Destructing\n"; }
            };

            Widget* w = new Widget(); // allocates and constructs
            delete w;                 // destructs and deallocates

    The ``~Widget()`` destructor is called when ``delete`` is used, ensuring proper cleanup of resources. The ``Widget`` object is constructed when allocated with ``new``.

Memory layout and organization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

**Why do we need organized memory layout?**

Memory organization is fundamental to how programs execute. Memory isn't just for storing variables and arrays—it's also for storing function call information (parameters, return addresses, local variables), program instructions (the compiled code itself), and dynamically allocated data. The operating system and CPU work together to organize memory into different regions, each optimized for specific purposes:

- **Code segment**: Stores your compiled program instructions
- **Static/global data**: Stores global and static variables with fixed addresses
- **Stack**: Manages function calls and local variables with automatic cleanup
- **Heap**: Provides flexible dynamic memory for data whose size or lifetime isn't known at compile time

This division allows the CPU to optimize memory access patterns, prevents different types of data from interfering with each other, and enables efficient memory management. Without this organization, programs would be slower, less reliable, and more prone to crashes.

The stack
"""""""""

.. dropdown:: What is the stack and why is it used for function calls?
    :icon: info
    :color: info

    **The stack** is a region of memory with Last-In-First-Out (LIFO) structure, used for managing function calls and local variables.

    **Complete memory layout visualization**::

        High memory addresses
        ┌─────────────────────────┐
        │   Stack (grows down)    │ ← Stack pointer (SP)
        │   - Function calls      │   Fast, automatic cleanup
        │   - Local variables     │   Size limit (~1-8 MB)
        │   - Return addresses    │
        │   - Function parameters │
        ├─────────────────────────┤
        │   (unused space)        │
        ├─────────────────────────┤
        │   Heap (grows up)       │ ← Heap pointer
        │   - Dynamic allocation  │   Flexible, manual management
        │   - new/malloc          │   Size: GB+ (OS dependent)
        │   - std::vector data    │
        │   - Large objects       │
        ├─────────────────────────┤
        │   Static/Global data    │
        │   - Global variables    │   Fixed addresses
        │   - Static variables    │   Lifetime: entire program
        │   - Constants           │
        ├─────────────────────────┤
        │   Code/Text segment     │
        │   - Program instructions│   Read-only
        │   - Compiled functions  │   Executable code
        └─────────────────────────┘
        Low memory addresses

    **How the stack works during function calls**:

    .. code-block:: cpp

        void bar(double x) {
            double c = x + 1.0;  // local variable on stack
        }

        void foo(double a) {
            double b = 10.0;     // local variable on stack
            bar(a + b);          // call bar
        }

        int main() {
            foo(5.0);            // call foo
            return 0;
        }

    **Stack evolution**::

        1. main() called:
        ┌──────────────┐
        │ return addr  │
        └──────────────┘

        2. foo(5.0) called:
        ┌──────────────┐
        │ a = 5.0      │
        │ b = 10.0     │
        │ return addr  │
        ├──────────────┤
        │ return addr  │ ← from main
        └──────────────┘

        3. bar(15.0) called:
        ┌──────────────┐
        │ x = 15.0     │
        │ c = 16.0     │
        │ return addr  │
        ├──────────────┤
        │ a = 5.0      │
        │ b = 10.0     │
        │ return addr  │
        ├──────────────┤
        │ return addr  │ ← from main
        └──────────────┘

        4. bar() returns, its frame is removed:
        ┌──────────────┐
        │ a = 5.0      │
        │ b = 10.0     │
        │ return addr  │
        ├──────────────┤
        │ return addr  │ ← from main
        └──────────────┘

        5. foo() returns, its frame is removed:
        ┌──────────────┐
        │ return addr  │ ← from main
        └──────────────┘

    **Why use the stack for function calls?**

    - **Speed**: Stack allocation is extremely fast—just move a pointer (no searching for free memory)
    - **Automatic cleanup**: When a function returns, its stack frame is automatically freed
    - **Predictable lifetime**: Local variables have clear scope and lifetime
    - **CPU cache-friendly**: Stack memory is accessed sequentially, which is optimal for CPU caches
    - **No fragmentation**: Memory is always allocated and freed in LIFO order
    - **Simple and efficient**: The hardware supports stack operations directly (push/pop instructions)

The heap
""""""""

.. dropdown:: What is heap memory and how does it work?
    :icon: info
    :color: info

    **Heap memory** is a region for dynamic memory allocation where the size and lifetime of data can be determined at runtime.

    **Key characteristics**:

    - Managed by the operating system's memory allocator
    - Used for runtime memory allocation (``new``/``delete`` in C++, ``malloc``/``free`` in C)
    - Much larger than stack (can be GB+, limited by available RAM)
    - Memory persists until explicitly freed
    - Slower allocation than stack (requires searching for free blocks)
    - Risk of memory leaks if not properly managed

    **Heap memory visualization**::

        Heap grows upward (toward higher addresses)
        
        Before allocation:
        ┌─────────────────────────────────┐
        │     Free space (unallocated)    │ Size: GB+
        └─────────────────────────────────┘
        
        After allocations:
        ┌─────────────────────────────────┐
        │  Free space                     │
        ├─────────────────────────────────┤
        │  Block 3: double[1000000]       │ ← ptr3 (8 MB)
        │  (vector data)                  │
        ├─────────────────────────────────┤
        │  Free (after deletion)          │ ← Fragmentation
        ├─────────────────────────────────┤
        │  Block 2: std::complex<double>  │ ← ptr2 (16 bytes)
        ├─────────────────────────────────┤
        │  Block 1: double                │ ← ptr1 (8 bytes)
        │  Value: 3.14159                 │
        └─────────────────────────────────┘
        Heap base

    **Example with actual memory operations**:

    .. code-block:: cpp

        // Allocate single value on heap
        double* ptr1 = new double(3.14159);
        // Allocates 8 bytes on heap
        // ptr1 stores heap address (e.g., 0x6000012a4b00)
        
        // Allocate complex number
        std::complex<double>* ptr2 = new std::complex<double>(1.0, 2.0);
        // Allocates 16 bytes on heap
        // ptr2 stores heap address (e.g., 0x6000012a4b10)
        
        // Allocate large array via vector (internal heap allocation)
        std::vector<double> data(1000000);
        // Allocates 8 MB on heap for array storage
        // vector manages this pointer internally
        
        delete ptr1;  // Free the 8 bytes
        delete ptr2;  // Free the 16 bytes
        // data automatically freed when vector goes out of scope

    **Comparison with stack allocation**:

    .. code-block:: cpp

        void example() {
            // Stack allocation (fast, automatic)
            double x = 3.14;              // 8 bytes on stack
            double arr[100];              // 800 bytes on stack
            
            // Heap allocation (flexible, manual)
            double* p = new double(2.71); // 8 bytes on heap
            double* big = new double[1000000];  // 8 MB on heap
            
            // Stack overflow risk
            // double huge[10000000];  // Crash - too big for stack
            
            delete p;
            delete[] big;
        }  // Stack memory automatically freed here

    **When to use heap vs stack**:

    - **Use stack**: Small, fixed-size local variables, function parameters
    - **Use heap**: Large data, dynamic size, data that outlives function scope, shared data

    **Real-world scientific computing example**:

    .. code-block:: cpp

        void process_diffraction_data(int num_patterns) {
            // Stack: small, fixed data
            int pattern_size = 512;
            double threshold = 0.001;
            
            // Heap: large, runtime-sized data
            std::vector<double> patterns(num_patterns * pattern_size * pattern_size);
            std::vector<std::complex<double>> fft_result(pattern_size * pattern_size);
            
            // Process data...
            // All heap memory automatically cleaned up when vectors destroyed
        }

.. dropdown:: Why does the OS divide memory into stack and heap?
    :icon: info
    :color: info

    The OS divides memory to optimize different usage patterns:

    **Stack Memory**
    - Fast, fixed size (typically 1-8 MB)
    - LIFO (Last-In, First-Out)
    - For local variables and function calls
    - Automatic memory management
    - No fragmentation
    - Best for: small, short-lived data

    **Heap Memory**
    - Flexible, variable size (GB+)
    - For dynamic data structures
    - Manual memory management (or RAII with smart pointers)
    - Can fragment over time
    - Best for: large data, dynamic lifetime, shared ownership

    **Example showing the difference**:

    .. code-block:: cpp

        void bad_function() {
            double data[10000000];  // Stack overflow - 80 MB on limited stack
        }
        
        void good_function() {
            std::vector<double> data(10000000);  // Heap allocation - no problem
        }

Memory fragmentation
^^^^^^^^^^^^^^^^^^^

.. dropdown:: What is memory fragmentation?
    :icon: info
    :color: info

    Memory fragmentation is when free memory becomes divided into small, scattered blocks, making it difficult to allocate large contiguous chunks of memory.

.. dropdown:: Why does memory fragmentation occur?
    :icon: info
    :color: info

    Fragmentation happens when:
    - Memory blocks are allocated in different sizes
    - Blocks are freed in random order
    - Gaps form between allocated blocks
    - These gaps are too small for new allocations

.. dropdown:: How can you avoid memory fragmentation?
    :icon: info
    :color: info

    Key strategies:
    - Use memory pools
    - Allocate larger blocks upfront
    - Minimize frequent small allocations
    - Use fixed-size allocations when possible
    - Monitor memory usage patterns

Memory pools
^^^^^^^^^^^

.. dropdown:: What are memory pools?
    :icon: info
    :color: info

    Memory pools are pre-allocated blocks of memory that manage smaller allocations internally:

    - Allocate one large block upfront
    - Manage smaller allocations within that block
    - Reduce fragmentation
    - Improve allocation speed
    - Simplify memory management

.. dropdown:: Can you show an example of a memory pool in C++?
    :icon: info
    :color: info

    Here is a simple example of a memory pool in C++:

    .. code-block:: cpp

        #include <iostream>
        #include <vector>

        class MemoryPool {
        public:
            MemoryPool(size_t size) : poolSize(size), pool(new char[size]), offset(0) {}

            ~MemoryPool() {
                delete[] pool;
            }

            void* allocate(size_t size) {
                if (offset + size > poolSize) {
                    throw std::bad_alloc();
                }
                void* ptr = pool + offset;
                offset += size;
                return ptr;
            }

            void deallocate(void* ptr, size_t size) {
                // In a simple memory pool, deallocation is often a no-op
            }

        private:
            size_t poolSize;
            char* pool;
            size_t offset;
        };

        int main() {
            MemoryPool memPool(1024); // Create a memory pool of 1024 bytes

            int* intPtr = static_cast<int*>(memPool.allocate(sizeof(int)));
            *intPtr = 42;
            std::cout << "Allocated integer: " << *intPtr << std::endl;

            memPool.deallocate(intPtr, sizeof(int)); // Deallocation is a no-op in this simple example

            return 0;
        }


Smart pointers
^^^^^^^^^^^^^

Memory management is error prone. C++ provides automatic memory management through smart pointers:

    - ``std::unique_ptr<T>``: Exclusive ownership, auto-deletion when out of scope
    - ``std::shared_ptr<T>``: Shared ownership with reference counting
    - ``std::weak_ptr<T>``: Non-owning reference to prevent circular references

Practice examples
-----------------

Pointer declarations
^^^^^^^^^^^^^^^^^^^

Basic pointer with actual memory addresses:

.. code-block:: cpp

    double* p;              // pointer to double (uninitialized, contains garbage)
    double x = 3.14159;
    p = &x;                 // p now holds address of x (e.g., 0x7ffeeb5c4a18)
    
    std::cout << "Value: " << x << std::endl;      // 3.14159
    std::cout << "Address: " << &x << std::endl;   // 0x7ffeeb5c4a18
    std::cout << "Pointer: " << p << std::endl;    // 0x7ffeeb5c4a18
    std::cout << "Dereferenced: " << *p << std::endl;  // 3.14159

Multiple pointers on one line:

.. code-block:: cpp

    double *p1, *p2;  // both are pointers to double
    double* p1, p2;   // p1 is a pointer to double, p2 is a double (confusing!)

Constant pointers:

.. code-block:: cpp

    double x = 3.14, y = 2.71;
    const double* p1 = &x;         // pointer to const double (can't modify *p1)
    double* const p2 = &x;         // const pointer (can't change address)
    const double* const p3 = &x;   // both const

Generic pointer:

.. code-block:: cpp

    void* vp = &x; // can point to any type, but you must cast to use it

Pointer to pointer:

.. code-block:: cpp

    double** pp = &p; // points to a pointer (e.g., stores 0x7ffeeb5c4a20)

Using references
^^^^^^^^^^^^^^^^^^

.. dropdown:: How do you use references in C++?

    :icon: info
    :color: info

    .. code-block:: cpp
    
        double x = 3.14;
        double& ref = x;   // ref refers to x
        ref = 2.71;        // modifies x directly
        std::cout << x;    // prints 2.71

.. dropdown:: What problem does reference solve?

    :icon: info
    :color: info

    In C, you must use pointers to modify variables in functions:

    .. code-block:: cpp

        void increment(double* p) {
            (*p) += 1.0;
        }

        int main() {
            double x = 5.5;
            increment(&x);  // must pass address
            std::cout << x; // prints 6.5
        }

    This can lead to issues like null pointers or pointer arithmetic errors.

    In C++, references provide a safer and more intuitive way:

    .. code-block:: cpp

        void increment(double& r) {
            r += 1.0;
        }
        
        int main() {
            double x = 5.5;
            increment(x);  // pass by reference
            std::cout << x; // prints 6.5
        }
    
        
    References eliminate pointer-related risks while maintaining efficiency.

What is the concept of "null" in C++?
    In C++, "null" refers to a pointer that does not point to any valid object or memory location. It is represented by the constant ``nullptr`` (introduced in C++11) or the macro ``NULL`` (from C). A null pointer is used to indicate that the pointer is not currently assigned to any object.

Why do you want to have a pointer that is "null"?
    Having a null pointer is useful for several reasons:
    
    1. **Indicating Absence**: A null pointer can signify that a pointer is not currently pointing to any valid object, which is useful for optional data or uninitialized pointers.
    
    2. **Error Handling**: Functions can return a null pointer to indicate failure or that no valid object could be returned.
    
    3. **Memory Management**: It helps in managing dynamic memory by allowing pointers to be set to null after deallocation, preventing dangling pointers.


What's a good analogy to understand pointers vs references?
    If a pointer is like holding someone's address written on paper,

    A reference is like being given a permanent portal directly into their house. You don't need to look up the address — you just walk in.