.. _hardware_compute_resources: Compute Resources ================= Overview -------- The Compute Resources module in daolite provides detailed models of various computing hardware to accurately estimate the performance of AO pipeline components. This module allows users to: 1. Define custom hardware specifications 2. Use pre-defined hardware profiles 3. Compare performance across different systems 4. Model heterogeneous computing (CPU + GPU) Hardware Model Components ------------------------- CPU Resources ~~~~~~~~~~~~~ CPU resources are modeled with the following parameters: - **Number of cores**: Physical CPU cores available - **Core frequency**: Clock speed of CPU cores in Hz - **FLOPS per cycle**: Floating point operations per clock cycle (vectorization capability) - **Memory channels**: Number of memory channels - **Memory width**: Width of each memory channel in bits - **Memory frequency**: Memory clock frequency in Hz - **Network speed**: Network interface speed in bits/second - **Driver time**: Overhead for kernel/driver interactions .. code-block:: python # Example: Creating custom CPU resources from daolite.compute import ComputeResources # Define an AMD EPYC server cpu = ComputeResources( hardware="CPU", cores=64, core_frequency=2.45e9, # 2.45 GHz flops_per_cycle=32, # AVX-512 support memory_channels=8, memory_width=64, # 64 bits per channel memory_frequency=3200e6, # 3200 MHz network_speed=100e9, # 100 Gbps time_in_driver=5 # 5 μs driver overhead ) GPU Resources ~~~~~~~~~~~~~ GPU resources are modeled with a simplified approach focusing on the key performance parameters: - **Hardware type**: Indicator that this is a GPU resource - **Memory bandwidth**: Peak memory bandwidth in bytes/second - **FLOPS**: Peak floating point operations per second - **Network speed**: PCIe or network interface speed - **Driver time**: GPU driver and kernel launch overhead .. code-block:: python # Example: Creating custom GPU resources from daolite.compute import ComputeResources # Define an NVIDIA A100 GPU gpu = ComputeResources( hardware="GPU", memory_bandwidth=1.6e12, # 1.6 TB/s (HBM2e) flops=1.9e13, # 19.5 TFLOPS (FP32) network_speed=25e9, # PCIe 4.0 x16 time_in_driver=20 # 20 μs driver overhead ) Pre-defined Hardware Library ---------------------------- daolite includes a comprehensive library of pre-defined hardware profiles for common CPUs and GPUs: CPU Profiles ~~~~~~~~~~~~ .. code-block:: python from daolite.compute import hardware # AMD CPUs cpu1 = hardware.amd_epyc_7763() # AMD EPYC 7763 (Milan) cpu2 = hardware.amd_epyc_9654() # AMD EPYC 9654 (Genoa) cpu3 = hardware.amd_ryzen_7950x() # AMD Ryzen 9 7950X # Intel CPUs cpu4 = hardware.intel_xeon_8480() # Intel Xeon Platinum 8480+ cpu5 = hardware.intel_xeon_8462() # Intel Xeon 8462Y+ # Use a pre-defined CPU in a pipeline pipeline.add_component(PipelineComponent( name="Centroider", compute=hardware.amd_epyc_7763(), # Easy to use! ... )) GPU Profiles ~~~~~~~~~~~~ .. code-block:: python from daolite.compute import hardware # NVIDIA GPUs gpu1 = hardware.nvidia_a100_80gb() # NVIDIA A100 80GB gpu2 = hardware.nvidia_h100_80gb() # NVIDIA H100 80GB gpu3 = hardware.nvidia_rtx_4090() # NVIDIA RTX 4090 # AMD GPUs gpu4 = hardware.amd_mi300x() # AMD Instinct MI300X # Use a pre-defined GPU in a pipeline pipeline.add_component(PipelineComponent( name="Reconstructor", compute=hardware.nvidia_rtx_4090(), ... )) Memory Model ------------ The memory model in daolite calculates effective bandwidth based on: 1. **Theoretical peak bandwidth**: Base calculation from hardware specs For CPUs: ``memory_channels * memory_width * memory_frequency / 8`` For GPUs: Directly specified memory bandwidth 2. **Access pattern efficiency**: Real-world memory access patterns rarely achieve theoretical peak - Sequential access: ~80-95% efficiency - Strided access: ~40-60% efficiency - Random access: ~10-30% efficiency 3. **Cache effects**: Optional modeling of cache benefits Computation Model ----------------- The computation model estimates processing time based on: 1. **Theoretical peak FLOPS**: Maximum floating point operations per second For CPUs: ``cores * core_frequency * flops_per_cycle`` For GPUs: Directly specified FLOPS 2. **Algorithm efficiency**: Real-world efficiency compared to theoretical peak - Memory-bound algorithms: Typically limited by memory bandwidth - Compute-bound algorithms: Limited by computational throughput - Each algorithm has a scaling factor to account for implementation efficiency Multiple Resource Types ----------------------- daolite supports defining multiple resource types for different components: .. code-block:: python # Define CPU for camera readout and DM control cpu_resource = amd_epyc_7763() # Define GPU for centroiding and reconstruction gpu_resource = nvidia_rtx_4090() # Use in pipeline components pipeline.add_component(PipelineComponent( component_type=ComponentType.CAMERA, name="Camera", compute=cpu_resource, # CPU for camera # ...other parameters... )) pipeline.add_component(PipelineComponent( component_type=ComponentType.CENTROIDER, name="Centroider", compute=gpu_resource, # GPU for centroiding # ...other parameters... )) Adding Custom Hardware Profiles ------------------------------- Users can extend the hardware library with custom profiles: .. code-block:: python from daolite.compute import ComputeResources from daolite.compute.resources import register_hardware_profile # Create a custom hardware profile factory function def my_custom_server(): return ComputeResources( hardware="CPU", cores=128, core_frequency=3.2e9, flops_per_cycle=64, memory_channels=16, memory_width=64, memory_frequency=3600e6, network_speed=200e9, time_in_driver=2 ) # Register the profile register_hardware_profile("my_custom_server", my_custom_server) # Later use it from the library from daolite import my_custom_server resource = my_custom_server() Implementation Details ---------------------- The Compute Resources module uses a factory pattern to create resources with consistent configurations: - ``ComputeResources``: Class for both CPU and GPU resources - Pre-defined hardware profiles are factory functions This design allows for easy extension and customization while maintaining type safety and consistent interfaces. API Reference ------------- For complete API details, see the :ref:`api_compute_resources` section.