Edge and Cloud-Based AI Inference for Big Data Processing
This chapter explores the strategic implementation of AI inference in the edge-cloud continuum, focusing on the downsides of cloud-based processing and the latency, bandwidth, and scalability advantages of edge-based processing. It describes foundational architectures—edge, fog, and hybrid systems—and assesses the appropriateness of each for real-time big-data settings. It also highlights open issues—data volume, variability in quality, and resource constrains—while also discussing more advanced approaches, including model partitioning or distillation, adaptive inference, compression approaches, and federated learning. The chapter includes case studies with real examples of actual improvements in response times, accuracy, and operational overhead. The chapter concludes with technical challenges and considers future research directions on scalable, privacy-preserving distributed AI systems.