Abstract
Present disclosure provides a method and system for cognitive optimization of distributed engines framework. The present disclosure discloses a telemetry-driven optimization system designed to elevate the performance and efficiency of distributed data pipelines operating on Hadoop YARN. The system harnesses the analytical power of Large Language Models (LLMs) and a standardized Model Context Protocol (MCP). Further, the system identifies and remediates inefficiencies in application code, resource allocation, and cluster configurations. The system seamlessly integrates with developer workflows, enabling automated diagnostics and corrective actions while embedding feedback loops for continuous learning and improvement. Thus, present disclosure discloses the architecture and implementation of system (i.e., CODE), using Apache Spark as a representative use case, and demonstrates its impact on operational efficiency, developer productivity, and system throughput in large-scale data environments
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
KUMAR, SWAGAT; RANJAN, AMITABH; and MISHRA, SOUMENDRA KUMAR, "METHOD AND SYSTEM FOR COGNITIVE OPTIMIZATION OF DISTRIBUTED ENGINES FRAMEWORK", Technical Disclosure Commons, (September 25, 2025)
https://www.tdcommons.org/dpubs_series/8640