Puneet MahajanFollow


Maintaining data integrity and reliability in distributed storage systems is a complex and important challenge. Distributed storage systems are susceptible to various issues, such as data corruption, data loss, and data inconsistencies, arising from hardware failures, network disruptions, software defects, synchronization errors, security breaches, etc. Current techniques such as redundant storage devices, use of erasure coding, data snapshots/backup, etc. are reactive and cannot address data integrity issues before they manifest or at an early stage. Costly manual intervention is necessary to mitigate data integrity issues but can cause delays or introduce errors. This disclosure describes the use of machine learning based anomaly detection techniques and of symbolic links to provide a proactive, self-healing data management system. The described techniques actively monitor for signs of data integrity issues and autonomously initiate corrective actions. Such actions can include actions related to data restoration, corruption correction, or access rerouting. This approach not only addresses data corruption after it occurs but also aims to predict and prevent potential issues before they impact the system, filling a critical gap in current data management practices.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.