Abstract

One of the biggest challenges in metadata management schemes that sit outside the filesystem layer is their ability to index meaningful path information of files that are being referenced in an external system like a database or in a metadata journal file. Path to a file is a critical requirement that allows both meaningful interpretation of the locality of the file and its metadata and also secondly allows for more efficient user mode services that can transform the file or its metadata. Additionally path information is very essential in compliance systems where audit logs need to tell what happened to a file and where it is located. However when the data path is being audited from layers such as protocols, it becomes harder to reconstruct the entire path information for all the files given that the protocol layers do not directly integrate with the underlying Filesystem. The protocol layers would then need to rely on system cache to get the path data and sometimes this may not be possible making it required for the protocol to actually do an expensive reverse path walk, reconstructing the path. This actually heavily degrades the performance of the system. In this paper we discuss a mechanism that allows us to record enough information about the file using the unique ID of itself and its parent in the protocol layer such that if and when required the path information can be reconstructed based on a reliable reverse lookup in a database or a file based journal system. The idea is to have enough information to reconstruct the path at a later time and outside the system where the information was initially originated from. The paper also talks of keeping this system consistent under all conditions.

Share

COinS