A data loss prevention (DLP) system and method can use embeddings generated from possibly confidential data to determine if the data does, in fact, contain confidential data. The possibly confidential data is obtained from a software application, network firewall, cloud service, or other computing component. The system generates an embedding based on the possibly confidential data and compares the embedding to one or more embeddings stored in a data store. The stored embeddings are embeddings generated from known confidential data. Responsive to the generated embedding being above a threshold similarity to a stored embedding, the system may cause a DLP action to be performed (e.g., preventing a user from accessing or exfiltrating the confidential data, notifying a security administrator user, etc.).

