To protect private information stored in text embeddings, it’s essential to de-identify the text before embedding and storing it in a vector database. In this article, we’ll demonstrate how to de-identify and chunk text using Tonic Textual, and then easily embed these chunks and store the data in a Pinecone vector database to use for semantic search in RAG or other LLM applications.
First seen on securityboulevard.com
Jump to article: securityboulevard.com/2025/01/how-to-create-de-identified-embeddings-with-tonic-textual-pinecone/
![]()

