Verifiable cross-modal searchable encryption via hierarchical spherical tree with beam search

Research output: Contribution to journalArticlepeer-review

Abstract

Large-scale multimedia data are widely outsourced to cloud services to support cross-modal retrieval, such as text-to-image search. However, the cloud server is not fully trusted, which raises privacy concerns. Cross-modal searchable encryption (CMSE) enables retrieval over encrypted data without revealing content or query information. In large-scale settings, efficient similarity search over encrypted high-dimensional embeddings remains challenging. Existing indexing approaches, such as locality-sensitive hashing and tree-based methods, face trade-offs between retrieval quality and efficiency. Moreover, a malicious server may skip computations or return incomplete results, making verifiability an important requirement. This paper proposes VCSE-HST, a verifiable cross-modal searchable encryption scheme. VCSE-HST builds a hierarchical spherical tree index for fast pruning and uses beam search to explore multiple candidate paths, achieving both high efficiency and high retrieval quality on large datasets. The scheme provides a dual verification mechanism: score correctness verification for encrypted similarity scores and execution integrity verification based on Merkle commitment. Security analysis demonstrates that VCSE-HST achieves index confidentiality and trapdoor indistinguishability. Experimental evaluation validates that VCSE-HST attains substantial efficiency gains over linear search while preserving retrieval accuracy.

Original languageEnglish
Article number104373
JournalJournal of Information Security and Applications
Volume97
DOIs
StatePublished - Mar 2026

Keywords

  • Beam search
  • Cloud computing
  • Cross-modal retrieval
  • Execution integrity verification
  • Hierarchical spherical tree
  • Searchable encryption

Fingerprint

Dive into the research topics of 'Verifiable cross-modal searchable encryption via hierarchical spherical tree with beam search'. Together they form a unique fingerprint.

Cite this