Path-LLM: A Multi-Modal Path Representation Learning by Aligning and Fusing with Large Language Models

  • Yongfu Wei
  • , Yan Lin
  • , Hongfan Gao
  • , Ronghui Xu
  • , Sean Bin Yang*
  • , Jilin Hu*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

The advancement of intelligent transportation systems has led to a growing demand for accurate path representations, which are essential for tasks such as travel time estimation, path ranking, and trajectory analysis. However, traditional path representation learning (PRL) methods often focus solely on single-modal road network data, overlooking important physical and regional factors that influence real-world traffic dynamics. To overcome this limitation, we introduce Path-LLM, a multi-modal path representation learning model that integrates large language models (LLMs) into PRL. Our approach leverages LLMs to interpret both topological and textual data, enabling robust multi-modal path representations. To effectively align and merge these modalities, we propose TPalign, a contrastive learning-based pretraining strategy that ensures alignment within the embedding space. We then present TPfusion, a multimodal fusion module that dynamically adjusts the weight of each modality before integration. To further optimize LLM training, we introduce a Two-stage Overlapping Curriculum Learning (TOCL) approach, which progressively increases the complexity of the training data. Finally, we evaluate Path-LLM on three real-world datasets across traditional PRL downstream tasks, achieving up to a 61.84% improvement in path ranking performance on the Xi’an dataset. Additionally, Path-LLM demonstrates superior performance in both few-shot and zero-shot learning scenarios.

Original languageEnglish
Title of host publicationWWW 2025 - Proceedings of the ACM Web Conference
PublisherAssociation for Computing Machinery, Inc
Pages2289-2298
Number of pages10
ISBN (Electronic)9798400712746
DOIs
StatePublished - 28 Apr 2025
Event34th ACM Web Conference, WWW 2025 - Sydney, Australia
Duration: 28 Apr 20252 May 2025

Publication series

NameWWW 2025 - Proceedings of the ACM Web Conference

Conference

Conference34th ACM Web Conference, WWW 2025
Country/TerritoryAustralia
CitySydney
Period28/04/252/05/25

Keywords

  • Contrastive learning
  • Curriculum learning
  • Large language models
  • Path representation learning

Fingerprint

Dive into the research topics of 'Path-LLM: A Multi-Modal Path Representation Learning by Aligning and Fusing with Large Language Models'. Together they form a unique fingerprint.

Cite this