Skip to main content

User account menu

  • Log in
DBS-Logo

Database Group Leipzig

within the department of computer science

ScaDS-Logo Logo of the University of Leipzig

Main navigation

  • Home
  • Study
    • Exams
      • Hinweise zu Klausuren
    • Courses
      • Current
    • Modules
    • LOTS-Training
    • Abschlussarbeiten
    • Masterstudiengang Data Science
    • Oberseminare
    • Problemseminare
    • Top-Studierende
  • Research
    • Projects
      • Benchmark datasets for entity resolution
      • FAMER
      • HyGraph
      • Privacy-Preserving Record Linkage
      • GRADOOP
    • Publications
    • Prototypes
    • Annual reports
    • Cooperations
    • Graduations
    • Colloquia
    • Conferences
  • Team
    • Erhard Rahm
    • Member
    • Former employees
    • Associated members
    • Gallery

Intermediate Fusion for Multimodal Product Matching

Breadcrumb

  • Home
  • Research
  • Publications
  • Intermediate Fusion for Multimodal Product Matching

Pollack, J. ; Köpcke, H. ; Rahm, E.

Intermediate Fusion for Multimodal Product Matching

Proc. GI-Workshop Grundlagen von Datenbanksystemen (GVDB)

2024 / 05

Paper

Abstract

Web-based entity resolution, particularly in the context of online marketplaces and e-commerce ecosystems, is a critical task for accurately identifying and matching similar product offers across the web. Traditional approaches to entity resolution have primarily relied on textual information, but the increasing availability of diverse data modalities has led to the adoption of a multimodal approach. This paper introduces an innovative intermediate fusion architecture for multimodal product matching,
effectively combining textual information from RoBERTa embeddings and visual information from Swin-Transformer embeddings. Our approach enhances matching accuracy by leveraging the complementary nature of text and image modalities. Experimental results on the WDC Shoes and Zalando datasets show the superiority of our proposed approach compared to unimodal models and multimodal baselines. The outcomes highlight the potential for multimodal product matching to improve entity resolution in online marketplaces, thereby enhancing the user shopping experience.

Recent publications

  • 2025 / 9: Generating Semantically Enriched Mobility Data from Travel Diaries
  • 2025 / 8: Slice it up: Unmasking User Identities in Smartwatch Health Data
  • 2025 / 6: SecUREmatch: Integrating Clerical Review in Privacy-Preserving Record Linkage
  • 2025 / 6: Leveraging foundation models and goal-dependent annotations for automated cell confluence assessment
  • 2025 / 5: Federated Learning With Individualized Privacy Through Client Sampling

Footer menu

  • Directions
  • Contact
  • Impressum