- Mitarbeiter
- Forschung
- Publikationen
- Projekte
- Prototypes
- Jahresberichte
- Kooperationen
- Promotionen
- Colloquia
- Conferences
- Studium
- Service
Analysis of research publications
Analysis of publications
Four our survey of entity resolution prototypes we have analyzed numerous research publications w.r.t. their evaluation of entity resolution approaches. A list of entity matching papers can also be found in our publication categorizer.
Title | Authors | Venue | Year | Domain | Datasets |
---|---|---|---|---|---|
A Distance-Based Approach to Entity Reconciliation in Heterogeneous Databases | Dey, Sarkar, De | TKDE | 2002 | persons | MIS faculty directory |
A grammar-based entity representation framework for data cleaning | Arasu, Kaushik | SIGMOD | 2009 | people, affiliations | RIDDLE (UCD), DBLP |
A Heterogeneous Field Matching Method for Record Linkage | Steven N. Minton, Claude Nanjo, Craig A. Knoblock, Martin Michalowski, Matthew Michelson | ICDM | 2005 | restaurants, cars, hotels | |
Adaptive duplicate detection using learnable string similarity measures | Bilenko, M. and Mooney, R. J | SIGKDD | 2003 | restaurants, bibliographic | Cora, Citeseer |
Adaptive Name Matching in Information Integration | Bilenko, M., Mooney, R., Cohen, W., Ravikumar, P., and Fienberg, S. | Int. Syst. | 2003 | animals, departments, park, restaurant, persons | |
Adaptive Product Normalization: Using Online Learning for Record Linkage in Comparison Shopping | Bilenko, Basu, Sahami | ICDM | 2005 | products | Froogle |
An Entity Resolution Framework for Deduplicating Proteins | Lochovsky, Topaloglou | DILS | 2008 | proteins | IPI |
An integrated, conditional model of information extraction and coreference with application to citation matching. | Wellner, McCallum, Peng, Hay | UAI | 2004 | bibliographic | Citeseer |
Automatic Training Example Selection for Scalable Unsupervised Record Linkage | Christen | PAKDD | 2008 | census, restaurants, publications | Cora |
Automatically utilizing secondary sources to align information across sources | Michalowski, M., Thakkar, S., and Knoblock, C. A. | AI Mag. | 2005 | restaurants, companies | |
Autonomous citation matching. | Lawrence, Giles, Bollacker | AGENTS | 1999 | bibliographic | |
Cleaning the Spurious Links in Data | Lee, M. L., Hsu, W., and Kothari, V | Int. Syst. | 2004 | bibliographic, movies | DBLP, KDD Cup 2003 Hep-ph data |
Collective Entity Resolution in Relational Data | Bhattacharya, Getoor | TKDD | 2007 | bibliographic | Citeseer, arXive, BioBase |
Combining a Logical and a Numerical Method for Data Reconciliation | Saïs, Pernelle, Rousset | JoDS | 2010 | bibliographic, hotels | Cora, HOTEL |
Constraint-Based Entity Matching | Shen, Li, Doan | AAAI | 2005 | persons, movies | DBLP, IMDB |
Detecting Duplicates in Complex XML Data | Weis, Naumann | ICDE | 2006 | moviews | IMDB, film-dienst.de |
DogmatiX tracks down duplicates in XML | Weis, M. and Naumann, F. | SIGMOD | 2005 | movies, CDs | IMDB, FreeDB |
D-Swoosh: A Family of Algorithms for Generic, Distributed Entity Resolution | Benjelloun, O., Garcia-Molina, H., Gong, H., Kawai, H., Larson, T. E., Menestrina, D., and Thavisomboon, S. | ICDCS | 2007 | products | Yahoo |
Duplicate record elimination in large data files. | Bitton, D. and DeWitt, D. J. | TODS | 1983 | ||
Efficient clustering of high-dimensional data sets with application to reference matching | McCallum, Nigam, Ungar | SIGKDD | 2000 | bibliographic | Cora |
Efficient Private Record Linkage | Yakout, Atallah, Elmagarmid | ICDE | 2009 | persons | BC voters list |
Eliminating fuzzy duplicates in data warehouses. | Ananthakrishna, R., Chaudhuri, S., and Ganti, V. | VLDB | 2002 | persons/customers (bibliographic) | |
Entity Identification in Database Integration | Lim, E., Srivastava, J., Prabhakar, S., and Richardson, J. | ICDE | 1993 | ||
Entity resolution with iterative blocking | Whang, Menestrina, Koutrika, Theobald, Garcia-Molina | SIGMOD | 2009 | products, hotels | Yahoo |
Example-driven design of efficient record matching queries | Chaudhuri, Chen, Ganti, Kaushik | VLDB | 2007 | companies, persons, bibliographic, restaurants, birds | RIDDLE, Cora |
Exploiting context analysis for combining multiple entity resolution systems | Chen, Kalashnikov, Mehrotra | SIGMOD | 2009 | bibliographic, persons | RealPub |
Exploiting relationships for object consolidation | Chen, Kalashnikov, Mehrotra | IQIS | 2005 | movies | Stanford download |
Exploiting secondary sources for unsupervised record linkage | M. Michalowski and S. Thakkar and C. Knoblock | IIWeb | 2004 | Restaurants | Zagat, Dinesite |
Febrl - An Open Source Data Cleaning, Deduplication and Record Linkage System with a Graphical User Interface | Christen | SIGKDD | 2008 | census, restaurants, publications | |
Finding similar identities among objects from multiple web sources | Carvalho, J. C. and da Silva, A. S. | WIDM | 2003 | bibliographic, movies, restaurants | |
Framework for Evaluating Clustering Algorithms in Duplicate Detection | Hassanzadeh, Chiang, Miller, Lee | VLDB | 2009 | companies, bibliographic | DBLP |
Generic entity resolution with data confidences | Menestrina, Benjelloun, Garcia-Molina | CleanDB | 2006 | ||
Industry-scale duplicate detection | Weis, Naumann, Jehle, Lufter, Schuster | VLDB | 2009 | people | Schufa |
Interactive deduplication using active learning | Sarawagi, Bhamidipaty | SIGKDD | 2002 | bibliographic, addresses | Citeseer |
Joint deduplication of multiple record types in relational data | Culotta, McCallum | CIKM | 2005 | bibliographic | Citeseer, Cora |
Large-Scale Deduplication with Constraints Using Dedupalog | Arasu, Re, Suciu | ICDE | 2009 | bibliographic | Cora |
Learning domain-independent string transformation weights for high accuracy object identification | Tejada, Knoblock, Minton | SIGKDD | 2002 | restaurants, companies and airports | |
Learning object identification rules for information integration, Information | Tejada, Knoblock, Minton | Systems | 2001 | restaurants, companies and airports | |
MOMA-A Mapping-based Object Matching System | Thor, Rahm | CIDR | 2007 | bibliographic | DBLP, ACM, GS |
Object Matching for Information Integration: A Profiler-Based Approach | Doan, Lu, Lee, Han | IIWeb | 2003 | bibliographic, movies | Citeseer, IMDB |
Profile-Based Object Matching for Information Integration | Doan, Lu, Lee, Han | Int. Syst. | 2003 | bibliographic | Citeseer |
Reference reconciliation in complex information spaces | Dong, Halevy, Madhavan | SIGMOD | 2005 | PIM, bibliographic | Cora |
Source-aware Entity Matching: A Compositional Approach | Shen, DeRose, Doan, Ramakrishnan | ICDE | 2007 | persons, movies | IMDB, Yahoo Movie Search |
Structure-based inference of xml similarity for fuzzy duplicate detection | Leitão, Calado, Weis | CIKM | 2007 | movies, CDs | |
Swoosh: A generic approach to entity resolution | Benjelloun, Garcia-Molina, Su, Widom | TechReport | 2005 | products, hotels | Yahoo |
TAILOR: A Record Linkage Tool Box | Elfeky, Elmagarmid, Verykios | ICDE | 2002 | products, persons | |
The Field Matching Problem: Algorithms and Applications | Alvaro E. Monge and Charles Elkan | SIGKDD | 1996 | departments | INSPEC |
The merge/purge problem for large databases | Hernández, M. A. and Stolfo, S. J. | SIGMOD | 1995 | ||
Training Selection for Tuning Entity Matching | Köpcke, Rahm | QDB/MUD | 2008 | bibliographic, park, restaurant | DBLP, ACM, GS, Riddle |
Transformation-based Framework for Record Matching | Arasu, Chaudhuri, Kaushik | ICDE | 2008 | bibliographic, addresses | Cora |
XML duplicate detection using sorted neighborhoods | Puhlmann, Weis, Naumann | EDBT | 2006 | movies, CDs |