Recent Publications

Page is infrequently updated. Latest publications plus extended material are also available from the faculty and/or project web pages.

2021 and preprints



  • JOSIE:  Overlap set similarity search for finding joinable tables in data lakes
    Erkang Zhu, Dong Deng, Fatemeh Nargesian, Renée J. Miller
    SIGMOD, pp. 847-864, 2019.

  • Anytime approximation in probabilistic databases via scaled dissociations
    Maarten Van den Heuvel, Peter Ivanov, Wolfgang Gatterbauer, Floris Geerts, Martin Theobald
    SIGMOD, pp. 1295-1312, 2019.
    pdf | preprint | bib

  • VISE: Vehicle Image Search Engine with traffic camera
    Hyewon Choi, Erkang Zhu, Arsala Bangash, Renée J. Miller
    PVLDB 12(12): 1842-1845 (2019).

  • Data lake management: Challenges and opportunities (tutorial)
    Fatemeh Nargesian, Erkang Zhu, Renée J. Miller, Ken Q. Pu, Patricia C. Arocena
    PVLDB 12(12): 1986-1989 (2019).

  • Bridging quantities in tables and text
    Yusra Ibrahim, Mirek Riedewald, Gerhard Weikum, Demetrios Zeinalipour-Yatzi
    ICDE, pp. 1010-1021, 2019.

  • A collective, probabilistic approach to schema mapping using diverse noisy evidence
    Angelika Kimmig, Alex Memory, Renée J. Miller, Lise Getoor
    IEEE Trans. Knowl. Data Eng. 31(8): 1426-1439 (2019).

  • Abstract cost models for distributed data-intensive computations
    Rundong Li, Ningfang Mi, Mirek Riedewald, Yizhou Sun, Yi Yao
    DAPD Journal, 37(3): 411-439, 2019.
    pdf | bib

  • Algebraic approximations of the probability of Boolean functions
    Wolfgang Gatterbauer
    SUM (Invited Keynote), pp. 449-450, 2019.



  • Beta probabilistic databases: A scalable approach to belief updating and parameter learning
    Niccolò Meneghetti, Oliver Kennedy, Wolfgang Gatterbauer
    SIGMOD, pp. 573-586, 2017. (Invited to the Special Issue of TODS on “best of SIGMOD 2017“)
    pdf | preprint | bib

  • Automated template generation for question answering over knowledge graphs
    Abdalghani Abujabal, Mohamed Yahya, Mirek Riedewald, Gerhard Weikum
    WWW, pages 1191-1200, 2017.

  • Interactive navigation of open data linkages
    Erkang Zhu, Ken Q. Pu, Fatemeh Nargesian, Renée J. Miller
    PVLDB 10(12): 1837-1840 (2017).

  • A collective, probabilistic approach to schema mapping
    Angelika Kimmig, Alex Memory, Renée J. Miller, Lise Getoor
    ICDE, pp. 921-932, 2017.
    pdf | arXiv:1702.03447 | preprint

  • The linearization of belief propagation on pairwise Markov random fields
    Wolfgang Gatterbauer
    AAAI, pp. 3747-3753, 2017.
    pdf | arXiv:1502.04956 (long) | bib

  • DeepSea: Progressive workload-aware partitioning of materialized views in scalable data analytics
    Jiang Du, Renée J. Miller, Boris Glavic, Wei Tan
    EDBT, pp. 198-209, 2017.

  • Conflict of interest declaration and detection system in heterogeneous networks
    Siyuan Wu, Leong Hou U, Sourav S Bhowmick, Wolfgang Gatterbauer
    CIKM, pp. 2383-2386, 2017 (Short paper).
    pdf | preprint | bib

  • A case for abstract cost models for distributed execution of analytics operators
    Rundong Li, Ningfang Mi, Mirek Riedewald, Yizhou Sun, Yi Yao
    In Proc. Int. Conf. on Big Data Analytics and Knowledge Discovery (DaWaK), pages 149-163, 2017.
    pdf | preprint

  • Algorithms for automatic ranking of participants and tasks in an anonymized contest
    Yang Jiao, R. Ravi, Wolfgang Gatterbauer
    WALCOM, pp 335-346, 2017. (Invited to the Special Issue of Elsevier TCS on “best of WALCOM 2017“)
    pdf | arXiv:1612.04794 | bib

  • VIQS: Visual interactive exploration of query semantics
    Christina Christodoulakis, Eser Kandogan, Ignacio G. Terrizzano, Renée J. Miller
    ESIDA@IUI, pp. 25-32, 2017.

  • Dissociation and propagation for approximate lifted inference with standard relational database management systems
    Wolfgang Gatterbauer, Dan Suciu
    VLDBJ (Special Issue of VLDB Journal on “best of VLDB 2015“). pp. 5-30, 2016.
    pdf | arXiv:1310.6257 (long)

  • Data quality: The role of empiricism
    Shazia Wasim Sadiq, Tamraparni Dasu, Xin Luna Dong, Juliana Freire, Ihab F. Ilyas, Sebastian Link, Renée J. Miller, Felix Naumann, Xiaofang Zhou, Divesh Srivastava
    SIGMOD Record 46(4): 35-43 (2017)

  • The future of data integration (Keynote Abstract)
    Renée J. Miller
    KDD, p. 3, 2017.

  • A machine learning approach for result caching in web search engines
    Tayfun Kucukyilmaz, Berkant Barla Cambazoglu, Cevdet Aykanat, Ricardo Baeza-Yates
    Inf. Process. Manag. 53(4): 834-850 (2017)

  • Story-focused reading in online news and its potential for user engagement
    Janette Lehmann, Carlos Castillo, Mounia Lalmas, Ricardo Baeza-Yates
    J. Assoc. Inf. Sci. Technol. 68(4): 869-883 (2017)

  • Quality-efficiency trade-offs in machine learning for text processing
    Ricardo Baeza-Yates, Zeinab Liaghat:
    BigData 2017: 897-904

  • FA*IR: A Fair Top-k Ranking Algorithm
    Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, Ricardo Baeza-Yates:
    CIKM 2017: 1569-1578

  • Detection of Trending Topic Communities: Bridging Content Creators and Distributors
    Lorena Recalde, David F. Nettleton, Ricardo Baeza-Yates, Ludovico Boratto:
    HT 2017: 205-213

  • Exploring Query Auto-Completion and Click Logs for Contextual-Aware Web Search and Query Suggestion
    Liangda Li, Hongbo Deng, Anlei Dong, Yi Chang, Ricardo Baeza-Yates, Hongyuan Zha
    WWW 2017: 539-548


  • Merlin: Exploratory Analysis with Imprecise Queries
    Bahar Qarabaqi, Mirek Riedewald
    IEEE TKDE, 28(2): 342-355, 2016. (TKDE special issue on “best of ICDE 2014“)

  • Making sense of entities and quantities in web tables
    Yusra Ibrahim, Mirek Riedewald, Gerhard Weikum
    CIKM, pp. 1703-1712, 2016.
    pdf | preprint

  • Visual congruent ads for image search
    Yannis Kalantidis, Ayman Farahat, Lyndon Kennedy, Ricardo Baeza-Yates, David A. Shamma
    ICPR 2016: 1496-1505

  • Encouraging Diversity- and Representation-Awareness in Geographically Centralized Content
    Eduardo Graells-Garrido, Mounia Lalmas, Ricardo Baeza-Yates:
    IUI 2016: 7-18

  • Data Portraits and Intermediary Topics: Encouraging Exploration of Politically Diverse Profiles
    Eduardo Graells-Garrido, Mounia Lalmas, Ricardo Baeza-Yates:
    IUI 2016: 228-240

  • Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising
    Mihajlo Grbovic, Nemanja Djuric, Vladan Radosavljevic, Fabrizio Silvestri, Ricardo Baeza-Yates, Andrew Feng, Erik Ordentlich, Lee Yang, Gavin Owens
    SIGIR 2016: 375-384

  • Towards Mobile Query Auto-Completion: An Efficient Mobile Application-Aware Approach
    Aston Zhang, Amit Goyal, Ricardo Baeza-Yates, Yi Chang, Jiawei Han, Carl A. Gunter, Hongbo Deng
    WWW 2016: 579-590