Zum Inhalt

DoDaS & FAIR Guest: Jeff M. Phillips

Bitte Bildnachweis einfügen

15.02.2024, 16-18 Uhr (c.t.), Otto-Hahn-Str. 12, Raum E.003

Prof. Jeff M. Phillips, Ph.D.

Kahlert School of Computing

University of Utah

Jeff Phillips is a joint guest of the FAIR research profile & DoDaS

Title: Ferret: Reviewing Tabular Datasets for Manipulation

Abstract: How do we ensure the veracity of science? The act of manipulating or fabricating scientific data has led to many high-profile fraud cases and retractions. Detecting manipulated data, however, is a challenging and time-consuming endeavor. Automated detection methods are limited due to the diversity of data types and manipulation techniques. Furthermore, patterns automatically flagged as suspicious can have reasonable explanations. Instead, we propose a nuanced approach where experts analyze tabular datasets, e.g., as part of the peer-review process, using a guided, interactive visualization approach. In this paper, we present an analysis of how manipulated datasets are created and the artifacts these techniques generate. Based on these findings, we propose a suite of visualization methods to surface potential irregularities. We have implemented these methods in Ferret, a visualization tool for data forensics work. Ferret makes potential data issues salient and provides guidance on spotting signs of tampering and differentiating them from truthful data.

This is joint work with Devin Lange, Shaurya Sahai, and Alexander Lex, and appeared in EuroVis 2023 (https://vdl.sci.utah.edu/publications/2023_eurovis_ferret/)


Short Bio: Jeff Phillips (https://www.cs.utah.edu/~jeffp/) is a Professor in the Kahlert School of Computing.  There he runs the Data Science academic program, including a bachelors degree in Data Science that he led the creation of.  He also founded and directed of the Utah Center for Data Science.  He regularly publishes in top venues across data science, including in data mining, machine learning, data management, and statistics -- and also in computational geometry.  He published an undergraduate textbook "Mathematical Foundations for Data Analysis" (mathfordata.github.io) with Springer-Nature in 2021.  He has an NSF CAREER Award, and has been continuously funded by NSF since 2009.  For the 2023-2024 academic year he is on sabbatical as a Senior Research Fellow at ScaDS.AI in University of Leipzig, and a visitor at the MPI for Mathematics in the Sciences.