In case you missed it, Computing Community Consortium (CCC) Council member Cynthia Dwork, distinguished scientist at Microsoft Research, and her co-authors Vitaly Feldman, research scientist at IBM’s Almaden Research Center; Moritz Hardt, research scientist at Google; Toniann Pitassi, professor in the Department of Computer Science at the University of Toronto; Omer Reingold, principle researcher at Samsung Research America; and Aaron Roth, the Raj and Neera Singh Assistant Professor in the Department of Computer and Information Science in the University of Pennsylvania’s School of Engineering and Applied Science published an article in Science on The reusable holdout: Preserving validity in adaptive data analysis.
In their paper they
demonstrate a new approach for addressing the challenges of adaptivity based on insights from privacy-preserving data analysis.
Large data sets offer a vast scope for testing already-formulated ideas and new ones, but researchers who attempt to do both on the same data set run the risk of making false discoveries. Based on ideas drawn from differential privacy, Dwork et al. now provides a theoretical solution. Ideas are tested against aggregate information and individual data set components remain confidential.
For more information, please see their paper The reusable holdout: Preserving validity in adaptive data analysis, Aaron Roth’s ScienceBlog.com interview, or Cynthia Dwork’s interview with Microsoft.