Section 2 of The BD2K Guide to the Fundamentals of Data Science online lecture series, titled Data Representation Overview, starts October 28th with an overview from Anita Bandrowski, UCSD. The National Institutes of Health (NIH) Big Data to Knowledge (BD2K) program run lecture series features experts from around the country presenting on a wide range of topics in data science. This course is an introductory overview that assumes no prior knowledge or understanding of data science. The series began Friday, September 9th and will continue to run all year once per week from 12noon-1pm ET.
Additionally registration is now open for the 2016 Open Data Science Symposium: How Open Data and Open Science are Transforming Biomedical Research. The symposium will take place December 1 at the Bethesda North Marriott Conference Center in Bethesda, MD, in conjunction with the 2016 BD2K All Hands Meeting, November 29-30, and is free and open to the public. The symposium will be live cast at: https://videocast.nih.gov/. Please register here by November 18, 2016. For more information about this event, contact Elizabeth.Kittrie@nih.gov or Joe.Bonner@nih.gov
If you would like to join the meeting, please go to the BD2K Guide web page for the most up-to-date computer or mobile logins.
This is a joint effort of the BD2K Training Coordinating Center (TCC), the BD2K Centers Coordination Center (BD2KCCC), and the NIH Office of the Associate Director of Data Science. For up-to-date information about the series and to see archived presentations, go to this website.
Tentative Schedule
10/28/16 SECTION 2: DATA REPRESENTATION OVERVIEW (Anita Bandrowski, UCSD)
11/4/16 Databases and data warehouses, Data: structures, types, integrations (Chaitan Baru, NSF)
11/11/16 No lecture- Veteran’s Day
11/18/16 Social networking data (TBD)
12/2/16 Data wrangling, normalization, preprocessing (Joseph Picone, Temple)
12/9/16 Exploratory Data Analysis (Brian Caffo, Johns Hopkins)
12/16/16 Natural Language Processing (Noemie Elhadad, Columbia)
1/6/17 SECTION 3: COMPUTING OVERVIEW (Patricia Kovatch, Icahn School of Medicine at Mount Sinai)
1/13/17 Workflows/pipelines
1/20/17 Programming and software engineering; API; optimization
1/27/17 Cloud, Parallel, Distributed Computing, and HPC
2/3/17 Commons: lessons learned, current state
2/10/17 SECTION 4: DATA MODELING AND INFERENCE OVERVIEW (Dates tentative)
2/17/17 Smoothing, Unsupervised Learning/Clustering/Density Estimation
2/24/17 Supervised Learning/prediction/ML, dimensionality reduction
3/3/17 Algorithms, incl. Optimization
3/10/17 Multiple testing, False Discovery rate
3/17/17 Data issues: Bias, Confounding, and Missing data
3/24/17 Causal inference
3/31/17 Data Visualization tools and communication
4/7/17 Modeling Synthesis
SECTION 5: ADDITIONAL TOPICS
4/14/17 Open science
4/21/17 Data sharing (including social obstacles)
4/28/17 Ethical Issues
5/5/17 Extra considerations/limitations for clinical data
5/12/17 reproducibility
5/19/17 SUMMARY and NIH context
Other Upcoming BD2K Opportunities
- BD2K FOA:RFA-CA-16-020 “BD2K Support for Meetings of Data Science Related Organizations (U13).” The purpose of this FOA is to support high quality and impactful conferences or meetings convened by community-based, data science-related organizations that help to carry out critical work related to biomedical data science and are aligned with the goals and Mission Statement of the NIH BD2K program. Applications due December 15, 2016. Permission to submit application letters is required. Applicants are urged to initiate contact well in advance of the chosen application due date and no later than 6 weeks before that date. Please visit the Frequently Asked Questions (FAQ) page for more information on this FOA.
- BD2K FOA:RFA-ES-16-010 “Big Data to Knowledge (BD2K) Community-Based Data and Metadata Standards Efforts (R24).” Applications due November 9, 2016. For additional information, contact Cindy Lawler, lawler@niehs.nih.gov. Check out the blog post on the BD2K INPUT/OUTPUT Blog: https://datascience.nih.gov/BlogFOACommunity-BasedStandards.
- BD2K FOA:RFA-LM-17-001 “Big Data to Knowledge (BD2K) Enhancing the Efficiency and Effectiveness of Digital Curation for Biomedical Big Data (U01).” Applications due December 15, 2016. For additional information, contact Valerie Florance at: florancev@mail.nih.gov.
- BD2K FOA: RFA-ES-16-011 “BD2K Research Education Curriculum Development: Data Science Overview for Biomedical Scientists (R25).” Applications due December 7, 2016. For additional information, contact the BD2K Training Team atbd2k_training@mail.nih.gov.
- BD2K FOA: RFA-MD-16-002 “NIH Big Data to Knowledge (BD2K) Enhancing Diversity in Biomedical Data Science (R25).” Applications due November 14, 2016. For additional information, contact the BD2K Training Team atbd2k_training@mail.nih.gov.
- BD2K Commons Credits Model Opportunity: NIH call for organizations to become conformant providers as part of the Commons Credits Model. For details, visit:https://www.fbo.gov/index?s=opportunity&mode=form&id=b45f1d0703f3da22dae26291ec45e5a5&tab=core&_cview=0.
- NIH call for Public Feedback forDataMed DDI Prototype: Developed through the BD2K biomedical and healthCAre Data Discovery Indexing Ecosystem project (bioCADDIE), the prototype allows users to find and access biomedical datasets from multiple sources based on key attributes. DataMed is an element of the NIH BD2K Commons, the vision for an interconnected digital ecosystem of resources around data and other research digital objects. DataMed is a work in progress and the bioCADDIE development team welcomes your feedback here. For more information, contactE.Gururaj@uth.tmc.edu or biocaddie@ucsd.edu.