On October 23, 2012, Robert Grossman and Collin Bennett from Open Data Group will give a tutorial at the O’Reilly Strata Conference in New York City on “Best Practices for Building and Deploying Predictive Models over Big Data.”
The slides and some related materials can be downloaded from tutorials.opendatagroup.com.
The 3.5 hour tutorial consists of 12 modules:
- Building Predictive Models – EDA and Building Features
- Case Study: MalStone
- Working with Multiple Models: Ensembles and Segments
- Case Study: CTA
- Deploying Predictive Models Using PMML-based Scoring Engines
- Three Ways to Build Models over Hadoop Using R
- Case Study: Building Trees over Big Data
- Improving the Impact of a Model In Operations – The SAMS Methodology
- Case Study: AdReady
- Quantifying the Lift of a Predictive Models and Improving It
- Case Study: Matsu
Open Data Group helped pioneer some of the technology behind topics 2, 3, 6, 7, 8 and 9. For example, you can follow the links to learn more about the MalStone Benchmark, the Multiple Model component of the DMG’s PMML standard, and Project Matsu, which uses MapReduce to process and analyze images.
If you are at the Strata Conference, please stop by to say hello.