Open Data Group Tutorial at O’Reilly Strata Conference in NYC

On October 23, 2012, Robert Grossman and Collin Bennett from Open Data Group will give a tutorial at the O’Reilly Strata Conference in New York City on “Best Practices for Building and Deploying Predictive Models over Big Data.”

The slides and some related materials can be downloaded from

The 3.5 hour tutorial consists of 12 modules:

  1. Introduction
  2. Building Predictive Models – EDA and Building Features
  3. Case Study: MalStone
  4. Working with Multiple Models: Ensembles and Segments
  5. Case Study: CTA
  6. Deploying Predictive Models Using PMML-based Scoring Engines
  7. Three Ways to Build Models over Hadoop Using R
  8. Case Study: Building Trees over Big Data
  9. Improving the Impact of a Model In Operations – The SAMS Methodology
  10. Case Study: AdReady
  11. Quantifying the Lift of a Predictive Models and Improving It
  12. Case Study: Matsu

Open Data Group helped pioneer some of the technology behind topics 2, 3, 6, 7, 8 and 9. For example, you can follow the links to learn more about the MalStone Benchmark, the Multiple Model component of the DMG’s PMML standard, and Project Matsu, which uses MapReduce to process and analyze images.

If you are at the Strata Conference, please stop by to say hello.

This entry was posted in analytic models, big data, Blog, news, PMML. Bookmark the permalink.