News

Robert Grossman Joins NASA ITIC

January 2010

Robert Grossman, Open Data Managing Partner, has been invited to join the Information Technology Infrastructure Committee of the NASA Advisory Council.  In this role, he will be contributing to the Committee’s mission to support all NASA information technology infrastructure-related programs, projects, activities and facilities, including high performance computing.

Analytics on Demand

November 2009

Open Data has created Amazon EC2 AMI’s to deliver analytics services over the cloud.  Available as a 32 or 64 bit image, each includes:

  • Augustus v0.4.0
  • R with XTS
  • Python 2.6 (with NumPy)
  • Octave 3.2
  • MySQL 5.0

AMI ID, additional deployment information and  support can be found at our Google Code project, http://augustus.googlecode.com

Comprehensive Change Detection Suite

October 2009

Open Data Group has launched a changed detection project on Google Code, http://code.google.com/p/change-detection/.

This is an introduction and demonstration of using open source software and the Data Mining Group’s Predictive Model Markup Language (PMML) standard to perform data analytics.  Specifically, we show how using multiple Baseline models over segments can be used to detect of anomalous behavior.

Case studies, sample data sets, and access to open source analytic suite of software are available.

Announcing PMML 4.0 compliant Augustus

September 2009

Augustus, an open source analytic scoring engine that works with segmented models is  compliant with the PMML 4 standard recently adopted.  Augustus is designed for use with statistical and data mining models. The new release provides Baseline, Tree and Naive-Bayes producers and consumers.  The new release, training, documentation and support can be found at our Google Code project, http://code.google.com/p/augustus/

Augustus is typically used to construct models and score data with models. Augustus includes a dedicated application for creating, or producing, predictive models rendered as PMML-compliant files. Scoring is accomplished by consuming PMML-compliant files describing an appropriate model. The typical model development and use cycle with Augustus is as follows:

  1. Identify suitable data with which to construct a new model.
  2. Provide a model schema which proscribes the requirements for the model.
  3. Run the Augustus producer to obtain a new model.
  4. Run the Augustus consumer on new data to effect scoring.

Malstone v0.8.2 released

This stylized analytic computation, a benchmark for clouds designed for data intensive computing, is available at the Google Code project of the Open Cloud Consortium.   MalStone is designed to use records generated by MalGen, also accessible at the same source.