Augustus at KDD

Collin Bennett and James Pivarski of Open Data debuted Augustus 0.6 at the KDD 2013 Big Data Camp. If you were not able to attend, a KDD 2013 trail has been added to the Interactive Documentation running on Google’s App Engine.

Dr. Pivarski also spoke about how PMML can be extended using Augustus at the PMML ½-day workshop in support of the accepted paper Augustus 0.6: Design and Implementation of a Hot-Swappable PMML Scoring Engine.

Augustus is an open source statistical toolkit maintained by Open Data. It is PMML 4.1-compliant and both builds and consumes PMML models. It is available on GitHub and Google Code.

More information about KDD 2013 can be found here.

Posted in big data, Blog, PMML, predictive analytics | Leave a comment

Open Data Founder named to Federal 100 Award List

Robert Grossman, founding partner of Open Data, has been named by Federal Computer Week to its Federal 100 Award list.  The 24th annual list recognizes government and industry leaders who have played pivotal roles in the federal government IT community.  Dr. Grossman is part of an elite group of individuals Fed100who have gone above and beyond their daily responsibilities and have made a difference in the way technology has transformed or accelerated the mission of the agencies they support.  He and the other winners will  be honored during ceremonies on March 20, 2013 at the Grand Hyatt in Washington DC.   

Grossman is widely recognized as an expert in large data, including design of analytic architectures deployed in cloud environments involving Petabytes of data.   Dr. Grossman has served in a variety of key advisory capacities over his career to assist government agencies meet the complex challenges of information technology, security and oversight.

In addition to his work with Open Data’s commercial and government clients, Grossman is a faculty member at the University of Chicago, where he is the Director of Informatics at the Institute for Genomics and Systems Biology, a Senior Fellow at the Computation Institute, and a Professor of Medicine in the Section of Genetic Medicine.  He also serves on the NASA Advisory Council, chairs the Open Cloud Consortium, founded the Data Mining Group, is a Visiting Professor at the Booth School of Management, and is a frequent speaker and author on the area of big data and intensive computing.

Open Data is proud of the well deserved recognition, and pleased that Robert Grossman’s  significant contributions  are being honored in this way.

More about the Fed100 Award


Posted in analytic strategy, big data, Blog, news, predictive analytics | Tagged , , , | Leave a comment

Automation, Algorithms, Predictive Models and All That

Earlier this week, I was one of the speakers at a panel that discussed how automation, algorithms, predictive models, and related technology have changed our lives.

The event was kicked off Christopher Steiner, author of Automate This: How Algorithms Came to Rule Our World, who talked about some of the ways that algorithms are changing our lives, ranging from high speed trading to medical diagnoses.

In addition to myself, the panel included:

  • Rayid Ghani, Chief Scientist of the Obama for America 2012 campaign
  • George John, Founder and CEO, rocketfuel
  • Keary Philips, Allstate Insurance Company
  • Rishad Tobaccowala, Chief Strategy and Innovation Officer, VivaKi

Rayid Ghani spoke about some of the ways that predictive analytics was used to help persuade some of those who may not have voted to actually register to vote and later to show up at the polls and to vote.

I discussed that as important as algorithms are, they are sometimes best thought of in the context of: i) what are the Concepts and abstractions used to model the problem; ii) what are the Algorithms used to compute with these abstractions; iii) and what are the Devices that the algorithms run on? CAD for short.

It is interesting to look at big data from the perspective of the concepts, algorithms and devices. We are better at predictive analytics today not just because we have better algorithms, but also because we have made significant progress on the concepts and abstractions that underlie predictive analytics and on the devices we use.

For example, 20 years ago with big data and predictive analytics, the focus was on building a single statistical model and looking for knowledge; we generally used regression algorithms to analyze data; and we used high end workstations for the computations. Today, with big data, we tend to think of collections of models (ensembles, cubes of models, etc.) and focus the actions (not the knowledge) that are possible; we would more typically use algorithms that compute trees or support vector machines; and we do computations over clusters of workstations.

There is more about CAD in Chapter 1 of my book on the Structure of Digital Computing.

Posted in analytic models, big data, Blog | Leave a comment