Software for Analytics

There are many software applications available for analytics. Rather than try to list all of them, we suggest that you visit the Data Mining Group’s Predictive Model Markup Language (PMML) web site. Here you can find a list of those analytic applications that support PMML, which is the dominant standard for statistical and data mining models.

  • The most widely deployed open source software application for statistical modeling is R Project for Statistical Computing. Open Data has many years of experience developing and deploying analytic models using R. We have also developed R packages.
  • The most widely deployed open source software for working with big data is Hadoop. Open Data has been developing and deploying Hadoop-based applications since Hadoop was first released. Open Data specializes in building analytic models over Hadoop and arguably has more experience in this area than any other company.
  • Open Data Group has developed an open source Python-based application called Augustus for building and deploying statistical models that are PMML-compliant. You can download it from the Augustus Home Page. Augustus supports PMML-compliant segmented models and can easily created thousands of different segmented models.

Open Data’s outsourced services employs open source software whenever it can for two important reasons: First, some of the highest quality statistical software is open source. Good examples are R and Hadoop. Second, by using open source software for its outsourced services, Open Data provides better value for its clients.