We invented Reduced Error Logistic Regression (RELR). RELR solves predictive modeling problems including variable selection as a single step maximum probability optimization that will yield roughly the same reliable and accurate model across independent training samples. RELR's advantage is with high dimensional and/or small sample size problems where traditional methods employ arbitrary dimension reduction and variable selection and can be very labor intensive to test these arbitrary parameters . RELR's dimension reduction and variable selection is not arbitrary, so all modelers will immediately generate the same maximum probability solution given the sample training sample. Compared to traditional predictive modeling methods, RELR is much less likely to show overfitting and multicollinearity error. A RELR model built with a sample size of 1000 can be as accurate as a model that requires 50,000-100,000 observations with traditional methods. (As an example of these results, see JSM Proceedings technical paper that can be downloaded from the Papers and Presentations page. Or, for a non-technical introduction that includes such an example, see Executive White Paper ).
Our MyRELR software automates all aspects of explanatory regression model building. An accurate explanatory regression model with high dimensional data that may take three weeks to build with standard methods could be easily built automatically and immediately with MyRELR. More importantly, RELR models are non-arbitrary and most probable models. The only truly "arbitrary" parameter in a RELR model build is the size of the training sample, as more accurate models with larger numbers of selected variables will tend to be found with larger training sample sizes. Yet, RELR models also approach a point of maximum accuracy very quickly in terms of training sample size compared to standard methods, especially in high dimensional problems. For all of these reasons, RELR models can have much better classification and probability accuracy than existing methods with small training sample sizes and/or high dimensional problems.
Because RELR models and subtracts error as part of the maximum probability regression in a single step optimization process, substantial labor typically devoted to arbitrary parameter choices and cross validation and error problems with high dimensional or small sample size models is eliminated. Because of this reduced error, RELR does not require large sample sizes for accurate and valid models. This effectively removes large data processing and large data collection costs from predictive model building. Because MyRELR models typically have very few variables with variable selection, scoring large datasets with previously built models can also be much more efficient with MyRELR. MyRELR translates into enormous savings in data collection, data processing, labor, and error costs.
Recent News
May 18, 2010. St. Louis, MO (USA) - We announce today that we plan to have a cloud version of RELR available sometime next year. We will call this cloud version SkyRELRTM to differentiate it from our MyRELRTM product that we have been successfully selling to users with internal SAS installations. SkyRELR will be entirely GUI-based and thus will hopefully appeal to more than just SAS users. The move to the cloud with SkyRELR will have no effect on our continued effort to sell MyRELR to organizations with existing SAS installations, as we will continue to support and sell MyRELR. Instead, the SkyRELR product will be an attempt to broaden the appeal of RELR to a much wider audience. Because RELR gives accurate models with small sample sizes, RELR does not require large datasets to build a model. Thus, RELR is ideal for a predictive analytics cloud implementation. A main stumbling block in previous predictive analytics cloud implementations is often the time and effort to put large datasets up in the cloud.
January 8, 2010. St. Louis, MO (USA) - We announce today that Rice Analytics has decided to name its flagship Reduced Error Logistic Regression (RELR) software product MyRELR. In the four preceding years of research, development, beta tests, and initial rollout, it was simply called RELR. RELR and Reduced Error Logistic Regression are terms for the statistical regression method, but they do not lend themselves to terms for a branded software product. MyRELR is the name chosen because it does fit the software product category, it incorporates the previous identify to RELR, and it can be trademarked. Because the regression method and the software product can usually be used interchangeably, we will continue to use the term RELR, but we will now use MyRELR in specific reference to the branded software product.
September 9, 2009. St. Louis, MO (USA) - Our new executive white paper written by Dan Rice and entitled "Breiman's Quiet Scandal: Stepwise Logistic Regression and RELR" was in the Publications section of the online industry newletter KDnuggets.com on August 27, 2009 (issue 09:n16). This item had the Most Clicks by Subscribers and was the 2nd Most Viewed item overall of 41 items that were published that week. This article written in "plain business English" for executives reviews the major difficulties with Stepwise Logistic Regression that were pointed out by the late statistician Leo Breiman. This article also reviews evidence that our RELR method may be a solution to these problems. The complete white paper can be downloaded by clicking this link to the Executive White Paper page of this website.
June 15, 2009. St. Louis, MO (USA) - Dan Rice gave an invited address last week at the 2009 Classification Society Annual Conference from June 11-13 at the Washington University Medical School. This conference brought together roughly a hundred experts from major universities and businesses in the areas of machine learning, choice modeling, and classification research. This conference was truly international in scope and had attendees from many industrialized countries. However, the relatively small size of this conference compared to JSM allowed for an extended discussion between attendees over the course of several days. The title of this talk was "Reduced Error Logistic Regression". This talk can be downloaded from the Papers and Presentations page of this website.
June 3, 2009. St. Louis, MO (USA) - We have now updated the Case Studies page of this website with credit scoring results from three major banks and one credit card company. The most impressive result is that one user reports a lift from RELR in the KS statistic from roughly 40 to 65 compared to other methods.
February 12, 2009. St. Louis, MO (USA) - Rice Analytics, a SAS Alliance Partner, and the exclusive provider of Reduced Error Logistic Regression (RELR) software announced today that it is proud to be a sponsor of this year's Midwest SAS User Group Conference (MWSUG) in Cleveland, Ohio in October, 2009. Dan Rice was a speaker at the 2008 MWSUG conference in a session on Reduced Error Logistic Regression. The MWSUG is one of the larger regional statistical conferences - approximately 300 people attended the 2008 conference in Indianopolis, Indiana.
September 2, 2008. St. Louis, MO (USA) - Rice Analytics announced today that it has invented a process to reduce the number of variables in RELR models substantially. This process is called Parsed Reduced Error Logistic Regression, or ParsedRELRTM. Like Full RELR, ParsedRELR gives accurate validation sample models without problematic overfitting. Like Full RELR, ParsedRELR is suited for high dimensional datasets with hundreds of thousands of input variables and interactions. However, ParsedRELR gives extremely parsimonious solutions that often select fewer than ten variables with high dimensional and multicollinear datasets that might require hundreds of variables for an accurate Full RELR model. While other predictive modeling methods can select few variables, these same few variables are unlikely to be selected in models built from independent samples, so they have little explanatory meaning. In contrast, due to the error reduction features inherent in RELR, the same variables are likely to be selected in ParsedRELR models built from independent samples at much smaller sample sizes than is possible with other methods. ParsedRELR opens the door to highly accurate predictive models that are not just predictive, but also potentially explanatory.
August 6, 2008. Denver, CO (USA) - Dan Rice spoke today at the Data Mining and Machine Learning Session of the 2008 Joint Statistical Meetings in Denver, Colorado. This session was chaired by Bill Heavlin of Google Inc. and had good speakers from the United States Army, Medical University of China, University of Alabama, Bell Labs, and the University of California at Berkeley. This session was extremely well attended with a standing-room-only crowd. This standing-room-only crowd and the lively discussions prompted Bill Heavlin to say that this session "was the best session at the conference". The title of Rice's talk was "Generalized Reduced Error Logistic Regression Machine". In this presentation, Rice provided evidence that Reduced Error Logistic Regression is able to reduce error significantly compared to Penalized Logistic Regression, Step-Wise Logistic Regression and four other standard methods. A full article coinciding with this talk and published in JSM 2008 Proceedings can be downloaded from the Papers and Presentations page of this website. The Joint Statistical Meetings is one of the largest gatherings of statisticians in the world. Approximately 5000 people attended this conference in Denver this summer.