Predictive Modeling
Classification:
- What are the advantages of different classification algorithms?
- What are the advantages of using a decision tree for classification?
- What are the disadvantages of using a decision tree for classification?
- What are the advantages of logistic regression over decision trees? Are there any cases where it’s better to use logistic regression instead of decision trees?
- What is the difference between Linear SVMs and Logistic Regression?
- How do random forests work in layman’s terms? I’m specifically interested in how RFs do regression problems.
Regression
- What is an intuitive explanation of a multivariate regression?
- How would linear regression be described and explained in layman’s terms?
- What is the difference between linear regression and least squares?
Binary classification error metrics:
When are fewer predictors preferable?
Model Evaluation
- What is an intuitive explanation of cross-validation?
- What are the pitfalls on relying on cross-validation to select models?
Overfitting
- What is an intuitive explanation of over-fitting, particularly with a small sample set? What are you essentially doing by over-fitting? How does the over-promise of a high R², low standard error occur?
- How can I avoid overfitting?
Techniques
Outliers
Missing Data
Misc
Example Predictive Modeling Tasks
- Kaggle – Predicting a march madness bracket
- Is there any summary of top models for the Netflix prize? What are the high level and intuitive ideas behind the winning models that were finally used in the ensemble learning by top teams?