# Predictive Modeling

**Classification:**

- What are the advantages of different classification algorithms?
- What are the advantages of using a decision tree for classification?
- What are the disadvantages of using a decision tree for classification?
- What are the advantages of logistic regression over decision trees? Are there any cases where it’s better to use logistic regression instead of decision trees?
- What is the difference between Linear SVMs and Logistic Regression?
- How do random forests work in layman’s terms? I’m specifically interested in how RFs do regression problems.

**Regression**

- What is an intuitive explanation of a multivariate regression?
- How would linear regression be described and explained in layman’s terms?
- What is the difference between linear regression and least squares?

**Binary classification error metrics:**

**When are fewer predictors preferable?**

**Model Evaluation**

- What is an intuitive explanation of cross-validation?
- What are the pitfalls on relying on cross-validation to select models?

**Overfitting**

- What is an intuitive explanation of over-fitting, particularly with a small sample set? What are you essentially doing by over-fitting? How does the over-promise of a high R², low standard error occur?
- How can I avoid overfitting?

**Techniques**

**Outliers**

**Missing Data**

**Misc**

**Example Predictive Modeling Tasks**