**Introduction to Econometrics**

**Course Description**

The core of predictive analytics are econometric models based on least squares and advanced regression methods. This 3 day class will provide an introduction to econometrics balancing the mathematical theory with the conceptual understanding and practical application. The first day of this three day workshop develops the least squares regression estimators, model adequacy measures, tests of hypotheses, confidence/prediction intervals, and outlier measures using matrix algebra with the R programming language and running models in SAS. Independent variables that are categorical with many levels are often problematic to model—this class will show how to effectively incorporate indicator (dummy) variables in the analysis. On the second day, we explore methods to detect when the least squares assumptions fail by either heteroskedacity, multicollinearity, non-normal distributions, imprecise independent variable levels, or correlated error terms. Preferred approaches to accommodate these situations are weighted least squares, transformations, ridge regression, generalized linear models (to include logistic regression), and instrument variables. The third day will focus on time series data using SAS to model and accurately forecast in the presence of trends, seasonality, and seemingly random fluctuations. Econometric models include smoothing, Autoregressive Integrated Moving Average (ARIMA), transfer functions, and panel data analysis. The last day will also look at lifetime and survival data analysis. This is a hands-on workshop with numerous representative examples using R, SAS, and other software.

**Course Goals/Objectives**

A participant who successfully completes this course will:

- Understand several types econometric models and common applications
- Know how the mathematical and statistical properties of Least Squares regression estimators, residuals, and model adequacy measures
- Be able to use R programming language to develop econometric models from the matrix approach
- Know how to detect and correct for violations of least squares assumptions such as non-normality, non-constant variance, correlated errors, and dependencies between independent variables
- Be able to detect outliers and use modern methods to model in the presence of unusual observations
- Know how to effectively model qualitative independent variables with indicator (dummy) variables and qualitative dependent variables with generalized linear models (e.g. logistic regression)
- Understand the difference between random and fixed effects models
- Know how to model time dependent data and account for trend, seasonality, and other structure using ARIMA, autoregressive, and transfer function modeling
- Know how to develop and interpret output from econometric models write programs with procedures from SAS and other statistical software

**Course Outline**

- Overview of Econometrics
- Fundamental ideas (variable types, model forms, output)
- Applications in industry
- Example

- Probability and Statistics Review
- Normal Distribution and tests for normality
- Sampling Distributions and Central Limit Theorem
- One, Two Sample t-Tests and ANOVA

- Ordinary Least Squares Regression
- Simple Linear Regression Formulation
- Mathematical Development of OLS Estimator
- Matrix Algebra to form models using R
- Gauss-Markov Theorem
- Estimator variance and use of (X’X)
^{-1}matrix

- Coefficient Interpretation, Tests of Significance, and Confidence Intervals
- Example with SAS

- Residual Diagnostics and Model Adequacy
- Residual definition and types of residuals
- R
^{2}, Adjusted R^{2}and Predicted R^{2} - Confidence and prediction intervals on observations
- Assumptions of OLS and tests
- Example in SAS

- Leverage Points and Outliers
- Detecting influential observations and outliers
- Robust Regression methods
- Example in SAS

- Qualitative or Indicator (Dummy) Variables
- Construction of indicator variables
- Impacts to intercept, slope, and slope/intercept
- Example in SAS

- Heteroskedasticity and Multicollinearity
- Impact and detection of non-constant variance
- Weighted least squares and transformations
- Impact and detecting dependence between independent variables
- Methods to correct multicollinearity and ridge regression
- Example in SAS

- General Linear Models and Random Effects
- Logistic Regression for binary responses
- Other useful GLMs
- Random versus Fixed Effects
- Example in SAS

- Time Series
- Characteristics of time series and autocorrelation
- Smoothing methods
- Autoregressive Integrated Moving Average (ARIMA) modeling
- PROC AUTOREG
- Transfer Functions

- Other Useful Tools
- Instrument variables
- Panel data
- Lifetime, survival, and censored data modeling