Variable | Coefficient | Std Error | p-value | 95% CI |
---|---|---|---|---|
Intercept | 9.620 | 0.007 | <0.001 | [9.61, 9.64] |
Limited English Proficiency | -0.038 | 0.010 | <0.001 | [-0.058, -0.019] |
High School | 0.227 | 0.007 | <0.001 | [0.212, 0.241] |
Some College | 0.274 | 0.009 | <0.001 | [0.256, 0.292] |
Bachelor's Degree | 0.402 | 0.008 | <0.001 | [0.387, 0.416] |
Graduate Degree | 1.140 | 0.007 | <0.001 | [1.12, 1.15] |
Male Gender | 0.419 | 0.003 | <0.001 | [0.413, 0.426] |
Note: | ||||
Reference: Less than HS education, Female gender |
Model
Data Generating Mechanism
The outcome variable in our model is the natural logarithm of income (log_income
), and it is modeled as a linear function of English proficiency (lep
), education level (educ_level
), and gender (gender
). The fitted model is:
\[ \begin{aligned} \log(\text{income})_i =\;& 9.62 \\ &- 0.0385 \cdot \text{LEP}_i \\ &+ 0.227 \cdot \text{HighSchool}_i \\ &+ 0.274 \cdot \text{SomeCollege}_i \\ &+ 0.402 \cdot \text{Bachelors}_i \\ &+ 1.14 \cdot \text{Graduate}_i \\ &+ 0.419 \cdot \text{Male}_i \\ &+ \varepsilon_i \end{aligned} \]
Where:
- LEP = 1 if individual has limited English proficiency, 0 otherwise
- HighSchool, SomeCollege, Bachelors, Graduate are dummy variables for education level
- Male = 1 if individual is male, 0 if female
- The reference category is female English-proficient individuals with less than high school education
To recover predicted income in dollars, we exponentiate the log outcome:
\[ \text{income} = \exp(\log(\text{income})) \]