A real-time alertness monitoring solution using computer vision and the Eye Aspect Ratio (EAR) algorithm to detect driver drowsiness. The system achieves 92% detection precision by analyzing video at 30 frames per second using dlibโs facial landmark detection.
https://github.com/user-attachments/assets/4a9a3a4b-d7e7-4a60-bbae-e1ad812c9860
Matplotlib & Seaborn - Performance visualization
drowsiness-detection/
โ
โโโ README.md # Project documentation
โโโ PROJECT_SUMMARY.md # Executive summary
โโโ requirements.txt # Python dependencies
โ
โโโ src/
โ โโโ driver_drowsiness_detection.py # Main detection system
โ โโโ test_and_calibrate.py # Testing & calibration
โ โโโ driver.py # Original implementation
โ
โโโ models/
โ โโโ shape_predictor_68_face_landmarks.dat # dlib model
โ
โโโ audio/
โ โโโ alarm.wav # Alert sound
โ
โโโ results/
โ โโโ detection_metrics.json # Performance metrics
โ โโโ detection_analysis.png # Analysis visualizations
โ โโโ threshold_analysis.csv # Threshold testing results
โ โโโ frame_analysis.csv # Frame count impact analysis
โ
โโโ docs/
โโโ DDDS_PPT.pptx # Project presentation
# Install Python 3.8+
python --version
# Install pip
python -m pip install --upgrade pip
git clone <repository-url>
cd drowsiness-detection
pip install opencv-python
pip install scipy
pip install numpy
pip install pygame
pip install matplotlib
pip install seaborn
pip install pandas
pip install dlib (may require Visual C++ build tools)# Download shape_predictor_68_face_landmarks.dat
# From: http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
# Extract and place in project root or models/ folder
Run the detection system:
python driver_drowsiness_detection.py
Run with video file:
python driver_drowsiness_detection.py --source path/to/video.mp4
Skip calibration:
python driver_drowsiness_detection.py --no-calibration
Custom threshold:
python driver_drowsiness_detection.py --threshold 0.18
Run comprehensive tests:
python test_and_calibrate.py
This will:
The EAR algorithm calculates the ratio of eye opening based on facial landmarks:
EAR = (||p2 - p6|| + ||p3 - p5||) / (2 * ||p1 - p4||)
Where p1-p6 are the 6 facial landmarks around the eye.
Key Insights:
โโโโโโโโโโโโโโโโโโโ
โ Video Input โ
โ (30 FPS) โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Face Detection โ
โ (dlib HOG) โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ 68 Facial โ
โ Landmarks โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Extract Eye โ
โ Coordinates โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Calculate EAR โ
โ (Left + Right) โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ EAR < 0.16? โ
โ (Threshold) โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Consecutive โ
โ Frame Counter โ
โ (90 frames) โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Binary โ
โ Classification โ
โ Alert/Drowsy โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Trigger Alarm โ
โ (If drowsy) โ
โโโโโโโโโโโโโโโโโโโ
The system implements binary state classification:
False Alarm Prevention:
The system performs automatic calibration:
threshold = mean_EAR - 1.5 * std_EAR
threshold = max(0.15, threshold) # Lower bound safety
| Metric | Value |
|---|---|
| Precision | 92% |
| Recall | 88% |
| F1-Score | 90% |
| Accuracy | 94% |
| False Alarm Rate | Reduced by 18% |
| Metric | Value |
|---|---|
| Frame Rate | 30 FPS |
| Latency | <35 ms per frame |
| Detection Window | 3 seconds (90 frames) |
| Calibration Time | 5 seconds |
Based on 5,000+ frame analysis:
Information Panel (Top):
Eye Visualization:
State Display (Center):
Alert Message (Bottom):
class Config:
# EAR thresholds
EAR_THRESHOLD = 0.16 # Drowsiness threshold
EAR_CONSEC_FRAMES = 90 # 3 seconds at 30 FPS
# Performance
TARGET_FPS = 30 # Target frame rate
FRAME_WIDTH = 640 # Video width
FRAME_HEIGHT = 480 # Video height
# Detection
CALIBRATION_FRAMES = 150 # 5 seconds calibration
ALARM_COOLDOWN = 5 # Seconds between alarms
# Paths
PREDICTOR_PATH = "shape_predictor_68_face_landmarks.dat"
ALARM_PATH = "alarm.wav"
For Higher Sensitivity (detect earlier):
EAR_THRESHOLD = 0.18 # Higher threshold
EAR_CONSEC_FRAMES = 60 # 2 seconds
For Lower False Alarms:
EAR_THRESHOLD = 0.14 # Lower threshold
EAR_CONSEC_FRAMES = 120 # 4 seconds
Tested 50 different thresholds from 0.12 to 0.25:
| Threshold | Precision | Recall | F1-Score |
|---|---|---|---|
| 0.14 | 96% | 78% | 86% |
| 0.15 | 94% | 84% | 89% |
| 0.16 | 92% | 88% | 90% |
| 0.17 | 88% | 91% | 89% |
| 0.18 | 82% | 94% | 88% |
Impact of consecutive frame requirement:
| Window | Precision | False Alarms | Trade-off |
|---|---|---|---|
| 1.0s (30f) | 78% | High | Fast but noisy |
| 2.0s (60f) | 86% | Medium | Balanced |
| 3.0s (90f) | 92% | Low | Optimal |
| 4.0s (120f) | 95% | Very Low | Slow response |
| 5.0s (150f) | 97% | Minimal | Too slow |
Conclusion: 3-second window provides optimal balance between precision and response time.
# Optimize for lower-end hardware
config.FRAME_WIDTH = 320
config.FRAME_HEIGHT = 240
config.TARGET_FPS = 15
# Stream detection events to server
def send_alert(driver_id, timestamp, ear_value):
# API call to monitoring system
pass
def calculate_ear(self, eye_points):
# Vertical eye landmarks
A = distance.euclidean(eye_points[1], eye_points[5])
B = distance.euclidean(eye_points[2], eye_points[4])
# Horizontal eye landmark
C = distance.euclidean(eye_points[0], eye_points[3])
# Calculate EAR
ear = (A + B) / (2.0 * C)
return ear
def detect_drowsiness(self, ear_value, current_time):
if ear_value < self.config.EAR_THRESHOLD:
self.drowsy_frame_count += 1
# 3-second threshold
if self.drowsy_frame_count >= self.config.EAR_CONSEC_FRAMES:
if self.current_state == "ALERT":
self.current_state = "DROWSY"
# Trigger alarm with cooldown
if (current_time - self.last_alarm_time) > self.config.ALARM_COOLDOWN:
self.trigger_alarm()
self.last_alarm_time = current_time
return "DROWSY"
else:
# False alarm prevention
self.drowsy_frame_count = 0
self.current_state = "ALERT"
return "ALERT"
This project demonstrates:
Potential enhancements:
For questions or feedback about this project, please reach out through GitHub issues.
This project is open source and available for educational purposes.
Soukupovรก, T., & ฤech, J. (2016). โReal-time eye blink detection using facial landmarks.โ In Proceedings of the 21st computer vision winter workshop.
Dewi, C., Chen, R. C., & Jiang, X. (2020). โDeep Convolutional Neural Network for Enhancing Traffic Sign Recognition Developed on Yolo V4.โ Multimedia Tools and Applications.
dlib Documentation: http://dlib.net/
OpenCV Documentation: https://docs.opencv.org/
Note: This system is designed for demonstration and research purposes. For production deployment in vehicles, additional safety features, redundancy, and regulatory compliance are required.