Here’s a detailed project using TensorFlow.js: Real-Time Object Detection with COCO-SSD. This project uses a pre-trained model to detect and label objects through the webcam, such as people, cups, or phones.
Project Overview
- What It Does: Detects objects from a live webcam stream.
- Model Used: COCO-SSD (Common Objects in Context).
- Key Features:
- Detects multiple objects in real-time.
- Draws bounding boxes with object labels.
Step 1: Set Up Your Project Files
Create the following files:
/ObjectDetection
│
├── index.html
├── styles.css
└── script.js
Step 2: Add HTML (index.html)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Real-Time Object Detection</title>
<link rel="stylesheet" href="styles.css">
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@latest"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/coco-ssd"></script>
</head>
<body>
<h1>Real-Time Object Detection</h1>
<video id="webcam" autoplay playsinline></video>
<canvas id="canvas"></canvas>
<script src="script.js"></script>
</body>
</html>
Step 3: Style the App (styles.css)
body {
display: flex;
justify-content: center;
align-items: center;
height: 100vh;
margin: 0;
flex-direction: column;
background-color: #333;
color: white;
}
video, canvas {
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
max-width: 80%;
height: auto;
border-radius: 10px;
}
Step 4: Add JavaScript Logic (script.js)
let model;
const video = document.getElementById('webcam');
const canvas = document.getElementById('canvas');
const ctx = canvas.getContext('2d');
// Load COCO-SSD model
async function loadModel() {
model = await cocoSsd.load();
console.log('Model loaded!');
startWebcam();
}
// Start the webcam
function startWebcam() {
navigator.mediaDevices.getUserMedia({
video: true
}).then((stream) => {
video.srcObject = stream;
video.onloadedmetadata = () => {
video.play();
detectObjects();
};
}).catch((err) => {
console.error('Webcam error:', err);
});
}
// Detect objects in the video stream
async function detectObjects() {
const predictions = await model.detect(video);
drawPredictions(predictions);
requestAnimationFrame(detectObjects);
}
// Draw bounding boxes and labels
function drawPredictions(predictions) {
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
ctx.clearRect(0, 0, canvas.width, canvas.height);
predictions.forEach(prediction => {
const [x, y, width, height] = prediction.bbox;
ctx.strokeStyle = 'lime';
ctx.lineWidth = 2;
ctx.strokeRect(x, y, width, height);
ctx.fillStyle = 'lime';
ctx.font = '16px Arial';
ctx.fillText(prediction.class, x, y > 10 ? y - 5 : y + 15);
});
}
// Initialize the app
loadModel();
Step 5: Test Your Object Detection App
- Open index.html in your browser.
- Grant camera access when prompted.
- Watch as the webcam feed loads, and detected objects get highlighted with bounding boxes and labels.
Explanation
-
COCO-SSD Model:
- This pre-trained model can detect 80 common objects like people, chairs, and laptops.
cocoSsd.load()
loads the model asynchronously.
-
Webcam Stream Handling:
- The
navigator.mediaDevices.getUserMedia()
function accesses the webcam.
- The video element streams live input.
-
Real-Time Detection:
- We use
requestAnimationFrame()
to continuously detect objects and render predictions.
- The bounding boxes and labels are drawn on the HTML canvas.
Possible Enhancements
- Confidence Scores: Display confidence percentages with object labels.
- Mobile Optimization: Adjust canvas dimensions for mobile screens.
- Multiple Models: Use YOLO or MobileNet for different detection tasks.
- Event Handling: Trigger alerts when certain objects (e.g., "person") are detected.
Conclusion
This project demonstrates how to integrate machine learning models in the browser using TensorFlow.js. You’ve built a real-time object detection app with COCO-SSD, which can detect objects through the webcam stream.
Let me know if you'd like to explore more features or try another TensorFlow.js project!
Here is the working project link to explore and test: Goto the link