
- detection dataset: https://universe.roboflow.com/roboflow-jvuqo/basketball-player-detection-3-ycjdo/dataset/6
- numbers OCR dataset: https://universe.roboflow.com/roboflow-jvuqo/basketball-jersey-numbers-ocr/dataset/3
- blogpost: https://blog.roboflow.com/identify-basketball-players
Models I used:
- RF-DETR – a DETR-style real-time object detector. We fine-tuned it to detect players, jersey numbers, referees, the ball, and even shot types.
- SAM2 – a segmentation and tracking. It re-identifies players after occlusions and keeps IDs stable through contact plays.
- SigLIP + UMAP + K-means – vision-language embeddings plus unsupervised clustering. This separates players into teams using uniform colors and textures, without manual labels.
- SmolVLM2 – a compact vision-language model originally trained on OCR. After fine-tuning on NBA jersey crops, it jumped from 56% to 86% accuracy.
- ResNet-32 – a classic CNN fine-tuned for jersey number classification. It reached 93% test accuracy, outperforming the fine-tuned SmolVLM2.
Posted by RandomForests92
![[OC] NBA players tracking and recognition [OC] NBA players tracking and recognition](https://www.byteseu.com/wp-content/uploads/2025/10/f6czfcht8osf1-1024x576.gif)
12 Comments
Do commentators use something like this or are they just really good at their jobs?
It’s cool when it’s sports, but this is also how we’re being tracked in public. But hey, we have the convenience of unlocking our phones a micro-second faster.
Cool… but this basic machine vision that’s been available for a decade now. YOLO made this available in package form 5 years ago
Wouldn’t it be better to also track the ball. Stats velocity, hang time, bounce count, points etc.
But can you add real time accurately positioned moustaches to all the players?
Super cool! Thanks for post
This is so cool! I know there have been similar projects for soccer but I would love to do it myself too. I think the biggest issue there is lack of tactical cam footage. The moment the broadcast feed cuts over to show a single player or refcam or just moves the defenders out of the frame because the ball is too far forward it becomes very messy.
Do you have any tips?
This would make announcing so much easier
Cool! I’m assuming this isn’t real time right? Needs to run on recorded video? How long does it take to process?
Cool. It would be neat if it could track the ball and project the trajectories too.
Wow this is impressive!! I hear tracking overlapping players and clashes of three or more when going for a ball is one of the main hurdles is that true? I also read they still haven’t pinned this with a high degree of accuracy enough for soccer. I’m no expert at all but is theres a high degree of success depending of camera fidelity and how much frames per second it can handle? Or is there a number thats considered enough and past that is overkill? If there is, this means investment on all teams in a league at the same time for fairplay reasons. But hey thats me just rambling about what I read, if anyone has good intel on this please share!
Would be cool to have under their name their shooting % for that location on the court when they have the ball