Spark (November 2024 - Present)
Python, YOLO, Librosa, OpenCV, Matplotlib, NumPy
Speed Training Made Easy
Spark is a computer vision-based software that automatically times sprints from video footage. Using YOLO's pose estimation software, it can identify athletes and track their movements frame by frame. The timing's precision is within 0.03 seconds using the YOLO11l-pose model and improves the efficiency of a speed training session.
Demo Videos

We are using the YOLO11l-pose in this video, the second most accurate model available. The YOLO11x-pose model is the most accurate and timing can be accurate down to 0.005 seconds, but at the cost of slower processing time.This demonstration does not show an actual use case of the software, but it demonstrates how "final time" is calculated. The program generates the audio graph show in the image. After extracting the peak audio time, we compare this with the time when the runner crosses the "finish line". Comparing the second video with the fourth video, we can see the accuracy of the program. There are still improvements to be made on detecting when the runner crosses the finish line.
How To Use
Many features are yet to be implemented, but here is how it works for now.
- Set camera up at finish line. Center the screen on the finish line.
- Start recording.
- Runner/coach/teammate gives audio cue, such as "GO".
- Go!
The system works by first detecting the auditory cue, or the peak in audio. It then tracks the athlete through the finish line, and calculating the elapsed time from start to end.
What's Next?
Currently, more than one person in the frame will lead to inaccurate results. There needs to be a way to mark the runner as the person we are following. Also, if the person giving the audio cue is not right by the phone, there will be a small error in time due to the speed of sound. The next major step will be to deploy the software.