top of page

How Pose estimation AI actually Works



You've probably seen a pose estimation AI on your Facebook feed, but how does it actually work? Pose estimation is the process of estimating the 3D human pose from a single 2D image. It's one of the hottest trends in artificial intelligence because it can be used for so many purposes. As an example, let's take a closer look at how this technology works and what it means to you as someone who has taken yoga classes.


Why Use AI for Pose Estimation


There are many different types of AI technologies that can be applied to this type of data analysis, from machine learning algorithms to computer vision technology. Machine learning algorithms examine how well you’re able to hold each pose based on what it knows about the human body and its limitations.


AI is being used to track the poses you do in yoga through pose estimation algorithms. It will be very helpful for a large number of people who prefer at-home fitness services. And this number is increasing every day especially after the boom we saw last year.


The data collected by AI can be used for a variety of purposes, including improving your practice by analyzing the mistakes you make when performing an asana and suggesting how you can improve your posture. It can help us learn more about our bodies especially with the assistance of data collected by fitness wearables.


How Pose Estimation AI Works

The algorithm works by analyzing frames from a video sequence to find correspondences between key points on different frames. For example, if two people are standing side by side in one frame and facing the camera in another frame, there will be corresponding points (e.g., their feet) where they appear together on both frames.


Neural networks is an algorithm inspired by biological neurons and mimic the way the human brain operates, this is one of the ways of creating artificially intelligent programs. Modern pose estimation is based on something called convolutional neural networks which is a type of neural network used in image recognition and is specifically designed to process pixel data.


Here is a step by step process of how CNN uses computer vision for pose estimation:


STEP 1: When you start the AI program, the camera captures your body movements while you exercise.

STEP 2: The captured video is split into hundreds of frames.

STEP 3: Every frame is processed by pose estimation models and key-points in the body are detected.

STEP 4: A virtual skeleton is formed in either 2D or 3D

STEP 5: The virtual skeleton is scanned and analyzed in order to detect any mistakes in the exercise technique.

STEP 6: This information is processed and is received by the user along with recommendations on how to correct posture.


Face recognition has become mainstream nowadays and so is motion capture in the film industry. Pose estimation is very similar to these but instead of scanning your face, it scans your entire body. Looking at the speed at which technology is evolving, we are sure to have better and more capable AI models in the future to make our lives better.

bottom of page