Health is important to all of our lives. We only have one body and so we should treat it right. I wanted to see how my fitness tracks out in the past. I downloaded the data to do my analysis.
I wanted to see if I was hitting the recommended steps a day as I know I walk everyday. This metric could prove useful to see if I was making the necessary activity to preserve my health.
Data Collection
Tracking steps and distance traveled manually would be a huge pain. Thankfully we have smart devices that automatically track that. I always have my iPhone 12 on my person that tracks my steps. I'm not exactly sure how the devices work but for this project's sake we will assume most of their collection is accurate.
Even thought my phone tracks my data, it doesn't make it easy for me to export the data. I downloaded a third party app called QS Access that exported all of my health data aggregated in my Apple Health app.
Here's what that data looks like:
As you can see the data has two date/time categories, empty values for calories burned and body fat percentage and finally steps taken and distance traveled.
Once I had this data exported and loaded as a CSV, I was ready to analyze.
Analyzing the Data
I wanted to use Python for this project so I could some automated quick analysis while also learning how to execute Python better.
I used Google Collab as my IDE, which worked great since it is a cloud based notebook. You can actually take a look at my notebook here at this link.
Or you can just check out my code below:
First I used pandas to import the CSV for analysis.
I then used the term df.describe() to get some basic descriptive analytics for the data table.
It led to some pretty interesting insights.
Steps:
Insight: I learned that I walk on average 5,901 steps per day, which is pretty abysmal when you consider the recommended amount of steps each day for a health individual like myself is 10,000.
Action: Focus on walking to more places that I could drive or do virtually. Also I should do more activities than take place outdoors.
Distance Traveled:
Insight: The maximum miles I traversed in a day was 7.52 miles which translated to me taking over 20,000 steps that day. Since that number is nearly 4 times my average I wanted to see if I knew what was going on for me that day. This led me to the next part of my analysis which was boolean masking in Pandas to try and tease out these outlier days.
Using Boolean masks, I was able to identify the days which I had traveled more than 5 miles in distance. I noticed that the days I had done that were usually days during the middle of the week, Thursday seeming to be the most common day.
Conclusion:
This data set has been fun to analyze and I feel like I have been able to find some good insights from it. These insights would have been nearly impossible to find manually so it also showed me the power the Python library Pandas has.
Comments