Gym Market Analysis

DataCamp
Completed
Author

Chris Kornaros

Published

September 30, 2024

gym

You are a product manager for a fitness studio and are interested in understanding the current demand for digital fitness classes. You plan to conduct a market analysis in Python to gauge demand and identify potential areas for growth of digital products and services.

The Data

You are provided with a number of CSV files in the “Files/data” folder, which offer international and national-level data on Google Trends keyword searches related to fitness and related products.

workout.csv

Column Description
'month' Month when the data was measured.
'workout_worldwide' Index representing the popularity of the keyword ‘workout’, on a scale of 0 to 100.

three_keywords.csv

Column Description
'month' Month when the data was measured.
'home_workout_worldwide' Index representing the popularity of the keyword ‘home workout’, on a scale of 0 to 100.
'gym_workout_worldwide' Index representing the popularity of the keyword ‘gym workout’, on a scale of 0 to 100.
'home_gym_worldwide' Index representing the popularity of the keyword ‘home gym’, on a scale of 0 to 100.

workout_geo.csv

Column Description
'country' Country where the data was measured.
'workout_2018_2023' Index representing the popularity of the keyword ‘workout’ during the 5 year period.

three_keywords_geo.csv

Column Description
'country' Country where the data was measured.
'home_workout_2018_2023' Index representing the popularity of the keyword ‘home workout’ during the 5 year period.
'gym_workout_2018_2023' Index representing the popularity of the keyword ‘gym workout’ during the 5 year period.
'home_gym_2018_2023' Index representing the popularity of the keyword ‘home gym’ during the 5 year period.
# Import the necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
# Start coding here
import pandas as pd

workout = pd.read_csv('data/workout.csv')
three_kw = pd.read_csv('data/three_keywords.csv')
workout_geo = pd.read_csv('data/workout_geo.csv')
kw_geo = pd.read_csv('data/three_keywords_geo.csv')
workout.head()
month workout_worldwide
0 2018-03 59
1 2018-04 61
2 2018-05 57
3 2018-06 56
4 2018-07 51
peak = workout.loc[workout['workout_worldwide'].idxmax()]
year_str = peak.str.split('-')[0][0]
year_str
'2020'
workout.dtypes
month                object
workout_worldwide     int64
dtype: object
covid = workout.loc[(workout['month'] > '2019-12') & (workout['month'] <= '2022-12')]
post_covid = workout.loc[workout['month'] > '2022-12']
peak_covid = three_kw.loc[(workout['month'] > '2019-12') & (workout['month'] <= '2022-12')][['home_workout_worldwide', 'gym_workout_worldwide', 'home_gym_worldwide']].max().idxmax()
current = three_kw.loc[workout['month'] > '2022-12'][['home_workout_worldwide', 'gym_workout_worldwide', 'home_gym_worldwide']].max().idxmax()
peak_covid
current
'gym_workout_worldwide'
top_country = workout_geo.loc[workout_geo['workout_2018_2023'].idxmax()]['country']
top_country
'United States'
kw_geo1 = kw_geo.loc[(kw_geo['Country']=='Philippines') | (kw_geo['Country']=='Malaysia')]
kw_geo1
Country home_workout_2018_2023 gym_workout_2018_2023 home_gym_2018_2023
23 Philippines 52.0 38.0 10.0
61 Malaysia 47.0 38.0 15.0
home_workout_geo = kw_geo1.loc[kw_geo1['home_workout_2018_2023'] == kw_geo1['home_workout_2018_2023'].max(), 'Country'].values[0]
home_workout_geo
'Philippines'