top of page

Create Your First Project

Start adding your projects to your portfolio. Click on "Manage Projects" to get started

Case Study: BellaBeat

Case study

May 2023

Garden Grove, California.

Introduction

Bellabeat was founded in 2013 by Urška Sršen and Sando Mur. It is a small company with great potential to become a market leader in the industry. Bellabeat is currently focused on women's health as its main business, offering smart products for key health monitoring such as the Bellabeat app, Leaf, Time, and Spring. With its remarkable potential, co-founder Sršen aims to utilize the available user data to develop potential marketing strategies for further growth

Business Task

Unlock new growth opportunities by leveraging the data from smart device usage to identify trends and patterns,thereby gaining valuable insights to develop a data-driven marketing strategy.

 

Key stakeholders:

Urška Sršen: Bellabeat's co-founder and Chief Creative Officer.

Sando Mur: Mathematician and Bellabeat’s co-founder; a key member of the Bellabeat executive team

 

The goal is to answer these question:

1. What are some trends in smart device usage? 

2. How could these trends apply to Bellabeat customers? 

3. How could these trends help influence Bellabeat marketing strategy? 

Prepare data

FitBit Fitness Tracker Data - the data is under CC0: Public Domain and made available through Mobius.

  • The dataset is from Kaggle and contains personal fitness tracker data from thirty Fitbit users, including minute-level output for physical activity, heart rate, and sleep monitoring. It includes information about daily activity, steps, and heart rate that can be used to explore users’ habits.

  • The dataset was generated by respondents to a distributed survey via Amazon Mechanical Turk between 03.12.2016 and 05.12.2016.

  • The dataset consists of a total of 18 files in .csv format organized in a long format.

In this analysis, my focus will be on the daily timeframe to detect high-level usage trends, rather than delving into the detailed performance on a minute-by-minute or hourly basis.

The data is downloaded and stored as an Excel file for preprocessing. During the preprocessing stage, I check the integrity of the data by using Excel's formula: “=COUNTA(UNIQUE(A2:A941))” to count the unique number of IDs in the file "dailyActivity_merged." However, upon analysis, I discovered that there are 33 unique IDs, whereas the author of the data claims that the data was collected for a total of 30 people. Intrigued by this discrepancy, I decided to apply the same formula to other daily-frame data files. Interestingly, the Calories, Intensities, and Steps files also contained 33 unique IDs, similar to the dailyActivity_merged file. However, when I checked the "sleep file," I found only 24 unique IDs, and even fewer, just 8 unique IDs, in the "weight file." These variations in the number of unique IDs among the different data files raised further concerns about the data's integrity.

Process

1. Import packages & data

In this analysis, my focus will be on the daily timeframe to detect high-level usage trends, rather than delving into the detailed performance on a minute-by-minute or hourly basis.

RStudio Cloud is used for efficient data processing and is well-suited for this purpose, all 18 CSV files have been uploaded and stored in RStudio Cloud. Six of them, with a daily time-frame, have been loaded into a data frame for further processing.

​

​

​

​

​

​

​

​

​

​

2. Merging data

Then, I proceeded with merging the data frames to enhance the convenience of data processing. Consequently, I obtained three data frames that will be utilized for further analysis: 'all_daily,' which combines Step, Calories, Activity, and Intensities; 'Sleep,' which merges Step, Calories, Activity, and Intensities with Sleep data; and 'MergeALL,' which consolidates all the data into a single entity. Afterwards, to gain a preliminary understanding of the data and ensure its integrity, I used the "glimpse()" function to take a brief look at the data.

​

​

​

​

​

​

​

​

​

 

 

The "all_daily" data frame contained 940 observations, while the "MergeALL" data frame only had 35 observations. This significant difference in the number of observations arises from the variation and insufficient information present in the "sleepDay_merge" and "weightLogInfo_merge" data frames in comparison to the others.

Analyze

1.Proportion of daily activity

​

I conducted an analysis to determine the type of activity that consumed the majority of the total time for all individuals. The chart revealed that the Sedentary activity type is the most predominant among individuals throughout the day:

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

 

2.Activity Duration and Caloric Expenditure

​

Subsequently, a detailed analysis was conducted on the duration of each activity to explore the relationship between these activities and caloric expenditure. The findings indicate that there is no discernible correlation between sedentary, lightly and fairly active activities and calorie burn. A moderate correlation becomes apparent between very active activities and caloric expenditure.

​

  • Coefficient of determination (Sedentary active minutes vs Calories burnt)= -0.107

  • Coefficient of determination (Lightly active minutes vs Calories burnt)= 0.287

  • Coefficient of determination (Fairly active minutes vs Calories burnt)= 0.298

  • Coefficient of determination (Very active minutes vs Calories burnt)= 0.616

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

2.Activity Duration and Sleeptime

​

We will now delve into the relationship between the time of activities throughout the day and sleep duration to examine potential correlation. This investigation aims to determine whether a correlation exists between the time of day activities and sleep time. By exploring this aspect, we can gain further insights into the potential impact of activity patterns on sleep duration

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

Based on the analysis conducted, we investigated the correlation between the duration of different activity types and the total sleep time. The results indicate that there is no significant correlation between the time spent on Lightly Active, Fairly Active, and Very Active activities and the total sleep time. However, we observed an inverse correlation relationship between sedentary time and the total sleep time.

The correlation coefficients between the total sleep time and each activity type are as follows:

  • cor(TotalMinutesAsleep vs SedentaryMinutes)= -0.599

  • cor(TotalMinutesAsleep vs LightlyActiveMinutes) = 0.033

  • cor(TotalMinutesAsleep vs FairlyActiveMinutes)= -0.245

  • cor(TotalMinutesAsleep vs VeryActiveMinutes) = -0.090

 

3.Activity Duration, steps and calories burned with BMI

Now, we will proceed to determine the potential correlation between the duration of each activity type and Body Mass Index (BMI), as well as the correlation between daily step count and calories burned with BMI. For the purpose of analysis, we have categorized BMI into three groups :

  • Below 18.5:Classification: Underweight

  • 18.5 - 24.9:Classification: Healthy Weight

  • 25.0 - 29.9:Classification: Overweight

  • 30.0 and above:Classification: Obesity

  • ​

3.1 Activity Duration with BMI

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

Through our analysis, individuals who engage in a higher proportion of Sedentary activities are less likely to have a healthy BMI. Conversely, as the percentages of other activities, such as VeryActive or FairlyActive, increase, we can observe a positive association with a healthier BMI.Furthermore, it is noteworthy that individuals who predominantly engage in Sedentary activities without much variation tend to have a higher likelihood of being classified as having an Obesity BMI.

​

3.2 Step, and calories burned with BMI

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

Based on the graph, it appears that individuals with an overweight BMI tend to have higher step counts and calorie burn compared to those with a healthy weight. However, it is important to note that these observations could be influenced by potential data bias. The data used for analysis is collected from users who utilize health monitoring products, indicating a self-selection bias where only individuals who are interested in their health and potentially aiming to lose weight through exercise are included in the dataset. As a result, the higher step counts and calorie burn among the overweight group may not be representative of the broader population.

1

Conclusion

Based on the analysis of the Fitbit fitness tracker data, several trends and insights have been identified that can guide Bellabeat in developing a data-driven marketing strategy for further growth. Here are the key findings:

  1. Sedentary activity accounts for the majority of the time spent on activities.

  2. No significant correlation has been found between sedentary, lightly active, and fairly active activities with calorie burn. However, a moderate correlation has been observed between very active activities and caloric expenditures.

  3. There exists an inverse correlation between sedentary activity and sleep duration. This implies that individuals who engage in less physical activity throughout the day tend to have shorter sleep durations.

  4. Individuals who are less active generally exhibit less healthy body mass index (BMI) values. This suggests that individuals who engage in active or fairly active activities during the day tend to have a healthy BMI.

Recommendation

Based on the insights derived from the analysis, several recommendations can be made to inform Bellabeat's marketing strategy:

  1. Foster an active lifestyle through product development: Bellabeat should prioritize the creation of innovative products or features that inspire users to engage in physical activity. This could involve designing personalized activity trackers, developing comprehensive workout plans, and incorporating interactive challenges to make exercise enjoyable and rewarding for users.

  2. Highlight the significance of balanced lifestyles: Bellabeat should educate its customer base about the importance of maintaining a well-rounded lifestyle that encompasses regular physical activity and adequate sleep. This can be accomplished through engaging content marketing initiatives, impactful social media campaigns, and strategic partnerships with influential figures in the wellness industry.

  3. Target specific customer segments: By recognizing the correlation between activity levels and BMI, Bellabeat can tailor its marketing efforts to specific customer segments. For instance, individuals with sedentary lifestyles and higher BMIs could be targeted with personalized recommendations, coaching services, and supportive resources to assist them in adopting healthier habits.

  4. Leverage data-driven marketing approaches: Bellabeat should leverage the wealth of user data available to personalize its marketing endeavors and provide tailored product recommendations based on individual needs. By analyzing user behavior and preferences, Bellabeat can offer targeted promotions, suggest relevant products, and deliver captivating content that resonates with its customers.

In summary, by utilizing the valuable insights derived from the Fitbit fitness tracker data, Bellabeat can elevate its product offerings, foster deeper customer engagement, and establish itself as a prominent player in the realm of women's health and wellness.

bottom of page