In 2017, Division on Addictions (DOA) and bwin Interactive Entertainment collected data on gambling activity by users who registered for the service between February 1st and 28th of 2005. This data_mart summarizes the activities of these users from February 1st until Septermber 30th 2005. In order to examine solely the gambling patterns of the users when they use their own money, we removed all of the events where the users were playing with promotional money. Below are a couple summary statistics of the forementioned users and an explanation of the logic that we used to summarize the gambling behavior of each user.
One statistic that we found interesting is that females make up less than one-tenth of the new gambler demographic between February 1st and September 30th 2013. This reflects the trend of males being more interesting in playing high stakes games like poker.
Another interesting statistic is the fact that about half of the users were in their twenties. This is logical given people in their twenties generally do not have a lot of responsibilities like taking care of children and paying a mortgage. It could also reflect the fact that are still looking for that thrill in life, which can be gained in the short term through adrenaline raising activities, such as gambling.
Another interesting statistic that came out of our data was that Germany is by far the best represented country amoung out of all countries. Given that the companies doing the research were in Austria, the data might be skewed to this country due to the proximity of the research location.
Here are the top acquisition platforms, filtered by those that brought more than 100 users to gambling. Clearly, BETANDWIN.DE is the most successful referer bringing in far more users at over 22 thousand for the period of February 1st until February 28th, 2005. The fact that the highest performing acquisition website has a ‘.de’ domain suggesting that the website is German. Thus, Germany is by far the nationality that is the best represented, it is logical that the best performing acquisition website is German.
Platform | Users_Acquired |
---|---|
BETANDWIN.DE | 22044 |
BETANDWIN.COM | 14502 |
BETEUROPE.COM | 2360 |
BETOTO.COM | 2144 |
BETANDWIN POKER | 483 |
BETANDWIN CASINO | 456 |
PLAYIT.COM | 327 |
Users having signed on between February 1st and 28th seemed to be most interested by sports gambling, as they make up most of the activity measured.
Gambling Product | Number of Users |
---|---|
Sports book fixed odds users | 40572 |
Sports book live action users | 25202 |
Poker users | 2353 |
Casino BossMedia users | 1862 |
Supertoto users | 833 |
Games VS users | 2530 |
Games bwin users | 1827 |
Casino Chartwell users | 4867 |
As we will explain below, in order to aggregate user activity, we calculated the maximum profit and the maximum loss that each user earned between February 1st and September 30th 2005. Here, we calculated what the average of these macimum profit and loss variables were for all of the users in order to see how well each product is doing. While the users make the highest average profit at Casino BossMedia, they also make their highest average loss there. On the other hand, Supertoto users never really make a proft. What would be interesting would be to see if the means of the profit and loss variables are statistically different in order to answer which product earns the highest margin.
Note: Because poker data was collected separatly from the other products, we created different variables for it. The variable most similar to profit and loss were the maximum poker chip buy and sell amounts, as they also represent money flow to and from the company. Thus, poker statistics are summarized in their own table.
Gambling Product | Average Profit | Average Loss |
---|---|---|
Sports book fixed odds users | 84.7237830350981 | -63.7932823252489 |
Sports book live action users | 34.1906268986588 | -49.9792293786207 |
Casino BossMedia users | 91.6610670784103 | -148.060899785177 |
Supertoto users | -0.274633853541417 | -1.30714789915966 |
Games VS users | 10.6535182608696 | -61.4971819367589 |
Games bwin users | 10.4244630541872 | -28.9911491516147 |
Casino Chartwell users | 59.0799342510787 | -128.492677213889 |
Poker Chip Action | Average Maximum Amount Amoung Users |
---|---|
Maximum Poker Chip Sell Amount | 121.121201219159 |
Maximum Poker Chip Buy Amount | 157.277439402374 |
Before even begining our analysis, we first wanted to remove all of the activity recorded where the user played with promotional money as that may not truly describe a users gambling habits if they were using their own money. To do this, we filtered out all play activities recorded before the first pay date (i.e. the date the users deposited money to play the games.) We used the following code to do this:
raw_user_daily_agg <- merge(x = raw_user_daily_agg, y = raw_demo[,c("UserID", "FirstPay")], by = "UserID", all.x = TRUE)
raw_user_daily_agg["Promotional"] <- raw_user_daily_agg$FirstPay > raw_user_daily_agg$Date
raw_user_daily_agg <- filter(raw_user_daily_agg, Promotional == "FALSE")
Then, in order to analyze user behaivior we created various metrics. For all of the products, except for poker, we calculated the user loyalty, consumption, addiction, and engagement. We defined loyalty as the length of the relationship in days (LOR_days_active), consumption as the total spent (stakes_total), addiction as the betting frequency (bet_freq) and the engagement as the number of days active (Nbr_days_active). Note, engagement differs from loyalty in that loyalty is the number of days between the first play and the last play while engagement is the total number of days that the user actually played.
We also calculated summary statistics, such as min, max and total for the number of bets, the amount of winnings and the amount of stakes. Finally, we created two variables for the largest amount lost in one day as well as the largest profit in one day.
To create these metrics, we used the following code:
raw_user_daily_agg %>%
group_by(UserID) %>%
filter(ProductDescription == x) %>%
summarize(
first_active_date = min(Date),
last_active_date = max(Date),
LOR_days_active = difftime(max(Date), min(Date), units = "days"),
Days_of_Inactivity = difftime(as.Date('2005-09-30', "%Y-%m-%d"), max(Date), units = "days"),
Nbr_days_active = n_distinct(Date),
bets_total = sum(Bets),
max_bet = max(Bets),
min_bet = min(Bets),
bet_freq = ifelse(LOR_days_active == 0, 0, bets_total/as.double(LOR_days_active, units='days')),
win_total = sum(Winnings),
max_win = max(Winnings),
min_win = min(Winnings),
stakes_total = sum(Stakes),
max_stake = max(Stakes),
min_stake = min(Stakes),
max_loss = min(Profit_Loss),
max_profit = max(Profit_Loss)
)
Because the information on the poker product came from a different database, the summary statistics calculated were a bit different but essentially following the same logic. For poker, we split the data into buying poker chips and selling poker chips. Again, we calculated the user loyalty (LOR_days_active), consumption (total_amount), addiction (avg_amount_per_day), and engagement (Nbr_days_active).
raw_poker_chip %>%
group_by(UserID) %>%
filter(TransType == "buy") %>%
summarize(
first_active_date = min(TransDate),
last_active_date = max(TransDate),
LOR_days_active = difftime(max(TransDate), min(TransDate), units = "days"),
Days_of_Inactivity = difftime(as.Date('2005-09-30', "%Y-%m-%d"), max(TransDate), units = "days"),
Nbr_days_active = n_distinct(TransDate),
total_amount = sum(TransAmount),
max_amount = max(TransAmount),
min_amount = min(TransAmount),
avg_amount = mean(TransAmount),
avg_amount_per_day = poker_buy_total/as.double(Nbr_days_active_buy_poker, units='days'))
Once we created all of the variables mentioned above for all of the products, we merged them with the relevant demographics variables–i.e. Country, Language, Registration Date, Application/Acquisition, Gender–from the demographics database. We also added the age variable from analytics database as we did not have access to the users’ birthday.
To conclude, the data mart created contains all of the demographics and metrics described above for each of the users in the databases.