League of Legends Champion Recommendations

Introduction

League of Legends is a popular online game by Riot Games where two teams of five players compete to destroy the opposing team's base. Each individual player chooses a champion and a role to fill on their team. There are over 130 champions in the game and, in the current meta, there are five roles to fill: top lane, middle lane, support (bottom lane), attack damage carry (called adc, also bottom lane), and jungle (between lanes). The three lanes and jungle region can be clearly seen in the map below.

League of Legends map

Almost every player in League has a favorite champion and role. Personally, I enjoy playing two somewhat niche characters in the top lane: Quinn and Urgot. Both are ranged champions with unique identities — Quinn can quickly assassinate enemies and Urgot can tank a lot of damage. These champions are not particularly well-suited for any role and, as a result, players rarely pick them (unless a recent patch has made them overpowered). This got me thinking: "If I enjoy playing Quinn in the top lane, what other champions might I enjoy?"

In this post, I'll walk through a somewhat simple approach to iteratively developing a champion recommendation system for the game League of Legends.

Data collection

The first step in building a recommendation system is to collect relevant data. Classic recommendation systems look for similarities between users based on how each user rates the same items. However, because players do not rate champions or roles, player preferences must instead be inferred from the roles and champions they choose in game. Essentially, this will have to be an implicit recommendation system.

Before collecting any game data, it is crucial to first determine which players we would want recommendations from. In League of Legends, players are ranked into different tiers based on their skill level. Using Riot's API, we can find and collect game data from players at any tier. For my algorithm, I chose to collect data from players who were ranked Gold as the Gold tier seems to offer a good balance of in-game skill (top 20% of all players) and off-meta champions (like Urgot).

Basic analysis

From the player data we collected, we need to find a value that can act as a proxy for a player's role+champion rating. One candidate is a player's in-role champion play frequency, which can be thought of as the probability that a player will choose champion X given that they are playing role A. Note that, unlike explicit ratings, these in-role champion play frequencies are inherently negatively correlated, as playing champion X more will decrease the the play frequency for all other champions in that role. As such, matrix completion methods may not be a viable route and we will instead use a more ad-hoc approach.

We can explore these play frequencies by plotting histograms for some popular champions in each role:

In-role champion play frequencies for popular champions. Play frequencies equal to zero are ignored.
Y-axis is player count.

With the exception of Adc Lucian and Ezreal, these distributions look exponential. We can use these distributions to choose criteria to guide us in determining which players we think should qualify to make a recommendation for a particular role+champ combo. If we have a lot of data, we could choose these criteria to be extremely stringent and only take recommendations from players who play a champion significantly more than the average (e.g. only recommendations from players in the upper quartile). However, given that we don't have much data, our goal here will be to build a recommendation system that considers recommendations from all players who play a champion more than average.

Round 1

One approach to calculating recommendations might be to use the sample Pearson correlation coefficient r between role+champion combinations:

where N is the number of players in the dataset, fⁱ_{X_A} is the in-role play frequency for player i on champion X in role A, and ⟨f_{X_A} ⟩ and s_{X_A} are the average and standard deviation across all players. The Y_B variables are defined simialrly. This approach would essentially calculate a recommendation from each player that scales with how frequently they play both champion X in role A and champion Y in role B. However, there are a few flaws with this approach. First, players who play both champions X and Y less than average would still contribute positively to r (and hence, they would "recommend" Y to people who play X). And second, a one-trick pony who plays champion X significantly more than the average will have a much larger contribution to r, thereby skewing the final recommendations towards their playstyle.

Instead of looking to the sample Pearson correlation coefficient, we could instead say that "if a player plays champion X in role A more than average and champion Y in role B more than average, they would 'recommend' champion Y to you." This approach avoids both of the issues discussed above and is more in line with our original goal. We'll consider recommendations from players who have: 1) played a minimum number of games in both roles, and 2) played a minimum number and fraction of games on champion X in role A. These restrictions will amplify the signal from players who main the selected role and champion.

Here are a few sample recommendations for the top, mid, and adc roles (click to enlarge):

The roles are top (), jungle (), mid (), support (), and adc (). As for N, when below a role icon, N is the number of players in the dataset who play enough games to make a recommendation for this role; when below a champion, N is represents the number of players who would recommend that champion to you. Ideally, a good set of recommendations would exhibit a sharp drop in N between groups of recommendations, indicating that certain recommendations are better than others. We can see this in Top Darius's adc recommendations, where it is clear that Jhin and Lucian are better recommendations than Jinx. However, this distinction in N is not always present. This could be an intrinsic property of the dataset or a consequence of the algorithm. Maybe we can try changing the algorithm a bit?

Round 2

What if we instead varied the strength of each recommendation from 0 to 1 according to how frequently each player plays both champions? We could use something in the spirit of the sample Pearson Correlation coefficient r, but with the following changes:

a player's contribution to r, the term inside the sum which we'll call rⁱ, is capped at a value r_max ;
a player's recommendation is equivalent to rⁱ/r_max ;
a player's recommendation is counted only if a player plays both champions more than average.

The resulting recommendations will give a stronger voice to those who main either champion while avoiding the issue of being completely skewed towards the playstyles of one-trick ponies.

Implementing this approach, we get the following results:

These recommendations look very similar to our previous recommendations. We could try to increase our minimum requirements for making a recommendation, but at this point, we might be going in circles. We'll instead try to evaluate the quality of our previous recommendations.

Evaluation

While we can get an idea of what might be a good or bad recommendation from the differences in N between champion recommendations, we need a statistical test to determine whether or not one recommendation is actually better than another. If we treat the recommendations for each champion as samples from a binomial distribution, we can use a chi-squared test to determine whether or not one champion is indeed recommended significantly more than another. From these results, we can even devise some sort of scoring system.

Let's rate our recommendations using a simple system where the rating of a recommendation starts at 3 stars and is reduced if:

the fraction of players who recommend the top champion is below some threshold,
the recommendation is not significantly different than the next best recommendations,
the previous recommendation is (likely) better.

Applying this approach to our three examples above, we get the following results:

These results seem quite reasonable. For Top Darius, all three top lane recommendations have the same rating, as we might expect given the fact that these champions had a similar number of recommendations regardless of the underlying algorithm. Additionally, we can see that the adc recommendations Jhin and Lucian are indeed significantly better than Jinx. If we look at Mid Lux, we see that the best recommendations for support are Lux and Morgana, which we might expect given that Lux and Morgana have similar skillsets.

This algorithm is currently implemented here with data from 2017.

Final thoughts

Given the fact that many of the top recommendations seem statistically significant, the resulting algorithm seems alright. Nevertheless, we could continue to expand and improve this algorithm in many ways. For example, we could:

implement a more rigorous approach to the ratings algorithm, as the frequent appearance of empty stars indicate the current algorithm may be too strict;
allow a player to provide their in-game name so we can tailor the recommendations to their playstyle by treating them as a superposition of different role+champion mains;
validate the roles in the raw data from Riot's API (is jungle Lux really a thing?);
tweak the recommendations to highlight or hide off-meta recommendations (again, jungle Lux?);
~~reimplement the algorithm for the 2017 season so the recommendations are more relevant.~~ Done.

And, what about my personal goal of understanding my fellow Top Quinn mains? Well, here are the results:

It turns out that Top Quinn mains will play Quinn in almost any role. In fact, this trend is also present in the recommendations for Mid, Jungle, and Adc Quinn. I guess birds of a feather do flock together.