The 2022 World Cup will be held in Qatar - after the unsurprising first-round exit without the German team - and is dividing opinion: a global sporting event is pitted against a political issue - usually an unfavorable combination. It is undisputed, however, that the correct prediction of soccer combinations on and off the pitch is a topic that occupies many people, from professional coaches to bettors. One approach to narrowing down outcomes and performance in advance and going as far as predicting an outcome is "Deep Soccer," the Machine Learning Soccer Predictor by Dr. Miguel Gonzalez, who works in the Data Science department at Cognizant Mobility. We talked to him and today we present you in essential brevity what exactly "Deep Soccer" is, how you can use the tool and why you still prefer not to bet money on it.
Ca. 9 min
What exactly is Deep Soccer, the machine learning predictor for soccer results?
Basically, Deep Soccer is an Artificial Intelligence that predicts both match results and player performances using industrial Machine Learning mechanisms. For a more detailed insight into this exciting subject, we recommend our articles on clustering and classification, without which training a model would not be possible.
While Deep Soccer started out as the project of an enthusiastic Data Science fan, namely Dr. Miguel Gonzalez, who simply wanted to combine sports, Machine Learning and result-oriented Data Science, it has grown into a concrete project that even the united journaille is starting to take more interest in. Those who speak Spanish (or want to use their browser’s translation function) can view one of the articles at this link. Nevertheless, it should be noted that this is a private project and not a work from the Mobility Rockstars or Cognizant Mobility environment.
So what can Deep Soccer do now?
The artificial intelligence-based tool predicts two things: on the one hand, the results of a soccer match, where the league, or as in this case, the World Cup, can be specifically selected. After entering the game, the tool spits out a variety of values: The expected number of goals per half, how many fouls, corners and cards there might be and some metrics more.
Of course, this part is especially fun, and especially during the World Cup it is not without charm to compare your own tips with those of the AI. It goes without saying that the predictions are of course forecasts, not reliable predictions – not least, for example, the German team lost to the passionate islanders of Japan despite a predicted probability of victory of around 50%. It goes without saying that you have to make something out of your opportunities. The Predictor, however, predicted that Germany would make it against Costa Rica: he correctly predicted a victory for the German national team, even though despite the win, they once again failed in a World Cup preliminary round and will have to watch the rest of the tournament from the stands.
Deep Soccer: More than just a game
The headline anticipates it: Deep Soccer can do more than “just” predict games. An important component of the tool – which is not yet available online – is the prediction of player performance. So if you are interested in how a Christiano Ronaldo would do in a match against FC Bayern despite his advanced age (and advanced attitude), you can also have this checked with Deep Soccer. How many goals could such a Ronaldo score then, how many fouls does he commit, how often does he shoot with his left foot or his right? How many shots towards the goal has he fired, how many corners does he take? These values can be predicted with some precision based on various data sets – more on this in a moment – and while they leave room for interpretation, they can certainly be indications for coaches, for example, to build up their strategy; which is also the declared goal of the Machine Learning Football Predictor.
If you would like to see this part of the tool in action, you can have Dr. Miguel Gonzalez explain it to you himself in this YouTube video. It’s in Spanish, but just turn on subtitles and go for it.
Deep Soccer Machine Learning Football Predictor: And where does the data come from?
When it comes to Data Science, then, sure, it’s all about: Data. To train a model for an artificial intelligence, it needs training data. These, however, are easy to come by, especially in the field of soccer. Databases like Kaggle offer many kinds of datasets, many websites offer historical data about soccer games and players, FIFA’s data is public, and even games like FIFA*23 offer extensive datasets that can be used to train an AI.
The real difficulty, moreover, is not in obtaining the data, but in blending them into a single database. In line with the Pareto principle, this aspect, which is less relevant in the final execution, accounts for almost 80 percent of the effort – because clean results can only be predicted with clean data.
So the next step is to verify and improve this data, keyword “data highness”. Errors want to be cleaned up and the data properly prepared.
The Deep Soccer tool, based on Python as the main technology, further employs algorithms to deal with this data and train the model. In the field of industrial machine learning, one usually applies different algorithms to check which one works best, and especially for manageable data sets, not every algorithm is suitable. Random Forest, as an evolution of the well-known Decision Trees, lent itself to player performances and is able to account for the multiple decisions that lead to a forecast.
To train the model for the match results, Neural Networks was the first approach – but Deep Learning usually requires quite a lot of data, so Linear Regression prevailed here as a functional solution approach and now ensures that Deep Soccer can output many exciting results.
Deep Soccer – Can I bet on it? And combine my own results and data with the tool?
Of course: This subheading arouses desires. Thanks to artificial intelligence, can you now predict the games and performances of teams and players and have your wallet gilded in the betting office? No, of course not. These are still calculated forecasts. These are certainly based on reliable values, and the predictions will become more precise, especially as the project continues to develop, so that they can also be used as a basis for trainers, managers and teams, for example, at least that is the hope of Dr. Miguel Gonzalez.
The good news, of course, is that Deep Soccer was built with an API, or interface. So not only individual results can be retrieved manually on the page: In a few minutes, theoretically thousands of results can be predicted. A new data set that can be reused.