Survey

Document related concepts

Transcript

Story Board : UEFA EURO 2016 Knockout stages from the eyes of data Poland Wales Portugal Belgium Portugal Belgium Portugal France France Germany France Germany France Italy Iceland Data says FRANCE will triumph EURO 2016 Data and Statistics we used: Euro cup 2008, Euro cup 2012, FIFA world cup 2014 and all international matches played since 2014. Features we used: We used 8 features namely FIFA Rank, Team cost, Cap sum of teams, Total Goals of a team, total goals of best three players, Goalkeeper Rank, Average Age, Home/away % win. Algorithms and Methods: We used Naïve Bayes classifier to calculate the probability of winning a match while provided all important attributes. Our assumptions: Belgium did not participate in EURO cup since 2000 so no recent data available. We took some assumptions for that. Prediction Algorithm Example • Let’s say France and Germany playing semi-final in EURO 216. We have previous statistics of both the teams. • Given these statistic tables we can calculate conditional probability of every attribute given match results (win/lose/draw) • Let’s calculate probability of having high FIFA rank of a team (say Germany) given that Germany won the match • i.e. Probability (FIFA rank = high/ result = win) • That comes out 0.794 • Similarly we can calculate for other attributes like team cost, cap sum etc. • Let’s calculate probability of win for Germany i.e. • P(win) = total win/ total match played = 0.76 and P(lose) = 1 – p(win) = 0.24 • Now we need to calculate conditional probability of winning of particular team given all attributes. • Probability of winning Germany over France given that Germany has FIFA rank higher than France, Team cost is higher than France, teams goals are higher, Best_3 score is higher, GK rank is low, average age is also low. • That comes out = 0.3330 which means Germany has less chance to win according to Naïve Bayes. Disclaimer: Please note that all predictions are made on purely statistical, scientific data. As such, there are many thousands (potentially millions) of other unpredictable factors that could influence the outcome of a match (accidents etc.)