Data
I used data from Chess.com and focused specifically on games played by players with Elo ratings above 1900, covering both Blitz and Classical formats. Due to computational limitations, I simplified the dataset by creating a categorical variable called win_rate, which was based on the probabilities associated with patterns in the first few moves of each game. Similarly, the opening column was transformed into a factor variable representing the likelihood of each opening leading to a win. This approach allowed me to retain key strategic signals while keeping the model computationally manageable.