???
Chess has a finite number of moves/states. Football is infinite.
Football is absolutely not infinite, there’s a limited number of players in a limited space. The barrier to modelling isn’t resolution, we can accurately model the weather after all.
We can absolutely build a predictive model using some kind of tokenisation and traditional feature extraction from current video segmentation and labelling techniques. Even a relatively low resolution 3D grid would capture most of the relevant features of players and balls maybe you could get sophisticated with pose estimation and kinematic modelling of player body positions, but a simple CNN classifier to put them into a few minimal pose buckets would work too for player state: standing, jumping, falling, etc. time series gives you velocity and movement.
Once you’ve done that you can model goals and produce some kind of similarity metric which allows you to assign difficulty scores.
But I fully expect that’s what closed source xG models are doing.
TBH I wouldn’t be surprised if you could skip all of that these days and feed the video into a video transformer model similar to ChatGPT and get a direct prediction out. Next token prediction is what video generation models do, run a bunch of next frames predictions after training on all the football footage you can find and assess how many have the ball in the goal.