The Rabbit Hole: Cracking the FPL Price Algorithm (Part 1 of 7)

Part 1 of 7: Cracking the FPL Price Algorithm
Edit (15/2/26): People have already started looking at my data and offering help. I have noticed that I had an error in the fall model. Honest test F1 is 0.60, not 0.63. The previous number had test-set selection bias baked in.
Everyone who plays FPL has watched a player rise or fall in price and wondered why. I decided to actually find out or at least try.
What started as a weekend curiosity turned into 4 seasons of data, 720,000 rows, machine learning models, a server that never sleeps. This is the story of what I found.
The moment it started
I transferred out a player. He rose overnight. I lost 0.1m.
If you play FPL you know the feeling. You stare at the app and think: how does this actually work? The price went up, but why him? Why not the other player with more transfers? What are the rules?
The FPL website tells you almost nothing. There’s no documentation. No explanation. Just prices that change overnight while you sleep, decided by an algorithm that nobody outside of FPL towers has ever seen.
So I did what any reasonable person would do. I opened the API.
What everyone thinks they know
There’s a lot of conventional wisdom about FPL prices. Most of it is wrong.
“It’s based on net transfers.” Sort of, but not how you think.
“The most transferred-in player always rises.” Definitely not.
“fplstatistics will tell you who’s going to rise.” Sometimes. Their accuracy is… we’ll get to that.
Here’s what the FPL website actually tells you about price changes: nothing. But the API, the public endpoint that every FPL app and tool uses, gives you three key fields per player, updated once a day:

transfers_in_event, transfers_out_event, and selected_by_percent. That’s it. That’s your starting material. Three numbers per player per day, and from those three numbers, sites like fplstatistics.co.uk try to predict which players will change price.
The question is whether you can do better. And the answer is yes, but it’s going to take a while.
The first discovery that hooked me
I pulled the API for the first time and plotted net transfers against “did this player rise the next day.” And immediately something didn’t add up.

Keane: 1 Dec 2025 — 17,648 net transfers, 2.3% owned, rose. (He also rose twice more that month on higher numbers.)
Thiago got 413,851 net transfers in a single day. Didn’t rise. Meanwhile Keane, with 17,648 transfers, rose the same week.
Thiago is owned by 29.2% of managers. Keane is owned by 2.3%.
That difference matters. A lot. But I didn’t know that yet. All I knew was that the simple “most transfers = price rise” model was obviously wrong, and now I needed to know what the actual model was.
This is how rabbit holes start. You notice one thing that doesn’t make sense, and instead of moving on with your life, you pull on the thread.
Haaland got 335,160 net transfers one day in September. 47.4% of all managers brought him in. He didn’t rise. If the most transferred player in the game, on one of the biggest transfer days of the season, doesn’t rise. Then whatever the algorithm is doing, it’s not what people think it’s doing.
The scale of the problem
Let me give you the numbers so you understand what we’re dealing with.

720,254 player-days across 4 seasons. That’s roughly 800 players tracked every day for 4 years.
Of those 720,254 days, exactly 2,035 were rises. That’s 0.28%.
Falls are more common, 7,934, about 1.1% but that’s a different story for later.
So the task is this: find the pattern in 0.28% of the data. It’s like trying to spot a specific person in a crowd of 350, except the crowd changes every day and the person is wearing the same clothes as everyone else.
If you built a model that just said “nobody will rise today, ever” it would be correct 99.7% of the time. And completely useless.
This is the fundamental problem. Not finding the signal but finding it in a sea of noise where almost everything is “nothing happened.”
The decision to go deep
At this point a normal person would have read a Reddit thread about it and moved on. I decided I needed every daily snapshot of every FPL player for the last 4 seasons.
The Wayback Machine (that site that archives the entire internet) has snapshots of the FPL API going back years. Every bootstrap-static endpoint, every day, every player. Someone at archive.org is doing god’s work and I doubt they know that one of the beneficiaries is a bloke trying to reverse-engineer a fantasy football algorithm.
Three seasons of historical data from the Wayback Machine. One season of live data from a Supabase pipeline I built to capture the current season in real time. 122,000 records from the live collection alone.

“I’ll just scrape a few weeks of data” became “I need everything from 2022 onwards.” The dataset landed at 720,254 rows in a single parquet file, one row per player per day, with price, ownership, transfers in, transfers out, form, status, the lot.
This file combined_all_seasons.parquet became the centre of everything that followed. Every experiment, every model, every discovery started with loading this file.
The research begins
Phase 1 was boring but necessary: data quality. Timestamp consistency. Missing days. Duplicates. Leakage checks, making sure I wasn’t accidentally using tomorrow’s data to predict today. The kind of work that nobody writes blog posts about because it’s tedious and important in equal measure.
Phase 2 was more interesting: classifying each day as rise, fall, or no-change. Matching price changes to the daily snapshots. Building the target variable.
And immediately there were puzzles. Players who rose with surprisingly low transfers. Players who didn’t rise despite massive demand. Days where nothing happened even though the transfer market was going mad.
| Season | Records | Rises | Falls | Days |
|---|---|---|---|---|
| 2022-23 | 156,011 | 453 | 2,093 | 222 |
| 2023-24 | 231,700 | 620 | 1,945 | 300 |
| 2024-25 | 202,361 | 571 | 2,006 | 280 |
| 2025-26 | 130,182 | 391 | 1,890 | 175 |
| Total | 720,254 | 2,035 | 7,934 | 977 |
Each of those 2,035 rises had a story. Each one was a decision the algorithm made based on rules I couldn’t see. But the decisions were all in the data. I just couldn’t see the pattern yet.
That realisation that the answer was already there, sitting in 720,000 rows, waiting to be found is what turned a weekend project into a months-long obsession.
A disclaimer before we go any further

Technical Sidebar: The Data Pipeline
If you’re here for the data science, here’s what the pipeline looks like:
– Source: FPL API
bootstrap-staticendpoint, one snapshot per day
– Historical: 3 seasons scraped from Wayback Machine (2022-23 through 2024-25)
– Live: 2025-26 season collected via Supabase with a daily cron job (122k records inplayer_daily_activity.json)
– Master dataset:combined_all_seasons.parquet— 720,254 records, one row per player per day
– Fields per row: player_id, date, price, ownership_percent, transfers_in/out (daily and event), form, status, news, team, position, season, gameweek
– Labels:is_riseandis_fallderived fromprice_change_daily(matched against official price change records)
– Validation scripts:p1_extract_timestamps.py,p1_missing_days.py,p1_leakage_check.py,p1_duplicates.py
– Class balance: 0.28% rises, 1.10% falls, 98.62% no change
The single biggest data quality issue was timestamp alignment, making sure each row represented the state of the player before the price change decision, not after. Getting this wrong means leaking future information into the model, which makes your results look amazing and your predictions worthless.
What’s next
I had the data. 720,000 rows across 4 seasons. Now I just needed to figure out what the algorithm was actually doing with it.
Spoiler: it took months. And the answer was both simpler and more complicated than I expected.
Next: Part 2 — “720,000 Rows of Obsession”
This is Part 1 of a 7-part series about reverse-engineering the FPL price change algorithm. The research behind this series powers fplcore.com.