On a daily basis, the automation tasks that engineers deal with are usually pretty straightforward; often something as simple as “send an email reminder after 2 days of inactivity”. These automations are very valuable. However, from time to time we’re faced with some really tough automation challenges.
Here’s one of them.
(Spoiler alert: Engineers had to work together with data scientists to solve it!)
We recently launched our own locker network. While the network is growing fast, it’s still small compared to our marketplace. We have more users wanting to use our locker network than we have capacity to serve them. That means we need to control the inflow of shipments to our network. How can we manage the flow most effectively?
On the surface, the problem looks very simple. Why don’t we just limit the number of shipping labels sold per day? The problem with this idea is that the lockers would be poorly utilised.
Fundamentally, the key challenge is that the physical world is not instant. Sellers on Vinted have 4 days to ship the parcel, and then it takes another ±2 days to deliver the parcel. So, if the locker is full today, that doesn’t mean we have to stop sales. We need to look at the backlog of existing shipments and predict what’s going to happen in the future.
There are further complications. Some lockers are more popular than others. Buyers choose the destination locker, but sellers are free to choose any locker for drop-off. Some shipments end up being cancelled or need to be returned. All these factors combine to create to a tough challenge.
The Quick Fix
To buy time while we got to work on a proper solution, we implemented a locker visibility flag, which was controlled manually. We were reviewing locker capacity daily and hiding lockers that were nearly full. Obviously this wasn’t the most efficient solution, but good enough, as there weren’t that many lockers on the ground.
The Second Step
Our second step is automation. We’ve built a script that runs every hour and hides or shows individual lockers based on the current data. To play it safe, the manual override stayed in place.
How does the script decide when to hide a locker? Well, math.
The script relies on two key metrics: occupancy rate and delivery backlog. The basic idea: When there are too many parcels waiting to be dropped off in relation to the number of locker slots available, the script hides the locker.
Our data scientists have built a mathematical model that takes these two inputs and produces a decision to hide or show the locker to our customers. Under the hood, you’ll find things like Markov chains and binomial distributions. But the upshot of all that math is that the script can forecast the probability that the locker will overflow if left visible. Once the predicted probability of overflow reaches a critical threshold, the locker is hidden.
Crisis and chaos averted.
It’s now been a few weeks since we launched the automated inflow control. It looks like the script is working fine. However, we’ll continue to gather data and see if there’s some room for improvement.
But for now, this was a beautiful example of engineers and data scientists working together to solve a key business problem.