Callbacks¶
PassengerSim includes a variety of optimized data collection processes that run automatically during a simulation, but these pre-selected data may not be sufficient for every analysis. To supplement this, users can choose to additionally collect any other data while running a simulation. This is done by writing a "callback" function. Such a function is invoked regularly while the simulation is running, and can inspect and store almost anything from the Simulation object.
import pandas as pd
import passengersim as pax
pax.versions()
passengersim 0.57.dev4+g8e386693f passengersim.core 0.57.dev0+g85ef573c.d20251014
Here, we'll run a quick demo using the "3MKT" example model. We'll give AL1 the 'P' RM system to make it interesting.
cfg = pax.Config.from_yaml(pax.demo_network("3MKT"))
cfg.simulation_controls.num_samples = 100
cfg.simulation_controls.burn_samples = 50
cfg.simulation_controls.num_trials = 1
cfg.db = None
cfg.outputs.reports.clear()
cfg.carriers.AL1.rm_system = "P"
sim = pax.Simulation(cfg)
Types of Callback Functions¶
To collect data, we can write a function that will interrogate the simulation and grab whatever info we are looking for. There are three different points where we can attach data collection callback functions:
begin_sample, which will trigger data collection at the beginning of each sample, after the RM systems for each carrier are initialized (e.g. with forecasts, etc) but before any customers can arrive.end_sample, which will trigger data collection at the end of each sample, after customers have arrive and all bookings have be finalized.daily, which will trigger data collection once per day during every sample, just after any DCP or daily RM system updates are run.
The first two callbacks (begin and end sample) are written as a function that accepts one argument
(the Simulation object), and either returns nothing (to ignore that event)
or returns a dictionary of values to store, where the keys are all strings
naming what's being stored and the values can be whatever is of interest.
This can be a simple numeric value (i.e., a scalar), or a tuple, an array,
a nested dictionary, or any other pickle-able
Python object.
We can attach each callback to the Simulation by using a Python decorator.
Example Callback Functions¶
For example, here we create a callback to collect carrier revenue at the end of every sample. Note that we skip the burn period by returning nothing for those samples; this is not required by the callback algorithm but is good practice for analysis.
@sim.end_sample_callback
def collect_carrier_revenue(sim: pax.Simulation) -> dict | None:
if sim.sim.sample < sim.sim.burn_samples:
return
return {c.name: c.revenue for c in sim.sim.carriers}
The daily callback operates similarly, except it accepts a second argument that gives the number of days prior to departure for this day. You don't need to use the second argument in the callback function, but you need to including in the function signature (and you can use it if desired, e.g. to collect data only at DCPs instead of every day). In the example here, we collect daily carrier revenue, but only every 7th sample, which is a good way to reduce the overhead from collecting detailed data.
@sim.daily_callback
def collect_carrier_revenue_detail(sim: pax.Simulation, days_prior: int) -> dict | None:
if sim.sim.sample < sim.sim.burn_samples:
return
if sim.sim.sample % 7 == 0:
return {c.name: c.revenue for c in sim.sim.carriers}
Multiple callbacks of the same kind can be attached (i.e. there can be two end_sample callbacks). The only limitation is that the named values in the return values of each callback function must be unique, or else they will overwrite one another.
For example, suppose we also want to count for each carrier the number of passengers departing each airport on each sample day. The previous end sample callback stored revenue values in a dictionary keyed by carrier name, so if we don't want to overwrite that, we need to use a different key. One way to avoid that is to just nest the output of the callback function in another dictionary with a unique top level key.
from collections import defaultdict
@sim.end_sample_callback
def collect_passenger_counts(sim: pax.Simulation) -> dict | None:
if sim.sim.sample < sim.sim.burn_samples:
return
paxcount = defaultdict(lambda: defaultdict(int))
for leg in sim.sim.legs:
paxcount[leg.carrier.name][leg.orig] += leg.sold
# convert defaultdict to a regular dict, not necessary but pickles smaller
paxcount = {carrier: dict(airports) for carrier, airports in paxcount.items()}
return {"psgr_by_airport": paxcount}
One of the nifty features of callbacks is that they can access anything available in the simulation, not just sales and revenue data from carriers. For example, we can inspect demand objects directly, and see how many potential passengers were simulated so far, and how many didn't make a booking on any airlines (i.e. the "no-go" customers).
@sim.daily_callback
def count_nogo(sim: pax.Simulation, days_prior: int) -> dict | None:
if sim.sim.sample < sim.sim.burn_samples:
return
if sim.sim.sample % 7 == 0:
return
if days_prior > 0 and days_prior not in sim.config.dcps:
# Only count "nogo" (unsold) demand at DCPs, and at departure (days_prior == 0)
return
nogo_count = defaultdict(lambda: defaultdict(lambda: defaultdict(int)))
for dmd in sim.sim.demands:
nogo_count[dmd.orig][dmd.dest][dmd.segment] += dmd.unsold
# convert defaultdict to a regular dict, not necessary but pickles smaller
nogo_count = {orig: {dest: dict(seg) for dest, seg in dests.items()} for orig, dests in nogo_count.items()}
return {"nogo": nogo_count}
Re-using Callback Functions¶
Attaching via the decorators is a convenient way to add callbacks to a single simulation. The decorators connect the callback function to the simulation, but do not otherwise modify the function itself. It is easy to define callback functions in a seperate module or to re-use callback functions for multiple simulations, by using the decorator as a regular function. For example, we can create a second simulation object, and attach the same callback functions like this:
duplicate_sim = pax.Simulation(cfg)
duplicate_sim.end_sample_callback(collect_carrier_revenue)
duplicate_sim.daily_callback(collect_carrier_revenue_detail);
In this example, the duplicate_sim is running the same config as the original, but this would work with a modified config or even a completely different network.
Once we have attached all desired callbacks to the simulation we want to run, we can run it as normal.
summary = sim.run()
Task Completed after 1.31 seconds
All the usual summary data remains available for review and analysis.
summary.fig_carrier_revenues()
Callback Data¶
In addition to the usual suspects, the summary object includes the collected callback data from our callback functions.
summary.callback_data
<passengersim.callbacks.CallbackData from daily, end_sample>
Because we connected a "daily" callback, the data we collected is available under the
callback_data.daily accessor.
summary.callback_data.daily[:5]
[{'trial': 0,
'sample': 50,
'days_prior': 63,
'nogo': {'BOS': {'ORD': {'business': 0, 'leisure': 0},
'LAX': {'business': 0, 'leisure': 0}},
'ORD': {'LAX': {'business': 0, 'leisure': 0}}}},
{'trial': 0,
'sample': 50,
'days_prior': 56,
'nogo': {'BOS': {'ORD': {'business': 0, 'leisure': 0},
'LAX': {'business': 0, 'leisure': 0}},
'ORD': {'LAX': {'business': 0, 'leisure': 0}}}},
{'trial': 0,
'sample': 50,
'days_prior': 49,
'nogo': {'BOS': {'ORD': {'business': 0, 'leisure': 0},
'LAX': {'business': 0, 'leisure': 0}},
'ORD': {'LAX': {'business': 0, 'leisure': 0}}}},
{'trial': 0,
'sample': 50,
'days_prior': 42,
'nogo': {'BOS': {'ORD': {'business': 0, 'leisure': 0},
'LAX': {'business': 0, 'leisure': 4}},
'ORD': {'LAX': {'business': 0, 'leisure': 3}}}},
{'trial': 0,
'sample': 50,
'days_prior': 35,
'nogo': {'BOS': {'ORD': {'business': 0, 'leisure': 0},
'LAX': {'business': 0, 'leisure': 5}},
'ORD': {'LAX': {'business': 0, 'leisure': 4}}}}]
As you might expect, the "begin_sample" or "end_sample"
callbacks are available under callback_data.begin_sample or callback_data.end_sample,
respectively.
summary.callback_data.end_sample[:3]
[{'trial': 0,
'sample': 50,
'AL1': 100475.0,
'AL2': 103700.0,
'psgr_by_airport': {'AL1': {'BOS': 200.0, 'ORD': 240.0},
'AL2': {'BOS': 200.0, 'ORD': 240.0}}},
{'trial': 0,
'sample': 51,
'AL1': 101475.0,
'AL2': 97425.0,
'psgr_by_airport': {'AL1': {'BOS': 190.0, 'ORD': 240.0},
'AL2': {'BOS': 197.0, 'ORD': 240.0}}},
{'trial': 0,
'sample': 52,
'AL1': 108575.0,
'AL2': 95000.0,
'psgr_by_airport': {'AL1': {'BOS': 191.0, 'ORD': 231.0},
'AL2': {'BOS': 162.0, 'ORD': 231.0}}}]
The callback data can include pretty much anything, so it is stored in a
very flexible (but inefficient) format: a list of dict's. If the content
of the dicts is fairly simple (numbers, tuples, lists, or nested dictionaries thereof),
it can be converted into a pandas DataFrame using the to_dataframe method
on the callback_data attribute. This may make subsequent analysis easier.
summary.callback_data.to_dataframe("daily")
| trial | sample | days_prior | nogo.BOS.ORD.business | nogo.BOS.ORD.leisure | nogo.BOS.LAX.business | nogo.BOS.LAX.leisure | nogo.ORD.LAX.business | nogo.ORD.LAX.leisure | AL1 | AL2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 50 | 63 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | NaN |
| 1 | 0 | 50 | 56 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | NaN |
| 2 | 0 | 50 | 49 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | NaN |
| 3 | 0 | 50 | 42 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 3.0 | NaN | NaN |
| 4 | 0 | 50 | 35 | 0.0 | 0.0 | 0.0 | 5.0 | 0.0 | 4.0 | NaN | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 1174 | 0 | 99 | 7 | 0.0 | 3.0 | 0.0 | 8.0 | 0.0 | 7.0 | NaN | NaN |
| 1175 | 0 | 99 | 5 | 0.0 | 5.0 | 0.0 | 8.0 | 0.0 | 10.0 | NaN | NaN |
| 1176 | 0 | 99 | 3 | 0.0 | 7.0 | 0.0 | 11.0 | 0.0 | 11.0 | NaN | NaN |
| 1177 | 0 | 99 | 1 | 0.0 | 8.0 | 6.0 | 14.0 | 3.0 | 15.0 | NaN | NaN |
| 1178 | 0 | 99 | 0 | 1.0 | 10.0 | 7.0 | 18.0 | 3.0 | 17.0 | NaN | NaN |
1179 rows Ć 11 columns
summary.callback_data.to_dataframe("end_sample")
| trial | sample | AL1 | AL2 | psgr_by_airport.AL1.BOS | psgr_by_airport.AL1.ORD | psgr_by_airport.AL2.BOS | psgr_by_airport.AL2.ORD | |
|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 50 | 100475.0 | 103700.0 | 200.0 | 240.0 | 200.0 | 240.0 |
| 1 | 0 | 51 | 101475.0 | 97425.0 | 190.0 | 240.0 | 197.0 | 240.0 |
| 2 | 0 | 52 | 108575.0 | 95000.0 | 191.0 | 231.0 | 162.0 | 231.0 |
| 3 | 0 | 53 | 104275.0 | 98825.0 | 200.0 | 240.0 | 200.0 | 240.0 |
| 4 | 0 | 54 | 101300.0 | 97000.0 | 193.0 | 240.0 | 188.0 | 240.0 |
| 5 | 0 | 55 | 94600.0 | 101175.0 | 169.0 | 210.0 | 181.0 | 225.0 |
| 6 | 0 | 56 | 88425.0 | 87575.0 | 172.0 | 227.0 | 172.0 | 227.0 |
| 7 | 0 | 57 | 63900.0 | 58975.0 | 118.0 | 154.0 | 105.0 | 154.0 |
| 8 | 0 | 58 | 98400.0 | 96700.0 | 200.0 | 234.0 | 200.0 | 223.0 |
| 9 | 0 | 59 | 92175.0 | 92975.0 | 173.0 | 232.0 | 143.0 | 238.0 |
| 10 | 0 | 60 | 100425.0 | 92625.0 | 176.0 | 228.0 | 163.0 | 226.0 |
| 11 | 0 | 61 | 94125.0 | 86675.0 | 200.0 | 223.0 | 200.0 | 185.0 |
| 12 | 0 | 62 | 82525.0 | 83150.0 | 174.0 | 212.0 | 170.0 | 216.0 |
| 13 | 0 | 63 | 88625.0 | 76525.0 | 200.0 | 189.0 | 200.0 | 173.0 |
| 14 | 0 | 64 | 109900.0 | 103375.0 | 200.0 | 240.0 | 200.0 | 240.0 |
| 15 | 0 | 65 | 99775.0 | 108375.0 | 164.0 | 197.0 | 197.0 | 238.0 |
| 16 | 0 | 66 | 96825.0 | 100375.0 | 187.0 | 217.0 | 200.0 | 222.0 |
| 17 | 0 | 67 | 111275.0 | 105700.0 | 200.0 | 240.0 | 200.0 | 240.0 |
| 18 | 0 | 68 | 92575.0 | 86525.0 | 187.0 | 193.0 | 177.0 | 208.0 |
| 19 | 0 | 69 | 91375.0 | 97400.0 | 174.0 | 205.0 | 195.0 | 230.0 |
| 20 | 0 | 70 | 78400.0 | 75100.0 | 121.0 | 181.0 | 146.0 | 201.0 |
| 21 | 0 | 71 | 73825.0 | 69575.0 | 144.0 | 154.0 | 142.0 | 173.0 |
| 22 | 0 | 72 | 96150.0 | 93325.0 | 174.0 | 235.0 | 176.0 | 240.0 |
| 23 | 0 | 73 | 78000.0 | 85600.0 | 118.0 | 214.0 | 151.0 | 229.0 |
| 24 | 0 | 74 | 82600.0 | 81725.0 | 142.0 | 179.0 | 164.0 | 181.0 |
| 25 | 0 | 75 | 101550.0 | 93050.0 | 170.0 | 227.0 | 155.0 | 240.0 |
| 26 | 0 | 76 | 90600.0 | 94900.0 | 126.0 | 212.0 | 144.0 | 238.0 |
| 27 | 0 | 77 | 117925.0 | 104075.0 | 200.0 | 240.0 | 199.0 | 240.0 |
| 28 | 0 | 78 | 80275.0 | 73975.0 | 135.0 | 193.0 | 149.0 | 187.0 |
| 29 | 0 | 79 | 109225.0 | 96450.0 | 193.0 | 240.0 | 161.0 | 240.0 |
| 30 | 0 | 80 | 76700.0 | 86875.0 | 153.0 | 200.0 | 166.0 | 229.0 |
| 31 | 0 | 81 | 74250.0 | 82575.0 | 113.0 | 166.0 | 161.0 | 180.0 |
| 32 | 0 | 82 | 74950.0 | 67800.0 | 104.0 | 180.0 | 92.0 | 190.0 |
| 33 | 0 | 83 | 110900.0 | 98825.0 | 197.0 | 240.0 | 190.0 | 240.0 |
| 34 | 0 | 84 | 110350.0 | 97450.0 | 198.0 | 240.0 | 198.0 | 236.0 |
| 35 | 0 | 85 | 108625.0 | 100200.0 | 200.0 | 240.0 | 200.0 | 240.0 |
| 36 | 0 | 86 | 81575.0 | 82700.0 | 157.0 | 198.0 | 149.0 | 216.0 |
| 37 | 0 | 87 | 86825.0 | 93675.0 | 183.0 | 168.0 | 193.0 | 202.0 |
| 38 | 0 | 88 | 66575.0 | 70500.0 | 105.0 | 163.0 | 104.0 | 186.0 |
| 39 | 0 | 89 | 55700.0 | 59775.0 | 98.0 | 135.0 | 109.0 | 148.0 |
| 40 | 0 | 90 | 92500.0 | 100600.0 | 188.0 | 230.0 | 182.0 | 239.0 |
| 41 | 0 | 91 | 102450.0 | 97400.0 | 193.0 | 240.0 | 188.0 | 240.0 |
| 42 | 0 | 92 | 100525.0 | 108975.0 | 199.0 | 237.0 | 198.0 | 240.0 |
| 43 | 0 | 93 | 85975.0 | 87025.0 | 126.0 | 174.0 | 144.0 | 188.0 |
| 44 | 0 | 94 | 100175.0 | 97950.0 | 180.0 | 240.0 | 163.0 | 240.0 |
| 45 | 0 | 95 | 108800.0 | 104200.0 | 197.0 | 240.0 | 180.0 | 240.0 |
| 46 | 0 | 96 | 103900.0 | 98475.0 | 183.0 | 240.0 | 171.0 | 240.0 |
| 47 | 0 | 97 | 92250.0 | 80250.0 | 167.0 | 193.0 | 149.0 | 191.0 |
| 48 | 0 | 98 | 62450.0 | 73450.0 | 126.0 | 161.0 | 146.0 | 195.0 |
| 49 | 0 | 99 | 90350.0 | 90275.0 | 169.0 | 188.0 | 150.0 | 202.0 |
Users are free to process this callback data now however they like, with typical Python tools: analyze, visualize, interpret, etc.
# Visualize revenue difference between carriers across booking curve
import altair as alt
alt.Chart(summary.callback_data.to_dataframe("daily").eval("DIFF = AL1 - AL2")).mark_line().encode(
x=alt.X("days_prior", scale=alt.Scale(reverse=True)),
y="DIFF",
color="sample:N",
)
# Visualize "nogo" passengers over time, by market and segment
nogo = (
summary.callback_data.to_dataframe("daily")
.set_index(["days_prior", "sample"])
.drop(columns=["trial", "AL1", "AL2"])
)
nogo.columns = pd.MultiIndex.from_tuples(nogo.columns.str.split(".").to_list())
nogo.columns.names = ["nogo", "orig", "dest", "segment"]
nogo = nogo.stack([1, 2, 3], future_stack=True).dropna().reset_index()
mean_nogo = nogo.groupby(["days_prior", "orig", "dest", "segment"]).nogo.mean().reset_index()
mean_nogo["market"] = mean_nogo.orig + "-" + mean_nogo.dest
alt.Chart(mean_nogo).mark_line().encode(
x=alt.X("days_prior", scale=alt.Scale(reverse=True)),
y="nogo",
color="segment:N",
strokeWidth="market:N",
strokeDash="market:N",
)