from single commodity to multiattribute models for locomotive … · 2014-07-18 · locomotives...

24
Articles in Advance, pp. 1–24 ISSN 0041-1655 (print) ISSN 1526-5447 (online) http://dx.doi.org/10.1287/trsc.2014.0536 © 2014 INFORMS From Single Commodity to Multiattribute Models for Locomotive Optimization: A Comparison of Optimal Integer Programming and Approximate Dynamic Programming Belgacem Bouzaiene-Ayari Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544, [email protected] Clark Cheng, Sourav Das, Ricardo Fiorillo Norfolk Southern Corporation, Atlanta, Georgia 30309 {[email protected], [email protected], ricardo.fi[email protected]} Warren B. Powell Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544, [email protected] W e present a general optimization framework for locomotive models that captures different levels of detail, ranging from single and multicommodity flow models that can be solved using commercial integer programming solvers, to a much more detailed multiattribute model that we solve using approximate dynamic programming (ADP). Both models have been successfully implemented at Norfolk Southern for different planning applications. We use these models, presented using a common notational framework, to demonstrate the scope of different modeling and algorithmic strategies, all of which add value to the locomotive planning problem. We demonstrate how ADP can be used for both deterministic and stochastic models that capture locomotives and trains at a very high level of detail. Keywords : approximate dynamic programming; locomotive planning History : Received: September 2012; revision received: July 2013; accepted: November 2013. Published online in Articles in Advance. 1. Introduction Locomotives represent a major investment for a rail- road that might operate over 2,000 units, representing billions of dollars in assets. Railroads face the challenge of determining the right fleet size and mix, as well as operating the fleet in an efficient way. If the railroad maintains too small of a fleet, it will face the prospect of delayed trains, as well as additional costs from leasing locomotives. A fleet that is too large is expensive to own, maintain, and operate, and introduces the issue of storing additional locomotives in yards with limited space. The locomotive planning problem is a complex and multidimensional problem with competing objectives. When assigning power to trains, planners have to take into consideration horsepower requirements (for speed), tractive effort (to get over steep grades), ownership, maintenance requirements (the locomotive may have to be routed toward a shop), and consist formation (loco- motives are connected in sets of two to six locomotives, called consists, required to move a train). Shop routing is a major complication. Locomotives are required by law to pass through maintenance shops every 92 days for routine maintenance, and from time to time, they need to be sent to the shop for mechanical repairs. If a locomotive misses a shop appointment, it has to be turned off and “dragged” to the shop, adding to the cost of moving a train and sometimes requiring dedicated movements (incurring additional crew costs). Railroads prefer to anticipate shop appointments, moving a locomotive toward a shop while maximizing its productivity. The goal is to arrive in time for a shop appointment, not too early and never late. At the same time, a particular shop may be heavily congested, and as a result the railroad has to balance the demands on different shops. An important issue, widely ignored in prior research in locomotive optimization, is the presence of significant sources of uncertainty. Perhaps the most important is stochasticity in transit times, as well as the time required to process locomotives and trains through connecting yards. Compounding the problem is uncertainty in 1

Upload: others

Post on 25-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Articles in Advance, pp. 1–24ISSN 0041-1655 (print) � ISSN 1526-5447 (online) http://dx.doi.org/10.1287/trsc.2014.0536

© 2014 INFORMS

From Single Commodity to Multiattribute Modelsfor Locomotive Optimization: A Comparison

of Optimal Integer Programming andApproximate Dynamic Programming

Belgacem Bouzaiene-AyariDepartment of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544,

[email protected]

Clark Cheng, Sourav Das, Ricardo FiorilloNorfolk Southern Corporation, Atlanta, Georgia 30309

{[email protected], [email protected], [email protected]}

Warren B. PowellDepartment of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544,

[email protected]

We present a general optimization framework for locomotive models that captures different levels of detail,ranging from single and multicommodity flow models that can be solved using commercial integer

programming solvers, to a much more detailed multiattribute model that we solve using approximate dynamicprogramming (ADP). Both models have been successfully implemented at Norfolk Southern for different planningapplications. We use these models, presented using a common notational framework, to demonstrate the scope ofdifferent modeling and algorithmic strategies, all of which add value to the locomotive planning problem. Wedemonstrate how ADP can be used for both deterministic and stochastic models that capture locomotives andtrains at a very high level of detail.

Keywords : approximate dynamic programming; locomotive planningHistory : Received: September 2012; revision received: July 2013; accepted: November 2013. Published online in

Articles in Advance.

1. IntroductionLocomotives represent a major investment for a rail-road that might operate over 2,000 units, representingbillions of dollars in assets. Railroads face the challengeof determining the right fleet size and mix, as well asoperating the fleet in an efficient way. If the railroadmaintains too small of a fleet, it will face the prospect ofdelayed trains, as well as additional costs from leasinglocomotives. A fleet that is too large is expensive toown, maintain, and operate, and introduces the issueof storing additional locomotives in yards with limitedspace.

The locomotive planning problem is a complex andmultidimensional problem with competing objectives.When assigning power to trains, planners have to takeinto consideration horsepower requirements (for speed),tractive effort (to get over steep grades), ownership,maintenance requirements (the locomotive may have tobe routed toward a shop), and consist formation (loco-motives are connected in sets of two to six locomotives,called consists, required to move a train).

Shop routing is a major complication. Locomotivesare required by law to pass through maintenanceshops every 92 days for routine maintenance, andfrom time to time, they need to be sent to the shopfor mechanical repairs. If a locomotive misses a shopappointment, it has to be turned off and “dragged”to the shop, adding to the cost of moving a train andsometimes requiring dedicated movements (incurringadditional crew costs). Railroads prefer to anticipateshop appointments, moving a locomotive toward ashop while maximizing its productivity. The goal is toarrive in time for a shop appointment, not too earlyand never late. At the same time, a particular shopmay be heavily congested, and as a result the railroadhas to balance the demands on different shops.

An important issue, widely ignored in prior researchin locomotive optimization, is the presence of significantsources of uncertainty. Perhaps the most important isstochasticity in transit times, as well as the time requiredto process locomotives and trains through connectingyards. Compounding the problem is uncertainty in

1

Page 2: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning2 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

the train schedule itself, as trains can be added toor dropped from the schedule with only a few daysnotice. Train additions are particularly common forcommodities such as grain and coal.

Reflecting both its importance and complexity, loco-motive optimization has attracted considerable atten-tion from academics, consultants, and operationsresearch groups within the railroads. Almost all ofthese prior models were deterministic, and developedduring a period when commercial integer program-ming solvers were steadily maturing. The locomotiveproblem is not just a large integer program, it is a hardinteger program that involves difficult features such asthe need to bundle locomotives into consists, schedulelocomotives into maintenance shops, manage foreignpower, and deal with the need to occasionally delaytrains.

This paper reports on a family of models that werefer to using the umbrella term PLASMA (PrincetonLocomotive and Shop MAnagement system). Theseinclude the following:

PLASMA/SC. This is a single commodity flow modelwhere all locomotives are represented as being the same.The goal of the model is to provide general guidelinesof the following form: locomotives on inbound train Ashould be assigned to outbound trains B and C. Thismodel balances flows on a network level, maintainingdaily consistency patterns as well as first-in, first-outconstraints. This model does not capture individuallocomotives, and handles shop routing and the man-agement of foreign power only at an aggregate level.Trains are not allowed to be delayed. The model pro-duces a repeatable plan, with day-to-day consistencyand matching inventories at the beginning and endingof the week.

PLASMA/MC. This is a multicommodity flow formu-lation that produces a deterministic integer program-ming model. Each locomotive is modeled individuallyso it can capture consist breakup costs, but locomotivesare represented using a simplified set of attributesthat allows them to be aggregated into four classes(commodities).

PLASMA/MA. This is a multiattribute resource allo-cation model that captures all of the attributes of alocomotive required to produce a truly realistic model.It also captures operational issues such as consistbreakups, special equipment, returning foreign powerto interchange points, and the particularly difficult taskof routing power to maintenance shops. We have twoflavors of this model: A strategic planning system usedto determine fleet size and mix, and an operationalplanning model that works from a current snapshot ofthe current fleet.

PLASMA/SC and PLASMA/MC can both be solvedusing CPLEX. PLASMA/MC and PLASMA/MA can

both be solved using approximate dynamic program-ming (ADP). This makes it possible to use PLASMA/MCas a benchmark between CPLEX and ADP. We note thatboth PLASMA/SC and PLASMA/MA are in productionat Norfolk Southern, serving as evidence that bothbring value to the planning process. In our experimentalwork, we show that CPLEX can solve PLASMA/MC,but only with several major simplifications, and evenwith these simplifications it is sharply limited in termsof the planning horizon.

PLASMA/MA can be used as a deterministic opti-mizer or as a stochastic optimization model that canhandle variability in transit times, yard delays, locomo-tive failures, and changes to the train schedule. As ofthis writing, the deterministic version PLASMA/MAat Norfolk Southern is used in strategic planning todetermine fleet size and mix. It is also currently beingtested for tactical operational planning where it is usedto plan locomotive movements over a one–five dayhorizon. This paper uses the stochastic optimizationmodel to demonstrate for the first time that an opti-mization model can capture the uncertainty in transittimes and yard delays, producing solutions that aresignificantly more robust than a deterministic model.

We faced our biggest challenge developing the strate-gic planning model that is used to recommend fleetsize and mix. This requires that the model accuratelycapture locomotive productivity. We learned over theyears that many of the standard simplifications thatare accepted as part of the modeling process producedunacceptable results. For example, we started the modeldevelopment by grouping locomotives into four majorclasses (“commodities”) before finally making the tran-sition to modeling each locomotive individually withover a dozen attributes. Consist breakups, shop routing,and the management of foreign power were all critical.We even found that arrival and departure times hadto be modeled down to the minute; standard discretetime models did not work.

It is our belief from this research that ADP is uniquelywell suited to handling high levels of detail, but is lessskilled at managing a global vision of flows aroundthe network over time. For example, ADP is not agood technology for producing repeatable solutions(as is handled with PLASMA/SC) from one day tothe next, or ensuring that ending inventories matchbeginning inventories. ADP, however, can handle ahigh level of detail for each locomotive and train,globally optimizing over the entire network but onlyapproximately optimizing over time.

The most difficult challenge we faced was calibratingthe model to reproduce history. For this project, themost important metric was train delay. For example,for years the model produced delays that were toohigh, due almost entirely to problems with the data(primarily locomotives out of position, but sometimes

Page 3: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 3

because of errors in the train schedule that made con-nections impossible). However, it was also possible toproduce delays that were less than those observed inhistory. Although it is tempting to attribute instancesof lower delays to the intelligence of the model, inpractice they were always due to simplifications inthe model. For example, early versions of the modeldid not handle shop routing and constraints on theformation of locomotive consists (such as the properhandling of foreign power and locomotive equipmenttypes). Indeed, the railroads have learned that savingsin locomotives from optimization models were duemore to simplifications made in the model than theintelligence of optimization. A major finding of thisproject was that optimization was primarily a tool tomimic the intelligence of well-trained locomotive plan-ners whose decisions, although perhaps not optimal,are better informed than those in any computer model.

This paper makes the following contributions. (1) Weshow that CPLEX has no difficulty solving the simpli-fied PLASMA/SC model that is used to produce a high-level locomotive plan, but can only solve PLASMA/MCfor limited horizons. (2) We present a multiattributemodel that captures a very high level of detail, describea solution strategy based on ADP and show, on amulticommodity formulation, that ADP produces asolution within 1% of optimalality. (3) We show thatPLASMA/MA accurately captures fleet productivitywhen solved with ADP, producing overall train delaysthat are very close to history. (4) Along with finding thebest consist with the right equipment, PLASMA/MAsimultaneously manages the difficult problem of rout-ing power to different shops while managing shopcongestion. (5) Finally, we demonstrate for the firsttime that ADP can be used to solve a stochastic versionof the locomotive assignment problem, producing arobust operating policy that performs well when testedunder stochastic simulations.

Our presentation is designed to strike a balancebetween communicating the overall modeling andalgorithmic strategy in a compact and clear formula-tion, while also representing the high level of detailrequired to capture real-world locomotive operations.We model locomotives and trains using multidimen-sional attribute vectors, which facilitates representingthem at different levels of aggregation. The multiat-tribute model PLASMA/MA represents locomotives atthree levels of aggregation within the same model.

Our use of ADP for this application required thedevelopment of new strategies for approximatingvalue functions. For example, Simão et al. (2009) usevalue functions that are linear in the resource variable(drivers), and a hierarchical scheme to strike a balancebetween estimating the value of drivers at a high levelof detail (which introduces statistical sampling prob-lems) versus estimating drivers at a more aggregate

level, which reduces statistical error but introducesbias. Topaloglu and Powell (2006) uses piecewise linear,separable value function approximations to modelfreight cars. Over the course of this project, we startedwith linear value function approximations (which weretoo unstable), progressed to piecewise linear, separableapproximations, and finally ended up with a hierarchi-cal approximation strategy that uses piecewise linear,separable value functions at two different levels ofaggregation.

A theme of this paper is identifying the boundariesof classical integer programming technology usingmodern solvers. Our models start with PLASMA/SC,which is solved with CPLEX over a weeklong horizon.PLASMA/MC can be solved with CPLEX, but onlyover horizons up to around three days. ADP alsouses CPLEX, but only to solve single-period locomo-tive assignment problems, which uses value functionapproximations to produce good solutions over time.The real value of CPLEX in PLASMA/MA was notin its ability to produce a higher quality solution,but rather in its consistency and reliability, whichdramatically simplified model calibration.

The paper is organized as follows. Following the liter-ature review in §2, §3 presents the modeling frameworkbroken down into state variables, decision variables,exogenous information (for a stochastic model), transi-tion function, and the objective function. The paperthen presents three models. Section 4 presents a deter-ministic, single commodity formulation; §5 presents adeterministic, multicommodity formulation; and finally§6 presents an ADP algorithm for the multiattributeformulation. Section 7 describes how the shop routingproblem is solved in an integrated way, illustrating howthe ADP strategy can handle this important dimen-sion of shop routing. Section 8 describes a series ofexperiments that demonstrate model validation againsthistory, the stability of the model, the performanceof the shop routing, and finally (for the first time) ademonstration that the model produces higher quality,more stable results for stochastic networks.

2. Literature ReviewThere is an extensive literature on optimization modelsfor rail operations. These models range from singlecommodity models for managing generic fleets ofcontainers (e.g., White 1972; and Herren 1977); to mul-ticommodity models for handling multiple equipmenttypes with substitution (Glickman and Sherali 1985;Dejax and Crainic 1987; Crainic and Rousseau 1988;Haghani 1989; Crainic and Laporte 1997; Cordeau, Toth,and Vigo 1998; Holmberg, Joborn, and Lundgren 1998;Joborn 2001; Lingaya et al. 2002 and Joborn et al. 2004).A separate line of research has focused on handlingthe high level of uncertainty in the demand for freight

Page 4: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning4 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

cars (Mendiratta and Turnquist 1982; Jordan and Turn-quist 1983); this research has continued under thegeneral heading of “stochastic fleet management” or“dynamic vehicle allocation” (see the reviews in Powell,Jaillet, and Odoni 1995 and Powell, Bouzaiene-Ayari,and Simão 2006). This literature has laid a criticalfoundation for the locomotive optimization problem.

A separate literature has evolved around the morecomplex problem of managing locomotives. This prob-lem has been modeled almost exclusively as a large-scale integer programming problem (see Cordeau, Toth,and Vigo 1998 for a review of the literature as of 1998;and Ahuja, Cunha, and Sahin 2005 and Nemani andAhuja 2011 for more recent reviews). The complicatingissue with locomotives is that it takes more than onelocomotive to pull a train, and the minimum num-ber of locomotives needed to pull a train is typicallyfractional (for example, a particular train may need2.2 locomotives). Such problems are much harder thanthe “one resource, one task” problems that arise incontainer management problems.

There has been significant recent interest in mod-els for locomotive optimization. Ziarati et al. (1999)describes the use of modern branch-and-cut integerprogramming algorithms for the locomotive problem,which was applied to the Canadian National Railway(Ziarati et al. 1997). Cordeau, Soumis, and Desrosiers(2000 and 2001) apply Benders’ decomposition to han-dle the simultaneous optimization of locomotives andcars. Ahuja et al. (2005) formulates the locomotiveplanning problem as a mixed-integer programmingproblem by modeling the flows of locomotives, andproduced a plan that guides operations in the field,without reacting to real-time changes (similar to ourPLASMA/SC model described below). This modelconsiders issues such as “consist busting” (breakingcosts of locomotives apart), the management of foreignpower (power owned by other locomotives), and thehandling of special equipment such as cab signals. Theauthors found, however, that execution times usingthe version of CPLEX available could exceed 10 hours.In Vaidyanathan and Ahuja (2008), a model based onflows of consists was developed that reduced consistbreakups to almost zero; this formulation was alsocompared to a hybrid model for validation. This model,developed with CSX, also incorporated details suchas communications and the routing of foreign power.Vaidyanathan, Ahuja, and Orlin (2008) formulate thelocomotive routing problem to produce implementabletours for individual locomotives that considers refu-eling and maintenance. This is an extremely largeinteger programming problem that is solved using anaggregation/disaggregation technique.

3. A Modeling FrameworkIn this section, we provide a flexible modeling frame-work for the locomotive planning problem. This model

provides the foundation for the single commodity,multicommodity, and multiattribute models. An efforthas been made to strike a balance between providingthe structure of the model at a high level, while stillcommunicating the many details that were required toproduce an accurate representation of locomotive oper-ations. Most of the details of the models are providedin the appendices.

We present a single notational system where locomo-tives are represented using a vector of attributes. Thissystem can be quickly adapted to produce single com-modity, multicommodity, and multiattribute models.This representation has proven useful in other modelsof freight transportation systems (see in particularPowell, Shapiro, and Simão 2001, 2002; and Powell2007, Chapter 12). We use the modeling frameworkof ADP, which can be quickly adapted to both deter-ministic models, where we optimize over an entirehorizon, or the simulation-based strategy of ADP. Thisframework divides the model along five dimensions:the state variables, the decision variables, exogenousinformation (needed for the stochastic model), thetransition function, and the objective function.

We present our model by assuming that decisionsare made in discrete time steps t = 0111 0 0 0 1 T . We areable to assign a locomotive to at most one train at atime, which limits the length of the time steps to fourhours, which is roughly the length of the shortest trainsegment. For example, in the noon to 4 p.m. segment,we might assign a locomotive to a train that departs at12:37 p.m. and arrives at a nearby yard at 3:15 p.m. witha planned departure at 3:45 p.m.; we will not assign alocomotive to the train departing from the second yarduntil we solve the 4 p.m. to 8 p.m. block, although wecan still model the departure as happening at 3:30 p.m.It is important to emphasize, however, that althoughwe may make decisions in four-hour blocks, all eventsare modeled in continuous time. So, we might decideat 8 a.m. that a locomotive that first becomes availableat 9:32 a.m. should be assigned to a train that wasavailable to move at 8:45 a.m., thereby implying a47 minute delay.

3.1. The State VariableFor the locomotive optimization problem, the statevariable captures the status of the locomotives andtrains at a point in time. This is described using thefollowing:

a= The vector of attributes describing a locomotive.A= The set of all possible attribute vectors.

Rta = The number of locomotives with attribute a ∈Ain the system at time t.

Rt = 4Rta5a∈A.b = The vector of attributes describing a train

departure.B= The set of all possible train attributes.

Page 5: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 5

Dtb = The number of trains of type b ∈B in the systemat time t.

Dt = 4Dtb5b∈B.St = 4Rt1Dt5.

A detailed summary of the 14 locomotive attributesa and 23 train attributes b are given in Appendix A.Examples of locomotive attributes include current (orinbound) location, time of arrival, locomotive ID (usedin consist breakup logic), owner, locomotive type, axles,horsepower, current train assignment, shop due date,maintenance date, and special equipment (cab signal,flush toilet, speed limiter). Examples of train attributesinclude origin (if en route), destination (or currentlocation), train ID (used in consist breakup logic),estimated time of arrival, train type (e.g., merchandise,coal, grain, unit train).

For our most detailed operational model, theattributes of a locomotive include the following ele-ments: the arrival time (or time of availability) of alocomotive, the next location (if moving) or currentlocation (if at a yard), the estimated time of arrival(if the locomotive is currently moving), the owningrailroad, locomotive special equipment (such as thelocomotive speed limiter (LSL), the cab signaling system(CSS), and the flush toilet system (FTS)), the number ofaxles (a measure of the tractive effort of a locomotive,which determines its ability to move from a standstillor move up a steep grade), the horsepower (whichdetermines the maximum speed of the locomotive), theID of the current or previous train that it is currentlyassigned to (this is used to determine when locomotivesare joined together in a consist), the next time it is duein a maintenance shop, and the shop repair status. Afull description of the attribute vectors a and b aregiven in Appendix A.

Standard modeling practice for deterministic modelshas been to discretize time and then assume that allarrivals and departures occur at these time steps, asillustrated in Figure 1(a). Our work found that this didnot provide an acceptable level of accuracy even forour simplest models, since one train might arrive at12:37 p.m., a second might arrive at 1:12 p.m., and wemight need locomotives from both trains to move a

t – 1 t t + 1 t – 1 t t + 1

A

B

C

Trains

Locomotives

(a)A

B

C

Train 1

Train 2

Train 4

Train 3

(b)

Figure 1 Illustration of a Classical Time-Space Network for Multicommodity Flow Problems (a), and a Task-Graph Network That Provides for ContinuousArrival and Departure Times (b)

train that was originally scheduled to leave at 12:50 p.m.,but which now has to leave after 1:42 p.m.

To handle such situations, we used a task-graphrepresentation as depicted in Figure 1(b), where everytrain movement is represented as a specific arc uniqueto the movement. This representation allows us tomodel train departure times precisely, including theneed to delay trains small amounts because of a lateincoming train. Contrast this with an arc that sim-ply moves between points in space time, common inmulticommodity flow models (see Ahuja, Magnanti,and Orlin 1993). The task-graph representation allowsa locomotive from train 2 to be assigned to train 3,requiring that the departure time of train 3 be delayed.We limit how much a train might be delayed, butthe ability to handle train delays was found to be animportant feature. In all of our models, we are able torepresent arrival and departure times for locomotivesand trains in continuous time.

The attribute vector notation simplifies the transitionfrom the streamlined single and multicommodity mod-els, to the more detailed models used for fleet sizingand operational planning. The vector b is assumed toalways describe a single train. However, the attributevector a may not uniquely identify a locomotive (partic-ularly when a reduced attribute vector is used), and asa result Rta may be greater than one. In fact, the ADPmodel represents locomotives at three different levelsof detail at different points in time within the model.By contrast, the integer programming models use thesame attribute vector throughout the planning horizon,but we can create models of varying computationalcomplexity by changing the list of attributes that wewish to consider.

From the state vector Rt , we can compute two statis-tics that are used below to compute important costs.These are the following:

F= Set of foreign railroads.RF

tf = The number of foreign locomotives owned byrailroad f ∈F at time t, for which the railroadhas to pay leasing costs. This is computed bysumming the elements of Rta where the own-ership attribute of a corresponds to a railroadf ∈F.

Page 6: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning6 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

RShpts = Number of locomotives in shop s at time t (in

the shop, or waiting in line to enter the shop).

To model consist formulation, we introduce thefollowing notation:

It = The set of train IDs at time t, where a train ID isrepresented using i ∈It .

Kti = Set of locomotives that are identified with trainID i ∈It at time t. We can think of the elementsof Kti as a set of train IDs or, equivalently, a setof attribute vectors a ∈A where Rta = 1.

If there are four locomotives in Kti for some train ID i,and these are assigned to three different trains (twolocomotives to one train, and one each to two separatetrains), then this implies that we have “broken” theconsist two times, each of which incurs a cost.

3.2. The Decision VariablesThe locomotives available at time t ∈ 801 0 0 0 1 T 9 areacted on with different types of decisions, which wemodel using the following:

DD = The set of decisions used to act on a train, con-sisting of “move” and “hold.” These decisions arealways used in the context of a unique trainidentified by its attribute vector b.

DB = The set of decisions to assign a locomotive to atrain in B, where d ∈DB corresponds to a trainbd ∈ B. Note that assigning a locomotive to atrain is distinct from the decision to move thetrain itself.

dH = The decision to hold a locomotive (the symbolrefers to the fact that we are doing nothing tothe locomotive).

dSet = Decision to set off a locomotive that is currentlyattached to a train.

dShp = Decision to move a locomotive into a shop forrepair or maintenance. This decision can only beused when a locomotive is in a yard where theshop is located.

DR = The set of all decisions that can be used to acton a locomotive,

= DB ∪ dH ∪ dSet ∪ dShp.

Each type of decision in d ∈DR acts on locomotiveswith an attribute vector in A. We represent decisionsusing the following:

xRtad = The number of locomotives (resources), with

attribute a ∈ A, that we act on with decisiond ∈DR.

xRt = 4xR

tad5a∈A1d∈DR .xD

tbd = The number of trains, with attribute b ∈B, thatwe act on with a decision d ∈DD. Since b alwaysidentifies a single train, xD

tbd ∈ 80119.xDt = 4xD

tbd5d∈DD1 b∈B.xt = 4xR

t 1xDt 5.

We represent the feasible region for xt using thefollowing:

Xt = The feasible region for xt that captures conserva-tion of flow, integrality, and nonnegativity, givenwhat we know at time t.

The equations that define Xt are given in Appendix C.There are other decision variables that are determined

directly by xRt and xD

t through various constraints,which are described in Appendix C. For this reason, allof these variables are a function of xt (we do not writethis explicitly for notational compactness). These arethe following:

wtb = The delay imposed on train b ∈B at time t due tothe specific assignment of individual locomotives.

yEtb = The number of power units added on a train

b ∈B at time t for repositioning.yGtb = The number of power units assigned to a train

b ∈B at time t to fulfill the goal requirement.ztb = The number of locomotive consist splits at time t,

for the consist identified by train with attribute b.For example, if a four-locomotive consist is bro-ken into a consist with two locomotives, alongwith two individual locomotives, this wouldproduce ztb = 2.

3.3. Exogenous InformationExogenous information processes arise only in stochas-tic models. This information arises to describe transittime and yard delays, as well as changes to the trainschedule. We model these changes using the following:

Rta = Exogenous changes to locomotives with attributea (primarily delays in transit times and yarddelays) from information arriving between t − 1and t.

Rt = 4Rta5a∈A.Dtb = Exogenous changes to a train with attribute b.Dt = 4Dtb5b∈B.Wt = The exogenous information that arrived during

time interval t.= 4Rt1 Dt5.

We treat Wt as the information that first becomes knownat time t (it is useful to think of Wt as information thatarrives continuously between t− 1 and t), and captureschanges in arrival times (relative to the schedule) andyard delays, the addition of new trains to the schedule(or random train cancellations), and equipment failures(the unexpected requirement that a locomotive bereturned to the shop for maintenance).

A delay in a train arrival is captured as a changein the estimated time of arrival (an attribute of thelocomotives pulling the train). Changes in the trainschedule are modeled as increases (schedule additions)or decreases (cancellations) to the vector Dt = 4Dtb5b∈B.

Page 7: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 7

In our deterministic model, there is no exogenousinformation about locomotives and trains. For thisreason, Rt = 0 and Dt = 0. The vector Rt is determinedpurely from the starting state R0 and the decisions wemake that act on locomotives. In a deterministic modelthat represents the entire problem over time, D0 wouldrepresent the entire train schedule. When we use ADP,we still model the schedule as if it arrives over time,even though there is no uncertainty. Thus, when weuse ADP, Dt can be thought of as the trains that weare thinking about at time t, which generally refers totrain departures over a fairly short horizon.

3.4. The transition functionThe most natural way to model the evolution of locomo-tives and trains is through the use of resource transitionfunctions that work at the level of individual locomo-tives and trains. We represent these using

at+1 = aM 4at1dt1Wt+151 ∀ 4at1dt5 ∈A×DR1 (1)

bt+1 = bM 4bt1dt1Wt+151 ∀ 4bt1dt5 ∈B×DD0 (2)

Here, at and bt are locomotive and train attributevectors at time t, dt is a type of decision acting on alocomotive or train at time t, and Wt+1 would be (in astochastic model) new information arriving between tand t + 1. The attribute vectors at+1 and bt+1 representthe future locomotive and the future train resultingfrom the implementation of decision dt , given whatwe know at time t + 1. In the case where dt is a holddecision, at+1 simply represents the same locationat the next decision period. When dt represents atrain movement, a number of rules are applied thatdetermine the precise departure time given all of thelocomotives assigned to the train. For the stochasticmodel, we can capture variations in transit timesthat were not known when we made the decision attime t.

Our modeling strategy is relatively independent ofthe rules that govern how the locomotives and trainsevolve over time. In the deterministic models, theattribute transition function performs the same rolethat the A matrix does in mathematical programming.In the ADP model (deterministic or stochastic), weimplement the dynamics of a transition just as wewould in any simulation model.

We can use the transition functions aM 4 · 5 and bM 4 · 5to create matrices ãR

t+14�5 and ãDt+14�5 that allow us to

write the transition functions for Rt and Dt . Let

�Rt+11 a′1 ad4�5=

{

1 if a′ = aM 4at1dt1Wt+14�551

0 otherwise1

�Dt+11 b′1 bd4�5=

{

1 if b′ = bM 4bt1dt1Wt+14�551

0 otherwise1

when we are following sample path � and observeWt+14�5. We then let �R

t+11 a′1 ad4�5 be the element ofãa

t+14�5 for row a′ and column 4ad5. We form ãbt+14�5

in a similar way. We drop the argument � and viewthese matrices as random at time t, where we simulatethem to compute the state of the system at time t + 1.This allows us to write our transition equations using

Rt+1 =ãRt+1xt + Rt+11 (3)

Dt+1 =ãDt+1xt + Dt+10 (4)

Written this way, the random vector Rt+1 would onlyrepresent exogenous arrivals to or departures fromthe system. An exogenous arrival might be locomo-tives coming from another railroad, or newly pur-chased/leased locomotives entering the system for thefirst time.

We are going to find it useful to introduce the conceptof a post-decision state variable for locomotives. Let

adt = aM 4at1dt1Wt1 t+151

where Wt1 t+1 is a forecast, made at time t, of theexogenous information arriving between t and t+1. Wecan interpret adt as the attribute vector that we wouldexpect the locomotive to have after being acted on usingdecision d. The most important illustration of theseideas for our work arises with travel times and theestimated time of arrival. If we send a locomotive fromyard A to yard B, adt would have yard B as the inboundlocation, and the time of availability (estimated time ofarrival) reflects the scheduled time of arrival.

Using the forecasts Wt1 t+1, we can create a forecastedindicator variable �R

t1t+11a′1ad and a forecasted incidencematrix ãR

t1 t+1. From this, we define the post-decisionresource vector Rx

t given by

Rxt =ãR

t1 t+1xt0 (5)

We could then let

Rt+1 =Rxt + Rt+11

but now Rt+1 is capturing not just exogenous arrivalsof locomotives but also updates due to random traveltimes. We note that Rt+1 may be positive or negative.

In some cases, it is useful to use a general transi-tion function St+1 = SM 4St1 xt1Wt+15, where the decisionvector xt captures actions on all of the locomotivesand trains. All of these representations are equiva-lent and are useful in different settings. Some addi-tional details of the transition function are given inAppendix B.

3.5. Objective FunctionWe maximize the net contribution, consisting of rewardsminus costs. We let contributions refer to parametersthat can be positive (the reward of moving a train or

sroberts
Cross-Out
sroberts
Inserted Text
T
sroberts
Cross-Out
sroberts
Inserted Text
F
Page 8: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning8 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

arriving at the shop on time, returning foreign powerto the owning railroad), or negative (trains departinglate, using a number of locomotives that is greater orless than the goal, arriving late to a shop appointment,holding on to foreign power). The elements of theobjective function are given by the following:

cRtad = Contribution earned from acting on a locomotivea ∈ A with a decision d ∈ DR. For a decisiond ∈DD to move a train with attribute bd, the costconsiders the desirability of using a particulartype of locomotive to the type of train.

cDtbd = Contribution earned from moving a train withan attribute vector b ∈B with a decision d ∈DD.Higher priority trains (e.g., merchandise trains)carry higher rewards than less service sensitivefreight (e.g., coal trains).

cHtb = Cost per time unit paid for holding a train b ∈B.cEtb = Cost per power unit paid for repositioning loco-

motives (“empty moves”) on a train b ∈B.cGtb = Reward earned from adding one unit of power

to fulfill the goal requirement of train b ∈B.cK = The cost of splitting a consist of locomotives into

two parts.cF = The cost of leasing one foreign locomotive per

time period.

An important dimension of the model involves therouting of power to shop locations for repair andmaintenance. For now, we define the shop contributionfunction using the following:

CShpRtet 4St1xt5= Total contribution across the system

for shop routing, capturing on-timeperformance and utility of locomotivesas they are routed to shop, incurred attime t.

CShpCongts 4R

Shpts 5= A nonlinear function capturing the

value of having RShpts locomotives at

shop s at time t.

The functions CShpRtet 4St1xt5 and C

ShpCongts 4R

Shpts 5 are de-

scribed in detail in §7.The total contribution for time period t is then given

by the following function:

Ct4St1xt5 =∑

a∈A

d∈DR

cRtadxtad +∑

d∈DD

b∈B

6cDtbdxDtb + cHtbwtb

+ cGtbyGtb + cEtby

Etb7+

k∈Kt

cKztk +∑

f∈F

cFRFtf

+CShpRtet 4St1xt5+

s∈S

CShpCongts 4R

Shpts 50 (6)

We write the dependence of the costs and contributionson the state St to reflect the fact that the contributionfunction depends on data that varies over time. Thevariables 4wtb4xt55, 4yG

tb5, 4yEtb5, 4ztk5, 4R

Shpts 5, and 4RF

tf 5 are

uniquely inferred from xt using operational constraintsthat we explain in detail in Appendix C.

If our problem is deterministic, which means that Rt

and Dt are zero for all time periods, we can write theobjective function as

maxxt∈Xt1 t=010001T

T∑

t=0

Ct4St1xt5 (7)

subject to the constraints xt ∈Xt , reflecting constraintsat a point in time t, and the transition constraints (3)and (4) that link activities across time.

If we need to capture an exogenous information pro-cess, we introduce a decision function, or policy, X�

t 4St5,which depends on the state variable St = 4Rt1Dt5. Thepolicy is designed to return a feasible vector xt =X�

t 4St5,where xt ∈Xt . The state variable evolves according tothe transition function St+1 = SM 4St1xt1Wt+15, whereWt+1 represents the exogenous information receivedduring time interval t + 1, including a sample real-ization of Rt+1 and Dt+1. The optimization problem isnow one of finding the best policy X�

t 4St5 in a set ç.The objective function for the stochastic problem isnow written

max�∈ç

Ɛ

{ T∑

t=0

C4St1X�t 4St55

}

0 (8)

Here, the decision function X�t 4St5 is designed to return

a vector xt ∈Xt , while the state variables are linkedusing St+1 = SM 4St1xt1Wt+15.

The deterministic optimization problem defined by(7) is well understood, but that does not mean theproblem is easy. As we show below, there are importantversions of the locomotive optimization problem thatcan be solved by commercial solvers. However, it isnot hard to create problems that cannot be solved as adeterministic integer program.

The stochastic version, however, is an entirely differ-ent matter. Even small stochastic optimization problemscan be computationally intractable. In §6, we describean algorithmic strategy based on ADP, which we useto solve both deterministic and stochastic versions ofthe problem.

We begin by describing the deterministic integerprogramming models. We first describe the simplifiedscheduling model known as PLASMA/SC, which pro-duces a rough operating plan; this model is used byNorfolk Southern in production to provide approxi-mate, high-level scheduling instructions over an entireweek. We present this model as an instance of a prac-tical, implementable model that can be solved opti-mally using commercial solvers. We then describePLASMA/MC, which uses a simplified attribute vec-tor, producing a deterministic integer program thatcan be solved exactly. This model is used only forbenchmarking the ADP algorithm.

Page 9: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 9

Mon Tue Wed Thu Fri Sat Sun

Shop

sL

ocat

ions

: Train nodes

: Locomotive nodes

: Out-of-shop arcs: To-shop arcs: Assignment arcs: Train movement arcs

Sunday midnight

Figure 2 Network Representation of the Single Commodity Planning Model

4. The Deterministic SingleCommodity Model

The single commodity planning model, PLASMA/SC(see Figure 2), is used within Norfolk Southern toprovide scheduling guidelines of the following form:locomotives on inbound trains A and B should be usedon outbound trains C and D. There was no expectationthat this plan would be followed all of the time, butrather is only used as a guide in the field. Exceptionsare made routinely as dispatchers have to take intoconsideration all of the issues described in the detailedmodel. The model is highly simplified, but it can besolved optimally using CPLEX in under two minutesfor a weeklong horizon. It provides value in its abilityto take a global perspective of the entire network.It is our belief that PLASMA/SC provides valuablehigh-level guidance to the railroad, and could be usedto guide more detailed operational models such asPLASMA/MA.

The full model is described in Appendix E. Thefeatures of the model include the following:

• Starting and ending locomotive inventories areconstrained to be the same, making the plan repeatablefrom one week to the next.

• There are bonuses encouraging flow patternsthat are repeatable from one day to the next, whichsimplifies operations.

• The model uses a first-in, first-out assignmentprotocol, which is easier for yards to manage.

• The model contains simplified logic to minimizethe number of times that consists have to be broken.

• It uses simplified logic for shop routing and foreignpower.

It was critical that PLASMA/SC be solvable using acommercial solver such as CPLEX that ensures thatflows are optimized globally over the network. For thisreason, several simplifications had to be introduced.Locomotives were aggregated into a single type. Also,it does not route individual locomotives through theshop as we do in the multiattribute model, but it doesapproximate the flow of locomotives through the shopat an aggregate level to avoid artificially inflating thefleet. Finally, it does not recognize ownership and as aresult it manages the flows to and from other railroads(“foreign power”) at an aggregate level.

Similar issues were captured in the model (such asrepeatability of the flows) described in Ahuja et al.(2005), which was solved using large neighborhoodheuristics. We found that CPLEX 12.1 could solve ourformulation of the problem without difficulty for theweekly train schedule. This model is currently beingused at Norfolk Southern.

We tested the model on a typical data set with2,082 trains scheduled over a week in 146 differentlocations. The network interacts with five foreignrailroads through exchange locations. To obtain asolution within a reasonable run time, only one genericlocomotive type is considered. The experiment has beenconducted on a compute node with 96 GB of RAMand two Intel Xeon X5667 processors that supports 16threads using CPLEX 12.1. For a relative gap of 0.07%,the model is able to produce a solution in 94 seconds.

We have been able to conclude from our workwith PLASMA/SC that classical integer programmingformulations can be very effective for simplified loco-motive models. The most important simplifications forthis model are the aggregation into a single class oflocomotive, and the inability to delay trains.

Page 10: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning10 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

0

1,000

2,000

3,000

4,000

5,000

6,000

0.17

(4 hrs)

1.00

1.25

1.50

1.75

2.00

2.25

2.50

2.75

3.00

3.25

3.50

3.75

4.00

Num

ber

of tr

ains

Planning horizon (days)

Number of trains(a)

0

50,000

100,000

150,000

200,000

0.17

(4 hrs)

1.00

1.25

1.50

1.75

2.00

2.25

2.50

2.75

3.00

3.25

3.50

3.75

4.00

Prob

lem

siz

e (c

ols

+ r

ows)

Planning horizon (days)

(b)Cols

Rows

Figure 3 Size of the Integer Programming Model as a Function of the Planning Horizon

5. The DeterministicMulticommodity Model

PLASMA/MC is a deterministic integer programmingmodel represented by the objective function (7), whichis solved subject to the constraints xt ∈Xt for each timet = 01 0 0 0 1 T , and the linking constraints (3) and (4).The distinguishing feature of the model is the use of arestricted set of attributes, which allows us to grouplocomotives into four classes (high and low horse-power, high and low adhesion), producing a classicalmulticommodity flow model. This is a model that canbe solved by both CPLEX and (in its deterministicform) ADP, allowing us to use CPLEX as a benchmarkto evaluate the quality of the ADP solution. This modelalso allowed us to determine the boundaries of CPLEX.

The model has been tested on a Norfolk Southerndata set. The data set has 2,337 locomotives placed in 443different yards and ready to be used at the beginning ofthe planning horizon. Our data set includes 9,295 trainsover a seven day horizon, which proved to be beyondthe capabilities of CPLEX, so we resorted to solvingproblems with shorter horizons. Figure 3(a) shows thenumber of trains as a function of the planning horizon,and Figure 3(b) shows the number of rows and columns.

0

5

10

15

20

0.17

(4 hrs)

1.00

1.25

1.50

1.75

2.00

2.25

2.50

2.75

Solu

tion

time

(hou

rs)

Planning horizon (days)

Gap = 0.3%

(1.5

sec

s)

(2 m

ins)

(1 h

r)

(15.

5 hr

s)

(a)

0

1

2

3

4

5

6

0.17

(4 hrs)

1.00

1.50

2.00

2.50

3.00

3.50

Solu

tion

time

(hou

rs)

Planning horizon (days)

Gap = 1.0%

(b)

(5.1

hrs

)

(1.6

min

s)

(1.8

min

s)

(1.5

min

s)

(2.3

hrs

)

(8.1

hrs

)

(0.1

sec

)

(0.8

min

)

(1.8

min

s)

(3.6

min

s)

(6.4

min

s)

(18.

3 m

ins)

Figure 4 CPU Time for Integer Programming Model as a Function of the Planning Horizon

This study has two purposes. First, we want tolearn about the effect of the planning horizon on thesolution time. This analysis allows us to understand thelimits of CPLEX for solving a deterministic formulationof the model. Second, we use the results from theseexperiments as a benchmark for the ADP algorithm,although this comparison is reported below.

Our experiments were run with CPLEX 12.1 runningon a compute node with 96 GB of RAM and two IntelXeon X5667 processors that support 16 threads. Weconducted two sets of experiments using as stoppingcriteria a relative gap of 0.3% in the first set and 1%in second set. The results are shown in Figures 4(a)and 4(b).

These results show the dramatic increase in CPUtimes as the horizon increases, even when we relaxthe optimality gap. We note that caution has to beused when relaxing the optimality gap. Although a1% gap may seem quite acceptable, it actually hasbeen found to produce odd anomalies when detailedlocomotive-to-train assignments are examined. Yet,even with a 1% gap, run times exceed five hours fora 3.5 day horizon, and a run submitted for a fourday planning horizon was stopped after four days ofCPU time.

Page 11: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 11

We believe that this dramatic increase in CPU timeswill arise in any search algorithm that attempts to solvethe entire problem simultaneously over these longerhorizons. The problem is that finding an improvementin an assignment at one point in time requires reopti-mizing the entire path of the locomotive at all earlierand later points in time, which in turn will generallyrequire the reoptimization of other locomotives overthe entire horizon.

6. The Multiattribute Modelwith ApproximateDynamic Programming

The dramatic growth in run times for the aggregatedversion of PLASMA/MC, shown in Figure 4, demon-strates that an optimal solution of even a deterministicmodel over longer horizons with the full set of detailsis far beyond the capability of commercial solvers.It is unlikely that this is going to change even withimprovements in these solvers or improvements incomputer hardware. This observation motivated thedevelopment of an algorithmic strategy based on ADP.

Our original interest was a strategic planning toolthat could be used for fleet sizing. Fleet sizing is aparticularly difficult problem, because it requires thatthe model accurately capture locomotive productivity.This requires knowing that if a train needs an LSLlocomotive (which limits speed), we cannot movethe train unless one of the locomotives in the con-sist has this characteristic. We also have to managerouting locomotives to shop locations, while simulta-neously choosing locations to balance shop conges-tion. In addition, sometimes we will move a trainwith three locomotives, even if it only needs two, inorder to save the cost of breaking a consist. Alterna-tively, we may need to pull additional locomotives tobalance locomotive inventories around the network.Finally, we also have to route locomotives owned byother railroads (“foreign power”) back to the owningrailroad.

One algorithmic strategy is to use heuristics such asthe metaheuristics suggested in Glover and Kochen-berger (2003) or large neighborhood search algorithmsproposed in Ahuja et al. (2005). We attempted the useof heuristics to solve the local locomotive assignmentproblem in a yard and found that it worked quitewell most of the time. However, from time to timeit would provide answers that were clearly incorrect,causing considerable problems in identifying the causeof a problem (data?, model?, algorithm?). Furthermore,heuristics appeared to be an interesting strategy fordeterministic problems, but did not provide a naturalpath to handle uncertainties in an elegant way. Forthis reason, we adopted the strategy of ADP, which

can be used for both deterministic and stochasticproblems.

Our goal with ADP is to show the following: (1) ADPallows us to capture a high level of detail, producinga model that calibrates accurately against historicalperformance using a rigorous modeling and algorithmicframework. (2) We wish to evaluate the quality of theADP solution against the optimal solution of the largestmodel we can solve using integer programming on adeterministic model. (3) Finally, we wish to show thatan ADP policy that was trained using stochastic traveltimes produces a solution that is more robust thanan ADP policy trained on deterministic travel times(which is quite close to the optimal solution obtainedusing integer programming).

6.1. The ADP AlgorithmWe develop the ADP strategy in the context of astochastic model, and then show how it can be triviallyadapted to a deterministic model. We start by writingBellman’s equation as

Vt4St5= maxxt∈Xt

4C4St1xt5+ Ɛ8Vt+14St+15 � St950

To avoid the expectation, we use the predecision andpost-decision resource states (see Equation (5)) to breakthe Bellman’s equation into two equations:

Vt4St5= maxxt∈Xt

4C4St1xt5+V xt 4S

xt 551 (9)

V xt 4S

xt 5= Ɛ8Vt+14St+15 � Sx

t 90 (10)

Since we will never know V xt 4S

xt 5 exactly, we replace it

with an approximation Vt4Sxt 5.

Also, since xt is a fairly high-dimensional integervector, we would like to draw on the power of mathprogramming algorithms, which limits the types ofapproximations we can use. Early in the project, weused approximations that were linear in the resourcevariable Rt . Recognizing that Sx

t = 4Rxt 1D

xt 5, these linear

approximations had the form

Vt4Sxt 5=

a∈A1

VtaRxta1

where Rxta can be written

Rxta =

a′∈A1

d∈DR

�t1 a′1 adxta′d0

Note that we are approximating locomotives at theaggregation level represented by A1, which captureslocation, special equipment, the type of locomotive andif it is attached to a train. Note that our approximationignores the value of trains that are being delayed. Thisapproximation appears to work well enough, probablybecause delayed trains are not that common.

We worked with linear approximations for severalyears. Although they initially produced good results

Page 12: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning12 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

(and are much simpler to work with), we ultimatelyfound them to be unstable. We then made the tran-sition to separable, piecewise linear value functionapproximations following the work of Godfrey andPowell (2001, 2002) and Topaloglu and Powell (2006).These had the general form

Vt4Sxt 5=

a∈A1

Vta4Rxta50

Piecewise linear functions are much harder to workwith, primarily because decisions have to be made overtime, and travel times range from a few hours to a dayor more. If you wish to move additional locomotivesinto a region Wednesday afternoon, you can satisfythis need by moving locomotives over long distancesstarting early on Tuesday, or over shorter distances ontrains from nearby locations that depart Wednesdaymorning. Very specialized algorithms are required tohandle the problem of multiperiod travel times usingnonlinear value function approximations (see Godfreyand Powell 2002 and Topaloglu and Powell 2006 for afull description of how this problem is solved).

Over time, however, we found that even the non-linear approximations did not work well enough tosatisfy the careful analysis of the dispatchers withinNorfolk Southern. The problem could be traced tothe use of separable approximations. We might getapproximately the right number of locomotives of eachtype, but in aggregate we found that we simply hadtoo many locomotives in some locations.

To overcome this problem, we introduced (for thefirst time in our research) two layers of value functionapproximations that work at two different levels ofaggregation. Let Ag be the attribute space of locomo-tives at aggregation level g ∈G= 81129, where A1 is

LocomotivesHorsepower

4,400

4,400

6,000

4,400

5,700

4,600

6,200

Locomotivevalue

functions

Train reward function

TrainsDownstream

yard

Aggregatevalue

functions

Figure 5 Illustration of Locomotive Assignment Policy Using Hierarchical Value Function Approximations

the more disaggregate level of aggregation (but moreaggregate than A=A0), which captures locomotivetype, special equipment, and location, while A2 ignoresthe locomotive type (but still captures location).

We create two sets of value function approxima-tions that we represent using V

gt 4R

gt 5, where R

gt is

the resource vector using attribute space Ag . We thencreated a new value function approximation given bya weighted sum of V g

t 4Rgt 5 for g ∈G, given by

Vt4Rt5=∑

g∈G

�gVgt 4R

gt 51 (11)

Vgt 4R

gt 5=

a∈Ag

Vgta4R

gta51 g ∈ 811291 (12)

g∈G

�g= 10 (13)

Our locomotive assignment policy is now given by

X�t 4St5= arg max

xt∈Xt

4C4St1xt5+ Vt4Rxt 550 (14)

Figure 5 illustrates the locomotive assignment prob-lem, showing individual locomotives to the left beingassigned to individual trains. The assignment problemincludes all locomotives and trains that are available attime t, or will become available within a four hourwindow (the time interval from t to t + 1). The valueof locomotives in the future are captured at two levelsof aggregation. The detailed algorithm, without shoprouting, is given in Figure 6.

The algorithm progresses by simulating policy X�t 4St5

over the entire simulation horizon using Vt4Rt5. Com-puting X�

t 4St5 requires solving an integer programsubject to constraints xt ∈Xt on the availability of loco-motives captured by the high-dimensional vector Rt .

Page 13: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 13

Step 0. Initialization:Step 0a. Initialize an approximation for the value function

Vg10t 4Rx

t 5 to zero for all post-decision states Rxt , time periods

t = 80111 0 0 0 1 T 9, and aggregation levels g = 81129.Step 0b. Set n= 1.Step 0c. Initialize S1

0 . If the model is being run as a strategicplanning system, all locomotives start in a common “node in thesky” location; if the model is being run as an operational planningsystem, all locomotives are initialized at a given starting position.

Step 1. Choose a sample path �n. If the model is being run in adeterministic model, the sample path is always the same.

Step 2. Do for t = 0111 0 0 0 1 T :Step 2a: Choose a locomotive assignment using

X�t 4St5= arg max

xt∈Xt

(

C4Snt 1xt5+�

g∈G

�g∑

a∈Ag

Vg1n−1ta 4Rx1n

ta 5

)

1

where Rx1nt is the post-decision resource vector given by (5). Let xn

t

be the vector of assignments of locomotives to trains, includingdecisions to hold locomotives and trains. Compute vn

ta giving themarginal value of additional locomotives of type a ∈Ag for g ∈G.

Step 2b. Update Vg1n−1t−1 with the slopes v

g1nt using the CAVE

algorithm.Step 2c. Sample W n

t+1 =Wt+14�n5 and compute the next state

Snt+1 = SM 4Sn

t 1xnt 1W

nt+15.

Step 3. Calculate the set of paths PShp = 4PShptis 5 featuring different

travel times for getting power to shops using the train departuresgiven by xn = 4xn

t 5 as described in §7.

Step 4. Increment n. If n≤N go to Step 1.

Step 5. Return the value functions 4Vg1Nt 5Tt=1.

Figure 6 ADP Algorithm for the Locomotive Assignment Problem

To compute the marginal value of an additionallocomotive of type a ∈Ag , we first define

F 4Rt5= maxxt∈Xt 4Rt 5

4C4St1xt5+ Vt4Sxt 551

where Rt is imbedded in the constraint set Xt4Rt5(where we have written the dependence explicitly here).We approximate the marginal value of an additionallocomotive with attribute a by computing the dual vari-ables v

g1nta of the linear relaxation. We tried computing

numerical derivatives of the form

vg1nta = F 4R

gt + eta5− F 4R

gt 51

where eta is a vector of zeros with an element for eacha′ ∈Ag and a one for element a. Numerical derivativesdid not work because adding one additional locomotivewould not have any effect if two locomotives wereneeded to move the train. We computed the duals forall a ∈Ag .

The value functions are built using an iterativeadaptive learning algorithm where, at iteration n, theinformation (in the form of dual variables vg1n) neededto update these functions is gathered and stored as thealgorithm steps over time. Then, this information is

used at the end of the iteration to update the functionsto be used in the next iteration as follows

Vg1nt−1 =U4V

g1 4n−15t−1 1 S

g1nt−1 1 v

g1nt 51 g = 1120 (15)

In the equation above, U4 · 5 is a generic updatingfunction, V g1n

t−1 = 4Vg1nt−11 a5a∈Ag , and v

g1nt = 4v

g1nta 5a∈Ag is the

vector of derivatives at time period t − 1 at iteration n.We use the CAVE algorithm first given in Godfrey andPowell (2001) (see also Powell 2011, Chapter 13). TheCAVE algorithm is specifically designed to maintainconcavity, and has been shown to produce high-qualitysolutions in the context of fleet management problemsin truckload trucking and the management of freightcars for railroads.

The real value of solving locomotive assignmentproblems of the form illustrated in Figure 5 is speed.Whereas it may take five hours to solve a problem overa 3.5 day horizon (using a generous 1% optimality gap),solving a single locomotive assignment problem suchas that shown in Figure 5 takes around three secondsusing CPLEX, despite being a problem with thousandsof integer variables. Shallow problems appear to beextremely easy to solve. Even more important is thatthe problem in Figure 5 captures each locomotive withthe full set of attributes, including modeling the timeof availability down to the minute.

6.2. Benchmarking Against OptimalDeterministic Solution

As an initial test of our ADP algorithm, we solveda simplified version of the deterministic locomotiveplanning problem (no train delay, no shop routing, noforeign exchange, three days horizon) as a single integerprogram. Then, we solved the same problem with theADP algorithm proposed in this section. Figure 7 showsthe ADP objective function as a percent of the optimalfrom CPLEX. After 90 iterations, ADP produced asolution within half of 1% of the solution producedby CPLEX on a data set with a 3.5 day horizon (thelongest that we could solve using CPLEX). Later, wepresent a more comprehensive set of tests.

90

92

94

96

98

100

0 10 20 30 40 50 60 70 80 90

Perc

ent o

f de

term

inis

ticop

timal

sol

utio

n

Iteration

Figure 7 The ADP Objective Function as a Percent of the OptimalDeterministic Solution Obtained Using CPLEX, for a Problemwith a 3.5 Day Planning Horizon

Page 14: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning14 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

7. Shop RoutingShop routing is one of the most difficult operationalchallenges for the freight railroads in the United States.We have to get a locomotive due at the shop at time � asclose to this time without arriving late. The locomotivehas to move primarily on scheduled trains (which aresubject to random delays). In addition, we would likethe locomotive to be as productive as possible whilemoving toward its shop. As a final twist, we have someflexibility in the choice of shop we choose to send alocomotive, and the system has to balance congestionacross the shops.

We first define the following:

S= Set of shop locations.PShp

tis = A set of paths featuring different travel timesfor getting power from yard i at time t toshop s over scheduled trains.

�Shptis 4p5= The time required to traverse path p ∈PShp

tis .

Using this notation, we define three sets of costs.

cShpTmets 4p5= Bonus/penalty from arriving early/late

for getting a locomotive to shop s attime � when using path p ∈PShp

tis , wherethe function is given in Figure 8(a).

cShpUtilts 4p5= Cost attributed to the utilization of a

locomotive getting to shop when usingpath p ∈PShp

tis for getting a locomotive toshop s. This term captures the costs forsetouts that require breaking a locomo-tive from a consist.

CShpCongts 4R

Shpts 5= A piecewise linear and concave con-

gestion function for the maintenanceand repair services at shop s ∈S whenthere are R

Shpts locomotives in shop s

at time t, where the function is givenin Figure 8(b), which depicts an ideallevel of shop utilization, with lowerrewards when there are too many ortoo few locomotives.

0

–10 –9 –8 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7

Con

trib

utio

n

Days late

0

Con

trib

utio

n

Locomotives in the shop

Shop congestion function(b)(a)

Shop timing function

Figure 8 (a) Contribution for Arriving to Shop as a Function of the Arrival Time Relative to the Scheduled Arrival Time, and (b) Shop Congestion Function,Showing Increasing Value Up to the Capacity of the Shop, Followed by a Downward Sloping Function If the Number of Locomotives ExceedsCapacity, Implying Queueing

The shop routing model considers the marginal con-tributions c

ShpTmets 4p5 and c

ShpUtilts 4p5 per locomotive, and

the total contribution CShpCongts 4R

Shpts 5, which reflects the

congestion when there are RShpts locomotives in shop s at

time t.We solve the shop routing problem using an imbed-

ded dynamic program that produces a set of pathsthat can get a locomotive from a particular yard i at apoint in time t to a shop s by moving on a sequenceof trains. We do this by creating a graph of move-ments resulting from a forward pass of the algorithm,illustrated in Figure 9. We then solve a shortest pathproblem (illustrated in Figure 10) to get us from ayard i at time t to shop s at a time t′ > t, where t′

is set to t + 241 t + 481 0 0 0 1 yielding a set of shortestpaths that get to the shop at different points in timein the future. These paths make up the set PShp

tis , withassociated travel times �

Shptis 4p5 for p ∈PShp

tis .When we are considering an assignment of a loco-

motive with attribute a to train d ∈DB, it implies anarrival to a yard j at a time t′ corresponding to theestimated time of arrival of train bd. If the locomotiveneeds to get to a shop, we search over all paths p ∈PShp

t′js

for all shops s ∈S and we pick the one with the largesttotal value of time and utility contributions. That is,we compute

cShptb = Highest contribution of on-time plus utility

across path shops and all paths to each shop,

= maxp∈P

Shpt′ js

1 s∈S

8cShpTmets 4p5+ c

ShpUtilts 4p590

The total shop cost for a vector xt is then given by

CShpRtet 4St1xt5=

a∈A1d∈DB

cShptbd

xtad0

This shop cost is included in the contribution functionCt4St1 xt5 in Equation (6), forcing the model to balance

Page 15: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 15

Step 0

Time

Loc

atio

ns

T1

T4

T4T4

Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7

Location 1

Location 3

Location 5 T4

T4 T4

T3

T3

T3

T3

Location 2(shop A)

Location 4(shop B)

T1 T2

T2T1

T1 T2

T2

Figure 9 Train Movements in the Current Iteration

Step 0

Time

Loc

atio

ns

Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7

Location 1

Location 3

Location 5

Location 2(shop A)

Location 4(shop B)

Figure 10 Shop Routing Paths Built Using the Train Movements

shop routing issues with the other dimensions of thelocomotive management problem. We note that ourshop logic depends on the results of each simulationthat determines when trains actually move, and thetotal number of locomotives waiting in line at eachlocation at each point in time. We learn from theseactivities to estimate the marginal value of assigning alocomotive to arrive at a shop at a point in time, andwhen constructing the train schedule for the purposeof solving the shortest path problem.

8. Model ApplicationsAs of this writing, the multiattribute model can be runin two modes: strategic planning to determine fleet sizeand mix, and tactical planning to forecast/optimizeactivities over a one–seven day horizon. The model

can also be adapted for real-time planning purposes,but this third version has not been implemented as ofthis time. The only difference between the two modelsis the initial inventory of locomotives.

The strategic model works as follows. At the begin-ning of the simulation horizon, all locomotives areplaced in a super source, with one node for each typeof locomotive. The first decision problem, X�

0 4S05 hasto move locomotives from the supersource to a startinglocation. This is guided by the piecewise linear valuefunctions V n−1

0 4Rx05. Other than this initial step, which

is distinguished only by the initial positions of locomo-tives in a supersource, the algorithm is no differentthan when the starting attribute of each locomotive isspecified in a snapshot file (as is done in the tacticalplanning model).

Page 16: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning16 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

The tactical planning model, on the other hand, usesan initial snapshot of all of the locomotives, whichspecifies the full set of attributes of each locomotiveincluding starting location and time, as well as itsmaintenance status.

8.1. CalibrationSeveral years of calibration were devoted to the tuningof the model, algorithm, and data before the modelwas deemed to be calibrated. At Norfolk Southern,determining the right fleet size and mix requires tradingoff the expense of a larger fleet with the train delaysthat result from having too few locomotives. TheADP model will delay trains if there are not enoughlocomotives, but good estimates of train delays requirethat the model accurately represents operations at asurprisingly high level of detail. Our biggest challengewas the accuracy of the locomotive snapshot, whereerrors in the starting location would produce delaysbecause the locomotives were not in their real position.

Considerable care was invested in a detailed analysisof individual locomotive to train assignments. Althoughhigh-level performance measures are useful, they werenot enough for user acceptance. The calibration processwas assisted by a powerful tool, Pilotview, developedin CASTLE Lab for analysis of the results of transporta-tion models. See http://www.castlelab.princeton.edu/plasma.html to view a screen snapshot of Pilotview.

The calibration process involved generating train-delay curves, where we would run the model for arange of different fleet sizes, computing aggregate traindelay for each fleet size. We would then compare histor-ical train delay given the current fleet size. Figure 11(a)shows the results of the initial calibration exercise,using a data set from October 2007. The superb fit wasachieved largely because of careful attention to data,refinement of the model, and improvements to thealgorithm (in initial runs, the fit was terrible). Also,although the model has very few tunable parameters,there was no explicit step to fit the model using theseparameters, largely because the initial problems couldnot have been fixed just by tuning a parameter.

Del

ay (

hour

s/da

y)

Fleet size

Train delay—October 2007

Actual fleet size

Actual delay

(a)

Del

ay (

hour

s/da

y)

Fleet size

Train delay—March 2008

Actual fleet size

Actual delay

(b)

Model (min, average, max)Fit

History

Figure 11 Train Delay as a Function of Fleet Size, Compared to Historical Delay for October 2007 (Used for Calibration) and March 2008

Calibrating a model so that it produces the right levelof train delay is actually quite difficult, and we wouldargue that it represents a major plateau in locomotivemodeling. It requires we strike the right balance forbreaking consists, accurate modeling of yard delays,and the proper amount of repositioning locomotives tobalance the needs between yards. It is also necessaryto model the assignment of locomotives to shops formaintenance, as well as the proper handling of foreignpower. It is standard industry practice to optimizeflows assuming all trains leave on time (as we do inPLASMA/SC), but such models cannot be used forfleet sizing.

After achieving the close fit in Figure 11(a), werepeated the exercise on other data sets without dupli-cating the calibration process. The fit shown in Fig-ure 11(b) for a data set from March 2008, is typicalof the results we would obtain. After repeating thisprocess a number of times, Norfolk Southern gainedconfidence that the model was capturing the behaviorof their network, and the model became accepted astheir fleet planning tool.

8.2. Sensitivity and Stability AnalysisAn important question concerning the model is itssensitivity and stability with respect to different param-eters that influence the behavior of the model. Weanalyzed these questions for two parameters of con-siderable importance to any railroad: train speedand consist breakup cost. To address train speed, wegenerated train delay curves (delay as a function offleet size) while increasing or decreasing train speeds5%, 10%, and 15% relative to the base speed. Theresults are shown in Figure 12(a), which indicatesthat the model responds to train speeds as we wouldexpect.

We then performed a series of runs using a fixed fleetsize, where we varied the consist breakup cost. Theseresults show that increasing the consist breakup costproduced a steady, stable reduction in the number ofconsist breakups. These analyses suggest that ADP pro-duces results that respond in a stable, predictable way.

Page 17: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 17

Tra

in d

elay

(hou

rs/d

ay)

Fleet size

–15%–10%–5%Base

+5%+10%+15%

On

time

trai

ns(%

)(b)(a)

30

40

50

60

70

80

90

100

0 50 100 150 200 250 300 350

Nor

mal

ized

con

sist

bre

aks

(%)

Consist breakup cost

Figure 12 (a) On-Time Trains and Train Delay, as a Function of Fleet Size, for a Range of Average Train Speeds. (b) Consist Breakups for DifferentConsist Breakup Costs, Normalized as a Percent of Consist Breakups When the Cost Is Zero.

0

2

4

6

8

10

12

14

16

18

20

0 5 10 15 20 25 30 35 40

Ave

rage

sho

p qu

eue

leng

th

Time step

(b)(a)

0

50

100

150

200

250

–5 –4 –3 –2 –1 On time 1 2 3 4 5

Days late

Factor = 1.00Factor = 0.66Factor = 0.33Factor = 0.13Factor = 0.07Factor = 0.00

Cost 1

Cost 2

Figure 13 Effect of the Congestion Cost on the Shop Queue (a), and the Effect of the Penalty Cost on Arrival Times to Shop (b)

8.3. Shop RoutingWe tested the ability of the model to manage congestionacross shops, while still maintaining good performancein terms of getting power to a shop location on time.Figure 13(a) shows the aggregate number of locomo-tives sitting at different shop locations. Note that thereis little the model can do in the early time periods.When a low cost is used for shop congestion, we seethe number of locomotives sitting at different shopsremains high throughout the simulation. When weincrease the shop congestion penalty, there is a steadyreduction in the number of locomotives sitting at shoplocations. This suggests that we should be able toachieve higher locomotive utilization by minimizingthe number of locomotives waiting in line for theirshop appointment.

Figure 13(b) then shows the effect of varying thebonuses for arriving early (or penalties for arrivinglate) given in Figure 8. As more weight is given to thearrival time bonuses and penalties, we see a clear effecton the ability of the model to get locomotives to theshop on time. We note that even with a bonus of zero,some locomotives arrive at the shop in time becausethey are already at the correct location in the data set.

8.4. Testing on Stochastic DataAn advantage of ADP is the ease with which we canmake the transition from solving deterministic modelsto stochastic models. The only difference occurs whenwe choose a sample path in Step 1 of the algorithmin Figure 6. If we are solving a deterministic model,the new information about locomotives and trains isalways the same. When we move to solving a stochasticmodel, we introduce the dimension of sampling fromdistributions for transit times, yard delays, equipmentfailures, and schedule changes.

Incorporating uncertainty into the simulation of apolicy is straightforward. The challenge is producinga policy that is robust, which is to say that it makesadaptations to deal with uncertainty more effectively.In our setting, this occurs in a natural and intuitiveway. Figure 14 shows actual value functions for aparticular yard at a particular point in time on theNorfolk Southern network, trained using deterministicand stochastic data. The deterministic value functionrises and then quickly levels off when it reaches thenumber of locomotives needed for the yard at thatpoint in time.

When we use stochastic training, the value func-tion keeps rising. Instead of needing approximately

Page 18: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning18 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

Stochastic training

Deterministic training

Number of locomotives

Val

ue o

f lo

com

otiv

es a

t a y

ard

Figure 14 Deterministically and Stochastically Trained Value FunctionApproximations, for a Particular Yard at a Point in Time inthe Norfolk Southern Network

three locomotives, the value function using stochastictraining suggests that we need five or six locomotives,and we still derive some value from as many as sevenlocomotives. Interestingly, the marginal value of thefirst locomotive is much higher when using stochastictraining than deterministic training. We believe thisarises because locomotives are simply more valuablewhen the network is stressed by the uncertainty rail-roads experience from variations in travel times. Thevalue of a locomotive is magnified throughout thenetwork, because of the potential value added at futurepoints in time, at future locations.

We next ran two sets of simulations. In the first, wetrained the value functions on deterministic data, andthen ran a series of simulations where we simulatedvariations in transit times and yard delays. In the sec-ond, we trained the value functions on stochastic data,and then performed the same set of simulations withvariations in transit times and yard delays. Figure 15(a)shows the resulting inventories, demonstrating thatstochastically trained value functions lead to behaviorswhere yards are more likely to hold onto power whenpossible.

5

(a) (b)

10 15 20 25 30 35 1 25 50

Train delay using deterministically trained VFAsTrain delay using stochastically trained VFAs

75

Iterations

Train delay

100 125 150

Time period within simulation

Locomotive inventories

Deterministic trainingStochastic training

Figure 15 (a) Systemwide Locomotive Inventories Using Value Functions Trained on Stochastic and Deterministic Data, and (b) Total Train Delay WhenUsing Deterministically and Stochastically Trained Value Function Approximations

Figure 15(b) then shows the resulting train delayswhen we simulate the deterministically and stochasti-cally trained value functions. The delays with stochasti-cally trained value functions are not just lower, but areconsiderably less volatile.

The stochastically trained value function approxima-tions produce the intuitive behavior of responding touncertainty by maintaining buffer stocks of locomotives.This logic would be hard to mimic in a rule-basedsimulator in part because the buffer stocks differ notjust from one yard to the next, but also by day of week.This behavior reflects the well-known phenomenon thatthe utilization of locomotives varies by day of week.Value function approximations also make it possible forthe model to make trade-offs between holding powerin one yard versus repositioning it to another yardwhere it offers higher value.

9. Concluding RemarksThis paper has explored three classes of optimizationmodels for locomotive planning: a streamlined singlecommodity model, a more detailed multicommoditymodel, and a multiattribute model. We have thenconsidered two algorithmic strategies: integer program-ming (using CPLEX) and ADP. We showed that CPLEXcould easily handle the streamlined single commodityproblem, which is a model that is used in productionat Norfolk Southern to provide high-level scheduleguidance. However, CPLEX struggled with horizonsmore than three days for the more detailed model.

ADP, which can be described as an “optimizingsimulator,” uses feedback learning through value func-tion approximations to decompose large problems intoa sequence of smaller ones. One of the unexpectedoutcomes of this transition was the surprising reductionin CPU times, even when CPLEX was used to opti-mize locomotives over the entire railroad. ADP growslinearly with the number of time periods, implyingthat we could run simulations over a month or morewithout difficulty. The tactical model can be run in

Page 19: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 19

a production setting with a one-week horizon, withupdates every two minutes.

The research showed that an ADP-based model couldhandle a wide range of issues, spanning the goals andconstraints that guide consist formation, managementof foreign power, and the complex problem of simul-taneously routing power to shops while managingcongestion delays across the shops. We also found thatit could produce robust policies that handle uncertaintyin transit times and yard delays using simple, intuitivelogic.

AcknowledgmentsThis research into the approximate dynamic programmingalgorithms was supported in part by the Air Force Office ofScientific Research (AFOSR) [Grant contract FA9550-11-1-0172]. The adaptation to the locomotive planning problemwas funded by the Norfolk Southern Corporation.

Appendix A. The Attribute VectorsEach vector of locomotive attributes a ∈ A contains thefollowing 14 elements:

a1: decision time, which gives the discrete time t thatdetermines when we make a decision about assigningthe locomotive;

a2: current or inbound location;a3: locomotive ID;a4: owner;a5: type;a6: axles per unit;a7: horsepower per unit;a8: actionable (or available) time (also called the estimated

time of arrival);a9: current consist;a10: current train;a11: maintenance date;a12: shop repair status;a13: previous train symbol;a14: special equipment type installed on the locomotive (LSL,

CSS, FTS).

The attributes a31 0 0 0 1 a71a14 are static. The remainingattributes are dynamic.

The attribute vector b for trains is given by the following:

b1: decision time (normally the departure time, roundeddown to the nearest four hour interval);

b2: origin;b3: destination;b4: train ID;b5: type (merchandise, coal, grain, unit);b6: segment sequence;b7: arrival time to current location;b8: dwell duration at current location;b9: scheduled departure time;b10: duration;b11: expiration time (the train is canceled if held beyond this

time);b12: minimum power;b13: goal power;

b14: maximum power;b15: special locomotive equipment type required;b16: number of special locomotives required;b17: tonnage;b18: miles;b19: revenue;b20: previous train;b21: next train;b22: accumulated delay;b23: train symbol.

All of the dynamic attributes of a train are modified whenacted on with an assignment decision. In the case of adecision to hold a train, the only attribute that changes isthe decision time b1, since we move the train to the nextdecision subproblem. Furthermore, the arrival time to currentlocation attribute b7 is determined by the departure time ofthe previous train, if b7 exists. Trains can be delayed up toan expiration time and cannot depart before the scheduleddeparture time.

Appendix B. The Transition FunctionAccording to our modeling framework, when a locomotivewith attribute a ∈A is acted on at time period t with a decisiond ∈ D, the locomotive will be available in a future timeperiod t′ > t with an attribute a′ = aM 4a1 d1W5 (Equation (1)).Similarly, when a train b ∈B is moved with a decision d ∈DD ,the train will be available in the future with a modifiedattribute b′ = bM 4a1 d1W5 (Equation (2)). In this appendix, wedescribe how the arguments of aM and bM are used to obtaina′ and b′.

It is important to model the amount of time required tocomplete a decision. For this we define

�td = the expected duration of a decision d ∈D made attime t;

�t+11d = a random variable at time t that gives the durationof decision d based on information available attime t + 1.

These durations can be transit times, yard delays, set-offtimes (when breaking consists), and the time required tofinish a shop appointment. If a train departs at time t = 08:00on a move that should take �td = 10 hours (based on whatwe know at time t), we may get updates at each decisionepoch. Thus, by time t + 1, corresponding to time 12:00,we may have learned that the transit time has increased to�t+11d = 1102 hours.

The process of updating the attribute vector at dependson the nature of the decision. For a hold decision, the onlychange is the decision time. When a locomotive moves a train,its location changes, along with its available time (estimatedtime of arrival) and possibly its maintenance status. Thistransition might be written as

at+11i =

max8bt11 bt99+wtb + �t+11d1 if i = 1, 8bt31 if i = 2bt41 if i = 9bt1101 if i = 8bt1211 if i = 10at+11121 if i = 12b231 if i = 13ai1 otherwise0

Page 20: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning20 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

If we are solving a stochastic model, we would have torandomly sample the transit time �t+11d , and the repair statusat+1112. The point is to simply illustrate that the evolution ofthe attributes at and bt use standard tools from simulation.

There is a similar function for the train transition function.For example, it is through the train transition function that werepresent the fact that a train may consist of three segments.Thus, the train schedule may have trains going from A to B,B to C, and C to D, but if this is really the same train, wehave to capture the fact that the train from B to C cannotmove until the train from A to B has arrived (as well asmodeling any delays that may have incurred).

Given the attribute transition functions aM 4a1d1W5 andbM 4b1d1W5, we can construct the incidence functions

�Rt+11 a′1 ad4�5=

{

1 If a′ = aM 4at1dt1Wt+14�551

0 otherwise1

�Dt+11 b′1 bd4�5=

{

1 If b′ = bM 4bt1dt1Wt+14�551

0 otherwise0

We can then use these functions to model the transition ofthe resource and train vectors Rt and Dt using

Rt+11a′ =∑

a∈A

d∈DR

�Rt+11 a′1 ad4�5xRtad1

Dt+11b′ =∑

b∈B

d∈DD

�Dt+11 b′1 bd4�5xDtbd0

We note that although this is notationally useful, in software,the transition function is handled directly by the attributetransition functions applied to each locomotive and trainonce the decisions have been made.

Appendix C. The Constraints at Time tThe constraints represented by xt ∈Xt capture flow conserva-tion, along with the more constraints governing the processof consist formation. Basic flow conservation constraintsinclude

d∈DR

xtad =Rta1∑

d∈DD

xtbd =Dtb0

In addition, xRt and xD

t must be integer. For the multicom-modity and multiattribute models, the flows must be zeroor one. For the single commodity problem, they may begreater than one, but still integer.

The more complex constraints involve the operationalconstraints associated with assigning locomotives to trains. Atrain cannot move until the minimum power requirement ismet. Even though trains can move with minimum power,planners prefer to assign the number of locomotives thatfulfills the goal requirement. The units of power addedto a train above the goal requirement are assessed therepositioning cost. The total power assigned to a train shouldnot exceed the maximum power allowed on the train.

To model these constraints, we introduce the following:

wtb = the delay imposed on train b ∈B at time t due to thespecific assignment of individual locomotives;

yEtb = the number of power units added on a train b ∈B at

time t for repositioning;yGtb = the number of power units assigned to a train b ∈B at

time t to fulfill the goal requirement;

ztb = the number of locomotive consist splits at time t, forthe consist identified by train with attribute b, forexample, if a four-locomotive consist is broken into aconsist with two locomotives, along with two separatelocomotives, this would produce ztb = 2.

We use yGtb to denote the power units dedicated to pull

train b and yEtb to denote the power units added to b for

repositioning. We also let ap denote the power units providedby one locomotive a∈A, which is equal to either a6 or toa7, depending on what type of power unit is used by therailroad (axles or horsepower). If A represents the set of deadlocomotives that need to be pulled to the shop, then theminimum and maximum power requirement constraints fortrains with attributes b ∈B can be written as follows:

a∈A\A

d∈DD �db=b

apxRtad ≥ b12x

Dtb1 (C1)

a∈A

d∈DD �db=b

apxRtad ≤ b14x

Dtb0 (C2)

The goal power and the repositioned power constraints canbe expressed using the following inequalities:

yGtb ≤ inf

([

a∈A\A

d∈DD �db=b

apxRtad

]

− b12xDtb1 b13 − b12

)

1 (C3)

yEtb ≥

[

a∈A

d∈DD �db=b

apxRtad

]

− b12 − yGtb1 (C4)

yGtb ≥ 01 (C5)

yEtb ≥ 00 (C6)

As trains move in the network, they come across areasenforcing special regulations requiring that each train must bepowered by a number of locomotives with special equipmentsuch as the locomotive speed limiter, the cab signaling system, andthe flush toilet system. Notice that the LSL equipment fulfillsthe CSS and the FTS requirements, and the CSS equipmentfulfills the FTS requirements. The following constraint ensuresthat trains of attributes b ∈B cannot move unless the specialequipment requirements are satisfied:

a∈A\A �a14=b15

d∈DD �db=b

xRtad ≥ b16xDtb1 d ∈DD

� bd = b0 (C7)

Since not all locomotives and trains are available at thebeginning of time period t, we need to quantify the train delaywtb of each train b ∈B associated with each feasible solutionxt ∈Xt factored into the objective function contribution forperiod t defined by (6); wtb is computed by solving thefollowing constraints:

4a8 − b95xRtad ≤wtb1 ∀a ∈A1 ∀d ∈DD

� bd = b1 (C8)

wtb ≥ 01 (C9)

where a1 is the adjusted available time of the locomotive a. Ifthe locomotive a is still attached to its inbound train b′ 6= b,then a8 is set to a8 plus a set-off time. Otherwise, a8 = a8.

When a number of locomotives are assigned to a train,setting these locomotives together as an entity named consistrequires time and manpower. To minimize both train delaysand operational costs, it is important to avoid breaking consists

Page 21: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 21

Train 1

Train 2

Locomotiveconsist

H

O

Setoff decisions

Hold decisions

Assignments to train 1

Assignments to train 2

Shop decisions

S

Figure C.1 Locomotive Consist Break-Up Illustration

arriving at each yard to form new ones to power the outboundtrains. To model this problem, we use Ak to denote the set oflocomotive attribute vectors in a consist k ∈Kt and Rk

ta todenote the number of locomotives of attributes a ∈Ak. Asshown by Figure C.1, each locomotive of a consist k can beused in one of following four ways: hold to the next period,enter to a shop for either maintenance or repair, and assign toa train. A consist k (with more than one locomotive) wouldbe segmented in at least two parts if the locomotives in Ak

are utilized in at least two different ways among the four justdescribed. To determine the number of consist splits, whichwe denote by ztk, we introduce the following binary variables:

zHtk = a variable that equals 1 when at least one locomotivefrom Ak is held to the next period and 0 otherwise;

zStk = a variable that equals 1 when at least one locomotivefrom Ak is set-off from k and 0 otherwise;

zMtk = a variable that equals 1 when at least one locomotivefrom Ak is entered to shop for maintenance, and 0otherwise;

zReptk = a variable that equals 1 when at least one locomotive

from Ak is entered to shop for repair and 0 otherwise;zBtkb = a variable that equals 1 when at least one locomotive

from Ak is assigned to a train b ∈B and 0 otherwise;M = an arbitrary large number greater than or equal to

a∈Ak Rkta.

We now write the consist constraints to bound ztk given afeasible solution xt ∈Xt :

a∈Ak

d∈DH

xRtad ≤MzHtk1 (C10)

a∈Ak

d∈DS

xRtad ≤MzStk1 (C11)

a∈Ak

d∈DM

xRtad ≤MzMtk 1 (C12)

a∈Ak

d∈DRep

xRtad ≤MzReptk 1 (C13)

a∈Ak

d∈DD �db=b

xRtad ≤MzBtkb1 ∀ b ∈B1 (C14)

zHtk + zStk + zMtk + zReptk +

b∈B

zBtkb − 1 ≤ ztk1 (C15)

zHtk1 zStk1 z

Mtk 1 z

Reptk 1 zBtkb ≥ 01 and binary1 (C16)

ztk ≥ 01 and integer0 (C17)

Appendix D. Foreign Interchangeand Shop RoutingThere are more than 300 freight railroads in North America.To improve productivity, these railroads share many of theirassets on a daily basis. In this paper, we model the locomotiveinterchange process to minimize the leasing cost and maintaina balanced fleet over time by controlling the number oflocomotives assigned to trains leaving the network, given thecurrent excess of foreign locomotives. For a foreign railroadf ∈F we let Af be the set of locomotives owned by f andBf be the set of trains leaving to f . The excess of locomotivesowned by f at time period t, denoted by RF

tf , is bounded bythe following constraints:

a∈Af

d∈DD �db∈Bf

xRtad −∑

a∈Af

RAta ≤ RF

tf 1 (D1)

RF4t−15f +

a∈Af

RAta −

a∈Af

d∈DD �db∈Bf

xRtad ≤RFtf 1 (D2)

RFtf 1R

Ftf ≥ 00 (D3)

By calibrating the leasing cost cF applied to variable RFtf in

the contribution function (6), the model will minimize theexcess of foreign locomotives in the network over time.

We now formulate the constraints related to shop routing.Locomotives need to be regularly inspected and repaired. Forthis, the locomotive planners always try to route locomotivesto a shop while they are still useful. This is done by assigningthe units that are either due soon for maintenance or repair ontrains that bring them closer to a shop. The Federal RailroadAdministration (FRA) mandates that each locomotive has tobe inspected within 92 days from the last maintenance date.When the maintenance due date occurs before arriving to ashop, the engine is switched off and the unit is towed to thenearest shop. Whereas maintenance events are scheduledand known in advance, the repair events occur randomly.A random failure puts the locomotive in a state whereeither it can only be towed to a shop or it can be used topull trains for a number of days before it turns dead. Anumber of factors are taken into account to decide when andwhere a locomotive will be serviced. Among these are thefollowing:

• the locomotive current state (dead or operational) andthe type of shop work required;

• the date by which the locomotive needs to be in a shop;• the shop congestion level at the maintenance and repair

locations.To formulate the shop routing problem constraints of time

period t, we first need to determine the congestion levelat each shop s ∈S. To do this, we introduce the followingadditional notation:

�s4W5= the number of time periods required to service onelocomotive at shop s ∈S;

�s = the servicing capacity at shop s ∈S.

Assume that we have a feasible solution x ∈X, we want tocalculate the number of locomotives at each shop at each

Page 22: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning22 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

period t that we denote by RShpts . We start by calculating the

number of locomotives sent to a shop s ∈S at each timeperiod t, xts , as follows:

xShpts =

a∈A

d∈DShp

xRtad0 (D4)

Now, assume that the locomotives are serviced at every shops ∈S following the first-in, first-out policy and the serviceduration is deterministic (�s4W5 = �s), then the followingmust hold at every period t:

�ts =

min4q4t−15s + xts1�s −�4t−15s + �4t−�s 5s51

if 4�4t−15s − �4t−�s 5s+ 15≤ �s ,

01 otherwise1(D5)

�ts =�4t−15s − �4t−�s 5s+ �ts1 (D6)

qts = q4t−15s + xts − �ts1 (D7)

�ts =�4t−15s − �4t−�s 5s+ q4t−15s1 (D8)

RShpts = �ts + xts 0 (D9)

The variables used in the above equations are

�ts = the number of locomotives starting service;�ts = the number of locomotives being serviced;qts = the queue length;�ts = the number of locomotives at the shop before

adding xts .

The nonlinear congestion cost cShpCongts 4R

Shpts 5 added to the prob-

lem objective function should minimize the shop congestionover time.

Appendix E. The PLASMA/SC Planning ModelIn this model, the time discretization is done at the day level(T = 7). We use the notation introduced in §3 to describe thismodel with the following minor changes and additions:

ASt : the set of locomotives with attributes a ∈A released from

shop at the beginning of day t;AB

t : the set of locomotives with attributes a ∈A becomingavailable at day t after moving a train;

At = ASt ∪AB

t ;AS = 4AS

t 5t∈8010001T 9;AB = 4AB

t 5t∈8010001T 9;A= 4At5t∈8010001T 9;

DSt : the set of decisions of sending locomotives to shops

by the end of day t; here is one decision ds ∈DMt

per shop s ∈ S (�DMt � = �S�5 for each locomotive

node;RS

tas: the number of locomotives with attributes a ∈AS

out of shop s ∈S at the beginning of day t (givenas input to the model);

RBtas: the target number of locomotives with attributes

a ∈AB the model tries to send to shop s ∈S by theend of day t (given as input to the model);

aM 4a1d5: the transition function defined by (1) in a determin-istic environment;

bM 4b1d5: the transition function defined by (2) in a determin-istic environment;

xEtb : an integer (binary) variable representing the numberof trains of attributes b ∈Bt moved by the modeland used to reposition power;

xEt : 4xEtb5b∈Bt;

xt : 4xRt 1xDt 1x

Et 5.

For each locomotive with attributes a ∈A, the attributes a3,a4, a11, and a12 are dropped (aggregated to a common andmeaningless attribute), and we set a1 = a8. We let â−4t1 �5 bethe function returning the set of backward ordered � + 1 timeperiods (0 ≤ � ≤ T − 1) starting at t. Since in this model timeis cyclic (see Figure 2), this function can be explicitly statedas follows:

â−4t1 �5=

8t − �1 0 0 0 1 t91 if t ≥ �1

8T − �1 0 0 0 1 T 1091 t = 018T + t − �1 0 0 0 1 T 101 0 0 0 1 t91 otherwise0

In this problem, we assume that all locomotives sent toshop using a decision d ∈DS

t leave the system. This can beachieved by setting the duration of d equals to more thanseven days (�d > 7 days, for all d ∈DS

t ). However, if a train ora locomotive is acted on with a decision d ∈Dt\D

St with an

ending time greater than the planning horizon, then the timeis mapped within the horizon. Assuming that the durationof each decision d ∈Dt\D

St is shorter than seven days, the

mapping of the locomotive and train time attributes is doneas follows:

a′

i =

{

a′i − T 1 if a′

i ≥ T

a′i1 otherwise1

∀ 4a′1 a5 ∈AB×AB1 ∀d ∈D\DS

� aM 4a1d5= a′1 (E1)

b′

i =

{

b′i − T 1 if b′

i ≥ T

b′i1 otherwise1

∀ 4b′1 b5 ∈B×B1 ∀d ∈DD� bM 4b1d5= b′0 (E2)

Now, we formulate the problem constraints. We start withthe flow conservation constraints involving the locomotiveswith attributes a ∈AS

t coming out of a shop at day t:∑

d∈DSt

xRtad = 01 ∀a ∈ASt 1 (E3)

d∈DDt

xRtad =RStas1 ∀a ∈AS

t 0 (E4)

To route the required number of locomotives to the shops bythe end of day t, we add the following constraints:

t′∈â−4t165

a∈ABt′

�a2=s

xRt′ads ≤RBtas1 ∀ s ∈S0 (E5)

For the locomotives with a ∈ ABt , the flow conservation

constraints are expressed by the equation:∑

t′∈â−4t165

a′∈At′

d∈DDt′

�aM 4a′1d5=a

xRt′a′d =∑

d∈Dt

xRtad1 ∀a∈ABt 0 (E6)

A locomotive with attributes a ∈At is assignable to a trainb ∈Bt′ (xBtadb is feasible) if the locomotive and the train are atthe same location (a2 = b2) and, when the locomotive is noton a pass-through train, the time separating the locomotive

Page 23: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive PlanningTransportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS 23

availability and the train departure that we denote by H4a1 b5must be within a given time interval at the assignmentlocation 6h4a251 h4a257. In this model where time cyclic, H4a1 b5can be calculated as follows:

H4a1 b5=

b1 − 4a1 +O4a2551 if a1 < b1

b1 + T − 4a1 +O4a2551 if a1 > b1

−O4a251 otherwise1

∀ 4a1 b5 ∈AB×B0 (E7)

The function O4a25 returns the set-off time of the yard a2where the assignment takes place (typically four to six hours)if a setoff is required. As in the detailed model, the trainspower requirements must be enforced:

t′∈â−4t165

a∈At′

a6xRt′ad ≥ b12x

Dtb1 ∀d ∈DD

� db = b1 (E8)

t′∈â−4t165

a∈At′

a6xRt′ad ≤ b14x

Dtb1 ∀d ∈DD

� db = b0 (E9)

In this model, the power repositioning is allowed only ontrains starting from shop locations. To achieve this, we limitthe power assigned from inbound trains to the outboundtrain goal requirement:

t′∈â−4t165

a∈ABt′

a6xRt′ad ≤ b13x

Dtb1 ∀d ∈DD

� db = b1 (E10)

t′∈â−4t165

a∈ASt′

a6xRt′ad ≤ 4b14 − b135x

Etb1 ∀ b ∈Bt 0 (E11)

Multisegment trains have to run according to a sequenc-ing order as earlier train segments have to move first. Weformulate this constraint with the following equation:

xDt′b′ ≤ xDtb1 ∀ 4b1 b′5 ∈B×B � b′= b200 (E12)

The consist breakup constraints in the single commoditymodel are similar to the other models. To formulate theseconstraints, we use notation similar to the one introduced inAppendix C:

Akt = set of possible locomotive attributes in consists k ∈K at

day t;zStk = a variable that equals 1 when at least one locomotive

from Akt is sent to shop, and 0 otherwise;

zBtkb = a variable that equals 1 when at least one locomotivefrom Ak

t is assigned to a train b ∈B and 0 otherwise.

The constraints bounding the number of breakups ztk asso-ciated with each consist k ∈K can now be formulated asfollows:

a∈Akt

d∈DS

xRtad ≤MzStk1 (E13)

a∈Akt

xRtad ≤MzBtkb1 ∀ 4b1d5 ∈B×DD� db = b1 (E14)

zStk +∑

b∈B

zBtkb − 1 ≤ ztk1 (E15)

zStk1 zBtkb ≥ 01 and binary1 (E16)

ztk ≥ 01 and integer0 (E17)

The most challenging constraints of this problem are theones related to the solution repeatability over the weekdaysat the train symbol level. Some of the train schedules areduplicated over some days of the week and it is importantto power those trains the same way (at the train symbollevel) during the week. All trains of the same repeatableschedule share the same train symbol. We use Y to denotethe set of possible train symbols. We also let By representthe set of possible train attributes with the same symboly ∈Y (∀ b ∈By1 b23 = y). For each train b ∈By , we calculatethe number of locomotives assigned to b from inbound trainswith symbol y′ ∈Y, that we denote by x

Yy′b , as follows:

t′∈â−4t165

a∈ABt′y′

d∈DD �db=b

xRt′ad = xYy′b1 ∀ b ∈By0 (E18)

In the above equation, the set ABt′y′ contains the locomotives

with attributes a ∈ ABt′ after moving a train with symbol

y′ ∈Y (a ∈ABt′ � a

′13 = y′). The goal here is to keep the vector

xYb = 4x

Yy′b5y′∈Y as stable as possible across all the trains in By .

For each pair of trains with attributes 4b1 b′5 ∈By ×By , thechange in number of locomotives obtained from trains withsymbol y′ ∈Y is obtained by

�bb′y′ = �xYy′b − x

Yy′b′ �1 ∀ 4b1 b′5 ∈By

×By1 ∀y′∈Y0

To minimize the changes across all pairs of trains, we add acost to each variable �bb′y′ , and rewrite the above constraintas follows:

xYy′b − x

Yy′b′ ≤ �bb′y′1 (E19)

xYy′b′ − x

Yy′b ≤ �bb′y′1 (E20)

�bb′y′ ≥ 00 (E21)

We use the following costs and rewards in the objectivefunction:

cDtb : the contribution earned from moving one train b ∈Bt

(cDtb is positive b if it is a regular train and negative if b isa light engine);

cEtb : the cost of using train b ∈Bt to reposition power (an“empty” movement);

cRtad: the contribution earned from acting on one locomotivea ∈At with a decision d ∈D;

cK : the consist breakup cost;cY : the cost of having one change in the solution aggregated

to the train symbol level over time.

To enforce a FIFO policy when assigning locomotives totrains, we use the following locomotive-train assignmentcontribution for each decision d ∈DD:

cRtad =

01 if h4a25≤H4a1 b5≤ h∗4a25

−��H4a1db5−h∗4a25��1 if h∗4a25≤H4a1 b5≤ h4a25

−�1 otherwise1

where H4a1b5 is as defined by (E7), 6h4a251 h4a257, and6h4a251h

∗4a257 are the feasible and the best connection timeinterval at the assignment location a2, and � and � are

Page 24: From Single Commodity to Multiattribute Models for Locomotive … · 2014-07-18 · Locomotives represent a major investment for a rail-road that might operate over 2,000 units, representing

Bouzaiene-Ayari et al.: PLASMA: ADP for Locomotive Planning24 Transportation Science, Articles in Advance, pp. 1–24, © 2014 INFORMS

positive parameters (�> 1). The single commodity planningmodel can now be formulated as follows:

arg maxx∈X

{ T∑

t=0

b∈Bt

4cDtbxDtb + cEtbx

Etb5+

T∑

t=0

a∈At

d∈D

cRtadxRtad

+ cKT∑

t=0

k∈Kt

ztk + cY∑

y∈Y

y′∈Y\8y9

b∈By

b′∈By\8b9

�bb′y′

}

1 (E22)

where X is the set of feasible solutions x = 4xt5t∈8010001T 9 asdefined by constraints (E3)–(E21).

ReferencesAhuja R, Cunha C, Sahin G (2005) Network models in railroad plan-

ning and scheduling. TutORials in Operations Research (INFORMS,Catonsville, MD).

Ahuja R, Magnanti TL, Orlin J (1993) Network Flows: Theory, Algorithms,and Applications, Vol. 91 (Prentice Hall, Englewood Cliffs, NJ).

Ahuja RK, Liu J, Orlin JB, Sharma D, Shughart LA (2005) Solvingreal-life locomotive-scheduling problems. Transportation Sci.39(4):503–517.

Cordeau J-F, Soumis F, Desrosiers J (2000) A Benders decompositionapproach for the locomotive and car assignment problem.Transportation Sci. 34(2):133–149.

Cordeau J-F, Soumis F, Desrosiers J (2001) Simultaneous assignmentof locomotives and cars to passenger trains. Oper. Res. 49:(4)531–548.

Cordeau J-F, Toth P, Vigo D (1998) A survey of optimization mod-els for train routing and scheduling. Transportation Sci. 32(4):380–404.

Crainic T, Rousseau J-M (1988) Multicommodity, multimode freighttransportation: A general modeling and algorithmic frameworkfor the service network design problem. Transportation Res. Part B20(3):290–297.

Crainic TG, Laporte G (1997) Planning models for freight transporta-tion. Eur. J. Oper. Res. 97(3):409–439.

Dejax P, Crainic T (1987) A review of empty flows and fleet man-agement models in freight transportation. Transportation Sci.21(4):227–247.

Glickman T, Sherali H (1985) Large-scale network distribution ofpooled empty freight cars over time, with limited substitutionand equitable benefits. Transportation Res. Part B 19(2):85–94.

Glover F, Kochenberger G (2003) Handbook of Metaheuristics (Springer,Norwell, MA).

Godfrey G, Powell WB (2001) An adaptive, distribution-free algorithmfor the newsvendor problem with censored demands, withapplications to inventory and distribution. Management Sci.47(8):1101–1112.

Godfrey G, Powell WB (2002) An adaptive, dynamic programmingalgorithm for stochastic resource allocation problems I: Singleperiod travel times. Transportation Sci. 36(1):21–39.

Haghani A (1989) Formulation and solution of a combined trainrouting and makeup, and empty car distribution model. Trans-portation Res. Part B 23(6):433–452.

Herren H (1977) Computer controlled empty wagon distribution onthe SSB. Rail Internat. 8(1):25–32.

Holmberg K, Joborn M, Lundgren JT (1998) Improved empty freightcar distribution. Transportation Sci. 32(2):163–173.

Joborn M (2001) Optimization of empty freight car distribution inscheduled railways. Ph.D. thesis, Department of Mathematics,Linkoping University, Linkoping, Sweden.

Joborn M, Crainic TG, Gendreau M, Holmberg K, Lundgren JT(2004) Economies of scale in empty freight car distribution inscheduled railways. Transportation Sci. 38(2):121–134.

Jordan W, Turnquist M (1983) A stochastic dynamic network modelfor railroad car distribution. Transportation Sci. 17(2):123–145.

Lingaya N, Cordeau J-F, Desaulniers G, Desrosiers J, Soumis F (2002)Operational car assignment at via Rail Canada. TransportationRes. Part B 36(9):755–778.

Mendiratta VB, Turnquist MA (1982) A model for the managementof empty freight cars. Transportation Res. Record 838:50–55.

Nemani A, Ahuja R (2011) OR models in freight railroad industry.Wiley Encyclopedia of Operations Research and Management Science(John Wiley & Sons, Hoboken, NJ).

Powell WB (2007) Approximate Dynamic Programming: Solving theCurses of Dimensionality (John Wiley & Sons, New York).

Powell WB (2011) Approximate Dynamic Programming: Solving theCurses of Dimensionality, 2nd ed. (John Wiley & Sons, Hobo-ken, NJ).

Powell WB, Bouzaiene-Ayari B, Simão H (2006) Dynamic models forfreight transportation. Laporte G, Barnhart C, eds. Handbooks inOperation Research and Management Science: Transportation, Vol. 14(North-Holland, Amsterdam), 285–366.

Powell WB, Jaillet P, Odoni A (1995) Stochastic and dynamic networksand routing. Monma C, Magnanti T, Ball M, eds. Handbook inOperations Research and Management Science, Volume on Networks(North-Holland, Amsterdam), 141–295.

Powell WB, Shapiro JA, Simão HP (2001) A representational paradigmfor dynamic resource transformation problems. Coullard RFC,Owens JH, eds. Annals of Operations Research (J.C. Baltzer AG,Basel, Switzerland), 231–279.

Powell WB, Shapiro JA, Simão HP (2002) An adaptive dynamic pro-gramming algorithm for the heterogeneous resource allocationproblem. Transportation Sci. 36(2):231–249.

Simão HP, Day J, George A, Gifford T, Powell WB, Nienow J (2009)An approximate dynamic programming algorithm for large-scale fleet management: A case application. Transportation Sci.43(2):178–197.

Topaloglu H, Powell WB (2006) Dynamic programming approxima-tions for stochastic, time-staged integer multicommodity flowproblems. INFORMS J. Comput. 18(1):31–42.

Vaidyanathan B, Ahuja R (2008) Real-life locomotive planning: Newformulations and computational results. Transportation Res. Part B42(2):147–168.

Vaidyanathan B, Ahuja R, Orlin JB (2008) The locomotive routingproblem. Transportation Sci. 42(4):492–507.

White W (1972) Dynamic transshipment networks: An algorithm andits application to the distribution of empty containers. Networks2(3):211–236.

Ziarati K, Soumis F, Desrosiers J, Solomon M (1999) A branch-first,cut-second approach for locomotive assignment. Management Sci.45(8):1156–1168.

Ziarati K, Soumis F, Desrosiers J, Gelinas S, Saintonge A (1997)Locomotive assignment with heterogeneous consists at CNNorth America. Eur. J. Oper. Res. 97(2):281–292.