c.labbÉ f.reblewski j.-m - operations research · emulation is performed by an emulator, ......

19
REVUE FRANÇAISE DAUTOMATIQUE , DINFORMATIQUE ET DE RECHERCHE OPÉRATIONNELLE .RECHERCHE OPÉRATIONNELLE C.L ABBÉ F. R EBLEWSKI J.-M. V INCENT Performance evaluation of high speed network protocols by emulation on a versatile architecture Revue française d’automatique, d’informatique et de recherche opérationnelle. Recherche opérationnelle, tome 32, n o 3 (1998), p. 253-270. <http://www.numdam.org/item?id=RO_1998__32_3_253_0> © AFCET, 1998, tous droits réservés. L’accès aux archives de la revue « Revue française d’automatique, d’infor- matique et de recherche opérationnelle. Recherche opérationnelle » implique l’accord avec les conditions générales d’utilisation (http://www.numdam.org/ legal.php). Toute utilisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fi- chier doit contenir la présente mention de copyright. Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques http://www.numdam.org/

Upload: dohuong

Post on 19-Apr-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

REVUE FRANÇAISE D’AUTOMATIQUE, D’INFORMATIQUE ET DERECHERCHE OPÉRATIONNELLE. RECHERCHE OPÉRATIONNELLE

C. LABBÉ

F. REBLEWSKI

J.-M. VINCENTPerformance evaluation of high speed networkprotocols by emulation on a versatile architectureRevue française d’automatique, d’informatique et de rechercheopérationnelle. Recherche opérationnelle, tome 32, no 3 (1998),p. 253-270.<http://www.numdam.org/item?id=RO_1998__32_3_253_0>

© AFCET, 1998, tous droits réservés.

L’accès aux archives de la revue « Revue française d’automatique, d’infor-matique et de recherche opérationnelle. Recherche opérationnelle » impliquel’accord avec les conditions générales d’utilisation (http://www.numdam.org/legal.php). Toute utilisation commerciale ou impression systématique estconstitutive d’une infraction pénale. Toute copie ou impression de ce fi-chier doit contenir la présente mention de copyright.

Article numérisé dans le cadre du programmeNumérisation de documents anciens mathématiques

http://www.numdam.org/

Page 2: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

Recherche opérationnelle/Opérations Research

(vol. 32, n° 3, 1998, pp. 253-270)

PERFORMANCE EVALUATION OF HIGH SPEEDNETWORK PROTOCOLS BY EMULATION

ON A VERSATILE ARCHITECTURE (*)

by C. LABBÉ C1), F. REBLEWSKI (2) and J.-M. VINCENT (3)

Abstract. - The goal of this study is to validate a new simulation technique for performanceévaluation of discrete time queuing networks. This technique uses a versatile architecture, whichhas changea from its primary purpose : circuit émulation in the design flow of chips. It is shownthaï this technique can be used to highlight rare events, such as realistic packet loss probability inhigh-speed switched networks. © Elsevier, Paris

Keywords: Emulation on a Versatile Architecture, performance évaluation, high speed networks,queuing networks, rare events.

Résumé. - Le but de cette étude est de valider une nouvelle technique de simulation pourV évaluation de performance des réseaux défiles d'attente en temps discret. Cette technique consisteà utiliser une machine à architecture reconfigurable en la détournant de son utilisation première :Vémulation de circuit dans la chaîne de conception de systèmes numériques intégrés. On montreque cette technique peut être utilisée pour mettre en évidence des événements rares, comme des tauxde pertes réalistes dans les réseaux haut débit. © Elsevier, Paris

Mots clés : Émulation sur une architecture reconfigurable, évaluation de performance, réseauxhaut débit, réseaux de files d'attente, événements rares.

1. INTRODUCTION

Broadband Integrated Services Digital Networks (B-ISDNs) are intendedto provide a variety of different teleservices on a single "universal" network.Moreover, such services can have widely differing Quality of Service (QoS)requirements. At the packet (cell) level, this means différences in permissible

(*) This work has been supported by France Telecom (CNET) and M2000.(') M2000, CNET/DTL/ASR, rue du vieux Chêne, BP 98, 38243 Meylan Cedex, France.

[email protected](2) M2000, Bâtiment Azur, 4 rue R. Razel, 91400 Saclay, France.(3) Laboratoire LMC-IMAG, Domaine Universitaire, BP 53 X, 38041 Grenoble Cedex, France,

[email protected]

Recherche opérationnelle/Opérations Research, 0399-0559/98/03/© Elsevier, Paris

Page 3: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

2 5 4 C. LABBÉ, F. REBLEWSK1, J.-M. VINCENT

cell loss and cell transfer delays. A complete spécification of performance inpacket switching networks in volves two components: packet loss and packetdelay. End-to-end delay is an important measure of performance as toomuch delay is equivalent to loss when considering a real time context. Thismeasure of performance dépends directly on the switch architecture. Thatis why investigation on performance évaluation and dimensioning buffered-switching éléments are so important. Furthermore, these research activitieshave a practical impact on architectures and protocols.

Models used for this research are of ten discrete time queuing networks.This is especially true in the case of ATM (Asynchronous Transfer Mode),where slotted time is quite natural since all the cells have the same size. Aslot is the time needed to serve a cell, A realistic packet loss probability isaround 10~8-10~9. Such losses are rare events which are difficult to captureby direct computation. Software simulators are too limited to obtain such aprobability. For Systems with few tenth of thousand states P-simulation canbe used [11], Ho we ver it is currently possible to compute analytically realistictheoretical loss probabilities on the first stage of a broadband multiplexer andto obtain bounded results on the second stage [20]. Ho we ver the computationof realistic loss probability for subséquent stages is still an open problem.This is doe to the very large number of states of this problem.

The aim of this paper is to validate a new approach, using émulation on aversatile architecture machine for performance évaluation of finite capacitiesqueuing networks in discrete time. It is shown that this technique could beused to hightlight rare events, such as realistic packet loss probability inhigh-speed switched networks.

Emulation is a widely used technique in the design flow of chips.Programmable hardware is used to reproduce the functionalities of a circuit.Emulation is performed by an emulator, which can be seen as a hardwaresimulator, since its hardware configuration can be modified to model othercircuits ; this is an "all purpose hardware emulator" based on a versatilearchitecture [9, 7].

Here we will focus on packet loss probability and examine the trafficperturbation induced by stages of multiplexers. A four-by-four multistageATM switch [18] (see Figure 1) is used as an example throughout the article.If we assume that K is the capacity of queues and q the number of queuesunder study. The number of states of the System is (K + 2)q. That is to say,with five queues of capacity 20, the System has 5.106 states. For examplein the section 5 a model with 1012 states is treated and a loss probability of10~9 had been reached. This ATM switch is modeled by a queuing network

Recherche opérationnelle/Opérations Research

Page 4: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

PERFORMANCE EVALUATION OF HIGH SPEED NETWORK 255

which is emulated by a dedicated architecture on the versatile machine.The choice of architecture is determined by both the structure of the modeland the performance parameters under study. Expérimental protocols areconducted to collect statistics on losses and traffic characteristics,

2x2 output buffered switch

Sources i First stage Second stage

Figure 1. - Four-by-four double stage switch made of two-by-two output buffered switches.

The stucture of the paper is the following. The versatile architecture of theemulator and the software developments used in this study are presented inSection 2. The design method of a queuïng network, using this architecture,is discussed in Section 3, using the example of the four-by-four ATMswitch. Section 4 présents the hardware instrumentation required to obtaininformation on what happens in the queuing networks. This section alsoprésents the control with respect to statistical assumptions. Section 5 présentsa set of expérimental results on a four-by-four multistage ATM switch.

2. HARDWARE ARCHITECTURE AND SOFTWARE ENVIRONMENT

This section présents the hardware architecture and the softwareenvironment used to emulate queuing networks. The software is used todescribe a component modeiing the queuing network and the hardwaresimulator émulâtes this component.

2.1. Architecture

The hardware simulator is the M500 machine from Metasystems [9, 7],comprising:

• 500,000 programmable logic gates, connected to each other through aprogrammable network,

vol. 32, n° 3, 1998

Page 5: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

256 C. LABBÉ, F. REBLEWSKI, J.-M. VINCENT

• 17 Mbytes of memory, single or double port,

• adjustable clock frequency from 1 to 10 Mhz.

All this hardware can be shaped to emulate any digital and synchronouscircuit. The description of a chip is given to the Emulator by configurationfiles. Memories can be loaded or dumped using ASCII files. The values ofthe input signais of the emulated component are directly chosen by the user.The clock frequency is 10 Mhz for a very small occupation of the emulatorbut, under normal conditions, the frequency is usually close to 1 Mhz. Theemulator clock is under user control. All signais and register values areavailable on the last 7000 clock cycles, which is very useful for debugging.

2.2. Software environment

The software flow leads to the files required by the emulator to reproducethe functionalities of a circuit. These functionnalities are described in termsof concurrent processes using the VHDL language. VHDL is an efficientway of obtaining a high level description of a hardware component, whichis then translated into gates by the Synopsys synthesis tools. From thisreprésentation of the components, the Metasystems compiler produces thedata base required by the emulator.

The software flow is presented in Figure 2 and is detailed above:

• a VHDL (VHSIC Hardware Description Language) description of thechip is used to describe the System in terms of concurrent processes [13].

• Synopsys synthesis: this software, provided by Synopsys, translates theVHDL description into combinational logic and registers (logic gates),which are the basic logic éléments used in electronic Systems [14].

• The Metasystems compiler computes the links for the set of gatesavailable in the machine. This is the routing opération, which results in

VHDL Description

Compilation Synthesis Metasystems Compilation

Software simulation ofthe components

Hardware configuration

file^ Emulator configuration file

Figure 2. - Software flow used to structure the versatile architecture.

Recherche opérationnelle/Opérations Research

Page 6: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

PERFORMANCE EVALUATION OF HIGH SPEED NETWORK 257

Connecting the gates to each other through the programmable networkof the emulator.

3. MODELING DISCRETE TIME QUEUES USING THE VHDL HARDWAREDESCRIPTION

Due to this technique one has to rethink the queuing model in hardwaredesign. That is to say, in terms of counters, memories, comparatorsy statemachines... In this section, the method for modeling a queue in synthesizableVHDL is described. Different ways of building a queue are proposed,showing that the hardware représentation dépends on the adopted model. Thesimple case of a deterministic server is considered. In the first subsectionthe updating technique for the number of waiting customers is described.The second subsection présents a traffic supplier to feed a queue and, moreoften, a queuing network. This is the représentation of the arrivai process,i.e. génération of hardware sources. It is important to note that an automatedtranslation tool could be implemented. This will allowed to transform thequeuing model in a VHDL program, or even in a configuration of theemulator.

3.1. Hardware queue

A hardware représentation of a queue dépends on various parameters, suchas customer or server characteristics.

For example: if customers are ail similar, only a register and some "logicglue" are needed (see Figure 3). The register contains the number of waitingcustomers and is updated according to the number of new customer arrivais,the capacity of the queue and the number of customers served. As this is adiscrete time model, there are two options: arrivai first (AF) or departure first(DF). What is important to note is that in this case a slot is equivalent to aclock cycle. This is a poor représentation since performance parameters, suchas mean delay or delay variation are not easily measurable. It is howevercheap in terms of hardware occupation, since it requires only register andlogic glue.

In high-speed packet switched networks, there may be different levels ofpriority. In the previous model, the packets are ail identical but, in this case,a priority is assigned to each packet. To introducé this into our model, realpackets carrying information are used. This information has to be queued inthe same way as a real packet. This means that each queue of the networkis implemented using a memory corresponding in size to the capacity of

vol. 32, n° 3, 1998

Page 7: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

258 C. LABBÉ, F- REBLEWSKI, J-M. VINCENT

Arrivai First : New customers Buffer size Consumption 0

:> * > 1* >; 1Y Losses y Departure

Departure First : Consumption

Number of customer register

0 New customers Buffer size

_ : ^

l

Departure

+ j

Losses

Number of customer register

Figure 3. - Hardware représentation of a queue. This is a discrete time model (AF or DF)where arrivai and departure processes are not fixed. In each case only a register and somelogk glue are needed. Arrows give the way of propagation of integer signais.

the queue. However, in the case of multiple arrivais, multiple informationhas to be stored in one slot. If packets are in conflict, they all have tobe stored in the memory in the same time slot. As a resuit, a slot hasto be decomposed into multiple clock cycles. This reduces the simulationspeed and increases the time consumption, -since one slot requires multipleclock cycles- and the hardware size consumption, since each queue needs amemory. This represents, however a considérable extension, as packets canbe tagged to take into account priority, or to carry dates. ït is thus possibleto study the end-to-end delay and delay variation, or even to study a singlecommunication with background traffic.

The results presented in Section 5 have been obtained using the firstmodel of a queue with the AF option.

3.2. Traffic supplier

There are different ways of supplying traffic to the network, It is possibleto feed the Network with real traffic stored in the memory before beginningthe experiment. This is an interesting possibility since the characteristics ofreal traffic are very hard to reproduce [15]. However, to reach a low lossprobability, a very large sample of traffic is required, which is too large to bestored in the emulator memory. Another solution is to emulate traffic usingstatistical models. The most widely used models are:

Recherche opérationneile/Operations Research

Page 8: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

PERFORMANCE EVALUATION OF HIGH SPEED NETWORK 259

uniform traffic, geometrie process: in this model, packets arrive ateach input port according to an independent and identically distributedBernouilli process with parameter p (0 < p < 1). That is to say a cellarrives at each slot witli the probability p. Thus p represents the inputload or the arrivai rate at an input port (denoted by p), Assuming thismodel for real traffic leads to an optimistic performance évaluation. Infact, this model does not capture the bursty nature of real time traffic [ 18].

Markov Modulated Bernouilli Process (MMBP) [3]: several models havebeen proposed to describe real traffic, from which on-ojf sources havebeen widely studied [21, 10]. These sources are characterized by threeparameters, À, /x, r, where A and pi dénote respectively the inversesojourn times in the off and on state, and r dénotes the rate of packetgénération in the on state (see Figure 4). The input load p is defined asthe ratio r\/(\+p,\ and the bursi length is defined as the average amountof information generated during a single on period of the source, whichequals r/pt. Expérimental studies have however revealed auto-siinilarityproperties in the behavior of network traffic. Markovian models whichcapture these properties can be found in [15].

xFigure 4. - An on-ojf source; during the on state, packets are generated with rate r.

Despite the drawbacks of the uniform traffic model, it has been used to obtainthe results of Section 5. Hardware sources are thus built with a memory-based random generator (see appendix A). This generator gives a n bit longrandorn word which goes through a comparator with p * (2n — 1). The resultof this comparison is a bit of présence of a packet, with the probability p.

The destination of each cell is chosen uniformly from among all outputports and independently of all other packets. For this, each 2 x 2 switch hasa memory-based random generator. This random generator gives a two bitlong random word which choose the output for each input port.

This memory-based random generator can also be used to implement aMMBP. A state register has to be used to know the state of the on - of ƒ

vol. 32, n° 3, 1998

Page 9: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

2 6 0 C LABBÉ, F. REBLEWSKI, J.-M. VINCENT

source. The value given by the random generator is then compare toÀ * (2n — 1) if the source is in the off state or to fi* (2n — 1) if the source isin the on state. We are now able to build queues and customer suppliers, butinstrumentation is needed to collect results and to capture interesting events.As seen previously (in Section 3), the structure of the component stronglydépends on the model, and also on the required performance parameters. Aninstrumentation component is therefore built to collect important information.When this has been performed, the experiment has to be cleverly controlled.In particular the stop conditions are discussed in the second subsection.

3.3. Collecting results

The information required for the cell loss probability is the numberof emitted and rejected packets. There are many different packet lossprobabilities to be studied. For example, the loss probability for the wholenetwork, for a single stage, or for a single queue. In terms of hardware, thismeans one register for the number of incoming cells and another for thenumber of rejected cells. This is necessary at every point where the cell lossprobability has to be measured (see Figure 10).

To study the effect of the queuing network on traffic characteristics, atraffic analyzer is required. In fact, it may be désirable to collect statistics,such as the load on a queue, the mean burst length observed on a link, orto study the inter-arrival process. This is also performed by implementationof counters and registers.

All these aspects have to be anticipated and planned during the circuitdescription. This instrumentation is often more significant than the queuingnetwork in terms of hardware occupation. So there are two different typesof hardware implemented. The first is the implementation of the problem,and is composed of queues and sources. The second is the instrumentationdepending on the performance measures studied. This type of hardware iscomposed of counters and registers.

4. INSTRUMENTATION AND CONTROLLING EXPERIMENTATION

4.1. Emulation time duration

This problem is linked to the control of the emulator. The experiment canbe stopped when a register is equal to a given value. The register of thenumber of lost cells was chosen as a trigger mechanism for this purpose.The problem is to choose the most suitable value of this register.

Recherche opérationnelle/Opérations Research

Page 10: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

PERFORMANCE EVALUATION OF HIGH SPEED NETWORK 261

Let X{ be

f l if thei t / l cellis lost

1 ö otherwise

The performance parameter studied here is the loss rate EXu an estimatoris given by :

-^ 1!

Suppose that, for the analysis of the problem, {Xi}%eIN are independentrandom variables from a common distribution.

Network Instrumentation

output flow

input flow

losses of the

different queues

Number

oflosses

Number of emitted cellFigure 5. - Hardware instrumentation for coUecting information on thenetwork activity. Arrows give the way of propagation of integer signais.

Using the central limit theorem [17], and the standart unbiaised estimator

the confidence interval z(a) with level a is computed.

vol. 32, n° 3, 1998

Page 11: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

2 6 2 C. LABBÉ, F. REBLEWSKI, J.-M. VINCENT

In a typical experiment for a =• 0.05, n = lö ï 0 and 10,000 losses, theconfidence interval is 2%, while it is 6.3% with lT0ö0 losses.

In fact for bursty traffic, the variables X% are no longer independent. Whena queue is full, the probability of loss {X{ = 1) is higher, but bursts of lossesare still expected to be independent of one another [2]•; In practice» to produceaccurate results, it was deeided to reproduce each experiment several times.Each experiment is stopped when 1,000 packets have been lost. The valuespresented in Section 5 are, for each point, the mean of those experiments.The resuit are given for a confidence interval of 6.3%.

4.2, Cost

We can see that, like every "program", the implementation cost can bemesured in terms of "time" and "space".

Let n be the number of cells to be treated, q the number of opérationsthat have to be done on each cell (in our case this is clearly the number ofservices required by each cell) and s the number of sources.

The complexity of the problem is then Ö(nq)9 as q opérations have tobe performed on each cell. This is also the time complexity of a sequentialalgorithm. In our case, the q opérations are made in a pipeline, and the ssources émit in parallel with a rate p. This is why the number of clockcycles needed to treat n cells is:

As an example, for n — 1010 cells on the four-by-four switch (s = 4),and p = 0.8, the number of clock cycles required is 3.1 10&. As the clockfrequency is 106 hz, 50 minutes are necessary.

The place complexity dépends on the complexity of a single queue andthe size of the switch ( 4 x 4 , 8 x 8 ...), that is to say on the hardwaresurface needed for the implementation. As seen in Section 3, a queue canbe implemented in different ways. Each way has a different cost in termsof hardware occupation. Moreover, the instrumentation often represents aconsidérable cost, which is why the place complexity is hard to quantify.As an example, four 4 x 4 switches in parallel only use seven boards of theM500 machine, i.e. 30% of the total capacity.

The most important point to note is that the time needed to evaluate theloss rate on the third stage is no longer than that required to evaluate theloss rate on the second one.

Recherche opérationnelle/Opérations Research

Page 12: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

PERFORMANCE EVALUATION OF HIGH SPEED NETWORK 263

5. APPLICATION TO THE TWO STAGES FOUR-BY-FOUR SWITCH

This section is devoted to the study of a four-by-four switch (see Figure 1).The first subsection concerns the cell loss probability. This leads to the studyof the perturbation introduced by the switch on the traffic. In particular,properties of the output traffic are cornpared to those of the input trafficin the second subsection. The traffic model adopted is the geometrie onepresented in Section 3.

5.1. Loss probability function of capacities

Figure 6 shows the global packet loss probability, which is the lossprobability over all the four-by-four switch. The x axis is the second stagequeue capacity K2 varying from 10 to 50. Each curve corresponds to adifferent size of first stage buffer (Ki = 10,15,20,30).

0.001

O.QOOl r

le-05 7

le-06 -

le-07 r

le-08

Figure 6. — Loss rate versus capacities of the first (K^) and second stage (i^)* p = 0.8.For Ki = 30 and K2 > 30 too few losses have been observed to produce an accuratevalue of the loss rate.

A plateau can be seen on each curve for high values of JK"2. This canbe explained by the fact that the second stage is large enough and thusail observed losses arise from the first stage. In this case, increasing thecapacity of the second stage will be useless.

On the other hand, for low values of K2, the curves are ail together. Infact the capacity K\ of the first stage is large with respect to that of thesecond stage. As a resuit, ail losses arise from the second stage. In this case

vol. 32, n° 3, 1998

Page 13: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

264 C. LABBÉ, F. REBLEWSKI, J.-M. VINCENT

increasing the size of the first stage without increasing the capacity of thesecond will not be useful.

For example, with K\ = 20, the "optimal" configuration, in terms oflosses with respect to memory cost, is obtained for K% ~ 30. That is to say,if K\ = K2, the cell loss probability differs on each stage.

The cell loss probability per stage can be seen in Figure 7, where thecapacities of stages are equal (K — K\ — K2). It should be noted that lossesare always greater on the second stage. This could be explained by the factthat the first stage induces perturbations in the traffic. In other words thetraffic following a buffer stage is more bursty than the one at the entrance.

Loss rate

5 7 10 15 20Figure 7. - Loss rate at the first and second stage

versus capacity (K = K\ = K2 and p = 0,8).

25 K

This is the reason why a traffic analyzer has been built to characterize theseperturbations. Information gathereded by this traffic analyzer is presented inthe next subsection.

5.2. Traffic perturbations

Traffic perturbations are shown by a traffic analyzer. The traffic analyzergives the number of inter-arrival times that lasted j slots at the output ofthe ith stage. This allows eomputation of the probability al- of observing aninter-arrival time of j slots at the output of the ith stage. It also gives thenumber of bursts that lasted j slots at the output of the ith stage. This allows

Recherche opérationnelle/Opérations Research

Page 14: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

PERFORMANCE EVALUATION OF HIGH SPEED NETWORK 265

computation of the probability bl- of observing a burst that lasted j slots atthe output of the ith stage. As a convention i = 0 gives these probabilitiesat the output of the source stage.

Figure 8 présents the distribution a® and aj and Figure 9 présents thedistribution b^ and bj (0 < j < 16). As the arrivai process is a geometrieprocess of parameter p, the theoretical distribution aÈ of a°- can be computed:

A x2 t e s t C1) on the observed distribution a*- confirms the accuracy of thetraffic supplier. For b® the theoretical distribution bt® is given by:

A x2 test on the observed distribution also confirms that the traffic observedat the output of the sources is the one expected.

Probability

0.9

n a

0.7

0.6

0.5

0.4

0.3

0.2

0.1

n

i i i i i i

Distribution of a°

-

-

-

Distribution of o |

-

-

-

I r-T) r-, _ ,0 1 2 3 4 5 6 7 J

Figure 8. - Probability of an inter-arrival time lasting j slots, at the sources(a°) and at the output of the switch (aj). p = 0.8 and Kx = K2 = 10.

Distributions aj and bj show the perturbations introduced by the switchon the traffic. From Figure 8, it is clear that the mean inter-arrival time hasincreased. Figure 9 shows that the number of small bursts has increased,

> All the statistical tests are done with a degree of confidence of 95%.

vol. 32, n° 3, 1998

Page 15: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

266 C. LABBÉ, F. REBLEWSK1, J.-M. VINCENT

Probabüity

0.3

0.25

0.2

0.15

0.1

0.05

Distribution of '6j

Distribution of 6|

n~ir-rnr-n

0 2 4 6 8 10 12 14 16 3

Figure 9, - Probabüity of a burst lasting j slots, at the sources (6°)and at the output of the switch (bj). p = 0,8 and Ki = K2 = 10.

TABLE I

Distribution aj and b). (Ki s= K2 = 10 and p - 0.8j.

i

2

3

4

5

6

7 ;

8

9

10

11

12

13

14

15

16

> 16

0.1994

0.1602

0.1283

0,1024

0.0819

0.0655

0.0527 :

0.0417

0.0335

0,0267 '

0.0214

0,0171

0.0137

0,0109

0,0087

0,0069 ;

0.0282

0.2851

0.1641

0.1069

0.0756

0.0562

0.0433 :

0,0344

0.0280

0.0232

0,0194

0.0165 !

0.0141

0.0122

0,0107

0.0094

0.0083

0,0918

0.7998

0.1602

0.0319

0,0063

0.0012

0.0002 •

5.3£ - 5 ;

9.4.E: - 6

ISE - 6

5.0£ - 7

3.1E-7

0

0

0

0

0

0

* ? :•0.6196

0.2323

0,0894

0.0351

0,0139

0,0056 •

0.0022

0.0009

0.0003

0.0001

6.1E-5

2.5E - 5

L I E - 5

3.7£ -* 6

9.7E - 7

S.2E - 7

6AE-7

Recherche opérationnelle/Opérations Research

Page 16: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

PERFORMANCE EVALUATION OF HIGH SPEED NETWORK 267

but the number of large bursts has also increased as 6] > 1 6 = 0.918 andb°J>16 = 0.0282 (see Table I).

Thus the différence of losses on the different stages can be explained bythe fact that the mean burst length has been increased by the two stagesof the switch.

6. CONCLUSION AND EXTENSION

In this document, a new methodology for simulation of discrete timequeuing networks with finite capacities has been presented. This newmethodology uses a versatile architecture, configured for maximum efficiencyfor a given problem. This method has to be compared to software tools whichonly allow the computation of higher (10~5) cell loss rates. What is importantto note is the fact that this architecture is very easy to configure, and thatthis configuration is very easy to debug. This last point results from the factthat all signais of the configuration are available.

This new approach has been applied to the study of high-speed packetswitched networks. This has allowed for simulation of realistic cell lossprobabilities (lO~8,10~9) at all stages of a multistage ATM switch. Usingthis thechnology, it is thus possible to highlight rare events with a highdegree of accuracy.

A traffic analyzer has also been presented. This traffic analyzer shows theperturbation induced by a multiplexer on the traffic and shows why trafficmodeling is so important.

For further studies, the model has to be extended to real cells carryinginformation. This extension will allow studies on end-to-end delay and delayvariation (jitter). It is also possible to introducé priority levels for packets,and new buffering stratégies, such as partial sharing.

More generally, this type of machine could be used to emulate numeroustypes of network protocols, and also to solve different discrete time queuingnetwork problems.

7. ACKNOWLEDGEMENT

The authors would like to thank V. Olive and S. Martin (2) for fruitful support during the courseof this work. The authors also thanks the anonymous référées for their valuable comments.

(2) from the CNET-Grenoble.

vol. 32, n° 3, 199.8

Page 17: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

268 C. LABBÉ, F. REBLEWSKI, J.-M. VINCENT

APPENDIX

A. PSEUDO RANDOM GENERATOR

16 bits signal xor Input load pn

16x126 memory 16 bits registerFigure 10- - The memory based random generator.

1 bit

The random generator is based on the polynomial 1 + X + X127 (see[18]). A double part memory is used as a shift register. Every clock cycle a16 bits random word w is generated (0 < w < 216 - l ) . To reproduce thegeometrie process describe in section 3 this word is compared with

Where |_J is the lower integer part. For example with p = 0.8 we havepn = 52429. A cell is emitted if w < pn-

Here is a part of the VHDL code for this random generator:

- A double port memory is used. Addresses are 7 bit words.

signal AddRead, AddWrite : std-ulogiovector(6 downto 0);

- Addresses as integer.

signal NumWrite, NumRead : integer range 0 to 125;

- 1 6 bits words read or write in memory.

signal ToBeWrite, Read : std-ulogiovector(15 downto 0);

- A 16 bits register at the output

signal regXO : std-ulogic-vectorCIS downto 0);

~ Type Conversion.

AddRead <= std-ulogic-vector(Conv-unsigned(NumRead,7));

AddWrt <= std-ulogicvector(Convunsigned(NumWrite,7));

if CK'event and CK='l ' then

- xor operator.

ToBeWrite <= Read xor regXO;

- Shift

Recherche opérationnelle/Opérations Research

Page 18: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

PERFORMANCE EVALUATION OF HIGH SPEED NETWORK 2 6 9

NumRead <= (NumRead +1) Mod 2**Tad;

NumWrite <= (NumWrite+1) Mod 2**Tad;

regXO <= Read;

- Comparator.

if ( conv-integer(unsigned(regXO))< rhon ) then

res<=l;

else

res<=0;

end if;

end if;

REFERENCES

1. A. GRAVEY and G. HÉBUTERNE, Simultaneity in discrete-time single server queues withBernouilli inputs, Performance Evaluation North-Holland, 1992, 14, pp. 123—131.

2. B. SERICOLA, F. GUILLEMIN, G. RUBINO and A. SIMONIAN, Transient characteristicsof an M/M/oo system applied to statistical multiplexxing on an ATM link.Publication interne 874, IRISA, October 1994.

3. W. FISCHER and K. MEIER-HELLSTERN, The Markov-modulated Poisson process(MMPP) cookbook, Performance Evaluation North-Holland, 1993, 18, 2,pp. 149-171.

4. E. GELENBE and G. PUJOLLE, Introduction to Queueing Networks, John Wiley & SonsLimited, 1987.

5. G. HÉBUTERNE, Ecoulement du trafic dans les autocommutateurs, Mas son, InstitutNational de Télécommunications, 1985.

6. F. HÜBNER, Discrete-time analysis of the busy and idle period distributions of afinite-capacity ATM multiplexer with periodic input, Performance évaluationNorth-Holland, 1994, 21, 1-2, pp. 23-36.

7. L. BURGUN, F. REBLEWSKI, G. FENELON, J. BARBIER and O. LEPAPE, Sériai Fault Emulation.In Proceedings of the 33rd Design Automation Conference 1996 (DAC 96),Metasystems, France, 1996, pp. 801-806.

8. J. PELLAUMAIL, Majoration des retards dans les réseaux ATM, Rairo rechercheopérationnelle, 1996, 30, pp. 51-64.

9. L. BURGUN and F. REBLEWSKI, Première Génération d'Emulateurs Matériels Meta-systems. In Quatrième symposium sur les architectures nouvelles de machines,Metasystems, France, 1996.

10. G. MEEMPAT, G. RAMAMURTHY and B. SENGUPTA, A new performance measurefor statistical multiplexing : perspective of the individual user, Performanceévaluation North-Holland, 1996, 25, pp. 59-80.

11. J. PELLAUMAIL, Graphes, Simulation, L-matrices, application aux files d'attente,Hermès, 1992.

12. G. PUJOLLE, Commutateurs ATM: Classification et Architecture, Technique et ScienceInformatique, 1992, 11, 1, pp. 11-29.

13. R. AIRIAU, J.-M. BERGE and V. OLIVE, VHDL du langage à la modélisation, Pressespolytechniques et universitaires romandes, France Telecom, 1990.

14. R. AIRIAU, J.-M. BERGE and V. OLIVE, Circuit Synthesis with VHDL, Kluwer AcademiePublishers, France Telecom, 1994.

vol. 32, n° 3, 1998

Page 19: C.LABBÉ F.REBLEWSKI J.-M - Operations research · Emulation is performed by an emulator, ... 500,000 programmable logic gates, connected to each other through a programmable network,

2 7 0 C. LABBÉ, F. REBLEWSKI, J.-M. VINCENT

15. S. ROBERT and J.-Y. LE BOUDEC, Can Self-Similar Traffic Be Modeled by MarkovianProces ses?, Lecture Notes in Computer Science, 1996, 1044.

16. J. ROBERTS and F. GUILLEMIN, Jitter in ATM networks and its impact on peak rateenforeement, Performance Evaluation North-Holland, 1992, 16, 1-3, pp. 35-48.

17. S. M. Ross, A Course in Simulation, Mamillan Publishing Company, University ofcalifomia, Berkeley, 1991.

18. R. Y. AWDEH and H. T. MOUFTAH, Survey of ATM switch architectures, Lecture Notesin Computer Science, 1995, 27\ pp. 1567-1613.

19. S. TEZUKA, Uniform Random Numberd: Theory and practicer Kluwer AcademiePiblishers, IBM Japan, 1995.

20. L. TRUFFET, Méthodes de Calcul de Bornes Stochastiques sur des Modèles de Systèmeset de Réseaux, PhD thesis, Université Paris VI, 1995.

21. Y. XIONG, B. STEYAERT and H. BRUNEEL, An ATM statistical multiplexer with on /offsources and spacing: numerical and analytical performance studies, Performanceévaluation North-Holland, 1994, 21, 1-2, pp. 37-58.

Recherche opérationnelle/Opérations Research