Layered queueing is a small and elegant extension to queueing networks, to model complex resources that are held in groups when a program executes. Upper-layer servers include the use of lower layers in their service times. The interpretation of results (such as bottleneck identification) is different in layered systems, whether they are solved by analytic techniques or by simulation, or even viewed from measurements.

Layered queueing occurs in all real systems, although the layered effects may be small enough to be ignored, giving the usual "flat" models.

The model concept originated in the "active server" model of Woodside (1984), which was a systematic layered application of the surrogate delays of Jacobson and Lazowska (1983). Contributions have come from a broad group of researchers. Developments include 

Other models that do not exploit layered queueing explicitly still have a layered structure, such as a processor architecture model by Sorin et al (1998). This was shown in detail in a study of a distributed database system, by Sheikh and Woodside (1997).

The Carleton University group that publishes this page has been developing techniques to solve layered models and also to build them and to exploit their results. While our work is fully described here, we want to spread the interest in this approach and include other ideas, tools, and results.