MDL Reading

There is a large body of literature on the Minimum Description Length principle in the contexts of statistics, mathematics, machine learning, philosophy, etc. We give only a small selection of publications that we have found especially useful and important. More publications can be easily found using search engines such as Google, Google Scholar and CiteSeerx.


A.Barron, J.Rissanen, and B.Yu, The minimum description length principle in coding and modeling. IEEE Trans. Information Theory, vol. 44 (1998), no. 6, pp. 2743-2760.

P.Grünwald, A tutorial introduction to the minimum description length principle. In: Advances in Minimum Description Length: Theory and Applications (edited by P. Grünwald, I.J. Myung, M. Pitt), MIT Press, 2005 (80 pages; [PS], [PDF]).

M.H.Hansen and B.Yu, Model selection and the principle of minimum description length. J. American Statistical Association, vol. 96 (2001), pp. 746-774. (available at Prof. Hansen's homepage)


P.Adriaans and P.Vitanyi, The power and perils of MDL, Proc. 2007 IEEE Intl. Symp. Information Theory (ISIT), pp.2216-2220. (available at Prof. Vitányi's homepage)

A.Barron and T.M.Cover, Minimum complexity density estimation, IEEE Trans. Information Theory, vol. 37 (1991), no. 4, pp. 1034-1054. (available at Prof. Cover's homepage)

Q.Gao, M.Li, and P.M.B.Vitanyi, Applying MDL to learning best model granularity, Artificial Intelligence, vol. 121 (2000), no. 1-2, pp. 1-29. (available at Prof. Vitányi's homepage)

P.Grünwald, P.Kontkanen, P.Myllymäki, T.Silander, and H.Tirri, Minimum encoding approaches for predictive modeling. Proc. 14th Int. Conf. on Uncertainty in AI (UAI'98), G.Cooper and S.Moral (eds.), 1998, pp. 183-192. (available at CoSCo homepage)

A.D.Lanterman, Schwarz, Wallace, and Rissanen: Intertwining themes in theories of model selection. International Statistical Review, vol. 69 (2001), no. 2, pp. 185-212. (available at Prof. Lanterman's homepage)

I.J.Myung, V.Balasubramanian, and M.A.Pitt. Counting probability distributions: Differential geometry and model selection. Proc. National Academy of Sciences, USA, vol. 97 (2000), pp. 11170-11175. (available at Prof. Balasubramanian's homepage)

J.Rissanen, Modeling by shortest data description. Automatica, vol. 14 (1978), pp. 465-471.

J.Rissanen, A Universal prior for integers and estimation by minimum description length. Annals of Statistics, vol. 11(1983), no. 2, pp. 416-431.

J.Rissanen, Universal coding, information, prediction, and estimation, IEEE Trans. Information Theory, vol. 30 (1984), pp. 629-636.

J.Rissanen, Stochastic complexity. J. Royal Statistical Society, Series B, vol. 49 (1987), no. 3, pp. 223-239.

J.Rissanen, Stochastic complexity and modeling. Annals of Statistics, vol. 14 (1986), pp. 1080-1100.

J.Rissanen, Fisher information and stochastic complexity. IEEE Trans. Information Theory, vol. 42 (1996), pp. 40-47.

J.Rissanen, Hypothesis selection and testing by the MDL principle. The Computer Journal, vol. 42 (1999), no. 4, pp. 260-269. (available at Computer Journal)

J.Rissanen, MDL Denoising. IEEE Trans. Information Theory, vol. 46 (2000), no. 7, pp. 2537-2543.
Errata: 1. The last term in Eqs. (36) and (40) should be -ln k(n-k). 2. DJ signal in Fig. 1 incorrect.

J.Rissanen, Strong optimality of the normalized ML models as universal codes and information in data. IEEE Trans. Information Theory, vol. 47 (2001), no. 5, pp. 1712-1717.

J.Rissanen, Complexity of simple nonlogarithmic loss functions. IEEE Trans. Information Theory, vol. 49 (2003), no. 2, pp. 476-484.

N.K.Vereshchagin and P.M.B.Vitanyi, Kolmogorov's structure functions and model selection, IEEE Trans. Information Theory, vol. 50 (2004), no. 12, pp. 3265-3290. (available at Prof. Vitanyi's homepage)

P.M.B.Vitanyi and M.Li, Minimum description length induction, Bayesianism, and Kolmogorov complexity. IEEE Trans. Information Theory, vol. 47 (2000), pp. 446-464. (available at Prof. Vitányi's homepage)

K.Yamanishi, A Decision-theoretic extension of stochastic complexity and its applications to learning. IEEE Trans. Information Theory, vol. 44 (1998), pp. 1424-1439.


NEW: Jorma Rissanen, Optimal Estimation of Parameters, Cambridge University Press, 2012.

Peter Grünwald, Petri Myllymäki, Ioan Tabus, Marcelo Weinberger, and Bin Yu (editors), Festschrift in Honor of Jorma Rissanen on the Occasion of his 75th Birthday, Tampere International Center for Signal Processing, TICSP Series #38, 2008. (PDF, 29.0 MB)

Jorma Rissanen, Information and Complexity in Statistical Modeling, Springer, 2007. Errata

Peter Grünwald, The Minimum Description Length Principle, MIT Press, 2007. Sample chapter: Preface

Peter Grünwald, In Jae Myung, and Mark Pitt (editors), Advances in Minimum Description Length: Theory and Applications, MIT Press, 2005.

Te Sun Han and Kingo Kobayashi, Mathematics of Information and Coding, Translations of Mathematical Monographs, vol. 203, American Mathematical Society, 2001.

Jorma Rissanen, Stochastic Complexity in Statistical Inquiry, World Scientific, 1989.


Steven de Rooij, Minimum Description Length Model Selection: Problems and Extensions, University of Amsterdam, the Netherlands, 2008.

Teemu Roos, Statistical and Information-Theoretic Methods for Data Analysis, University of Helsinki, Finland, 2007.

Tim van Erven, When Data Compression and Statistics Disagree: Two Frequentist Challenges for the Minimum Description Length Principle, Leiden University, the Netherlands, 2010.

Lectures and Talks

Video lecture: Jorma Rissanen, MDL theory as a foundation for statistical modeling. MSRI Workshop on Information Theory, Mathematical Sciences Research Institute, Berkeley, February–March 2002.

Video lecture: Peter Grünwald, Universal modeling: Introduction to modern MDL. Machine Learning Summer School, Tubingen, 2003.

Slides: Peter Grünwald, Tutorial on modern MDL, NIPS 2001 Workshop on MDL: Developments in Theory and New Applications, Whistler, Canada, December 2001. (available at NIPS 2001)

Lecture notes: Jorma Rissanen, Lectures on statistical modeling theory, August 2005. (73 pages)

Slides: Jorma Rissanen, The Structure function and distinguishable models of data, 4th Annual Kolmogorov Lecture, Royal Holloway, London, February 2006.

Video lectures: Teemu Roos, "MDL Principle", Lectures 9 & 10 of the Information-Theoretic Modeling course, Dept. of Computer Science, University of Helsinki, September–October 2009.

Lecture notes: Teemu Roos, Introduction to Information-Theoretic Modeling, April 2011.


IEEE Transactions on Information Theory
Annals of Statistics
Computer Journal (Special Issue on Kolmogorov Complexity)
Journal of the Royal Statistical Society: Series B