A Bayesian network is a probabilistic graphical model that integrates the available mechanistic understanding of the disease development (knowledge) into a graphical model that summarizes the conjectured causal relationships and incorporates the corresponding quantitative information (data). It can integrate heterogeneous pieces of information and can be continuously improved and refined by adding new information. Chronic obstructive pulmonary disease (COPD) is a lung disease characterized by chronic obstruction of lung airflow, which interferes with normal breathing and is not fully reversible. COPD is strongly linked to the exposure to noxious particles or gases, such as cigarette smoke, which triggers a variety of pathogenic processes. A large body of data is available which relates to biological changes associated with the development of COPD; however, the holistic sequence of events is mainly unknown. Based on this uncertainty, a bayesian network model is suggested to link noted events with resulting causes, and to predict the risk of getting smoking-related COPD. To develop the model, a database of internal and published clinical and experimental data has been established containing 441 data sets summarized in standard format using literature data transfer sheets. Based on review of the available data, a COPD model has been proposed. At present, the proposed model contains a subset of variables, e.g., interleukin 8, neutrophils, metalloproteinase 8 and 9, neutrophil Elastase, and FEV1. The ultimate goal of this work will be to build a more complex model, which covers the major pathogenic processes of the disease, thus allowing estimation of the probability to develop smoking-related COPD. Such a model will help to discover pathways that have the highest impact on the predictions (potential disease biomarkers), to identify ‘knowledge gaps’, and will allow evidence-based predictions of disease risk.