Monitoring of volatile and semivolatile compounds was performed using gas chromatography (GC) coupled to high-resolution electron ionization mass spectrometry, using both headspace and liquid injection modes. A total of 560 reference compounds, including 8 odd n-alkanes, were analyzed and experimental linear retention indices (LRI) were determined. These reference compounds were randomly split into training (n = 401) and test (n = 151) sets. LRI for all 552 reference compounds were also calculated based upon computational Quantitative Structure–Property Relationship (QSPR) models, using two independent approaches RapidMiner (coupled to Dragon) and ACD/ChromGenius software. Correlation coefficients for experimental versus predicted LRI values calculated for both training and test set compounds were calculated at 0.966 and 0.949 for RapidMiner and at 0.977 and 0.976 for ACD/ChromGenius, respectively. In addition, the cross-validation correlation was calculated at 0.96 from RapidMiner and the residual standard error value obtained from ACD/ChromGenius was 53.635. These models were then used to predict LRI values for several thousand compounds reported present in tobacco and tobacco-related fractions, plus a range of specific flavor compounds. It was demonstrated that using the mean of the LRI values predicted by RapidMiner and ACD/ChromGenius, in combination with accurate mass data, could enhance the confidence level for compound identification from the analysis of complex matrixes, particularly when the two predicted LRI values for a compound were in close agreement. Application of this LRI modeling approach to matrixes with unknown composition has already enabled the confirmation of 23 postulated compounds, demonstrating its ability to facilitate compound identification in an analytical workflow. The goal is to reduce the list of putative candidates to a reasonable relevant number that can be obtained and measured for confirmation.