References

Abdeldayem, E, A Ibrahim, A Ahmed, E Genedi, and W Tantawy. 2015. “Positive Remodeling Index by MSCT Coronary Angiography: A Prognostic Factor for Early Detection of Plaque Rupture and Vulnerability.” The Egyptian Journal of Radiology and Nuclear Medicine 46 (1). Elsevier: 13–24.

Abdi, H, and L Williams. 2010. “Principal Component Analysis.” Wiley Interdisciplinary Reviews: Computational Statistics 2 (4): 433–59.

Agresti, A. 2012. Categorical Data Analysis. Wiley-Interscience.

Akaike, Hirotugu. 1974. “A New Look at the Statistical Model Identification.” IEEE Transactions on Automatic Control 19 (6). Ieee: 716–23.

Allison, P. 2001. Missing Data. Sage Publications.

Altman, D. 1991. “Categorising Continuous Variables.” British Journal of Cancer 64 (5): 975.

Altman, DG, and JM Bland. 1994a. “Diagnostic tests 3: Receiver operating characteristic plots.” BMJ: British Medical Journal 309 (6948): 188.

———. 1994b. “Statistics Notes: Diagnostic tests 2: Predictive values.” British Medical Journal 309 (6947): 102.

Altman, D, B Lausen, W Sauerbrei, and M Schumacher. 1994. “Dangers of Using "Optimal" Cutpoints in the Evaluation of Prognostic Factors.” Journal of the National Cancer Institute 86 (11): 829–35.

Amati, G, and Cornelis J Van R. 2002. “Probabilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness.” ACM Transactions on Information Systems 20 (4): 357–89.

Ambroise, C, and G McLachlan. 2002. “Selection Bias in Gene Extraction on the Basis of Microarray Gene-Expression Data.” Proceedings of the National Academy of Sciences 99 (10): 6562–6.

Audigier, Vincent, François Husson, and Julie Josse. 2016. “A Principal Component Method to Impute Missing Values for Mixed Data.” Advances in Data Analysis and Classification 10 (1). Springer: 5–26.

Bairey, E, E Kelsic, and R Kishony. 2016. “High-Order Species Interactions Shape Ecosystem Diversity.” Nature Communications 7: 12285.

Barker, M, and W Rayens. 2003. “Partial Least Squares for Discrimination.” Journal of Chemometrics 17 (3): 166–73.

Basu, S, K Kumbier, J Brown, and B Yu. 2018. “Iterative Random Forests to Discover Predictive and Stable High-Order Interactions.” Proceedings of the National Academy of Sciences 115 (8): 1943–8.

Benavoli, A, G Corani, J Demsar, and M Zaffalon. 2016. “Time for a Change: A Tutorial for Comparing Multiple Classifiers Through Bayesian Analysis.” arXiv.org.

Benjamini, Y, and Y Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society. Series B (Methodological). JSTOR, 289–300.

Bergstra, J, and Y Bengio. 2012. “Random Search for Hyper-Parameter Optimization.” Journal of Machine Learning Research 13: 281–305.

Berry, B, J Moretto, T Matthews, J Smelko, and K Wiltberger. 2015. “Cross-Scale Predictive Modeling of CHO Cell Culture Growth and Metabolites Using Raman Spectroscopy and Multivariate Analysis.” Biotechnology Progress 31 (2): 566–77.

Berry, Kenneth J, Paul W Mielke Jr, and Janis E Johnston. 2016. Permutation Statistical Methods. Springer.

Bickle, M. 2010. “The Beautiful Cell: High-Content Screening in Drug Discovery.” Analytical and Bioanalytical Chemistry 398 (1): 219–26.

Bien, J, J Taylor, and R Tibshirani. 2013. “A Lasso for Hierarchical Interactions.” Annals of Statistics 41 (3): 1111.

Bishop, C. 2011. Pattern Recognition and Machine Learning. Springer.

Boulesteix, AL, and C Strobl. 2009. “Optimal Classifier Selection and Negative Bias in Error Rate Estimation: An Empirical Study on High-Dimensional Prediction.” BMC Medical Research Methodology 9 (1): 85.

Box, GEP, and D Cox. 1964. “An Analysis of Transformations.” Journal of the Royal Statistical Society. Series B (Methodological) 26 (2): 211–43.

Box, GEP, W Hunter, and J Hunter. 2005. Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building. Wiley.

Breiman, L. 1996. “Bagging Predictors.” Machine Learning 24 (2). Springer: 123–40.

———. 2001. “Random Forests.” Machine Learning 45 (1). Springer: 5–32.

Breiman, L., J. Friedman, R. Olshen, and C. Stone. 1984. Classification and Regression Trees. New York: Chapman; Hall.

Caputo, B, K Sim, F Furesjo, and A Smola. 2002. “Appearance-Based Object Recognition Using SVMs: Which Kernel Should I Use?” In Proceedings of Nips Workshop on Statistical Methods for Computational Experiments in Visual Processing and Computer Vision. Vol. 2002.

Chato, Lina, and Shahram Latifi. 2017. “Machine Learning and Deep Learning Techniques to Predict Overall Survival of Brain Tumor Patients Using MRI Images.” In 2017 Ieee 17th International Conference on Bioinformatics and Bioengineering, 9–14.

Chen, SH, J Sun, L Dimitrov, A Turner, T Adams, D Meyers, BL Chang, et al. 2008. “A Support Vector Machine Approach for Detecting Gene-Gene Interaction.” Genetic Epidemiology 32 (2). Wiley Online Library: 152–67.

Chollet, F, and JJ Allaire. 2018. Deep Learning with R. Manning.

Chong, E, and S Zak. 2008. “Global Search Algorithms.” In An Introduction to Optimization. John Wiley & Sons, Inc.

Christopher, D, R Prabhakar, and S Hinrich. 2008. Introduction to Information Retrieval. Cambridge University Press.

Cilla, M, E Pena, MA Martinez, and DJ Kelly. 2013. “Comparison of the Vulnerability Risk for Positive Versus Negative Atheroma Plaque Morphology.” Journal of Biomechanics 46 (7). Elsevier: 1248–54.

Cleveland, W. 1979. “Robust Locally Weighted Regression and Smoothing Scatterplots.” Journal of the American Statistical Association 74 (368): 829–36.

———. 1993. Visualizing Data. Summit, New Jersey: Hobart Press.

Cover, T, and J Thomas. 2012. Elements of Information Theory. John Wiley; Sons.

Davison, A, and D Hinkley. 1997. Bootstrap Methods and Their Application. Cambridge University Press.

Del Castillo, Enrique, Douglas C Montgomery, and Daniel R McCarville. 1996. “Modified Desirability Functions for Multiple Response Optimization.” Journal of Quality Technology 28 (3). Taylor & Francis: 337–45.

Demsar, J. 2006. “Statistical Comparisons of Classifiers over Multiple Data Sets.” Journal of Machine Learning Research 7 (Jan): 1–30.

Derringer, G, and R Suich. 1980. “Simultaneous optimization of several response variables.” Journal of Quality Technology 12 (4): 214–19.

Dickhaus, T. 2014. “Simultaneous Statistical Inference.” AMC 10. Springer: 12.

Dillon, W, and M Goldstein. 1984. Multivariate Analysis Methods and Applications. Wiley.

Efron, B. 1983. “Estimating the error rate of a prediction rule: Improvement on cross-validation.” Journal of the American Statistical Association, 316–31.

Efron, B, and T Hastie. 2016. Computer Age Statistical Inference. Cambridge University Press.

Efron, B, and R Tibshirani. 1997. “Improvements on cross-validation: The 632+ bootstrap method.” Journal of the American Statistical Association, 548–60.

Eilers, P, and B Marx. 2010. “Splines, Knots, and Penalties.” Wiley Interdisciplinary Reviews: Computational Statistics 2 (6): 637–53.

Elith, J, J Leathwick, and T Hastie. 2008. “A Working Guide to Boosted Regression Trees.” Journal of Animal Ecology 77 (4). Wiley Online Library: 802–13.

Eskelson, B, H Temesgen, V Lemay, TT Barrett, N Crookston, and A Hudak. 2009. “The Roles of Nearest Neighbor Methods in Imputing Missing Data in Forest Inventory and Monitoring Databases.” Scandinavian Journal of Forest Research 24 (3). Taylor AND Francis: 235–46.

Fernandez-Delgado, M, E Cernadas, S Barro, and D Amorim. 2014. “Do We Need Hundreds of Classifiers to Solve Real World Classification Problems?” Journal of Machine Learning Research 15 (1): 3133–81.

Fogel, P, D Hawkins, C Beecher, G Luta, and S Young. 2013. “A Tale of Two Matrix Factorizations.” The American Statistician 67 (4). Taylor & Francis: 207–18.

Friedman, J. 1991. “Multivariate Adaptive Regression Splines.” The Annals of Statistics 19 (1): 1–141.

———. 2001. “Greedy Function Approximation: A Gradient Boosting Machine.” Annals of Statistics, 1189–1232.

———. 2002. “Stochastic Gradient Boosting.” Computational Statistics & Data Analysis 38 (4). Elsevier: 367–78.

Friedman, J, T Hastie, and R Tibshirani. 2010. “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software 33 (1): 1.

Friedman, J, and B Popescu. 2008. “Predictive Learning via Rule Ensembles.” The Annals of Applied Statistics 2 (3): 916–54.

Friendly, M, and D Meyer. 2015. Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data. CRC Press.

Frigge, M, D Hoaglin, and B Iglewicz. 1989. “Some Implementations of the Boxplot.” The American Statistician 43 (1). Taylor & Francis: 50–54.

Garcı'a-Magariños, M, I López-de-Ullibarri, R Cao, and A Salas. 2009. “Evaluating the Ability of Tree-Based Methods and Logistic Regression for the Detection of SNP-SNP Interaction.” Annals of Human Genetics 73 (3). Wiley Online Library: 360–69.

Geman, S, E Bienenstock, and R Doursat. 1992. “Neural Networks and the Bias/Variance Dilemma.” Neural Computation 4 (1). MIT Press: 1–58.

Ghosh, A K, and P Chaudhuri. 2005. “On Data Depth and Distribution-Free Discriminant Analysis Using Separating Surfaces.” Bernoulli 11 (1): 1–27.

Gillis, N. 2017. “Introduction to Nonnegative Matrix Factorization.” arXiv Preprint arXiv:1703.00663.

Giuliano, K, R DeBiasio, Dunlay. T, A Gough, J Volosky, J Zock, G Pavlakis, and L Taylor. 1997. “High-Content Screening: A New Approach to Easing Key Bottlenecks in the Drug Discovery Process.” Journal of Biomolecular Screening 2 (4): 249–59.

Golub, G, M Heath, and G Wahba. 1979. “Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter.” Technometrics 21 (2): 215–23.

Good, Phillip. 2013. Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer Science & Business Media.

Goodfellow, I, Y Bengio, and A Courville. 2016. Deep Learning. MIT Press.

Gower, J. 1971. “A General Coefficient of Similarity and Some of Its Properties.” Biometrics, 857–71.

Greenacre, M. 2010. Biplots in Practice. Fundacion BBVA.

———. 2017. Correspondence Analysis in Practice. CRC press.

Guo, C, and F Berkhahn. 2016. “Entity embeddings of categorical variables.” arXiv.org.

Guyon, I, J Weston, S Barnhill, and V Vapnik. 2002. “Gene Selection for Cancer Classification Using Support Vector Machines.” Machine Learning 46 (1): 389–422.

Haase, R. 2011. Multivariate General Linear Models. Sage.

Hammes, G. 2005. Spectroscopy for the Biological Sciences. John Wiley; Sons.

Hampel, D, P Andrews, F Bickel, P Rogers, W Huber, and J Turkey. 1972. Robust Estimates of Location. Princeton, New Jersey: Princeton University Press.

Harrell, F. 2015. Regression Modeling Strategies. Springer.

Harrington, E. 1965. “The Desirability Function.” Industrial Quality Control 21 (10): 494–98.

Hastie, T, R Tibshirani, and M Wainwright. 2015. Statistical Learning with Sparsity. CRC Press.

Haupt, R, S Haupt, and S Haupt. 1998. Practical Genetic Algorithms. Vol. 2. Wiley New York.

Hawkins, D. 1994. “The Feasible Solution Algorithm for Least Trimmed Squares Regression.” Computational Statistics & Data Analysis 17 (2): 185–96.

Healy, K. 2018. Data Visualization: A Practical Introduction. Princeton University Press.

Hill, A, P LaPan, Y Li, and S Haney. 2007. “Impact of Image Segmentation on High-Content Screening Data Quality for SK-BR-3 Cells.” BMC Bioinformatics 8 (1): 340.

Hintze, J, and R Nelson. 1998. “Violin Plots: A Box Plot-Density Trace Synergism.” The American Statistician 52 (2). Taylor & Francis Group: 181–84.

Hoerl, R, Aand Kennard. 1970. “Ridge Regression: Biased Estimation for Nonorthogonal Problems.” Technometrics 12 (1): 55–67.

Holland, John H. 1992. “Genetic Algorithms.” Scientific American 267 (1). JSTOR: 66–73.

Holmes, S, and W Huber. 2019. Modern Statistics for Modern Biology. Cambridge University Press.

Hosmer, D, and S Lemeshow. 2000. Applied Logistic Regression. New York: John Wiley & Sons.

Hothorn, T, K Hornik, and A Zeileis. 2006. “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics 15 (3). Taylor & Francis: 651–74.

Hothorn, T, F Leisch, A Zeileis, and K Hornik. 2005. “The design and analysis of benchmark experiments.” Journal of Computational and Graphical Statistics 14 (3): 675–99.

Hyndman, R, and G Athanasopoulos. 2013. Forecasting: Principles and Practice. OTexts.

Hyvarinen, A, and E Oja. 2000. “Independent Component Analysis: Algorithms and Applications.” Neural Networks 13 (4-5). Elsevier: 411–30.

Jahani, M, and M Mahdavi. 2016. “Comparison of Predictive Models for the Early Diagnosis of Diabetes.” Healthcare Informatics Research 22 (2): 95–100.

Jones, D, M Schonlau, and W Welch. 1998. “Efficient Global Optimization of Expensive Black-Box Functions.” Journal of Global Optimization 13 (4). Springer: 455–92.

Karthikeyan, M, R Glen, and A Bender. 2005. “General Melting Point Prediction Based on a Diverse Compound Data Set and Artificial Neural Networks.” Journal of Chemical Information and Modeling 45 (3): 581–90.

Kenny, P, and C Montanari. 2013. “Inflation of Correlation in the Pursuit of Drug-Likeness.” Journal of Computer-Aided Molecular Design 27 (1): 1–13.

Kim, A, and A Escobedo-Land. 2015. “OkCupid Data for Introductory Statistics and Data Science Courses.” Journal of Statistics Education 23 (2): 1–25.

Kirkpatrick, S, D Gelatt, and M Vecchi. 1983. “Optimization by Simulated Annealing.” Science 220 (4598). American Association for the Advancement of Science: 671–80.

Kuhn, M. 2008. “The Caret Package.” Journal of Statistical Software 28 (5): 1–26.

Kuhn, M, and K Johnson. 2013. Applied Predictive Modeling. Springer.

Kvalseth, T. 1985. “Cautionary Note About \(R^2\).” American Statistician 39 (4): 279–85.

Lambert, J, L Gong, CF Elliot, K Thompson, and A Stromberg. 2018. “RFSA: An R Package for Finding Best Subsets and Interactions.” The R Journal.

Lampa, E, L Lind, P Lind, and A Bornefalk-Hermansson. 2014. “The Identification of Complex Interactions in Epidemiology and Toxicology: A Simulation Study of Boosted Regression Trees.” Environmental Health 13 (1). BioMed Central: 57.

Lawrence, I, and K Lin. 1989. “A Concordance Correlation Coefficient to Evaluate Reproducibility.” Biometrics, 255–68.

Lee, T-W. 1998. Independent Component Analysis. Springer.

Levinson, M, and D Rodriguez. 1998. “Endarterectomy for Preventing Stroke in Symptomatic and Asymptomatic Carotid Stenosis. Review of Clinical Trials and Recommendations for Surgical Therapy.” In The Heart Surgery Forum, 147–68.

Lewis, D, Y Yang, T Rose, and F Li. 2004. “Rcv1: A New Benchmark Collection for Text Categorization Research.” Journal of Machine Learning Research 5: 361–97.

Lian, K, J White, E Bartlett, A Bharatha, R Aviv, A Fox, and S Symons. 2012. “NASCET Percent Stenosis Semi-Automated Versus Manual Measurement on CTA.” The Canadian Journal of Neurological Sciences 39 (03). Cambridge Univ Press: 343–46.

Little, R, and D Rubin. 2014. Statistical Analysis with Missing Data. John Wiley; Sons.

Luo, G. 2016. “Automatically Explaining Machine Learning Prediction Results: A Demonstration on Type 2 Diabetes Risk Prediction.” Health Information Science and Systems 4 (1): 2.

MacKay, D. 2003. Information Theory, Inference and Learning Algorithms. Cambridge University Press.

Mallick, H, and N Yi. 2013. “Bayesian Methods for High Dimensional Linear Models.” Journal of Biometrics and Biostatistics 1: 005.

Mandal, Abhyuday, CF Jeff Wu, and Kjell Johnson. 2006. “SELC: Sequential Elimination of Level Combinations by Means of Modified Genetic Algorithms.” Technometrics 48 (2). Taylor; Francis: 273–83.

Mandal, Abhyuday, Kjell Johnson, CF Jeff Wu, and Dirk Bornemeier. 2007. “Identifying Promising Compounds in Drug Discovery: Genetic Algorithms and Some New Statistical Techniques.” Journal of Chemical Information and Modeling 47 (3). ACS Publications: 981–88.

Marr, B. 2017. “IoT and Big Data at Caterpillar: How Predictive Maintenance Saves Millions of Dollars.” https://www.forbes.com/sites/bernardmarr/2017/02/07/iot-and-big-data-at-caterpillar-how-predictive-maintenance-saves-millions-of-dollars/#109576a27240.

Massy, W. 1965. “Principal Components Regression in Exploratory Statistical Research.” Journal of the American Statistical Association 60 (309). Taylor & Francis: 234–56.

McElreath, R. 2015. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Chapman; Hall/CRC.

Meier, P, G Knapp, U Tamhane, S Chaturvedi, and H Gurm. 2010. “Short Term and Intermediate Term Comparison of Endarterectomy Versus Stenting for Carotid Artery Stenosis: Systematic Review and Meta-Analysis of Randomised Controlled Clinical Trials.” BMJ 340. British Medical Journal Publishing Group: c467.

Micci-Barreca, D. 2001. “A Preprocessing Scheme for High-Cardinality Categorical Attributes in Classification and Prediction Problems.” ACM SIGKDD Explorations Newsletter 3 (1): 27–32.

Miller, A. 1984. “Selection of Subsets of Regression Variables.” Journal of the Royal Statistical Society. Series A (General), 389–425.

Mitchell, M. 1998. An Introduction to Genetic Algorithms. MIT Press.

Mockus, J. 1994. “Application of Bayesian Approach to Numerical Methods of Global and Stochastic Optimization.” Journal of Global Optimization 4 (4). Springer: 347–65.

Mozharovskyi, P, K Mosler, and T Lange. 2015. “Classifying Real-World Data with the DD\(\alpha\)-Procedure.” Advances in Data Analysis and Classification 9 (3). Springer: 287–314.

Mundry, R, and C Nunn. 2009. “Stepwise Model Fitting and Statistical Inference: Turning Noise into Signal Pollution.” The American Naturalist 173 (1): 119–23.

Murrell, B, D Murrell, and H Murrell. 2016. “Discovering General Multidimensional Associations.” PloS One 11 (3): e0151551.

Nair, V, and G. Hinton. 2010. “Rectified Linear Units Improve Restricted Boltzmann Machines.” In Proceedings of the 27th International Conference on Machine Learning, edited by J Furnkranz and T Joachims, 807–14. Omnipress.

Neter, J, M Kutner, C Nachtsheim, and W Wasserman. 1996. Applied Linear Statistical Models. Irwin Chicago.

Nitish, S, H Geoffrey, K Alex, S Ilya, and S Ruslan. 2014. “Dropout: A Simple Way to Prevent Neural Networks from Overfitting.” Journal of Machine Learning Research 15: 1929–58.

Opgen-Rhein, R, and K Strimmer. 2007. “Accurate Ranking of Differentially Expressed Genes by a Distribution-Free Shrinkage Approach.” Statistical Applications in Genetics and Molecular Biology 6 (1). De Gruyter.

Piironen, J, and A Vehtari. 2017a. “Comparison of Bayesian Predictive Methods for Model Selection.” Statistics and Computing 27 (3): 711–35.

———. 2017b. “Sparsity Information and Regularization in the Horseshoe and Other Shrinkage Priors.” Electronic Journal of Statistics 11 (2): 5018–51.

Preneel, B. 2010. “Cryptographic Hash Functions: Theory and Practice.” In ICICS, 1–3.

Qi, Y. 2012. “Random Forest for Bioinformatics.” In Ensemble Machine Learning, 307–23. Springer.

Quinlan, R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.

Raimondi, C. 2010. “How I Won the Predict HIV Progression Data Mining Competition.” http://blog.kaggle.com/2010/08/09/how-i-won-the-hiv-progression-prediction-data-mining-competition/.

Reid, R. 2015. “A Morphometric Modeling Approach to Distinguishing Among Bobcat, Coyote and Gray Fox Scats.” Wildlife Biology 21 (5). BioOne: 254–62.

Reshef, D, Y Reshef, H Finucane, S Grossman, G McVean, P Turnbaugh, E Lander, M Mitzenmacher, and P Sabeti. 2011. “Detecting Novel Associations in Large Data Sets.” Science 334 (6062): 1518–24.

Rinnan, Åsmund, Frans Van Den Berg, and Søren Balling Engelsen. 2009. “Review of the Most Common Pre-Processing Techniques for Near-Infrared Spectra.” TrAC Trends in Analytical Chemistry 28 (10). Elsevier: 1201–22.

Roberts, S, and R Everson. 2001. Independent Component Analysis: Principles and Practice. Cambridge University Press.

Robnik-Sikonja, M, and I Kononenko. 2003. “Theoretical and Empirical Analysis of Relieff and Rrelieff.” Machine Learning 53 (1): 23–69.

Rousseeuw, P, and C Croux. 1993. “Alternatives to the Median Absolute Deviation.” Journal of the American Statistical Association 88 (424): 1273–83.

Sakar, C, G Serbes, A Gunduz, H Tunc, H Nizam, B Sakar, M Tutuncu, T Aydin, E Isenkul, and H Apaydin. 2019. “A Comparative Analysis of Speech Signal Processing Algorithms for Parkinson’s Disease Classification and the Use of the Tunable Q-Factor Wavelet Transform.” Applied Soft Computing 74: 255–63.

Sapp, S, Mark J M van der Laan, and J Canny. 2014. “Subsemble: An Ensemble Method for Combining Subset-Specific Algorithm Fits.” Journal of Applied Statistics 41 (6). Taylor; Francis: 1247–59.

Sathyanarayana, Aarti, Shafiq Joty, Luis Fernandez-Luque, Ferda Ofli, Jaideep Srivastava, Ahmed Elmagarmid, Teresa Arora, and Shahrad Taheri. 2016. “Sleep Quality Prediction from Wearable Data Using Deep Learning.” JMIR mHealth and uHealth 4 (4).

Schoepf, UJ, M van Assen, A Varga-Szemes, TM Duguay, HT Hudson, S Egorova, K Johnson, et al. n.d. “Automated Plaque Analysis for the Prognostication of Major Adverse Cardiac Events.” European Journal of Radiology.

Schofield, A, M Magnusson, and D Mimno. 2017. “Understanding Text Pre-Processing for Latent Dirichlet Allocation.” In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2:432–36.

Schofield, A, and D Mimno. 2016. “Comparing Apples to Apple: The Effects of Stemmers on Topic Models.” Transactions of the Association for Computational Linguistics 4: 287–300.

Schölkopf, B, A Smola, and KR Müller. 1998. “Nonlinear Component Analysis as a Kernel Eigenvalue Problem.” Neural Computation 10 (5). MIT Press: 1299–1319.

Serneels, S, E De Nolf, and P Van Espen. 2006. “Spatial Sign Preprocessing: A Simple Way to Impart Moderate Robustness to Multivariate Estimators.” Journal of Chemical Information and Modeling 46 (3): 1402–9.

Shaffer, J. 1995. “Multiple Hypothesis Testing.” Annual Review of Psychology 46 (1): 561–84.

Shao, J. 1993. “Linear Model Selection by Cross-Validation.” Journal of the American Statistical Association 88 (422): 486–94.

Shawe-Taylor, J, and N Cristianini. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press.

Silge, J, and D Robinson. 2017. Text Mining with R: A Tidy Approach. O’Reilly.

Spall, J. 2005. Simultaneous Perturbation Stochastic Approximation. John Wiley; Sons.

Stanković, J, I Marković, and M Stojanović. 2015. “Investment Strategy Optimization Using Technical Analysis and Predictive Modeling in Emerging Markets.” Procedia Economics and Finance 19: 51–62.

Stekhoven, D, and P Buhlmann. 2011. “MissForest: Non-Parametric Missing Value Imputation for Mixed-Type Data.” Bioinformatics 28 (1). Oxford University Press: 112–18.

Steyerberg, E, M Eijkemans, and D Habbema. 1999. “Stepwise Selection in Small Data Sets: A Simulation Study of Bias in Logistic Regression Analysis.” Journal of Clinical Epidemiology 52 (10): 935–42.

Stone, M, and R Brooks. 1990. “Continuum Regression: Cross-Validated Sequentially Constructed Prediction Embracing Ordinary Least Squares, Partial Least Squares and Principal Components Regression.” Journal of the Royal Statistical Society. Series B (Methodological). JSTOR, 237–69.

Strobl, C, AL Boulesteix, A Zeileis, and T Hothorn. 2007. “Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution.” BMC Bioinformatics 8 (1): 25.

Svetnik, V, A Liaw, C Tong, C Culberson, R Sheridan, and B Feuston. 2003. “Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling.” Journal of Chemical Information and Computer Sciences 43 (6). ACS Publications: 1947–58.

Tan, PN, M Steinbach, and V Kumar. 2006. Introduction to Data Mining. Pearson Education.

Thomson, J, K Johnson, R Chapin, D Stedman, S Kumpf, and T Ozolinš. 2011. “Not a Walk in the Park: The ECVAM Whole Embryo Culture Model Challenged with Pharmaceuticals and Attempted Improvements with Random Forest Design.” Birth Defects Research Part B: Developmental and Reproductive Toxicology 92 (2): 111–21.

Tibshirani, R. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society. Series B (Methodological), 267–88.

Timm, N, and J Carlson. 1975. “Analysis of Variance Through Full Rank Models.” Multivariate Behavioral Research Monographs. Society of Multivariate Experimental Psychology.

Tufte, E. 1990. Envisioning Information. Cheshire, Connecticut: Graphics Press.

Tukey, John W. 1977. Exploratory Data Analysis. Reading, Mass.

Tutz, G, and S Ramzan. 2015. “Improved Methods for the Imputation of Missing Data by Nearest Neighbor Methods.” Computational Statistics and Data Analysis 90. Elsevier: 84–99.

United States Census Bureau. 2017. “Chicago Illinois Population Estimates.” https://tinyurl.com/y8s2y4bh.

U.S. Energy Information Administration. 2017a. “Weekly Chicago All Grades All Formulations Retail Gasoline Prices.” https://tinyurl.com/ydctltn4.

———. 2017b. “What Drives Crude Oil Prices?” https://tinyurl.com/supply-opec.

Van Buuren, S. 2012. Flexible Imputation of Missing Data. Chapman; Hall/CRC.

Van Laarhoven, P, and E Aarts. 1987. “Simulated Annealing.” In Simulated Annealing: Theory and Applications, 7–15. Springer.

Wand, M, and C Jones. 1994. Kernel Smoothing. Chapman; Hall/CRC.

Weinberger, K, A Dasgupta, J Langford, A Smola, and J Attenberg. 2009. “Feature Hashing for Large Scale Multitask Learning.” In Proceedings of the 26th Annual International Conference on Machine Learning, 1113–20. ACM.

Weise, T. 2011. Global Optimization Algorithms: Theory and Application. www.it-weise.de.

West, K, Band Welch, and A Galecki. 2014. Linear Mixed Models: A Practical Guide Using Statistical Software. CRC Press.

Whittingham, M, P Stephens, R Bradbury, and R Freckleton. 2006. “Why Do We Still Use Stepwise Modelling in Ecology and Behaviour?” Journal of Animal Ecology 75 (5): 1182–9.

Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, Inc.

Willett, P. 2006. “The Porter Stemming Algorithm: Then and Now.” Program 40 (3): 219–23.

Wolpert, D. 1996. “The Lack of a Priori Distinctions Between Learning Algorithms.” Neural Computation 8 (7). MIT Press: 1341–90.

Wood, S. 2006. Generalized Additive Models: An Introduction with R. Chapman; Hall/CRC.

Wu, CF Jeff, and Michael S Hamada. 2011. Experiments: Planning, Analysis, and Optimization. John Wiley & Sons.

Yandell, B. 1993. “Smoothing Splines: A Tutorial.” The Statistician, 317–19.

Yeo, I-K, and R Johnson. 2000. “A New Family of Power Transformations to Improve Normality or Symmetry.” Biometrika 87 (4): 954–59.

Zanella, F, J Lorens, and W Link. 2010. “High Content Screening: Seeing Is Believing.” Trends in Biotechnology 28 (5): 237–45.

Zhu, Xiaojin, and Andrew B Goldberg. 2009. “Introduction to Semi-Supervised Learning.” Synthesis Lectures on Artificial Intelligence and Machine Learning 3 (1). Morgan & Claypool Publishers: 1–130.

Zou, H, and T Hastie. 2005. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2). Wiley Online Library: 301–20.

Zuber, V, and K Strimmer. 2009. “Gene Ranking and Biomarker Discovery Under Correlation.” Bioinformatics 25 (20): 2700–2707.

Zumel, N., and J. Mount. 2016. “vtreat: A data.frame processor for predictive modeling.” arXiv.org.