Journal on Policy & Complex Systems Volume 3, Issue 2 | Page 124

A Novel Evolutionary Algorithm for Mining Noisy Survey Datasets with an Application Toward Combating Chagas Disease
Policy and Complex Systems - Volume 3 Number 2 - Fall 2017

A Novel Evolutionary Algorithm for Mining Noisy Survey Datasets with an Application Toward Combating Chagas Disease

John P . Hanley , Donna M . Rizzo A
Abstract
Chagas disease is a deadly , neglected tropical disease endemic to every country in Central and South America . The principal vector of Chagas disease in Central America is the insect Triatoma dimidiata . The best methods of preventing household infestation with T . dimidiata ( including Ecohealth interventions ) involve mining large amounts of socioeconomic and entomologic survey data ( comprised of nominal and ordinal data types ) for numerous potential risk factors . The number of risk factors suggested by experts is too large for exhaustive search ; and the use of traditional statistics can exclude risk factors that are purely epistatic . Therefore , we apply a novel evolutionary algorithm , the conjunctive clause evolutionary algorithm ( CCEA ), to mine these “ Big Datasets ” for the most important risk factors associated with T . dimidiata infestation using georeferenced survey data from two villages in Guatemala as examples . The CCEA identified socioeconomic risk factors to be important that are not significant using traditional statistics .
Keywords : Chagas disease , evolutionary algorithm , big data , data mining , Ecohealth
A
University of Vermont
120 doi : 10.18278 / jpcs . 3.2.8