Gustavo Batista

Associate Professor
Lecturer

I joined UNSW as an associate professor in 2018, after working for more than ten years for the University of Sao Paulo (USP). During 2010-2012, I was a visiting researcher at the University of California, Riverside (UCR) working in the prof. Eamonn Keogh's laboratory. 

During my stay at UCR, I continued my work with time series analysis, particularly developing methods for classification and clustering of time-oriented data. In conjunction with Dr Keogh, I proposed the first time series distance invariant to complexity and speed-up techniques to compare massive amounts of time series data under warping. 

More recently, I have worked with data streams, particularly with classification with label latency and proposed efficient unsupervised methods to detect concept drifts as well as to learn in the presence of these changes in the data distribution. 

My research is motivated by applying Machine Learning in practice. My approach is to work on challenging applications that help my students and me to identify gaps in the literature or assumptions in the state-of-the-art that do not hold for our applications. This research approach often leads to contributions both in Computer Science as well as the application areas.

One instance of such an approach is the challenge of incorporating classification algorithms on embedded devices. For example, I have developed lightweight models that can run in environments with severe power restrictions such as satellites and sensors. One notorious application is the development of sensors to classify insects in flight automatically, allowing the creation of surveillance systems for disease vectors, invasive species and pests. I have also developed EmbML, a Machine Learning tool to convert sickit-learn and Weka classifiers into C++ code crafted to run into low-power microcontrollers, such as ones found in the Arduino family.

In the last years, I have actively worked in the area of Machine Learning Quantification, developing new algorithms to count events accurately. These recent developments have led to the proposal of a novel Data Mining task known as One-class Quantification as well as a family of efficient quantification algorithms. 

The impact of my research can be measured by the number of recent papers citing my research articles. According to Google Scholar, my paper have more 
than 9,000 citations, with more than 1,000 citations in 2020.

Book Chapters
add
NADAI BLD; MOURA L; MALETZKE AG; BATISTA GEAPA; CORBI JJ, 2021, 'TECNOLOGIA NO MONITORAMENTO AMBIENTAL DE MOSQUITOS TRANSMISSORES DE DOENÇAS: QUAIS SÃO OS DESAFIOS? UMA BREVE REVISÃO', in INDICADORES BIOLÓGICOS DE QUALIDADE EM AMBIENTES AQUÁTICOS CONTINENTAIS: MÉTRICAS E RECORTES PARA ANÁLISES, RFB Editora, http://dx.doi.org/10.46898/rfb.9786558891321.8
2021
dos Reis D; Maletzke A; Cherman E; Batista G, 2019, 'One-Class Quantification', in Machine Learning and Knowledge Discovery in Databases, pp. 273 - 289, http://dx.doi.org/10.1007/978-3-030-10925-7_17
2019
Maletzke AG; Lee HD; Enrique G; Batista APA; Coy CSR; Fagundes JAJ; Chung WF, 2014, 'Time series classification with motifs and characteristics', in Soft Computing for Business Intelligence, Springer, Berlin, Heidelberg, pp. 125 - 138
2014
Journal articles
add
de Nadai BL; Maletzke AG; Corbi JJ; Batista GEAPA; Reiskind MH, 2021, 'The impact of body size on Aedes [Stegomyia] aegypti wingbeat frequency: implications for mosquito identification', Medical and Veterinary Entomology, http://dx.doi.org/10.1111/mve.12540
2021
Parmezan ARS; Souza VMA; Žliobaitė I; Batista GEAPA, 2021, 'Changes in the wing-beat frequency of bees and wasps depending on environmental conditions: a study with optical sensors', Apidologie, vol. 52, pp. 731 - 748, http://dx.doi.org/10.1007/s13592-021-00860-y
2021
Souza VMA; dos Reis DM; Maletzke AG; Batista GEAPA, 2020, 'Challenges in benchmarking stream learning algorithms with real-world data', Data Mining and Knowledge Discovery, vol. 34, pp. 1805 - 1858, http://dx.doi.org/10.1007/s10618-020-00698-5
2020
Reis DD; de Souto M; de Sousa E; Batista G, 2020, 'Quantifying With Only Positive Training Data', arXiv preprint arXiv:2004.10356
2020
Silva DF; Yeh C-CM; Zhu Y; Batista GEAPA; Keogh E, 2019, 'Fast similarity matrix profile for music analysis and exploration', IEEE Transactions on Multimedia, vol. 21, pp. 29 - 38, http://dx.doi.org/10.1109/TMM.2018.2849563
2019
Parmezan ARS; Souza VMA; Batista GEAPA, 2019, 'Evaluation of statistical and machine learning models for time series prediction: Identifying the state-of-the-art and the best conditions for the use of each model', Information Sciences, vol. 484, pp. 302 - 337
2019
Silva DF; Giusti R; Keogh E; Batista GEAPA, 2018, 'Speeding up similarity search under dynamic time warping by pruning unpromising alignments', Data Mining and Knowledge Discovery, vol. 32, pp. 988 - 1016
2018
Maletzke AG; dos Reis DM; Batista GEAPA, 2018, 'Combining instance selection and self-training to improve data stream quantification', Journal of the Brazilian Computer Society, vol. 24, pp. 12 - 12, http://dx.doi.org/10.1186/s13173-018-0076-0
2018
Souza V; Rossi RG; Batista GEAPA; Rezende SO, 2017, 'Unsupervised active learning techniques for labeling training sets: An experimental evaluation on sequential data', Intelligent Data Analysis, vol. 21, pp. 1061 - 1095
2017
Silva DF; Souza VMA; Ellis DPW; Keogh EJ; Batista GEAPA, 2015, 'Exploring low cost laser sensors to identify flying insect species', Journal of Intelligent & Robotic Systems, vol. 80, pp. 313 - 330
2015
Prati RC; Batista GEAPA; Silva DF, 2015, 'Class imbalance revisited: a new experimental setup to assess the performance of treatment methods', Knowledge and Information Systems, vol. 45, pp. 247 - 270
2015
Silva DF; Souza VMA; Ellis DPW; Keogh EJ; Batista GEAPA, 2015, 'Exploring Low Cost Laser Sensors to Identify Flying Insect Species: Evaluation of Machine Learning and Signal Processing Methods', Journal of Intelligent and Robotic Systems: Theory and Applications, vol. 80, pp. 313 - 330, http://dx.doi.org/10.1007/s10846-014-0168-9
2015
Batista GEAPA; Delgado M; Bernardini F, 2015, 'ENIAC 2013 Special Issue', Journal of Intelligent and Robotic Systems: Theory and Applications, vol. 80, pp. 225 - 226, http://dx.doi.org/10.1007/s10846-015-0260-9
2015
Batista GEAPA; Keogh EJ; Tataw OM; De Souza VMA, 2014, 'CID: an efficient complexity-invariant distance for time series', Data Mining and Knowledge Discovery, vol. 28, pp. 634 - 669
2014
Chen Y; Why A; Batista G; Mafra-Neto A; Keogh E, 2014, 'Flying insect detection and classification with inexpensive sensors', JoVE (Journal of Visualized Experiments), pp. e52111 - e52111
2014
Chen Y; Why A; Batista G; Mafra-Neto A; Keogh E, 2014, 'Flying insect classification with inexpensive sensors', Journal of insect behavior, vol. 27, pp. 657 - 677
2014
Parmezan ARS; Batista GEAPA; others , 2014, 'ICMC-USP time series prediction repository', Instituto de Ciências Matemáticas e de Computaçao, Universidade de Sao Paulo, Sao Carlos, Brasil. URL https://goo. gl/uzxGZJ
2014
Del Gaudio R; Batista G; Branco A, 2014, 'Coping with highly imbalanced datasets: A case study with definition extraction in a multilingual setting', Natural Language Engineering, vol. 20, pp. 327 - 359
2014
Chen Y; Why A; Batista G; Mafra-Neto A; Keogh E, 2014, 'Flying Insect Classification with Inexpensive Sensors', Journal of Insect Behavior, vol. 27, pp. 657 - 677, http://dx.doi.org/10.1007/s10905-014-9454-4
2014
Rakthanmanon T; Campana B; Mueen A; Batista G; Westover B; Zhu Q; Zakaria J; Keogh E, 2013, 'Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping.', ACM Trans Knowl Discov Data, vol. 7, https://www.ncbi.nlm.nih.gov/pubmed/31607834
2013
Silva DF; de Souza VMAA; Batista GEAPA, 2013, 'A comparative study between MFCC and LSF coefficients in automatic recognition of isolated digits pronounced in Portuguese and English', Acta Scientiarum. Technology, vol. 35, pp. 621 - 628
2013
Rakthanmanon T; Campana B; Mueen A; Batista G; Westover B; Zhu Q; Zakaria J; Keogh E, 2013, 'Addressing big data time series: Mining trillions of time series subsequences under dynamic time warping', ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 7, pp. 1 - 31
2013
Prati RC; Batista GEAPA, 2012, 'A complexity-invariant measure based on fractal dimension for time series classification', International Journal of Natural Computing Research (IJNCR), vol. 3, pp. 59 - 73
2012
Prati RC; Batista GEAPA; Monard MC, 2011, 'A survey on graphical methods for classification predictive performance evaluation', IEEE Transactions on Knowledge and Data Engineering, vol. 23, pp. 1601 - 1618, http://dx.doi.org/10.1109/TKDE.2011.59
2011
Milaré CR; Batista GEAPA; Carvalho ACPLF, 2011, 'A hybrid approach to learn with imbalanced classes using evolutionary algorithms', Logic Journal of IGPL, vol. 19, pp. 293 - 293
2011
Prati R; Batista G; Monard M, 2010, 'A survey on graphical methods for classification predictive performance evaluation', Knowledge and Data Engineering, IEEE Transactions on, pp. 1 - 1
2010
Prati RC; Batista GEDAPA; Monard MC, 2008, 'Evaluating classifiers using ROC curves', IEEE Latin America Transactions, vol. 6, pp. 215 - 222, http://dx.doi.org/10.1109/TLA.2008.4609920
2008
Prati RC; Batista GEAPA; Monard MC, 2008, 'Curvas ROC para avaliaç ao de classificadores', Revista IEEE América Latina, vol. 6, pp. 215 - 222
2008
Batista GEAPA; Milaré CR; Prati RC; Monard MC, 2006, 'A Comparison of Methods for Rule Subset Selection Applied to Associative Classification.', Inteligencia artificial: Revista Iberoamericana de Inteligencia Artificial, vol. 10, pp. 29 - 35
2006
Batista G; Prati R; Monard M, 2005, 'Balancing strategies and class overlapping', Advances in Intelligent Data Analysis VI, pp. 741 - 741
2005
Milaré C; Batista G; de Carvalho A; Monard M, 2004, 'Applying genetic and symbolic learning algorithms to extract rules from artificial neural networks', MICAI 2004: Advances in Artificial Intelligence, pp. 833 - 843
2004
Batista GEAPA; Prati RC; Monard MC, 2004, 'A study of the behavior of several methods for balancing machine learning training data', ACM SIGKDD Explorations Newsletter, vol. 6, pp. 20 - 29
2004
Prati R; Batista G; Monard M, 2004, 'Class imbalances versus class overlapping: an analysis of a learning system behavior', MICAI 2004: Advances in Artificial Intelligence, pp. 312 - 321
2004
Batista GEAPA; Monard MC, 2003, 'Descriç ao da arquitetura e do projeto do ambiente computacional DISCOVER LEARNING ENVIRONMENT—DLE', Relatório Técnico do ICMC/USP
2003
Batista GEAPA; Monard MC, 2003, 'Experimental comparison of K-nearest neighbour and mean or mode imputation methods with the internal strategies used by C4. 5 and CN2 to treat missing data', University of Sao Paulo
2003
Batista GEAPA; Monard MC, 2003, 'An analysis of four missing data treatment methods for supervised learning', Applied Artificial Intelligence, vol. 17, pp. 519 - 533
2003
Batista GEAPA; Monard MC, 2002, 'K-Nearest Neighbour as Imputation Method: Experimental Results', Technical report, ICMC-USP
2002
Batista GEAPA; Monard MC, 2002, 'A Study of K-Nearest Neighbour as an Imputation Method.', HIS, vol. 87, pp. 48 - 48
2002
Monard MC; Batista GEAPA, 2002, 'Learning with Skewed Class Distributions', Advances in Logic, Artificial Intelligence, and Robotics: LAPTEC 2002, vol. 85, pp. 173 - 173
2002
Batista G; Carvalho A; Monard M, 2000, 'Applying one-sided selection to unbalanced datasets', MICAI 2000: Advances in Artificial Intelligence, pp. 315 - 325
2000
Batista GEAPA, 1997, 'Um ambiente de avaliaçao de algoritmos de aprendizado de máquina utilizando exemplos', Dissertaç ao de Mestrado, ICMC-USP
1997
Conference Papers
add
Sharma A; Li J; Mishra D; Batista G; Seneviratne A, 2021, 'Passive WiFi CSI Sensing Based Machine Learning Framework for COVID-Safe Occupancy Monitoring', in 2021 IEEE International Conference on Communications Workshops, ICC Workshops 2021 - Proceedings, IEEE, presented at 2021 IEEE International Conference on Communications Workshops (ICC Workshops), 14 June 2021 - 23 June 2021, http://dx.doi.org/10.1109/ICCWorkshops50388.2021.9473673
2021
Rebello G; Hu Y; Thilakarathna K; Batista G; Seneviratne A; Duarte OCMB, 2020, 'Melhorando a Acurácia da Detecção de Lavagem de Dinheiro na Rede Bitcoin', in Anais XXXVIII Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos (SBRC 2020), Sociedade Brasileira de Computação, presented at Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos, http://dx.doi.org/10.5753/sbrc.2020.12321
2020
Jacintho LHM; Da Silva TP; Parmezan ARS; Batista GEDAPA, 2020, 'Brazilian Presidential Elections: Analysing Voting Patterns in Time and Space Using a Simple Data Science Pipeline', in Anais do Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2020), Sociedade Brasileira de Computação, presented at Symposium on Knowledge Discovery, Mining and Learning, http://dx.doi.org/10.5753/kdmile.2020.11979
2020
de Sá JMC; Rossi ALD; Batista GEAPA; Garcia LPF, 2020, 'Algorithm recommendation for data streams', in Proceedings - International Conference on Pattern Recognition, IEEE, pp. 6073 - 6080, presented at 2020 25th International Conference on Pattern Recognition (ICPR), 10 January 2021 - 15 January 2021, http://dx.doi.org/10.1109/ICPR48806.2021.9411923
2020
Maletzke A; Hassan W; dos Reis D; Batista G, 2020, 'The Importance of the Test Set Size in Quantification Assessment', in IJCAI, IJCAI, YOKOHAMA, pp. 2640 - 2646, presented at Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track, YOKOHAMA, http://dx.doi.org/10.24963/ijcai.2020/366
2020
Hassan W; Maletzke A; Batista G, 2020, 'Accurately quantifying a billion instances per second', in Proceedings - 2020 IEEE 7th International Conference on Data Science and Advanced Analytics, DSAA 2020, pp. 1 - 10, http://dx.doi.org/10.1109/DSAA49011.2020.00012
2020
Maletzke A; dos Reis D; Cherman E; Batista G, 2019, 'DyS: a Framework for Mixture Models in Quantification', in Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)
2019
Tsutsui Da Silva L; Souza VMA; Batista GEAPA, 2019, 'EmbML Tool: Supporting the use of supervised learning algorithms in low-cost embedded systems', in Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI, pp. 1633 - 1637, http://dx.doi.org/10.1109/ICTAI.2019.00238
2019
Souza V; Pinho T; Batista G, 2018, 'Evaluating stream classifiers with delayed labels information', in Proceedings - 2018 Brazilian Conference on Intelligent Systems, BRACIS 2018, pp. 408 - 413, http://dx.doi.org/10.1109/BRACIS.2018.00077
2018
dos Reis DM; Maletzke AG; Batista GEAPA, 2018, 'Unsupervised context switch for classification tasks on data streams with recurrent concepts', in Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp. 518 - 524
2018
Silva DF; Batista GEAPA; Keogh E, 2018, 'Large-Scale Similarity-Based Time Series Mining', in Anais do Concurso de Teses e Dissertações da SBC (CTD-SBC), Sociedade Brasileira de Computação - SBC, presented at XXXI Concurso de Teses e Dissertações da SBC, http://dx.doi.org/10.5753/ctd.2018.3656
2018
Parmezan ARS; Souza VMA; Batista GEAPA, 2018, 'Towards Hierarchical Classification of Data Streams', in 23rd Iberoamerican Congress on Pattern Recognition (CIARP), pp. 314 - 322
2018
da Silva TP; Souza VMA; Batista GEAPA; de Arruda Camargo H, 2018, 'A Fuzzy Classifier for Data Streams with Infinitely Delayed Labels', in 23rd Iberoamerican Congress on Pattern Recognition (CIARP)
2018
Moreira dos Reis D; Maletzke A; Silva DF; Batista GEAPA, 2018, 'Classifying and counting with recurrent contexts', in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1983 - 1992
2018
dos Reis D; Maletzke A; Cherman E; Batista G, 2018, 'One-class quantification', in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, Dublin, Ireland, pp. 273 - 289, presented at ECML PKDD 2018, Dublin, Ireland, 10 September 2018 - 14 September 2018, http://dx.doi.org/10.1007/978-3-030-10925-7
2018
Maletzke A; dos Reis D; Cherman E; Batista G, 2018, 'On the Need of Class Ratio Insensitive Drift Tests for Data Streams', in Second International Workshop on Learning with Imbalanced Domains: Theory and Applications, pp. 110 - 124
2018
Silva DF; Batista GEAPA, 2018, 'Elastic time series motifs and discords', in 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, pp. 237 - 242, IEEE
2018
Maletzke AG; dos Reis DM; Batista GEAPA, 2017, 'Quantification in data streams: Initial results', in 2017 Brazilian Conference on Intelligent Systems (BRACIS), IEEE, pp. 43 - 48, IEEE
2017
Giusti R; Silva DF; Batista GEAPA, 2016, 'Improved time series classification with representation diversity and svm', in 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, pp. 1 - 6, IEEE
2016
Sousa CAR; Batista GEAPA, 2016, 'Constrained Local and Global Consistency for semi-supervised learning', in 2016 23rd International Conference on Pattern Recognition (ICPR), IEEE, pp. 1689 - 1694, IEEE
2016
Silva DF; Batista GEAPA; Keogh E, 2016, 'Prefix and Suffix Invariant Dynamic Time Warping', in 2016 IEEE 16th International Conference on Data Mining (ICDM), IEEE, presented at 2016 IEEE 16th International Conference on Data Mining (ICDM), 12 December 2016 - 15 December 2016, http://dx.doi.org/10.1109/icdm.2016.0161
2016
Giusti R; Silva DF; Batista GEAPA, 2016, 'Improved Time Series Classification with Representation Diversity and SVM', in 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, presented at 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), 18 December 2016 - 20 December 2016, http://dx.doi.org/10.1109/icmla.2016.0010
2016
Silva DF; Batista GEAPA; Keogh E, 2016, 'Prefix and suffix invariant dynamic time warping', in 2016 IEEE 16th International Conference on Data Mining (ICDM), IEEE, pp. 1209 - 1214, IEEE
2016
Silva DF; Batista GEAPA, 2016, 'Speeding up all-pairwise dynamic time warping matrix calculation', in Proceedings of the 2016 SIAM International Conference on Data Mining, Society for Industrial and Applied Mathematics, pp. 837 - 845, Society for Industrial and Applied Mathematics
2016
Silva DF; Yeh CCM; Batista GEAPA; Keogh E, 2016, 'SiMPle: Assessing music similarity using subsequences joins', in Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016, pp. 23 - 29
2016
Silva DF; Batista GEAPA; Keogh E; others , 2016, 'On the effect of endpoints on dynamic time warping', in SIGKDD Workshop on Mining and Learning from Time Series II, San Francisco, CA. Association for Computing Machinery-ACM
2016
dos Reis DM; Flach P; Matwin S; Batista G, 2016, 'Fast unsupervised online drift detection using incremental kolmogorov-smirnov test', in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1545 - 1554
2016
Parmezan ARS; Batista GEAPA, 2015, 'A study of the use of complexity measures in the similarity search process adopted by knn algorithm for time series prediction', in 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), IEEE, pp. 45 - 51, IEEE
2015
Souza VMA; Batista GEAPA; Souza-Filho NE, 2015, 'Automatic classification of drum sounds with indefinite pitch', in 2015 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 1 - 8, IEEE
2015
Silva DF; de Souza VMA; Batista GEAPA, 2015, 'Music Shapelets for Fast Cover Song Recognition.', in ISMIR, pp. 441 - 447
2015
Souza VMA; Silva DF; Gama JA; Batista GEAPA, 2015, 'Data Stream Classification Guided by Clustering on Nonstationary Environments and Extreme Verification Latency', in SIAM International Conference on Data Mining (SDM), pp. 873 - 881
2015
Giusti R; Silva DF; Batista GEAPA, 2015, 'Time series classification with representation ensembles', in International Symposium on Intelligent Data Analysis, Springer, Cham, pp. 108 - 119, Springer, Cham
2015
Qi Y; Cinar GT; Souza VMA; Batista GEAPA; Wang Y; Principe JC, 2015, 'Effective insect recognition using a stacked autoencoder with maximum correntropy criterion', in 2015 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 1 - 7, IEEE
2015
de Sousa CAR; Souza VMA; Batista GEAPA, 2015, 'An experimental analysis on time series transductive classification on graphs', in 2015 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 1 - 8, IEEE
2015
de Sousa AR; Batista GEAPA, 2015, 'Robust multi-class graph transduction with higher order regularization', in 2015 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 1 - 8, IEEE
2015
Oliveira LS; Batista GEAPA, 2015, 'Igmm-cd: a gaussian mixture classification algorithm for data streams with concept drifts', in 2015 Brazilian Conference on Intelligent Systems (BRACIS), IEEE, pp. 55 - 61, IEEE
2015
Souza VMA; Silva DF; Batista GEAPA; Gama J, 2015, 'Classification of Evolving Data Streams with Infinitely Delayed Labels', in IEEE International Conference on Machine Learning & Applications (ICMLA), pp. 214 - 219
2015
Silva DF; Rossi RG; Rezende SO; Batista GEAPA, 2014, 'Music Classification by Transductive Learning Using Bipartite Heterogeneous Networks', in International Society of Music Information Retrieval Conference (ISMIR)
2014
Lemes CI; Silva DF; Batista GEAPA, 2014, 'Adding diversity to rank examples in anytime nearest neighbor classification', in 2014 13th International Conference on Machine Learning and Applications, IEEE, pp. 129 - 134, IEEE
2014
Souza VMA; Silva DF; Batista GEAPA, 2014, 'Extracting texture features for time series classification', in 2014 22nd International Conference on Pattern Recognition, IEEE, pp. 1425 - 1430, IEEE
2014
de Sousa CAR; Souza VMA; Batista GEAPA, 2014, 'Time series transductive classification on imbalanced data sets: an experimental study', in 2014 22nd International Conference on Pattern Recognition, IEEE, pp. 3780 - 3785, IEEE
2014
Domingues MA; Marcacini RM; Rezende SO; Batista GEAPA, 2013, 'Improving the recommendation of given names by using contextual information', in CEUR Workshop Proceedings, pp. 61 - 72
2013
Domingues MA; Cherman EA; Nogueira BM; Conrado MS; Rossi RG; De Padua R; Marcacini RM; Souza VMA; Batista GEAPA; Rezende SO, 2013, 'A comparative study of algorithms for recommending given names', in 2013 2nd International Conference on Informatics and Applications, ICIA 2013, pp. 66 - 71, http://dx.doi.org/10.1109/ICoIA.2013.6650231
2013
Maletzke AG; Lee HD; Batista GEAPA; Rezende SO; Machado RB; Voltolini RF; Maciel JN; Silva F, 2013, 'Time Series Classification using Motifs and Characteristics Extraction: A Case Study on ECG Databases', in Procedings of the Fourth International Workshop on Knowledge Discovery, Knowledge Management and Decision Support, Atlantis Press, presented at Fourth International Workshop on Knowledge Discovery, Knowledge Management and Decision Support, 06 November 2013 - 08 November 2013, http://dx.doi.org/10.2991/.2013.40
2013
Rakthanmanon T; Keogh E, 2013, 'Data mining a trillion time series subsequences under dynamic time warping', in Twenty-Third International Joint Conference on Artificial Intelligence
2013
Chen Y; Hu B; Keogh E; Batista GEAPA, 2013, 'DTW-D: time series semi-supervised learning from a single example', in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 383 - 391
2013
Silva D; Papadopoulos H; Batista GEAPA; Ellis DPW, 2013, 'A video compression-based approach to measure music structural similarity', in International Society for Music Information Retrieval Conference, pp. 95 - 10
2013
de Sousa CAR; Rezende SO; Batista GEAPA, 2013, 'Influence of graph construction on semi-supervised learning', in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, Berlin, Heidelberg, pp. 160 - 175, Springer, Berlin, Heidelberg
2013
Giusti R; Batista GEAPA, 2013, 'An empirical comparison of dissimilarity measures for time series classification', in 2013 Brazilian Conference on Intelligent Systems, IEEE, pp. 82 - 88, IEEE
2013
Silva DF; De Souza VMA; Batista GEAPA, 2013, 'Time series classification using compression distance of recurrence plots', in 2013 IEEE 13th International Conference on Data Mining, IEEE, pp. 687 - 696, IEEE
2013
de Souza VMA; Silva DF; Batista GEAPA, 2013, 'Classification of data streams applied to insect recognition: Initial results', in 2013 Brazilian Conference on Intelligent Systems, IEEE, pp. 76 - 81, IEEE
2013
Silva DF; De Souza VMA; Batista GEAPA; Keogh E; Ellis DPW, 2013, 'Applying machine learning and audio analysis techniques to insect recognition in intelligent traps', in 2013 12th International Conference on Machine Learning and Applications, IEEE, pp. 99 - 104, IEEE
2013
Qiang Z; Rakthanmanon T; Batista G; Keogh E, 2012, 'A novel approximation to dynamic time warping allows anytime clustering of massive time series datasets', in SIAM International Conference on Data Mining, pp. 999 - 1010
2012
Alves GEDAP; Silva DF; Prati RC; others , 2012, 'An experimental design to evaluate class imbalance treatment methods', in 2012 11th International Conference on Machine Learning and Applications, IEEE, pp. 95 - 101, IEEE
2012
Silva DF; de Souza VMA; Batista GEAPA; Giusti R, 2012, 'Spoken digit recognition in portuguese using line spectral frequencies', in Ibero-American Conference on Artificial Intelligence, Springer, Berlin, Heidelberg, pp. 241 - 250, Springer, Berlin, Heidelberg
2012
Rakthanmanon T; Campana B; Mueen A; Batista G; Westover B; Zhu Q; Zakaria J; Keogh E, 2012, 'Searching and mining trillions of time series subsequences under dynamic time warping', in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 262 - 270
2012
Batista G; Wang X; Keogh E, 2011, 'A Complexity-Invariant Distance Measure for Time Series', in SDM-2011: Proceedings of SIAM International Conference on Data Mining
2011
Batista GEAPA; Hao Y; Keogh E; Mafra-Neto A, 2011, 'Towards automatic classification on flying insects using inexpensive sensors', in 2011 10th International Conference on Machine Learning and Applications and Workshops, IEEE, pp. 364 - 369, IEEE
2011
Batista G; Keogh E; Neto AM; Rowton E, 2011, 'SIGKDD demo: sensors and software to allow computational entomology, an emerging application of data mining', in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp. 761 - 764, ACM
2011
Giusti R; Batista GEAPA, 2010, 'Discovering Knowledge Rules with Multi-Objective Evolutionary Computing', in Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on, IEEE, pp. 119 - 124, IEEE
2010
Batista GEAPA; Campana B; Keogh E, 2010, 'Classification of Live Moths Combining Texture, Color and Shape Primitives', in Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on, IEEE, pp. 903 - 906, IEEE
2010
Batista GEAPA; Silva DF, 2009, 'How k-Nearest Neighbor Parameters Affect its Performance', in X Argentine Symposium on Artificial Intelligence
2009
Prati RC; Batista GEAPA; Monard MC, 2009, 'Data mining with imbalanced class distributions: concepts and methods.', in IICAI, pp. 359 - 376
2009
Prati RC; Batista GEAPA; Monard MC, 2008, 'A study with class imbalance and random sampling for a decision tree learning system', in IFIP International Conference on Artificial Intelligence in Theory and Practice, Springer, Boston, MA, pp. 131 - 140, Springer, Boston, MA
2008
Matsubara ET; Prati RC; Batista GEAPA; Monard MC, 2008, 'Missing value imputation using a semi-supervised rank aggregation approach', in Brazilian Symposium on Artificial Intelligence, Springer, Berlin, Heidelberg, pp. 217 - 226, Springer, Berlin, Heidelberg
2008
Giusti R; Batista GEAPA; Prati RC, 2008, 'Evaluating Ranking Composition Methods for Multi-Objective Optimization of Knowledge Rules', in Hybrid Intelligent Systems, 2008. HIS’08. Eighth International Conference on, IEEE, pp. 537 - 542, IEEE
2008
Batista GEAPA; Prati RC; Monard MC, 2005, 'Balancing strategies and class overlapping', in Famili AF; Kok JN; Pena JM; Siebes A; Feelders A (eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), SPRINGER-VERLAG BERLIN, Madrid, SPAIN, pp. 24 - 35, presented at 6th International Symposium on Intelligent Data Analysis, Madrid, SPAIN, 08 September 2005 - 10 September 2005, http://dx.doi.org/10.1007/11552253_3
2005
Matsubara ET; Monard MC; Batista GEAPA, 2005, 'Multi-view semi-supervised learning: An approach to obtain different views from text datasets', in Proceeding of the 2005 conference on Advances in Logic Based Intelligent Systems: Selected Papers of LAPTEC 2005, IOS Press, pp. 97 - 104, IOS Press
2005
Prati RC; Batista GEAPA; Monard MC, 2004, 'Learning with class skews and small disjuncts', in Brazilian Symposium on Artificial Intelligence, Springer, Berlin, Heidelberg, pp. 296 - 306, Springer, Berlin, Heidelberg
2004
Batista GEAPA; Monard MC; Bazzan ALC, 2004, 'Improving rule induction precision for automated annotation by balancing skewed data sets', in International Symposium on Knowledge Exploration in Life Science Informatics, Springer, Berlin, Heidelberg, pp. 20 - 32, Springer, Berlin, Heidelberg
2004
Batista GEAPA; Bazan AL; Monard MC, 2003, 'Balancing training data for automated annotation of keywords: a case study', in Proceedings of the Second Brazilian Workshop on Bioinformatics, pp. 35 - 43
2003
Lorena AC; Batista GEAPA; De Carvalho ACPLF; Monard MC, 2002, 'Splice junction recognition using machine learning techniques', in Proceedings of the First Brazilian Workshop on Bioinformatics, Citeseer, pp. 32 - 39, Citeseer
2002
Lorena AC; Batista GEAPA; de Carvalho ACPLF; Monard MC, 2002, 'The influence of noisy patterns on the performance of learning methods in the splice junction recognition problem', in Neural Networks, 2002. SBRN 2002. Proceedings. VII Brazilian Symposium on, IEEE, pp. 31 - 36, IEEE
2002
Batista GEAPA; Monard MC, 2001, 'A study of K-nearest neighbour as a model-based method to treat missing data', in Argentine Symposium on Artificial Intelligence
2001
Baranauskas JA; Monard MC; Batista GEAPA, 2000, 'A computational environment for extracting rules from databases', in Ebecken N; Brebbia CA (ed.), Management Information Systems, WIT PRESS, CAMBRIDGE UNIV, CAMBRIDGE, ENGLAND, pp. 321 - 330, presented at 2nd International Conference on Data Mining, CAMBRIDGE UNIV, CAMBRIDGE, ENGLAND, 05 July 2000 - 07 July 2000, http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000166319000032&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=891bb5ab6ba270e68a29b250adbe88d1
2000
Other
add
Chen Y; Keogh E; Hu B; Begum N; Bagnall A; Queen A; Batista G, 2015, The ucr time series classification archive
2015
Souza VMA; Silva DF; Gama J; Batista GEAPA, 2015, Nonstationary environments-archive
2015
Theses / Dissertations
add
BATISTA GE, 2003, Pré-processamento de dados em aprendizado de máquinas supervisionado., Tese (Doutorado)-Instituto de Ciências Matemáticas e de Computaç ao …
2003
  • 2020. Best Research Paper Award. IEEE International Conference on Data Science and Advanced Analytics (IEEE-DSAA).
  • 2017 – 2020. Research Fellow, level 2. National Council for Scientific and Technological Development, CNPq.
  • 2014 – 2017. Research Fellow, level 2. National Council for Scientific and Technological Development, CNPq.
  • 2015 – 2016. Google Research Award in Latin America. Google Inc.
  • 2012. Best Research Paper Award. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (ACM-KDD).

Grant funding as principal investigator

  • 2017 – 2019: FAPESP e-Science Research Grant. Intelligent Traps and Sensors: an Innovative Approach to Control Insect Pests and Disease Vectors. $55,000.
  • 2016 – 2019: USAID Combating Zika and Future Threats Grand Challenge. An Intelligent Trap and Mobile Application to Motivate Local Mosquito Control Activities. $500,000.
  • 2017 – 2019: CNPq Research Fellow. Novel Approaches in Machine Learning Applied to Automatic Insect Recognition. $25,000.
  • 2015 – 2016: Google LA Research Award. Controlling Dengue Fever Mosquitoes using Intelligent Sensors and Traps. $24,000.
  • 2012 – 2014: FAPESP Research Grant. Complexity-invariance for Classification, Clustering and Motif Discovery in Time Series. $30,000.
  • 2013 – 2015: FAPESP-CALDO International Cooperation Grant. Research on Geospatial Marine Biology Data Mining using Time Series, Text Mining and Visualization (with Stan Matwin co-PI for NSERC). $20,000.
  • 2013 – 2015: FAPESP-CNPq Research Grant. Intelligent Sensors for Controlling Agricultural Pests and Disease-vector Insects. $55,000.
  • 2014 – 2017: CNPq Universal Research Grant. Real-time Monitoring of Insect Pests in Agriculture and the Environment. $25,000.
  • 2014 – 2017: FAPESP New Frontiers Grant. Time Series Classification Algorithms Applied to Embedded Systems. $30,000.
  • 2007 – 2009: FAPESP Research Grant. Machine Learning and Class Imbalance. $10,000.