Tutorials | FUZZ-IEEE 2019

Visual Assessment of Cluster Tendency
- Organized by Jim Bezdek
Deep Fuzzy Models
- Organized by Joao Souza, Uzay Kaymak, and Alexander Gegov
Non-Standard Machine Learning and Data Science Problems with Fuzzy Logic: A Big Data Perspective
- Organized by Alberto Fernandez, Isaac Triguero, and Mikel Galar
Fuzzy Fusion of Decisions from Heterogeneous Deep Machine Learning Models
- Organized by Grant Scott
Support Fuzzy-Set Machines: From Kernels on Fuzzy Sets to Machine Learning Applications
- Organized by Jorge Guevara, Roberto Hirata, and Stephane Canu
Fuzzy-Rough Data Mining (Using the Weka Data Mining Suite)
- Organized by Richard Jensen and Neil Mac Parthalain
Sculpting the State Space: A New Way to Establish and Explain the Potential for Improved Performance When Using Rule-Based Fuzzy Systems
- Organized by Jerry Mendel

Schedule

	Session 1	Tags	Session 3	Tags
8:00 – 10:00	Non-standard Machine Learning and Data Science Problems with Fuzzy Logic: A Big Data Perspective (Alberto Fernandez, Isaac Triguero, Mikel Galar)	Big Data	Deep Fuzzy Models (Joao Souza, Uzay Kaymak, Alexander Gegov)	Deep+ML
10:00 – 10:30	Coffee		Coffee
10:30 – 12:30	Tutorial Cancelled	Big Data	Support Fuzzy-Set Machines: From Kernels on Fuzzy Sets to Machine Learning Applications (Jorge Guevara, Roberto Hirata, Stephane Canu)	Deep+ML
12:30 – 14:00	Lunch (on your own)		Lunch (on your own)
	Session 2		Session 4
14:00 – 16:00	Visual Assessment of Cluster Tendency (Jim Bezdek)		Fuzzy Fusion of Decisions from Heterogeneous Deep Machine Learning Models (Grant Scott)	Deep+ML
16:00 – 16:30	Coffee		Coffee
16:30 – 18:30	Fuzzy-rough data mining: using the Weka Data minning suite (Richard Jensen, Neil Mac Parthalain)		Sculpting the State Space: A New Way to Establish and Explain the Potential for Improved Perfomance When Using Rule-Based Fuzzy Systems (Jerry Mendel)

More Details

Visual Assessment of Cluster Tendency

Organized by Jim Bezdek

Outline:
A. The three canonical problems of clustering (pre-clustering assessment of tendency, clustering algorithms, post-clustering validation). Definitions and notation, label vectors and partitions. The need for assessment and cluster validity. Why cluster validation is hard. Labeled data are NOT clustered data.

B. Visual assessment of cluster tendency. The VAT/iVAT families of visual assessment models. Relation of pre-clustering assessment to post-clustering validation. sVAT and ClusiVAT for big data. InciVAT for streaming data.

C. Scalar measures of cluster validity (cluster validity indices or CVIs). Internal and external measures. Three kinds of comparison surveys: internal, external, and mixed internal/external. Comparisons based on fake clustering vs. real cluster analysis. Corrections for bias. Normalized forms. Complexity. Approximations for Big Data clustering.

D. Selected Internal CVIs for crisp cluster validation: Davies-Bouldin (DBI), VRC, C Index, Gamma, PBMH, Silhouette index, Dunn’s index, Generalized Dunn Indices. Bias in internal indices.

E. Selected External CVIs for crisp cluster validation: Rand’s index, Adjusted Rand Index, Mutual Information, Normalized forms of Mutual Information, Joint Entropy, Variation of Information. Bias in external indices.

F. Selected Internal CVIs for fuzzy partitions: Partition Coefficient, Partition Entropy, Xie-Beni (XBI), PBMF, Fuzzy C-Indices, Soft Silhouette

G. Incremental CVIs for monitoring and control of crisp and fuzzy streaming clustering: iDBI, iXBI, incremental Modified Generalized Dunn’s Indices, iMDI43, iMDI53.

Organizer bio:
James Bezdek (LF ’10) received the PhD in Applied Mathematics from Cornell University in 1973. Jim is past president of NAFIPS (North American Fuzzy Information Processing Society), IFSA (International Fuzzy Systems Association) and the IEEE CIS (Computational Intelligence Society): founding editor the Int’l. Jo. Approximate Reasoning and the IEEE Transactions on Fuzzy Systems: Life fellow of the IEEE and IFSA; and a recipient of the IEEE 3rd Millennium, CIS Fuzzy Systems Pioneer, and technical field award Rosenblatt medals, and the IPMU Kempe de Feret Award. Jim retired in 2007, and will be coming to a university near you soon.

Deep Fuzzy Models

Organized by Joao Souza, Uzay Kaymak, and Alexander Gegov

Deep learning has gained significant attention within the computational intelligence community over the recent years. Its success has been mainly due to the increased capability of modern computers to collect, store and process large volumes of data. This has led to a substantial increase in the effectiveness and efficiency of data management. As a result, it has become possible to achieve high accuracy for some benchmark learning tasks such as object classification and image recognition within a short time frame. The most common implementation of deep learning has been through neural networks due to the ability of their layers to perform multiple functional composition as part of a multistage learning process.

In spite of the significant recent advances in deep learning discussed above, there are still some open problems and serious limitations. In particular, effectiveness is usually adversely affected when the data is not well defined due to inherent noise, uncertainty, ambiguity, vagueness and incompleteness. This has an adverse impact on efficiency due to the necessity to define the data better by means of additional collection, analysis and cleaning. The reduced effectiveness and efficiency undermines the ability of deep learning to address real life tasks that are safety critical or time critical. Besides this, deep leaning has been used mainly in a passive manner for the purpose of observing the environment but it almost has not been used in an active manner for the purpose of changing the environment. Finally, deep learning models often have poor transparency which makes them difficult for understanding and interpretation by non-technical users.

The aim of this tutorial is to present in a brief structured format the contents of the papers accepted for publication in the forthcoming Special Issue on Deep Fuzzy Models for the IEEE Transactions on Fuzzy Systems. The focus of this special issue is on the problems and limitations discussed above with the help of deep fuzzy models. The latter have been around in different forms and under different names such as hierarchical fuzzy systems and fuzzy networks. These models are well suited for performing multiple functional composition at both crisp and linguistic level. Moreover, they have the potential of handling effectively and efficiently data that is not well defined due to the use of a fuzzy approach. Also, deep fuzzy models can be used in both passive and active manner with regard to the environment due to their generic structure. Finally, these models have a high level of transparency due to their rule base nature.

Organizer bio:
The tutorial presenters are Guest Editors for the forthcoming Special Issue on Deep Fuzzy Models for the IEEE Transactions on Fuzzy Systems. They are also Associate Editors for the IEEE Transactions on Fuzzy Systems.

Non-Standard Machine Learning and Data Science Problems with Fuzzy Logic: A Big Data Perspective

Organized by Alberto Fernandez, Isaac Triguero, and Mikel Galar

In the era of big data, the leverage of recent advances achieved in distributed technologies enables a novel scenario known as data science, whose main goal is to discover unknown patterns or hidden relations from voluminous data in a faster way. Extracting knowledge from big data becomes a very interesting and challenging task where we must consider new paradigms to develop scalable algorithms. However, computational intelligence models for machine learning, including those that consider fuzzy logic, cannot be straightforwardly adapted to the new space and time requirements. Hence, existing algorithms should be redesigned or new ones developed in order to take advantage of their capabilities in the big data context. Moreover, several issues are posed by real-world complex big data problems besides from computational complexity, and big data mining techniques should be able to deal with challenges such as dimensionality, class-imbalance, and lack of annotated samples among others.

Addressing Big Data becomes a very interesting and challenging task where we must consider new paradigms to develop scalable algorithms. The MapReduce framework, introduced by Google, allows us to carry out the processing of large amounts of information. Its open source implementation, named Hadoop, has allowed the development scalable algorithm becoming de facto standard for addressing Big Data problems. Recently, new alternatives to the standard Hadoop-MapReduce framework have arisen to improve the performance in this scenario, being Apache Spark project the most relevant one. Even working on Spark, the MapReduce framework implies that existing algorithms need to be redesigned or new ones need to be developed in order to take advantage of their capabilities in the big data context.

Data science is a quite recent field of study, and it is still rapidly expanding. In other words, the open directions for novel research are particularly associated with the analysis of the application of fuzzy systems to emerging work scenarios in data science.
Some clear examples are data streams, imbalanced classification, or big dimension, among others.

In this tutorial, we will first provide a gentle introduction to the problem of Big Data from the perspective of the development of fuzzy-based models. Then, we will dive into the field of Big Data analytics, describing several interesting case studies and real applications on the topic addressed with fuzzy approaches. Finally, we will carry out a thorough discussion with aims at defining the direction for the design of powerful algorithms based on both fuzzy systems, and how the information extracted with these models can be useful for the experts.

Organizer bio:
Isaac Triguero received his M.Sc. and Ph.D. degrees in Computer Science from the University of Granada, Granada, Spain, in 2009 and 2014, respectively. He is currently an Assistant Professor in Data Science at the School of Computer Science of the University of Nottingham. He has published more than 30international journal papers as well as more than 30 contributions to conferences. He is a Section Editor-in-Chief of the Machine Learning and Knowledge Extraction journal, and an associate editor of the Big Data and Cognitive Computing journal. He is also a reviewer of more than 30 international journals. He has acted as Program Co-Chair of the IEEE Conference on Smart Data (2016), the IEEE Conference on Big Data Science and Engineering (2017), and the IEEE International Congress on Big Data (2018). He has acted as guest editor for special issues in journals such as Information Sciences, Cognitive Computation, IEEE Access, and Big Data Analytics. His research interests include data mining, data reduction, biometrics, optimization, evolutionary algorithms, semi-supervised learning, bioinformatics and big data learning.

Alberto Fernández received the M.Sc. and Ph.D. degrees in computer science from the University of Granada, Granada, Spain, in 2005 and 2010, respectively. He is currently an Assistant Professor with the Department of Computer Science and Artificial Intelligence, University of Granada, Spain. He has published more than 100 papers in highly rated JCR journals and international conferences. In 2013, 2014, and 2017 Dr. Fernández received the University of Granada Prize for Scientific Excellence Works in the field of Engineering. He has also been awarded in 2011 with the Lofti A. Zadeh Best Paper prize (IFSA Assocaiation). He has been recently selected as a Highly Cited Researcher http://highlycited.com (in the field of Computer Science, 2017 Clarivate Analytics). His research interests include classification in imbalanced domains, fuzzy rule learning, evolutionary algorithms, multiclassification problems with ensembles and decomposition techniques, and data science in big data applications.

Mikel Galar received the M.Sc. and Ph.D. degrees in Computer Science in 2009 and 2012, both from the Public University of Navarra, Pamplona, Spain. He is currently an assistant professor at the Department of Statistics, Computer Science and Mathematics at the Public University of Navarre. He is the author of 33 published original articles in international journals and more than 48 contributions to conferences. He is also reviewer of more than 35 international journals. His research interests are data mining, classification, multi-classification, ensemble learning, evolutionary algorithms, fuzzy systems and big data. He is a member of the IEEE, the European Society for Fuzzy Logic and Technology (EUSFLAT) and the Spanish Association of Artificial Intelligence (AEPIA). He has received the extraordinary prize for his PhD thesis from the Public University of Navarre and the 2013 IEEE Transactions on Fuzzy System Outstanding Paper Award for the paper “A New Approach to Interval-Valued Choquet Integrals and the Problem of Ordering in Interval- Valued Fuzzy Set Applications” (bestowed in 2016)

Fuzzy Fusion of Decisions from Heterogeneous Deep Machine Learning Models

Organized by Grant Scott

This tutorial will teach participants three key skills that are critical to advancing research in computational intelligence:

1) use of Keras and TensorFlow;

2) deep learning models and transfer learning techniques; and

3) fuzzy machine learning model fusion.

The tutorial session will be broken into these three portions, each of which culminates in code examples that can be immediately migrated from the tutorial to the participants own research thrusts (theories and applications). The session will conclude with a case-study demonstrating how the components have been tied together for scientific studies in remote sensing.

Organizer bio:
Dr. Grant Scott is the Director of the MU Data Science and Analytics (DSA) MS Program at the University of Missouri, USA, where he also has an appointment in the EECS Department. The DSA program specializes in teaching data science as a life cycle, incorporating provenance of data and process for repeatability in scientific discoveries. Dr. Scott has been collaborating in research efforts at the MU Center for Geospatial Intelligence (CGI) on use of deep neural models for classification and object detection in high-resolution remote sensing imagery data. The CGI has numerous publications establishing it as the leader in high-resolution imagery classification and object detection for Earth electro-optical imagery, producing the state-of-the-art results on a broad spectrum of benchmark datasets — using both single architecture and heterogeneous architecture fusion.

Support Fuzzy-Set Machines: From Kernels on Fuzzy Sets to Machine Learning Applications

Organized by Jorge Guevara, Roberto Hirata, and Stephane Canu

Outline:
1. Introduction to kernels and kernels machines. This first component will provide attendees with the theoretical background and practical applications of kernel machines. We will focus on:
a. Reproducing Hilbert Space theory,
b. Kernels machines for solving classification, regression and anomaly detection and
c. Practical issues and implementations

2. Kernels on fuzzy sets. This module will teach how to perform kernel engineering using tools from fuzzy mathematics. In particular, we will introduce and code:
a. The cross-product kernel on fuzzy sets
b. The intersection kernel on fuzzy sets
c. The distance-based kernel on fuzzy sets

3. Support fuzzy machines. This last part also will introduce practical aspects on how to successfully use these machines in practical machine learning tasks. We will see:
a. Definition of support fuzzy machines
b. Data-driven techniques for fuzzy modelling of datasets
c. Practical implementations on real datasets.

4. Further directions.

We will prepare a GitHub project for the tutorial which will include scripts, data, and practical examples.

Organizer bio:
Jorge Guevara is a Research Scientist at IBM Research, He actively performs research in Artificial Intelligence for solving and giving smart data-driven solutions to problems arisen from oil and gas companies and, for applying Machine Learning and Data Science solutions on the Natural Resources area. He is very interested in the study of Artificial Intelligence, from theoretical to practice perspective, for example, investigating on how we can make AI more explainable and interpretable, and how industry and society can benefit from this. He holds a Ph.D., an MSc, and a BSc in Computer Science. His areas of interest are Machine Learning, Data Science, Deep Learning, Speech Recognition, Signal Processing and Statistical Learning Theory. In the past, He worked on Kernel Methods, Fuzzy Theory and Speech Recognition.

Roberto Hirata Jr. is an Associate Professor of the Institute of Mathematics and Statistics (IME) of the University of São Paulo (USP). he received a degree in Physics (Institute of Physics – USP) and one in Mathematics (IME-USP). He received a MSc degree in Computer Science (IME-USP-1997) for his work on morphological segmentation and fast algorithms for basic morphological operators and a PhD degree in Computer Science (IME-USP-2001) for his work on Aperture operators, a class of operators that can be automatically designed using Machine Learning (ML). As part of his PhD’s training, he has worked under Prof. Dr. Edward Russel Dougherty at the Texas A&M University. His work on data analysis and machine learning for bioinformatics was fruitful and he also participated in two cover papers in the prestigious Cancer Research journal. His last works on ML are in the areas of Computer Vision and Fuzzy Logic kernel methods. The main application for the later are anomaly detection and imprecision and uncertainty data analysis. He has advised nine MSc and four PhD projects and he is author or co-author in more than twenty papers in journals or conferences in the last five years.

Stéphane Canu is a Professor of the LITIS research laboratory and of the information technology department, at the National institute of applied science in Rouen (INSA). He received a Ph.D. degree in System Command from Comiègne University of Technology in 1986. He joined the faculty department of Computer Science at Compiegne University of Technology in 1987. He received the French habilitation degree from Paris 6 University. In 1997, he joined the Rouen Applied Sciences National Institute (INSA) as a full professor, where he created the information engineering department. He has been the dean of this department until 2002 when he was named director of the computing service and facilities unit. In 2004 he join for one sabbatical year the machine learning group at ANU/NICTA (Canberra) with Alex Smola and Bob Williamson. In the last five years, he has published approximately thirty papers in refereed conference proceedings or journals in the areas of theory, algorithms and applications using kernel machines learning algorithm and other flexible regression methods. His research interests includes kernels machines, regularization, machine learning applied to signal processing, pattern classification, factorization for recommender systems and learning for context aware applications.

Fuzzy-Rough Data Mining (Using the Weka Data Mining Suite)

Organized by Richard Jensen and Neil Mac Parthalain

Outline:
The material to be covered includes:

• An introduction to the areas of knowledge discovery and data mining

• An introduction to the principle concepts of rough sets and fuzzy-rough sets for data mining

• Feature selection and fuzzy-rough feature selection, along with extensions to handle noisy data, missing values, and unsupervised data.

• Classification, including fuzzy-rough nearest neighbour (NN) and rule induction

• Data instance/object selection, prototype selection, and data sub-table selection

• An introduction to the Weka data mining suite

• A demonstration of the above described data mining tasks using the Weka environment (including some alternative (nature inspired) search tech- niques for performing feature selection and classification)

• A demonstration of other useful tools for data mining in Weka, such as dealing with missing values, and how to use the Experimenter and Knowledge-Flow interfaces

Organizer bio:
Richard Jensen received the B.Sc. degree in Computer Science from Lancaster University, UK, and the M.Sc. and Ph.D. degrees in Artificial Intelligence from the University of Edinburgh, UK. He is a Lecturer with the Department of Com- puter Science at Aberystwyth University, working in the Advanced Reasoning Group. His research interests include rough and fuzzy set theory; pattern recog- nition; information retrieval; feature selection; and swarm intelligence. He has published over 70 peer-refereed articles in these areas (h-index: 32), including a best paper award winner. He authored the research monograph ”Computational Intelligence and Feature Selection: Rough and Fuzzy Approaches”, published jointly by IEEE/Wiley. He is an AE for the International Journal of Approx- imate Reasoning and is on the editorial board of Transactions on Rough Sets amongst others, as well as on the advisory board of the International Rough Set Society. He has organised several special sessions on fuzzy rough sets for the IEEE International Conference on Fuzzy Systems and the Joint Rough Set Symposium.

Neil Mac Parthal ́ain is a Research Fellow with the Advanced Reasoning Group at the Department of Computer Science, Aberystwyth University, Wales, UK. His areas of research include rough set theory, fuzzy set theory, pattern recognition, feature selection, classification and medical imaging and applications. He has published over 50 peer-refereed conference papers and academic journal articles in these and related areas. He was a member of the organising committee for 16th International Conference on Fuzzy Systems (FUZZ-IEEE 2007), London and has been involved with the organization of a number of special sessions at the IEEE series of International Conferences on Fuzzy Systems.

Sculpting the State Space: A New Way to Establish and Explain the Potential for Improved Performance When Using Rule-Based Fuzzy Systems

Organized by Jerry Mendel

Since Zadeh’s seminal 1965 paper, a very important application for fuzzy sets has been rule- based fuzzy systems. When such systems use type-1 (interval type-2 or general type-2) fuzzy sets they are called type-1 (interval type-2 or general type-2) fuzzy systems. Thousands of articles (including books) have been published about such fuzzy systems and invariably they demonstrate that better performance, as measured by an application’s performance metric(s), is achieved by: (1) a type-1 fuzzy system over a non-fuzzy system, (2) an interval type-2 fuzzy system over a type-1 fuzzy system, and (3) a general type-2 fuzzy system over an interval type-2 fuzzy system.

A crucial question is: Why does improved performance occur as one goes from crisp, to type- 1, to interval type-2, to general type-2 fuzzy systems? This question is crucial because many people do not like (are prejudiced about) fuzzy systems, on general principles, and unless we are able to answer it up-front, they will ignore what we are doing. To paraphrase Martin Luther King, we must overcome this prejudice, and this tutorial will help us to achieve this.
A traditional comparative performance analysis begins with a specific application and is performed entirely within the context of that application, with each application requiring its own performance analysis. The proposer’s contention is that there should be a common component to all performance analyses, after which the rest of the performance analysis is application-dependent.

This tutorial provides new and novel answers to the above crucial question that are in the spirit of “a common component to all performance analyses,” for type-1, interval type-2 and general type-2 fuzzy systems. It treats both singleton and non-singleton fuzzifiers so that one can also explain, in new and novel ways, what the potential benefits are for using non- singleton fuzzifiers. All of this is done using the following four kinds of partitions:

(1) Uncertainty partitions that let one distinguish type-1 fuzzy sets from crisp sets, interval type-2 fuzzy sets from type-1 fuzzy sets, and general type-2 fuzzy sets from interval type-2 fuzzy sets;

(2) First-order rule partitions that are direct results of uncertainty partitions, are associated with the number of rules that fire in different regions of the state space, and lead to a course sculpting of the state space;

(3) Second-order rule partitions that are associated with changes in the slopes of membership functions within first-order rule partitions, and lead to a fine sculpting of the state space; and,

(4) Novelty partitions that only occur in interval or general type-2 fuzzy systems that use type-reduction.
Rule and novelty partitions sculpt the state space into hyper-rectangles within each of which resides a different nonlinear function (which is why a rule-based fuzzy system is a variable- structure system).

The goal of this tutorial is to provide its attendees with procedures for creating rule and novelty partitions that will let them explain the greater sculpting of the state space by:

• A type-1 fuzzy system that provides it with the potential to outperform a crisp system

• An interval type-2 fuzzy system that provides it with the potential to outperform a type-1 fuzzy system (and the latter can occur even when type-1 and interval type-2 fuzzy systems
are described by the same number of parameters)

• The even greater sculpting of the state space by a general type-2 fuzzy system that provides
it with the potential to outperform an interval type-2 fuzzy system

• A non-singleton fuzzy system that provides it with the potential to outperform a singleton
fuzzy system

Organizer bio:
Jerry M. Mendel received the Ph.D. degree in electrical engineering from the Polytechnic Institute of Brooklyn, Brooklyn, NY. Currently, he is Emeritus Professor of Electrical
Engineering at the University of Southern California in Los Angeles, where he has been since 1974. He is also a Tianjin 1000-Talents Foreign Experts Plan Endowed Professor, and Honorary Dean of the College of Artificial Intelligence, Tianjin Normal University, Tianjin, China. He has published over 570 technical papers and is author and/or co-author of 13 books, including Uncertain Rule-based Fuzzy Systems: Introduction and New Directions, 2nd ed. (Springer 2017), Perceptual Computing: Aiding People in Making Subjective Judgments (Wiley & IEEE Press, 2010), and Introduction to Type-2 Fuzzy Logic Control: Theory and Application (Wiley & IEEE Press, 2014). He is a Life Fellow of the IEEE, a Distinguished Member of the IEEE Control Systems Society, and a Fellow of the International Fuzzy Systems Association. He was President of the IEEE Control Systems Society in 1986, a member of the Administrative Committee of the IEEE Computational Intelligence Society for nine years, and Chairman of its Fuzzy Systems Technical Committee and the Computing With Words Task Force of that TC. Among his awards are the 1983 Best Transactions Paper Award of the IEEE Geoscience and Remote Sensing Society, the 1992 Signal Processing Society Paper Award, the 2002 and 2014 Transactions on Fuzzy Systems Outstanding Paper Awards, a 1984 IEEE Centennial Medal, an IEEE Third Millenium Medal, a Fuzzy Systems Pioneer Award (2008) from the IEEE Computational Intelligence Society for “fundamental theoretical contributions and seminal results in fuzzy systems”; and, 2015 USC Viterbi School of Engineering Senior Research Award. His present research interests (yes, he is still performing research with many colleagues around the Globe) include: type-2 fuzzy logic systems and computing with words.