It also provides a powerful way to calculate the importance degree of vague and uncertain big data to. It is useful for dealing with indiscernibility of objects caused by. Puts forward a xml mining model based on rough set theory. Data mining, data tables, distributed data mining ddm.
Discriminant versus rough set approach to vague data analysis. Based on rough sets 4 and the concept of lower and upper boundary sets 5, we introduce a method for updating approximations by considering adding and. Download data mining tutorial pdf version previous page print page. Book description practical applications of data mining emphasizes both theory and applications of data mining algorithms. Relationships exist between rough set theory and dempstershafers theory of. This section presents the concepts of rough set theory.
It is a new mathematical tool to deal with partial information. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. Rough set theory 7 is a new mathematical approach to data analysis and data mining. Data mining, decision tables, rough set, rule extraction. Rough set theory was introduced by zdzislaw pawlak in 1982. Rough set theory and zadehs fuzzy set theory are two independent approaches to deal with uncertainty. The general experimental procedure adapted to data mining problems involves the following steps. For the rough set theory, in the process of data mining, there are still a large number. Hongmei chen, tianrui li, ieee senior member, chuan luo, shijinn horng, ieee member. Chapter 2 rough sets and reasoning from data presents the application of rough set concept to reason from data data mining.
On rough set based approaches to induction of decision. We can use rough set approach to discover structural relationship within imprecise and noisy data. And combining with probability logic, random truth degree of rough logic can be studied in the future. Research on data mining algorithm based on rough set. The authors express that rough data set theory is not the only discipline in which discretization is. A characteristic set, a generalization of the elementary set wellknown in rough set theory, may be computed using such blocks. Dominancebased rough set approach for group decisions, european journal of operational research, 2511.
Pawlak as a mathematical ap proach to deal with vagueness and uncertainty in data analysis 50, 51. The advances in locationacquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles and animals. The monograph can be used under the creation of courses for graduate students and for ph. A decisiontheoretic rough set approach for dynamic data mining. A rough set approach to attribute generalization in data.
Finally, some description about applications of the data mining system with rough set theory is included. This book provides stateoftheart research results on intrusion detection using reinforcement learning, fuzzy and rough set theories, and genetic algorithm and serves wide range of applications, covering general computer security to server, network, and cloud security. Rough set theory has been a methodology of database mining or knowledge discovery in relational databases. Consequently, the theoretical part is minimized to emphasize the practical application side of the rough set approach. Another methodology which has high relevance to data mining and plays a central role in this volume is that of. A characteristic set, a generalization of the elementary set wellknown in rough set. Rough set theory rst is a mathematical approach that handles uncertainty and is capable of discovering knowledge from a database.
Rule induction from a decision table using rough sets theory. Mining incomplete dataa rough set approach springerlink. Data mining is a process of discovering various models, summaries, and derived values from a given collection of data. Relationships exist between rough set theory and dempstershafers theory of evidence.
Rough set theory and its applications ua computer science. Data mining is an emerging powerful tool for analysis and prediction. Risk analysis technique on inconsistent interview big data. Rough set approach, fuzzy set approachs, prediction, linear and multipleregression. Rough set theory provides a simple and elegant method for analyzing data. In this perspective, granular computing has a position of centrality in data mining. Clustering in data mining algorithms of cluster analysis.
And combining with probability logic, random truth. Rough set theory fundamental concepts, principals, data. Through indepth study on the existing rough set and data mining technologies, for the shortcomings of the existing data mining algorithms based on rough set, this paper presents an improved algorithm. A partition of u is a family of mutually disjoint nonempty subsets of u, called blocks, such that the union of all blocks is u. The results obtained in the monograph can be useful for researchers in such areas as machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test. A rough set knowledge discovery framework is formulated for the analysis of.
Chapter 1 basic concepts contains general formulation of basic ideas of rough set theory together with brief discussion of its place in classical set theory. Rule generation from raw data is a very effective and most widely used tool of data mining. This paper discusses the basic concepts of rough set theory and point out some rough set based research directions and applications. Real life data are frequently imperfect, erroneous, incomplete, uncertain and vague. It is useful for dealing with indiscernibility of objects caused by incomplete or limited information. The authors have broken the discussion into two sections, each with a specific theme. The chapter is focused on the data mining aspect of the applications of rough set theory. This paper, introduces the fundamental concepts of rough set theory and other aspects of data mining, a discussion of data representation with rough set theory including pairs of attributevalue blocks, information tables reducts, indiscernibility relation and. A rough set approach to data mining this paper reports our experiences with the application of the hierarchy of probabilistic decision tables to face recognition.
A rough set approach for generation and validation of. Visualization of data through data mining software is addressed. Based on the rough set theory, the rough logic and its deduction theory system can be established. Rough set approach to the analysis of the structureactivity relationship of quaternary imidazolium compounds.
This paper introduces a new approach for mining ifthen rules in databases with uncertainty and incompleteness. Due to the xml document is a kind of semistructured data, using the traditional data mining methods for mining of xml data is not applicable. The results considered in this book can be useful for researchers in machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory and logical. After 15 year of pursuing rough set theory and its application the theory has reached a certain degree of maturity. Rough set can be used as a tool to generate rules form decision table in data mining. Comparative analysis between rough set theory and data. Comparative analysis between rough set theory and data mining. Mining incomplete dataa rough set approach jerzy w.
Many techniques have been proposed for processing, managing and mining trajectory data in the past decade, fostering a broad range of applications. Theory and application on rough set, fuzzy logic, and. The main aim is to show how rough set and rough set analysis can be effectively used to extract knowledge from large databases. A survey on rough set theory and its applications sciencedirect. The rough set approach 7 to data analysis has many important advantages that provide efficient algorithms for finding hidden patterns in data, finds minimal sets of data, evaluates significance of data, generates. Approximation can further be applied to data mining related task, e. Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. Data mining, data tables, distributed data mining ddm, rough sets. Intrusion detection a data mining approach nandita. Data mining technology has emerged as a means for identifying patterns and trends from large quantities o. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in decision making. Summarization providing a more compact representation of the data set, including visualization and report. Rough set theory, data mining, decision table, decision rule, data representation. A rough set approach to attribute generalization in data mining.
Additionally, the rough set approach to lower and upper approximations and certain possible rule sets concepts are introduced. The concept of rough, or approximation, set s was introduced by pawlak, and is based on the single assumption that information is associated with. Analysis of imprecise data is an edited collection of research chapters on the most recent developments in rough set theory and data mining. This book provides stateoftheart research results on intrusion detection using reinforcement learning, fuzzy and rough set theories, and genetic algorithm and serves wide range of applications, covering. Decision rule induction for service sector using data. Consequently, the theoretical part is minimized to emphasize the practical application side of the rough set approach in the context of data analysis and modelbuilding applications. The advances in locationacquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles. There are so many approaches for handling missing attribute values.
In this data mining clustering method, a model is hypothesized for each cluster to find the best fit of data for a given model. The approach is based on the combination of generalization distribution table gdt and the rough set methodology. Pdf a decisiontheoretic rough set approach for dynamic data. A rough set based method for updating decision rules on attribute values coarsening and refining, ieee transactions on knowledge and data engineering, 2612. In this thesis, a rulebased rough set decision system is. Also, this method locates the clusters by clustering the density. Pdf this article comments on data mining and rough set theory, regarding the article myths about rough set. Granular computing is an emerging computing paradigm of information processing. For the purposes of analysis and decision support in the business area in many cases data mining using rough set theory is used. The rough set approach 7 to data analysis has many important advantages that.
Data mining and knowledge discovery in real life applications 36 outset, rough set theory has been a methodology of database mining or knowledge discovery in relational databases. The results considered in this book can be useful for researchers in machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory and logical analysis of data. However, the clustering algorithms for categorical data are few and are unable to handle uncertainty. It demonstrates this process with a typical set of data. Inhibitory rules in data analysis a rough set approach. A rough set approach for generation and validation of rules. A crucial concept in the rough set approach to machine learning is that of. New directions in rough sets, data mining, and granularsoft computing, lnai. The customer related data are categorical in nature. In recent years we witnessed a rapid grow of interest in rough set theory and its application, world wide. The dominancebased rough set approach drsa is an extension of rough set theory for multicriteria decision analysis mcda, introduced by greco, matarazzo and slowinski. Chapter 1 gives an overview of data mining, and provides a description of the data mining process. Parallel computing of approximations in dominancebased rough sets approach, knowledgebased systems, 87. A convenient way to present equivalence relations is through partitions.
Rough set theory was proposed by pawlak 17 in 1982, which o ers a mathematical approach to data analysis and data mining 11,1821. A rough set approach for generation and validation of rules for missing attribute values of a data set. Introduction modern organizations use several types of decision support systems to facilitate decision support. Seminal description of datamining approaches with reference. Fuzzyrough data mining with weka aberystwyth university. Rough sets theory is a new mathematical approach used in the intelligent data analysis and data mining if data is uncertain or incomplete. A rough set approach for the discovery of classification rules in.
The data mining technology instead of classic statistical analysis is developed to help the people to discover the knowledge inside of the data. A rough set is a formal approximation of a crisp set in terms of a pair of sets that give the lower and upper approximation of the original set learn more in. The chapters in this work cover a range of topics that focus on discovering dependencies among data, and reasoning about vague, uncertain and imprecise information. The rough set theory, which originated in the early 1980s, provides an alternative approach to the fuzzy set theory, when dealing with uncertainty, vagueness or inconsistence often. For the rough set theory, in the process of data mining, there are still a large number of problems need to be discussed, such as large data sets, efficient reduction algorithm, parallel computing, hybrid algorithm, etc. The results obtained in the monograph can be useful for researchers in such areas as machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory, and logical analysis of data lad. Reduce cost of mailing by targeting a set of consumers likely to buy a new cell phone product approach. Sets, fuzzy sets and rough sets our digital library. The rough set theory offers a viable approach for decision rule extraction from data. In its abstract form, it is a new area of uncertainty mathematics closely related to fuzzy theory.
Chapter 2 presents the data mining process in more detail. An overview of useful business applications is provided. A gdt is a table in which the probabilistic relationships between concepts and instances over discrete domains are. This paper, introduces the fundamental concepts of rough set theory and other aspects of data mining, a. Risk assessment is very important for safe and reliable investment. Recently, the rough set and fuzzy set theory have generated a great deal of interest among more and more researchers. Finally, some description about applications of the data mining system with. Various topics of data mining techniques are identified and described throughout. Performance analysis and prediction in educational data. Granular computing is an emerging computing paradigm of information processing and an approach for knowledge representation and data mining. This overview provides a description of some of the most common data mining algorithms in use today. The approach is based on the combination of generalization distribution table gdt.
555 1002 1088 511 1502 450 610 128 1262 874 710 991 478 1010 1057 1514 233 1100 1282 933 1186 1245 1429 1407 519 1487 244 1300 712 911 1135 1115 1125 1106 114 1221 99 1309 276 947 1468 260 631 981 364