Theoretical and Natural Science

- The Open Access Proceedings Series for Conferences


Theoretical and Natural Science

Vol. 34, 02 April 2024


Open Access | Article

FT_HTlist: A fault-tolerant frequent itemset mining algorithm based on the linear table

Xingyue Li 1 , Jun Lu * 2
1 Heilongjiang University
2 Heilongjiang University

* Author to whom correspondence should be addressed.

Theoretical and Natural Science, Vol. 34, 72-76
Published 02 April 2024. © 2023 The Author(s). Published by EWA Publishing
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Citation Xingyue Li, Jun Lu. FT_HTlist: A fault-tolerant frequent itemset mining algorithm based on the linear table. TNS (2024) Vol. 34: 72-76. DOI: 10.54254/2753-8818/34/20241168.

Abstract

This paper proposes a fault-tolerant frequent itemset mining algorithm (FT_HTlist) based on the linear table when the fault-tolerance is 1. The algorithm uses the method of concatenating 1 in the highest bit of the binary number of the known fault-tolerant frequent patterns to generate the candidate fault_tolerant patterns, called FT_Candidate. The algorithm is based on the data structure of the linear table for fault-tolerant frequent itemset mining. This method does not need recursion, so it reduces the consumption of mining space. At the same time, the paper proposed a deduplication algorithm to remove the support for repeat calculations. So the algorithm has a strong advantage in spatial performance. In addition, the algorithm only needs to mine two horizontal chains of the FT_Candidate, thus reducing the consumption of mining time. Finally, the paper shows the time performance and space performance of the proposed algorithm under sparse datasets and dense datasets. The results show that our algorithm has better mining time than other algorithms, and the horizontal chain reduces the memory occupation of the algorithm.

Keywords

The linear table, Fault-tolerance, Mining algorithm

References

1. W. Ertel, Machine Learning and Data Mining, Income Introduction to Artificial Intelligence, W. Ertel, editor, Income Undergraduate Topics in Computer Science. , , London: Springer, 2011, pp. 161-220. London: Springer, 2011, pp. 161-220. doi: 10.1007/978-0-85729-299-5_8.

2. M. Cafaro and M. Pulimeno, Frequent Itemset Mining, Income Business and Consumer Analytics: New Ideas, P. Moscato and N. J. de Vries, editors, Cham: Springer International Publishing, 2019, pp. 269-304. doi: 10.1007/978-3-030-06222-4_6.

3. P. Fournier-Viger, J. C.-W. Lin, B. Vo, T. T. Chi, J. Zhang and H. B. Le, “A survey of itemset mining”, WIREs Data Mining and Knowledge Discovery, vol. 7, issue 4, July 2017 , doi: 10.1002/widm.1207.

4. F. M. Nafie Ali and A. A. Mohamed Hamed, “Usage Apriori and clustering algorithms in WEKA tools to mining dataset of traffic accidents”, Journal of Information and Telecommunication, Volume 2, Issue 3, Pages 231-245, July 2018, doi: 10.1080/24751839.2018.1448205.

5. S. Liu and C. K. Poon, “On Mining Proportional Fault-Tolerant Frequent Itemsets”, in Database Systems for Advanced Applications, vol. 8421, S. S. Bhowmick, C. E. Dyreson, C. S. Jensen, M. L. Lee, A. Muliantara, and B. Thalheim, editors, Income Lecture Notes in Computer Science, vol. 8421. , Cham: Springer International Publishing, 2014, pp. 342-356. doi: 10.1007/978-3-319-05810-8_23.

6. B. C. Hidayanto, R. F. Muhammad, R. P. Kusumawardani and A. Syafaat, “Network Intrusion Detection Systems Analysis using Frequent Item Set Mining Algorithm FP-Max and Apriori”, Procedia Computer Science, vol. 124, pp. 751-758, January 2017, doi: 10.1016/j.procs.2017.12.214.

7. X. Yu, H. Wang, X. Zheng and S. Liu, “A Model of Mining Noise-Tolerant Frequent Itemset in Transactional Databases”, in Proceedings 2015 International Conference on Intelligent Networking and Collaborative Systems, Taipei: IEEE, September 2015, pp. 21-24. doi: 10.1109/INCoS.2015.87.

8. J.-L. Koh and P.-W. Yo, “An Efficient Approach for Mining Fault-Tolerant Frequent Patterns Based on Bit Vector Representations”, in Database Systems for Advanced Applications, vol. 3453, L. Zhou, B. C. Ooi and X. Meng, editors, in Lecture Notes in Computer Science, vol. 3453, Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 568-575. doi: 10.1007/11408079_51.

9. S. Bashir, Z. Halim and A. Rauf Baig, “Mining fault tolerant frequent patterns using pattern growth approach”, In Proceedings 2008 IEEE/ACS International Conference on Computer Systems and Applications, March 2008, pp. 172-179. doi: 10.1109/AICCSA.2008.4493532.

10. S. Liu and C. K. Poon, “On mining approximate and exact fault-tolerant frequent itemsets”, Knowl Inf Syst, vol. 55, no. 2, pp. 361-391, May 2018, doi:. 10.1007/s10115-017-1079-4.

11. S. M. A. Ashraf and T. Nafis, “Fault Tolerant Frequent patterns mining in large datasets having certain and uncertain records”, p. 14.

12. Z. Li, F. Chen, J. Wu, Z. Liu and W. Liu, “Efficient weighted probabilistic frequent itemset mining in uncertain databases”, Expert Systems, vol. 38, no. 5, pp. e12551, 2021, doi: 10.1111/exsy.12551.

13. G. Lee, S.-L. Peng and Y.-T. Lin, “Proportional fault-tolerant data mining with applications to bioinformatics”, Inf Syst Front, vol. 11, no. 4, pp. 461-469, September 2009, doi: 10.1007/s10796-009-9158-z. -469, September 2009, doi: 10.1007/s10796-009-9158-z.

Data Availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Authors who publish this series agree to the following terms:

1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.

2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.

3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open Access Instruction).

Volume Title
Proceedings of the 3rd International Conference on Computing Innovation and Applied Physics
ISBN (Print)
978-1-83558-369-2
ISBN (Online)
978-1-83558-370-8
Published Date
02 April 2024
Series
Theoretical and Natural Science
ISSN (Print)
2753-8818
ISSN (Online)
2753-8826
DOI
10.54254/2753-8818/34/20241168
Copyright
02 April 2024
Open Access
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited

Copyright © 2023 EWA Publishing. Unless Otherwise Stated