Theoretical and Natural Science
- The Open Access Proceedings Series for Conferences
Vol. 34, 02 April 2024
* Author to whom correspondence should be addressed.
This paper proposes a fault-tolerant frequent itemset mining algorithm (FT_HTlist) based on the linear table when the fault-tolerance is 1. The algorithm uses the method of concatenating 1 in the highest bit of the binary number of the known fault-tolerant frequent patterns to generate the candidate fault_tolerant patterns, called FT_Candidate. The algorithm is based on the data structure of the linear table for fault-tolerant frequent itemset mining. This method does not need recursion, so it reduces the consumption of mining space. At the same time, the paper proposed a deduplication algorithm to remove the support for repeat calculations. So the algorithm has a strong advantage in spatial performance. In addition, the algorithm only needs to mine two horizontal chains of the FT_Candidate, thus reducing the consumption of mining time. Finally, the paper shows the time performance and space performance of the proposed algorithm under sparse datasets and dense datasets. The results show that our algorithm has better mining time than other algorithms, and the horizontal chain reduces the memory occupation of the algorithm.
The linear table, Fault-tolerance, Mining algorithm
1. W. Ertel, Machine Learning and Data Mining, Income Introduction to Artificial Intelligence, W. Ertel, editor, Income Undergraduate Topics in Computer Science. , , London: Springer, 2011, pp. 161-220. London: Springer, 2011, pp. 161-220. doi: 10.1007/978-0-85729-299-5_8.
2. M. Cafaro and M. Pulimeno, Frequent Itemset Mining, Income Business and Consumer Analytics: New Ideas, P. Moscato and N. J. de Vries, editors, Cham: Springer International Publishing, 2019, pp. 269-304. doi: 10.1007/978-3-030-06222-4_6.
3. P. Fournier-Viger, J. C.-W. Lin, B. Vo, T. T. Chi, J. Zhang and H. B. Le, “A survey of itemset mining”, WIREs Data Mining and Knowledge Discovery, vol. 7, issue 4, July 2017 , doi: 10.1002/widm.1207.
4. F. M. Nafie Ali and A. A. Mohamed Hamed, “Usage Apriori and clustering algorithms in WEKA tools to mining dataset of traffic accidents”, Journal of Information and Telecommunication, Volume 2, Issue 3, Pages 231-245, July 2018, doi: 10.1080/24751839.2018.1448205.
5. S. Liu and C. K. Poon, “On Mining Proportional Fault-Tolerant Frequent Itemsets”, in Database Systems for Advanced Applications, vol. 8421, S. S. Bhowmick, C. E. Dyreson, C. S. Jensen, M. L. Lee, A. Muliantara, and B. Thalheim, editors, Income Lecture Notes in Computer Science, vol. 8421. , Cham: Springer International Publishing, 2014, pp. 342-356. doi: 10.1007/978-3-319-05810-8_23.
6. B. C. Hidayanto, R. F. Muhammad, R. P. Kusumawardani and A. Syafaat, “Network Intrusion Detection Systems Analysis using Frequent Item Set Mining Algorithm FP-Max and Apriori”, Procedia Computer Science, vol. 124, pp. 751-758, January 2017, doi: 10.1016/j.procs.2017.12.214.
7. X. Yu, H. Wang, X. Zheng and S. Liu, “A Model of Mining Noise-Tolerant Frequent Itemset in Transactional Databases”, in Proceedings 2015 International Conference on Intelligent Networking and Collaborative Systems, Taipei: IEEE, September 2015, pp. 21-24. doi: 10.1109/INCoS.2015.87.
8. J.-L. Koh and P.-W. Yo, “An Efficient Approach for Mining Fault-Tolerant Frequent Patterns Based on Bit Vector Representations”, in Database Systems for Advanced Applications, vol. 3453, L. Zhou, B. C. Ooi and X. Meng, editors, in Lecture Notes in Computer Science, vol. 3453, Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 568-575. doi: 10.1007/11408079_51.
9. S. Bashir, Z. Halim and A. Rauf Baig, “Mining fault tolerant frequent patterns using pattern growth approach”, In Proceedings 2008 IEEE/ACS International Conference on Computer Systems and Applications, March 2008, pp. 172-179. doi: 10.1109/AICCSA.2008.4493532.
10. S. Liu and C. K. Poon, “On mining approximate and exact fault-tolerant frequent itemsets”, Knowl Inf Syst, vol. 55, no. 2, pp. 361-391, May 2018, doi:. 10.1007/s10115-017-1079-4.
11. S. M. A. Ashraf and T. Nafis, “Fault Tolerant Frequent patterns mining in large datasets having certain and uncertain records”, p. 14.
12. Z. Li, F. Chen, J. Wu, Z. Liu and W. Liu, “Efficient weighted probabilistic frequent itemset mining in uncertain databases”, Expert Systems, vol. 38, no. 5, pp. e12551, 2021, doi: 10.1111/exsy.12551.
13. G. Lee, S.-L. Peng and Y.-T. Lin, “Proportional fault-tolerant data mining with applications to bioinformatics”, Inf Syst Front, vol. 11, no. 4, pp. 461-469, September 2009, doi: 10.1007/s10796-009-9158-z. -469, September 2009, doi: 10.1007/s10796-009-9158-z.
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open Access Instruction).