Augmenting Naïve Bayes Classifiers with <i>k</i>-Tree Topology

The Bayesian network is a directed, acyclic graphical model that can offer a structured description for probabilistic dependencies among random variables. As powerful tools for classification tasks, Bayesian classifiers often require computing joint probability distributions, which can be computatio...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fereshteh R. Dastjerdi, Liming Cai
Format:	Article
Language:	English
Published:	MDPI AG 2025-07-01
Series:	Mathematics
Subjects:	Bayesian networks Naïve Bayes augmentation classification mutual information KL-divergence treewidth
Online Access:	https://www.mdpi.com/2227-7390/13/13/2185
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849319851647041536
author	Fereshteh R. Dastjerdi Liming Cai
author_facet	Fereshteh R. Dastjerdi Liming Cai
author_sort	Fereshteh R. Dastjerdi
collection	DOAJ
description	The Bayesian network is a directed, acyclic graphical model that can offer a structured description for probabilistic dependencies among random variables. As powerful tools for classification tasks, Bayesian classifiers often require computing joint probability distributions, which can be computationally intractable due to potential full dependencies among feature variables. On the other hand, Naïve Bayes, which presumes zero dependencies among features, trades accuracy for efficiency and often comes with underperformance. As a result, non-zero dependency structures, such as trees, are often used as more feasible probabilistic graph approximations; in particular, Tree Augmented Naïve Bayes (TAN) has been demonstrated to outperform Naïve Bayes and has become a popular choice. For applications where a variable is strongly influenced by multiple other features, TAN has been further extended to the <i>k</i>-dependency Bayesian classifier (KDB), where one feature can depend on up to <i>k</i> other features (for a given <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>k</mi><mo>≥</mo><mn>2</mn></mrow></semantics></math></inline-formula>). In such cases, however, the selection of the <i>k</i> parent features for each variable is often made through heuristic search methods (such as sorting), which do not guarantee an optimal approximation of network topology. In this paper, the novel notion of <i>k</i>-tree Augmented Naïve Bayes (<i>k</i>-TAN) is introduced to augment Naïve Bayesian classifiers with <i>k</i>-tree topology as an approximation of Bayesian networks. It is proved that, under the Kullback–Leibler divergence measurement, <i>k</i>-tree topology approximation of Bayesian classifiers loses the minimum information with the topology of a maximum spanning <i>k</i>-tree, where the edge weights of the graph are mutual information between random variables conditional upon the class label. In addition, while in general finding a maximum spanning <i>k</i>-tree is NP-hard for fixed <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>k</mi><mo>≥</mo><mn>2</mn></mrow></semantics></math></inline-formula>, this work shows that the approximation problem can be solved in time <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>O</mi><mo>(</mo><msup><mi>n</mi><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>)</mo></mrow></semantics></math></inline-formula> if the spanning <i>k</i>-tree also desires to retain a given Hamiltonian path in the graph. Therefore, this algorithm can be employed to ensure efficient approximation of Bayesian networks with <i>k</i>-tree augmented Naïve Bayesian classifiers of the guaranteed minimum loss of information.
format	Article
id	doaj-art-5317493ffbcb4d01a955f0378e9173cc
institution	Kabale University
issn	2227-7390
language	English
publishDate	2025-07-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj-art-5317493ffbcb4d01a955f0378e9173cc2025-08-20T03:50:17ZengMDPI AGMathematics2227-73902025-07-011313218510.3390/math13132185Augmenting Naïve Bayes Classifiers with <i>k</i>-Tree TopologyFereshteh R. Dastjerdi0Liming Cai1School of Computing, University of Georgia, Athens, GA 30602, USASchool of Computing, University of Georgia, Athens, GA 30602, USAThe Bayesian network is a directed, acyclic graphical model that can offer a structured description for probabilistic dependencies among random variables. As powerful tools for classification tasks, Bayesian classifiers often require computing joint probability distributions, which can be computationally intractable due to potential full dependencies among feature variables. On the other hand, Naïve Bayes, which presumes zero dependencies among features, trades accuracy for efficiency and often comes with underperformance. As a result, non-zero dependency structures, such as trees, are often used as more feasible probabilistic graph approximations; in particular, Tree Augmented Naïve Bayes (TAN) has been demonstrated to outperform Naïve Bayes and has become a popular choice. For applications where a variable is strongly influenced by multiple other features, TAN has been further extended to the <i>k</i>-dependency Bayesian classifier (KDB), where one feature can depend on up to <i>k</i> other features (for a given <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>k</mi><mo>≥</mo><mn>2</mn></mrow></semantics></math></inline-formula>). In such cases, however, the selection of the <i>k</i> parent features for each variable is often made through heuristic search methods (such as sorting), which do not guarantee an optimal approximation of network topology. In this paper, the novel notion of <i>k</i>-tree Augmented Naïve Bayes (<i>k</i>-TAN) is introduced to augment Naïve Bayesian classifiers with <i>k</i>-tree topology as an approximation of Bayesian networks. It is proved that, under the Kullback–Leibler divergence measurement, <i>k</i>-tree topology approximation of Bayesian classifiers loses the minimum information with the topology of a maximum spanning <i>k</i>-tree, where the edge weights of the graph are mutual information between random variables conditional upon the class label. In addition, while in general finding a maximum spanning <i>k</i>-tree is NP-hard for fixed <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>k</mi><mo>≥</mo><mn>2</mn></mrow></semantics></math></inline-formula>, this work shows that the approximation problem can be solved in time <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>O</mi><mo>(</mo><msup><mi>n</mi><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>)</mo></mrow></semantics></math></inline-formula> if the spanning <i>k</i>-tree also desires to retain a given Hamiltonian path in the graph. Therefore, this algorithm can be employed to ensure efficient approximation of Bayesian networks with <i>k</i>-tree augmented Naïve Bayesian classifiers of the guaranteed minimum loss of information.https://www.mdpi.com/2227-7390/13/13/2185Bayesian networksNaïve Bayes augmentationclassificationmutual informationKL-divergencetreewidth
spellingShingle	Fereshteh R. Dastjerdi Liming Cai Augmenting Naïve Bayes Classifiers with <i>k</i>-Tree Topology Mathematics Bayesian networks Naïve Bayes augmentation classification mutual information KL-divergence treewidth
title	Augmenting Naïve Bayes Classifiers with <i>k</i>-Tree Topology
title_full	Augmenting Naïve Bayes Classifiers with <i>k</i>-Tree Topology
title_fullStr	Augmenting Naïve Bayes Classifiers with <i>k</i>-Tree Topology
title_full_unstemmed	Augmenting Naïve Bayes Classifiers with <i>k</i>-Tree Topology
title_short	Augmenting Naïve Bayes Classifiers with <i>k</i>-Tree Topology
title_sort	augmenting naive bayes classifiers with i k i tree topology
topic	Bayesian networks Naïve Bayes augmentation classification mutual information KL-divergence treewidth
url	https://www.mdpi.com/2227-7390/13/13/2185
work_keys_str_mv	AT fereshtehrdastjerdi augmentingnaivebayesclassifierswithikitreetopology AT limingcai augmentingnaivebayesclassifierswithikitreetopology

Augmenting Naïve Bayes Classifiers with <i>k</i>-Tree Topology

Similar Items