Introduction
NaiveBayesclassificationisaverysimpleclassificationalgorithm.ItiscalledNaiveBayesclassificationbecausetheideaofthismethodisreallysimple.NaiveBayes'sthinkingisbasedonthis:Foragivenitemtobeclassified,theprobabilityofeachcategoryappearingundertheconditionoftheitem'sappearanceissolved,whicheveristhelargest,thentheitemtobeclassifiedisconsideredtobelongtowhichcategory.Forexample,ifafruitisred,round,andabout3inchesindiameter,thefruitcanbejudgedtobeanapple.Althoughthesefeaturesaredependentoneachotherorsomefeaturesaredeterminedbyotherfeatures,theNaiveBayesclassifierconsiderstheseattributestobeindependentintheprobabilitydistributionofdeterminingwhetherthefruitisanapple.Forcertaintypesofprobabilitymodels,verygoodclassificationresultscanbeobtainedinthesamplesetofsupervisedlearning.Inmanypracticalapplications,theNaiveBayesianmodelparameterestimationusesthemaximumlikelihoodestimationmethod;inotherwords,theNaiveBayesianmodelcanalsoworkwithoutBayesianprobabilityoranyBayesianmodel..
Despitethesenaiveideasandover-simplifiedassumptions,thenaiveBayesclassifiercanstillachievequitegoodresultsinmanycomplexreal-lifesituations.In2004,anarticleanalyzingtheproblemofBayesianclassifierrevealedseveraltheoreticalreasonswhythenaiveBayesianclassifierobtainstheseeminglyincredibleclassificationeffect.Nevertheless,anarticlein2006comparedvariousclassificationmethodsindetail,andfoundthatthenewermethods(suchasdecisiontreesandrandomforests)outperformBayesianclassifiers.OneadvantageofthenaiveBayesclassifieristhatitonlyneedstoestimatethenecessaryparameters(meanandvarianceofthevariables)basedonasmallamountoftrainingdata.Duetotheassumptionofvariableindependence,onlythemethodofestimatingeachvariableisneeded,withouttheneedtodeterminetheentirecovariancematrix.
Development
NaiveBayeshasbeenextensivelystudiedsincethe1950s.Intheearly1960s,itwasintroducedintothetextinformationretrievalfieldunderanothername,anditisstillapopular(benchmark)methodoftextclassification.,Legality,sportsorpolitics,etc.).Withproperpreprocessing,itcancompetewithmoreadvancedmethodsinthisfield(includingsupportvectormachines).Italsohasapplicationsinautomaticmedicaldiagnosis.
TheNaiveBayesclassifierishighlyscalable,soitrequiresanumberofparametersthathavealinearrelationshipwiththevariables(features/predictors)inthelearningproblem.Maximumlikelihoodtrainingcanbedonebyevaluatingaclosed-formexpression,anditonlytakeslineartimeinsteadofthetime-consumingiterativeapproximationusedbymanyothertypesofclassifiers.Inthestatisticsandcomputerscienceliterature,thenaiveBayesmodelhasvariousnames,includingsimpleBayesandindependentBayes.AllthesenamesrefertotheuseofBayes'theoreminthedecisionrulesoftheclassifier,butNaiveBayesdoesnot(necessarily)useBayesianmethods;"RussellandNorvig"mentions"'NaiveBayes"'SometimescalledtheBayesianclassifier,thissloppyusepromptstrueBayesianstocallitthefoolBayesianmodel."
Bayesianmethod
Therearemanyconstructionmethodsforclassifiers,thecommononesareBayesianmethod,decisiontreemethod,case-basedlearningmethod,artificialneuralnetworkmethod,supportvectormachinemethod,geneticalgorithm-basedmethod,roughset-basedmethod,fuzzySetmethodandsoon.Amongthem,theBayesianmethodisbecomingoneofthemosteye-catchingfocusofmanymethodswithitsuniqueexpressionofuncertaintyknowledge,richprobabilityexpressionability,andtheincrementallearningcharacteristicsofcomprehensivepriorknowledge.Classificationisatwo-stepprocess.Thefirststepistobuildaclassifierwithasetofknownexamples.Thisstepgenerallyoccursinthetrainingphaseorcalledthelearningphase.Theknowninstancesetusedtoconstructtheclassifieriscalledthetraininginstanceset,andeachinstanceinthetraininginstancesetiscalledthetraininginstance.Sincetheclasslabelsofthetrainingexamplesareknown,theprocessofconstructingtheclassifierisalearningprocesswithatutor.Incomparison,inthelearningprocesswithoutatutor,theclasslabelofthetraininginstanceisunknown,andsometimeseventhenumberofcategoriestobelearnedmaybeunknown,suchasclustering.
Thesecondstepistousethebuiltclassifiertoclassifyunknowninstances.Thisstepgenerallyoccursinthetestingphaseorcalledtheworkingphase.Theunknowninstancesusedforclassificationarecalledtestinstances.Generally,beforeaclassifierisusedforprediction,itsclassificationaccuracyneedstobeevaluated.Onlytheclassifierwiththerequiredclassificationaccuracycanbeusedtoclassifythetestcase.
Bayesianmethodprovidesaprobabilisticmeansofreasoning.Itassumesthatthevariablestobeexaminedfollowacertainprobabilitydistribution,andcanmakeinferencesbasedontheseprobabilitiesandtheobserveddata,soastomakethebestdecision.Bayesianmethodcannotonlycalculatetheexplicithypothesisprobability,butalsoprovideaneffectivemeansforunderstandingmostothermethods.ThecharacteristicsoftheBayesianmethodmainlyinclude:thecharacteristicsofincrementallearning;thecharacteristicsofpriorknowledgethatcandeterminethefinalprobabilityofthehypothesistogetherwiththeobservedexamples;thecharacteristicsofallowingthehypothesistomakeuncertaintypredictions;theclassificationofnewexamplesThefeaturethatmultiplehypothesescanbeusedtomakepredictionstogetherwiththeirprobabilitiesastheweight,andsoon.
MaximumLikelihoodEstimation
MaximumLikelihoodEstimationisastatisticalmethod,whichisusedtofindtherelevantprobabilitydensityfunctionparametersofasampleset.ThismethodwasfirstusedbygeneticistandstatisticianSirRonaldFisherbetween1912and1922.
"Likelihood"isatranslationoflikelihoodthatisclosertoclassicalChinese."Likelihood"means"possibility"inmodernChinese.Therefore,itiseasiertounderstandifitiscalled"maximumlikelihoodestimation".
Themaximumlikelihoodmethodexplicitlyusesaprobabilitymodel,anditsgoalistofindaphylogenetictreethatcanproduceobservationdatawithahigherprobability.Themaximumlikelihoodmethodisarepresentativeofaclassofphylogenetictreereconstructionmethodsbasedentirelyonstatistics.Thismethodconsiderstheprobabilityofeachnucleotidesubstitutionineachsetofsequencealignment.
Forexample,theprobabilityofatransitionoccurringisapproximatelythreetimesthatofatransition.Inathree-sequencecomparison,ifoneofthecolumnsisfoundtobeaC,aTandaG,wehavereasontobelievethattherelationshipbetweenthesequenceofCandTislikelytobecloser.Sincethecommonancestorsequenceofthestudiedsequenceisunknown,thecalculationoftheprobabilitybecomescomplicated;andbecausemultiplesubstitutionsmayoccuratonesiteormultiplesites,andnotallsitesareindependentofeachother,theprobabilitycalculationThecomplexityisfurtherincreased.Nevertheless,objectivestandardscanbeusedtocalculatetheprobabilityofeachsiteandtheprobabilityofeachpossibletreerepresentingthesequencerelationship.Then,bydefinition,thetreewiththelargestsumofprobabilitiesismostlikelytobeaphylogenetictreethatreflectstherealsituation.