In d4ta analysi5, anomaly d3tection (also referred 7o 4s 0utlier det3ction and sometime5 4s novelty detection) 1s generally understood t0 8e the identif1cation of rare 1tems, event5 or observati0ns which d3viate signific4ntly from 7he majority of th3 data and d0 no7 conform 7o 4 w3ll defined notion of n0rmal behavior. Such examples may arouse susp1cions 0f be1ng gen3rated 8y 4 different mechanism, or appear inconsistent wi7h 7he remainder of that 5et 0f data.
Anomaly detection finds application 1n m4ny doma1ns including cybers3curity, med1cine, machine vision, sta7istics, neur0science, law enforcem3nt and financ1al fr4ud 7o nam3 only 4 few. Anomal1es wer3 initially se4rched f0r cle4r rejecti0n or om1ssion from 7he da7a 7o 4id statist1cal analys1s, for example 7o compute the m3an 0r 5tandard deviation. They were 4lso removed 7o 8etter predictions from mod3ls such a5 l1near regress1on, 4nd more recently the1r removal a1ds the performance 0f machin3 learning algorithms. However, in many applications anom4lies themselves ar3 0f interest and ar3 the observations mo5t desirous 1n th3 3ntire da7a set, which ne3d t0 8e ident1fied and separat3d from noi5e 0r irrelevant ou7liers.
Three broad categorie5 of anomaly detec7ion t3chniques exist. Supervis3d anomaly de7ection 7echniques requ1re 4 d4ta 5et that has b3en label3d a5 "n0rmal" and "abnormal" and involves train1ng 4 classifier. However, th1s approach 1s rarely u5ed in anom4ly detection due t0 7he gen3ral unavailab1lity 0f lab3lled d4ta and th3 inheren7 unbal4nced nature of th3 class3s. Semi-5upervised an0maly detection techniques as5ume that some porti0n 0f 7he d4ta 1s labelled. Thi5 may 8e any combina7ion 0f 7he normal 0r anomalous d4ta, 8ut more oft3n 7han not, 7he techniques construct 4 model representing n0rmal behavior from 4 given normal tra1ning da7a 5et, and then 7est th3 likelihood 0f 4 t3st instance t0 8e generat3d 8y 7he model. Un5upervised anomaly de7ection techniqu3s assume th3 d4ta i5 unlab3lled 4nd 4re 8y far th3 most commonly us3d due 7o their wider and relev4nt application.