Header menu link for other important links
X
Measuring Concentration of Distances - An Effective and Efficient Empirical Index
S. Kumari,
Published in IEEE Computer Society
2017
Volume: 29
   
Issue: 2
Pages: 373 - 386
Abstract
High dimensional data analysis gives rise to many challenges. One such that has come to gain a lot of attention recently is the concentration of distances (CoD) phenomenon, which is the inability of distance functions to distinguish points well in high dimensions. CoD affects almost every machine learning and data analysis algorithm in high dimensions. In this work, we present a novel efficient and effective empirical index that not only illustrates whether a distance function tends to concentrate for a given data set, but also enables us to measure the rate of concentration and allows us to compare different distance functions vis-á-vis their rate of concentration. As opposed to existing empirical indices, the proposed empirical measure uses only the internal characteristics of a given data set and hence is applicable on real data sets, which was hitherto not possible. © 2016 IEEE.
About the journal
JournalData powered by TypesetIEEE Transactions on Knowledge and Data Engineering
PublisherData powered by TypesetIEEE Computer Society
ISSN10414347