assembl.nlp.optics module

min_points - number of objects in a neighborhood of the selected object (minimal number of objects considered as a cluster) ————————————————————————- Output: RD - vector with reachability distances (m,1) CD - vector with core distances (m,1) order - vector specifying the order of objects (1,m) ————————————————————————- Example of use: x=[randn(30,2)*.4;randn(40,2)*.5+ones(40,1)*[4 4]]; [RD,CD,order]=optics(x,4) ————————————————————————- References: [1] M. Ankrest, M. Breunig, H. Kriegel, J. Sander, OPTICS: Ordering Points To Identify the Clustering Structure, available from www.dbs.informatik.uni-muenchen.de/cgi-bin/papers?query=–CO [2] M. Daszykowski, B. Walczak, D.L. Massart, Looking for natural patterns in analytical data. Part 2. Tracing local density with OPTICS, J. Chem. Inf. Comput. Sci. 42 (2002) 500-507 ————————————————————————- Written by Michal Daszykowski Department of Chemometrics, Institute of Chemistry, The University of Silesia December 2004 http://www.chemometria.us.edu.pl

Core algorithm ported to python Jan, 2009 by Brian H. Clowers, Pacific Northwest National Laboratory. bhclowers at gmail.com Dependencies include numpy, scipy (formerly hcluster, now in scipy) Extraction section written by Marc-Antoine Parent maparent@acm.org

class assembl.nlp.optics.Dendrogram(cluster, parent=None)[source]

Bases: object

Nested intervals corresponding to clusters

class assembl.nlp.optics.Interval(start, end)[source]

Bases: object

A closed integer interval

class assembl.nlp.optics.Optics(min_points=4, distMethod='cosine')[source]

Bases: object

A calculation using the optics algorithm.

assembl.nlp.optics.euclid(i, x)[source]

euclidean(i, x) -> euclidean distance between x and y