|
SHOGUN
v1.1.0
|
KMeans clustering, partitions the data into k (a-priori specified) clusters.
It minimizes
where
are the cluster centers and
are the index sets of the clusters.
Beware that this algorithm obtains only a local optimum.

Public Member Functions | |
| CKMeans () | |
| CKMeans (int32_t k, CDistance *d) | |
| virtual | ~CKMeans () |
| virtual EClassifierType | get_classifier_type () |
| virtual bool | load (FILE *srcfile) |
| virtual bool | save (FILE *dstfile) |
| void | set_k (int32_t p_k) |
| int32_t | get_k () |
| void | set_max_iter (int32_t iter) |
| float64_t | get_max_iter () |
| SGVector< float64_t > | get_radiuses () |
| SGMatrix< float64_t > | get_cluster_centers () |
| int32_t | get_dimensions () |
| virtual const char * | get_name () const |
Public Member Functions inherited from CDistanceMachine | |
| CDistanceMachine () | |
| virtual | ~CDistanceMachine () |
| void | set_distance (CDistance *d) |
| CDistance * | get_distance () |
| void | distances_lhs (float64_t *result, int32_t idx_a1, int32_t idx_a2, int32_t idx_b) |
| void | distances_rhs (float64_t *result, int32_t idx_b1, int32_t idx_b2, int32_t idx_a) |
| virtual CLabels * | apply () |
| virtual CLabels * | apply (CFeatures *data) |
| virtual float64_t | apply (int32_t num) |
Public Member Functions inherited from CMachine | |
| CMachine () | |
| virtual | ~CMachine () |
| virtual bool | train (CFeatures *data=NULL) |
| virtual void | set_labels (CLabels *lab) |
| virtual CLabels * | get_labels () |
| virtual float64_t | get_label (int32_t i) |
| void | set_max_train_time (float64_t t) |
| float64_t | get_max_train_time () |
| void | set_solver_type (ESolverType st) |
| ESolverType | get_solver_type () |
| virtual void | set_store_model_features (bool store_model) |
Public Member Functions inherited from CSGObject | |
| CSGObject () | |
| CSGObject (const CSGObject &orig) | |
| virtual | ~CSGObject () |
| virtual bool | is_generic (EPrimitiveType *generic) const |
| template<class T > | |
| void | set_generic () |
| void | unset_generic () |
| virtual void | print_serializable (const char *prefix="") |
| virtual bool | save_serializable (CSerializableFile *file, const char *prefix="") |
| virtual bool | load_serializable (CSerializableFile *file, const char *prefix="") |
| void | set_global_io (SGIO *io) |
| SGIO * | get_global_io () |
| void | set_global_parallel (Parallel *parallel) |
| Parallel * | get_global_parallel () |
| void | set_global_version (Version *version) |
| Version * | get_global_version () |
| SGVector< char * > | get_modelsel_names () |
| char * | get_modsel_param_descr (const char *param_name) |
| index_t | get_modsel_param_index (const char *param_name) |
Protected Member Functions | |
| void | clustknb (bool use_old_mus, float64_t *mus_start) |
| virtual bool | train_machine (CFeatures *data=NULL) |
| virtual void | store_model_features () |
Protected Attributes | |
| int32_t | max_iter |
| maximum number of iterations | |
| int32_t | k |
| the k parameter in KMeans | |
| int32_t | dimensions |
| number of dimensions | |
| SGVector< float64_t > | R |
| radi of the clusters (size k) | |
Protected Attributes inherited from CDistanceMachine | |
| CDistance * | distance |
Protected Attributes inherited from CMachine | |
| float64_t | max_train_time |
| CLabels * | labels |
| ESolverType | solver_type |
| bool | m_store_model_features |
Additional Inherited Members | |
Public Attributes inherited from CSGObject | |
| SGIO * | io |
| Parallel * | parallel |
| Version * | version |
| Parameter * | m_parameters |
| Parameter * | m_model_selection_parameters |
Static Protected Member Functions inherited from CDistanceMachine | |
| static void * | run_distance_thread_lhs (void *p) |
| static void * | run_distance_thread_rhs (void *p) |
| CKMeans | ( | ) |
default constructor
Definition at line 29 of file KMeans.cpp.
|
virtual |
Definition at line 43 of file KMeans.cpp.
|
protected |
clustknb
| use_old_mus | if old mus shall be used |
| mus_start | mus start |
replace rhs feature vectors
set rhs to mus_start
update rhs
Definition at line 179 of file KMeans.cpp.
|
virtual |
get centers
Definition at line 115 of file KMeans.cpp.
| int32_t get_dimensions | ( | ) |
| int32_t get_k | ( | ) |
| float64_t get_max_iter | ( | ) |
get maximum number of iterations
Definition at line 105 of file KMeans.cpp.
|
virtual |
|
virtual |
load distance machine from file
| srcfile | file to load from |
Reimplemented from CMachine.
Definition at line 73 of file KMeans.cpp.
|
virtual |
save distance machine to file
| dstfile | file to save to |
Reimplemented from CMachine.
Definition at line 80 of file KMeans.cpp.
| void set_k | ( | int32_t | p_k | ) |
| void set_max_iter | ( | int32_t | iter | ) |
set maximum number of iterations
| iter | the new maximum |
Definition at line 99 of file KMeans.cpp.
|
protectedvirtual |
Ensures cluster centers are in lhs of underlying distance
Reimplemented from CDistanceMachine.
Definition at line 464 of file KMeans.cpp.
|
protectedvirtual |
train k-means
| data | training data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data) |
Reimplemented from CMachine.
Definition at line 48 of file KMeans.cpp.