K-means clustering implementation in JAVA

Details about K-Means Clustering on images:

Before the algorithm starts, the user needs to set a number of greyvalues (bins). The resulting image will contain that number of greyvalues.
With that number of bins (called ‘k’) the algorithm clusters the greyvalues of the image into k clusters and once the algorithm is terminated, every cluster will have its own greyvalue.
With starting the algorithm, you should set:

  • How to define the ‘startingmeans’ of the clusters before the first iteration.
  • What the stopping criteria are.

The Algorithm:

In short this is what the algorithm is supposed to do:
Initialize (so set k, set ‘startingmeans’, set stopping criteria)
Loop while termination condition isn’t met (

  • For each pixel: assign the pixel to a class such that the distance from the pixel to the center (the mean) of a class is minimalized.
  • For each class: recalculate the means of the class based on the pixels belonging to that class.


My implementation:

The user can set his k (which is fairly easy).
I’ve implemented 3 ways to choose the ‘startingmeans’ this far:

  1. i mod k class: The pixel at index i is assigned to the class i modulo k
  2. Distribute mean table over color space: According to the k that’s chosen the means are chosen so that the are spreaded equally over the complete color space of the image.
  3. Random: Just as it says. Given a k, there will be chosen k random mean values.

The termination constraints are currently not visible for users and are set to:
Terminate after fewer than n pixels change classes after a recalculation of the means.
I’ve set my n to 300 which is pretty small if you are using images bigger than 512 by 512 pixels. Next to that, the algorithm will be terminated if there are more than j iterations needed to get a stable result (in the meanings of that there are not more than n pixels changing classes after a recalculation of the means). My j is currently set to 50. Most of the times the algorithm terminates because of less than 300 pixels have changed classes.

Now that we’ve seen how the parameters of the algorithm are set, let’s have a look how I’ve implemented the algorithm in terms of code and decisions I’ve made.

I’ve devided the code over 3 classes:

1 to build the JDialog which is needed to ask for the input of the user concerning the way the algorithm needs to be initiated.
One with the actual algorithm and the last class is a clusterclass.

Because the class with the JDialog is not that interesting, we’ll focus on the other two classes.

The ClusterClass is pretty simple: it only holds a mean, an upperbound and a lowerbound.
I’ve chosen for the fact that this class holds the bounds because at the initialization of the algorithm, there are k classes which are created (and put into an ArrayList). You can let each class hold it’s own pixels which are belonging to that class, but if your k grows and the image is big, the complete image will be twice in the memory: as the original image and all pixels will be part of one of the clusterclasses as well. Instead of that I’ve chosen to hold the bounds of that class so that if I’m checking pixelvalues, it can also check to which class it belongs in the same for-loop.

As mentioned earlier: a pixel belongs to a class if the distance from that pixelvalue to the mean of a class is minimized. Because my ClusterClasses hold their upper- and lowerbound, a pixelvalue has to lay between the bounds to be part of that class. The bounds are simply calculated by checking which mean is the nearest (but have a lower value for the lowerbound and a higher bound for the upperbound). The bound can simply be calculated by taking the mean of these two means.
After every pixel is assigned to a class (In my case: it can check to which class it belongs). The means of the classes can be recalculated by taking the sum of all the pixelvalues belonging to that clusterclass and divide this sum by the number of pixels in the clusterclass.
After the recalculations of the means, the upper- and lowerbounds need to be recalculated as well.
After this iteration, the termination condition has to be checked. If the condition isn’t met, another iteration follows. If the condition is met, the clusters are set and the colors of the image can be recalculated.

And now shortly in JAVA:

public KMeansAction:

while (not_terminated) do:

  • recalculateMeans
  • recalculateBounds
  • checkTermination


private void processImage:

// This works for 8-bit greyscale images
// It calculates the greyvalues that will occur in the resulting image
delta = 255 / ( k – 1)
for every pixel do:
for every class do:
if a pixel belongs to that class then

// set the greyvalue of that pixel to the index of the class in the list times delta

greyvalue = classindex * delta
// then set the rgbvalue of that pixel to the greyvalue
newImage.setRGB(pixel location, greyvalue)

NOTE (August 7, 2009): I’ve found the source code and put it in this blogpost.

Getagged , ,

80 reacties op “K-means clustering implementation in JAVA

  1. davor schreef:

    Can you show the code of the kmeans so that we can study this, or a pseudocode, thanks.

  2. Patrick schreef:

    codeodor.comDavor: You can find a readable paper with pseudo & program code for the K-Means clustering algorithm and finding K here. I was looking for my pseudocode this evening, but I wasn’t able to find it yet. It’s been a while since I last implemented this. If I find it I will post it here as well.

  3. davor schreef:

    Thanks, I read it and I like it.

  4. it would be better with other languages support, but thanks..

  5. bb schreef:

    the definitions and your explanation are definitely good, but it will be great if you provide your code too..

  6. Patrick schreef:

    bb: I was checking my computer after your reply. It seems that I’ve lost some code when I reinstalled my computer. One of the pieces I lost was the code of a project which could do image enhancing operations like the K-means clustering.
    I can’t stand the fact that it’s very likely to be gone so I’ll write it again asap and post the code. It isn’t too hard though 😉

  7. bb schreef:

    thanks, Im looking forward to it..
    i have found some implementations on the web but they are not clear enough
    ı hope yours will not be.. 🙂

  8. Anitha schreef:

    the definitions and your explanation are definitely good, but it will be great if you provide your code too..

  9. ganesh schreef:

    hi….dear frd……my proect beyond the k-means algorithm…….i dont know how the data get from DATASET
    using java………pls tell tips….urgent……..

  10. Patrick schreef:

    Ganesh: What does your dataset look like? In what kind of format is it?

  11. Abhinov schreef:

    hello,its a great implementation there….could u tell me how can i cluster the pixels …so that every cluster would represent an object….say if there is a photo of a person,how can i create boundaries (approximated) and separate out the eyes ,nose, hair of that person as individual objects(clusters)…………. Thanq

  12. Patrick schreef:

    Abhinov: K-means clustering depends heavily on your initialization and your objects. Basically K-means clustering forms K clusters based on mean values. This means that if you choose K to be high, a lot of clusters will be present and each will capture a small range of pixel values and you might miss a part of the object (especially with anti-aliasing). If you set K too low, a large range of pixel values are captured into a cluster and you might end up with having pixels in your cluster which shouldn’t be there.
    Unless your objects are well separated in terms of pixel values (colors) I think K-means clustering is not really the right method to do object recognition. And even when the pixel values are well separated you have to choose your amount of clusters wisely. It will probably end up in a trial-and-error process.

    For object recognition you could perhaps have a look at blob detection. In a programming language like Matlab this is worked out pretty well for example.

    I hope this answers your question. If not, feel free to ask more!

  13. Abhinov schreef:

    Hey,ThanQ for clearing most of the assumputions/doubts , i think i should be working on a new clustering algorithm right from the scratch…another question on the same lines : Do the face detection(object recognition) algorithms( ones used in digital cameras) detect a part or entire boundary of a face, with exact detail given to each part?


  14. Patrick schreef:

    I’m glad to be of service. A quick search resulted in some interesting literature. Perhaps this can be interesting for you to use as a reference for designing a clustering algorithm:
    Object Recognition Using Clustering, Classification, and Centroids
    Research on clustering articles

    Most camera’s use some sort of blob detection and put a bounding box around it. For a camera it’s not tremendous important to recognize a face precisely. The main concern is to recognize the area in which the face is. For that they put a bounding box around it to be sure to select the whole face. Sometimes such a camera is completely wrong and you will see bounding boxes in spots where no face is present (e.g. a part of the background).
    Precise face detection can be a hard task. Recognition software often uses some sort of marks (tip of the nose, eyes / pupils, mouth corners, chin etc.) to recognize parts of the face. For determining the boundaries of these parts, Active Shape Models can provide an interesting solution.

  15. Abhinov schreef:

    That’s a lot of info about clustering, ill run through it, discover more :)….. Thanks for clarification! i will be in touch with you very soon!

  16. Patrick schreef:

    Cool! Can’t wait 😀

  17. vritant jain schreef:

    hello sir,
    am working with abhinov on our project..
    what we plan to do is use a clustering algorithm for object recognition on images.
    i found BIRCH to be an interesting work in this field.
    did the googling part,got a paper on it by Tian Zhang, Raghu Ramakrishnan, & Miron Livny but dint get an IMPLEMENTATION OF BIRCH..
    kind of a shot in the dark,
    but any help would be great..
    thank you.

  18. Patrick schreef:

    As far as I can see the method is presented in the paper you found. On this website they’ve made a mini project of implementing it. I did some searching, but only ended up with forums where the algorithm was requested instead of given.
    I think it would be nice to implement it yourself too 🙂
    If I had some time I would gladly try to do that, as well as re-implementing my K-means clustering. (Lost my source code when formatting my PC 🙁 )

  19. Abhinov P schreef:

    Hello sir,

    We found some inputs on BIRCH algorithm from Tian Zhang, Raghu Ramakrishnan, & Miron Livny,but their implementations were far more outdated(1995 C/c++ source) and we found it to be difficult to decipher it (or may be even to think about it).We are being forced to quit on the BIRCH algorithm, is there any part or ‘weird’ idea that u could suggest us on the clustering domain itself say, mixing algorithms or modifying few algorithms which could be done in a months time or so. Ironically,we are lacking proper guidance for the same at the halfway to the deadlines of project submission.

    Thank you sir, you have been of great help so far..

  20. Patrick schreef:

    I was looking for the BIRCH algorithm and found this presentation which gives an overview on some clustering algorithms. It might contain some ideas for you. It also gives data on BIRCH and an example, so it might be worth looking into.

    I’ve learned a lot from you guys too. I’m looking into clustering more for sure. If I run into something you should have a look at, I’ll post it here!

  21. Abhinov P schreef:

    Hello Again,

    Thanks For That,After following the slides in your response, we’ve got an idea which goes like this: adding the logic of a clustering algorithm to another ,both of the same type and showing the result through an image.

    We chose K means basically,but in turn are looking out for another algorithm who’s logic would be legible to fix in another, any suggestions here.please correct me if i am skipping basics,if any.

    Best Regards

  22. Patrick schreef:

    I think that’s about it.
    A clustering algorithm is an iterative algorithm (one or more iterations) which takes as input the set of pixels and a comparison measure. In case of K-means this measure is the shortest distance to a mean. You could think of a new way of clustering pixelvalues, regions etc.

    There are methods like local clustering. In such methods you only cluster a part of an image. You could for example do something like edge detection on an image to determine the boundaries of an object and do a local clustering on that.
    There are numerous things to think of which could be handy.
    Another thing is local histogram equalization. In such an algorithm you could determine objects which have a non-optimal histogram and do histogram equalization on these objects.

  23. vritant schreef:

    hello Mr Patrick,
    after looking at the above slides,suggestions, and after seeing demo’s at thispage,we have come up with this plan…
    kindly find time to read and suggest…

  24. Patrick schreef:

    Hi guys,

    I will try to look into it tonight.
    I’ve started to work on an image processing application in my spare time. It might be interesting to have such things incorporated some time.

  25. vritant schreef:

    thanks… u never know.. it could be of use in white hat! 🙂

  26. Patrick schreef:

    Actually I’m also working on other things besides White Hat Mafia online game in my spare time. One of them is the image processing application 😉

    Looking at your document I wonder what you will take as CA1 and CA2.
    Some other remarks:
    * First you want to put the pixels in a 3D space where the location is determined by the three RGB components. This sound plausible and can work out fine if you can determine the cluster boundaries in 3D in a nice way.
    * Then you introduce a correlation factor.. I think that if you would like to use such a relation, you don’t need to plot the pixels in 3D space. Just use the three RGB components as the plotting doesn’t add extra info.
    * To determine the clusters you will probably have to determine the amount of clusters, determine a cluster size or another smart way of determining clusters. This will have to be in your correlation calculation. This correlation must be calculated between two pixels a time. How do you see this correlation between two pixels? (i.e. when are two pixels actually correlated nicely)? For this you will also need a certain threshold.
    * What do you want to do in step 5? Which algorithm? Keep in mind that if you want to use K-means clustering you cannot do this on the image itself! Only on the RGB space. The downside of K-means is that you have to specify the K and you’d rather not do this as this depends on the number of objects and gradients in the image.

    At this point I can’t really see how the pixel correlation would work. Remember that this needs to be done between two pixels and so for every pixel you will need to see how it correlates with all other pixels of the image (or better: the pixels in its neighborhood).
    However: for clustering I do agree that hierarchical clustering could be a good idea. You won’t probably have to set any value for that as you can use the largest ‘jump’ in the dendogram to determine the optimal number of clusters.

    Maybe you should have a look at Blob detection. This is basically what you want to do. At the “See also” there are some interesting references to things like boundary / edge detection.

  27. ganesh schreef:

    dear frd patrick………i need source code of K-means cluster implementation .for example..40,25,15,5,10,25,15,20,10,80 its my set of transaction……i need group into 3.using centroids concept……..pls tell me…….its very urgent……

  28. nadia schreef:

    hai,can u help me.i want to implement k-means code in my program as below..tq

    import java.awt.Graphics;
    import java.awt.Graphics2D;
    import java.awt.Image;
    import java.awt.MediaTracker;
    import java.awt.Toolkit;
    import java.awt.event.ActionEvent;
    import java.awt.event.ActionListener;

    import java.awt.image.BufferedImage;
    import java.awt.image.BufferedImageOp;
    import java.awt.image.ConvolveOp;
    import java.awt.image.Kernel;
    import java.awt.image.RescaleOp;

    import javax.swing.JFrame;
    import javax.swing.JMenu;
    import javax.swing.JMenuBar;
    import javax.swing.JMenuItem;
    import javax.swing.JPanel;

    public class SampleJFrame2 extends javax.swing.JFrame {

    /** Creates new form SampleJFrame2 */
    public SampleJFrame2() {

    /** This method is called from within the constructor to
    * initialize the form.
    * WARNING: Do NOT modify this code. The content of this method is
    * always regenerated by the Form Editor.
    private void initComponents() {

    jComboBox1 = new javax.swing.JComboBox();
    jLabel1 = new javax.swing.JLabel();
    jLabel2 = new javax.swing.JLabel();
    jButton1 = new javax.swing.JButton();
    jButton2 = new javax.swing.JButton();
    jButton3 = new javax.swing.JButton();
    panel1 = new ImageProcessingPanel();
    panel2 = new ImageProcessingPanel();
    jScrollPane1 = new javax.swing.JScrollPane();
    jTextPane1 = new javax.swing.JTextPane();
    jLabel3 = new javax.swing.JLabel();
    jButton4 = new javax.swing.JButton();
    jMenuBar1 = new javax.swing.JMenuBar();


    jComboBox1.setModel(new javax.swing.DefaultComboBoxModel(new String[] {“select”,”try.jpg”, “brain2.jpg”, “brain3.jpg”, “brain4.jpg”, “brain5.jpg” }));
    jComboBox1.addActionListener(new java.awt.event.ActionListener() {
    public void actionPerformed(java.awt.event.ActionEvent evt) {

    jLabel1.setBackground(new java.awt.Color(204, 153, 0));
    jLabel1.setFont(new java.awt.Font(“Tahoma”, 1, 12));
    jLabel1.setForeground(new java.awt.Color(0, 51, 204));
    jLabel1.setText(” ORIGINAL IMAGE”);

    jLabel2.setFont(new java.awt.Font(“Tahoma”, 1, 12));
    jLabel2.setForeground(new java.awt.Color(0, 51, 204));

    jButton1.setBackground(new java.awt.Color(204, 204, 255));
    jButton1.setFont(new java.awt.Font(“Tahoma”, 1, 11));
    jButton1.setForeground(new java.awt.Color(0, 0, 153));
    jButton1.addActionListener(new java.awt.event.ActionListener() {
    public void actionPerformed(java.awt.event.ActionEvent evt) {

    jButton2.setBackground(new java.awt.Color(204, 204, 255));
    jButton2.setFont(new java.awt.Font(“Tahoma”, 1, 11));
    jButton2.setForeground(new java.awt.Color(0, 0, 153));
    jButton2.addActionListener(new java.awt.event.ActionListener() {
    public void actionPerformed(java.awt.event.ActionEvent evt) {

    jButton3.setBackground(new java.awt.Color(204, 204, 255));
    jButton3.setFont(new java.awt.Font(“Tahoma”, 1, 11));
    jButton3.setForeground(new java.awt.Color(0, 0, 153));
    jButton3.addActionListener(new java.awt.event.ActionListener() {
    public void actionPerformed(java.awt.event.ActionEvent evt) {

    javax.swing.GroupLayout panel1Layout = new javax.swing.GroupLayout(panel1);
    .addGap(0, 239, Short.MAX_VALUE)
    .addGap(0, 306, Short.MAX_VALUE)

    javax.swing.GroupLayout panel2Layout = new javax.swing.GroupLayout(panel2);
    .addGap(0, 244, Short.MAX_VALUE)
    .addGap(0, 306, Short.MAX_VALUE)

    jTextPane1.setBackground(new java.awt.Color(204, 204, 255));
    jTextPane1.setFont(new java.awt.Font(“Tahoma”, 1, 20));
    jTextPane1.setForeground(new java.awt.Color(0, 102, 255));

    jLabel3.setFont(new java.awt.Font(“Tahoma”, 1, 11));
    jLabel3.setForeground(new java.awt.Color(0, 51, 204));
    jLabel3.setText(“Select images”);

    jButton4.setBackground(new java.awt.Color(204, 204, 255));
    jButton4.setFont(new java.awt.Font(“Tahoma”, 1, 11));
    jButton4.setForeground(new java.awt.Color(0, 0, 153));
    jButton4.addActionListener(new java.awt.event.ActionListener() {
    public void actionPerformed(java.awt.event.ActionEvent evt) {

    javax.swing.GroupLayout layout = new javax.swing.GroupLayout(getContentPane());
    .addComponent(jScrollPane1, javax.swing.GroupLayout.DEFAULT_SIZE, 874, Short.MAX_VALUE))
    .addGap(74, 74, 74)
    .addComponent(jLabel1, javax.swing.GroupLayout.PREFERRED_SIZE, 141, javax.swing.GroupLayout.PREFERRED_SIZE))
    .addGap(39, 39, 39)
    .addComponent(panel1, javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE))
    .addGap(91, 91, 91)
    .addComponent(jButton4, javax.swing.GroupLayout.PREFERRED_SIZE, 103, javax.swing.GroupLayout.PREFERRED_SIZE)))
    .addGap(10, 10, 10)
    .addComponent(jComboBox1, javax.swing.GroupLayout.PREFERRED_SIZE, 138, javax.swing.GroupLayout.PREFERRED_SIZE))
    .addGap(29, 29, 29)
    .addComponent(jLabel3, javax.swing.GroupLayout.DEFAULT_SIZE, 206, Short.MAX_VALUE)))
    .addComponent(panel2, javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
    .addGap(117, 117, 117))
    .addGroup(javax.swing.GroupLayout.Alignment.TRAILING, layout.createSequentialGroup()
    .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING, false)
    .addComponent(jButton2, javax.swing.GroupLayout.Alignment.LEADING, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE)
    .addComponent(jButton1, javax.swing.GroupLayout.Alignment.LEADING, javax.swing.GroupLayout.DEFAULT_SIZE, 97, Short.MAX_VALUE)
    .addComponent(jButton3, javax.swing.GroupLayout.Alignment.LEADING, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE))
    .addGap(191, 191, 191)))
    .addGroup(javax.swing.GroupLayout.Alignment.TRAILING, layout.createSequentialGroup()
    .addComponent(jLabel2, javax.swing.GroupLayout.PREFERRED_SIZE, 133, javax.swing.GroupLayout.PREFERRED_SIZE)
    .addGap(180, 180, 180))))
    .addGap(46, 46, 46)
    .addGap(47, 47, 47)
    .addComponent(jLabel1, javax.swing.GroupLayout.PREFERRED_SIZE, 38, javax.swing.GroupLayout.PREFERRED_SIZE)
    .addComponent(jLabel2, javax.swing.GroupLayout.PREFERRED_SIZE, 27, javax.swing.GroupLayout.PREFERRED_SIZE))
    .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING, false)
    .addComponent(panel2, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE)
    .addComponent(panel1, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE)))
    .addGap(198, 198, 198)
    .addComponent(jLabel3, javax.swing.GroupLayout.PREFERRED_SIZE, 24, javax.swing.GroupLayout.PREFERRED_SIZE)
    .addComponent(jComboBox1, javax.swing.GroupLayout.PREFERRED_SIZE, 29, javax.swing.GroupLayout.PREFERRED_SIZE)))
    .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED, javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE))
    .addComponent(jScrollPane1, javax.swing.GroupLayout.PREFERRED_SIZE, 50, javax.swing.GroupLayout.PREFERRED_SIZE)
    .addGap(396, 396, 396)))
    .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING, false)
    .addComponent(jButton4, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE)
    .addComponent(jButton1, javax.swing.GroupLayout.DEFAULT_SIZE, 35, Short.MAX_VALUE))
    .addGap(12, 12, 12)
    .addComponent(jButton2, javax.swing.GroupLayout.PREFERRED_SIZE, 35, javax.swing.GroupLayout.PREFERRED_SIZE)
    .addGap(18, 18, 18)
    .addComponent(jButton3, javax.swing.GroupLayout.PREFERRED_SIZE, 33, javax.swing.GroupLayout.PREFERRED_SIZE)
    .addGap(48, 48, 48))


    private void jComboBox1ActionPerformed(java.awt.event.ActionEvent evt) {

    String name = (String)jComboBox1.getSelectedItem();
    // TODO add your handling code here:

    private void jButton1ActionPerformed(java.awt.event.ActionEvent evt) {

    private void jButton2ActionPerformed(java.awt.event.ActionEvent evt) {
    // TODO add your handling code here:

    private void jButton3ActionPerformed(java.awt.event.ActionEvent evt) {
    panel2.brighten(); // TODO add your handling code here:

    private void jButton4ActionPerformed(java.awt.event.ActionEvent evt) {
    String name = (String)jComboBox1.getSelectedItem();
    panel2.loadImage(name);// to load the image at the panel as a reset:

    * @param args the command line arguments
    public static void main(String args[]) {
    java.awt.EventQueue.invokeLater(new Runnable() {
    public void run() {
    new SampleJFrame2().setVisible(true);

    // Variables declaration – do not modify
    private javax.swing.JButton jButton1;
    private javax.swing.JButton jButton2;
    private javax.swing.JButton jButton3;
    private javax.swing.JButton jButton4;
    private javax.swing.JComboBox jComboBox1;
    private javax.swing.JLabel jLabel1;
    private javax.swing.JLabel jLabel2;
    private javax.swing.JLabel jLabel3;
    private javax.swing.JMenu jMenu1;
    private javax.swing.JMenuBar jMenuBar1;
    private javax.swing.JScrollPane jScrollPane1;
    private javax.swing.JTextPane jTextPane1;
    private JMenuItem exitItem;
    private ImageProcessingPanel panel1;
    private ImageProcessingPanel panel2;
    // End of variables declaration

    class ImageProcessingPanel extends JPanel {
    public void paintComponent(Graphics g) {
    if (image != null)
    g.drawImage(image, 0, 0, null);

    public void loadImage(String name) {
    Image loadedImage = Toolkit.getDefaultToolkit().getImage(name);
    MediaTracker tracker = new MediaTracker(this);
    tracker.addImage(loadedImage, 0);
    try {
    } catch (InterruptedException e) {
    image = new BufferedImage(loadedImage.getWidth(null), loadedImage
    .getHeight(null), BufferedImage.TYPE_INT_RGB);
    Graphics2D g2 = image.createGraphics();
    g2.drawImage(loadedImage, 0, 0, null);


    private void filter(BufferedImageOp op) {
    BufferedImage filteredImage = new BufferedImage(image.getWidth(),
    image.getHeight(), image.getType());
    op.filter(image, filteredImage);
    image = filteredImage;
    //to divide the image in matrices
    private void convolve(float[] elements) {
    Kernel kernel = new Kernel(3, 3, elements);
    ConvolveOp op = new ConvolveOp(kernel);

    public void blur() {
    float weight = 1.8f / 9.0f;
    float[] elements = new float[9];
    for (int i = 0; i < 9; i++)
    elements[i] = weight;

    public void sharpen() {
    float[] elements = { 0.0f, -1.0f, 0.0f, -1.0f, 5.f, -1.0f, 0.0f, -1.0f,
    0.0f };

    public void brighten()
    {float a = 1.5f;//scale factor
    float b = -9.0f;//offset
    RescaleOp op = new RescaleOp(a, b, null);

    private BufferedImage image;

  29. Patrick schreef:

    Hi Nadia,

    I will try and look into the code as soon as possible. In principle K-Means is not that hard. I would add the action itself and a configuration pane for the action parameters to your existing code. If I find some time I will see if I can create some code. I see that some people are interested in it, so it’s worth doing sometime!

  30. nadia schreef:

    thank patrick..i really really need ur help.hope u can help me as soon as possible.thank u so much..

  31. nadia schreef:

    hai patrick,
    sory for disturbing u.actually,,i really really in urgent.hope that u can help me in this short time.
    i’ve tried using k-means in matlab but it’s not working.so,right now i really hope from u.please help me..:-)..hope that u can help me solve my problem for k-means existing code that i’ve sent to u in java code..tq..tq..tq..

  32. Patrick schreef:


    I’m working on the code. The most important stuff to start with is to determine your K (your amount of clusters). This can be chosen in many ways. Next to that you will need initialization parameters of the clusters.
    Other things to think of are on which type of images you would like to do this clustering. Color images are more complex than grey-value images. You will also need to have a distance measure to determine what the distance from a pixel to a cluster is.

    Have you already thought about these things?

  33. nadia schreef:

    Hai patrick,
    thanz cause u give me some idea.i’ve already done it n it’s work.

  34. […] clustering in Java code found! JAVA Add comments My blogpost on K-means clustering has the highest number of views, so people are probably interested in it. Sadly enough I lost the […]

  35. jmenubar schreef:

    […] … Mail (will not be published) (required) Website. The Jersey Jargons is proudly powered by …K-means clustering implementation in JAVA | Patrick's playgroundDetails about K-Means Clustering on images: Before the algorithm starts, the user needs to set a […]

  36. Raghotham S schreef:

    Hi Sir,

    v hav a project in college – Image segmentation using K means
    i hav the code which u hav posted
    but am not knowin how to pass an image as input
    I need an applet to take the image input and output the results
    so pls help. Its v v urgent.
    Sorry for bothering u

  37. Patrick schreef:


    You can read a file as a BufferedImage and supply this to the K Means action. You could use ImageIO for reading a file to a BufferedImage. Have a look here for that.
    Hopefully this is of use for you.

  38. n n schreef:

    hello patrick.
    I want to ask your personal opinion.
    can this algorithm used to segment brain tissue from the non-brain tissues component?
    thank you 🙂

    • Patrick schreef:

      The K-means clustering algorithm creates clusters of pixel colors. Unless the pixels representing the brain tissue can be put in one or more clusters, it will probably result in a weak separation.
      You could also have a look at image segmentation. Fluid mostly is dark and bone mostly is white in grey-scale images. You could try to set your thresholds in such a way that these are left out of the image.

  39. n n schreef:

    okay thank you for your opinion. 😀
    one more thing, what is the best criteria to set the threshold value?

    • Patrick schreef:

      That completely depends on your image (on its grey values). For fluid and bone I suppose the threshold values should be somewhere at the edges of the histogram (near completely white and completely black). These values vary per image. If you would like to use it in an algorithm, a primitive algorithm could look at the image histogram and put the thresholds at particular values according to rules.
      E.g. only bone and tissue are visible / are not black / have ‘higher’ pixel values, so look for the peak in the histogram for the black pixels and put the threshold somewhere near there. You could do the same for bone as it has white pixel values which are at the opposite side of the histogram.

  40. Prabhu Ramanujam schreef:

    Hi Patrick,

    I am using k-means for document clustering. I identify the tf-idf and store that in hash table. Could you please help me how to process the next step. Please suggest me any ideas. Thanks in advance.

  41. Patrick schreef:

    Hi Prabhu,

    Are you clustering documents or do you want to cluster the words in a document? (What exactly is the goal you’re trying to achieve)?
    With clustering you pick cluster centers and based on a distance measure you assign objects/terms/whatsoever to these clusters.
    Think well about what you want your clusters to contain and how you define distance (what your distance measure would be).

  42. Prabhu Ramanujam schreef:


    Thanks for your reply.

    I want to cluster the documents and then want to take the words from the documents which are all clusters. If i got the cluster as {doc1, doc5, doc8}, then i want to get the similar words from these three documents. I plan to use Euclidean distance. Please help me out.

  43. Patrick schreef:

    To summarize: You want to cluster the documents based on a ‘cloud’ of words. The outcome of the algorithm will be clusters of documents.
    To determine in which cluster a document belongs, the words in the document are used. That means that you use the words as a distance measure to determine the distance to the cluster centers.
    Is this correct?

    And how do you plan to determine this distance? What’s the distance between two words? And between a collection of words?

  44. Nagaswathi schreef:

    sir,iam in need of kmeans algorithm in java.could you please upload the code in to this website.I will be verymuch thankful if you could do this.
    thanking you sir,

  45. Patrick schreef:

    Hi Nagaswathi,

    I’ve put the source code in this blogpost
    The complete code in which it was active can be found on downloads page.

    Good luck 🙂

  46. Lamia schreef:

    Salut à tous d’abord je vous salut pour ces belles performances ça fait du bien de voir des gens qui s’intéressent et qui sont aussi ambitieux!
    pour ma part je voudrais de l’aide s’il vous plaît concernant l’algorithme de k-means pour former des cluster(groupes,communautés) de fans de films(système de filtrage) help me please!je suis préssée je n’ai hélas pas beaucoup de temps!:merci d’avance!!!!

  47. Patrick schreef:

    Bonjour Lamia,

    Pour mieux vous aider: Quelle est exactement votre problème?

  48. dharma schreef:

    Patrick sir…
    please help me. sir,,,
    i have ur algorithm but that gives error n..Exception thread in “Main” ..NoClassDefFoundError…what can i do ..now …reply me soon…

    coming mar 10 is my project delivery .so please help me as soon as possible..

    • Patrick schreef:


      Please reply in the appropriate topic (in which you’ve posted earlier). The error you supply is completely generic. I can’t relate it directly to the code supplied with these posts.
      Apparently some class definition is missing. Please check if all classes are defined correctly.



  49. dharma schreef:

    Dear sir .

    thanks for ur reply..
    i have saved in KMeanAction which is class name ..

    but it will give error …

    Exeception thread “main” NoClassDefFoundError..

    i think that means Class KMeanAction doesnt defined..

    please reply me soon sir,,,,

    or tell me the correct way to do that procedures..
    1. in what name i want to save …
    2.where i put the cluster class code..
    3.how to give the input…

  50. dharma schreef:

    Dear sir..
    sorry for disturbing again…
    u have mentioned in both the programs the package name as actions…then’

    wht i can do.

    can i create new class and importing the two packages which is contains KMeansAction and ClusterClass…

    help me..please..

    v vvvvvvvvvvv urgent …

  51. Patrick schreef:

    Well.. yes, it seems obvious that k-means clustering is a particular operation, so I placed in a package actions.

    You could of course just pick both classes and place them into your own code / package hierarchy. Basically you need only those two classes. Just make sure that you specify the path to these classes according to where you’ve placed them.

    Then just perform the operation on a BufferedImage like so:

    action = new KMeansAction(BufferedImage image, int bins, int[]histogram, int initway);
    result = action.getResultImage();

    in which result will contain the resulting BufferedImage.

  52. kiran schreef:

    hey patrick i m pleased with your code bt i need d for k-means clustering for data sets say havg the info of vechicles so giv some basic idea to start with plz help me out i need do it for adaptive k -means toooooooooo so plz

  53. Patrick schreef:


    Please formulate your exact question. I cannot see how far you already are with your issue.
    What kind of info do you have?

    As an example:
    Let’s say you have information on top speed of your car dataset. By setting your K to 2, you will end up with two clusters of cars.
    In a more concrete way: if the top speeds range from 120 to 240 miles per hour, with a traditional K-Means clustering and a K of 2, you will end up with the following two clusters:
    Cluster 1: top speeds from 120 to 180
    Cluster 2: top speeds from 180 to 240

    Adaptive K-Means clustering would try to find the best K for your dataset. In contrary to traditional K-Means clustering, it actually looks at, and uses the data to determine the optimal K.

  54. deepak schreef:

    can anybody guide me on how to go about implementing BIRCh clustering algo….I tried googling but wasnt able to come up with a solution….any help would be appreciated

  55. Akash schreef:

    How to compile this code?????

  56. Akash,

    This code is not runnable directly as it doesn’t contain a main method. You can use the code in the referred blogpost in your existing application.
    If you’d like to compile the whole code (containing K-Means clustering and some more operations) you can download the Ispe development source code on my downloads page.



  57. Akash schreef:


  58. ritima schreef:

    hello sir,
    i want to use this k means algorithm for coloured and large images. what changes should i inculcate in the code so as to utilise it for my purpRose. Do reply its urgent. thanks in advance

  59. Memo schreef:

    hello patrick,
    thanx for your info abt k-means clustering… i m doing a project on plant disease detection n i need to work with color images.. will u plz help me implementing k-means on color images…???
    plz… rply sn..

  60. risa schreef:

    hello patrick..
    thank for your info about K-means clustering.. but i have question and i think i really need your help..please…T__T
    How clustering histogram of some images using K-means clustering??

  61. Hi Memo,

    Did you find any information on K-Means clustering on color images? (Clustering in 3 dimensional space?)



  62. Hi Risa,

    Are you looking to cluster a image histogram? This is actually the same as performing it on images: on images you take the grey value, which is an integer value and you cluster on this. The histogram also contains (integer?) values, which makes the algorithm exactly the same.
    In code, instead of supplying an image / running through all pixels of an image, you just provide the histogram values.
    Was that of any help?



  63. risa schreef:

    ex : i have images in image database.. first.. i extract the histogram. then want that image histogram clustered with K-means clustering… how can i implement in Java codes??
    examples the histogram codes are :
    private void makeHistogram() {
    // reset all values
    for (int i = 0; i < 256; i++)
    histogram[i] = 0;

    for (int h = 0; h < picpanel.getImage().getHeight(); h++) {
    for (int w = 0; w < picpanel.getImage().getWidth(); w++) {
    Color color = new Color(picpanel.getImage().getRGB(w, h));
    int greyvalue = color.getRed();
    histogram[greyvalue] += 1;

    thanks a lot before (for your inspiration and education)^___^

  64. tintin schreef:

    Hi Patrick,
    have you ever measured the time complexity of your clustering program?

  65. Hi tintin,

    Well, as it runs for every pixel in an image, I’d say the complexity is O(n^2).



  66. tintin schreef:

    Hi Patrick,
    what kind of method that you used to measure the distance? Euclidean or Manhattan? Thank’s for reply.

  67. tintin schreef:

    Hi Patrick,
    the O(n^2) is the big O or the time complexity?
    Thank you

  68. tintin schreef:

    Hi Patrick,
    I read the formula in fuzzyHe but I confuse with the way you calculate the distance, Di = SquareRoot ( | M^2 – si^2 | ) is it euclidean?
    thank you

  69. Hi tintin,

    The O(n^2) is the algorithm complexity in big O notation.
    Seeing the formula I’d say it’s euclidean indeed.



  70. tintin schreef:

    Thank you Patrick for your reply.
    Is it okay I use your program for my assignment?
    I’m measuring the time complexity of k-mean algorithm.


  71. benzon schreef:

    Hi, Sir Patrick i have a project which is image processing using kmeans, can you teach on how to set k=1 color of the background, k=2 color of the leaf, k=3 the color of the shadow around the leaf.? awating for your reply sir.

  72. Dipak schreef:

    Hello Sir,
    i am working on BIRCH clustering algorithm. i Found Birch clustering algorithm Paper Raghu Ramakrishna, Livey…1996.
    i want to modify the BIRCH clustering algorithm Splitting Statergy. But i am unable to understande How Birch is working step by spep Procedure with example and implemetation of this algorithm.

    Sir i am stuck here since last one month. Please Help Me regarding BIRCH Step-by-Step working and implemetation Help.

    Thank You Very much In advance. My email id: dipak07ce011@gmail.com.

  73. Mikin schreef:

    do you have java source code for word clustering.
    I have one text file having articles and names etc.
    i have to cluster that file according to the alphabet.
    can you help me out
    i could not found any kind of code for that.

    Thank you

    • Hi Mikin,

      I don’t have code for clustering a text file. However, I think it might be pretty easy to adjust the K-Means clustering code: instead of clustering the image (red) values, you can cluster on characters.



  74. Dipak Dabhi schreef:

    Hello Sir,

    i need help regarding BIRCH CLustering algorithm. i want to know what is input (which type of data) and output.

    i need java code of BIRCH Clustering algorithm.

    Sir Please help Me…i stuck here since last 3 month.

  75. sbairwa schreef:

    Hello Sir,
    I am working on Mini Batch k means algorithm for my post graduation and it requires the modification in the basic algorithm. I have tried a lot but in vain. I request you to kindly help me out in the way how can I make substantial changes in the algorithm that increases its efficiency. Any help would be appreciated

Geef een reactie

Het e-mailadres wordt niet gepubliceerd. Vereiste velden zijn gemarkeerd met *