Sep 07

My blogpost on K-means clustering has the highest number of views, so people are probably interested in it. Sadly enough I lost the source code of the K-means action a while ago. Last week I needed an external harddisk to make a back-up of some files. There was already some content on the disk. I found quite some pieces of code including the K-means code. Although it is quite simple code operating on (if I remember correctly 8-bit) greyscale images, it might give some insights in how to do this.

The whole code file is presented below. For more information you can view my earlier blogpost on K-means clustering.

package actions;
 
import java.awt.Color;
import java.awt.image.BufferedImage;
import java.util.ArrayList;
 
/**
 * This KMeansAction performs a K-means clustering action on a BufferedImage
 * @author Patrick van Kouteren
 *
 */
 
public class KMeansAction {
 
		BufferedImage image_temp;
		boolean not_terminated;
		int loops, changedPixels;
		int[] histogram;
		ArrayList classes;
		int [] lowerbounds;
		public final static int MEAN_BY_MOD = 1;
		public final static int MEAN_BY_SPACE = 2;
		public final static int MEAN_AT_RANDOM = 3;
 
		/**
		 * Controls the actual work:
		 * - Initialization
		 * - Loop until termination condition is met
		 *  + for each pixel: assign pixel to a class such that the distance from the pixel to the mean of that class is minimized
		 *  + for each class: recalculate the means of the class based on pixels belonging to that class
		 * - End loop
		 * @param image
		 * @param bins (k)
		 * @param histogram
		 */
		public KMeansAction(BufferedImage image, int bins, int[]histogram, int initway) {
			this.histogram = histogram;
			lowerbounds = new int[bins];
			initialize(image, bins, initway);
			calculateBounds();
			while (not_terminated) {
				recalculateMeans();
				loops++;
				checkTermination();
				}
			processImage(image, bins);
		}
 
		/**
		 * Set the new color values for the image
		 * @param image
		 */
		private void processImage(BufferedImage image, int bins) {
			int delta = 255 / (bins-1);
			for (int h = 0; h < image.getHeight(); h++){
				for (int w = 0; w < image.getWidth(); w++){
					Color rgb = new Color(image.getRGB(w, h));
					int grey = rgb.getRed();
					for (int i = 0; i classes.get(i).lowerbound && grey < classes.get(i).upperbound) {
							int g = i*delta;
							image_temp.setRGB(w,h,(new Color(g, g, g)).getRGB());
						}
					}
				}
			}
		}
 
		/**
		 * Returns the image created by the processImage method
		 * @return the result image
		 */
		public BufferedImage getResultImage() {
			return image_temp;
		}
 
		/**
		 * Just for fun: returns the number of loops which were needed for getting a stable result
		 * @return number of loops for stable result
		 */
		public int getLoops(){
			return loops;
		}
 
		/**
		 * Initializes the algorithm. Creates k ClusterClasses and puts them into a LinkedList
		 * @param image
		 * @param bins
		 */
		@SuppressWarnings("unchecked")
		private void initialize(BufferedImage image, int bins, int initway){
			image_temp = image;
			loops = 0;
			changedPixels = 0;
			not_terminated = true;
			classes = new ArrayList();
			for (int i = 0; i < bins; i++) {
				ClusterClass cc = new ClusterClass(createMean(initway, bins, i, image));
				classes.add(cc);
			}
 
		}
 
		/**
		 * Controls the calculations of the upper- and lowerbounds of ClusterClasses and sets them
		 *
		 */
		private void calculateBounds() {
			for (int i = 0; i < classes.size(); i++){
				int lb = calculateLowerBound(classes.get(i));
				lowerbounds[i] = lb;
				classes.get(i).setBounds(lb,calculateUpperBound(classes.get(i)) );
				}
		}
 
		/**
		 * Does the actual calculation of the lowerbound
		 * @param ClusterClass
		 * @return Lowerbound
		 */
		private int calculateLowerBound(ClusterClass cc) {
			int cMean = cc.getMean();
			int currentBound = 0;
			for (int i = 0; i< classes.size(); i++) { 					if (cMean > classes.get(i).getMean()) {
						currentBound = Math.max((cMean + classes.get(i).getMean())/2, currentBound);
					}
					else {
					}
				}
			return currentBound;
			}
 
		/**
		 * Does the actual calculation of the upperbound
		 * @param ClusterClass
		 * @return Upperbound
		 */
		private int calculateUpperBound(ClusterClass cc) {
				int cMean = cc.getMean();
				int currentBound = 255;
				for (int i = 0; i< classes.size(); i++) {
						if (cMean < classes.get(i).getMean()) {
							currentBound = Math.min((cMean + classes.get(i).getMean())/2, currentBound);
						}
						else {}
					}
				return currentBound;
				}
 
		/**
		 * Takes care of the recalculation of the means of the ClusterClasses
		 *
		 */
		private void recalculateMeans() {
			for (int i = 0; i= 50) {
				not_terminated = false;
			}
			if (changedPixels <= 300) {
				not_terminated = false;
			}
		}
 
		private void calculateChangedPixels() {
			int changed = 0;
			for (int i = 0; i< lowerbounds[i]) {
					for (int j = c; j lowerbounds[i]) {
					for (int j = lowerbounds[i]; j< image.getHeight(); h++){
					for (int w = 0; w < image.getWidth(); w++){
						pixelindex+=1;
						if (pixelindex % bins == index) {
							Color rgb = new Color(image.getRGB(w, h));
							sum+= rgb.getRed();
							value+=1;
						}
					}}
				return sum/value;
 
			case MEAN_BY_SPACE:
				return (int)(255 / (bins-1) * index);
			case MEAN_AT_RANDOM:
				Double dmean = Math.random() * 255;
				return (int) Math.floor(dmean);
			default:
				return 0;
			}
		}
}

In addition to this, the custom class ClusterClass is defined as:

package actions;
 
/**
 * The ClusterClass is just a class holding the important cluster properties.
 * @author Patrick van Kouteren
 *
 */
 
public class ClusterClass {
	int mean, upperbound, lowerbound;
 
	public ClusterClass(int m) {
		mean = m;
	}
 
	public void setBounds(int lb, int ub) {
		lowerbound = lb;
		upperbound = ub;
	}
 
	public void setMean(int i) {
		mean = i;
	}
 
	public int getMean() {
		return mean;
	}
 
	public int getLowerBound() {
		return lowerbound;
	}
 
	public int getUpperBound() {
		return upperbound;
	}
 
	public void calculateMean(int [] histogram) {
		int tempMean = 0;
		int counter = 0;
		for (int i = lowerbound; i<= upperbound; i++) {
			counter += histogram[i];
			tempMean += histogram[i] * i;
		}
		mean = tempMean / counter;
	}
 
}

The source code might not be completely visible. It can be viewed in a blank screen here. As mentioned in the replies to this post, I forgot to add the ClusterClass. It can be viewed in a blank screen here.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • email
  • LinkedIn
  • Live
  • NuJIJ
  • Print
  • Slashdot
  • TwitThis
  • Yahoo! Buzz
  • eKudos
  • Hyves
  • MySpace
  • StumbleUpon
  • Technorati
  • MSN Reporter
  • Symbaloo
  • Twitter
  • Reddit

26 Responses to “K-means clustering in Java code found!”

  1. Антоха says:

    Данный пост очень информативен, спасибо Вам!

    Edit: please in English as other people might benefit from possible questions and / or solutions.
    Translated: This post is very informative, thank you!

  2. pinaka says:

    in your programm you have used ClusterClass.what is it?
    when i compile the code it’s giving error

    cannot find symbol:ClusterClass

    what should i do?

  3. Patrick says:

    Hi pinaka, thanks for your interest. I’ll look into that as soon as possible. Unfortunately I’ve left my notebook with the code at school. I’ll reply as soon as I’ve solved the issue.

    Cheers,
    Patrick

  4. Patrick says:

    I see what the issue is: it’s a custom class. I’ll add it tomorrow :)

  5. Ivan says:

    Hi Patrick,

    really appreciate your k-means clustering in Java because i am planning to implement it into j2me.

    hope you could add the custom class (ClusterClass) soon so i could try it out on the desktop first.

    Thanks.

  6. Patrick says:

    I’ve updated the post with a link to the file as well as the contents in the post itself. Note that these classes are actually part of an application which I was writing, so they might have been written in the context of this application.

    I would like to continue this application and make it public here some day, but I don’t have the time to do that just now.

    If there are any questions, sugesstions, or whatsoever I’d be glad to hear them!

  7. Prashanth says:

    Is it possible for you to give me an equivalent C# code

  8. Patrick says:

    Prashanth,

    If I would know how to do it I would. Unfortunately I haven’t had time to learn any version of C yet. However, I guess that the code itself gives a reasonable impression of how to do it in C#.

    Cheers,

    Patrick

  9. vinod says:

    hello patrick,

    Do u have CURE & CHAMELON clustering algorithm implemented in java

  10. Patrick says:

    Hi Vinod,

    I’m afraid not. I must admit that I haven’t looked into the code recently. If I’d only have some more time I would. Do you happen to have a clear data source about Cure & Chamelon clustering from which an implementation can be derived?

  11. dekompi says:

    Very nice progie.
    But where the main method sir?

    Regards

  12. Patrick says:

    dekompi: As this action is part of an application, the main method is not present in this class. It’s an action class which receives its info from the main frame. The main frame is basically a frame which shows images and adds file options.

  13. Raghotham S says:

    Hi Sir,
    I hav a project in college – Image segmentation using kmeans

    i hav the code which u hav posted
    but am not knowin how to pass an image as input
    I need to design an applet which takes image as input and outputs the result.
    pls help its v v urgent.
    Thanks in advance

  14. Raghotham S says:

    can u pl giv me a code which can segment image without that histogram input?
    Or else can u explain how to generate histogram for an image?

  15. Patrick says:

    Raghotham S:

    The segmentation is done on the histogram. The histogram of an image is basically a ‘graph’ of all pixel values of the image. In greyscale images the red value is equal to the green and the blue value. When having an image with 256 colors, your histogram contains of 256 bins. You simply count the pixels having a red (or green or blue) value of 0, 1, 2, …, 255. I will try and look up the code for the application tonight.

  16. Raghotham S says:

    Ok thanks a lot!!

  17. Patrick says:

    Raghotham S:

    Full source code can now be found here or at the download page

    Good luck!

  18. [...] there was some interest in source code in two earlier posts (post 1, post 2) I've posted the source code [...]

  19. Raghotham S says:

    Thanks a ton… i owe u :)

  20. Raghotham S says:

    one last doubt patrick,
    the ispe.java doesn’t work for most of the images whereas the ispe.jar file works well with almost all images. Why is it so?

  21. Raghotham S says:

    figured it out :D
    thanks again for the code :)

  22. Patrick says:

    Did it have to do with the classpath and the ImageIO jar file in the lib directory?
    That’s what I expected..

  23. Raghotham S says:

    it needed a slight modification for the buffered image handling part. :)

  24. Chit Su Hlaing says:

    Thank you for your open source coeds. Thank you so much! I wish you that you can have the more great logic and think for your great life.

  25. shipra says:

    hi! cud u pls tell the logic/concept dat u have used in k mean clustering…(i cud’nt get it 100%..).i tried your code ..it gives very good results..generally its said that k mean gives good result ..depending on the initial seed pixels..but howcome your code gives good result (same) for all three options..K,distributed over color space..,randomly
    …cud u pls elaborate distributed over color space and k..(the theorey part)..
    neways very good prog..
    thanks..

  26. Patrick says:

    Shipra,

    I don’t think the result is the same for all three types of initialization. That would be a coincidence. What K-means actually tries to do is to find clusters of likely ‘equal’ pixel values. It all depends on the original image histogram. Especially when the mean is distributed over the color space.
    I can’t imagine my code always gives good results for randomly chosen means. In the worst case all means chosen are close to each other and at one side of the histogram.

    Come to think of it: the means could also be chosen in a weighted way. This would perhaps give some good results in all cases.

Leave a Reply

preload preload preload