<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Patrick&#039;s playground &#187; clustering</title>
	<atom:link href="http://www.vankouteren.eu/blog/tag/clustering/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.vankouteren.eu/blog</link>
	<description>Random thoughts, problems and solutions</description>
	<lastBuildDate>Sun, 29 Jan 2012 07:53:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>K-means clustering in Java code found!</title>
		<link>http://www.vankouteren.eu/blog/2009/09/k-means-clustering-in-java-code-found/</link>
		<comments>http://www.vankouteren.eu/blog/2009/09/k-means-clustering-in-java-code-found/#comments</comments>
		<pubDate>Mon, 07 Sep 2009 13:36:45 +0000</pubDate>
		<dc:creator>Patrick van Kouteren</dc:creator>
				<category><![CDATA[JAVA]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[k-means]]></category>
		<category><![CDATA[K-means clustering]]></category>

		<guid isPermaLink="false">http://www.vankouteren.eu/blog/?p=144</guid>
		<description><![CDATA[My blogpost on K-means clustering has the highest number of views, so people are probably interested in it. Sadly enough I lost the source code of the K-means action a while ago. Last week I needed an external harddisk to make a back-up of some files. There was already some content on the disk. I [...]]]></description>
			<content:encoded><![CDATA[            <script type="text/javascript" src="http://www.vankouteren.eu/blog/wp-content/plugins/wordpress-code-snippet/scripts/shBrushJava.js"></script>
<p>My <a title="K-means clustering implementation in JAVA" href="http://www.vankouteren.eu/blog/2007/10/k-means-clustering-implementation-in-java/">blogpost on K-means clustering</a> has the highest number of views, so people are probably interested in it. Sadly enough I lost the source code of the K-means action a while ago. Last week I needed an external harddisk to make a back-up of some files. There was already some content on the disk. I found quite some pieces of code including the K-means code. Although it is quite simple code operating on (if I remember correctly 8-bit) greyscale images, it might give some insights in how to do this.</p>
<p><span id="more-144"></span>The whole code file is presented below. For more information you can view <a title="K-means clustering implementation in JAVA" href="http://www.vankouteren.eu/blog/2007/10/k-means-clustering-implementation-in-java/">my earlier blogpost</a> on K-means clustering.</p>
<p><pre class="brush: java">package actions;

import java.awt.Color;
import java.awt.image.BufferedImage;
import java.util.ArrayList;

/**
 * This KMeansAction performs a K-means clustering action on a BufferedImage
 * @author Patrick van Kouteren
 *
 */

public class KMeansAction {

		BufferedImage image_temp;
		boolean not_terminated;
		int loops, changedPixels;
		int[] histogram;
		ArrayList classes;
		int [] lowerbounds;
		public final static int MEAN_BY_MOD = 1;
		public final static int MEAN_BY_SPACE = 2;
		public final static int MEAN_AT_RANDOM = 3;

		/**
		 * Controls the actual work:
		 * - Initialization
		 * - Loop until termination condition is met
		 *  + for each pixel: assign pixel to a class such that the distance from the pixel to the mean of that class is minimized
		 *  + for each class: recalculate the means of the class based on pixels belonging to that class
		 * - End loop
		 * @param image
		 * @param bins (k)
		 * @param histogram
		 */
		public KMeansAction(BufferedImage image, int bins, int[]histogram, int initway) {
			this.histogram = histogram;
			lowerbounds = new int[bins];
			initialize(image, bins, initway);
			calculateBounds();
			while (not_terminated) {
				recalculateMeans();
				loops++;
				checkTermination();
				}
			processImage(image, bins);
		}

		/**
		 * Set the new color values for the image
		 * @param image
		 */
		private void processImage(BufferedImage image, int bins) {
			int delta = 255 / (bins-1);
			for (int h = 0; h &amp;lt; image.getHeight(); h++){
				for (int w = 0; w &amp;lt; image.getWidth(); w++){
					Color rgb = new Color(image.getRGB(w, h));
					int grey = rgb.getRed();
					for (int i = 0; i classes.get(i).lowerbound &amp;amp;&amp;amp; grey &amp;lt; classes.get(i).upperbound) {
							int g = i*delta;
							image_temp.setRGB(w,h,(new Color(g, g, g)).getRGB());
						}
					}
				}
			}
		}

		/**
		 * Returns the image created by the processImage method
		 * @return the result image
		 */
		public BufferedImage getResultImage() {
			return image_temp;
		}

		/**
		 * Just for fun: returns the number of loops which were needed for getting a stable result
		 * @return number of loops for stable result
		 */
		public int getLoops(){
			return loops;
		}

		/**
		 * Initializes the algorithm. Creates k ClusterClasses and puts them into a LinkedList
		 * @param image
		 * @param bins
		 */
		@SuppressWarnings(&quot;unchecked&quot;)
		private void initialize(BufferedImage image, int bins, int initway){
			image_temp = image;
			loops = 0;
			changedPixels = 0;
			not_terminated = true;
			classes = new ArrayList();
			for (int i = 0; i &amp;lt; bins; i++) {
				ClusterClass cc = new ClusterClass(createMean(initway, bins, i, image));
				classes.add(cc);
			}

		}

		/**
		 * Controls the calculations of the upper- and lowerbounds of ClusterClasses and sets them
		 *
		 */
		private void calculateBounds() {
			for (int i = 0; i &amp;lt; classes.size(); i++){
				int lb = calculateLowerBound(classes.get(i));
				lowerbounds[i] = lb;
				classes.get(i).setBounds(lb,calculateUpperBound(classes.get(i)) );
				}
		}

		/**
		 * Does the actual calculation of the lowerbound
		 * @param ClusterClass
		 * @return Lowerbound
		 */
		private int calculateLowerBound(ClusterClass cc) {
			int cMean = cc.getMean();
			int currentBound = 0;
			for (int i = 0; i&amp;lt; classes.size(); i++) { 					if (cMean &amp;gt; classes.get(i).getMean()) {
						currentBound = Math.max((cMean + classes.get(i).getMean())/2, currentBound);
					}
					else {
					}
				}
			return currentBound;
			}

		/**
		 * Does the actual calculation of the upperbound
		 * @param ClusterClass
		 * @return Upperbound
		 */
		private int calculateUpperBound(ClusterClass cc) {
				int cMean = cc.getMean();
				int currentBound = 255;
				for (int i = 0; i&amp;lt; classes.size(); i++) {
						if (cMean &amp;lt; classes.get(i).getMean()) {
							currentBound = Math.min((cMean + classes.get(i).getMean())/2, currentBound);
						}
						else {}
					}
				return currentBound;
				}

		/**
		 * Takes care of the recalculation of the means of the ClusterClasses
		 *
		 */
		private void recalculateMeans() {
			for (int i = 0; i= 50) {
				not_terminated = false;
			}
			if (changedPixels &amp;lt;= 300) {
				not_terminated = false;
			}
		}

		private void calculateChangedPixels() {
			int changed = 0;
			for (int i = 0; i&amp;lt; lowerbounds[i]) {
					for (int j = c; j lowerbounds[i]) {
					for (int j = lowerbounds[i]; j&amp;lt; image.getHeight(); h++){
					for (int w = 0; w &amp;lt; image.getWidth(); w++){
						pixelindex+=1;
						if (pixelindex % bins == index) {
							Color rgb = new Color(image.getRGB(w, h));
							sum+= rgb.getRed();
							value+=1;
						}
					}}
				return sum/value;

			case MEAN_BY_SPACE:
				return (int)(255 / (bins-1) * index);
			case MEAN_AT_RANDOM:
				Double dmean = Math.random() * 255;
				return (int) Math.floor(dmean);
			default:
				return 0;
			}
		}
}&lt;/pre&gt;
In addition to this, the custom class ClusterClass is defined as:
&lt;pre lang=&quot;java&quot;&gt;package actions;

/**
 * The ClusterClass is just a class holding the important cluster properties.
 * @author Patrick van Kouteren
 *
 */

public class ClusterClass {
	int mean, upperbound, lowerbound;

	public ClusterClass(int m) {
		mean = m;
	}

	public void setBounds(int lb, int ub) {
		lowerbound = lb;
		upperbound = ub;
	}

	public void setMean(int i) {
		mean = i;
	}

	public int getMean() {
		return mean;
	}

	public int getLowerBound() {
		return lowerbound;
	}

	public int getUpperBound() {
		return upperbound;
	}

	public void calculateMean(int [] histogram) {
		int tempMean = 0;
		int counter = 0;
		for (int i = lowerbound; i&amp;lt;= upperbound; i++) {
			counter += histogram[i];
			tempMean += histogram[i] * i;
		}
		mean = tempMean / counter;
	}

}</pre></p>
<p>The source code might not be completely visible. It can be viewed in a blank screen <a title="View code in blank screen" href="http://www.vankouteren.eu/downloads/KMeansAction.java">here</a>. As mentioned in the replies to this post, I forgot to add the ClusterClass. It can be viewed in a blank screen <a title="ClusterClass" href="http://www.vankouteren.eu/downloads/ClusterClass.java" target="_blank">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.vankouteren.eu/blog/2009/09/k-means-clustering-in-java-code-found/feed/</wfw:commentRss>
		<slash:comments>69</slash:comments>
		</item>
		<item>
		<title>K-means clustering implementation in JAVA</title>
		<link>http://www.vankouteren.eu/blog/2007/10/k-means-clustering-implementation-in-java/</link>
		<comments>http://www.vankouteren.eu/blog/2007/10/k-means-clustering-implementation-in-java/#comments</comments>
		<pubDate>Thu, 18 Oct 2007 18:29:44 +0000</pubDate>
		<dc:creator>Patrick van Kouteren</dc:creator>
				<category><![CDATA[JAVA]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[k-means]]></category>

		<guid isPermaLink="false">http://www.vankouteren.eu/blog/?p=5</guid>
		<description><![CDATA[Details about K-Means Clustering on images: Before the algorithm starts, the user needs to set a number of greyvalues (bins). The resulting image will contain that number of greyvalues. With that number of bins (called ‘k’) the algorithm clusters the greyvalues of the image into k clusters and once the algorithm is terminated, every cluster [...]]]></description>
			<content:encoded><![CDATA[            <script type="text/javascript" src="http://www.vankouteren.eu/blog/wp-content/plugins/wordpress-code-snippet/scripts/shBrushJava.js"></script>
<p class="MsoNormal"><strong><span lang="EN-GB">Details about K-Means Clustering on images:</span></strong><span lang="EN-GB"></span></p>
<p class="MsoNormal"><span lang="EN-GB">Before the algorithm starts, the user needs to set a number of greyvalues (bins). The resulting image will contain that number of greyvalues.<br />
With that number of bins (called ‘k’) the algorithm clusters the greyvalues of the image into k clusters and once the algorithm is terminated, every cluster will have its own greyvalue.<br />
With starting the algorithm, you should set:</span></p>
<ul type="disc">
<li class="MsoNormal"><span lang="EN-GB">How to define the ‘startingmeans’ of the      clusters before the first iteration.</span></li>
<li class="MsoNormal"><span lang="EN-GB">What the stopping criteria are.</span></li>
</ul>
<p class="MsoNormal"><span lang="EN-GB"> </span></p>
<p class="MsoNormal"><strong><span lang="EN-GB">The Algorithm:</span></strong><span lang="EN-GB"></span></p>
<p class="MsoNormal"><span lang="EN-GB">In short this is what the algorithm is supposed to do:<br />
Initialize (so set k, set ‘startingmeans’, set stopping criteria)<br />
Loop while termination condition isn’t met (</span></p>
<ul type="disc">
<li class="MsoNormal"><span lang="EN-GB">For each pixel: assign the pixel to a      class such that the distance from the pixel to the center (the mean) of a      class is minimalized.</span></li>
<li class="MsoNormal"><span lang="EN-GB">For each class: recalculate the means of      the class based on the pixels belonging to that class.</span></li>
</ul>
<p><span lang="EN-GB">)</span></p>
<p class="MsoNormal"><strong><span lang="EN-GB">My implementation:</span></strong><span lang="EN-GB"></span></p>
<p class="MsoNormal"><span lang="EN-GB">The user can set his k (which is fairly easy).<br />
I’ve implemented 3 ways to choose the ‘startingmeans’ this far:</span></p>
<ol type="1">
<li class="MsoNormal"><span lang="EN-GB">i mod k class: The pixel at index i is      assigned to the class i modulo k</span></li>
<li class="MsoNormal"><span lang="EN-GB">Distribute mean table over color space:      According to the k that’s chosen the means are chosen so that the are      spreaded equally over the complete color space of the image.</span></li>
<li class="MsoNormal"><span lang="EN-GB">Random: Just as it says. Given a k, there      will be chosen k random mean values.</span></li>
</ol>
<p class="MsoNormal"><span lang="EN-GB">The termination constraints are currently not visible for users and are set to:<br />
Terminate after fewer than n pixels change classes after a recalculation of the means.<br />
I’ve set my n to 300 which is pretty small if you are using images bigger than 512 by 512 pixels. Next to that, the algorithm will be terminated if there are more than j iterations needed to get a stable result (in the meanings of that there are not more than n pixels changing classes after a recalculation of the means). My j is currently set to 50. Most of the times the algorithm terminates because of less than 300 pixels have changed classes.</span></p>
<p class="MsoNormal"><span lang="EN-GB">Now that we’ve seen how the parameters of the algorithm are set, let’s have a look how I’ve implemented the algorithm in terms of code and decisions I’ve made.</span></p>
<p class="MsoNormal"><span lang="EN-GB">I’ve devided the code over 3 classes:</span></p>
<p class="MsoNormal"><span lang="EN-GB">1 to build the JDialog which is needed to ask for the input of the user concerning the way the algorithm needs to be initiated.<br />
One with the actual algorithm and the last class is a clusterclass.</span></p>
<p class="MsoNormal"><span lang="EN-GB">Because the class with the JDialog is not that interesting, we’ll focus on the other two classes.</span></p>
<p class="MsoNormal"><span lang="EN-GB">The ClusterClass is pretty simple: it only holds a mean, an upperbound and a lowerbound.<br />
I’ve chosen for the fact that this class holds the bounds because at the initialization of the algorithm, there are k classes which are created (and put into an ArrayList). You can let each class hold it’s own pixels which are belonging to that class, but if your k grows and the image is big, the complete image will be twice in the memory: as the original image and all pixels will be part of one of the clusterclasses as well. Instead of that I’ve chosen to hold the bounds of that class so that if I’m checking pixelvalues, it can also check to which class it belongs in the same for-loop.</span></p>
<p class="MsoNormal"><span lang="EN-GB">As mentioned earlier: a pixel belongs to a class if the distance from that pixelvalue to the mean of a class is minimized. Because my ClusterClasses hold their upper- and lowerbound, a pixelvalue has to lay between the bounds to be part of that class. The bounds are simply calculated by checking which mean is the nearest (but have a lower value for the lowerbound and a higher bound for the upperbound). The bound can simply be calculated by taking the mean of these two means.<br />
After every pixel is assigned to a class (In my case: it can check to which class it belongs). The means of the classes can be recalculated by taking the sum of all the pixelvalues belonging to that clusterclass and divide this sum by the number of pixels in the clusterclass.<br />
After the recalculations of the means, the upper- and lowerbounds need to be recalculated as well.<br />
After this iteration, the termination condition has to be checked. If the condition isn’t met, another iteration follows. If the condition is met, the clusters are set and the colors of the image can be recalculated.</span></p>
<p class="MsoNormal"><strong><span lang="EN-GB">And now shortly in JAVA:</span></strong><span lang="EN-GB"></span></p>
<p class="MsoNormal"><span lang="EN-GB">public KMeansAction:</span></p>
<p class="MsoNormal"><span lang="EN-GB">initialize<br />
calculateBounds<br />
while (not_terminated) do:</span></p>
<ul type="disc">
<li class="MsoNormal"><span lang="EN-GB">recalculateMeans</span></li>
<li class="MsoNormal"><span lang="EN-GB">recalculateBounds</span></li>
<li class="MsoNormal"><span lang="EN-GB">checkTermination</span></li>
</ul>
<p class="MsoNormal"><span lang="EN-GB">processImage</span></p>
<p class="MsoNormal"><span lang="EN-GB">private void processImage:</span></p>
<p class="MsoNormal"><span lang="EN-GB">// This works for 8-bit greyscale images<br />
// It calculates the greyvalues that will occur in the resulting image<br />
delta = 255 / ( k – 1)<br />
for every pixel do:<br />
for every class do:<br />
if a pixel belongs to that class then</span></p>
<p class="MsoNormal">// set the greyvalue of that pixel to the index of the class in the list times delta</p>
<p class="MsoNormal"><span lang="EN-GB"> greyvalue = classindex * delta<br />
// then set the rgbvalue of that pixel to the greyvalue<br />
newImage.setRGB(pixel location, greyvalue)</span></p>
<p class="MsoNormal">
<p class="MsoNormal"><span lang="EN-GB">NOTE (August 7, 2009): I've found the source code and put it in <a title="K-means clustering source code blogpost" href="http://www.vankouteren.eu/blog/2009/09/k-means-clustering-in-java-code-found/">this blogpost.</a></span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.vankouteren.eu/blog/2007/10/k-means-clustering-implementation-in-java/feed/</wfw:commentRss>
		<slash:comments>60</slash:comments>
		</item>
	</channel>
</rss>

