Online text based games (3)

This week I’ve actually done a lot of work without noticing it. First of all I needed an environment in which I could write and directly test code without being dependent on a webhost. The easiest way to do this is by installing Apache2Triad. It’s easy to install and it works directly on. The control panel offers enough functionality to make it do what you want (that is to parse PHP files and to interact with a database).

After that I’ve tested some files which I already had and which could be reused. As I’ve edited them, some mistakes had to be corrected. After doing that (I hate code which is not compiling), I’ve written the first two use-cases:

  1. Register user
  2. Log in user

“So that’s all you’ll probably think? It ain’t that hard to construct a user registering system and a login prompt!”

Verder lezen

Getagged , ,

Online text based games (2)

Earlier I wrote a very small sketch about online text-based games. Since then I’ve thought about how to set up things in such a way that updating and extending would be easy. Therefore I’ve taken a look at CMS systems like Joomla/Mambo. This CMS has a components folder both in the public_html folder and the administrator folder. Components to add need to be placed in these folders to be used in the front-end and back-end respectively.

Verder lezen

Getagged , , ,

Avast Antivirus for Mac: MAD problem

Yesterday I bought a license for Avast! Antivirus software for my MacBook. As I already use this software on all my Windows PC’s and I’m confident about it, I knew what I would buy. My happiness quickly vanished after a couple of scans. I wanted to do a complete scan of my harddisk, but every time I got the error telling me that there was something wrong with MAD.

After a little search on the internet I found out that there is a problem with the most recent version of the Avast engine (version 0.0.68). You can find out which engine version you’re running by viewing the tiptooltext in the upper left corner (hold your mouse on the VPS version).

The problem is in the com.avast.MacAvast.MAD file. There is already a solution provided by one of the people of Avast. This solution will be part of the next update, but in the meanwhile it can be fixed manually. Informationand instructions about fixing the bug by replacing the MAD file can be found here.

Getagged ,

Online text based games

I was just wondering: what would I actually need to create an online text based game?

Apparently I’m not the only one. I’ve played some online text based games and I’m actually pretty amazed that most of them don’t listen to calls from their communities and a lot of them are outdated.

So I’ve decided that if I have spare time (which I don’t have much), I’m working on this project I just came up with. Perhaps it will succeed, perhaps it won’t or perhaps somewhere in the middle of it, but it feels exciting to create something and to overcome the problems which are sure to rise.

Verder lezen

Getagged , ,

Apple Time Capsule Read-Only Problem

I’ve been using my Time Capsule without any problems for a couple of months now. Last week all of a sudden at the end of the day I got the message that my Time Capsule wasn’t accessible any more and that it turned into read-only state. Since then I haven’t been able to make any back-ups of my machine. The suggestion was to format the drive, but who wants to lose all data when you don’t even know for sure if it solves the problem?After some searching I found this website which describes the problem and also offers a solution (although taking a long time to repair) for the problem without losing your data. Hopefully it can helpful to more people.

Getagged , , , , ,

Cisco VPN Error 51

I was getting annoyed today by my Cisco VPN client unwilling to do its job. As I had been playing around with my Time Capsule and my wireless accesspoint to get my wireless network straight so that it could connect with my Nintendo Wii, I encountered an ‘Error 51′. Luckily I wasn’t the only one who got that problem. The first solution already worked:”The simple fix is to quit VPNClient, open a Terminal window, (Applications -> Utilities -> Terminal) and type the following:

sudo /System/Library/StartupItems/CiscoVPN/CiscoVPN restart

(for OS X 10.5 and lower) or

sudo kextload /System/Library/Extensions/CiscoVPN.kext

(10.6) and give your password when it asks. This will stop and start the “VPN Subsystem”, or in other words restart the CiscoVPN.kext extension. Cisco seems to have problems when network adapters disappear and reappear, something that happens commonly in Wireless or Dial-up scenerios. Sometimes putting a system to sleep, disconnecting an Ethernet cable or simply reconnecting your wireless will cause CiscoVPN to loose track of the network adapters on the system.”The source can be found here.

Getagged , , ,

Frequent Itemset Mining implementation in JAVA

Huge datasets, often containing important operational knowledge, defy standard data analysis methods. Traditional data analysis methods do not easily scale from analyzing megabytes of data to analyzing terra- or peta-bytes of data, nor from analyzing low dimensional data to analyzing very high dimensional data. Furthermore, results may become difficult or almost impossible to interpret by the end-user because of their size and complexity. These are several of the problems that novel data mining methods try to solve. Frequent Itemset Mining focuses on deriving association rules which can then be used to classify new incoming data. The classic example is the shopping cart example with the ‘myth’: people who buy diapers also buy beer. If there is a large confidence of the rule D(iapers) => B(eer), it will actually be there in the ouput of the algorithm. For a concern this is perhaps nice to know, so that they can adjust their shop to it (like putting the diapers close to the beer or something) and their sales (if you buy an extra pack of diapers, you get a 50% discount on beer).
There is a large variety of Frequent Itemset Mining algorithms available on the internet. Because none of them was of direct use, I’ve made an implementation itself. It has its known downsides (see below), but it hopefully provides a start for people who want to do more with FIM.
With this Frequent Itemset Mining implementation I’ve implemented 2 algorithms which are capable of doing this: The Apriori algorithm and the FP-Tree algorithm. Note that these packages are not created by me.

Input of the program / code:
A “.data” comma separated file.

Known downsides of the program / code:
> The user is asked for two classes when performing the scanning algorithm.
There are a lot of cases where there are more than two classes in a data file.
For making it possible to handle these files, the code just has to be adjusted
slightly: the algorithm must look itself for the number of different classes
and perform the partitionscans on all these different classes.
> File rewriting is necessary at this point for the algorithm to work. The
speed could be much improved if this isn’t necessary any more. For that the
code for these algorithms needs to be rewritten to handle lists directly.
> The hash function which is present in the program is pretty basic and
probably not sufficient for large datasets. This could be solved by
implementing a stronger hash function (or using JAVA’s hash function).
In overall, I’ve tried to keep all functions as generic as possible so that
further extension is actually possible, so in that point of view, the code that
I’ve written is pretty scalable and by adjusting some functions (hashfunction
and scanning parameters) slightly, it can be even more scalable. All CSV files
can be read and the program reacts apropriate on the incoming data. If the data
is correct and it can be handled, the algorithm can start working with this
input and it produces a result file.
I was also surprised to see how many warnings my Eclipse environment generated
for the used source codes. Most of them can be solved directly, but some need
more time.

Download
The JAR file, source and test input file can be found in the download section.

Find more on Frequent Itemset Mining: http://www.google.nl/search?source=ig&hl=nl&rlz=&q=Frequent+Itemset+Mining&btnG=Google+zoeken
More on Association Rule learning:
http://www.google.nl/search?hl=nl&q=Association+Rule+Learning&btnG=Zoeken

Library references:
CSVReader: OpenCSV (http://opencsv.sourceforge.net)
Hash function: http://www.cs.usfca.edu/galles/cs245/hash.java.html
Apriori algorithm: http://www2.cs.uregina.ca/~dbd/cs831/notes/itemsets/itemset_prog1.html
FPGrowth algorithm: http://www.csc.liv.ac.uk/~frans/KDD/Software/FPGrowth/fpGrowth.html

Getagged , , ,

K-means clustering implementation in JAVA

Details about K-Means Clustering on images:

Before the algorithm starts, the user needs to set a number of greyvalues (bins). The resulting image will contain that number of greyvalues.
With that number of bins (called ‘k’) the algorithm clusters the greyvalues of the image into k clusters and once the algorithm is terminated, every cluster will have its own greyvalue.
With starting the algorithm, you should set:

  • How to define the ‘startingmeans’ of the clusters before the first iteration.
  • What the stopping criteria are.

The Algorithm:

In short this is what the algorithm is supposed to do:
Initialize (so set k, set ‘startingmeans’, set stopping criteria)
Loop while termination condition isn’t met (

  • For each pixel: assign the pixel to a class such that the distance from the pixel to the center (the mean) of a class is minimalized.
  • For each class: recalculate the means of the class based on the pixels belonging to that class.

)

My implementation:

The user can set his k (which is fairly easy).
I’ve implemented 3 ways to choose the ‘startingmeans’ this far:

  1. i mod k class: The pixel at index i is assigned to the class i modulo k
  2. Distribute mean table over color space: According to the k that’s chosen the means are chosen so that the are spreaded equally over the complete color space of the image.
  3. Random: Just as it says. Given a k, there will be chosen k random mean values.

The termination constraints are currently not visible for users and are set to:
Terminate after fewer than n pixels change classes after a recalculation of the means.
I’ve set my n to 300 which is pretty small if you are using images bigger than 512 by 512 pixels. Next to that, the algorithm will be terminated if there are more than j iterations needed to get a stable result (in the meanings of that there are not more than n pixels changing classes after a recalculation of the means). My j is currently set to 50. Most of the times the algorithm terminates because of less than 300 pixels have changed classes.

Now that we’ve seen how the parameters of the algorithm are set, let’s have a look how I’ve implemented the algorithm in terms of code and decisions I’ve made.

I’ve devided the code over 3 classes:

1 to build the JDialog which is needed to ask for the input of the user concerning the way the algorithm needs to be initiated.
One with the actual algorithm and the last class is a clusterclass.

Because the class with the JDialog is not that interesting, we’ll focus on the other two classes.

The ClusterClass is pretty simple: it only holds a mean, an upperbound and a lowerbound.
I’ve chosen for the fact that this class holds the bounds because at the initialization of the algorithm, there are k classes which are created (and put into an ArrayList). You can let each class hold it’s own pixels which are belonging to that class, but if your k grows and the image is big, the complete image will be twice in the memory: as the original image and all pixels will be part of one of the clusterclasses as well. Instead of that I’ve chosen to hold the bounds of that class so that if I’m checking pixelvalues, it can also check to which class it belongs in the same for-loop.

As mentioned earlier: a pixel belongs to a class if the distance from that pixelvalue to the mean of a class is minimized. Because my ClusterClasses hold their upper- and lowerbound, a pixelvalue has to lay between the bounds to be part of that class. The bounds are simply calculated by checking which mean is the nearest (but have a lower value for the lowerbound and a higher bound for the upperbound). The bound can simply be calculated by taking the mean of these two means.
After every pixel is assigned to a class (In my case: it can check to which class it belongs). The means of the classes can be recalculated by taking the sum of all the pixelvalues belonging to that clusterclass and divide this sum by the number of pixels in the clusterclass.
After the recalculations of the means, the upper- and lowerbounds need to be recalculated as well.
After this iteration, the termination condition has to be checked. If the condition isn’t met, another iteration follows. If the condition is met, the clusters are set and the colors of the image can be recalculated.

And now shortly in JAVA:

public KMeansAction:

initialize
calculateBounds
while (not_terminated) do:

  • recalculateMeans
  • recalculateBounds
  • checkTermination

processImage

private void processImage:

// This works for 8-bit greyscale images
// It calculates the greyvalues that will occur in the resulting image
delta = 255 / ( k – 1)
for every pixel do:
for every class do:
if a pixel belongs to that class then

// set the greyvalue of that pixel to the index of the class in the list times delta

greyvalue = classindex * delta
// then set the rgbvalue of that pixel to the greyvalue
newImage.setRGB(pixel location, greyvalue)

NOTE (August 7, 2009): I’ve found the source code and put it in this blogpost.

Getagged , ,