tag:blogger.com,1999:blog-68926607023660807372024-03-13T16:55:20.686+01:00SubNucleon :: web :: dev :: design ::Bojanhttp://www.blogger.com/profile/15515367189634457454noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-6892660702366080737.post-28472019198175035532009-07-31T09:23:00.004+02:002009-07-31T13:35:32.315+02:00Simple Duplicate File CheckerSince it's harder to find an application that does what you need these days than it is to write one yourself, here's a small Java program that checks the files in two directories and prints out the absolute path names of any two files (one from each directory) that are of the same size. It does this for all same-size pairs. Setting up an alias in your favourite shell can help you use the app much faster and with nicer syntax. I use:<br /><blockquote style="font-family: courier">alias checkDuplicateSizes="java -cp ~/scripts DuplicateSizeChecker"</blockquote><br /><br />The path names of the possibly duplicated files are surrounded by single quotes (') and separated by a space. This makes them ideal for using with <i>cmp</i>. You can either copy and paste file pairs of interest by hand, or set up a script to <i>cmp</i> all of the files the DuplicateSizeChecker finds. You can easily grep out the "non-interesting" lines (which make the output more human readable when not used in a script) using something like the following:<br /><blockquote style="font-family: courier">checkDuplicateSizes /tmp/a /tmp/b | grep "'/"</blockquote><br /><br />Finding identical files is then pretty easy with a little script (for which I also have an alias called <i>checkDuplicates</i>):<br /><blockquote style="font-family: courier"><pre>#!/bin/bash<br /><br />java -cp ~/scripts DuplicateSizeChecker "$1" "$2" | grep "'/" |<br />while read filePair; do<br /> eval cmp -s $filePair<br /> same=`echo $?`<br /><br /> #echo $filePair<br /> #echo $same<br /><br /> if [ $same == "0" ]; then<br /> echo<br /> echo The following files are identical<br /> echo $filePair | sed "s/\\' \\'/\\'\\`echo -e '\n\r'`\\'/g"<br /> fi<br />done</pre><br /></blockquote><br /><br />That's it. This has helped me solve my problems (for now) and I hope it helps you too. I haven't included any options like suppressing certain output, having more verbose output or different formatting, recursing through subdirectories, etc because I wanted to get this done quickly and because I am trying to be a little more YAGNI. Feel free to take, use, adapt or do whatever you like with this code (be respectful and reasonable, and leave a comment if it helped you out somehow - especially if you modify the code to do something smarter).<br /><br />Here's some sample output:<br /><blockquote style="font-family: courier"><pre>$ checkDuplicateSizes /tmp/one /tmp/two<br /><br />The following files have the same length (0 B)<br />'/tmp/one/one empty file' '/tmp/two/twoemptyfile'<br /><br />The following files have the same length (22 B)<br />'/tmp/one/onesame' '/tmp/two/twosame'<br /><br />The following files have the same length (15 B)<br />'/tmp/one/onesamesize' '/tmp/two/twosamesize'<br />$ checkDuplicates /tmp/one /tmp/two<br /><br />The following files are identical<br />'/tmp/one/one empty file'<br />'/tmp/two/twoemptyfile'<br /><br />The following files are identical<br />'/tmp/one/onesame'<br />'/tmp/two/twosame'</pre><br /></blockquote><br /><br /><br />Here's the meat:<br /><blockquote style="font-family: courier"><pre>import java.io.File;<br /><br />public class DuplicateSizeChecker {<br /> public static void main(String[] args){<br /> if(args.length < 2){<br /> System.err.println("Please specify two different directories as the first two arguments");<br /> return;<br /> }<br /><br /><br /> File folder1 = new File(args[0]);<br /> File folder2 = new File(args[1]);<br /><br /> if(!folder1.isDirectory() || !folder2.isDirectory() || folder1.equals(folder2)){<br /> System.err.println("Please specify two different directories as the first two arguments");<br /> }<br /> else{<br /> if(args.length > 2){<br /> System.out.println("More than two arguments supplied; only the first two are necessary; subsequent ones will be ignored"$<br /> }<br /><br /> int size1 = folder1.list().length;<br /> int size2 = folder2.list().length;<br /><br /> if(size1 > size2) {<br /> for(File f1 : folder1.listFiles()){<br /> for(File f2 : folder2.listFiles()){<br /> if(f1.isFile() && f2.isFile() && f1.length() == f2.length()){<br /> printFileInfo(f1, f2);<br /> }<br /> }<br /> }<br /> }<br /> else{<br /> for(File f2 : folder2.listFiles()){<br /> for(File f1 : folder1.listFiles()){<br /> if(f1.isFile() && f2.isFile() && f1.length() == f2.length()){<br /> printFileInfo(f1, f2);<br /> }<br /> }<br /> }<br /> }<br /> }<br /> }<br /><br /> private static void printFileInfo(File f1, File f2){<br /> System.out.println();<br /> System.out.println("The following files have the same length (" + f1.length() + " B)");<br /> System.out.println("'" + f1.getAbsolutePath() + "' '" + f2.getAbsolutePath() + "'");<br /> //System.out.println("\"" + f1.getAbsolutePath() + "\" \"" + f2.getAbsolutePath() + "\"");<br /> }<br /><br />}</pre><br /></blockquote>Bojanhttp://www.blogger.com/profile/15515367189634457454noreply@blogger.com0tag:blogger.com,1999:blog-6892660702366080737.post-26981593973480212822009-06-02T12:10:00.003+02:002009-06-02T12:41:05.985+02:00Download Link ScriptSometimes you might see a URL (in plain, non-linked text) for something you would like to download. You can't right click and then "Save Link As..." (or similar, depending on your browser) because, well, it isn't a link. So oftentimes I find myself creating these quick, little, one-time HTML files to create a download link for myself. How about making a script to do this work instead? Yup, seems like a good idea. Here it is:<br /><blockquote style="font-family: courier;"><br /><br /># filename is based on URL, but first replace special characters in the URL with dots<br />filename="`echo "$1" | tr -s " ~!@#$%^&*()+=[]{}\\|;':<>?,./" "."`"<br /><br /># put the file in the /tmp directory (or any directory of your choosing), as defined below<br />tempdir="/tmp"<br />filepath="$tempdir/$filename.htm"<br /><br /># create the quick HTML required for the download link<br />link="<html><head><title>Download link for $1</title></head><body><a href=\"$1\">$1</a></body></html>"<br /><br /># write the download link HTML to the temporary file<br />echo $link > $filepath<br /><br /># launch the file with the link<br />open $filepath<br /><br /></blockquote><br /><br />And you're done!Bojanhttp://www.blogger.com/profile/15515367189634457454noreply@blogger.com0tag:blogger.com,1999:blog-6892660702366080737.post-17519880950230272032009-05-18T02:24:00.004+02:002009-06-02T12:09:49.313+02:00Open Source ProjectsI started some projects on Google Code last year and have recently picked up work on them again. You may want to check them out - there is already a perfectly usable (and extremely useful, in my opinion) Java library for checking object state during runtime, checking passed method parameters and exception chaining. This is particularly useful for things like dependency injection and can help you painlessly throw exceptions early (a good practice) - sometimes as early as a constructor call. Fewer meaningless <span style="font-family:courier;">NullPointerException</span>s and fewer exceptions with no message will ultimately result in faster development <span style="font-style:italic;">and</span> more stable production code.<br /><br />Another project currently being worked on quite actively is meant to create an abstraction of a source code repository (version control system) with the goal of allowing developers to take better advantage of the repository. Here is the "Overview" blurb as of today:<br /><blockquote><br />The goals of this project are to allow teams to use an SVN repository to develop more efficiently by allowing for:<br /><ul><br /> <li>easy separation of concerns for various files (modular area use)</li><br /> <li>trunk stability (code sandboxing)</li><br /> <li>integration with an arbitrary tasklist, bug tracker or other applicable project management tool</li><br /></ul><br /><br />All three features should be usable individually in case a team wants to use one, but not the other(s). They will be implemented in a library, which should then have a number of interfaces built on top of it, including (but not necessarily limited to) a CLI, Eclipse plug-in and a NetBeans plug-in. <br /></blockquote><br /><br />Another library planned for the future deals with Tagging. Currently, tagging is <span style="font-style:italic;">flat</span> and unflexible. My idea is to introduce hierarchy and some other interesting concepts/features into Tagging.<br /><br />Take a look!<br /><a href="http://code.google.com/p/generic-libraries/">http://code.google.com/p/generic-libraries/</a>Bojanhttp://www.blogger.com/profile/15515367189634457454noreply@blogger.com0tag:blogger.com,1999:blog-6892660702366080737.post-54282604626107879552008-12-23T16:50:00.010+01:002009-01-06T06:49:26.591+01:00Remember The Milk on your iPodA little while ago, I decided I really wanted to have an easy way to get my Remember The Milk tasks from RTM to my iPod so I could have them when I am away from my laptop since I always have my iPod with me.<br /><br />For those of you with an iPhone, this doesn't really apply since you always have your tasks available via the iPhone-optimized RTM webapp (unless you're a non-pro user). For those of you with an iPod Touch, this certainly still applied when I originally created it in October due to the non-permanent nature of the device's internet connection. However, with the release of the offline-capable, native <a href="http://blog.rememberthemilk.com/2008/11/new-for-pro-remember-milk-now-available.html">iPhone/iPod Touch RTM app</a> (available from the app store), my approach applies to a slightly smaller audience. Nevertheless, non-pro users and owners of "regular" iPods will still definitely find this useful.<br /><br />In any case, my approach was the following:<br /><ol><li>Pull RTM tasks in ATOM format from a particular list or smart list to a temporary file</li><li>Transform the ATOM XML to a plain text format</li><li>Copy the plain text tasks file to the <span style="font-style: italic;">notes</span> directory on my iPod so I can view them on my device</li></ol>In addition, I wanted to make this convenient, easy and unintrusive so I would actually do it, which yielded the following three requirements:<br /><ul><li>Run the above three tasks in sequence with no intervention (i.e. a script)</li><li>Run the script nicely, i.e.<br />- The terminal window appears in a visually pleasing way, or does not appear<br />- The terminal window closes after the script is run<br /></li><li>Run the script automatically when the iPod is connected<br /></li></ul><br /><span style="font-weight: bold;">Retrieving the Tasks</span><br />Depending on your platform, retrieving the tasks requires a slightly different utility. On Mac OS X, I retrieve the tasks using <span style="font-style: italic;">curl</span>. On many Linux distros, it can be done using <span style="font-style: italic;">wget</span> (which is how I did it on Ubuntu when I first started working on this).<br /><br />Using <span style="font-style: italic;">curl</span>:<br /><span style="font-family:courier new;">curl --silent --url $TaskListURL --output $TempRawTaskFilePath</span><br /><br />Using <span style="font-style: italic;">wget</span>:<br /><span style="font-family:courier new;">wget --quiet --no-check-certificate $TaskListURL -O $TempRawTaskFilePath</span><br /><br />Since the ultimate goal was to run these apps from a script, I specified the options to suppress output (<span style="font-style: italic;">silent</span>/<span style="font-style: italic;">quiet</span>). <span style="font-style: italic;">wget</span> requires the <span style="font-style: italic;">--no-check-certificate</span> option if retrieving tasks via an HTTPS URL (which one should be). Both apps require (of course) the URL to retrieve, which is specified above in the variable <span style="font-style: italic;">TaskListURL</span>. I also specified that the retrieved tasks should be stored to a file whose path is in the variable <span style="font-style: italic;">TempRawTaskFilePath.</span><br /><br />The URL of the ATOM feed containing the tasks in a desired list can be obtained as follows:<br /><ol><li>Go to <a href="http://www.rememberthemilk.com/">Remember the Milk</a></li><li>Log in, if not already logged in</li><li>Go to your <span style="font-style: italic;">Tasks</span></li><li><span style="font-style: italic;"><span style="font-style: italic;"></span></span>Select the desired list (or smart list)</li><li>Make sure no tasks are selected</li><li>Click the <span style="font-style: italic;">Atom</span> link in the <span style="font-style: italic;">List</span> tab of the floating right panel, and copy the URL from your browser's address bar, OR, if you use Firefox, simply right click on the <span style="font-style: italic;">Atom</span> link and select "Copy Link Location" from the context menu since FF will try to add the feed to your bookmarks using its internal feed reader<span style="font-style: italic;"><span style="font-style: italic;"> </span></span>if you just click the link<br /></li></ol><span style="font-weight: bold;"><br />Transforming the ATOM feed<br /></span>Apache has a standards-complaint XSLT processor called <a href="http://xml.apache.org/xalan-j/">Xalan</a>. On Ubuntu, I used the<span style="font-weight: bold;"> </span><a href="http://xml.apache.org/xalan-c/">C++ Version of Xalan</a>. On OS X, I use the Java version. Either way, it is easy enough to use it from the command line once you have the ATOM data and an XSL file specifying the way it should be transformed.<br /><br />Using Java version:<br /><span style="font-family:courier new;">java -jar xalan.jar -text -in $TempRawTaskFilePath -xsl atom2plain.xsl -out $TempTransformedTaskFilePath</span><br /><br />Using C++ version:<br /><span style="font-family:courier new;">xalan -in $TempRawTaskFilePath -xsl atom2plain.xsl -out $TempTransformedTaskFilePath</span><br /><br />Playing around with the XSL took some time to get the desired plain text output. It is included in the download at the bottom.<span style="font-weight: bold;"><br /></span><br /><br /><span style="font-weight: bold;">Running the tasks from a script</span><br />The above steps were easy enough to capture in a bash script. I then pulled out all of the generic parts of the script into a seperate script so I can easily repeat the task for a number of task lists (or smart lists). Finally, I created a second "runner script" to run the generic script with parameters such that I get the tasks that I want copied to my iPod.<br /><br />The two bash scripts are included in the download at the bottom.<br /><br /><br /><span style="font-weight: bold;">Running the script nicely</span><br />In order to have the runner script run in an unintrusive manner, I created a Terminal configuration (on OS X) that executes the script in a Terminal window that:<br /><ul><li>starts minimized (i.e. only shows up in the dock)</li><li>shows up mostly transparent and quite small if de-iconified</li><li>exits when the script finishes its tasks</li></ul>This Terminal configuration is also included in the download at the bottom.<br /><br /><br /><span style="font-weight: bold;">Running the script automatically</span><br />Finally, in order to fully automate the process, I looked for a utility that would run the script whenever the iPod was connected to my computer. The app I came across is <a href="http://www.azarhi.com/Projects/DSW/">Do Something When</a> (DSW). It does exactly when I wanted - whenever the iPod is connected, DSW detects it and runs the Terminal configuration (which in turn runs the script).<br /><br /><br /><span style="font-weight: bold;">The goods<br /></span>Here is a <a href="http://members.shaw.ca/subnucleon/SubNucleon-rtm2ipod.zip">ZIP archive</a> containing:<br /><ul><li>The XSL file (<span style="font-family:courier new;">rtm2ipod.xsl</span>)</li><li>The generic bash script (<span style="font-family:courier new;">rtm2ipod.sh</span>)</li><li>The runner bash script (<span style="font-family:courier new;">run_rtm2ipod.sh</span>)</li><li>The Terminal configuration (<span style="font-family:courier new;">term_run_rtm2ipod.term</span>)</li></ul><br /><span style="font-weight: bold;">Further work</span><br />It would be nice if ways were found to do the following things (and posted in the comments) so that the entire flow described in this post can be realized on Linux and Windows machines as well as Mac OS X (since I have only described and created the <span style="font-style: italic;">full</span> flow for OS X):<br /><ul><li>Create a batch equivalent of the bash scripts (for Windows systems)</li><li>Create the equivalent of a "Terminal configuration" - something that achieves the goals of running the script in a visually pleasing, auto-closing window that starts off minimized/iconified (for Linux and Windows systems)</li><li>Find the equivalent of Do Something When so the script can be run automatically when an iPod is connected (for Linux and Windows systems)</li></ul>If you have good, concrete hints on how to achieve the above, or time to actually create the artifacts, please take the time to write a comment. As always, feel free to let me know if you found this post useful. Thanks!Bojanhttp://www.blogger.com/profile/15515367189634457454noreply@blogger.com0tag:blogger.com,1999:blog-6892660702366080737.post-90665463548672680522008-11-17T23:06:00.000+01:002008-11-17T23:22:43.616+01:00First PostOver the years I have found I often email myself information about:<br /><ul><li>cool things I find</li><li>important or useful information</li><li>mini-projects I put together in my spare time</li><li>ideas for mini-projects I should put together in my spare time<br /></li><li>etc</li></ul>I figure it's high time I find a better way of organizing this information. Additionally, most of the stuff I save tends to be fruit(s) of many hours of research and experimentation labour; there's no good reason this information shouldn't be readily available for other people to find and use.<br /><br />Often I find exactly the information I'm looking for easily and quickly. Many times it's on someone's blog. This is exactly the information I will <span style="font-style: italic;">not</span> be blogging about. My aim is for the information I put up to be interesting, useful and <span style="font-style: italic;">original</span>. If you know of another place on the web with similar content to one of my posts (and there's no link already), chances are I couldn't easily find it, so feel free to bring it to my attention and (do the world a favour and) paste the URL.<br /><br />Needless to say, comments are welcome (this is Web 2.0 stuff after all). Rudeness, unnecessary ranting and so forth are not. If your comment contains excessive and unnecessary profanity, inappropriate content or other attributes that cause me to look at it and think something along the lines of "wow, this person is..." followed by "immature," "a bigot," "a real jerk" or anything similar, chances are it will not make it past my filter.<br /><br />I will try to make sure my posts are useful, interesting and time-saving. Please make sure your comments are tailored with the same spirit.Bojanhttp://www.blogger.com/profile/15515367189634457454noreply@blogger.com0