Jul 6 2011

Deepcore launches unshape.co.uk

UNSHAPE

Unshape: the creative vehicle of visual artists Katja Alexiadou and Kostas Katsikas. Take a look at the brand new website here. Our team worked in cooperation with both of the artists in order to meet their aesthetic desires of their portfolio. The Django platform was used for the implementation and the javascript framework jQuery for the site effects and the dynamics.


Mar 5 2011

Deepcore launches xsoz.gr

Deepcore finally lauches xsoz's official web page (link). Xsoz is a music artist who delivers cold mechanical beats and manipulating dark, distorted soundscapes. As a team we recognize the difficulty of being an underground artist in Greece and we believe that a more attention has to be paid on those efforts of creating unique quality music. Therefore, we done our best efforts to create a promotional website for Xsoz. The interface was tuned to match the atmosphere created from the artist's music. There was a very successful cooperation with the artist by the time we started the implementation.

The site was build in the Django platform which is a high-level Python Web framework. The javascript framework used was jQuery. The design is based on CSS which was build from our team.

Our best wishes to Xsoz!


Mar 5 2011

Repair corrupted binary files transfered via ftp in ASCII mode

A common mistake usually made by users (without much experience) is when transferring binary files via ftp. If the binary option is not selected then the binary file will be transfered using text/ascii mode. This results to the corruption of the destination binary file. Imagine the destructiveness of this simple mistake when you have to do with backups that could not be transfered again (in binary mode). You will end up with corrupted backup data and you would pray not to have done this mistake!

I recently faced such a problem with tar.gz backup files stored in an external NTFS hard drive. The data was transfered there via ftp (unfortunately in ascii mode). The server with the original data (Centos 5) had problems with hard disks combined using RAID technology. One of the two disks was irreversibly damaged.

So, i tried to recover the data from the backup drive. I tried to issue the usual command for uncompressing and extracting the data.

$ tar zxvf mysql_backup.tar.gz

The output was giving a serious error of gzip.

gzip: stdin: invalid compressed data--format violated
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

The above not recoverable error gave me the creeps. After googling around i found that is very difficult task to recover those corrupted files because the FTP ASCII mode transfer destroys data by transforming dissimilar original bytes to identical output values. Many of the suggesting solutions was to transfer again the file in binary mode, something impossible since the original data were lost. The only vital suggestions to recover the data was by implementing a recovery program which will recognize and fix the corruption. An algorithm suggested was to open the corrupted file as a byte stream and detect and remove all the carriage returns (CR) followed by linefeed (LF). Taking into account that the number of possible valid CR (0d) LF (0a) byte pairs in a binary will not be very high; the probability of a binary having a valid CR (0d) LF (0a) pair is quite low. Therefore, you have a high probability to save your data.

Finally, i didn't implement such an algorithm (which is quite easy), since i found an implementation in c in the official site of the gzip. The offer a program called fixgz to remove the extra CR (carriage return) bytes inserted by the transfer. However, there is absolutely no guarantee that this will actually fix your file. Despite the no guarantee warning, i got my entire data backup (which included images, files and sql backups in tar.gz files) by doing the following:

$ wget http://www.gzip.org/fixgz.zip
$ unzip fixgz.zip
$ gcc -o fixgz fixgz.c
$ fixgz corrupted_backup.tar.gz fixed_backup.tar.gz
$ tar zxvf fixed_backup.tar.gz

And there it work like a charm! The data were extracted 100% successfully. I wanted to share my experience, because it probably will be useful to one of you desperate with this ftp ascii thing out there. Conclusion, always enable the binary mode when you deal with binary files in ftp transfers ;)


Aug 21 2009

Crawl, automate or test with HtmlUnit and JRuby

I bet you get dizzy with all the web testers that exist out there. I've been searching for the most  suitable one for my needs,  in order to write my own autonomous automators and crawlers. The main problem i was facing was that most of the libraries or gui-less browsers i used didn't support javascript, and that was a pain in the ass because i was getting stuck to a lot of pages. I have the impression that javascript is being used extensively by web developers nowadays and so if you want to write your own code doing some interesting stuff inside the web u will need support for javascript!

I came up with htmlunit library which is gui-less browser for java easy to use and has good support for javascript. Well it's even easier to use this library when you write your programs in jruby.

Below i will explain about the setting up  those together and writing some very usefull bots for your everyday needs.

  • Install jruby
  • Download htmlunit
  • Enable JRuby and include jar files
  • Write some code

Step 1

First of all we have to install jruby. If you compile jruby yourself remember to include it in your classpath.

Mac OS X

You will have to download and install MacPorts (http://www.macports.org/install.php) and then issue the following command:

$ sudo port install jruby

Linux

Use the package manager you have installed in your system. You simply write the following for distributions using aptitude:

$ sudo apt-get install jruby

Windows

http://www.devdaily.com/blog/post/ruby/installing-jruby-on-windows-xp-system/

Step 2

Download htmlunit from http://sourceforge.net/projects/htmlunit/files/
Place the downloaded jars into a folder named lib.

 
tar -zxvf htmlunit-x_x.tar.gz
cd htmlunit-x-x/
mv lib/ path_of_your_choice/
 

Step 3

Top in the ruby file you  are working write the following:

# Require Java so we can use the Java libraries
require 'java';
 
# Get HTML Unit and all of its required libraries
require 'htmlunit-2.1.jar';

Example: Vodafone bill

A simple example retreiving the bill for my mobile phone from vodafone:

voda.rb

# Require Java so we can use the Java libraries
require 'java';
 
# Get HTML Unit and all of its required libraries
require 'htmlunit-2.1.jar';
require 'commons-httpclient-3.1.jar';
require 'commons-io-1.4.jar';
require 'commons-logging-1.1.1.jar';
require 'commons-lang-2.4.jar'
require 'commons-codec-1.3.jar'
require 'xercesImpl-2.8.1.jar'
require 'xml-apis-1.0.b2.jar'
require 'jaxen-1.1.1.jar'
require 'commons-collections-3.2.jar'
require 'js-1.7R1.jar'
require 'nekohtml-1.9.7.jar'
require 'sac-1.3.jar'
require 'cssparser-0.9.5.jar'
require 'xalan-2.7.0.jar'
require 'xercesImpl-2.8.1.jar'
 
# Include the Web Client class
include_class 'com.gargoylesoftware.htmlunit.WebClient';
include_class 'com.gargoylesoftware.htmlunit.BrowserVersion';
 
# Function to connect to vodafone website
def connect_to_vodafone
version = BrowserVersion.new( "Netscape", "5.0 (Macintosh; en-US)", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14", "1.2" , 5.0 )
puts "version:ok"
wc = WebClient.new(version)
puts "wc:ok"
page = wc.getPage("http://www.vodafone.gr/portal/client/cms/viewCmsPage.action?pageId=1032");
puts "load_page:ok"
puts "\nLogging in to vodafone.gr ...\n"
#get login box
forms = page.getForms()
login_form = nil
forms.each do |form|
if form.getActionAttribute().include? "/portal/client/idm/login!login.action"
login_form = form
end
end
 
username = login_form.getInputByName("username")
password = login_form.getInputByName("password")
button = login_form.getInputByName("Submit")
#set values to login box
username.setValueAttribute("your_pass")
password.setValueAttribute("your_username")
 
mypage = button.click()
 
mypage = wc.getPage("https://www.vodafone.gr/portal/client/idm/loadUserProfile.action");
account_form = mypage.getFormByName("myAccountSelectBill")
select_drop_down = mypage.getByXPath('//select[@id="billingAccountField"]')[0]
#results for 1st account
get_results(select_drop_down.asText(),mypage)
end
 
def get_results(am,page)
#Collect the data you are interested in
total_amount = page.getByXPath('//input[@id="payBill_totalOwnedAmount"]')[0].getValueAttribute()
recent_amount = ""
duration = ""
page.getByXPath('//td[@class="main_text pad5"]').each do |td|
if td.asXml().include?"€"
recent_amount = td.asText
end
if td.asXml().include?"-"
duration = td.asText
end
end
 
#Print collection
puts "\nVodafone bill"
puts "-------------------------------------------"
puts "A/M: "+am+"\n"
puts "Total amount: " + total_amount + " €\n"
puts "Recent bill amount: " + recent_amount.split(' ')[1].split(',').join('.') + " €\n"
puts "Duration: " + duration + "\n\n"
end
 
connect_to_vodafone

Execution

jruby -Ipath_to_lib_folder voda.rb 2>/dev/null

More examples to come :)


Jul 10 2009

Creating a CMS using CouchDB, Django and 30 lines of python

I bet you have heard the news on the street about this old ericsson programming language coming back to life bringing functional programming on the web. Yes, I talk about erlang! The language that is used by companies like Facebook, Amazon and of course Ericsson for their network products. It offers features like hot code swapping, lightweight inter-process communication and more. Generally it is a really great language that fits really well into the "Cloud computing" industry.

As I was exploring the language a while ago, I came into a project called Apache CouchDB, an object-oriented database, or to be more precise a "document store". At first I thought, oh just another object store, but then, after a bit of research, I got in love with it. It is wonderful because It gives you the ability to store, retrieve and query structured data without having to define a schema, one can also attach files onto each document! And the best of all you can do all of this through a neat web interface! Of course it is programmed in Erlang and this is the reason why it offers great speed, stability and distributed features like replication. As I mentioned earlier you can even query data, using uhm... yes... JavaScript! Smart.

After I used it for a couple of hours, I thought that it would really be a great Django template store! One could serve all templates in CouchDB and also define template instances using documents that include all the variables a template renders. Isn't this some kind of a CMS? I started a simple implementation and after no more than 30 lines of python there it was! Really simple but also really functional!

Here are the steps!

  1. Install CouchDB
  2. Create a new Django project
  3. Create a new app inside tha django project, I called mine totemplate.
  4. Create the appropriate views and setup the urls.py
  5. Design the database on CouchDB
  6. enjoy!

Step 1

Installing CouchDB should be really easy for all platforms.

Mac OS X

You will have to download and install MacPorts (http://www.macports.org/install.php) and then issue the following command:

$ sudo port install couchdb

Linux

Here is a link to a tutorial for linux: http://barkingiguana.com/2008/06/28/installing-couchdb-080-on-ubuntu-804.

Windows

If you are on windows you can take a look here: http://wiki.apache.org/couchdb/Installing_on_Windows I haven't tested it myself but it should work fine.

After you have installed couchdb, you will also have to install the python library for it. This is called couchdb-python and is available here:
http://code.google.com/p/couchdb-python/.
For you that have easy_install installed on your machines, issuing:

$ sudo easy_install couchdb-pytho

should do the job quickly and easilly.

Here is the views.py:

# Create your views here.
from django.http import HttpResponse
from django.template import Template, Context
from couchdb import *
from totemplate.settings import COUCH_SERVER
 
def show(request, resource_id, page_id):
    couch_store = Server(COUCH_SERVER)
    category_name = couch_store['indexers'][resource_id]['category']
    category = couch_store[category_name][page_id]
    template = couch_store['templates'][category['template_id']]
    html = Template(template['body']).render(Context(category))
    return HttpResponse(html)
 
def index(request, resource_id):
    couch_store = Server(COUCH_SERVER)
    indexer = couch_store['indexers'][resource_id]
    t = Template(indexer['template'])
    html = t.render(Context(couch_store))
    return HttpResponse(html)
 
def resources(request):
    couch_store = Server(COUCH_SERVER)
    resources = {"indexers":[]}
    for indexer in couch_store['indexers']:
        if couch_store['indexers'][indexer].has_key('template'):
            resources['indexers'] += [indexer]
    t = Template( couch_store["settings"]["index"]["body"])
    html = t.render(Context(resources))
    return HttpResponse(html)

Jun 12 2009

Caching the result of a python function using memcached and decorators

sarcachem.py

import memcache, time
 
HOST = "127.0.0.1"
PORT = "11211"
MC_CLIENT = memcache.Client(['%s:%s'%(HOST,PORT)], debug=0)
 
class sarcachem:
 
    class helper:
 
        def __init__(self, outer, fun):
            self.outer = outer
            self.fun = fun
 
        def __call__(self, *args, **kwargs):
            # If cached value does not exist
            # 1. Check to see if it is locked
            #    If it is, wait until it unlocks
            #    If it is not, lock and calculate value,
            #    unlock when finished
            # Return cached value
            key = "%s.%s->(%s,%s)"%(self.outer.salt,
                                   self.fun.func_name,
                                   repr(args),
                                   repr(kwargs))
            key_lock = "00_locked_%s"%(key)
 
            if MC_CLIENT.get(key) is None:
                if not MC_CLIENT.get(key_lock):
                    MC_CLIENT.set(key_lock,True)
                    result = self.fun(*args, **kwargs)
                    MC_CLIENT.set(key,result,time=self.outer.time)
                    print "Storing: ", key, ": ", MC_CLIENT.get(key)
                    MC_CLIENT.delete(key_lock)
                else:
                    while True:
                        time.sleep(1)
                        if not MC_CLIENT.get(key_lock):
                            break
                        else:
                            continue
 
            return MC_CLIENT.get(key)
 
    def __init__(self,time=3,salt="base"):
        """ In this function we set all our decorator's parameters """
        self.time = time
        self.salt = salt
 
    def __call__(self, fun):
        return sarcachem.helper(self, fun)

And here is the way you can use it:

from sarcachem import sarcachem
 
@sarcachem(10,__file__)
def fib(number=0):
 
    # Suck my life into the CPUHOLE
    for i in range(0,100000):
        i+100;
    # END OF LIFE SUCKER
 
    if number==0:
        return 0
    elif number==1:
        return 1
    else:
        return int(fib(number-1)) + int(fib(number-2))
 
if __name__=="__main__":
    print fib(100), fib(29)

Jun 12 2009

Up and running

Welcome to our brand new blog. We decided to start deepcore.gr with our blog for the moment. New stuff coming in the near future.

Enjoy your stay,

The team