I bet you get dizzy with all the web testers that exist out there. I've been searching for the most  suitable one for my needs,  in order to write my own autonomous automators and crawlers. The main problem i was facing was that most of the libraries or gui-less browsers i used didn't support javascript, and that was a pain in the ass because i was getting stuck to a lot of pages. I have the impression that javascript is being used extensively by web developers nowadays and so if you want to write your own code doing some interesting stuff inside the web u will need support for javascript!

I came up with htmlunit library which is gui-less browser for java easy to use and has good support for javascript. Well it's even easier to use this library when you write your programs in jruby.

Below i will explain about the setting up  those together and writing some very usefull bots for your everyday needs.

  • Install jruby
  • Download htmlunit
  • Enable JRuby and include jar files
  • Write some code

Step 1

First of all we have to install jruby. If you compile jruby yourself remember to include it in your classpath.

Mac OS X

You will have to download and install MacPorts (http://www.macports.org/install.php) and then issue the following command:

$ sudo port install jruby

Linux

Use the package manager you have installed in your system. You simply write the following for distributions using aptitude:

$ sudo apt-get install jruby

Windows

http://www.devdaily.com/blog/post/ruby/installing-jruby-on-windows-xp-system/

Step 2

Download htmlunit from http://sourceforge.net/projects/htmlunit/files/
Place the downloaded jars into a folder named lib.

  1.  
  2. tar -zxvf htmlunit-x_x.tar.gz
  3. cd htmlunit-x-x/
  4. mv lib/ path_of_your_choice/
  5.  

Step 3

Top in the ruby file you  are working write the following:

  1. # Require Java so we can use the Java libraries
  2. require 'java';
  3.  
  4. # Get HTML Unit and all of its required libraries
  5. require 'htmlunit-2.1.jar';

Example: Vodafone bill

A simple example retreiving the bill for my mobile phone from vodafone:

voda.rb

  1. # Require Java so we can use the Java libraries
  2. require 'java';
  3.  
  4. # Get HTML Unit and all of its required libraries
  5. require 'htmlunit-2.1.jar';
  6. require 'commons-httpclient-3.1.jar';
  7. require 'commons-io-1.4.jar';
  8. require 'commons-logging-1.1.1.jar';
  9. require 'commons-lang-2.4.jar'
  10. require 'commons-codec-1.3.jar'
  11. require 'xercesImpl-2.8.1.jar'
  12. require 'xml-apis-1.0.b2.jar'
  13. require 'jaxen-1.1.1.jar'
  14. require 'commons-collections-3.2.jar'
  15. require 'js-1.7R1.jar'
  16. require 'nekohtml-1.9.7.jar'
  17. require 'sac-1.3.jar'
  18. require 'cssparser-0.9.5.jar'
  19. require 'xalan-2.7.0.jar'
  20. require 'xercesImpl-2.8.1.jar'
  21.  
  22. # Include the Web Client class
  23. include_class 'com.gargoylesoftware.htmlunit.WebClient';
  24. include_class 'com.gargoylesoftware.htmlunit.BrowserVersion';
  25.  
  26. # Function to connect to vodafone website
  27. def connect_to_vodafone
  28. version = BrowserVersion.new( "Netscape", "5.0 (Macintosh; en-US)", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14", "1.2" , 5.0 )
  29. puts "version:ok"
  30. wc = WebClient.new(version)
  31. puts "wc:ok"
  32. page = wc.getPage("http://www.vodafone.gr/portal/client/cms/viewCmsPage.action?pageId=1032");
  33. puts "load_page:ok"
  34. puts "\nLogging in to vodafone.gr ...\n"
  35. #get login box
  36. forms = page.getForms()
  37. login_form = nil
  38. forms.each do |form|
  39. if form.getActionAttribute().include? "/portal/client/idm/login!login.action"
  40. login_form = form
  41. end
  42. end
  43.  
  44. username = login_form.getInputByName("username")
  45. password = login_form.getInputByName("password")
  46. button = login_form.getInputByName("Submit")
  47. #set values to login box
  48. username.setValueAttribute("your_pass")
  49. password.setValueAttribute("your_username")
  50.  
  51. mypage = button.click()
  52.  
  53. mypage = wc.getPage("https://www.vodafone.gr/portal/client/idm/loadUserProfile.action");
  54. account_form = mypage.getFormByName("myAccountSelectBill")
  55. select_drop_down = mypage.getByXPath('//select[@id="billingAccountField"]')[0]
  56. #results for 1st account
  57. get_results(select_drop_down.asText(),mypage)
  58. end
  59.  
  60. def get_results(am,page)
  61. #Collect the data you are interested in
  62. total_amount = page.getByXPath('//input[@id="payBill_totalOwnedAmount"]')[0].getValueAttribute()
  63. recent_amount = ""
  64. duration = ""
  65. page.getByXPath('//td[@class="main_text pad5"]').each do |td|
  66. if td.asXml().include?"€"
  67. recent_amount = td.asText
  68. end
  69. if td.asXml().include?"-"
  70. duration = td.asText
  71. end
  72. end
  73.  
  74. #Print collection
  75. puts "\nVodafone bill"
  76. puts "-------------------------------------------"
  77. puts "A/M: "+am+"\n"
  78. puts "Total amount: " + total_amount + " €\n"
  79. puts "Recent bill amount: " + recent_amount.split(' ')[1].split(',').join('.') + " €\n"
  80. puts "Duration: " + duration + "\n\n"
  81. end
  82.  
  83. connect_to_vodafone

Execution

  1. jruby -Ipath_to_lib_folder voda.rb 2>/dev/null

More examples to come :)

Tagged with:  
Share →
This site is protected by Comment SPAM Wiper.