Internet Explorer Automation

In the previous chapter we learned how to find information about OLE classes and their members. Moreover we made Rails page that, although without fancy look, displays very useful information for selected OLE objects. Now we should go a step further and start to utilize this knowledge.

We will begin with simple wrapper class that starts Internet Explorer and exposes few methods for easier manipulation. Along the chapter we will improve it and add some advanced features. Initial source code is given below (file ie_runner.rb).

require 'win32ole'
class IERunner
  attr_reader :app
  def initialize
    @type = WIN32OLE_TYPE.new("Microsoft Internet Controls", "InternetExplorer")
    @app = WIN32OLE.new("InternetExplorer.Application")
  end

  def guid
    @type.guid
  end

  def progid
    @type.progid
  end

  def visible=(is_visible)
    @app.visible = is_visible
  end

  def goto(url)
    @app.Navigate(url)
  end

  def close
    @app.Quit
  end

  def html
    @app.document.nil? ? '' : @app.document.body.outerHTML
  end
end

In the initialize method we are creating IE's WIN32OLE_TYPE object and starting application. We are also defining reader attribute that returns our Internet Explorer application object, methods that return GUID and ProgrID, as well as method for navigating Web pages, exiting application and, finally, method that returns HTML source of a page that is currently displayed in the browser. If you now create IERunner object web browser will be started but nothing visible will happen. The reason is the fact that most of applications when started through OLE are not flagged as visible and to display their window you have to explicitly set visible flag to true. Let's try this wrapper. Save following code in test_ie_runner.rb file in the same directory as ie_runner.rb

require './ie_runner'
ie = IERunner.new
ie.visible = true
ie.goto http://rubyinstaller.org
sleep(5)
ie.close

Running above script will open new Internet Explorer window, The RubyInstaller homepage will be loaded and after 5 seconds application will be closed. We are making progress. We now know how to start a new application through OLE automation and send it messages, by calling OLE methods, through which we can manipulate it. Even though we can get back results of methods execution and check OLE object state through its properties, this is still one way communication because each action must be initiated from Ruby script. In some cases this can be enough. Our Ruby script will call numerous OLE methods, possibly branching due to the values of OLE properties and finally release OLE object and exit.

More complex applications will, on the other hand, need to receive some notifications from OLE control and make decisions about next action based on it. Whether this is possible or not depends on OLE control. Namely some OLE controls use publish/subscribe, or better say, observer pattern to enable callbacks from OLE control to the client application.

In general, observer pattern is software pattern in which source object, called subject, keeps the list of its observers (subscribers) and notifies them whenever state changes. Usually this notification is done by calling observers' method, possibly passing it arguments. Observer can subscribe to more than one type of change and subject can have one or more observers per type of change. Subject maintains the list of all observers in such a way that it knows which observers should be called when particular change happens.

In other words, subject exposes a list of events to which observers can subscribe. Whenever event occurs observer's method, event handler, is called. In our case subject is OLE control and observer is our Ruby script. For this purpose win32ole extension library defines special class, WIN32OLE_EVENT. Whenever you want to receive notifications from OLE control you must create object of this type passing OLE control to the WIN32OLE_EVENT constructor as an argument.

ie = WIN32OLE.new(InternetExplorer.Application)
ev = WIN32OLE_EVENT.new(ie)

After object is created you can use it to subscribe to control's events. Subscribing can be done in two ways. The first one is to call WIN32OLE_EVENT's on_event method. This method accepts one optional argument. If this argument is given, event handler will subscribe to the event of the same name. If on_event is called without argument, handler will subscribe to all events. This method accepts block which is the code that will be executed when callback is called.

ev.on_event(DocumentComplete) {|*args| puts Loaded document at #{args[1]}”}
ev.on_event() {|event, *args| puts Event #{event} triggered”}

First line subscribes to the Internet Explorer's DocumentComplete event which is fired when document being navigated to reaches ReadyState_Complete. This event passes two arguments to the callback function – object that specifies WebBrowser object corresponding to the event and URL of the loaded document. Callback procedure will print message with URL of loaded document. Second line subscribes to all events and whenever it is called it prints out name of triggered event. If you do not want to receive event notifications any more you can use off_event method.

ev.off_event(DocumentComplete)

The second way of subscribing is to pass instance of the class that handles events notifications to the event handler's object. In our example we will use this method, but before we go over all details of this method there is one important issue we must solve.

Take a look at the following Ruby script:

require 'win32ole'
ie = WIN32OLE.new(InternetExplorer.Application)
ev = WIN32OLE_EVENT.new(ie)
ev.on_event(DocumentComplete) {|*args| puts Loaded document at #{args[1]}”}
ie.Visible = true
ie.Navigate rubyonwindowsguides.github.io"

At the first glance this is perfectly good script. We are creating Internet Explorer automation object and corresponding event handler, subscribing to DocumentComplete event and navigating to some URL. Do you see the problem we are facing here? If yes, excellent; otherwise keep reading.

Let's analyze script again. We are starting Internet Explorer, subscribing to the event and navigating to some URL. But, by the time document is loaded and it reaches ReadyState_Complete our Ruby script will exit, so we will never be able to receive any notification. It is clear that we must prevent script from exiting if we want our callback function, or in our case Ruby procedure, to be executed. So how can we do that? First idea might be to wait for user input

gets

or to use infinite loop

wile(true)
end

If you try any of these you will end up with Internet Explorer that doesn't load page and has empty window client area. The reason for this behavior lies in a way how Windows applications work. All Windows applications are event driven. System passes input to the application's window. Each window has special function, window procedure, that system calls whenever it has input for the window. When application processes current input, control is returned to the system and application waits till new input is available.

Passing input to the window is message based. Each input is new message sent to the window. Actually messages can be sent or posted to the window. Messages that are sent are immediately processed and message queue is bypassed. On the other hand posted messages are put in the message queue. They are processed one at the time. Each message is removed from the queue, examined and finally processed in window procedure.

An application must remove and process its messages. For single-threaded applications it is usually done in the message loop. Simple message loop has one function call to each of following functions: GetMessage, TranslateMessage and DispatchMessage. Without going into more details, these function calls retrieve message from the message queue, made necessary transformations and dispatch message to the target window.

Most of messages sent by system to application's windows are posted. If main application thread is blocked windows will never process any of them. This will result in window that is not refreshed and doesn't respond to mouse or keyboard input. This is exactly scenario we will face if we use infinite loop or if we wait for user input in gets. If you still have script running you can check this by dragging any window over Internet Explorer's window. Client area will not be redrawn. But if you move Internet Explorer window you will see that client area is cleared up again. System sends repaint message when window is moved and such messages are immediately processed by window procedure. Now you can end script by hitting Ctrl-C.

It is clear now that, in order to prevent script from exiting without blocking Internet Explorer's message queue we must use message loop and call GetMessage, TranslateMessage and DispatchMessage functions. Fortunately we do not have to create our own message loop method. Such function exists in win32ole library. It is class method of WIN32OLE_TYPE class.

while(true)
  WIN32OLE_TYPE.message_loop
end

With these three lines at the end of our script we will prevent script exiting prematurely and we will be able to receive notification whenever browser loads complete page.

When we talked how to handle events notifications fired from OLE control we mentioned two ways. The first one, using on_event method, was explained but the second, using instance of some class was just mentioned. It is now time to learn how to use objects to process events. But before we go over the code let's see what we want to achieve. Of course, first functionality is to have easy way to register event handlers without using on_event method. Second, we want to centralize event handling in one class. Finally we want to be able to subscribe not only to events exposed by Internet Explorer but also to other OLE objects within it.

Class that will do this is given in the code below (event_handler.rb):

class EventHandler
  def initialize
    @handlers = {}
  end

  def add_handler(event, &block)
    if block_given?
      @handlers[event] = block
    end
  end

  def method_missing(event, *args)
    if @handlers[event.to_s]
      @handlers[event.to_s].call(*args)
    end
  end
end

In the initialize method we are defining hash variable that will keep all registered event handler methods. Next method, add_handler receives event name in the first argument and block that should be executed when event is fired in the second argument. Using & in front of the argument actually tells Ruby to convert block to the Proc object that can be later invoked with call method as we will see soon. Method add_handler checks whether block is passed to it and if it is, stores Proc object as a value in the @handlers instance variable using event name as a key.

All the magic of EventHandler class happens in the last method, method_missing. This method will be called whenever we try to call methods that are not defined. First argument of the method is name of the method that does not exist. Second argument uses Ruby construct *args to collect all arguments which are passed to non-existing method to an array. Our method_missing checks whether we have handler for requested method and if we do, it executes it, passing it all original arguments. Let's try our new class in interactive ruby.

C:\projects\ruby\ruby_win\internet_explorer>irb -I. -revent_handler
irb(main):001:0> eh = EventHandler.new
=> #<EventHandler:0x1c8bad8 @handlers={}>
irb(main):002:0> eh.HandleMyEvent
=> nil
irb(main):003:0> eh.add_handler('HandleMyEvent') do |*args|
irb(main):004:1* puts "Called HandleMyEvent(#{args.join(', ')})"
irb(main):005:1> end
=> #<Proc:0x15c7050@(irb):3>
irb(main):006:0> eh.HandleMyEvent
Called HandleMyEvent()
=> nil
irb(main):007:0> eh.HandleMyEvent("one", "two", "three")
Called HandleMyEvent(one, two, three)

We are starting interactive ruby with -I. argument which tells it to add current directory to the Ruby's $LOAD_PATH variable. This variable keeps a list of paths that will be searched when we require file. Ruby 2.x contrary to 1.8 doesn't add current directory to the $LOAD_PATH list so we have to explicitely add it. Next we tell interactive ruby to load our new file with -revent_handler. After we create new EventHandler object we are calling HandleMyEvent method. Since this method is not defined we get nil as a result of this call. In the next three statements we are adding new event handler which just prints out information about name and arguments passed during method call. After that, if we call method with the name given in the first argument of add_handler and no arguments we see message that correctly displays which method is called without arguments. In the last statement we are calling same method with three arguments.

Handling arguments in such a way is possible because we used *args construct in block definition. Using asterisk '*' before argument name tells Ruby to collect all subsequent arguments in the array. If we do not pass arguments when we call method, array will be empty. Contrary, if arguments are used in method call, they will be collected in one array and passed to the block which can later use them.

With EventHandler class in place we can start to improve IERunner class. This is our new initialize method.

def initialize
  @type = WIN32OLE_TYPE.new("Microsoft Internet Controls", "InternetExplorer")
  @app = WIN32OLE.new("InternetExplorer.Application")
  @ev_handler = EventHandler.new
  @run_loop = true

  @app_event = WIN32OLE_EVENT.new(@app)
  @app_event.handler = @ev_handler

  @ev_handler.add_handler("OnQuit") {|*args| exit_message_loop}
  @ev_handler.add_handler("DocumentComplete") do |*args|
    unless @app.document.body.innerHTML.empty?
      @doc_event = WIN32OLE_EVENT.new(@app.document)
      @doc_event.handler = @ev_handler
    end
  end
end

First we create EventHandler object and define flag for exiting application. After that, we create new Internet Explorer event and set our EventHandler object as a handler for browser's events. At the end we are subscribing to two events OnQuit and DocumentComplete. OnQuit event handler will be called when Internet Explorer is closed. It just calls exit_message_loop and exits.

We also want to handle events related to loaded documents. Every time we navigate browser to a new URL, new document will be created and DocumentComplete event will be fired when document reaches ReadyState_Complete. Unless we loaded empty page, we create new event for loaded document and set our EventHandler object as a main handler for document related events. Note that even though we set handler in the event we will not process any event till we define event handling method. But before we do that we need to implement few more things.

def register_handler(event, &block)
  @ev_handler.add_handler(event, &block) unless event == 'OnQuit'
end

def exit_message_loop
  puts "IE exiting..."
  @run_loop = false
end

def run
  while(@run_loop)
    WIN32OLE_EVENT.message_loop
  end
end

First method added to our IERunner class is part of our simple API. IERunner clients should use this method instead of directly accessing EventHandler object. Reason for this is we already registered our event handler for OnQuit event because we need it for gracefully finish message loop, and we do not want our clients to override that. Next method is used to signal message loop to finish. Finally we are defining run method that runs message loop.

Look at the following code from runner_sample.rb.

$LOAD_PATH.unshift File.expand_path('../', __FILE__)
require 'ie_runner'

ier = IERunner.new
ier.register_handler("BeforeNavigate2") do |*args|
  puts "About to go to: #{args[1]}"
end
ier.visible = true
ier.goto "http://rubyinstaller.org"
ier.run

After requiring ie_runner file, we are creating runner object. Next we are registering new event handler for BeforeNavigate2 event. Finally we are displaying Internet Explorer window, navigating to the RubyInstaller home page and starting message loop so our EventHandler object gets the chance to process events fired by browser. When this script is started new Internet Explorer window opens and, before page is loaded, script prints out URL that is passed to the browser. Since our script runs as long as Internet Explorer is running it can be a good starting point for creating macros. While script is running we can use browser as we usually do and within a script store navigated URLs, save them in the file and replay later.

Now when we know how to handle events without blocking message queue we can further improve our IERunner class so it can be used not only for navigating to various URLs, but also for testing web applications within Internet Explorer. First thing we need is a way to find various elements in loaded pages. Fortunately document object within Internet Explorer already has methods that do that and we only have to expose them through our interface.

def element_by_id(id)
  return @app.document.getElementById(id.to_s) unless @app.document.nil?
end

def elements_by_name(name)
  return @app.document.getElementsByName(name.to_s) unless @app.document.nil?
end

def elements_by_tag(tag)
  return @app.document.getElementsByTagName(tag.to_s) unless @app.document.nil?
end

First method returns single element and next two return collection of elements because there can be more than one element with given name or tag on the page. Naturally we want to be able to simulate elements clicking so let's add this method too.

def click(id)
  elem = element_by_id(id)
  elem.click unless elem.nil?
end

Further we need methods to fill text, set check-box value, select value by index in select box and find child element by attribute value.

def fill_text(id, value)
  elem = element_by_id(id)
  elem.value = value unless elem.nil?
end

def set_check_box(id, value = true)
  elem = element_by_id(id)
  elem.checked = value unless elem.nil?
end

def select_index(id, idx)
  elem = element_by_id(id)
  elem.selectedIndex = idx unless elem.nil?
end

def child_with_attrib_value(parent, attrib, value)
  if parent && parent.hasChildNodes
    parent.childNodes.each do |cn|
      begin
        return cn if cn.send(attrib) == value
      rescue
      end
    end
  end
end

Although IERunner functionality can be improved with additional methods that search elements by XPath, CSS selectors or multiple attributes values, it is good enough to illustrate Internet Explorer automation through Ruby scripts. But before we start using IERunner there is one more problem to be solved. Namely we saw that, in order to be able to process events fired by browser, we must use message_loop method. This method keeps messages pumping and enables browser and our Ruby script running simultaneously. Unfortunately there is one drawback in this approach. If we call IERunner::run method it will not exit as long as Internet Explorer is running. Therefore any statement after function call will not be executed. At the same time whenever we navigate to a new URL, page will not be loaded immediately and we will have to wait till it loads in order to continue our processing.

This means we will have to change our approach and to implement function that will not block message queue, our script and, at the same time, enable us to wait till page is loaded or timeout expires. Let's do it now. First we have to define flag that will signalize us whether page is loaded or not. Add following statement in IERunner::initialize method.

@document_complete = false

Whenever document is loaded we will set this variable to true and will reset it to false when we are about to go to new URL. We will do this by handling two events. First event, DocumentComplete, is called when document reaches ReadyState_Complete or, in other words, when page is completely loaded. Second one is called just before browser is navigated to a new URL. Add following lines to IERunner::initialize method.

@ev_handler.add_handler("DocumentComplete") do |*args|
  @document_complete = true
end
@ev_handler.add_handler("BeforeNavigate2") {|*args| @document_complete = false}

We see that call to the DocumentComplete event handler will cause our flag to be set to true, and firing BeforeNavigate2 event causes @document_complete flag to be set to false. That was exactly what we wanted.

If we want, for example, to process clicking on the document or handle keys pressing we have to subscribe to document events. Event DocumentComplete is the one which we can use to accomplish this. Whenever new page is loaded in the browser, new document object is created and we have to attach to it's events. This can be done usin following code added to DocumentComplete event handler.

unless @app.document.body.innerHTML.empty?
  @doc_event = WIN32OLE_EVENT.new(@app.document)
  @doc_event.handler = @ev_handler
end

There is just one more thing we must implement in our IERunner class – wait method which is given below.

def wait_complete(secs, interval = 0.5)
  elapsed = 0
  while(!@document_complete && elapsed <= secs)
    elapsed += interval
    sleep(interval)
    WIN32OLE_EVENT.message_loop
  end
end

Our wait_complete method accepts two arguments. The first one is timeout interval in seconds and the second one is length of sleeping interval. Second argument has default value meaning that if we do not pass this argument during function call, sleeping interval of 0.05 seconds will be used.

Inside method we are running while loop either as long as document is not fully loaded or timeout expires. Within loop we are calling WIN32OLE_EVENT#message_loop method giving Internet Explorer a chance to process messages posted to it.

IERunner class is now ready. We can navigate Internet Explorer, search for elements, click links and wait for a document to be completely loaded. Here is a full source of new runner_sample.rb script.

$LOAD_PATH.unshift File.expand_path('../', __FILE__)
require 'ie_runner'

ier = IERunner.new
ier.visible = true
ier.goto "http://www.google.com/language_tools"
ier.wait_complete(30)

ier.fill_text("source", "Interesantan primer")
ier.element_by_id("gt-submit").click

Our script starts new instance of Internet Explorer, navigates it to Google's translate page and waits for document to be loaded. Script will exit from wait_complete method as soon as document reaches ReadyState_Complete or thirty seconds expire. After that, script fills text in input field which has ID value equal to “source”. Next we are searching for element with ID “gt-submit” and simulate clicking on the submit button. Script doesn't close browser on exit. It leaves it on purpose so we can check whether we really end up with translated text.

Having all needed methods in place we are ready to use IERunner for Web application testing. Even though running tests in real browser is actually functional testing, we will, for the sake of simplicity, use unit testing framework, test-unit, that is a part of standard Ruby library. Here is our test case class.

$LOAD_PATH.unshift File.expand_path('../', __FILE__)
require 'ie_runner'
require 'minitest/autorun'

class IETest < MiniTest::Unit::TestCase
  def setup
    @ier = IERunner.new
    @ier.visible = true
  end

  def teardown
    @ier.close
  end

  def test_translate
    @ier.goto "http://www.google.com/language_tools"
    @ier.wait_complete(30)
    assert_equal(false, @ier.html.empty?, "Document not loaded")
    source = @ier.element_by_id("source")
    refute_nil(source, "Element 'source' not found")

    submit_btn = @ier.element_by_id("gt-submit")
    refute_nil(submit_btn, "Element 'gt-submit' not found")
  end
end

test-unit framework will call setup method before any test. Knowing this we can use it to perform all initializations needed by tests. In the setup method of our test case we are creating new IERunner object, which opens Internet Explorer. In the only test we are opening page we want to test and checking if page has two elements. We can run this test from the command line using following command:

C:\projects\ruby\rwin_book_code>ruby ie_test.rb
MiniTest::Unit::TestCase is now Minitest::Test. From ie_test.rb:5:in `<main>'
Run options: --seed 16012

# Running:

.

Finished in 21.554106s, 0.0464 runs/s, 0.1392 assertions/s.

1 runs, 3 assertions, 0 failures, 0 errors, 0 skips

Since we closed browser in a teardown method of our test case and this method is called after each test this time contrary to previous example, browser will be properly closed. At this stage we have a way to check whether our application behaves as we expect, by checking for specific string on the loaded page, and we do not have to leave browser open in order to verify that.

We have a big overhead in our test class. Since setup method is executed before and teardown after each test, Internet Explorer will keep opening and closing. This is very expensive and will make our test run too long. Better solution would be to open browser before all tests and close it at the end.