Architectural
Themes
Before
we start looking at the individual pieces to understand how they're wired
together, it's useful to understand the the overarching themes of the
architecture and development of the project. Succinctly put, these are:
·
Keep the costs down.
·
Emulate the user.
·
Prove the drivers
work…
·
…but you shouldn't
need to understand how everything works.
·
Lower the bus factor.
·
Have sympathy for a JavaScript
implementation.
·
Every method call is
an RPC call.
·
We are an Open Source
project.
Before starting the
automation using any automation tool, it is very important to know how that
tool works and how it is architecture. This will helps to take the good
advantage of the tool at the same time it will helps to make right automation
framework. In my further posts I will start explain how to use selenium and how
to create selenium framework in details but before that let’s get an overview
of Selenium web driver architecture. Selenium can be a little bit confusing. As
a beginner you will find how simply you can record and play the selenium
scripts but it is not straight forward to how it’s doing that. At first glance
it might appear that Selenium is actually driving the browser directly from
your code but there’s actually a little bit more going on here and it’s going
to help us understand how we can remote execute our test by looking at this
basic architecture. So here’s a picture of the architecture for Selenium Web
Driver, which is the current version of Selenium. 

Selenium web driver architecture mainly divided into three parts
1.
Language level
bindings
2.
Selenium Web driver
API
3.
Drivers
1) Language Level Bindings:
You can see at the Left hand side here we’ve got some bindings and these are language level bindings and with which you can implement the Selenium webdriver code. In simple words these the languages in which are making an framework, will interact with the Selenium Webdriver and work on various browsers and other devices. So we have a common API that we use for Selenium that has a common set of commands and we have various bindings for the different languages. So you can see there’s Java, Java, Python, Ruby, there’s also some other bindings and new bindings can be added very easily.
2) Selenium Web driver API:
Now these bindings communicate with Selenium Web driver API and This API send the commands taken from language level bindings interpret it and sent it to Respective driver. Right now don’t worry about how it works. I will explain them in upcoming posts. In basic term it contains set of common library which allow sending command to respective drivers.
3) Drivers:
Drivers here at the right hand side, you see we have various internet browser specific drivers such as IE driver, a Firefox, Chrome, and other drivers such as HTML unit which is an interesting one. It works in headless mode which makes text execution faster. It also contains mobile specific drivers as well. But the basic idea here is that each one of these drivers knows how to drive the browser that it corresponds to. So the Chrome driver knows how to handle the low level details of Chrome browser and drive it to do things like clicking button, going into pages, getting data from the browser itself, the same thing for Firefox, IE, and so on.
You can see at the Left hand side here we’ve got some bindings and these are language level bindings and with which you can implement the Selenium webdriver code. In simple words these the languages in which are making an framework, will interact with the Selenium Webdriver and work on various browsers and other devices. So we have a common API that we use for Selenium that has a common set of commands and we have various bindings for the different languages. So you can see there’s Java, Java, Python, Ruby, there’s also some other bindings and new bindings can be added very easily.
2) Selenium Web driver API:
Now these bindings communicate with Selenium Web driver API and This API send the commands taken from language level bindings interpret it and sent it to Respective driver. Right now don’t worry about how it works. I will explain them in upcoming posts. In basic term it contains set of common library which allow sending command to respective drivers.
3) Drivers:
Drivers here at the right hand side, you see we have various internet browser specific drivers such as IE driver, a Firefox, Chrome, and other drivers such as HTML unit which is an interesting one. It works in headless mode which makes text execution faster. It also contains mobile specific drivers as well. But the basic idea here is that each one of these drivers knows how to drive the browser that it corresponds to. So the Chrome driver knows how to handle the low level details of Chrome browser and drive it to do things like clicking button, going into pages, getting data from the browser itself, the same thing for Firefox, IE, and so on.
How all blocks work together?
So what’s happening here is you’re going to write your test in let’s say in Java and you’re going to be using common Selenium API and that Java binding is going to be sending command across this common Web Driver API. Now on the other end is going to be listening a driver, It’s going to interpret those commands and it’s going to execute them on the actual browser and then it’s going to return the result backup using the Web Driver API to your code where you can look at that result.
So what’s happening here is you’re going to write your test in let’s say in Java and you’re going to be using common Selenium API and that Java binding is going to be sending command across this common Web Driver API. Now on the other end is going to be listening a driver, It’s going to interpret those commands and it’s going to execute them on the actual browser and then it’s going to return the result backup using the Web Driver API to your code where you can look at that result.
Let’s take more closure look that how exactly that works
Let say you have written test using java (binding code) against Selenium API and that binding code is going to issue commands across Web Driver wire protocol this is a rest-based web service that is able to interpret those commands. The driver server is just a little executable that runs each one of the drivers has this driver server that basically listens on a port on your local machine when you run your tests and it’s waiting for these commands to come in. And when these commands come in it interprets those commands and then automates the browser and then returns those results back.
Let say you have written test using java (binding code) against Selenium API and that binding code is going to issue commands across Web Driver wire protocol this is a rest-based web service that is able to interpret those commands. The driver server is just a little executable that runs each one of the drivers has this driver server that basically listens on a port on your local machine when you run your tests and it’s waiting for these commands to come in. And when these commands come in it interprets those commands and then automates the browser and then returns those results back.
I hope this will give
some clear idea about how Selenium Web driver being architect.

No comments:
Post a Comment