0 votes
in Selenium by
What do you understand about broken links? How can you detect broken links in Selenium? Explain properly with code.

1 Answer

0 votes
by

Links or URLs that are not reachable are known as broken links. They may be unavailable or inoperable due to a server issue. A URL's status will always be 2xx, indicating that it is legitimate. There are a variety of HTTP status codes, each with its own set of functions. HTTP status 4xx and 5xx indicate an invalid request. The 4xx class of status codes is used for client-side errors, while the 5xx class is used for server response errors.

You should always check for broken links on your site to ensure that the user does not end up on an error page. If the rules aren't updated appropriately, or the requested resources aren't available on the server, the error will occur. Manual link checking is a time-consuming task because each web page may have a huge number of links, and the process must be performed for each page. 

To find broken links in Selenium, follow the instructions below.

Using the <a> (anchor) tag, collect all of the links on a web page.

For each link, send an HTTP request.

Make that the HTTP response code is correct.

Based on the HTTP response code, determine whether the link is genuine or not.

Repeat the procedure for all of the links that were captured in the first step.

package SeleniumPackage;

import java.io.IOException;

import java.net.HttpURLConnection;

import java.net.MalformedURLException;

import java.net.URL;

import java.util.Iterator;

import java.util.List;

import org.openqa.selenium.By;

import org.openqa.selenium.WebDriver;

import org.openqa.selenium.WebElement;

import org.openqa.selenium.chrome.ChromeDriver;

import org.openqa.selenium.chrome.ChromeOptions;

public class BrokenLinks {

   

   public static void main(String[] args) {

       

       String pageURL = "http://www.interviewbit.com";

       String url = "";

       HttpURLConnection huc = null;

       int responseCode = 200;

       System.setProperty("webdriver.chrome.driver", "C:\\Users\\user\\Downloads\\selenium\\chromedriver_win32\\chromedriver.exe");

       ChromeOptions options = new ChromeOptions();

       options.addArguments("--headless", "--disable-gpu", "--window-size=1920,1200","--ignore-certificate-errors", "--silent");

       WebDriver driver = new ChromeDriver(options);//Creating an instance of the WebDriver class

       

       driver.manage().window().maximize();

       

       driver.get(pageURL);

       

       List<WebElement> links = driver.findElements(By.tagName("a")); // getting hold of all the elements having the anchor tag

       

       Iterator<WebElement> it = links.iterator();

       // Iterating over the obtained list of elements and checking them one by one

       while(it.hasNext()){

           

           url = it.next().getAttribute("href");

           

           System.out.println(url);

       

           if(url == null || url.isEmpty()){

               System.out.println("The linked element has invalid href url.");

               continue;

           }

           

           if(!url.startsWith(pageURL)){

               System.out.println("URL belongs to another domain, skipping it.");

               continue;

           }

           

           try {

               huc = (HttpURLConnection)(new URL(url).openConnection());

               

               huc.setRequestMethod("HEAD");

               

               huc.connect(); // connecting to the url

               

               responseCode = huc.getResponseCode(); // reading the response code on firing the url

               

               if(responseCode >= 400){

                   System.out.println(url+" is a broken link");

               }

               else{

                   System.out.println(url+" is a valid link");

               }

                   

           } catch (MalformedURLException e) {

               e.printStackTrace();

           } catch (IOException e) {

               e.printStackTrace();

           }

       }

       

       driver.quit();

   }

}

Explanation - In the above code, we first set up the system properties and then initialize a webdriver object. We find all the elements in the web page having the anchor tag with the help of the findElements() method. Then, we iterate over the list obtained one by one and fire up the URL and read the response code received to check if it is a broken link or not.

Related questions

0 votes
asked Aug 19, 2019 in Selenium by rahulsharma
0 votes
asked Jan 9 in Selenium by sharadyadav1986
...