Web scraping using Node.js

Here, I am trying to present in simple way to scrap web for the required data. However, you have to be more conscious on using scraping techniques. You may not allowed to do so. In this article I am showing scraping the home page of this website. From the home page we extract the latest posts' titles using Node.js code. 

Web scraping using Node.js

Simple steps for Web Scraping Using Node.js

    npm init -y

  • Install required packages for web scraping procedure. Here, we install two packages/module using editor terminal.

    npm install cheerio

&

    npm install request

  • Then, create index.js file and fill the file content with the code as shown below.
    • Try using different link of your interest.


What you need for scraping?

  1. Get URL of the website from where you desired to scape.  1.5. Ctrl + Shift + I or F12 shows Inspect Window where you select 'Element' tab (Include HTML of the site).
  2. Clicking on arrow on right top corner of Inspect window allow you to select desired text to extract from website in left window.
  3. Click on the text you like to extract so that it highlighted in Element tab of Inspect window.
  4. Copy class name (that including your text of interest), you need to insert in the code.
  5. Make sure if  the above class have or haven't any children html tags. (it is required if text of interest inside any html tag)
Web scraping using Node.js
Displaying the information we need for scraping the content of the site.



index.js

    const cheerio = require('cheerio');

    const request = require('request');

    let latestPosts = null;

    function scrapeBlogTitle() {

        request('https://www.bloggernepal.com/', (error, Response, html) => {

            if (!error && Response.statusCode == 200) {
                const $ = cheerio.load(html)
                latestPosts = $('.post-title').children('a').text();
                console.log('Latest posts of Blogger Nepal are ', latestPosts);
            }
        })
    }

    scrapeBlogTitle();

  • Run the script. 

    node index.js

See the result in the terminal.

If you get you successes. 

Hope, this short article is helpful to get idea about web scraping using Node.js. If you have comments, feedbacks or any suggestion please you can. 

0 Comments