Android Basic JSOUP Tutorial

In this tutorial, you will learn how to implement JSOUP open source java library in your Android application. JSOUP provides a very convenient API for extracting and manipulating data, using DOM, CSS, and jquery-like methods. JSOUP allows you to scrape and parse HTML from a URL, file, or string and many more. We will create 3 buttons on the main view and each button will perform different tasks such as showing the website title, description and a logo. So lets begin…

Before you proceed with this tutorial, download the latest JSOUP library from here.

Paste your downloaded Jsoup file into your project libs folder as shown on the image below.

jsoup_libs

Create a new project in Eclipse File > New > Android Application Project. Fill in the details and name your project JsoupTutorial.

Application Name : JsoupTutorial

Project Name : JsoupTutorial

Package Name : com.androidbegin.jsouptutorial

Open your MainActivity.java and paste the following code.

MainActivity.java

In this activity, we have created three buttons that response to three different AsyncTask. Before I proceed with further explanation, see the steps below on how to get the html source codes from a website.

Step 1 : Visit http://www.androidbegin.com with any preferred Internet browser on your PC

homepage

 

Step 2 : Right-Click on an open space and select “View page source

pagesource

 

Step 3 : Website source codes

source

 

A website source code determines how your webpages should appear. However, source code of a web page will only display information and code that is not processed by the server.

The first button retrieves the website title. This is a way to get the page title.

Java Code

Website Source Code

 

The second button retrieves the website description. By using Elements, we are able to specify the exact location of the data.

Java Code

Website Source Code 

 

The third button retrieves the website logo. By using Elements, we are able to specify the exact location of the data.

Java Code

Website Source Code

 

Next, create an XML graphical layout for the MainActivity. Go to res > layout > Right Click on layout > New > Android XML File

Name your new XML file activity_main.xml and paste the following code.

activity_main.xml

Next, change the application name and texts. Open your strings.xml in your res > values folder and paste the following code.

strings.xml

In your AndroidManifest.xml, we need to declare permissions to allow the application to connect to the Internet. Open your AndroidManifest.xml and paste the following code.

AndroidManifest.xml

Output:

BasicJsoupTutorial ScreenShots

Source Code

JsoupTutorial (1.6 MiB, 931 downloads)
  • Aravind Asthme

    nice article

  • Pierre Lel Mustang

    Helped a lot . thanks

  • Shuai Wang

    great!

  • Zhubarb

    Thank you, this is very helpful. However, I noticed that it is quite slow. I guess due to having to fetch the entire document separately for all Async Tasks? Wouldn’t it be faster to run “Document document = Jsoup.connect(url).get();” once and then let the buttons access the different parts of the ‘document’?

  • André Felipe

    It’s a good article, but I have a problem. I do everything which is written in the article, but the app don’t work, I used the debug and I discovered where is the problem, apparently, the problem is when the Jsoup try to open a connection com a url(Jsoup.connect(url).get()), but I don’t know how to fix this?

    • Patrik

      Make sure you export jsoup (properties –> Java build path –> Order and export)

      • ethanchan

        i have the same problem too…there is an error in “doInBackground”

  • http://www.AndroidBegin.com/ AndroidBegin

    Hi developers, this tutorial may not be working at the moment because of the changes I made to this website. I will update the current tutorial as soon as possible.

  • zurche

    This is an AWESOME tutorial. Thank you very much for sharing this.

  • AFRODESCENDIENTE

    Hello, what if i want to create a xml layout with the same structure of a div from a website, maybe ‘ Hello ‘ …. how can i add each data into the corresponding field on my xml layout ‘ image.jpgtext ???

  • AFRODESCENDIENTE

    Hello,
    My emulator is not showing the app, throws an error! i tried with an other website but same happens… what should i do?

  • Guest

    Great! thx

  • KK

    Awesome example

  • http://www.midhunadarvin.site90.com Midhun

    I’m getting the text but the image is not being downloaded . Can you suggest anything that might be causing this?

    • http://www.midhunadarvin.site90.com Midhun

      oh i got it……the img tag in the website doesn’t fall between h1 tags anymore…..so you have to change the jsoup selection.

      • http://www.AndroidBegin.com/ AndroidBegin

        Hi Midhun, glad you solve your problems. I will update this tutorial soon to correct the coding.

  • golon

    Thanks!!!!

  • jura7

    I always get the same error in logcat. Can anybody help, please? When I click on title button progress dialog shows for few seconds and then my app crashes:/ Here below is my logcat if it helps.

  • http://www.AndroidBegin.com/ AndroidBegin

    Hi Developers, I’ve made the changes accordingly. Should be working fine now. Thanks