How to get page title from HTML in Python ?
The best way to extract title from an HTML page stored in a string is to use BeautifulSoup:
from bs4 import BeautifulSoup
htmlStr = '<head><title>Header title</title></head><body><h1>Body title</h1></body>'
soup = BeautifulSoup(htmlStr, features="html5lib")
# Get page title (header)
titleHeader = soup.find('title')
print (titleHeader.string)
# Get page title (H1 title)
titleH1 = soup.find('h1')
print (titleH1.string)
The best way to extract title from an HTML page stored in a string is to use BeautifulSoup:
from bs4 import BeautifulSoup
htmlStr = '<head><title>Header title</title></head><body><h1>Body title</h1></body>'
soup = BeautifulSoup(htmlStr, features="html5lib")
# Get page title (header)
titleHeader = soup.find('title')
print (titleHeader.string)
# Get page title (H1 title)
titleH1 = soup.find('h1')
print (titleH1.string)
# | ID | Query | URL | Count |
---|