How to get page title from HTML in Python ?
The best way to extract title from an HTML page stored in a string is to use BeautifulSoup:
from bs4 import BeautifulSoup
htmlStr = '<head><title>Header title</title></head><body><h1>Body title</h1></body>'
soup = BeautifulSoup(htmlStr, features="html5lib")
# Get page title (header)
titleHeader = soup.find('title')
print (titleHeader.string)
# Get page title (H1 title)
titleH1 = soup.find('h1')
print (titleH1.string)
The best way to extract title from an HTML page stored in a string is to use BeautifulSoup:
from bs4 import BeautifulSoup
htmlStr = '<head><title>Header title</title></head><body><h1>Body title</h1></body>'
soup = BeautifulSoup(htmlStr, features="html5lib")
# Get page title (header)
titleHeader = soup.find('title')
print (titleHeader.string)
# Get page title (H1 title)
titleH1 = soup.find('h1')
print (titleH1.string)
| # | ID | Query | URL | Count |
|---|---|---|---|---|
| 0 | 13041 | en | https://en.ans.wiki/5933/how-to-get-page-title-from-html-in-python | 4 |