How to get all the links in a HTML page stored in a string in Python?
The best way to extract links from an HTML page stored in a string is to use BeautifulSoup:
from bs4 import BeautifulSoup
soup = BeautifulSoup(htmlStr, features="html5lib")
# Get links
soup.findAll('a')
Here is a full example displaying the links:
from bs4 import BeautifulSoup
htmlStr = '<body><a href="https://ans.wiki">AnsWiki</a><br><a href="https://fr.ans.wiki">AnsWiki French</a></body>'
soup = BeautifulSoup(htmlStr, features="html5lib")
# Get links
for link in soup.findAll('a'):
print (link.get('href'))
The best way to extract links from an HTML page stored in a string is to use BeautifulSoup:
from bs4 import BeautifulSoup
soup = BeautifulSoup(htmlStr, features="html5lib")
# Get links
soup.findAll('a')
Here is a full example displaying the links:
from bs4 import BeautifulSoup
htmlStr = '<body><a href="https://ans.wiki">AnsWiki</a><br><a href="https://fr.ans.wiki">AnsWiki French</a></body>'
soup = BeautifulSoup(htmlStr, features="html5lib")
# Get links
for link in soup.findAll('a'):
print (link.get('href'))
# | ID | Query | URL | Count |
---|