web: how can ı fix my code acording to duty text [on hold]

mercredi 1 mai 2019

how can ı fix my code acording to duty text [on hold]

hello I've been learning python for only three months and I have a project homework but ı did'n complete code ı need help ı write to diffrent code then ı will combine them but ı agains some error this point ı wanna suggest

MY DUTY TEXT

Python Projects

Newspaper webpage crawling: It should crawl the websites specified. It should show the option of storing raw html or only the news text. Second option should include news topic and the text. Output files should be html for raw html files, txt for news text. Name of the files can be arbitrary numbers. For each website a folder should be created. Inputs: Websites Crawling depth Storing option (raw html or news next) Root folder Output: Folders with website name Files containing website data

MY first code

import requests from bs4 import BeautifulSoup

url = 'https://www.dailystar.co.uk/' r = requests.get(url) source = BeautifulSoup(r.content,"lxml") title_link =source.find_all("h3",attrs={"class":"title"})

file_text = open("news_title.text","w") file_text.write("-") for link in title_link: print(link.text) file_text.write(str(link.text)) file_text.write("\n-") file_text.close()

print("*******************************") file_html = open("html_adress.html","w") for html in source: file_html.write(str(html)) file_html.close()

print("*******************************")

my second code import requests from bs4 import BeautifulSoup

Read News Application

# headers = requests.utils.default_headers() headers.update({ 'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0', })

print("Lütfen bekleyin... Haberler çekiliyor...\n") url= 'https://www.dailystar.co.uk/' istek=requests.get(url,headers) icerik=istek.content soup = BeautifulSoup(icerik, "html.parser")
```
print(" LİNKlER VE  HABERLER ŞU ŞEKİLDE:\n ------------------------")
```
haberler=soup.find_all("h3",{"class": "title"}) linkler=soup.find_all("a",{"class": "story"})

sayi=1 for i in haberler:
```
print(sayi, "-)", i.text)
sayi+=1

sayi =1
```
for i in linkler:
```
 print(sayi, "-)", i.get("href"))
 sayi+=1
 istek2 = requests.get(i.get("href"), headers)
 istek_soup = BeautifulSoup(istek2.content, "lxml")
 print(istek2.status_code, "İstek durumu")
 metin = istek_soup.find_all("div", {"class": "news-content"})
     for j in metin:
          print(j.text)**strong text**
```

web

mercredi 1 mai 2019

how can ı fix my code acording to duty text [on hold]

Read News Application

Aucun commentaire:

Enregistrer un commentaire