jeudi 27 octobre 2016

Dictionary task. Choose needed ones and order them

stackoverfollowers!

There is a task that i am trying to resolve now. here it is:

'' Write a function samewords(u1, u2, enc, k) that:

  1. Take 2 urls and enc = ‘utf8’:
    u1 = 'http://ift.tt/2fiYYz5'
    u2 = 'http://ift.tt/2ePQugb'
  2. On this web pages u1 and u2 find words of the length k that occur on both pages
  3. Count how many times that words occur on each page
  4. Return a list that contain groups of 3 parameters: word (found in paragraph 2), occur1 (how many times a word occurs on the page u1), occur2 (how many times a word occurs on the page u2)
  5. A returned list should be in decreasing ordered in accordance with total number of occurs on the both pages ''

So returned list should look like this if k=10 (the length of serching words):

[(u'fondamenti', 4, 4), (u'istruzioni', 4, 3), (u'operazioni', 2, 3), (u'stylesheet', 2, 2), (u'permettono', 2, 1), (u'googlecode', 1, 1), (u'inlinemath', 1, 1), (u'javascript', 1, 1), (u'parentnode', 1, 1), (u'tantissime', 1, 1)]

using this code to delete all notalphabetic characters

def mywords(s):              # delet nonalphabetic characters
    for c in '''!?/-,():;--'.\_[]"{}''':
        s = s.replace(c, ' ')
    return s.split()            # return a list of all words from page with my url

import urllib.request as ul

def myurl(u, enc):      #open my url
    p = ul.urlopen(u)
    t = p.read()
    p.close()

    return mywords(t.lower())

And then i meet difficultes with points 3-5 and stuck (mainly because if something doesn't go i check the code online with pythontutor.com but in this case i can't do that because it doesn't support urllib library)

Thank you!!!




Aucun commentaire:

Enregistrer un commentaire