vendredi 20 octobre 2017

How to find if english words exist in string

I am trying to parse some web domains (tens of thousands) to see if they contain any English words.

It is easy for me to parse the domains to grab the main part of the domain with tldextract and then I tried to use enchant to see if they exist in the English dictionary.

The problem is I do not know how to split the domains in to multiple words to check, i.e. latimes returns as False but times would return as True.

Does anyone know a clever way to do look if there is an english word contained at all in the strings?

Thanks!




Aucun commentaire:

Enregistrer un commentaire