dimanche 26 février 2017

Grepping a word buried in a on a website

I am having trouble grepping a word on a website. This is the command I'm using

wget -q http://ift.tt/2louVpo | grep 'medical' which is returning nothing, when it should be returning

[name of the website]:Many recent developments in biological and medical .

.

.

.

.

. The overall goal of what I'm trying to do is find a certain word within all the links of the website

My script is written like this

#!/bin/bash

#$1 is the parent website
#This pipeline obtains all the links located on a website
wget -qO- $1 | grep -Eoi '<a [^>]+>' |  grep -Eo 'href="[^\"]+"' | cut -c 7- | rev | cut -c 2- | rev > .linksLocated

#$2 is the word being looked for
#This loop goes though every link and tries to locate a word
while IFS='' read -r line || [[ -n "$line" ]]; do
        wget -q $line | grep "$2"
done < .linksLocated

#rm .linksLocated




Aucun commentaire:

Enregistrer un commentaire