jeudi 29 avril 2021

bash curl | grep a specific website content

I'm trying to extract a specific piece of information from a website, but the content seems to be included in the class definition:

<div class= "some_div_class">
  <strong content="999" itemprop="price" class="strong_class">
      999
  </strong>
</div>

I'm targeting the "999", which I can if I do:

curl -s url |grep -zPo '<strong content="999" itemprop="price" class="strong_class">\s*\K.*?(?=\s*</strong>)'

If the "999" is in the content though, and it changes, grep would become invalid. Wildcards wouldn't return anything




Aucun commentaire:

Enregistrer un commentaire