mardi 18 juin 2019

C# Mapping from absolute html hyperlink to local path: How?

a previous colleague downloaded a large part of our old corporate FAQ and saved the files as html.

I need to find a way to go through and replace all absolute hyperlinks with the relevant location where the file has been saved in relation to the root.

e.g. If the files are saved at c:\faq I need a way to change all links from https://corporatewebsitefaq.com to c:\faq.

Another example is that a link may point to the home page (e.g https://corporatewebsitefaq.com/index.html) but this link is a subfolder say c:\faq\subfolder\page.html. I would need this link to be updated to c:\faq\index.html

Also, the links have been moved around a few drives so the original folder structure is no longer valid.

Using the HTMLAgility Pack I can retrieve all the links in all the pages, it's just the actual mapping between all the files in all subfolders which is causing me issue.

I played around with the URI object but could not seem to nail it.

Thanks for any help Mark




Aucun commentaire:

Enregistrer un commentaire