I build a script in C# for Unity to read & download a specific webpage source in a text file in Unity, what I really want to achieve is to extract from this pages only html tables data, for example I want to remove all the lines from DOCTYPE html PUBLIC to table class="formular" to extract table & data:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" >
<head>
<link rel="shortcut icon" href="https://mywebsite.com/favicon.ico" />
<title>My Website Com</title>
<meta name="description" content="Ministerul pentru intreprinderi mici si mijlocii, comert, turism si profesii liberale"/>
<meta name="keywords" content="My Web Site/>
<meta name="Language" content="en"/>
<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
<meta name="rating" content="General" />
<meta name="revisit-after" content="7 Days" />
<meta name="robots" content="index,follow" />
<link rel="shortcut icon" href="/favicon.ico" />
<meta name="publisher" content="Unity Design" />
<meta name="copyright" content="Copyright (c) Unity Design" />
<meta name="author" content="Developed by Unity Design - www.UnityDesign.com" />
<link href="/css/style.css?t=2017061401" rel="stylesheet" type="text/css" />
<link href="/css/uploader.css?t=2017061401" rel="stylesheet" type="text/css" />
<script type="text/javascript" src="/js/jquery-1.8.0.min.js?t=2017061401"></script>
<script>
var jQr = jQuery.noConflict();
</script>
<script type="text/javascript" src="/js/mootools-1.2.5-core-yc.js?t=2017061401"></script>
<script type="text/javascript" src="/js/mootools-1.2.5.1-more.js?t=2017061401"></script>
<script type="text/javascript" src="/js/uploader/Swiff.Uploader.js?t=2017061401"></script>
<script type="text/javascript" src="/js/uploader/Fx.ProgressBar.js?t=2017061401"></script>
<script type="text/javascript" src="/js/uploader/Lang.js?t=2017061401"></script>
<script type="text/javascript" src="/js/uploader/FancyUpload2.js?t=2017061401"></script>
<script type="text/javascript" src="/js/js.js?t=2017061401"></script>
<script src='https://www.google.com/recaptcha/api.js?hl=en'></script>
</head>
<body onload="$('ajaxloader').setStyle('display','none')"><div id="container">
<div class="logo_container">
<a href="/" id="logo" title="MWC - Home Page"><img src="/i/logo.png?40084" /></a>
<div style="position:absolute; right:0; top:107px;" id="ajaxloader"><img src="/i/ajax-loader.gif" /></div>
</div>
<div class="menu_top">
<a href="https://mywebsite.com/" title="Home Page"><h2>Home Page</h2></a>
<a href="https://mywebsite.com/contact/" title="Contact"><h2>Contact</h2></a>
<div class="clear"></div>
</div>
<div style="clear:both;"></div>
<div style="padding:5px 0;"></div>
<div id="content" ><h1>List of items: Example</h1><br><br>
<div class="tableExample" style="padding-left:0;">
<table class="formular">
<tr>
<th>Position</th>
<th>Name of item</th>
<th>Date added</th>
</tr>
<tr>
<td>1</td>
<td>John</td>
<td>2017-07-14 19:19</td>
</tr>
<tr>
<td>2</td>
<td>Jane</td>
<td>2017-07-14 19:30</td>
</tr>
<tr>
<td>3</td>
<td>Kelly</td>
<td>2017-07-14 18:44</td>
</tr>
<tr>
<td>4</td>
<td>Michael</td>
<td>2017-07-12 12:49</td>
</tr>
<tr>
<td>5</td>
<td>William</td>
<td>2017-07-13 00:26</td>
</tr>
</table>
</div>
Any ideas how can achieve this? Thanks in advance!
Aucun commentaire:
Enregistrer un commentaire