What follows is parts of a webpage, a bookstore or a library query results.
It contains some book information, i.e. title, author, and publisher.
I would like to get the red marked parts out.
<h1>數位信號處理實務:通訊應用型DSP 不收退</h1><img src="http://static.findbook.tw/image/book/97 ... 9409/large" class="cover" alt="數位信號處理實務:通訊應用型DSP 不收退" /><p>作者:<a href="/search?keyword_type=author&q=%E5%90%B3%E8%B3%A2%E8%B2%A1">吳賢財</a>, 出版社:滄海書局, 出版日期:2001-12-01</p><p>商品條碼:9789572079409, ISBN:9572079409<br/>分類標籤:<a href="/explore/tag/10032">中文書</a>
Is there any elegant way to pare it (like we do for paring JSON ...etc.)?
How do I parse these HTML out using javascript?
Moderator: General Moderators
Re: How do I parse these HTML out using javascript?
If you would like to read the HTML as if it were in the DOM, you can create an element and put the HTML inside:
Then you can access the child nodes using functions like `div.getElementsByTagName()` or `childNodes`.
The last part you have in red would need a regular expression to grab text between the colon and comma.
Code: Select all
var div = document.createElement('div');
div.innerHTML = yourHtml;The last part you have in red would need a regular expression to grab text between the colon and comma.