Hey guys,
I am trying to parse a russian html file from a russian webpage. I am using curl. I am supposed to get some values from some set fields:
Like
Product Id: 400212
Product Type: engine
Here both the labels product Id and product price are in russian. I need to extract their values.
If the content was in english I could have done it without a problem but I am just not sure how to handle foreign languages? Any one has done this before?
Thanks!
Parsing html page with russian content using regex
Moderator: General Moderators
-
littlebiker
- Forum Newbie
- Posts: 1
- Joined: Tue Feb 07, 2006 7:17 am
- feyd
- Neighborhood Spidermoddy
- Posts: 31559
- Joined: Mon Mar 29, 2004 3:24 pm
- Location: Bothell, Washington, USA
Tried running it through Google translation, or Babelfish? It should be possible to script through them, or maybe to just get your barings as to where in the text the information is actually stored.
If we could see several examples of text (not just the specific text, but many lines around it), we may be able to write one, or give you more direction.
If we could see several examples of text (not just the specific text, but many lines around it), we may be able to write one, or give you more direction.