If you find difficulties using HTML Parser code, we offer setup services for setting up the pipes. Depends on your needs, pick one for yourself, and submit obGrabber support ticket with details of your requirements (such as: what do you want to use Parser Code for?...), we will do the setting up stuff for you:

Attention: This service is only for asking us finding the correct way of using Parser Code with your purposes, it is not for setting whole pipes. In case you want us to set up your pipes, please create a ticket at Custom Work department!

Setup HTML Parser x 02 (2 pipes)

$ 25

Setup HTML Parser x 06 (6 pipes)

$ 55

Setup HTML Parser x 15 (15 pipes)

$ 135

This processor is a magic one, which is alternative processor for “Get Fulltext” processor in the case “Get Fulltext” doesn’t work for your source.

To be honest, it is a little bit complicated here with our own program language, but we will try to make it clear and easy.

You can watch this Video Tutorial on Parsing RSS to increase accuracy for your auto content for an overview of how to work with Get Fulltext Parser Code.

We provide an easy way to test HTML Parser processor with the same interface at: http://demo.foobla.com/html_parser. You can test on it first to get the code and use it for your obGrabber Pipe.
You can refer HTML Parser samplesto get more ideas about it.

There are several commands: ginner, remove, split, wrap and replace to do magic with HTML source. Each command need to be placed on a new line.

Function: ginner

Get inside content of an HTML tag from input HTML source

Syntax

ginner|{LINE}|{TAG}|{DELIMITER}|{RETURN}|{DEBUG}|
  • {LINE}: the output from {LINE}. Basically, you can put many lines and each line will have an output itself, and we can use output of this line as input of other line. “0” means the original input of the processor, “1” means the output of the line#1.
  • {TAG}: target HTML tag
  • {DELIMITER} a string inside that target tag
  • {RETURN}: the number of part will be returned for the processor, start with 0, L stands for Last part.
  • {DEBUG}: debug mode in the case the {DELIMITER} cannot be found from INPUT HTML source.
    • 0: return "” (empty string) in the case an error occurs.
    • 1: stop immediately in the case an error occurs.
    • 2: return INPUT HTML source.

Example:

ginner|0|div|post|L|1|

Get inner content from input HTML source for the “div” tag with a string “post” inside that div, no matter that string is id, class or any attribute. For example: <html>...<body>...<div class=”post” id=”whatever” what_ever=”attribute”>I want to get this text</div>...</body></html> Will return “I want to get this text” by using above sample.

Function: remove

To remove an HTML tag out our input HTML source

Syntax

remove|{LINE}|{TAG}|{DELIMITER}|

Example

remove|0|div|post|

Remove div tag which has string “post” inside. For example:

<html>...<body>...ABC<div class=”post” id=”whatever” what_ever=”attribute”>I want to get this text</div>XYZ...</body></html>

Will return

<html>...<body>...ABCXYZ...</body></html>

Which is the input HTML source without the div tag with string “post” inside.

Function: split

To split/seperate HTML source to many parts base on a delimiter. This function is pretty similar to explode function in PHP (if you know PHP program language).

Syntax

split|{LINE}|{DELIMITER}|{RETURN}|{DEBUG}|
  • {LINE}: the output from {LINE}. Basically, you can put many lines and each line will have an output itself, and we can use output of this line as input of other line. “0” means the original input of the processor, “1” means the output of the line#1.
  • {DELIMITER}: HTML or Text to delimiter the INPUT HTML source.
  • {RETURN}: the number of part will be returned for the processor, start with 0, L stands for Last part.
  • {DEBUG}: debug mode in the case the {DELIMITER} cannot be found from INPUT HTML source.
    • 0: return "” (empty string) in the case an error occurs.
    • 1: stop immediately in the case an error occurs.
    • 2: return INPUT HTML source.

Example

Example 1
split|0|<div class="post">|L|1|

Split the INPUT HTML source to many parts by the delimiter <div class=”post”>, it gets the last part, and if nothing found, it will stop immediately and start over with the new item.

Example 2
split|2|<p class="paragraph">|1|2

Split the output from line#2 by the delimiter <p class=”paragraph”>, it gets the first part, and if nothing found, it will return the line-itself input.

Function: wrap

wrap/combine one or many parts (which returned by other lines) by a new HTML format.

Syntax

wrap|{INPUT_LINE1,INPUT_LINE2,...}|{WRAP_HTML}|
  • {INPUT_LINE1,INPUT_LINE2,...}: input lines variables to be wrapped.
  • {WRAP_HTML}: there are variables in {WRAP_HTML}
    • {ogb-0} understands for the first line-parameter in INPUT_LINE1, this will be replaced by the output value of INPUT_LINE1.
    • {ogb-1} understands for the first line-parameter in INPUT_LINE2, this will be replaced by the output value of INPUT_LINE2.

Example

wrap|3,5|<div class="content">{ogb-0}<hr />{ogb-1}|

Combine line#3 and line#5 into the new formated HTML source, the first line parameter (line#3) will be replaced for {obg-0}, the second line parameter (line#5) will be replaced for {obg-1}.

Function: replace

replaces an INPUT_SOURCE by a new one.

Syntax

replace|{INPUT_LINE}|{SEARCH}|{REPLACE}|
  • {INPUT_LINE}: get input from other line output.
  • {SEARCH}: search this string.
  • {REPLACE}: and replace by this string.

Example

replace|5|<div class="abc"|<div class="xyz" |

Find <div class=”abc” from line#5 output, replace it by <div class=”xyz” 

Get Our Newsletter