cleanSelect ( string cssSelector , string _code , string _selection , string _postProcessing ) : string

Select someting in code using a css selector, by default on all the source code but you can indicate a portion of code, the selection correspond to what you want it return.

In this function the actions are executed in this order:
1) _code is parsed
2) cssSelector is searched in the parsed DOM
3) Once the DOM element is located, we search the property defined in _selection
4) The _postProcessing is applied to clean this data

Warning: it is tempting to generalize the use of functions starting with clean without asking too many questions. But this is a bad idea because you often lose the HTML tags. So, if you do a cleanSelect and then another one after, it won't work because the second one won't have any HTML to parse. Clean the data in the last step.


htmlColor='<div class="color">Blue</div>' 
console(cleanSelect(".color", htmlColor)) // -> Blue

See also




Use Inspect element in your browser to find the CSS selector. CSS Selector reference.
It is working for XML also.

_code (optional)

The HTML or XML code where the cssSelector should be searched. If null it is the page code in a FORPAGE script.

_selection (optional)

What do you want to extract the element from the HTML DOM?
• innerHTML or html (by default) = HTML inner the element
• outerHTML = HTML outer the element
• text = text in the element
• object = return the Elements Jsoup object
• other value = the name of the attribute in the tag
Note attribute can be "abs:href" for absolute links in a href attribute

_postProcessing (optional)

How do you want to clean the output data? This property amounts to using a combination of cleaning functions.
• null or nothing = For unformatted generic texts. Equivalent to stripTags + standardizeText
• "description" or "d" = For texts formatted in HTML. Equivalent to standardizeText
• "price" or "p" = For the prices. Equivalent to htmlToPrice
• "number", "decimal", "float" or "n" = number with decimal (see number)
• "integer" or "i" = numeric integer (see number)
• "none" or "-" or "." or "0" = nothing
• Other avaliable post-processings : "xml", "url", "prestashop_category", "prestashop_manufacturer", "prestashop_supplier", "prestashop_feature", "prestashop_feature_value", "prestashop_attribute", "prestashop_combination"