universalCaptchaSolver

universalCaptchaSolver ( string/array screenPosition , string/method checkerIsSolved , string _instructions , boolean[default:false] _testYourself , closure _actionBetweenSolvingTry , int _timeout , int[default:25] _maxSolvingTry ) : boolean

This powerful function allows you to solve any captcha without having to identify it. To do this, Grimport creates a split-screen session between a specific section of your PC screen and the device of a human agent solving the captcha. The advantage is that mouse movements and keyboard typing are faithfully reproduced as if the captcha solver were actually behind the screen and the fingerprint is not altered (the solver does not pass through a proxy, there is no DSN leak and the fingerprint is not altered by a browser with different dimensions, for example).
This function requires the captcha to be loaded in Firefox Developper and the Grimport Bridge extension to be up to date. To do this, use the firefox function.
The function is synchronised between crawlers so that it cannot be executed twice simultaneously.

Installation


To use this technology, you need to install on Windows AutoHotKey v2 and on Linux, xdotool and wmctrl (sudo apt-get install xdotool wmctrl) to get to grips with the keyboard and mouse.
You also need to have a Cloud code defined as the function costs idIA Tech Cloud Credits. This depends on the price of the captcha but the cost is generally between 0.01€ and 0.005€ per captcha solved. Your Cloud code can be found on your idIA Tech invoices and your account must be funded. Enter your Cloud code in the Grimport Crawler settings or with the setCloudCode function.

Advice


It is important to make the solver's task as easy as possible. If the interaction zone is too large, the instructions are not clear or there are too many steps, more solvers risk cancelling the task to move on to another proposal, which increases the waiting time for your script. Restrict the interaction zone as much as possible to the captcha only and possibly preselect the field or open the puzzle if possible unless it involves a cancellation with refresh in the event of a long wait.
While the solver is taking control of the commands, you must not use your keyboard or mouse, as you will interfere with its actions.
You don't know the nationality of the captcha solver in advance. Make sure that captchas with instructions are displayed in English. Configure your browser to display pages in English only.
If you need a pixel ruler, we recommend ScreenRuler.

Example


setCloudCode(111111)
firefox("browse", "https://captcha.com/demos/features/captcha-demo.aspx")
firefox("waitLoaded")

universalCaptchaSolver("#demoForm fieldset", { ->
if(firefox("javascript", """ document.querySelector(".correct") ? document.querySelector(".correct").innerHTML : '' """)) return true
else return false
})
setCloudCode(111111)
firefox("setHeaders", [
/.*/ : [ "Accept-Language": "en-US,en;q=0.5" ] //we ensure that all pages are in English
])

firefox("browse", "https://nopecha.com/demo/hcaptcha#easy")
firefox("waitLoaded")

positionIframe = firefox("positionInScreen", "iframe") //capture the initial position
screenZone = [ //extension of the area
"x": get(positionIframe, "x"),
"y": get(positionIframe, "y") - 50,
"width": get(positionIframe, "width") + 500,
"height": get(positionIframe, "height") + 650,
"from": "screenInFirefox",
]

universalCaptchaSolver(
screenZone,
{ ->
if(firefox("javascript", """ document.querySelector(".response") ? document.querySelector(".response").innerHTML : '' """)) return true
else return false
},
"Task: click on 'I am a human' and solve the captcha to make it disappear and validate!", //solving the captcha in two stages using a puzzle
{-> //refresh the page to reset everything
firefox("browse", "https://nopecha.com/demo/hcaptcha#easy")
firefox("waitLoaded")
},
3600, //one hour
40,
true //for test, false in production
)
Often the captcha is in an iframe (e.g. Datadome). You will then need to capture the coordinates inside the iframe in JS. These coordinates are relative to the page and not to the screen.
screenZone =  firefox("javascript", """
 captchaFrame = document.querySelector('#captcha__frame'); //the Datadome catpcha box

// Get the details of the captcha__frame
captchaFrameRect = captchaFrame.getBoundingClientRect();

// Creating a table with coordinates
coordinates = {
x: captchaFrameRect.left + window.scrollX,
y: captchaFrameRect.top + window.scrollY,
width: captchaFrameRect.width,
height: captchaFrameRect.height,
from: 'pageInFirefox'
};

//return the table (it will be returned as an object directly usable by screenZone)
coordinates""", 4) // index 4 was found by iterative testing. It represents the index of the frame in which Javascript is executed. We start with index 1 and gradually increase until we find a frame that works. There may be frames within frames and there may be many of them.

//we solve the captcha
universalCaptchaSolver(screenZone,{-->
iframeDatadome = regex(/(?si)<iframe [^<>]*src="([^<>"]+captcha[^<>"]+)"/, firefox("sourceCode"), "outerHTML") //it's not about frame 4. regex will run on the main frame. It simply detects the iframe containing the other iframes which themselves contain frame 4. This method is sufficient because when the captcha is solved, the target site no longer has any iframes containing the word "captcha".
if(iframeDatadome) return false
return true
})

Here is a practical example of captcha resolution with Datadome for a site that also uses IP filtering:

identification = {->
	console("identification")

	setCloudCode(11111111111)
	//cookie page
	firefox("browse", "https://www.website.com/")
	firefox("waitLoaded")
	await(10*1000) //ajax functions load the cookie in a complex way, so we wait to make sure everything is in place

	//universalCaptchaSolver
	positionCaptcha = firefox("positionInScreen", "#captcha__frame", "*") //note the last argument "*", which is used to search for the element in iframes
	if(positionCaptcha)
	{	
		positionCaptcha.put("from", "screenInFirefox") //the capture mode is added to the table because it is not in firefox/positionInScreen
		universalCaptchaSolver(positionCaptcha,{->
			if(contains(firefox("javascript","el = document.querySelector('.captcha__human__container'); if(!el) ''; else el.outerHTML", "*"), "blocked")) return true  //in this case, the site blocks completely without offering a captcha, so we exit universalCaptchaSolver and deal with this case later with an IP change
			else if(firefox("javascript","""el = document.querySelector("iframe[src*='geo.captcha-delivery.com']"); if(!el) ''; else el.outerHTML""", "*")) return false //the iframe is present in the main frame, which means that the captcha is still there
			else return true //the captcha is solved
		}, "Solve the captcha (usually a puzzle piece to be moved to the right place). Refresh as necessary (wait 5 secondes after do it).", false) //the instructions are important, guide the captcha solver carefully and in detail. At the slightest oddity, it will move on to another captcha.
		
	}

	//use of the cookie in grimport
	setCookieValue(firefox("getAllCookies", ".website.com"), "https://www.website.com/")
	setCookieValue(firefox("getAllCookies", ".www.website.com"), "https://www.website.com/")
	setCookieValue(firefox("getAllCookies", "www.website.com"), "https://www.website.com/")

	//check for blocking
	if(contains(firefox("javascript","el = document.querySelector('.captcha__human__container'); if(!el) ''; else el.outerHTML", "*"), "blocked"))
	{
		console("Total blocking - you have to change your ip and cookie")
		return false //failed identification
	}
	return true //successful identification
}

nbHTTPerrors = 0 //HTTP error counter
actionHttpError({httpErrorIdentifier,errorLink,codeHTTP-> //actionHttpError is triggered every time an HTTP error occurs somewhere in Grimport
	if(equals(codeHTTP, 403)) //reverse engineering of requests has established that in the event of a 403 error you need to identify yourself again
	{
		nbHTTPerrors++ //each time an error occurs, the counter is incremented

		successIdentification = false
		if(nbHTTPerrors > 0 && nbHTTPerrors<=4)
		{
			successIdentification = identification()
		}
		if(nbHTTPerrors > 4 || !successIdentification) //if there are many requests for identification or if there is a problem with identification, the IP is changed
		{
			nbHTTPerrors=0
			proxyChange(["method":"nordvpn_by_command","zone":"europe",
			"action_between_change":{->
				firefox("clearAllCookiesAllDomains") //Deleting cookies on Firefox
				clearAllCookies() //Deleting cookies in Grimport Crawler
				identification() //we identify the web page test and the proxy change
			})
		}
})


See also

antiCaptcha
firefox

Parameters

screenPosition

This is the selected area of the Firefox page whose image and commands will be shared with the captcha solver.
The captcha solver cannot interact outside this area and cannot perform keystrokes. Its commands remain basic.
There are two ways of defining this zone:
  • string (ex: ".captcha-puzzle") : If you use a text string, this will be understood as a CSS selector. The x,y,width,height coordinates of the designated HTML element will be used. Select a parent element of the captcha so that it can be viewed in full. Note that some captcha display a "I'm human" box, for example, which displays a larger popup when clicked. If you use the small box selector, the puzzle popup will be cut off and the solver will not be able to solve it.
  • associative array (ex: ["x": 20, "y": 100, "width": 500, "height": 300, "from": "pageInFirefox"]) : If you use an array, you specify the x,y coordinates and the height and width dimensions of the area to be captured. An additional parameter must be used: "from" which can have the values "pageInFirefox" (the coordinates are taken from inside the Firefox web page, i.e. (0,0) designates the top left corner of the web page) or "screenInFirefox" (the coordinates are taken from your screen, (0,0) designates the top left corner of your screen). It may be useful to use positionInScreen from firefox to obtain these coordinates. With these two options, even if Firefox is minimised, it cannot see anything other than the Firefox window, this is a security feature for the Grimport user. A third mode exists for "from": "screen", which allows you to capture an area of the screen outside Firefox ((0,0) designates the top left corner of your screen).

checkerIsSolved

The @checker is used to check whether the captcha resolution has been completed. There are two types of @checker:
  • string (ex: ".captcha-puzzle") : If you use a character string, it designates the CSS selector of an HTML element present in the page. As long as this element is detected, the captcha has not been resolved. When it disappears, the captcha is solved and universalCaptchaSolver ends.
  • closure (ex: {-> return !equals(firefox("javascript", "window.location.href"), "https://site.com/captcha-check.php") } ) : You can define a function. As long as it returns false, the captcha is still being resolved. If it returns true, universalCaptchaSolver ends because the captcha has been solved.

_instructions (optional)

Enter the necessary instructions for the solver here. Be brief and clear.
By default, this text is displayed : "Task: solve the captcha to make it disappear and validate!"

_testYourself (optional)

If you want to test the function to see if the capture zone is correctly defined. Or if the general presentation to the solver is correct, use this argument. You will be presented with a link that you will need to use on another computer or smartphone to solve the captcha. The @testYourself mode costs no money.

_actionBetweenSolvingTry (optional)

Here you can define the action to be taken between two solvers trying to solve the captcha. Solvers often make mistakes or do anything at all. They then cancel the current task and move on to another, leaving their mistakes on the screen. The next solver will then have to fix everything, then solve the captcha, but this can be a deterrent for some. With this action, you can reset the test to zero, for example by clearing the field containing the captcha text to be solved or by regenerating a new puzzle.

_timeout (optional)

Timeout in seconds for the function. After this time, even if no solution has been found, the function ends. By default, there is no maximum timeout.

_maxSolvingTry (optional)

Maximum number of tries to solve the captcha. Different solvers will come and go, and each will have a chance. Only the one who succeeds will be paid, which will stop the function. If this number is exceeded, the function ends without having succeeded in solving the captcha.