Do you need to take snapshot of some urls? You just want a headless browser? You are using .NET? You don’t want to rely on node/phantom or on a .net library (awesomium, cefsharp, webkitdotnet)? You love IE? Yes to all? This post is for you!
Because we can’t do a screenshot in modern browsers
If you have control of your webapplication / website that needs to take the screenshots, you know that is no such thing in current browser, no standard whatsoever (but the WebRTC is coming !). Maybe you encountered the client library : html2canvas (and cansvg if you are dealing with svg) but this is not enough. It’s good for svg and canvas, but as soon as some html5 controls are in or complex style, it won’t do a nice job. You need a true browser that renders the page nicely, is waiting for ajax, pictures and so on, then call a method on it to get a snapshot.
Let’s emulate IE (yes, we want!)
Hopefully for you, .NET WebBrowser control is there. It is not suitable in every situation (take a coffee and read this stackoverflow post), but if you just want a snapshot of a given url, it won’t be a problem.
STAThread
First of all, you need a thread in STA (Single Thread Apartment) mode. If you want to take a snapshot from WCF or a console app, or any non-STA thread program, you need to create this kind of thread like this :
Dim ARE As New AutoResetEvent(False) Dim tWebBrowserSnapshot As Bitmap = Nothing Dim t = New Thread(Sub() Try Using tWebBrowser = New WebBrowser() ... End Using Finally ARE.Set() End Try End Sub) t.SetApartmentState(ApartmentState.STA) t.Start() ARE.WaitOne() Return tWebBrowserSnapshot
Snapshot me
The browser is created, yeepee! You can now add some code to give it some style/size.
tWebBrowser.ScrollBarsEnabled = False tWebBrowser.ScriptErrorsSuppressed = True tWebBrowser.Width = myWidth tWebBrowser.Height = myHeight tWebBrowser.Navigate(myUrl)
Then you can take a snapshot because it has a nice method: DrawToBitmap.
Dim tBitmap = New Bitmap(tWebBrowser.Width, tWebBrowser.Height) tWebBrowser.DrawToBitmap(tBitmap, New Rectangle(0, 0, tWebBrowser.Width, tWebBrowser.Height))
Snapshot logic
The main logic is there, but if you try that, you can run into severals issues. (such as blank image)
First, don’t forget that it’s a ‘true’ browser that renders the page. So, it takes time for it to load the page, then if it has ajax or images, it will call them, process the response etc. There is no magic flag that is set when everything is loaded on the page (to wait to call DrawToBitmap). WebBrowser has some events (such as DocumentCompleted) but it’s like the jquery $(document).on(‘ready’, fn), nothing is done yet. If you don’t know the targetted url, you don’t have a lot of choices :
– wait a finite amount of time to be almost sure everything is loaded
– maybe you can try to catch every ajax (by injecting some script into the document) the website is doing then count how many came back (but still, add a max timeout otherwise you could wait an infinite time).
– if you control the website on the url, create a variable that will be set to some value (true), and just wait for that.
Windows Registry is there for you
If you did that, you can still have blank pages. Why ? Because WebBrowser can use a old IE engine to render the page, which is not compatible. Because your server is up to date, you want to use IE11. To do that, you need to add a key in the registry.
HKEY_LOCAL_MACHINE SOFTWARE Microsoft Internet Explorer Main FeatureControl FEATURE_BROWSER_EMULATION
Add a REG_DWORD with the process name (if you are using IIS, the process is w3wp.exe; if you are using a console app, put the name of your exe) and as data: 0x2AF9 (check msdn to see the possible values, IE10 is 0x2711).
IIS will help too
Last thing, if it’s a WCF that does the screenshot, and if you are using IIS, you need to change who is running the pool to LocalService.
Have a nice screenshot !