I. Introduction
Sometimes, you may need a customized browser. In this case, you can freely add some novel but nonstandard features to a browser. As a result, you finally have a new but nonstandard browser. Web browser controls are only browser analysis engines. This means that there are still a number of work related to user interfaces Waiting For You To Do-add an address bar, toolbar, history, status bar, Channel Bar and favorites. In this way, to generate a custom browser, you can program two types-one is like Microsoft converting a Web browser control into a fully functional browser such as Internet Explorer; one is to add some new functions based on the existing ones. How nice is there a direct way to customize existing Internet Explorer? BHO (Browser Helper Objects) is used to achieve this purpose.
Ii. software customization
Previously, the behavior of customizing a software was mainly achieved through subclass. In this way, you can change the appearance and behavior of a window. Although subclass is regarded as a somewhat violent method-the victim does not know what happened-it is still the only choice for a long time.
With the advent of Microsoft Win32 API, it is increasingly difficult to subclass processes. Of course, if you are brave-Pointers never scare you, and most importantly, if you have been between system hooks, you may think this problem is too simple. But this is not always the case. The problem is that each process runs in its own address space, and it is slightly incorrect to break the process boundary. On the other hand, you may need to better manage customization. More often, customization may be implemented by the program itself.
In the latter case, the installed software only needs to query another component module at the specified disk location, then load and set the initial values, and finally let them freely follow the established design work. This is exactly what the Internet Explorer browser and Its BHO need to implement.
3. What is BHO?
From a certain point of view, Internet Explorer is no different from common Win32 programs. With BHO, you can write a COM object in the process, which is loaded every time it is started. Such an object will run in the same context as the browser, and can perform any action on available windows and modules. For example, a BHO can detect typical events, such as GoBack, GoForward, and DocumentComplete. In addition, BHO can access the menu and toolbar of the browser and make modifications, you can also generate a new window to display additional information about the current webpage, and install hooks to monitor messages and actions. In short, BHO's work is like a spy that we break into the browser territory (note that this is a legal job that Microsoft allows ).
Before learning more about BHO, I need to elaborate on some points. First, the BHO object depends on the browser's main window. In fact, this means that once a browser window is generated, a new BHO object instance will be generated. The lifecycle of any BHO object is the same as that of the browser instance. Second, BHO only exists in Internet Explorer 4.0 and later versions.
If you are using Microsoft Windows? 98, Windows 2000, Windows 95, or Windows NT version 4.0, the Active Desktop shell 4.71 is also run, and BHO is also supported by Windows Resource Manager. BHO is a COM in-process service registered in the Registry with a single click. At startup, Internet Explorer queries the key and preloads all objects under the key.
The Internet Explorer browser initializes this object and requires certain interface functions. If this interface is found, Internet Explorer uses the provided method to pass the IUnknown pointer to the BHO object. See Figure 1:
Figure 1 how to mount and initialize BHO objects in IE browser. site is the COM interface used for communication.
The browser may find a series of CLSID in the Registry and create a process instance for each CLSID. The result is that these objects are loaded into the context of the browser and run as if they were local components. However, due to the COM feature of Internet Explorer, it may not be very helpful even if it is installed into its process space (your ambitious implementations. In other words, BHO can indeed do many potentially useful things, such as subclass to form a window or install local hooks on the thread, but it is indeed far away from the core activities of the browser. BHO needs to establish a private COM-based communication channel to hook browser events or automated browsers. Therefore, this BHO should implement an interface called IObjectWithSite. In fact, Internet Explorer can pass its IUnknown interface through the IobjectWithSite interface. BHO, in turn, can store this interface and further query more specialized interfaces, such as IWebBrowser2, IDispatch, and IConnectionPointContainer.
Another way to analyze BHO objects is related to Internet Explorer Shell extension. As we know, a WINDOWS Shell extension is a COM server in the process. It performs some action in the Windows Resource Manager and enters the memory, such as displaying the context menu. By creating a COM module that implements several COM interfaces, you can add some items to the context menu and pre-process them correctly. A shell extension must be registered in a way that Windows resource manager can discover. A bho object follows the same pattern-the only change lies in the interface to be implemented. However, despite the differences in implementation methods, shell extension and BHO still have many common features. Table 1 is as follows:
Table 1 Comparison of shell extension and BHO features
Features
Shell Extension
BHO object
Loader
Windows Resource Manager
Internet Explorer (and Windows resource manager with Shell version 4.17 and later)
Click Activity
User Actions on a document (that is, right-click)
Open a browser window
When to uninstall
Seconds after the reference count reaches 0
When the window it loads is closed
Implementation form
DLL in COM Process
DLL in COM Process
Registration Requirements
The portal is usually set for a COM server. The additional portal depends on the shell type and the document type to be applied.
It is usually set for a COM server, and a registration entry for applying for BHO is added.
Interface requirements
Types dependent on Shell Extension
IObjectWithSite
If you are interested in SHELL extension programming, refer to the relevant MSDN documents.
Iv. BHO Lifecycle
As mentioned earlier, BHO is not only supported by Internet Explorer. If you are using shell 4.71 or later, your BHO object will also be loaded by Windows Resource Manager. The following table 2 shows the shell products of different versions that we can use. The Windows Shell version is stored in the shell32.dll library file.
Table 2 support for bho in different versions of Windows
Shell version
Installed Products
BHO support
4.00
Windows 95 and Windows NT 4.0 with or without Internet Explorer 4.0 or older versions. Note that no shell update is installed
Internet Explorer 4.0
4.71
Windows 95, Windows NT 4.0 with Internet Explorer 4.0 and Active Desktop shell update
Internet Explorer and Windows Resource Manager
4.72
Windows 98
Internet Explorer and Windows Resource Manager
5.00
Windows 2000
Internet Explorer and Windows Resource Manager
BHO objects are loaded with the display of the browser's main window, and are loaded with the destruction of the browser's main window. If you open multiple browser windows, multiple BHO instances are also generated.
No matter what command line the browser starts, BHO objects are loaded. For example, BHO objects are loaded even if you just want to see a specific HTML page or a given folder. Generally, BHO is taken into account when assumer.exe or ipolice.exe is running. If you set the "Open each folder in its own window" (Open in an independent window for each folder) folder option, each time you Open a folder, The BHO object will be loaded. See figure 2.
As shown in figure 2, each time you open a folder, execute an independent assumer.exe instance and load the registered BHO object.
However, note that this situation is only applicable when you open a folder from the "my computer" icon on the desktop. In this situation, the external shell needs to be invoked every time you move to another folder. This will not happen when you use two panes for browsing at the same time. In fact, when you change the folder, the shell will not start a new instance of the browser, but simply create another instance that embeds the view object. If you enter a new name in the address bar to change the folder, you can browse the folder in the same window, whether the Windows Resource Manager view is a single view or a dual view.
Internet Explorer is simpler. Only when you explicitly run the iexplore.exe Browser Multiple times can you copy multiple Internet Explorer instances. When you open a new window from Internet Explorer, each window is copied in a new thread instead of creating a new process. Therefore, you do not need to reload the BHO object.
First, the most interesting thing about BHO is that it is extremely dynamic. Each time Windows resource manager or Internet Explorer is opened, the loader reads the CLSID of installed BHO objects from the registry and processes them. If you edit the registry in the middle of multiple instances in the open browser, you can mount multiple different BHO files as multiple browsers copy and load. That is to say, if you choose to create a new browser of your own from scratch, you can embed it in a Visual Basic or MFC Framework Window. At the same time, you have a considerable opportunity to flexibly arrange browsing programs. If they meet your needs, you can rely on the powerful features of Internet Explorer and add as many plug-ins as possible.
V. IObjectWithSite Interface
From a high point of view, BHO is a DLL that can be attached to a new instance of the Internet Explorer browser. In some cases, BHO is also applicable to Windows Resource Manager.
Generally, a site is an intermediate object located between the container object and the contained object. Through it, container objects are managed to accommodate the content of objects, so internal functions of objects are available. Therefore, the container must implement the IoleClientSite interface, and the IOleObject interface must be implemented for the contained objects. By calling the methods provided by IOleObject, the container object makes the inclusion object clearly understand its HOST environment.
Once the container object becomes Internet Explorer (or Windows resource manager with WEB capabilities), The contained object only needs to implement a lightweight IObjectWithSite interface. This interface provides the following methods:
Table 3 IObjectWithSite Definition
Method
Description
HRESULT SetSite (IUnknown * pUnkSite)
Receives the IUnknown pointer from IE. A typical implementation is to save the pointer for future use ..
HRESULT GetSite (REFIID riid, void ** ppvSite)
Receives and returns the specified interface from the location set through the SetSite () method. A typical implementation is to query the previously saved interface pointer to further obtain the specified interface.
The only strict requirement on BHO is that this interface must be implemented. Note that you should avoid returning E_NOTIMPL when calling any of the above functions. Either you do not implement this interface, or you must ensure that the encoding is correct when you call these methods.
6. construct your own BHO object
A bho object is a Server DLL in a process. It is no longer appropriate to Use ATL to create it. Another reason we chose ATL is that it already provides the default and provides enough implementation of the IObjectWithSite interface. In addition, one of the predefined object types supported locally by the atl com wizard is the Internet Explorer Object, which is exactly the type that BHO should have. An ATL Internet Explorer Object is actually a simple object-that is, it is a COM server that supports IUnknown and self-registration and the IObjectWithSite interface. If you add such an object in the ATL Project and call the CViewSource class, you will get the following code from the wizard:
class ATL_NO_VTABLE CViewSource : public CComObjectRootEx<CComSingleThreadModel>, public CComCoClass<CViewSource, &CLSID_ViewSource>, public IObjectWithSiteImpl<CViewSource>, public IDispatchImpl<IViewSource, &IID_IViewSource, &LIBID_HTMLEDITLib>
As you can see, the wizard has inherited the class from the IObjectWithSiteImpl interface. This is an ATL template class that provides the basic implementation of the IObjectWithSite interface. In general, it is not necessary to overload the member function GetSite (). Instead, the SetSite () implementation code often needs to be customized. ATL only stores an IUnknown interface pointer in the member variable m_spUnkSite.
In the rest of the article, I will discuss a fairly complex and rich example of BHO. This BHO object will be attached to Internet Explorer, and a text box will be displayed to display the source code of the web page currently being browsed. The code window will be automatically updated as you change the webpage. If the browser does not display an HTML webpage, it will become grayed out. Any changes to the original HTML code are immediately reflected in the browser. HTML (DHTML) makes this seemingly magical implementation possible. This code window can be hidden and reproduced by pressing the hot key. Obviously, it shares the entire desktop space with Internet Explorer, as shown in figure 3.
Figure 3 BHO objects are in use. It is attached to Internet Explorer and displays a window to display the source code of the web page currently being browsed. You can also modify the source code.
The key point of this example is to access the browsing mechanism of Internet Explorer. In fact, it is just an instance of the webbrowser control. This example can be implemented in the following five steps:
- Test who is loading this object, Internet Explorer or Windows resource manager;
- Obtain the IWebBrowser2 interface to implement Web browser objects;
- Captures specific Web browser events;
- Access the current document object and make sure it is an HTML file;
- Manage the dialog box to display the HTML source code;
The first step is completed in dllmain. Setsite () is the proper position to get the pointer to the webbrowser object. Perform the following steps in detail.
7. Detect who is calling this object
As mentioned above, a BHO object will be loaded by Internet Explorer or Windows Resource Manager (premise: Shell version 4.71 or later. So I specifically designed a BHO to process HTML webpages, so this BHO has nothing to do with the resource manager. If a DLL does not want to be loaded together by the caller, you only need to implement in dllmain () to find out who will return false after calling the object. See the following code:
If (dwreason = dll_process_attach) {tchar pszloader [max_path]; // return the name of the caller module. The first parameter should be null. For details, see msdn. Getmodulefilename (null, pszloader, max_path); _ tcslwr (pszloader); If (_ tcsstr (pszloader, _ T ("assumer.exe") return false ;}
Once you know that the current process is Windows resource manager, you can exit immediately.
Note: It is dangerous to add more conditional statements! In fact, some other processes will be abandoned when they try to mount the DLL. If you do another experiment, for example, the execution file icycler.exefor Internet assumer, then the first reporter called regsvr32.exe (this program is used to automatically register objects ).
if (!_tcsstr(pszLoader, _T("iexplore.exe")))
You cannot register the dll library again. In fact, when regsvr32.exe tries to load the DLL to activate the dllregisterserver () function, the call will be abandoned.
8. Contact the Web browser
The SetSite () method is exactly where the BHO object is initialized. In addition, you can execute all the tasks that are only allowed to happen once in this method. When you open a URL using Internet Explorer, you should wait for a series of events to ensure that the required documents have been fully downloaded and initialized. Only now can you access the document content through the interface exposed by the object model (if any. This means you need to obtain a series of pointers. The first is the pointer to IWebBrowser2 (which is used to generate the WebBrowser object. The second pointer is related to the event. This module must be implemented as a browser event listener to receive downloads and document-related events. The following uses the ATL sensitive pointer for encapsulation:
CComQIPtr< IWebBrowser2, &IID_IWebBrowser2> m_spWebBrowser2;CComQIPtr<IConnectionPointContainer, &IID_IConnectionPointContainer> m_spCPC;
The source code is as follows:
HRESULT CViewSource: SetSite (IUnknown * pUnkSite) {// retrieves and stores IWebBrowser2 pointer m_spWebBrowser2 = pUnkSite; if (m_spWebBrowser2 = NULL) return E_INVALIDARG; // retrieve and store the IConnectionPointerContainer pointer m_spCPC = m_spWebBrowser2; if (m_spCPC = NULL) return E_POINTER; // retrieve and store the browser handle HWND. install a keyboard hook and use RetrieveBrowserWindow (); // Connect to the container return Connect () to receive Event Notifications ();}
To obtain the IWebBrowser2 interface pointer, you can perform a query. You can also query IConnectionPointContainer when an event occurs. Here, SetSite () retrieves the browser handle HWND and installs a keyboard hook in the current thread. HWND is used to move or resize the following Internet Explorer window. The hook is used to implement the hotkey function. you can press the hotkey to display/hide the code window.
9. Get events from Internet Explorer
When you direct a new URL, the browser needs to complete two types of events: download the document and prepare the HOST environment for it. That is to say, it must Initialize an object and make it available externally. For different document types, or to load a registered Microsoft ActiveX? The server processes the document (such as the word .doc file) or initializes some internal components to analyze the document content and generate and display the document. This is the case for HTML web pages, whose content becomes available due to the role of DHTML objects. When all documents are downloaded, The DownloadComplete event is activated. This does not mean that the object model can be used to securely manage the content of a document. In fact, the DocumentComplete event only indicates that everything has ended and the document is ready (note that the DocumentComplete event only arrives when you first access the URL. If you perform a refresh action, you only receive one DocumentComplete event ).
To intercept events from the browser, BHO needs to connect to the browser through the IConnectionPoint interface and implement the IDispatch pointer passing interface to handle various events. Now we use the IConnectionPointContainer pointer we have obtained to call the FindConnectionPoint method-it returns a pointer pointing to the connection point object (it is through this connection point object to obtain the required external interface, which is DIID_DWebBrowserEvent2 ). The following code shows the connection point:
HRESULT CViewSource: Connect (void) {HRESULT hr; CComPtr <IConnectionPoint> spCP; // The receive connection point hr = m_spCPC-> FindConnectionPoint (DIID_DWebBrowserEvent2, & spCP); if (FAILED (hr) return hr; // transmits the event processor to the container. Each time an event occurs, the container activates the corresponding functions on the implemented IDispatch interface. Hr = spCP-> Advise (reinterpret_cast <IDispatch *> (this), & m_dwCookie); return hr ;}
By calling the Advise () method of the IConnectionPoint interface, BHO tells the browser that it is very interested in the events it generates. Due to the COM event processing mechanism, all these means that BHO provides the IDispatch interface pointer to the browser. The browser calls back the Invoke () method of the IDispatch interface and uses the event ID value as the first parameter:
HRESULT CViewSource::Invoke(DISPID dispidMember, REFIID riid, LCID lcid, WORD wFlags, DISPPARAMS* pDispParams, VARIANT* pvarResult, EXCEPINFO* pExcepInfo, UINT* puArgErr){ if (dispidMember == DISPID_DOCUMENTCOMPLETE) { OnDocumentComplete(); m_bDocumentCompleted = true; } :}
Remember, when events are no longer needed, they should be separated from the browser. If you forget to do this, the BHO object will be locked, even after you close the browser window. Obviously, the best time to implement separation is when the event OnQuit is received.
10. Access document objects
In this case, the BHO has a Web browser control pointing to Internet Explorer and is connected to the browser control to receive all its events. After the webpage is downloaded and initialized correctly, we can access it through the DHTML document model. The document attribute of the Web browser returns a pointer to the IDispatch interface of the Document Object:
CComPtr<IDispatch> pDisp;HRESULT hr = m_spWebBrowser2->get_Document(&pDisp);
The get_Document () method only obtains an interface pointer. We need to further determine that there is an HTML document object behind the IDispatch pointer. If you use VB, you can use the following code:
Dim doc As ObjectSet doc = WebBrowser1.DocumentIf TypeName (doc) = "HTMLDocument" then'' get the Document Content and display Else ''disable the display dialogEnd If
Now you need to know the IDispatch pointer returned by get_Document. Internet Explorer is not only an HTML browser, but also an ActiveX document container. In this way, it is difficult to ensure that the current browser object is an HTML document. But there are still some solutions-You think, if the IDispatch pointer really points to an HTML document, the IHTMLDocument2 interface will be queried successfully.
The IHTMLDocument2 interface encapsulates the DHTML Object Model to display all functions of the HTML page. The following code implements these functions:
CComPtr <IDispatch> pDisp; HRESULT hr = m_spWebBrowser2-> get_Document (& pDisp); CComQIPtr <IHTMLDocument2, & IID_IHTMLDocument2> spHTML; spHTML = pDisp; if (spHTML) {// obtain and display the document content} else {// disable the Code Window controls}
If the IHTMLDocument2 interface fails to be queried, The spHTML pointer is NULL.
Now let's consider how to obtain the source code of the current display window. Just as an HTML page encapsulates all its content in the tag <BODY>, the DHTML Object Model requires you to obtain a pointer to the Body object:
CComPtr<IHTMLElement> m_pBody;hr = spHTML->get_body(&m_pBody);
The strange thing is that the DHTML Object Model does not allow you to obtain the original content before the tag <BODY>, such as <HEAD>. The content is processed in some attributes, but you still cannot extract the RAW text from the original HTML file. This way, only the content obtained from the BODY part is sufficient. To obtain the information contained in <BODY>... </BODY> in the HTML code section, you can read the outerHTML attribute content to a BSTR variable:
BSTR bstrHTMLText;hr = m_pBody->get_outerHTML(&bstrHTMLText);
On this basis, displaying the source code in the code window is a simple matter: generating a window, converting UNICODE to ANSI characters, and setting the edit box control. The following code implements these functions:
Hresult cviewsource: getdocumentcontent () {uses_conversion; // obtain the Document Object ccomptr <idispatch> Pdisp of webbrowser; hresult hR = m_spwebbrowser2-> get_document (& Pdisp ); if (failed (HR) return hr; // make sure we get an ihtmldocument2 interface pointer. // Let's query the ihtmldocument2 interface (using a sensitive pointer) ccomqiptr <ihtmldocument2, & iid_ihtmldocument2> sphtml; sphtml = Pdisp; // extract the source code of the document if (sphtml) {// obtain the body object hR = sphtml-> get_body (& m_pbody ); if (failed (HR) return hr; // get the HTML text BSTR bstrhtmltext; HR = m_pbody-> get_outerhtml (& bstrhtmltext); If (failed (HR) return hr; // convert text Unicode to ANSI lptstr psz = new tchar [sysstringlen (bstrhtmltext)]; lstrcpy (psz, ole2t (bstrhtmltext )); // adjust hwnd = m_dlgcode.getdlgitem (idc_text); enablewindow (hwnd, true); hwnd = m_dlgcode.getdlgitem (idc_apply); enablewindow (hwnd, true ); // set the text m_dlgcode.setdlgitemtext (idc_text, psz); Delete [] psz;} else // The document is not an HTML page {m_dlgcode.setdlgitemtext (idc_text ,""); hwnd = m_dlgcode.getdlgitem (idc_text); enablewindow (hwnd, false); hwnd = m_dlgcode.getdlgitem (idc_apply); enablewindow (hwnd, false);} return s_ OK ;}
Because I want to run this code to respond to the DocumentComplete Event Notification, each new page is automatically and elastically handled. The DHTML Object Model allows you to modify the structure of a webpage at will, but this change is completely restored after you press F5 to refresh. You also need to handle the DownloadComplete event to refresh the code window (note that the DownloadComplete event occurs before the DocumentComplete event ). You should ignore the first DownloadComplete event of the webpage, but pay attention to this event only when performing the refresh action. The Boolean member variable m_bDocumentCompleted is used to differentiate the two cases.
11. Management Code window
The code window used to display the original code of the current HTML page involves another basic ATL Programming Problem-dialog box window, which is located under the "Miscellaneous" tab of the ATL object wizard.
I adjusted the size of the code window to respond to the WM_INITDIALOG message so that it occupies the lower part of the desktop space, which is on the taskbar. When the browser starts, you can choose to display or not display this window. It is displayed by default, but this can be achieved by clearing the "Show window at startup" check box. If you like it, you can close it at any time. Press F12 to re-display the code window. F12 is implemented through the keyboard hook installed in SetSite. The startup environment is stored in the WINDOWS registry. I select the SHGetValue function in the shell library file shlwapi. dll to perform the Registry read and write operations. This is much simpler than Win32 function operations starting with Reg. See:
DWORD dwType, dwVal;DWORD dwSize = sizeof(DWORD);SHGetValue(HKEY_CURRENT_USER, _T("Software\\MSDN\\BHO"), _T("ShowWindowAtStartup"), &dwType, &dwVal, &dwSize);
This DLL file is generated together with Internet Explorer 4.0 and the birth of the Active Desktop. It is a standard combination of WIN98 and later versions, so you can use it with confidence.
12. Register BHO objects
Because BHO is a COM server, it should be registered as both a COM server and a BHO object. The ATL wizard automatically generates the. rgs file. In the first case, registration is free. The following file code snippet is used to register as a BHO object (generated in the CLSID example ).
HKLM { SOFTWARE { Microsoft { Windows { CurrentVersion { Explorer { ''BHO'' { ForceRemove {1E1B2879-88FF-11D2-8D96-D7ACAC95951F} }}}}}}}
Note that the word ForceRemove can delete the corresponding key value of this row when the object is detached. All BHO objects are clustered under the BHO key. For so many guys, there is no buffer call. In this way, it is time-consuming to install and test BHO.
Summary
This article describes the BHO object, through which you can inject your code into the address space of the browser. What you must do is write a COM server that supports the IObjectWithSite interface. At this point, your BHO object can achieve various legal purposes within the browser mechanism. The examples in this article involve COM events, DHTML Object models, and WEB browser programming interfaces. Although the content is slightly wider, it shows the application of BHO objects in the real world. For example, if you want to know what the browser is displaying, you need to understand how to receive events and be familiar with WEB browsers.
In addition, Windows Resource Manager interacts with BHO objects, which requires special attention during programming. The source program included in this article is MSDN. It passes debugging in Windows2000/VC6 (after compilation, restart IE to get the result ).