Quantitative Analysis
Parallel Processing
Numerical Analysis
C++ Multithreading
Python for Excel
Python Utilities
Services
Author

I. Motivation and design (PFE).
II. Installation of the Python for Excel.
III. Tutorial introduction into the Python for Excel.
IV. PFE Programmer's reference.
Index. Contents.

Motivation and design (PFE).


xcel is a habitual product and a common tool of business analysis. Python is a chosen language of open source community. The scope of libraries written for Python makes it a rival of Matlab. The need for combined use of both products is evident and received considerable attention already. There are several Python libraries for manipulation of Excel files, COM-based libraries for out-of-process control of Excel and XLL-based libraries for in-cell execution of Python functions. The present product was created to facilitate programming of interactive applications with Excel front end.

To see that such task is not directly feasible with traditional Python means consider the following Python script.

01\ import win32com.client as c

02\ app=c.Dispatch('Excel.Application')

03\ rng=app.Workbooks(1).Worksheets(1).Range('a1')

04\ rng.Value=1

Such script puts a value into a cell in some Excel workbook. It works nicely most of the time. However, consider what may happen if such script is actively used in an intense business environment. The user may have several third party Excel addins installed on his desktop and such addins may be responding to the changing market environment and actions of multiple users across company's network. The handler "rng" may become invalid as it passes into the operation "rng.Value=1" because the cell may be deleted or because the worksheet or the workbook may be closed.

Another example is an "if" statement.

01\ if rng.Value==1 :

02\ doSomething()

03\ else :

04\ doSomethingElse()

By the time "doSomething()" starts execution the rng.Value might change.

Similar considerations are common for textbooks on multithreaded programming. At this point the author would introduce the mutex concept. However, the Excel API provides no such concept. Instead, we introduce the Python for Excel.


Activation of PFE addin figure
Activation of PFE addin

When an Excel application starts, it scans Windows registry to obtain a list of *.xla and *.xlam add-in files. These files are loaded into the Excel process and may contain almost arbitrary VBA script. In particular, the VBA script may trap Workbook.Open event and, thus, instruct Excel to do something when the add-in is loaded. The Python for Excel inserts the otsPythonForExcel.xla(m) file into such process. The Open event handler inside the otsPythonForExcel add-in obtains the pointer to the Excel.Application object and passes it into the pfeAddin.dll (1), (see the picture ( Activation of PFE addin )). Such dll is an in-process C++ COM binary with an embedded Python engine (2). When the pfeAddin.dll receives the Excel.Application handler then it uses the Application's API to install listeners to all Excel events and to initialize Python access to the Excel object model (3). It consequently executes the Python script file pfeAddin_OnStart.py located in C:\Program Files\PythonForExcel\scripts directory (4). The pfeAddin_OnStart.py script builds the Python for Excel functionality.


Activation of pfeControlShell.exe
Activation of pfeControlShell.exe.

The construction above accomplishes in-process control of Excel on the main thread. However, it would not be of much use without debugging and experimentation facilities. In order to provide such facilities, the pfeAddin_OnStart.py script overrides the sys.stderr stream. When a Python exception is thrown then the sys.stderr receives data stream. When that happens the sys.stderr objects generates an out-of-process COM call into the pfeControlShell.exe binary (5), (see the figure ( Activation of pfeControlShell.exe )). The pfeControlShell.exe runs as a singular process and embeds a Python engine (6) that executes the pfeControlShell_init.py script (7). Such script builds the ControlShell GUI (8). The ControlShell runs several Python shells in a fully functional editor window and provides various convenient features such as calltips and call autocompletion. Each Excel instance that established a connection to the ControlShell is represented by a separate shell.

The presence of out-of-process ControlShell creates an unavoidable instability due to out-of-process communication with Excel. However, such instability occurs only at the debugging and experimentation stage. It has no bearing on the core programming.


Calling into Excel figure
Calling into Excel.

The ControlShell communicates with Excel instances via out-of-process COM protocol. Nevertheless, the commands sent from the ControlShell into Excel are executed in-process and on Excel's main thread. This is accomplished as follows. When the sys.stderr object (9), (see the figure ( Calling into Excel )) generates its first call into the pfeControlShell.exe then it passes the Application handler (10). The pfeControlShell.exe redirects (11) the received Application handler into a newly created process, pfeExcelPoker.exe (12). Such process stores the Application handler (13) and is responsible for out-of-process calling into Excel. When the user types a command into the ControlShell (14) and presses <Enter> then the command is stored (15) into a buffer (16) inside pfeControlShell. Consequently, the ControlShell signals (17) to the pfeExcelPoker process that it is time to "poke" the Excel instance. The pfeExcelPoker generates an out-of-process call (18) designed to create a specific event detectable by the Excel object model. The pfeAddin.dll is listening to such event. When the event is generated then the Excel's main thread enters the pfeAddin.dll. The pfeAddin.dll extracts commands from ControlShell buffer (16) and passes it (19) into the embedded Python engine (20). The sys.stdout object (21) is overridden by the pfeAddin_OnStart.py script. For this reason any output generated by the command is communicated (22) to the pfeControlShell.exe and displayed on the GUI.

The pfeExcelPoker.exe exists because the COM protocol forbids calling an IDispatch pointer from a thread that is different from a thread to which the IDispatch pointer was marshalled. Another restriction is that IDispatch always marshals into the main thread of the process. The main thread is the thread that runs the Windows message loop. Hence, the Application handler (which is the pointer to IDispatch) cannot reside in the pfeControlShell.exe. It would block the process every time the Excel takes time to respond to a call via the Application handler. The Application handler has to reside in a separate process.





Index. Contents.


















Copyright 2007