xcel is a habitual product and a common tool of business analysis. Python is
a chosen language of open source community. The scope of libraries written for
Python makes it a rival of Matlab. The need for combined use of both products
is evident and received considerable attention already. There are several
Python libraries for manipulation of Excel files, COM-based libraries for
out-of-process control of Excel and XLL-based libraries for in-cell execution
of Python functions. The present product was created to facilitate programming
of interactive applications with Excel front end.
To see that such task is not directly feasible with traditional Python means
consider the following Python script.
01\
import
win32com.client as c
02\
app=c.Dispatch('Excel.Application')
03\
rng=app.Workbooks(1).Worksheets(1).Range('a1')
04\
rng.Value=1
Such script puts a value into a cell in some Excel workbook. It works nicely
most of the time. However, consider what may happen if such script is actively
used in an intense business environment. The user may have several third party
Excel addins installed on his desktop and such addins may be responding to the
changing market environment and actions of multiple users across company's
network. The handler "rng" may become invalid as it passes into the operation
"rng.Value=1" because the cell may be deleted or because the worksheet or the
workbook may be closed.
Another example is an "if" statement.
01\
if
rng.Value==1 :
02\
doSomething()
03\
else
:
04\
doSomethingElse()
By the time "doSomething()" starts execution the rng.Value might change.
Similar considerations are common for textbooks on multithreaded programming.
At this point the author would introduce the mutex concept. However, the Excel
API provides no such concept. Instead, we introduce the Python for Excel.
Activation of PFE
addin
|
When an Excel application starts, it scans Windows registry to obtain a list
of *.xla and *.xlam add-in files. These files are loaded into the Excel
process and may contain almost arbitrary VBA script. In particular, the VBA
script may trap Workbook.Open event and, thus, instruct Excel to do something
when the add-in is loaded. The Python for Excel inserts the
otsPythonForExcel.xla(m) file into such process. The Open event handler inside
the otsPythonForExcel add-in obtains the pointer to the Excel.Application
object and passes it into the pfeAddin.dll (1), (see the picture
(
Activation of PFE addin
)).
Such dll is an in-process C++ COM binary with an embedded Python engine (2).
When the pfeAddin.dll receives the Excel.Application handler then it uses the
Application's API to install listeners to all Excel events and to initialize
Python access to the Excel object model (3). It consequently executes the
Python script file pfeAddin_OnStart.py located in C:\Program
Files\PythonForExcel\scripts directory (4). The pfeAddin_OnStart.py script
builds the Python for Excel functionality.
Activation of
pfeControlShell.exe.
|
The construction above accomplishes in-process control of Excel on the main
thread. However, it would not be of much use without debugging and
experimentation facilities. In order to provide such facilities, the
pfeAddin_OnStart.py script overrides the sys.stderr stream. When a Python
exception is thrown then the sys.stderr receives data stream. When that
happens the sys.stderr objects generates an out-of-process COM call into the
pfeControlShell.exe binary (5), (see the figure
(
Activation of
pfeControlShell.exe
)). The pfeControlShell.exe runs as a singular process
and embeds a Python engine (6) that executes the pfeControlShell_init.py
script (7). Such script builds the ControlShell GUI (8). The ControlShell runs
several Python shells in a fully functional editor window and provides various
convenient features such as calltips and call autocompletion. Each Excel
instance that established a connection to the ControlShell is represented by a
separate shell.
The presence of out-of-process ControlShell creates an unavoidable instability
due to out-of-process communication with Excel. However, such instability
occurs only at the debugging and experimentation stage. It has no bearing on
the core programming.
Calling into Excel.
|
The ControlShell communicates with Excel instances via out-of-process COM
protocol. Nevertheless, the commands sent from the ControlShell into Excel are
executed in-process and on Excel's main thread. This is accomplished as
follows. When the sys.stderr object (9), (see the figure
(
Calling into Excel
)) generates its
first call into the pfeControlShell.exe then it passes the Application handler
(10). The pfeControlShell.exe redirects (11) the received Application handler
into a newly created process, pfeExcelPoker.exe (12). Such process stores the
Application handler (13) and is responsible for out-of-process calling into
Excel. When the user types a command into the ControlShell (14) and presses
<Enter> then the command is stored (15) into a buffer (16) inside
pfeControlShell. Consequently, the ControlShell signals (17) to the
pfeExcelPoker process that it is time to "poke" the Excel instance. The
pfeExcelPoker generates an out-of-process call (18) designed to create a
specific event detectable by the Excel object model. The pfeAddin.dll is
listening to such event. When the event is generated then the Excel's main
thread enters the pfeAddin.dll. The pfeAddin.dll extracts commands from
ControlShell buffer (16) and passes it (19) into the embedded Python engine
(20). The sys.stdout object (21) is overridden by the pfeAddin_OnStart.py
script. For this reason any output generated by the command is communicated
(22) to the pfeControlShell.exe and displayed on the GUI.
The pfeExcelPoker.exe exists because the COM protocol forbids calling an
IDispatch pointer from a thread that is different from a thread to which the
IDispatch pointer was marshalled. Another restriction is that IDispatch always
marshals into the main thread of the process. The main thread is the thread
that runs the Windows message loop. Hence, the Application handler (which is
the pointer to IDispatch) cannot reside in the pfeControlShell.exe. It would
block the process every time the Excel takes time to respond to a call via the
Application handler. The Application handler has to reside in a separate
process.
|