Coding With SpiderMonkey

Originally published June 5th, 2006 by

Keith M.

Part I - Getting Started

So, I'm working on learning how to use the SpiderMonkey engine in my C programs to provide scripting support. I actually do have an overall goal that I would like to achieve from learning this (and a few other things) but that's not something I want to share...just yet anyway.

As I work, I want to use this blog as sort of record keeping device. I'll blog about what I've learned, what I think I know, but am not sure of, what I'm having problems with, etc. Perhaps someone else can make use of this information in their SpiderMonkey (or maybe just regular C) adventures.

Well then, lets get started with some information. I've messed with the code a while ago, but only recently started up again, so I had to re-learn some of the stuff I was working on. Last night I managed to get the engine setup and accept a file to parse as the command-line argument. I provide one custom object right now, which is the file object. It allows a script to write to files, or stdout. No reading support yet, or custom modes or anything, just plain, simple, writing.

I haven't bothered to create any sort of API for the file object (or any other object), instead I'm just adding methods on as I think of them and coding the behavior in some way that works and (to me) makes sense. To express the API's that exist, I'll use C++ like (not real) class definitions. The API for the file object currently looks like this:

class File {
    public:
        File(string name);
        bool open();
        int write(string text);
        bool close();

        static bool open();
        static int write(string text);
        static bool close();
};

So, essentially, there is one form of a constructor use to instantiate a new object, and there are three methods: open, write, and close which can be used as a method on an instantiated class, or statically (in which case, they affect stdout). So, here's a show example script showing how they can be used:

var out = new File("outfile");
if (out.open()){
    out.write("Todays date is: "+(new Date()).toGMTString());
    out.close();
    File.write("./outfile created, contains todays date.");
}
else {
    File.write("./outfile could not be opened");
}

Here's the output from running that code through my engine:

$./engine test.js
./outfile created, contains todays date.
$cat outfile
Todays date is: Tue, 06 Jun 2006 23:18:56 GMT

<b>--or, if ./outfile can't be opened--</b>

$./engine test.js
./outfile could not be opened

So, how did I get that working you may ask? Well, "simple!" I will say, though it did take a while to actually figure it out. Let me start this blog series off from the very beginning. In this post, I'll cover compiling SpiderMonkey on a Linux system, and then I'll cover making an extremely basic program that uses the JS engine, but does not actually do anything other than start it up and then shut it down.

Compiling SpiderMonkey

Compiling SpiderMonkey on Linux is really pretty straight forward. It's just a matter of running their make file using GNU Make, or probably any decent substitute, having tried any others myself. Here's a little step-by-step instruction on what I did to build SpiderMonkey and then make it usable.

Obtain the SpiderMonkey source code either from the official site, or you can download the version I'm using as I write this by using the link below.
Extract js-1.5.tar.gz to a directory somewhere. It will create a js/ subdirectory.
cd js/src/ to enter the source code directory.

make -f Makefile.ref to build the source. Notice you have to use the Makefile.ref file, not the regular Makefile. The regular Makefile is configured to build SpiderMonkey as the embedded engine for the browser. The Makefile.ref file will build the standalone library that you can use.

Update 11/4/2006 - I rebuilt my computer and installed Ubuntu Edgy Eft (6.10). I tried installing spidermonkey by following my own compile instructions here but I ended up receiving these compile errors.

make[1]: Circular jscpucfg.h <- Linux_All_DBG.OBJ/jsautocfg.h dependency dropped.
gcc -o Linux_All_DBG.OBJ/jscpucfg.o -c -Wall -Wno-format -DGCC_OPT_BUG -g -DXP_UNIX -DSVR4 -DSYSV -D_BSD_SOURCE -DPOSIX_SOURCE -DHAVE_LOCALTIME_R -DX86_LINUX  -DDEBUG -DDEBUG_kicken -DEDITLINE -ILinux_All_DBG.OBJ  jscpucfg.c
jscpucfg.c:376: internal compiler error: in dwarf2out_finish, at dwarf2out.c:14129
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.
For Debian GNU/Linux specific bug reporting instructions,
see <URL:file:///usr/share/doc/gcc-4.1/README.Bugs>.
Preprocessed source stored into /tmp/cc8xj9FF.out file, please attach this to your bugreport.
make[1]: *** [Linux_All_DBG.OBJ/jscpucfg.o] Error 1

It turns out these errors were caused by the -g debugging flag when compiling jscpucfg. I created a patch file which removes this flag and fixes the build process. Please apply the patch if you are experiencing the same problem.

When the build is complete, look at the last few lines of the output from make, you will probably see something similar to this:
```
make[1]: `Linux_All_DBG.OBJ/jsautocfg.h' is up to date.
make[1]: `Linux_All_DBG.OBJ/jscpucfg' is up to date.
make[1]: `Linux_All_DBG.OBJ/jscpucfg.o' is up to date.
make[1]: Nothing to be done for `Linux_All_DBG.OBJ/jsmathtemp.o'.
```
That means the build was successful, and that the libjs.so file is stored in the Linux_All_DBG.OBJ sub-folder. This folder name varies by platform/configuration (see the config/ directory for other names).
Now, if you want you can just reference the source directory and that object directory using your compiler flags to build the program. I, however, choose to put the header files into /usr/local/include/spidermonkey and the libjs.so file into /usr/local/lib. To do that, simply issue the following simple commands (as a user with appropriate privileges, such as root):
```
cp Linux_All_DBG.OBJ/libjs.so /usr/local/lib
mkdir -p /usr/local/include/spidermonkey
cp *.h Linux_All_DBG.OBJ/jsautocfg.h *.msg /usr/local/include/spidermonkey
```
Update 11/04/2006 - I've been informed that jsautocfg.h needs copied over too. I must have already done this when I wrote the article and missed it when I was recalling the steps necessary. The command above has been updated to include this file in the copy process.

Once complete, you will have Spidermonkey successfully "installed" and ready to work.

Touched by the spider(monkey)

So, now that SpiderMonkey is installed, now you get to create your first program using it. To start, lets do something really, really, really simple. That means a script that does not execute any script files. All it does is startup the engine; print a message; then shuts down. This will ensure that you have all the proper compile flags by introducing enough of the engine that you actually need to link with it, but leaving enough out that you should not be overrun with possible errors in the code.

The first step to setting up the JS engine is to #include the jsapi.h header file. This file defines all the public API functions for the SpiderMonkey engine. The basic rule of thumb, from what I understand, is that if it's defined in this file, it's safe to use. If for some reason you ever find that you need to dig into a different header file, you probably should not be using what you are. Or, you need to file a bug report so that the developers can work on providing the necessary functionality via a public API. Don't be eager to file bugs though, make sure you actually need to do what you think you need to do. There may be another way.

The second step is to set up a new runtime environment for the JS engine though a call to the JS_NewRuntime. This function sets up the runtime environment for the JS engine. The maxbytes parameter specifies how much memory the runtime will let fill up before it forces a call to the garbage collector to free old unused objects. I don't think this is an absolute cap on the memory, and I don't think it's required that this amount fills up before the GC runs, but I don't know the specific details. This is just my interpretation of the docs.

Once you have a runtime initialized with JS_NewRuntime(), you next have to make a context using JS_NewContext You can think of the context as the actual engine if you want. A script is executed within a specific context, and within the context is where the scripts variables and objects and such are stored. That's how I understand it to work anyway. The context is created within the specified runtime, rt. The context will have a stack size of stacksize bytes. The stack size does not have to be less than the runtime's maxbytes parameter. In fact, it should probably be much bigger.

Next, you need to create a global object. All objects created will be a child of this global object. The global object isn't directly available to the scripts you execute, but it could be made available if desired (which I did, to emulate `window' in browser-based code, but won't cover here). Before you can create this global object however, you must define it.

To define objects in SpiderMonkey, you use the JSClass structure. It's a pretty good-sized structure with quite a few fields you need to set up. Have a look at the documentation page for the full details. From what I understand of the process, and from what I've seen looking at the SpiderMonkey code, the easiest way to manage these classes seems to be just giving them unique names and declaring them as global variables in their respective files, then using an extern statement in the header file to let any other project files know about them. This is the method I will be using for now. If I come up with anything better later, that will be another post.

Now, the good thing about this class is that for most of the required fields, there are some methods provided by SpiderMonkey that you can use as a default. The optional fields can all be specified as NULL or 0. For the global object, it works nice to just use all the provided defaults, and set all the "optionals" to null, so that is what I will do. Here is the code for my global object:

JSClass js_global_object_class = {
    "System",
    0,
    JS_PropertyStub,
    JS_PropertyStub,
    JS_PropertyStub,
    JS_PropertyStub,
    JS_EnumerateStub,
    JS_ResolveStub,
    JS_ConvertStub,
    JS_FinalizeStub,
    JSCLASS_NO_OPTIONAL_MEMBERS
};

So now that we have our global object's class defined, we can create the object based on this definition. To create a new object, you use the JS_NewObject function to create a new global object. There are other functions to create objects as well, but they are for use later when you are defining objects that you want your scripts to be able to use. As you can see from the parameter list, defining an object is fairly simple. You have to tell it what context to create the object in by passing in the JSContext variable we created above as the cx parameter. Next you pass a pointer to the object definition that we created above as the clasp parameter. The last two parameters, proto and parent can be specified as NULL for this case. I'm not well versed on what these parameters are really for, but what I understand from the documentation and experience is that the proto (prototype) parameter is used as a base object for the new object. The object will inherit the members that are defined on the prototype object. Also, the new object's .__proto__ property will point to the prototype object. The parent argument specifies to which object the new object will be a child. This will set the .__parent__ property of the new object to the passed in parent. I'm guessing that this also means if the parent object gets collected by the garbage collector, that the child object will also be collected.

And finally, the last step in initializing the JS engine is to initialize all the standard, built-in Javascript classes such as the Date, Math, and String classes. You do this using the JS_InitStandardClasses function. You have to pass into this function the context to use, via the cx parameter, and the global object, via the obj parameter. This function is what actually sets obj as the global object. Prior to this, the object would be the same as any ordinary object you declared. If, for some reason, you wanted to use the JS engine without defining the standard classes, you could set your global object using the JS_SetGlobal function, but I cannot see any point in doing this right now.

Yay! Now the JS engine is up and running. So now what? Ordinarily now you would go through defining your own application-specific objects, like my file object outlined above, and proceed to execute scripts. However, to keep this simple we will do nothing. All I'm going to do is check to see if a file was passed on the command line. If one was, I'll show a message stating that we would run the file. If not, I'll show the usage text for the program.

if (fileToRun){
    printf("At this point, we would run the file %s if we were developed more.\n", fileToRun);
}
else {
    printf("Usage: %s file\n", argv[0]);
}

Now, as with most programs, the Javascript engine has to be properly shutdown when you are done with it. This will typically be at program exit, but not necessarily, depending upon your program's design. Engine shutdown is a pretty simple process. There are two functions that you use to destroy the context's that you have created, and then destroy the runtime object. To destroy all the created contexts, you use the JS_DestroyContext function. To destroy the runtime object, you use the JS_DestroyRuntime. Pretty simple, huh?

Compiling your beast

Yay! Now we should have a program that is ready to be compiled and run. Embedding the SpiderMonkey engine does make compiling a little more difficult than the simple gcc -ofile file.c command line that some people may be used to. Don't panic though; it's not that much more difficult. Basically all you need to do is use the -I and -L flags to tell the preprocessor where to find the header files, and the linker file where to find the library files. Then you use the -ljs flag to tell the linker it needs to include libjs.so in the linking process.

You also have to define a constant for your platform. You can do this using either the -D command line switch, or using a #define directive directly in the source code, prior to the #include <jsapi.h> directive. I have opted to do the former. There are a number of platform constants from which to choose. For a Unix/Linux platform the constant to use is XP_UNIX. For other platforms you need to choose one of the following: XP_BEOS, XP_MAC, XP_OS2, or XP_WIN. Should you forget to define this constant, you will get many error messages from the files regarding undefined objects, parser errors, warnings, etc. At the top of the output though, you should see these lines:

In file included from /usr/local/include/spidermonkey/jspubtd.h:45,
                 from /usr/local/include/spidermonkey/jsapi.h:47,
                 from engine.c:2:
/usr/local/include/spidermonkey/jstypes.h:224:6: #error "Must define one of XP_BEOS, XP_MAC, XP_OS2, XP_WIN or XP_UNIX"
/usr/local/include/spidermonkey/jstypes.h:240:2: #error No suitable type for JSInt8/JSUint8
/usr/local/include/spidermonkey/jstypes.h:253:2: #error No suitable type for JSInt16/JSUint16
/usr/local/include/spidermonkey/jstypes.h:273:2: #error No suitable type for JSInt32/JSUint32

For reference, below is the command line I used to compile the engine, and a summary of the options I used and why I used them.

$gcc -I/usr/local/include/spidermonkey -L/usr/local/lib -ljs -DXP_UNIX -O0 -g \
-Wall -pedantic -Wno-long-long -o engine engine.c

-I/usr/local/include/Spidermonkey -- Add the path /usr/local/include/spidermonkey to the C Pre-processors search path. This way it knows where to find the jsapi.h file.
-L/usr/local/lib -- Add the path /usr/local/lib to the linkers search path. This is needed so that the linker can find the libjs.so file.
-ljs -- Tells the linker that it in addition to the standard libraries, it needs to link against libjs.so
-DXP_UNIX -- Tells the compiler to define the constant XP_UNIX which tells SpiderMonkey we are on a UNIX or UNIX-like (Linux/BSD) operating system.
-O0 -- Disables optimizations made by the compiler. I do this as I find that often when the optimizations are enabled debugging with GDB becomes nearly impossible. GDB no longer is able to associate which line of the source file is being executed and often will do mini loops through the lines. Cycling between two lines for a while, then advancing a line and repeating.
-g -- Enables debugging symbols to make debugging with GDB easier.
-Wall -- Enables nearly all warning messages in order to help highlight possible problems.
-pedantic -- Enables even more strict checking. Highlights things such as using a double-slash (//) comment in C. That style of commenting came about in C++ and not C, though GCC still accepts it in C files.
-Wno-long-long -- Disable warning about 'long long' not being supported in ISO C90. I have yet to determine if this causes any actual problems, but so far everything is working, so I am just ignoring it.
-o engine -- Sets the output file to be engine
- *Update 11/6/2006 - I used to show this flag as being -oengine, all one word without a space. It would appear as though some versions of gcc do not like this format though. Since the version I use also accepts a space as well as this, I will include the space to be more compatible.
engine.c -- The input file.

Well, that's it. You should now have a program which successfully initiates the SpiderMonkey engine, prints a message, then shutdown the engine. This is only the beginning to far greater things. Enjoy!

Attachments

js-1.5.tar.gz -- Here's the 1.5 version of the SpiderMonkey code, straight from the Mozilla servers. This is the version I'm using as I write this post.
engine.c -- This is the program I created for this post. All this program does is startup the engine, print out a message about how it can't actually do anything, then shutdown the engine.
ubuntu-config.mk.patch -- A patch for ubuntu edgy eft systems which disables the -g flag when compiling allowing the compilation to complete.