PHP

10 Jun 2005

C API documentation

I love good documentation, but do not love writing or maintainting it.. I'm not talking about the nice docbook manuals in PEAR or PHP, but good old reference manuals for C API's.

As anyone who has worked with gtk or gnome will tell you, the semi-automated API docs that are generated for things like gtk and gnome-db, are a godsend. They frequently make the difference between taking 3 hours work to code something up, and a few days. This was one of the reasons why writing DBDO was not too complex.

However, the downside is that DBDO interacts with two API's, gnome-db and PHP. On one side, there is a detailed API documents, along with a highly structured design (gobject). On the other, is a organically grown API, which has evolved, relatively undocumented (except a few articles and the extension guide in the PHP manual).

As DBDO has reached a point where it implements the basic functionality, it has become quite clear that building it (or more specifically the libraries that it depends on) is extremely complex. And while it is increadibly featurefull and easy to code against, it's adoption is always going to be affected by this barrier to entry. (including my enthusiasum to set it up on my clients boxes. So as a parallel effort I've been exploring using PDO as the backend for DBDO.

The Itch
My work on these projects is usually restricted to a couple of evenings a week. As It's not exactly fee paying, and I'm often busy fighting bugs or other workload during the day. It had become quite clear in doing this,  that the complexity of both the PHP and PDO API's, along with lack of documentation was leading to a situation where I was spending at least half of my time looking through lxr.php.net working out which bit of the API I needed to use to do each task.

So in a fit of fustration, I started looking at both documenting, and simplifying the API's (initially of PDO, and then wandering off to consider PHP).

Documentation generation
Looking at the first of these two problems, Documentation, It's clear that the result of gnome/gtk's way of documenting API's is very efficient. It's quite easy to look at any gtk project, and understand the underlying ideas and locate methods that are likely to be the best match, just by browsing through the documents. (although images of the widget would frequently be nice..)

Taking gnome-db as an example, it uses 'gtk-doc' to parse the .h files (using perl), reading a few tags, then merges this with docbook templates, with placeholders for the API details (like synopsis etc.) and then uses a docbook tool to actually render this to HTML (or other formats as required.)

While this works really well, it adds one thing that I started off by saying, the need to actually 'love and care' for the generation of API docs. While It's a great idea, the reality is, that most of the people capable of documenting the internals of PDO or PHP, would much rather be doing far more interesting things....., especially if they are not getting paid for it.

It's also pretty clear that a majority of users really only need the HTML output these days. While the other formats are nice, C API documents are not exactly masterpieces which are flying off the shelf of your local bookshop. So the value of using docbook is questionable in relation to the time and effort required to deliver a solution like this.

Simplicity of API doc parsers
The gtk-doc toolkit, starts off as a very simple set of perl scripts to parse a .h file, however, like javadoc and phpdocumentor, it soon devolves into the problem that parsing structure information from @tags can be both complex and cumbersome. For something as industrial as documenting C API's, I began to wonder if ruling out the majority of this complexity was perhaps a good idea. A let's get down to basics approach seems like it would be more suited to the situation.

To this end, I tried out having @blocks that only had a key and a value.The value was just text, and should never be intended to be processed in any depth. I came up with a simple comment block to be prefixed to a definition of a function/struct/enum.
/**
* @function the_name_of_the_function
* this is a function.....
*/

/**
* @enum the_name_of_the_enum
* this is a macro.....
*/

/**
* @struct the_name_of_the_Struct
* some comments..
*/

These would only be applied to a piece of code that was in need of documentation. Hence installing the idea of minimal impact, high return.

It did not take too much to use a line by line parser to find these blocks, and store the data, along with the following definition, ready to be rendered later.

With the addition of a few more very simple tags, that help to structure the flow of the resulting output I was able to generate some simple documentation.
  • @page The title of the page or resulting page for the document... (followed by free text comments)
  • @class ClassName (an abstract name to group sets of functions together)
  • @include filename.h (to include another .h file to build up more complex documents)
Along with this allowing some HTML tags withing the body of the comment enables a bit of formating, to make things a little clearer.

Documenting API's can show their less elegant side
It did not take me long to realize when documenting PDO's API, that although it's design is pretty sensible, It was not really designed with exposing a public API in mind. The drivers usually provide a stuct with function pointers, which while being the classic way of doing this, is rather complex to document and illustrate in API docs.

To solve this, I ended up creating a functional wrapper around alot of these pointer method calls, following the general pattern of gobject type classes.

The resulting document can be seen here (while the link still works..)

At present, the current concept still has a few problems which either are solved, or will be solved.
  • requires the reformating of the defintion so all the spaces line up on the rendered output. (I think this should be solveable with some simple parsing, and padding of the definition)
  • requires the correct ordering of elements (eg. stucts before methods that use them.). Again, @include and @class should solve alot of this. along with doing some auto re-ordering (eg. structs/enums before functions)
  • requires @class to be in a seperate comment block and used before an @function comment - again, this should be a simple fix to allow you to specify which @class a @function belongs to.
  • requires you actually define real functions for things like #define'd functions - usually with #if 0 wrapped around them. An @def tag should solve this, allowing you to comment the synopsis for a macro.
While PDO is at a quite early stage, and Wez is quite interested in the research, it looks like PDO may get a nice internal API, so you can your own php extensions using databases easily. I wonder how complex it would be to introduce this as a comment standard for the rest of PHP, as I already started playing with here..

Posted by in PHP | Add / View Comments()

31 May 2005

FlexySvn packaged and released

I finally got round to packaging up the XUL based Subversion browser, so you can try it out on your own servers..

The first release effort is here : FlexySvn-0.1

Dependancies / Installation Instructions.
  • PHP5 (5.0.4 or CVS snapshot) (I might try doing a PHP4 version later.. - but you can always reverse proxy and run PHP5+apache2 on another port..)

  • The svn extension
    #pecl install -f svn

  • The colorer library and extension
    Download and build the colorer library Colorer-take5-linux.beta4.tar.bz2
    (good ole ./configure;make;make install;)
    Then install the colorer extension:
    #pecl install -f colorer

  • Download the tarball, and 'tar xvzf' into your web folder.

  • copy svn.php to index.php (so your config/bootstrap file will not be overwritten if I ever release another version)

  • edit the index.php - it's commented and pretty simple..

  • Visit the url (and hope it works)
The test machine I did all this on unfortunatly had alot of problems with apache2/php5 head and 5.0.4, Basically it segfaulted with all php calls.. - this was fixed with a small horrible hack,

Feel free to add comments to this post if you come across problems or email me fixes if you can think of any..

Note:
Apache must be configured with
AcceptPathInfo On
And PHP must be configured with
magic_quotes_gpc=Off
This can be done in a .htaccess file if necessary.


Posted by in PHP | Add / View Comments()

28 May 2005

Of Silly Acronyms and XUL World domination...

The highlight of this week was Sean's post describing how to use a Firefox extension to hide posts on planet-php, While I dont really agree with him that the ability to hide stuff is usefull (There's a scroll button on the right of the browser baby...!), It would far better if it would replace the current moronic acronyms being used by the PHP community...

AJAX = "using XMLhttpRequest"
Rest = "using Plain old POST and GET!"

It's become like reading badly written code. The author makes up short names for everything, then 3 years later, it's gone out of fashion, and no-one has a clue what they where talking about (I clean floors with Ajax!)

I've been using XMLhttpRequest for quite a while with Plain old POST and GET to send and receive data for XUL applications. Ocassionally this spreads back to plain old HTML, doing things like auto address filling, postcode retreival.

In comparison to the complexity of SOAP or XMLRPC, 95% of web calls do nothing more than send some data, and recieve some data.. = which amazingly enough is what POST and GET do...
Sending data via a standard HTTP request, just like a HTML form does, involves libraries which are much smaller than SOAP/XMLRPC.. and receiving data back can be flexible, a simple function may return a number or ERROR:......, a more compex one a simple XML document... - the great thing about these solutions is they are a breeze to debug.. - no more hunting down why SOAP types dont match on your client and app server...

Unfortunatly, As I have so painfully found, Javascript on IE is practically unusable, a place where undocumented, unexpected behaviour rules! (hint: try grabbing and setting the class name of a html element). This unfortunatly relegates using XMLhttpRequest to nothing more than 'pretty add-ons', unless you are prepared to invest a considerable amount of time working around IE's bugs. Often it's easier to go the simpler route, and Ban IE.

This week also brought up an interesting discussion at the office, on XUL and IE. I suggested that during the porting of some of the ASP(.net) applications to PHP, we also migrated them from (IE)HTML to XUL. This did bring up some questions like
  • what if the user is in an internet cafe and needs to use the applications?
  • do they have to install Firefox?
In reality they are ridiculous questions, Say bye-bye to security if you expect people to use them from an internet cafe... (even though Firefox has had a few minor security issues, none have been actively exploited yet, unlike it's competion.) So introducing Firefox to an office saves time, not adds it.. I also saw that someone had written a XUL ActiveX component for IE.. (so there is potential for the mentally challenged corporate types..)

But it did bring me on to thinking, that there are still some sites out there that only work in IE!, (HKMC is my latest example of idiots in Hong Kong = click the Check your eligabilty button).
So why not start creating Mozilla only sites (preferably using XUL).. If the big names = planet-php / artima / etc. stopped supporting plain old HTML, IE would dissapear to a final and well deserved death... (and given a 15 year track record, I'd be amazed if IE7 was any better).

So beware, this blog may turn XUL only one day....... on our murder IE campaign....


Posted by in PHP | Add / View Comments()

04 May 2005

FlexySvn: A subversion browser using the php svn bindings

As I mentioned in the update to my last post about the svn bindings for php, Wez has gone off like a rocket adding almost every feature under the sun to the extension. Authentication, repository creation, diff's etc. all appear to be working. So in another fit of itch scratching, I started hacking on a Subversion browser, using the bindings to blow away the rather staid and dull efforts that have gone before...

With a little XUL magic, and some clever tricks with xmlhttprequest, I threw most of the interface and application together in 2 evenings, Proving what I always suspected, XUL/JS/PHP is going to be a pretty hard combo to beat in terms of rapid development, and delivery.

It has a large feature set already, and probably does as much, if not more, that most of the alternatives out there.

FlexySvn is still being hacked at, although you should be able to get an idea of how it works from looking at the XUL Template, and the Page's action class. A downloadable version should be available after I go back and make a few tweaks to the svn bindings so the return values are more sensible. eg. using common names for log and ls details like
  • rev = the revision
  • time = a unix utime() number
  • datetime = a iso short formated version of the time (eg. 2001-01-01 10:30:12)
  • author (rather than modified_by or other names.)
  • ... others.. that need some thought....
The one major missing feature at present is the Edit page, which initially started off with the idea of using the HTML editor code from my web site, and webdav javascript posting to save. However, I have started pondering if writing a mozilla pluggin to use scintilla might be quite cute.  This is apparently something like what Komodo does.

The only drawback is that the documentation on how to build mozilla plugins on unix, are a little thin on the ground. My limited research indicated that using the build scripts from plugger and some of the code from the mozilla pluggin SDK might be the way to go, but then you have to start wondering, if you make the browser an editor, what happens when you accidentally write a neverending loop in javascript.. and render it on another window, - It will crash your editor?!... oops.

Posted by in PHP | Add / View Comments()

19 Apr 2005

svn bindings - More vapourware on its way

*UPDATE* the source is now in cvs.php.net/pecl/svn

That itch just got to much today, and I started on libsvn bindings for PHP. With a few hints from the subversion mailing list, I now have 3 commands working

<?php
dl('svn.so');
svn_checkout("http://www.akbkhome.com/svn/ext_svn","/tmp/ext_svn");
print_r(svn_cat("http://www.akbkhome.com/svn/ext_svn/svn.c"));
print_r(svn_ls("http://www.akbkhome.com/svn/ext_svn/"));
?>

It's a long way from being completed, but, it's quite nice to see how easy it was to get this far..
This one's php4 friendly, so you can try it out by downloading from the subversion server @ http://www.akbkhome.com/svn/ext_svn/

Posted by in PHP | Add / View Comments()

19 Apr 2005

DBDO first release

DBDO has finally made it to a state where it can be released. After testing on this site's PHP5 version, I finally nailed the last few segfaults.

The DBDO page on pecl.php.net lists the the differences between it and DB_DataObjects, and you can get an idea of the API by looking at the source for this websites PHP5 version or the documentation on DataObjects. The main things that still need looking at are

  • Join support
  • experimenting with PDO integration
It would be nice to hear from anyone that get's past the fun of compiling libgda (from CVS), and get it to work.

Posted by in PHP | Add / View Comments()

13 Apr 2005

Too many config options...

I sometimes wonder if engineers have a secret obsession with configuration options. I've been working on a few projects in the last few weeks that have made me wonder if the developers consider configuration options an excuse to be lazy, and let someone else sort out their mess.

Other than developing software, I also install my software, and other peoples, onto clients computers. At one point a few years ago, I was asked to provide instructions to install the software so they had a backup plan if I was un-available. Simple I thought, I write great software that's easy to install and set up. So off I went and wrote a realitively simple installation document.

Amazingly enough, the unfortunate guy the other end had great difficulty getting the application up and going. Some of the more classic problems where
  • The application needed write access to various folders on the machine, and those folder locations where configurable.
  • The application needed to use a number of system commands, and the path to the command needed setting up.
  • The application was going to be installed in a different path to the development box, with a different hostname etc.
I've also seen applications have configuration options for
  • name of the help url.
  • width of output (in a wrapped pre formated text area)
  • specific folder locations for specific purposes.
One of the primary failings to all of these options, was the fact that almost all of them are either not required, or should have been defaulted by the application.
  • Write access, can normally be dealt with by writing to the ini_get('session.save_path') by default, if the user wants to change the path, they can actually set the option or change the php.ini or .htaccess and modify the save_path
    You can also create your own application specific subdirectory under that temporary directory. PEAR's System::mkdir(array('-p',$mytmpdir)); works very well there.
  • System::which() provides a cross platform version of the unix which command. - this can be used to default paths for applications (hence no requirement for these type of options)
  • for paths and urls, PHP provides plenty of information about where your application is, and how to access it, forcing a user to configure them is almost always unnessecary.
  • Templates are really the place to modify url's or fixed paths or widths, Flexy's ability to be told about template folders to use, enables you to override (eg. copy and modify) specific templates that you need to change for your site.
  • Some options are just plain pointless, for the 2 in 2 million users who want them, there is usually a better way that adding an option (class inheritance comes to mind - extend my class and add it to your wrapper...)
  • And sometimes you just have to put your foot down, the help files are always going to be in [root]/help.. there's no choice in it, and dont waste our time making it configurable..
Solving this issue has involved the refactoring of my FlexyFramework many times, the history of how and where do configuration went something like this:
  • a ConfigData folder in the application root, stored one config file for each domain the app was installed on.
  • These contained a huge number of options, for all the packages that the applicaiton used.
  • As time has evolved, I've removed the requirement for alot of the core options, and used defaults that work 99% of the time.
  • About a year ago, It struck me the autoloading of these Files was actually the cause of some setup bugs and time wasting, so I moved almost all the remaining options to the index.php line that starts the Framework, and pulled in a single option from a machine wide ini file. (basically moving any optional stuff into the bootstrap file.)
  • So at the end of the day, we have one global .ini file for each server, which basically lists all the database dsn's for the various applications. (for simpler servers, the index.php file explicitly loads an ini file from somewhere, which generally only contains the database dsn.)
So finally I can say in the documentation
  • download, unzip
  • create the database using the .sql file (or similar)
  • Then depending on the complexity of the project
    • alter the index.php, add the database dsn.
    • alter the xxx ini file, and change the database dsn.
And hey presto it works...!, well sometimes..

Posted by in PHP | Add / View Comments()

09 Apr 2005

DBDO News

DBDO's slow migration from vapourware to bugware has been progressing this week. A month ago, I started useability testing on the PHP5 version of this site, and it helped find alot more issues than the unit tests that I had set up.

A few weeks ago though, I was working on another application that uses a threaded version of PHP embed, and DBDO, which was throwing up quite a few bugs around setting / fetching data using the overloaded internal setters and getters.

The original design of DBDO, was that when you fetch a value (eg.)
echo $do->name;
that the object would internally get the value from libgda at this point, rather than the way DB_DataObject currently does, by assigning all the PHP variables when you call $do->fetch().

The trouble was that this way of working began to get very confusing when mixed with all the potential ways that you may access the data.
  • Setting the column value (at this point you have to store a seperate hash for assigned values)
  • print_r and it's like need you to actually set values for all the properties.
I ended up with something like 3 hashes doing various tasks, and each needing memory managing. And as usual, complexity leads to numerous bugs.. So I made the decision last week to follow DataObjects logic of simply setting the properties on fetch().

This removed a large chunk of code, and in general simpified the whole query building process. Things like the update code could easily compare the fetched data against the current object properties and update only the changed data. The only thing that caught me out was that unless you add a zend_objects_store_add_ref() after changing the properties internally, the values get free'd too early and segfaults occur.

Anyway, the current plan is to get back to testing the code on the PHP5 version of this site next week, then actually make an alpha release...
Posted by in PHP | Add / View Comments()

29 Mar 2005

require_once is part of your documentation.

I had the pleasure (pun intended) of installing a small framework of code today, which broke at least half of the rules I've been building up for the projects I've been working on. The code illustrated very clearly why explicitly typing require_once is not only good idea, it can make the difference between clear readable code, and poor magic.

The error "Fatal error: Call to undefined function: somefunction_xyz() in..." appears after installing the code. Looking at the file, it only contains one line.. somefunction_xyz()!

The framework is supposed to have loaded this file, but as it's not set up correctly, and therefore, it didnt happen. To me this assumption that the file is loaded is flawed to begin with. Ignoring the issue that the framework relies on function libraries, the other fatal flaw is that Frameworks should rarely load more that one 'action' file, which in turn should be reasonably self explainitory where it is getting things from.

The missing require_once makes the code very difficult to follow without inside knowledge (or heavy use of grep) of how the framework may be working, and very little is given away as clues to what should have happened prior to this error occuring.

I guess this harks back to the idea that __autoload() will encourage people to write more code that is less self documenting, almost all languages C#, Java, Python... usually have a list at the top of the page, indicating what they 'import' or 'use' to achieve the aim of the program, PHP uses the require_once to document the source of your libraries. It helps others read your code, and in PHP can also be placed close to the place you actually use the library method. Alot of these language have ways around ending up with this large import list, and often some import's implicitly load others, but in making code readable it's often worth duplicating these, just to ensure that it's readable.

So from the trenches here, please try and make your code readable, other people have to install, set it up, and as quickly as possible understand what you intended to do....
Posted by in PHP | Add / View Comments()

18 Mar 2005

is __autoload evil?

Someone asked on a few of my other posts why I refer to __autoload as evil, (well apart from making sensationlist statements to keep the blog interesting).

lets start with what it's supposed to get rid of.

require_once 'SomeClass.php';
$x = new SomeClass;

The code above is reasonably predictable, require_once will look in the include path, and find the first match of SomeClass.php, the second line will create an instance of the class that looks like it's probably in SomeClass.php
The only magic here is
  • Which of the include paths SomeClass.php might be in..
Now enter __autoload.
I first saw __autoload on the Zend developers list, it's one of the methods that you looked at and thought, 'is this really a good idea?'. But ignored it, since like all features of any language - 'You dont have to use it'.. or so I thought..

Autoload basically hooks into a few places so when you would normally get a 'this class does not exist' message, autoload is called to let you try and load it, and hence avoid this message.

It also hooks into class_exists(), and gets called to let you try and load the class then, hence the purpose of

class_exists('PEAR') or require_once 'PEAR.php';

On the face of it, the above looks like it is just saving you a file call that require_once isn't really supposed to be doing.. but no.. class_exists is really secret code for 'you can have a go loading the PEAR class from wherever you like'

The justifications I've seen for this are two fold
  • It's better from a performance point of view.
  • It's more flexible.
The first argument, is extremely questionable, the microseconds that you may be saving, compared to parsing all the code that you have in all the classes is probably so tiny as to not be a particular issue. And almost all of it could be removed by using APC or similar if you really where that desperate for performance tweaks.

The Flexibility issue is also questionable, What can you do with autoload that cant be done with include path? or more to the point, what are you doing messing around with include path and autoload locations in the first place.., trying to dig a bigger bug hole for you or someone else to discover later..?

And finally into the fray spl_autoload
Meanwhile as all this was going on, Marcus added an autoloading toolkit to spl, the new 'Standard PHP Library', or perhaps the 'I need more classes library'. What has been added is the missing ability of autoload to be impliemented multiple times.

You can only define one __autoload method per instance of PHP, however spl_autoload allows you to register as many handers as you like, hence multiplying an already magic tool at infinatum. Now you may as well prey that what you typed is actually going to be run as you requested..

Is include_path so complex, troublesome or unflexible that it needs to be replaced with something so much more complex and flexible?

or does somebody want to do something so horrific with __autoload that they are dieing for this tool?
Posted by in PHP | Add / View Comments()
« prev page    (Page 4 of 7, totalling 63 entries)    next page »

Follow us on