Skip to main content

Curl HTTP Client 2.0

It’s been a while since I last updated my Curl HTTP Client class. That’s the class that we’ve been using for years now, for all kinds of site scrapping, bulk domain registration without API, … and even today we use it as a part of core in our brand new Payment system.

Since I had some spare time this weekend, I finally managed to merge some of updates we’ve created during all these years and put a class to github so I can do frequent updates regularly. Don’t worry, class has retained most of it’s previous functionality explained here so update to latest version shouldn’t cause any problem.

Here’s the GitHub page: https://github.com/dinke/curl_http_client. Feel free to use it in your own projects and send in your comments.

Validating an integer with PHP

When I started writing this post I wasn’t sure should I put this into programming category or fun … I mean, validating if passed variable is integer aka whole number, how hard can it be? It appears not really easy in PHP 🙂

Firstly some background, I needed to validate an array with db key values (like array(1,2,6,7,…n)), so I was thinking of using simple function like array_filter with callback, something like this:
array_filter($foo_array, 'is_int');
where is_int is callback, which calls built in is_int function for each array value, in order to filter out non int values.

The problem is (yes after 10 years dealing with PHP I am aware of that), PHP doesn’t treat int and string numbers the same, so string ’42’ wouldn’t be recognised as integer.

is_int('42'); //false
is_int(42); //true

To make things more “interesting” for programmers, if you have integer form data like age, year or whatever, they will be sent trough $_POST/$_GET/$_REQUEST arrays as ‘string numbers’ (so string ’42’ not int 42).

There is nice function to deal with such things and it’s name is is_numeric … but it only checks if string is actually number, so float values will be evaluated to true as well. ctype_digit on the other hand is opposite of is_int, it will only return true if test variable is string number, so ’42’ would evaluate to true, but 42 to false.

ctype_digit('42'); //true
ctype_digit(42);//false

And to make things even worse, PHP is silently converting really big integers to float (those bigger than PHP_INT_MAX const) so guess what would you get for number like is_int(23234234234235334234234)? Yep, false 🙂

$var = PHP_INT_MAX;
var_dump($var++); //true
var_dump($var); //false it's float now!

Yeah I know, you could cast var to int and do is_int … or cast var to string and do ctype_digit … and other dirty hacks… but what if someone smart from PHP team had decided to let say add 2nd argument to is_int check so you can check for type in some kind of ‘non strict’ mode, so string ’42’ is actually evaluated as integer? Something like this in C I guess:

function is_int(int var, int strict = 1)
{
   //if strict is false evaluate string integers like '42' to true!
}

All in all (at least to me), the easiest way to validate whether a variable is whole number (aka string or integer string) is with regular expression. Something like this:

/**
 * Test if number is integer, including string integers
 * @param mixed var
 * @return boolean
 */
function isWholeNumber($var)
{
    if(preg_match('/^\d+$/', $var)) {
        return true;
    }

    return false;
}

Someone from PHP dev team should really consider fixing this.

Debugging with Xdebug and Eclipse on Mac

In my previous posts I explained how to install MacPorts and setup typical Apache/PHP setup with it. Now we’re going to start with something more serious, something that sooner or later every PHP developer will have to deal with. Debugging! And yes when I say debugging I didn’t mean echoing or using var_dump 😉

First off some background info. As far as I know, the very first PHP IDE which actually had debugger built in was Zend Studio. I’ve been using ZDE 5.2.x for very long time, but since Zend stopped supporting it, it was very difficult to continue using it. On Snow Leopard it required some nasty Java hacks to make it working, and any Java update would break it. It was obvious that time of ZDE 5.x was over so I had to pick something new. My choice was Eclipse (I didn’t see any point in paying Zend extra fee for IDE based on Eclipse) so I downloaded Eclipse Hellios release from Zend PDT pages. Unfortunately it was really slow and buggy so after a while I’ve decided to download “classic” Eclipse and get PHP Dev SDK installed manually. Somehow it worked much better for me and since there is new Eclipse Indigo 3.7.1 release right now, I’ve decided to wrote detailed explanation how to install it. In case you already have Eclipse installed and you’re happy with it, just skim to Xdebug setup part.

Setting up Eclipse
————————-
So, let’s download Eclipse 3.7.1 classic and unpack it to some dir. I prefer to use my home dir (which is /Users/dinke/eclipse) but you can pick any, just copy files there and run Eclipse application from that folder. On first run it will ask you about workspace (which is place where Eclipse keeps some internal files), I used /Users/dinke/workspace but you can use any location that suit you. Eclipse will start with Welcome window that we are going to close for now (you can always get it reopened by choosing Help->Welcome)

Eclipse Startup

Now it’s time to download and install PHP Development Tools SDK. So open Help -> Install New Software. Pick Indigo from “work with” drop-down and when list bellow gets updated, go to Programming Languages and check box next to PHP Development Tools SDK as in image bellow.

PDT Startup

Now click next, accept terms and Finish, and PHP tools will be installed. After installation is done (it may take some time), you’ll be offered to restart Eclipse, so do it immediately. After Eclipse has been restarted we are going to switch to PHP perspective. So click to “Open Perspective” icon in upper right corner of Eclipse and pick “Other”. There you’ll be offered options similar as in image bellow:

PHP Perspective

Pick PHP click OK and we’re done with Eclipse setup. Congratulations! Now you can tweak Eclipse to your own preferences (Eclipse->Preferences) or import/start new project and start playing with code.

Xdebug
————
Now we’re going to deal with Xdebug part. First we have to install it, and we’re going to use MacPorts. If you don’t have MacPorts installed just look at my previous post about it.

So, start terminal and type this:

sudo port install php5-xdebug

Restart Apache (sudo /opt/local/apache2/bin/apachectl restart) and make sure that you have proper Xdebug section on phpinfo page (see bellow):

phpinfo

Now we need to add some configuration options to php.ini file, so let’s add this to the bottom of php.ini file (in my case located at /opt/local/etc/php5/php.ini):

[xdebug]
zend_extension="/opt/local/lib/php/extensions/no-debug-non-zts-20090626/xdebug.so"
xdebug.profiler_output_dir = "/tmp/xdebug/"
xdebug.profiler_enable = On
xdebug.remote_enable=On
xdebug.remote_host="localhost"
xdebug.remote_port=9000
xdebug.remote_handler="dbgp"
xdebug.idekey=ECLIPSE_DBGP 

and restart Apache again.

Now let’s configure Eclipse debugging options. Open Eclipse Preferences and then PHP->Debug and instead of Zend pick XDebug from PHP Debugger drop down.

eclipse_debug

Click Configure link next to dropdown, select Xdebug and click configure button again and make sure that in Accept remote session (JIT) is set to any (see image bellow).

eclipse_xdebug_settings

And this is it! Hooray!

Now let’s see how it’s actually working, we’re going to use so called “remote” debugging which allow us to actually debug directly from browser (there’s even firefox ext for that), however I prefer to insert my URL’s directly to configuration options. So select Run->Debug As from the menu and pick PHP Web Page. You will be asked to insert URL and then Debugging perspective will be opened along with editor. There you have ability to step into code, setup break points and watch variables on the right.

debugger

Happy Debugging 🙂

Useful Links:
Eclipse/Xdebug Remote debugging on Windows
Xdebug

MySQL: Deleting with Left Join

Today I had to deal with one huge table and cleanup all data where foreign key doesn’t have it’s primary key match in original table, just to remind myself how sub-queries in MySQL are terrible slower than joins.

I have some script which generates domains from typos, so I have one table with original domains (master_domains) and other one (result_domains) with generated typo domains. Basically something like this:

mysql> describe master_domains;
+--------+------------------+------+-----+---------+----------------+
| Field  | Type             | Null | Key | Default | Extra          |
+--------+------------------+------+-----+---------+----------------+
| id     | int(10) unsigned | NO   | PRI | NULL    | auto_increment | 
| domain | varchar(255)     | NO   | UNI | NULL    |                | 
+--------+------------------+------+-----+---------+----------------+
2 rows in set (0.07 sec)

mysql> describe result_domains;
+-----------+------------------+------+-----+---------+----------------+
| Field     | Type             | Null | Key | Default | Extra          |
+-----------+------------------+------+-----+---------+----------------+
| id        | int(10) unsigned | NO   | PRI | NULL    | auto_increment | 
| domain    | varchar(255)     | NO   | UNI | NULL    |                | 
| master_id | int(10) unsigned | YES  | MUL | NULL    |                | 
+-----------+------------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)

Table result_domains has master_id which is foreign key reference to primary key (id) in master_domains table. Since I also have other scripts generating domains without typos (which store result_domains.master_id field as NULL), today I simple wanted to get rid of those masters without proper master_id reference in result table or in other words those master domains where result_domains.master_id is NOT NULL.

With sub-queries you could write query easily with something like this:

delete from master_domains where id not in 
(select master_id from result_domains_frontend)

It is good habit to always run select query before deleting big number of rows (just to make sure your query is written correctly) so I tried select query first:

select * from master_domains where id not in 
(select master_id from result_domains_frontend) limit 10

However, it took several minutes to run without any output so eventually I’ve decided to stop it. I know that sub-queries are much slower than joins, so decided to do try removal operation with left join.

Left joins are actually perfect weapon to find rows that exist in one (left) and doesn’t exist in other (right) table. They also have one big advantage over sub-queries – they are performing much faster, plus they are backward compatible with old prehistoric MySQL 5.x versions. However delete syntax is little bit tricky so after few trial and errors eventually I came out with this query:

delete master_domains.* from master_domains 
left join result_domains_frontend 
on master_domains.id=result_domains_frontend.master_id 
where result_domains_frontend.master_id is null ;

And voila after a while it came up with result:

mysql> delete master_domains.* from master_domains 
left join result_domains_frontend 
on master_domains.id=result_domains_frontend.master_id 
where result_domains_frontend.master_id is null ;
Query OK, 270558 rows affected (46.58 sec)
mysql> 

MySQL: Moving table from one db to another

To move one table from one db to another, you can create new table (create table foo_new like foo) in db where you want to move table and then copy data with insert into/select query. However there is much easier way which is especially handy when you deal with big tables.

As you probably aready know, there is easy way to rename MySQL table just by issuing rename clause in alter statement:

ALTER TABLE foo RENAME TO new_foo;

You can also use RENAME TABLE syntax like this:

RENAME TABLE foo TO new_foo;

Now, when you need to move table from one db to another, all you have to do is to specify it’s current db and new db name as table prefix. For example if you want to move table foo from current db to new db you can issue queries like these:

ALTER TABLE currentdb.foo RENAME TO newdb.foo;

or

RENAME TABLE currentdb.foo TO newdb.foo;

Btw there is important difference between ALTER and RENAME statements in a way that with RENAME you can rename more than one tables at once. This comes handy if you want for example to swap names of two tables:

RENAME TABLE table1 TO temp, table2 TO table1, temp TO table2;

Browser Detection Update

Long time ago I developed Browser Detection Class which is able to recognize most of popular browser/OS’s used today.

For example FF on my macbook pro would be recognized like this:
Mozilla Firefox 3.0.4 / Mac OS X

Very useful stuff in case you have to redirect your users to different pages depending of browser version or to maintain your own site browser/os usage statistics.

Today I had a chance to update it with support for Windows Vista, Google Chrome and iPhone.

You can download complete code with usage examples by following this link.

PHP: Callback functions and OOP

Recently, I had to change default behavior of storing session data into files and use MySQL DB instead. In practice, that means writing whole bunch of callback functions and setting callbacks with session_set_save_handler function. Since I use OOP, what really bothered me was the fact that (according to my PHP CHM Manual sitting on my desktop) for session_set_save_handler, all functions has to exist in global scope, since all callback arguments are strings?

bool session_set_save_handler ( string open, string close, string read, string write, string destroy, string gc )

Doing that in non OOP way with 6 functions on global scope is not something I really liked, so I googled for solution and found that you can easily assign an array like array(‘class_name’, ‘method’) for all callbacks in PHP. Cool stuff which allows you to create session handler class with bunch of static methods for those callbacks, but why the hell that is not documented in PHP Manual???

I went to online manual at least to see if someone submitted comment about this, and find out that session_set_save_handler definition there is completely different:

bool session_set_save_handler ( callback $open, callback $close, callback $read, callback $write, callback $destroy, callback $gc )

Obviously, since last time I browsed online manual, a lot of thing has changed, one among them is introducing “callback” type in those “pseudo types” used only for documentation purposes. And there, manual for callback says following:

callback

Some functions like call_user_func() or usort() accept user defined callback functions as a parameter. Callback functions can not only be simple functions but also object methods including static class methods.

A method of an instantiated object is passed as an array containing an object as the element with index 0 and a method name as the element with index 1.

Static class methods can also be passed without instantiating an object of that class by passing the class name instead of an object as the element with index 0.

which basically allows you to pass an array with class name and method as callback, and that method will be called.

Let me give you and example with sessions:

<?php

/**
 * Sessin_Handlers class
 * contains dummy methods needed for session stuff
 * Replace content with some real stuff like db conn etc.
 *
 */
class Session_Handlers
{
    function open($save_path, $session_name)
    {
        echo "Open Method Called<br>";
        return true;
    }

    function close()
    {
        echo "Close Method Called<br>";
        return true;
    }

    function read($id)
    {
        echo "Read Method Called<br>";
        return true;
    }

    function write($id, $sess_data)
    {
        echo "Write Method Called<br>";
        return true;

    }

    function destroy($id)
    {
        echo "Destroy Method Called<br>";
        return true;
    }

    function gc($maxlifetime)
    {
        echo "GC Method Called<br>";
        return true;
    }
}

//call all method from Session_Handlers statically
session_set_save_handler(array('Session_Handlers', 'open'), array('Session_Handlers', 'close'),
    array('Session_Handlers', 'read'), array('Session_Handlers', 'write'), array('Session_Handlers', 'destroy'),
    array('Session_Handlers', 'gc'));

session_start();

// proceed to use sessions normally
?>

As you see, we’ve created simple methods which only echo when they are called (in real life, you should either save session data into file or db). As you can see, we simple passed arrays to session_set_save_handler, which served us to connect class methods with session callbacks.

Method Overloading in PHP5

Although with release of PHP5 we finaly got some long awaited OOP features, sometimes I really miss overloading capability which exists in languages like Java. I am talking about something like this:

class Overloading_Test
{
    public void hello()
    {
        System.out.println("Hello Anonymous");
    }

    public void hello(String name)
    {
        System.out.println("Hello " + name)
    }


    public void hello(String firstName, String lastName)
    {
        System.out.println("Hello " + firstName + " " + lastName);
    }
}

This way you can call either hello with no arguments at all, or with one or two arguments, and proper method would always be called. Unfortunately, if you try something like this in PHP, it would give you fatal error, because basically, methods cannot be redeclared, since support for overloading is not part of core language like in Java.

However, there is still a way to achieve this Java like overloading functionality by using “magic” methods that are described in PHP Manual. Although it is not clear from manual how could you achieve exact functionality like in Java, I played a little bit with __call function, and get interesting workaround.

<?php

class Overloading_Test
{
    function __call($method_name, $arguments)
    {
        //list of supported methods
        //only 'hello' for this test
        $accepted_methods = array("hello");

        //in case of unexistant method we trigger fatal error
        if(!in_array($method_name, $accepted_methods)) {
        trigger_error("Method <strong>$method_name</strong> doesn't exist", E_USER_ERROR);
        }

       //we inspect number of arguments
       if(count($arguments) == 0) {
           $this->hello1();
       } elseif(count($arguments) == 1) {
           $this->hello2($arguments[0]);
       } elseif(count($arguments) == 2) {
           $this->hello3($arguments[0], $arguments[1]);
       }
    
      return false;    
  }

  function hello1()
  {
      echo "Hello Anonymous<br>";
  }

  function hello2($name)
  {
      echo "Hello $name<br>";
  }

  function hello3($first_name, $last_name)
  {
      echo "Hello $first_name, $last_name<br>";
  }
}


$ot = new Overloading_Test();
$ot->hello();
$ot->hello("John");
$ot->hello("John", "Smith");
//this one will produce fatal error
//$ot->test();

If you run this code, you will get something like:

Hello Anonymous
Hello John
Hello John, Smith

So, what is going on here? Whenever we call some undeclared method (which is the case with ‘hello’ method here), magic method __call is called, and two arguments (method name and arguments) are passed to it. For this simple test, we only support overloading of ‘hello’ method, so in case you try any other, we trigger fatal error.

What’s going on further is, we simple check number of argumens passed (by counting $arguments array), and call proper method. For the sake of clarity, I only used simple overloading based on number of arguments, but you could also check for argument type (ie string, integer etc.) and call proper method.

So, as you see, method overloading in PHP5 is not as elegant as in Java, but you can still make it. For more information about ‘magic’ fucntions (there are quite a few for member overloading as well), please visit PHP Manual.

PHP 4 End of Life Announcement

From php.net

[13-Jul-2007]

Today it is exactly three years ago since PHP 5 has been released. In those three years it has seen many improvements over PHP 4. PHP 5 is fast, stable & production-ready and as PHP 6 is on the way, PHP 4 will be discontinued.

The PHP development team hereby announces that support for PHP 4 will continue until the end of this year only. After 2007-12-31 there will be no more releases of PHP 4.4. We will continue to make critical security fixes available on a case-by-case basis until 2008-08-08. Please use the rest of this year to make your application suitable to run on PHP 5.

For documentation on migration for PHP 4 to PHP 5, we would like to point you to our migration guide. There is additional information available in the PHP 5.0 to PHP 5.1 and PHP 5.1 to PHP 5.2 migration guides as well.

Amen to that. Finally this will move the rest of people to PHP5, so we can finally start using all those great PHP5 features without worrying that such code cannot be used on most of client servers. Looking forward to PHP6 now 🙂

MySQL 5.x – Finally improved client

Looking at my favorite rss feeds today, I found this post on great MySQL Performance Blog:

…if you press CTRL-C MySQL Command Line Client will not exit but will terminate query being executed.

In other words, in previous versions of MySQL client program, if you issue a query and try to interrupt it by hitting CTRL-C, CTRL-C would actually kill MySQL client itself, but query still continue running in background! In this case the only solution to really kill that query is to find it’s ID on process list (by issuing “show full processlist” query), and then to kill it with a query like “kill 12345”, where 12345 is ID of query that you want to be killed. In other words, something like this:

mysql> select * from odm_result_keywords where keyword like '%foo%joe%';
^CAborted
bash-2.05b$ mysql -A --enable-local-infile -udinke -ppass mydb
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1512 to server version: 4.1.18-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> show full processlist;
+------+-------+----------------------------+-------------------+---------+------+--------------+--------------------------------------------------------------------+
| Id   | User  | Host                       | db                | Command | Time | State        | Info                                                               |
+------+-------+----------------------------+-------------------+---------+------+--------------+--------------------------------------------------------------------+
| 1486 | dinke | localhost                  | mydb | Query   |    3 | Sending data | select * from odm_result_keywords where keyword like '%foo%joe.cl' |
+------+-------+----------------------------+-------------------+---------+------+--------------+--------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> kill 1486;
mysql>

Thanks to changes in MySQL client program, all you have to do now is to hit CTRL-C, and query will be stopped immediately:

mysql> select domain from odm_result_keywords_de where whois_status is null and domain like '%.%.%';
Query aborted by Ctrl+C
ERROR 1317 (70100): Query execution was interrupted
mysql>

For more information about this feature (as well as other changes in MySQL 5.0.25) please follow this link.