The mysteries of Sybase ASE + PDO

I’ve been fighting to get Sybase ASE 16.0 working with PHP’s PDO library. Of CRUD, the last three seem to be fine, but the C has given me some trouble until just now. One of the difficulties is the dearth of information on Sybase, since hardly anyone seems to use it, especially in combination with PHP. So I’m posting some things I’ve learned so far in the hopes that they’ll be useful to someone else. (I’m not, of course, using it by choice, nor am I an advocate for it….)

For reference, I’m using PHP 5.6 and Sybase ASE 16.0. Since moving away from Sybase isn’t an option, getting it working with PDO is imperative for an eventual upgrade to PHP7, because the legacy sybase_*() functions have now been removed. Additional benefits are those general to PDO: prepared statements for security and performance; easier integration (and less hidden dependency) with OO code; and classes which are easier to test, since you can inject e.g. a PDO with a SQLite connection and use all of the same method calls.

Connecting

We are using FreeTDS with the DBLIB PDO driver. A connection should also be possible with PDO_ODBC, though I’ve not tried it. The DSN is straightforward, but don’t forget to specify the character set. Something like:

$pdo = new \PDO("dblib:host=SERVER:9999;dbname=DATABASE;charset=utf8", $username, $password);
$pdo->setAttribute(\PDO::ATTR_ERRMODE, \PDO::ERRMODE_EXCEPTION);

…where SERVER, 9999, and DATABASE are your server address, port, and DB respectively, of course.

A very important detail is dealing with character sets. For a reason I will detail below, I also issue the following command after connecting:

$pdo->query('set char_convert off');

The Sybase char_convert setting controls whether the server tries to convert the character set of the text you send it, and we are telling the server not to do that. This was a counterintuitive solution for me, as I was dealing with UTF-8 text, requested a UTF-8 connection in my DSN, and was inserting to a UTF-8 server and database–so why should the conversion matter? Maybe the server ignores the charset parameter in the DSN? The answer is a mystery, but it works.

Now we are connected, and have char_convert turned off. You should be good to go for inserting UTF-8 text. For more mysteries about datatypes, read on.

Inserting text

Before I figured out that I needed to turn char_convert off, for a while I couldn’t insert a string containing special characters into a text column; I was getting an Error 2402, ‘Error converting client characters into server’s character set. Some character(s) could not be converted.’ This didn’t seem to make sense, since I knew that the character sets of my text, my client connection, and the server should all match. Then someone suggested I should treat the string as a LOB (since this is what it is) with a separate writetext command after the insert. This worked in my client, Toad:

declare @ptr varbinary(16)
select @ptr = textptr(paperAbstract) from papers where ID = 1234
writetext papers.paperAbstract @ptr with log 'foobar'

…but I couldn’t get it to work with PDO, whether with PDO::query() or PDO::prepare(), with lines separated by semicolons or newlines, or whether in a PDO::beginTransaction() or a manual $pdo->query('begin transaction'). Some of these resulted in syntax errors, and others in an error stating that the variable must be declared (suggesting, to me, that there was some auto-commit at play, but I don’t understand why this should be the case during a transaction).

Anyway, after banging my head against the wall for a while, I stumbled onto the char_convert setting above. After this, my inserts worked! I was a bit confused, as one source I read stated that text columns “must” be written with a writetext command and not insert, but since beginning this post I now see that inserting non-null in a text column allocates an entire 2K data page to hold the value. So there’s one mystery solved. 2K may be a bit much for some use cases (in which case a form of varchar would be better), but since I am dealing with paper abstracts, this is fine.

Another mystery is that our database is using text (not unitext) and varchar (not nvarchar or univarchar) columns, yet I can insert special characters in UTF-8, including emoji and even characters that render as tofu in my browsers. According to the Sybase documentation, varchar stores strings “in single-byte character sets such as English,” nvarchar is for multibyte character sets, and univarchar is for Unicode character sets. So why do my varchar columns accept special characters? ¯\_(ツ)_/¯ (And BTW, checking the system tables confirms that my client is not stupid, these really aren’t the n*/uni* variants.) However, I did notice that pushing special characters into a varchar will end up letting you put fewer chars than you might expect, since they are multibyte chars.

Testing

Although I love PHPUnit, since I am not really testing my PHP code but instead the compatibility of PDO and Sybase ASE in my environment, I wrote some simple vanilla PHP tests that insert different kinds of text, then fetch the inserted rows and compare them to the string I inserted. (Don’t forget to tear-down, i.e. delete the inserted rows.) At a bare minimum, we need to be able to deal with the Greek alphabet, astronomical and mathematical symbols (☉, ∑), and Hanzi, Kanji, and Katakana for names–and of course UTF-8 gives us all this. I found our longest abstract and doubled it, to make sure that length would not be an issue. Finally, I let the Big List of Naughty Strings have a turn, and was glad to see there were no problems there.

 

So if you’re using Sybase with PHP, I know your pain. But hopefully this helps someone!

Leave a Reply

Your email address will not be published. Required fields are marked *