Thursday, March 29, 2012

Endian problems with converting to varbinary

We are in the process of migrating from a DOS based binary file storage
format to a relational database structure. However, for migration reasons we
need to be able to export the relational data to the binary file structure
for processing. I have accomplished this with a not-so-lightweight procedure
that basically steps through data, pulls out a variable, converts it to
binary, and appends it to an Image column. At the end of the process I use
textcopy within the TSQL code to write the binary file out. However, when
processing the data in a VB6 utility, reading an integer from the file
becomes impossible. 1's become 256, 2's become 512, 4's become 1024, and
larger values are crazy. Stepping through the original files, we notice that
a one would be stored as 0x10, a 2 as 0x20. Stepping through the file that
SQL Server wrote, we noticed that a one was 0x01, a two was 0x02, etc.
I understand why the old files work, as that is in little endian format,
native to windows. However, I don't understand why SQL server is converting
to a binary structure that looks to be big endian. Below is an example of my
conversion, please let me know what I'm doing wrong and/or how I can fix it
to make it work correctly (tempBlob.MainBlob is the Image that I'm updating)
.
Thanks!
Code:
DECLARE @.bin2 binary(2)
DECLARE @.PixelsAcrossDetector SMALLINT
SELECT @.PixelsAcrossDetector = (SELECT PixelsAcross FROM Strips WHERE GUID =
@.StripGUID)
SELECT @.bin2 = (SELECT CONVERT(BINARY(2), @.PixelsAcrossDetector))
UPDATETEXT tempBlob.MainBlob @.Pointer NULL 0 @.bin2> Stepping through the original files, we notice that
> a one would be stored as 0x10, a 2 as 0x20. Stepping through the file that
> SQL Server wrote, we noticed that a one was 0x01, a two was 0x02, etc.
> I understand why the old files work, as that is in little endian format,
> native to windows. However, I don't understand why SQL server is
> converting
> to a binary structure that looks to be big endian.
Consider the following:
DECLARE
@.PixelsAcrossDetector smallint,
@.bin2 binary(2)
SET @.PixelsAcrossDetector = 1
SET @.bin2 = CONVERT(BINARY(2), @.PixelsAcrossDetector)
SELECT @.bin2
The result is a fixed-length *binary string* with a value of 0x0001 rather
than the little endian smallint value 0x0100 you want to serialize. One
method to address the problem is to swap the bytes like the example below.
It's ugly but Transact-SQL isn't really designed to create binary records.
DECLARE
@.PixelsAcrossDetector smallint,
@.bin2 binary(2),
@.Byte0 binary(1),
@.Byte1 binary(1)
SET @.PixelsAcrossDetector = 1
SET @.bin2 = CONVERT(BINARY(2), @.PixelsAcrossDetector)
SELECT @.Byte0 = SUBSTRING(@.bin2, 2, 1)
SELECT @.Byte1 = SUBSTRING(@.bin2, 1, 1)
SELECT @.Bin2 = @.Byte0 + @.Byte1
SELECT @.Bin2
Hope this helps.
Dan Guzman
SQL Server MVP
"David Harris" <DavidHarris@.discussions.microsoft.com> wrote in message
news:35FE8697-EE20-4641-9042-3F569D5BCDC9@.microsoft.com...
> We are in the process of migrating from a DOS based binary file storage
> format to a relational database structure. However, for migration reasons
> we
> need to be able to export the relational data to the binary file structure
> for processing. I have accomplished this with a not-so-lightweight
> procedure
> that basically steps through data, pulls out a variable, converts it to
> binary, and appends it to an Image column. At the end of the process I use
> textcopy within the TSQL code to write the binary file out. However, when
> processing the data in a VB6 utility, reading an integer from the file
> becomes impossible. 1's become 256, 2's become 512, 4's become 1024, and
> larger values are crazy. Stepping through the original files, we notice
> that
> a one would be stored as 0x10, a 2 as 0x20. Stepping through the file that
> SQL Server wrote, we noticed that a one was 0x01, a two was 0x02, etc.
> I understand why the old files work, as that is in little endian format,
> native to windows. However, I don't understand why SQL server is
> converting
> to a binary structure that looks to be big endian. Below is an example of
> my
> conversion, please let me know what I'm doing wrong and/or how I can fix
> it
> to make it work correctly (tempBlob.MainBlob is the Image that I'm
> updating).
> Thanks!
> Code:
> DECLARE @.bin2 binary(2)
> DECLARE @.PixelsAcrossDetector SMALLINT
> SELECT @.PixelsAcrossDetector = (SELECT PixelsAcross FROM Strips WHERE GUID
> =
> @.StripGUID)
> SELECT @.bin2 = (SELECT CONVERT(BINARY(2), @.PixelsAcrossDetector))
> UPDATETEXT tempBlob.MainBlob @.Pointer NULL 0 @.bin2|||>> We are in the process of migrating from a DOS based binary file storage f
ormat to a relational database structure. <<
Now you know why BIT is not part of the SQL Standards. SQL is meant to
be high-level, abstract language; use files and machine-specific
assembly language for that level of programming.
Right tool for the job; no kludges.|||"--CELKO--" Wrote:
> Now you know why BIT is not part of the SQL Standards.
> SQL is meant to be high-level, abstract language; use
> files and machine-specific assembly language for that
> level of programming.
One of my joys is that this is only interim, until we can rewrite our
processing software to make database calls insted of reading the binary
files. But unfortunately that will be a ways off for now, and we have to
kludge it up right now. :( But good design is right around the corner... :)
"Dan Guzman" wrote:
> The result is a fixed-length *binary string* with a value of 0x0001 rather
> than the little endian smallint value 0x0100 you want to serialize. One
> method to address the problem is to swap the bytes like the example below.
> It's ugly but Transact-SQL isn't really designed to create binary records.
Yeah, I was hoping it was a setting or a separate convert function. :) But
this seems like the best solution, will just have to abstract it to simplify
things. Appreciate the help. :)
Will the swap method also work with doubles/floats, just by scaling the
variables to 8 instead of 2, and completely reversing it? Or must the double
s
have a separate format?|||> Will the swap method also work with doubles/floats, just by scaling the
> variables to 8 instead of 2, and completely reversing it? Or must the
> doubles
> have a separate format?
It gets especially nasty when you deal with single and double data types.
Not only do you need to address the big/little endian issue, you'll need to
create the exponent correctly. It's likely to be easier and faster to write
a custom application program than to do this in Transact-SQL. I don't
always agree with Joe but, IMHO, this is beyond an acceptable temporary
kludge.
Hope this helps.
Dan Guzman
SQL Server MVP
"David Harris" <DavidHarris@.discussions.microsoft.com> wrote in message
news:219FAFF8-546B-47B1-B744-02BFD0391989@.microsoft.com...
> "--CELKO--" Wrote:
> One of my joys is that this is only interim, until we can rewrite our
> processing software to make database calls insted of reading the binary
> files. But unfortunately that will be a ways off for now, and we have to
> kludge it up right now. :( But good design is right around the corner...
> :)
> "Dan Guzman" wrote:
> Yeah, I was hoping it was a setting or a separate convert function. :) But
> this seems like the best solution, will just have to abstract it to
> simplify
> things. Appreciate the help. :)
> Will the swap method also work with doubles/floats, just by scaling the
> variables to 8 instead of 2, and completely reversing it? Or must the
> doubles
> have a separate format?|||>> Now you know why BIT is not part of the SQL Standards.
It was a bit suprising that SQL:99 bought it up & is already deprecated in
SQL:2003
Anith|||"Dan Guzman" wrote:
> It gets especially nasty when you deal with single and double data types.
> Not only do you need to address the big/little endian issue, you'll need t
o
> create the exponent correctly. It's likely to be easier and faster to wri
te
> a custom application program than to do this in Transact-SQL. I don't
> always agree with Joe but, IMHO, this is beyond an acceptable temporary
> kludge.
Agreed. Too bad everything isn't an integer. :) I did manage to create a few
procedures to abstract a good part of the process into TSQL, which cuts the
time at least in half from doing everything from C#. Appreciate the help,
even though it's not exactly what I wanted to hear... But this is interim,
and we'll take just about anything right now. :)

No comments:

Post a Comment