Thursday, April 4, 2013

VB6 VBA Read Large Binary Flat Files Past 2GB Limitation Using Windows API ReadFile WriteFile

If you deal with really large flat files in VB6 or VBA, how do you read past the 2 GB limitation? Visual Basic 6 / VBA has a very easy file read/write mechanism for binary files. It works great for user defined types (UDTs) and arrays of UDTs. Unfortunately, this interface was written before terabyte hard drives became standard. The actual limit is the size of a long integer (2,147,483,647). When reading using the binary Get function, I had the unpleasant experience of rolling the record number, which resulted in the value becoming negative, triggering a read error. I was fortunate to find a well-written demo that gives a solution using the Windows API file read / write methods to ReadFile and WriteFile. You can find the demo and more information at the link below:

VB6 HugeFixedFile

The code contains a class that encapsulates the basic methods you will need to deal with large files in VBA and VB6. See the code and forum posts for more details. I have since ditched the UDT interface as I found that LSet didn't do a great job converting my byte array to UDT. So I manually convert the byte arrays into the proper types. I had to find / build functions to convert a byte array to an integer and a long. The code is posted below in case some of you might find it useful to convert a byte array to an integer or a byte array to a long in VB6 or VBA:

Code:
Private Function ConvertByteToInteger(ByVal start As Long, ByVal lend As Long, ByRef byt() As Byte) As Integer
    ConvertByteToInteger = byt(start) + CLng(byt(lend)) * 256
End Function

Private Function ConvertByteToLong(ByVal start As Long, ByVal lend As Long, ByRef byt() As Byte) As Long
    ConvertByteToLong = byt(start) + CLng(byt(start + 1)) * 256& + CLng(byt(start + 2)) * 2& ^ 16 + CLng(byt(lend)) * 2 ^ 32
End Function

I also ditched the UDTs though the code at the link above shows how to convert a byte array to UDT using LSet. I found that by using the HugeFixedFile class and loading a byte array of 1000 records of 16 byte records, my processing time was cut by almost 2/3 when reading a large file. As a result, I edited the Read function (found in the demo) to allow for reading a larger byte array than the record count as follows: (assumes a zero-based byte array)

Code:
Public Function ReadRec(ByRef Record() As Byte) As Long
    If ReadFile(hFile, VarPtr(Record(0)), UBound(Record) + 1, ReadRec, 0) Then
        If ReadRec = 0 Then
            fEOF = True
        End If
    Else
        RaiseError HFF_READ_FAILURE
    End If
End Function