Export directly to UTF-8 from Access using VBA
data:image/s3,"s3://crabby-images/0d1df/0d1dfbe45f2ec141d2e712ddb0e3a80e2555e604" alt="Jimbo's picture Jimbo's picture"
Since the release of ECC 6.10 the SAP R/3 system supports Unicode, a system that uses multiple bytes to represent characters and allows symbols from non-ASCII alphabets to appear in the data. While ASCII can still be used for data from America and for simple financial data, the transport of data from one system to the next is almost always handled now using Unicode.
The flavors of Unicode
Massive tomes can be written on the origin, history and evolution of Unicode. Suffice it to say that of the many standards, pages, blocks, planes and formats we need only concentrate on the most popular version of the Unicode text file format: UTF-8. This standard is compatible with nearly every Unicode-enabled system and is the standard for SAP ECC 6.
MS Access as a tool for data conversion
MS Access handles Unicode internally automatically and without issue. It is by far the most powerful and useful non-SAP tool for Legacy Data Migration. Visual Basic for Applications (VBA) is a reduced-feature version of Visual Basic and adds the functionality to Access that makes it the programmer's choice. The traditional way to export to text using VBA is with the "Open ... As" statement. This defaults the data the traditional 8-bit standard ASCII encoding. Greek, when exported as ASCII text takes the form of question marks. In fact, the same can be said of most non-Western languages exported to ASCII. It's all Greek to me.
One option for exporting data to UTF-8 from Access is to create a template that Access can use to automatically format the data when it is exported. In Access 2008 this option is almost gone and has been replaced with "Save Export". In fact, the option to choose the delimiter to use is gone and now Access adds minuses and pipes to the data. The data is mostly worthless after that.
Stepping out of MS Access for a moment
Using the CreateObject function, Access can step outside the bounds of traditional VBA. These additional features are what make exporting directly to UTF-8 possible.
Added functionality and convenience
The code below determines what is being passed as the source within Access and then, based on whether the source is a table or selection query, forms the SQL statement on the data. Supporting the direct export of queries makes it easy to sort and filter the data as it is being exported.
Thanks where it is due
The second line in the code is the URL where I found the information I needed to convert a regular export-to-text function from ANSI-only to UTF-8. It has a great explanation of how to export to "Unicode" and I changed the type to "UTF-8" and it worked! If this works for you, please give SAPLSMW a little credit. You can download this and many other tools in Jimbo's VBA Automation Toolbox from the downloads section.
Function ExportToTextUnicode(strTableName As String, strFileName As String, Optional ByVal strDelim As String = vbTab) As Boolean 'Written by Jimbo at SAPLSMW.com 'Special thanks: accessblog.net/2007/06/how-to-write-out-unicode-text-files-in.html Dim rs As DAO.Recordset, strSQL As String Dim nCurrent As Long, nFieldCount As Long, nRecordCount As Long Dim RetVal As Variant, nCurRec As Long, dnow As Date, nCurSec As Long Dim nTotalSeconds As Long, nSecondsLeft As Long Dim strTest As String strSQL = "SELECT * FROM " & strTableName & ";" 'Check to see if strTableName is actually a query. If so, use its SQL query. nCurrent = 0 Do While nCurrent < CurrentDb.QueryDefs.Count If UCase(CurrentDb.QueryDefs(nCurrent).Name) = UCase(strTableName) Then strSQL = CurrentDb.QueryDefs(nCurrent).SQL End If nCurrent = nCurrent + 1 Loop Set rs = CurrentDb.OpenRecordset(strSQL) nFieldCount = rs.Fields.Count If Not rs.EOF Then 'Now find the *actual* record count--returns a value of 1 record if we don't do these moves. rs.MoveLast rs.MoveFirst End If nRecordCount = rs.RecordCount RetVal = SysCmd(acSysCmdInitMeter, "Exporting " & strTableName & " to " & strFileName & ". . .", nRecordCount) 'Create a binary stream Dim UnicodeStream Set UnicodeStream = CreateObject("ADODB.Stream") UnicodeStream.Charset = "UTF-8" UnicodeStream.Open For nCurrent = 0 To nFieldCount - 1 If Right(rs.Fields(nCurrent).Name, 1) = "_" Then UnicodeStream.writetext Left(rs.Fields(nCurrent).Name, Len(rs.Fields(nCurrent).Name) - 1) & strDelim Else UnicodeStream.writetext rs.Fields(nCurrent).Name & strDelim End If Next UnicodeStream.writetext vbCrLf nCurSec = Second(Now()) Do While Not rs.EOF nCurRec = nCurRec + 1 If Second(Now()) <> nCurSec And nCurRec <> rs.RecordCount Then nCurSec = Second(Now()) RetVal = SysCmd(acSysCmdUpdateMeter, nCurRec) RetVal = DoEvents() End If strTest = "" For nCurrent = 0 To nFieldCount - 1 'Check for blank lines--no need to export those! strTest = strTest & IIf(IsNull(rs.Fields), "", rs.Fields(nCurrent)) Next If Len(Trim(strTest)) > 0 Then 'Check for blank lines--no need to export those! For nCurrent = 0 To nFieldCount - 1 If Not IsNull(rs.Fields(nCurrent).Value) Then UnicodeStream.writetext Trim(rs.Fields(nCurrent).Value) End If If nCurrent = (nFieldCount - 1) Then UnicodeStream.writetext vbCrLf 'new line. Else UnicodeStream.writetext strDelim End If Next End If rs.MoveNext Loop 'Check to ensure that the file does't already exist. If Len(Dir(strFileName)) > 0 Then Kill strFileName ' The file exists, so we must delete it before it be created again. End If UnicodeStream.SaveToFile strFileName UnicodeStream.Close rs.Close Set rs = Nothing ExportToTextUnicode = True RetVal = SysCmd(acSysCmdRemoveMeter) End Function