PyRFC - Remote Controlling SAP ABAP systems with python

PyRFC - Remote Controlling SAP ABAP systems with python

Set up and run a remote function call on a BW system with pyRFC

Hey there! It can be hard as a Data Scientist in a BI organization that uses SAP BW as the main data warehouse technology. All these hard earned python skills are useless when your data is locked up under a layer of application written in ABAP, SAP’s proprietary language.

Imagine you could run ABAP modules to read and write into tables or execute functionality in a SAP Netweaver based system, all from Python. This is possible with pyRFC, a python library that works as an API to SAP’s Netweaver Remote Function Call Software Development Kit (NWRFC SDK). The SDK allows developers to build programs that can call or can be called from ABAP software.

Prerequisites

  • SAP RFC SDK 7.5: To download the SDK you need a SAP customer account that is usually tied to a license. The download link can be found in SAP Note 2573790, follow the link here. Make sure to choose the right operating system under downloads, the dropdown is a little tricky to find.
  • SAP GUI for Windows: Needed for systems with Single Sign on enabled
  • Python
  • Cython: Pythons C extension, required since the SDK is a C/C++ interface

Install SAP NetWeaver RFC SDK 7.50

Windows

Installation is straight forward following the instructions here. You only have to create an env variable and extend the path. Env variables under windows are set permanently (written to registry) with setx <variable>. You need to restart the shell after executing. Or you can use the GUI under advanced system settings –> environment variables. Choose if you want to change the variables system wide or only for your user.

Ubuntu

Note: I run Ubuntu 20.04 LTS and had to install from source.

Both python-dev and the python library Cython are a prerequisite to install from source. I install python dev with ubuntu’s package manager apt and Cython with pip.

In bash I create a new directory mkdir /usr/local/sap/nwrfcsdk. Then I set the SAPNWRFC_HOME env variable to that directory with export SAPNWRFC_HOME=/usr/local/sap/nwrfcsdk. To make sure that worked I run echo $SAPNWRFC_HOME. Next I copy the downloaded SDK into that directory and unzip it with unzip /usr/local/sap/nwrfcsdk/nwrfc750P_9-70002752.zip. I move the contents of the newly created folder to its parent folder and delete the now empty nwrfcsdk folder. So that python can find the libraries its important to create necessary system links. For this I add the lib folder to the path variable with export PATH=$PATH:/usr/local/sap/nwrfcsdk/lib.

To update the shared libraries links and cache I follow the documentation and create a config file in /etc/ld.so.conf.d. As root: vi /etc/ld.so.conf.d/nwrfcsdk.conf. And in the file I add:

# include nwrfcsdk
/usr/local/sap/nwrfcsdk/lib

Now as root: ldconfig and to check if the libraries were linked ldconfig -p | grep sap should show the libraries from /usr/local/sap/nwrfcsdk/lib

Install pyRFC

Get the latest release from GitHub by checking the release page. At the time of writing the latest release is v2.5.0.. Installing pyRFC on Linux required me to build it from source. So I get the Source code archive with git clone https://github.com/SAP/PyRFC cd into the cloned directory and run python3 setup.py bdist_wheel. This should also work on Windows although here I was able to pip install the wheel I got from here

To check that the installation is working, I run from python:

from pyRFC import *

It returns nothing so I am good to go.

Python class to read tables from a BW

The class implementation is inspired by a blog post from lwrc.

It calls RowsAtATime per batch and iterates as long as the result is larger than RowsAtATime.

In my experience it is safe to set RowsAtATime to ~1000.

Other important parameter are:

  • Fields: Allows you to specify fields as a list, the RFC module fails silently and returns no data when field is not present in table
  • TableName: Specify name of table you want to call, check e.g. via TCodeSearch Where: Specify a where condition, sensitive to spaces around =
  • RowsAtATime: The number of rows you want to transfer per iteration/ batch, smaller batches might be safer but take longer
  • RowSkips: Number of rows to skip from the beginning of the table

Note that you can check the parameters of a RFC modules with SAP transaction SE37 from SAP Logon.

from pyrfc import Connection, ABAPApplicationError, ABAPRuntimeError, LogonError, CommunicationError
from pprint import PrettyPrinter
from configparser import ConfigParser
import pandas as pd
#Logging (not part of code below)
# import logging
# import logging.config
# for timing the download
from time import time

class main():
    def __init__(self):
        try:
            config = ConfigParser()
            config.read("sapnwrfc.cfg")
            params_connection = config._sections["connection"]
            self.conn = Connection( **params_connection)
            
        except CommunicationError:
            print("Could not connect to server.")
            raise
        except LogonError:
            print("Could not log in. Wrong credentials?")
            raise
        
    def read_table(self, Fields, TableName, Where = '', RowsAtATime=10, RowSkips=0):
      """A function to call RFC_READ_TABLE"""
      try:


          # Select option
          if Fields[0] == '*':
              Fields = []
          else:
              Fields = [{'FIELDNAME':x} for x in Fields] 

          # options to filter by table field
          options = [{'TEXT': x} for x in Where] 

          #list that gathers the queried data
          l_result = []

          # RFC call
          # if you want to time the download
          #start_time =  datetime.datetime.now()
          count = 0
          while True:
              result = self.conn.call("RFC_READ_TABLE",
                                  QUERY_TABLE=TableName, #
                                  DELIMITER='|',
                                  FIELDS = Fields,  
                                  OPTIONS=options,
                                  ROWCOUNT = RowsAtATime,
                                  ROWSKIPS=RowSkips,
                                  #ET_DATA_RETURN is somehow needed to export sstring data types, 
                                  USE_ET_DATA_4_RETURN="X")
              RowSkips += RowsAtATime

              for line in range(0, len(result['ET_DATA'])):
                  #only append list to list not the returned dictionary
                  l_result.append(result['ET_DATA'][line]["LINE"].strip())

              headers =  headers = [x['FIELDNAME'] for x in result['FIELDS']]  
              if len(result['ET_DATA']) < RowsAtATime:
                  break
          l_result = [x.strip().split('|') for x in l_result ]
          df = pd.DataFrame(l_result, columns=headers)
      except (ABAPApplicationError, ABAPRuntimeError):
          print("An error occurred.")
          raise

        # logging example
        # self.logger.info( "Download time - {0} download took --- {1} seconds ---".format(TableName,datetime.datetime.now()- start_time ))

      return df

Exemplary config

Exemplary config for connecting to a SAP Netweaver system:


[connection]
#For a definition of parameters check: https://docs.oracle.com/cd/E19509-01/820-5064/ggrpj/index.html

#For productive systems use message server:
#mshost=<FQDN>
#else directly connect to the application sever:
ashost=<FQDN>

sysnr= <SysNr, e.g. 00>
sysid=<SysID>
client= <ClientID e.g. 100>

#for user, password based authentication, not working when SNC is enabled:
#user= 
#passwd=

# Find parameters in logon or eclipse:
snc_partnername=<SAP Server SNC name, e.g.  p:CN=EC7, O=SAP, C=US, OU=SAP>
snc_myname= p:CN=<user>
snc_qop= <the quality of the protection level, e.g. 9>
#language for connection (DE/EN):
lang= EN