LLM-Driven Open Banking and SAP Integration Framework: A Scalable Cloud Architecture using Databricks AI, Gradient Boosting, and Automated Software Testing

Ivan Mikhailovich Sokolov

doi:10.15662/IJEETR.2025.0706011

Authors

Ivan Mikhailovich Sokolov Lead Engineer, Russia Author

DOI:

https://doi.org/10.15662/IJEETR.2025.0706011

Keywords:

open banking, SAP integration, large language models (LLMs), Databricks Lakehouse, gradient boosting machine, automated software testing, cloud architecture, fintech ecosystem, API orchestration, predictive analytics

Abstract

In the rapidly evolving financial ecosystem, open banking initiatives demand agile, intelligent and secure architectures that can integrate legacy enterprise systems with modern AI-driven services. This paper proposes a scalable cloud framework that leverages large language models (LLMs) for understanding and orchestrating open banking APIs, integrates with the enterprise resource planning backbone of SAP S/4HANA (and related modules), utilises the Databricks Lakehouse platform for unified data and AI workloads, and employs gradient boosting machine (GBM) models for structured-data predictive tasks. In addition, the framework embeds automated software testing pipelines to ensure reliability, compliance and continuous delivery. The architecture supports real-time or near-real-time scenarios such as account linking, consent management, payment initiation, fraud monitoring and regulatory reporting. It allows an LLM interface to parse natural-language requests (e.g., from fintechs or corporate clients) into open banking transactions and maps them into SAP-centric business processes. The Databricks Lakehouse abstracts the data ingestion, transformation, feature engineering and model serving layers; the GBM component handles high-volume structured-data tasks such as credit risk scoring and anomaly detection; and the automated testing ensures the integrity of API integrations, model updates and end-to-end workflows. We describe the conceptual architecture, component interactions, design criteria (governance, latency, scalability, security, explainability), and implementation methodology. The proposed framework addresses both business agility and operational resilience, offering advantages in responsiveness, reuse, and governance, while discussing limitations around complexity, model drift and regulatory risk.

References

1. Hu, L., Chen, J., Vaughan, J., Yang, H., Wang, K., Sudjianto, A., & Nair, V. N. (2020). Supervised Machine Learning Techniques: An Overview with Applications to Banking. arXiv preprint. arXiv

2. Demajo, L. M., Vella, V., & Dingli, A. (2020). Explainable AI for Interpretable Credit Scoring. arXiv preprint. arXiv

3. Roa, L., Correa-Bahnsen, A., Suarez, G., Cortés-Tejada, F., Luque, M. A., & Bravo, C. (2020). Super-App Behavioral Patterns in Credit Risk Models: Financial, Statistical and Regulatory Implications. arXiv preprint. arXiv

4. Reddy, B. T. K., & Sugumar, R. (2025, June). Effective forest fire detection by UAV image using Resnet 50 compared over Google Net. In AIP Conference Proceedings (Vol. 3267, No. 1, p. 020274). AIP Publishing LLC.

5. Shashank, P. S. R. B., Anand, L., & Pitchai, R. (2024, December). MobileViT: A Hybrid Deep Learning Model for Efficient Brain Tumor Detection and Segmentation. In 2024 International Conference on Progressive Innovations in Intelligent Systems and Data Science (ICPIDS) (pp. 157-161). IEEE.

6. Binu, C. T., Kumar, S. S., Rubini, P., & Sudhakar, K. (2024). Enhancing Cloud Security through Machine Learning-Based Threat Prevention and Monitoring: The Development and Evaluation of the PBPM Framework. https://www.researchgate.net/profile/Binu-C-T/publication/383037713_Enhancing_Cloud_Security_through_Machine_Learning-Based_Threat_Prevention_and_Monitoring_The_Development_and_Evaluation_of_the_PBPM_Framework/links/66b99cfb299c327096c1774a/Enhancing-Cloud-Security-through-Machine-Learning-Based-Threat-Prevention-and-Monitoring-The-Development-and-Evaluation-of-the-PBPM-Framework.pdf

7. Adari, V. K. (2024). APIs and open banking: Driving interoperability in the financial sector. International Journal of Research in Computer Applications and Information Technology (IJRCAIT), 7(2), 2015–2024.

8. Manda, P. (2024). THE ROLE OF MACHINE LEARNING IN AUTOMATING COMPLEX DATABASE MIGRATION WORKFLOWS. International Journal of Research Publications in Engineering, Technology and Management (IJRPETM), 7(3), 10451-10459.

9. Sridhar Kakulavaram. (2022). Life Insurance Customer Prediction and Sustainbility Analysis Using Machine Learning Techniques. International Journal of Intelligent Systems and Applications in Engineering, 10(3s), 390 –.Retrieved from https://ijisae.org/index.php/IJISAE/article/view/7649

10. Amuda, K. K., Kumbum, P. K., Adari, V. K., Chunduru, V. K., & Gonepally, S. (2024). Evaluation of crime rate prediction using machine learning and deep learning for GRA method. Data Analytics and Artificial Intelligence, 4 (3).

11. Kesavan, E. Developing A Testing Maturity Model for Software Test Process Evaluation and Improvement using the DEMATEL Method. https://d1wqtxts1xzle7.cloudfront.net/124509220/Developing_A_Testing_Maturity_Model_for_Software_Test_Process_Evaluation_and_Improvement_using_the_DEMATEL_Method_1-libre.pdf?1757232956=&response-content-disposition=inline%3B+filename%3DDeveloping_A_Testing_Maturity_Model_for.pdf&Expires=1762449739&Signature=WF5l9kUpPuqrSE376hcDC9st4xWv9K9P-OedL8ydfiXp5Np~p0M8dvEvP2-k9NaWjGdfvcw2DoT3X9Fca7PG9-IgxQEoodbyt1rVJ-n2ZHqmuQ2~bMT-tBzSluQmw65jOy7a7PFkFizJEYF6Fz9TLwASEzDBB4gt8HoJtp8NwwrFY-cvrgQHU7x64ab3Cva8hqaS947HBXofRk1~5cGYjdwvAP4E4fotrZxZ~oKwn9Iq8bkobL376q0r7x~LjLXWEE4y~VzKQf8EIgiN3aDI3WkYn08vdDTnEvJMhfWWV-wSPlm0oqp9KFditEDByBQC5eRr6TUnsZwP3a4sc2Lj0Q__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA

12. Kandula, N. Machine Learning Techniques in Fracture Mechanics a Comparative Study of Linear Regression, Random Forest, and Ada Boost Model.

13. HV, M. S., & Kumar, S. S. (2024). Fusion Based Depression Detection through Artificial Intelligence using Electroencephalogram (EEG). Fusion: Practice & Applications, 14(2).

14. Raju, L. H. V., & Sugumar, R. (2025, June). Improving jaccard and dice during cancerous skin segmentation with UNet approach compared to SegNet. In AIP Conference Proceedings (Vol. 3267, No. 1, p. 020271). AIP Publishing LLC.

15. Poornima, G., & Anand, L. (2024, April). Effective strategies and techniques used for pulmonary carcinoma survival analysis. In 2024 1st International Conference on Trends in Engineering Systems and Technologies (ICTEST) (pp. 1-6). IEEE.

16. Bussu, V. R. R. Leveraging AI with Databricks and Azure Data Lake Storage. https://pdfs.semanticscholar.org/cef5/9d7415eb5be2bcb1602b81c6c1acbd7e5cdf.pdf

17. Lin, T. (2024). The role of generative AI in proactive incident management: Transforming infrastructure operations. International Journal of Innovative Research in Science, Engineering and Technology, 13(12), Article — . https://doi.org/10.15680/IJIRSET.2024.1312014

18. Pachyappan, R., Kotapati, V. B. R., & Shanmugam, L. (2024). TicketGenesis: LLM-Driven Compliance Evidence Extraction and Auto-Assignment Engine. Los Angeles Journal of Intelligent Systems and Pattern Recognition, 4, 325-366.

19. Sivaraju, P. S., & Mani, R. (2024). Private Cloud Database Consolidation in Financial Services: A Comprehensive Case Study on APAC Financial Industry Migration and Modernization Initiatives. International Journal of Research Publications in Engineering, Technology and Management (IJRPETM), 7(3), 10472-10490.

20. Gorle, S., Christadoss, J., & Sethuraman, S. (2025). Explainable Gradient-Boosting Classifier for SQL Query Performance Anomaly Detection. American Journal of Cognitive Computing and AI Systems, 9, 54-87.

21. Integration Guide (2024). SAP Financial Services Data Management-FPSL. SAP Help Portal

22. BytePlus (2024). How Gradient Boosting is Reshaping Banking. BytePlus

23. SpringerLink (2022). A Reference Architecture Model for Big Data Systems in the Finance Sector. Big Data & AI in Digital Finance, pp.3-28. SpringerLink

24. Wikipedia (2020). BAPI – Business Application Programming Interface for SAP. (Ken Kroes, Anil Thakur; Thomas G. Schüssler). Wikipedia

25. Wikipedia (2022). Databricks – The Lakehouse Platform.

LLM-Driven Open Banking and SAP Integration Framework: A Scalable Cloud Architecture using Databricks AI, Gradient Boosting, and Automated Software Testing

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

Make a Submission

Images

Submisssion

Open Access

License

Keywords

Keywords

Latest publications