Integrate chat completion into your business applications with SAP Cloud SDK for AI. Leverage the Generative AI Hub of SAP AI Core to make use of templating, grounding, data masking, content filtering and more. Access all features of SAP AI Core with the SAP Cloud SDK for AI.
APACHE-2.0 License
Integrate chat completion into your business applications with SAP Cloud SDK for GenAI Hub. Leverage the Generative AI Hub of SAP AI Core to make use of templating, grounding, data masking, content filtering and more. Set up your SAP AI Core instance with SAP Cloud SDK for AI Core.
We maintain a list of currently available and tested AI Core APIs
pom.xml
Add the following dependencies to your pom.xml
file:
<dependencies>
<dependency>
<groupId>com.sap.ai.sdk</groupId>
<artifactId>core</artifactId>
<version>${ai-sdk.version}</version>
</dependency>
</dependencies>
See an example pom in our Spring Boot application
public AiDeploymentCreationResponse createDeployment() {
final AiDeploymentCreationResponse deployment =
new DeploymentApi(getClient())
.create(
"default",
AiDeploymentCreationRequest.create()
.configurationId("12345-123-123-123-123456abcdefg"));
String id = deployment.getId();
AiExecutionStatus status = deployment.getStatus();
return deployment;
}
See an example in our Spring Boot application
public AiDeploymentDeletionResponse deleteDeployment(AiDeploymentCreationResponse deployment) {
DeploymentApi client = new DeploymentApi(getClient());
if (deployment.getStatus() == AiExecutionStatus.RUNNING) {
// Only RUNNING deployments can be STOPPED
client.modify(
"default",
deployment.getId(),
AiDeploymentModificationRequest.create().targetStatus(AiDeploymentTargetStatus.STOPPED));
}
// Wait a few seconds for the deployment to stop
// Only UNKNOWN and STOPPED deployments can be DELETED
return client.delete("default", deployment.getId());
}
See an example in our Spring Boot application
pom.xml
Add the following dependencies to your pom.xml
file:
<dependencies>
<dependency>
<groupId>com.sap.ai.sdk.foundationmodels</groupId>
<artifactId>openai</artifactId>
<version>${ai-sdk.version}</version>
</dependency>
</dependencies>
See an example pom in our Spring Boot application
final OpenAiChatCompletionOutput result =
OpenAiClient.forModel(GPT_35_TURBO)
.withSystemPrompt("You are a helpful AI")
.chatCompletion("Hello World! Why is this phrase so famous?");
final String resultMessage = result.getContent();
final var systemMessage =
new OpenAiChatSystemMessage().setContent("You are a helpful assistant");
final var userMessage =
new OpenAiChatUserMessage().addText("Hello World! Why is this phrase so famous?");
final var request =
new OpenAiChatCompletionParameters().addMessages(systemMessage, userMessage);
final OpenAiChatCompletionOutput result =
OpenAiClient.forModel(GPT_35_TURBO).chatCompletion(request);
final String resultMessage = result.getContent();
See an example in our Spring Boot application
OpenAiModel
final OpenAiChatCompletionOutput result =
OpenAiClient.forModel(new OpenAiModel("model")).chatCompletion(request);
It's possible to pass a stream of chat completion delta elements, e.g. from the application backend to the frontend in real-time.
This is a blocking example for streaming and printing directly to the console:
String msg = "Can you give me the first 100 numbers of the Fibonacci sequence?";
OpenAiClient client = OpenAiClient.forModel(GPT_35_TURBO);
// try-with-resources on stream ensures the connection will be closed
try( Stream<String> stream = client.streamChatCompletion(msg)) {
stream.forEach(deltaString -> {
System.out.print(deltaString);
System.out.flush();
});
}
The following example is non-blocking. Any asynchronous library can be used, e.g. classic Thread API.
String msg = "Can you give me the first 100 numbers of the Fibonacci sequence?";
OpenAiChatCompletionParameters request =
new OpenAiChatCompletionParameters()
.addMessages(new OpenAiChatUserMessage().addText(msg));
OpenAiChatCompletionOutput totalOutput = new OpenAiChatCompletionOutput();
OpenAiClient client = OpenAiClient.forModel(GPT_35_TURBO);
// Do the request before the thread starts to handle exceptions during request initialization
Stream<OpenAiChatCompletionDelta> stream = client.streamChatCompletionDeltas(request);
Thread thread = new Thread(() -> {
// try-with-resources ensures the stream is closed
try (stream) {
stream.peek(totalOutput::addDelta).forEach(delta -> System.out.println(delta));
}
});
thread.start(); // non-blocking
thread.join(); // blocking
// access aggregated information from total output, e.g.
Integer tokens = totalOutput.getUsage().getCompletionTokens();
System.out.println("Tokens: " + tokens);
Please find an example in our Spring Boot application.
It shows the usage of Spring Boot's ResponseBodyEmitter
to stream the chat completion delta messages to the frontend in real-time.
pom.xml
Add the following dependencies to your pom.xml
file:
<dependencies>
<dependency>
<groupId>com.sap.ai.sdk</groupId>
<artifactId>orchestration</artifactId>
<version>${ai-sdk.version}</version>
</dependency>
</dependencies>
See an example pom in our Spring Boot application
final var llmConfig = LLMModuleConfig.create().modelName("gpt-35-turbo").modelParams(Map.of());
final var inputParams =
Map.of("input", "Reply with 'Orchestration Service is working!' in German");
final var template = ChatMessage.create().role("user").content("{{?input}}");
final var templatingConfig = TemplatingModuleConfig.create().template(template);
final var config =
CompletionPostRequest.create()
.orchestrationConfig(
OrchestrationConfig.create()
.moduleConfigurations(
ModuleConfigs.create()
.llmModuleConfig(llmConfig)
.templatingModuleConfig(templatingConfig)))
.inputParams(inputParams);
final CompletionPostResponse result =
new OrchestrationCompletionApi(getOrchestrationClient("default"))
.orchestrationV1EndpointsCreate(config);
final String messageResult =
result.getOrchestrationResult().getChoices().get(0).getMessage().getContent();
See an example in our Spring Boot application
final var llmConfig = LLMModuleConfig.create().modelName("gpt-35-turbo").modelParams(Map.of());
List<ChatMessage> messagesHistory =
List.of(
ChatMessage.create().role("user").content("What is the capital of France?"),
ChatMessage.create().role("assistant").content("The capital of France is Paris."));
final var message =
ChatMessage.create().role("user").content("What is the typical food there?");
final var templatingConfig = TemplatingModuleConfig.create().template(message);
final var config =
CompletionPostRequest.create()
.orchestrationConfig(
OrchestrationConfig.create()
.moduleConfigurations(
ModuleConfigs.create()
.llmModuleConfig(llmConfig)
.templatingModuleConfig(templatingConfig)))
.inputParams(Map.of())
.messagesHistory(messagesHistory);
final CompletionPostResponse result =
new OrchestrationCompletionApi(getOrchestrationClient("default"))
.orchestrationV1EndpointsCreate(config);
final String messageResult =
result.getOrchestrationResult().getChoices().get(0).getMessage().getContent();
See an example in our Spring Boot application
final var llmConfig = LLMModuleConfig.create().modelName("gpt-35-turbo").modelParams(Map.of());
final var inputParams =
Map.of(
"disclaimer",
"```DISCLAIMER: The area surrounding the apartment is known for prostitutes and gang violence including armed conflicts, gun violence is frequent.");
final var template =
ChatMessage.create()
.role("user")
.content(
"Create a rental posting for subletting my apartment in the downtown area. Keep it short. Make sure to add the following disclaimer to the end. Do not change it! {{?disclaimer}}");
final var templatingConfig = TemplatingModuleConfig.create().template(template);
final var filterStrict =
FilterConfig.create()
.type(FilterConfig.TypeEnum.AZURE_CONTENT_SAFETY)
.config(
AzureContentSafety.create()
.hate(NUMBER_0)
.selfHarm(NUMBER_0)
.sexual(NUMBER_0)
.violence(NUMBER_0));
final var filterLoose =
FilterConfig.create()
.type(FilterConfig.TypeEnum.AZURE_CONTENT_SAFETY)
.config(
AzureContentSafety.create()
.hate(NUMBER_4)
.selfHarm(NUMBER_4)
.sexual(NUMBER_4)
.violence(NUMBER_4));
final var filteringConfig =
FilteringModuleConfig.create()
// changing the input to filterLoose will allow the message to pass
.input(FilteringConfig.create().filters(filterStrict))
.output(FilteringConfig.create().filters(filterStrict));
final var config =
CompletionPostRequest.create()
.orchestrationConfig(
OrchestrationConfig.create()
.moduleConfigurations(
ModuleConfigs.create()
.llmModuleConfig(llmConfig)
.templatingModuleConfig(templatingConfig)
.filteringModuleConfig(filteringConfig)))
.inputParams(inputParams);
final CompletionPostResponse result =
new OrchestrationCompletionApi(getOrchestrationClient("default"))
// this fails with Bad Request because the strict filter prohibits the input message
.orchestrationV1EndpointsCreate(config);
final String messageResult =
result.getOrchestrationResult().getChoices().get(0).getMessage().getContent();
See an example in our Spring Boot application
In the following example we use the data masking module to anonymize persons and phone numbers in the input.
In this case, the input will be masked before the call to the LLM. However, data can not be unmasked in the LLM output.
final var inputParams = Map.of("privateInfo", "Patrick Morgan +49 (970) 333-3833");
final var template =
ChatMessage.create().role("user").content("What is the nationality of {{?privateInfo}}");
final var templatingConfig = TemplatingModuleConfig.create().template(template);
final var maskingProvider =
MaskingProviderConfig.create()
.type(MaskingProviderConfig.TypeEnum.SAP_DATA_PRIVACY_INTEGRATION)
.method(MaskingProviderConfig.MethodEnum.ANONYMIZATION)
.entities(
DPIEntityConfig.create().type(DPIEntities.PHONE),
DPIEntityConfig.create().type(DPIEntities.PERSON));
final var maskingConfig = MaskingModuleConfig.create().maskingProviders(maskingProvider);
final CompletionPostRequest config =
CompletionPostRequest.create()
.orchestrationConfig(
OrchestrationConfig.create()
.moduleConfigurations(
ModuleConfigs.create()
.llmModuleConfig(LLM_CONFIG)
.templatingModuleConfig(templatingConfig)
.maskingModuleConfig(maskingConfig)))
.inputParams(inputParams);
final CompletionPostResponse result =
new OrchestrationCompletionApi(getOrchestrationClient("default"))
.orchestrationV1EndpointsCreate(config);
final String messageResult =
result.getOrchestrationResult().getChoices().get(0).getMessage().getContent();
See an example in our Spring Boot application
Change your LLM module configuration to add model parameters:
var llmConfig =
LLMModuleConfig.create()
.modelName("gpt-35-turbo")
.modelParams(
Map.of(
"max_tokens", 50,
"temperature", 0.1,
"frequency_penalty", 0,
"presence_penalty", 0));
See an example in our unit test
To add a header to AI Core requests, use the following code:
ApiClient client = Core.getClient().addDefaultHeader("header-key", "header-value");
DeploymentApi api = new DeploymentApi(client);
For more customization, creating a HeaderProvider is also possible.
For any AI Core service interaction, the SAP AI SDK requires credentials to be available at application runtime. By default, the credentials are extracted automatically from a service instance of type "aicore" bound to the application. Running the application locally without this service binding will throw an exception:
Could not find any matching service bindings for service identifier 'aicore'
There are multiple options to register the service binding:
VCAP_SERVICES
env var entry).AICORE_SERVICE_KEY
cds bind --to aicore --exec mvn spring-boot:run
"user-provided"
service binding (not recommended for production).ServiceBinding
or ServiceBindingAccessor
declaration in application (not recommended for production).aicore
to your application.{
"aicore": [
{
"clientid": "...",
"clientsecret": "...",
"url": "...",
"identityzone": "...",
"identityzoneid": "...",
"appname": "...",
"serviceurls": {
"AI_API_URL": "..."
}
}
]
}
Go into the SAP BTP Cockpit
Instances and Subscriptions -> Instances -> AI Core -> View Credentials -> Copy JSON
Set it as an environment variable AICORE_SERVICE_KEY
in your IDE
Or in your terminal:
export AICORE_SERVICE_KEY='{ "serviceurls": { "AI_API_URL": ...'
Lookup service-key credentials as explained in the previous step for AICORE_SERVICE_KEY
.
/v2
)At application runtime the following can be executed:
Destination destination = DestinationAccessor.getDestination("my-aicore");
ApiClient client = Core.getClient(destination);
cd sample-code/spring-app
mvn spring-boot:run
mvn clean package
cf push
This project is open to feature requests/suggestions, bug reports etc. via GitHub issues. Contribution and feedback are encouraged and always welcome. For more information about how to contribute, the project structure, as well as additional contribution information, see our Contribution Guidelines.
If you find any bug that may be a security problem, please follow our instructions at in our security policy on how to report it. Please do not create GitHub issues for security-related doubts or problems.
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone. By participating in this project, you agree to abide by its Code of Conduct at all times.
Copyright 2024 SAP SE or an SAP affiliate company and ai-sdk-java contributors. Please see our LICENSE for copyright and license information. Detailed information including third-party components and their licensing/copyright information is available via the REUSE tool.