Documentation
Last updated
Last updated
This project's goal is to allow developers to easily implement chat that doesn't suck into their applications. One of the main holdbacks with chats today (January 2025) is they don't allow the user to actually do anything. Getting information and asking questions is nice, but what happens when the user wants to take action? For example, if connected to a meal planning application, a user could say, 'What are my meals for the upcoming week?' 'Swap Thursday's evening meal to a meatless meal because we have a vegan friend coming over. And it'll need to feed 6 adults.' This scenario is impossible to handle unless an LLM can manipulate the user's data on their behalf in a safe, reliable manner. This is what the creators of the NLAPI set out to solve. Our engine would create the database mutations necessary (if using ) or the series of API calls (if using ) to get that done for the user. Our engine has access to the user's specific schema at runtime, so it knows whether it needs to ask follow-up questions or if it can execute the command from the user.
To accomplish this, we are using API Schemas, LLMs, and RAG. We give the LLM the same permissions as the authentication header you send us, meaning the LLM won't be able to do anything your user wouldn't be able to do or see.
(For complete guides checkout our )
Create an API key
Save your schema (Devii users can skip this step)
Use your nlapi-key
to start sending requests
Return Messages: The API will return the latest message in the thread object returned in the messages
array. The latest message will always be at messages[0]
, and the array will be ordered by created_at
in descending order (latest -> oldest).
Message Object: Each message object returned will have the following keys:
content
: The natural language input from the user (human) or natural language output from our models (bot)
speaker
: Identifies the message author. Currently, it will always be 'bot' or 'human'
created_at
: Timestamp of when the message was created
Threads are simply a conversation of messages.
Using Threads: Sometimes a user will try to interact with a model but not provide the required information for a valid DB mutation. When this happens, the NLAPI will respond in natural language with a message indicating what information it needs to complete the user's request. When this happens, the developer will need to pass the thread_id
key in the next request to follow up on the conversation. The NLAPI will use the whole context of the whole thread to complete the request so the user does not have to repeat previous information already mentioned in the thread.
New Threads: If no thread_id
is provided in the request, a new thread is created, and the NLAPI has no access to previous messages.
Thread Expiration: [in development] After a thread expires, a user cannot add additional messages to a thread. Expiring threads is a security feature. If we didn't expire threads, and a user had access to something early on in the thread, but not later, the NLAPI may assume access still and hallucinate bad database interactions. With the proper Role-Based Access Control policies, your user will still not be able to perform any action they are not allowed, but it could lead to a poor user experience and more hallucinations with longer threads. We may change this in the future.
Thread Object: The thread id is returned from every /nlapi
request. thread_id
: This is used to keep track of threads so a user can follow up with a conversation. created_at
: [in development] The timestamp the thread was created. expires_at
: [in development] The timestamp when the thread expires. After this, no more messages will be accepted, and the NLAPI will return the thread object with the last message's content: 'Error: Thread has Expired, please start a new thread.'
Context is information the developer knows about the user or current location of the request that the user and the NLAPI would not. For example, on a project management software, if a user is on a project dashboard page and sends the payload of {'user_input' : 'add a task to this project called make documentation for feature x'}
. The NLAPI would not know what project the user is referencing. Because the user is on the project page, the user could add the context key context: ["user is viewing project with id 71"]
to the payload, and the NLAPI would understand that the user is likely wanting to add a task with project_id of 71.
How context is being implemented is currently being shaped and is subject to change. Please send us your feedback on how you'd like to implement this. Email at jase@jasekraft.com
To request a streamed response, you must include the "options" key, which contains an object with the key "stream" set to "true". A sample payload might look like:
The event field describes the type of event that is being sent.
The data field contains a JSON string with information corresponding to the type of the event.
status_message
Status messages relay information about the steps the NLAPI is taking behind the scenes, whether that be the initial processing of the request, making queries, etc.
message_chunk
Message chunks contain JSON strings with "content" denoting the current message token and "thread_id" containing the id of the current conversation.
close
The close event is the last event in the response. In addition to "content" and "thread_id", close events also contain a "run_id".
error
If the NLAPI encountered an error while processing the request, it will send the error message in an error event.
To continue to improve our models, we optionally allow users to give feedback on responses. We currently use this data internally to continually improve our models. However, if you'd like to not participate in the continued refinement of our data, you can simply reach out, and we can discuss options. Coming In Some Amount of Time: with enterprise installations of this software, the enterprise's exclusive model will continue to learn from the data in this feedback loop
Developers can send POST requests to /feedback
with the feedback object to help provide feedback and improve our models. Users can only provide feedback on responses from threads they are logged into. As with requests made to nlapi
, you must include your access_token in the Authorization header of your request.
In order to help users understand what they are able to do with the NLAPI, we provide a brief written summary of their capabilities at GET /capabilities
. This endpoint requires an API key that is associated with an application. If the application is a Devii application, you must also include your user's access_token.
To enable the NLAPI to access your local API while it's deployed on the cloud, you can use Ngrok. Ngrok creates a secure tunnel from a public endpoint to a locally running service, making it an essential tool for development and testing. This guide provides a step-by-step process to set up Ngrok effectively.
This allows you to test how the NLAPI will respond to your API design prior to deployment. NOTE: Always use a separate application in your NLAPI portal for testing vs. production
Steps
Visit Ngrok's Website:
Register:
Fill in your details to create a new account or sign in if you already have one.
Access Your Auth Token:
After signing up, navigate to the Ngrok Dashboard to find your unique authentication token.
Install Ngrok:
Run Ngrok:
Open your terminal and run the following command, replacing YOUR_PORT
with the port number your local API is running on:
Update NLAPI Application:
If you have massive response type definitions (think like many nested objects), we may run into context size limits and return 400 errors. It is on our roadmap to solve this issue, and if you are running into it, please email me at jase@jasekraft.com so we know to make it a priority.
If your API returns large payloads, it is likely you will not have great results with the NLAPI. Mostly because you will hit the context limit per conversation quickly and use up a lot of tokens. We have a few ideas to mitigate this in the future, but for now, we suggest adding pagination to any endpoint that may return a long list.
For Complete Swagger Docs go here:
Messages are the foundation of the information that is passed to and from the NLAPI server. To send a message to the server, a developer can simply send a POST request to the /nlapi
endpoint with user_input
as part of the JSON object in the body. user_input
is the only required input for this endpoint. You can continue a conversation by including the thread_id
key in the request. (see section for more info) You can also include the key context
in the request. (see for more info)
Responses can be streamed back from NLAPI in the form of server-sent events. For more information on server-sent events, you can refer to the . In order to consume server-sent events, you will need to use an event parser. For JavaScript, you might consider using .
The events that the NLAPI sends in response will be sent as . Once decoded into plain text, they will have two fields, "event" and "data".
This event returns the . (See the /nlapi route)
Go to and click on the Sign Up button.
Download and install Ngrok by following the instructions on the .
Copy the generated public URL from the terminal and update your NLAPI application with this new URL.