Documentation
Purpose and Vision:
This project's goal is to allow developers to easily implement chat that doesn't suck into their applications. One of the main holdbacks with chats today (January 2025) is they don't allow the user to actually do anything. Getting information and asking questions is nice, but what happens when the user wants to take action? For example, if connected to a meal planning application, a user could say, 'What are my meals for the upcoming week?' 'Swap Thursday's evening meal to a meatless meal because we have a vegan friend coming over. And it'll need to feed 6 adults.' This scenario is impossible to handle unless an LLM can manipulate the user's data on their behalf in a safe, reliable manner. This is what the creators of the NLAPI set out to solve. Our engine would create the database mutations necessary (if using Devii) or the series of API calls (if using REST API) to get that done for the user. Our engine has access to the user's specific schema at runtime, so it knows whether it needs to ask follow-up questions or if it can execute the command from the user.
To accomplish this, we are using API Schemas, LLMs, and RAG. We give the LLM the same permissions as the authentication header you send us, meaning the LLM won't be able to do anything your user wouldn't be able to do or see.
Quick Start
(For complete guides checkout our Guides)
Create an API key Portal Docs
Save your schema Schema Docs (Devii users can skip this step)
Use your
nlapi-key
to start sending requests
fetch('https://api.nlapi.io/nlapi', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'nlapi-key': 'YOUR_API_KEY',
},
body: JSON.stringify({
user_input: 'create new ingredient called pinto beans',
// context: ['user id is 1'], // optional
// "thread_id" : 34232 // use this to follow up on a conversation
//"options" : {
// stream: true // returns chunks
//}
}),
})
.then((response) => response.json())
.then((data) => console.log(data))
.catch((error) => console.error('Error:', error));
For Complete Swagger Docs go here: NLAPI Swagger Docs
Documentation
Key Concepts
Messages
Messages are the foundation of the information that is passed to and from the NLAPI server. To send a message to the server, a developer can simply send a POST request to the /nlapi
endpoint with user_input
as part of the JSON object in the body. user_input
is the only required input for this endpoint. You can continue a conversation by including the thread_id
key in the request. (see Threads section for more info) You can also include the key context
in the request. (see Context for more info)
Return Messages: The API will return the latest message in the thread object returned in the messages
array. The latest message will always be at messages[0]
, and the array will be ordered by created_at
in descending order (latest -> oldest).
Message Object: Each message object returned will have the following keys:
content
: The natural language input from the user (human) or natural language output from our models (bot)
speaker
: Identifies the message author. Currently, it will always be 'bot' or 'human'
created_at
: Timestamp of when the message was created
Threads
Threads are simply a conversation of messages.
Using Threads: Sometimes a user will try to interact with a model but not provide the required information for a valid DB mutation. When this happens, the NLAPI will respond in natural language with a message indicating what information it needs to complete the user's request. When this happens, the developer will need to pass the thread_id
key in the next request to follow up on the conversation. The NLAPI will use the whole context of the whole thread to complete the request so the user does not have to repeat previous information already mentioned in the thread.
New Threads: If no thread_id
is provided in the request, a new thread is created, and the NLAPI has no access to previous messages.
Thread Expiration: [in development] After a thread expires, a user cannot add additional messages to a thread. Expiring threads is a security feature. If we didn't expire threads, and a user had access to something early on in the thread, but not later, the NLAPI may assume access still and hallucinate bad database interactions. With the proper Role-Based Access Control policies, your user will still not be able to perform any action they are not allowed, but it could lead to a poor user experience and more hallucinations with longer threads. We may change this in the future.
Thread Object: The thread id is returned from every /nlapi
request. thread_id
: This is used to keep track of threads so a user can follow up with a conversation. created_at
: [in development] The timestamp the thread was created. expires_at
: [in development] The timestamp when the thread expires. After this, no more messages will be accepted, and the NLAPI will return the thread object with the last message's content: 'Error: Thread has Expired, please start a new thread.'
Context
Context is information the developer knows about the user or current location of the request that the user and the NLAPI would not. For example, on a project management software, if a user is on a project dashboard page and sends the payload of {'user_input' : 'add a task to this project called make documentation for feature x'}
. The NLAPI would not know what project the user is referencing. Because the user is on the project page, the user could add the context key context: ["user is viewing project with id 71"]
to the payload, and the NLAPI would understand that the user is likely wanting to add a task with project_id of 71.
How context is being implemented is currently being shaped and is subject to change. Please send us your feedback on how you'd like to implement this. Email at jase@jasekraft.com
Streaming
Responses can be streamed back from NLAPI in the form of server-sent events. For more information on server-sent events, you can refer to the MDN documentation. In order to consume server-sent events, you will need to use an event parser. For JavaScript, you might consider using eventsource-parser.
To request a streamed response, you must include the "options" key, which contains an object with the key "stream" set to "true". A sample payload might look like:
{
"user_input": "Create a new recipe called 'meatloaf'",
"thread_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab",
"context": ["logged in user's id is 17283"],
"options": {
"stream": true
}
}
The events that the NLAPI sends in response will be sent as Uint8Arrays. Once decoded into plain text, they will have two fields, "event" and "data".
The event field describes the type of event that is being sent.
The data field contains a JSON string with information corresponding to the type of the event.
Streaming Events
status_message
Status messages relay information about the steps the NLAPI is taking behind the scenes, whether that be the initial processing of the request, making queries, etc.
event: status_message
data: {"content": "querying", "thread_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab"}
message_chunk
Message chunks contain JSON strings with "content" denoting the current message token and "thread_id" containing the id of the current conversation.
event: message_chunk
data: {"content": "ketchup", "thread_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab"}
close
The close event is the last event in the response. In addition to "content" and "thread_id", close events also contain a "run_id".
This event returns the NLAPI Response Object. (See the /nlapi route)
error
If the NLAPI encountered an error while processing the request, it will send the error message in an error event.
event: error
data: {"content": "Error: API Connection error", "thread_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab", "run_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab", "processing_time": 3.1492, "query_time": 0.4498, "total_time": 3.599, "time_to_first_token": 2.2372}
Feedback (In Developement)
To continue to improve our models, we optionally allow users to give feedback on responses. We currently use this data internally to continually improve our models. However, if you'd like to not participate in the continued refinement of our data, you can simply reach out, and we can discuss options. Coming In Some Amount of Time: with enterprise installations of this software, the enterprise's exclusive model will continue to learn from the data in this feedback loop
Developers can send POST requests to /feedback
with the feedback object to help provide feedback and improve our models. Users can only provide feedback on responses from threads they are logged into. As with requests made to nlapi
, you must include your access_token in the Authorization header of your request.
Feedback Object
{
"run_id": string, // The id of the run returned from the chat response.
"score": 0 | 1 // Was the response good or bad? 0 for bad, 1 for good.
}
Capabilities Summary
In order to help users understand what they are able to do with the NLAPI, we provide a brief written summary of their capabilities at GET /capabilities
. This endpoint requires an API key that is associated with an application. If the application is a Devii application, you must also include your user's access_token.
// Response schema
{
"capabilities_summary": string
}
Local Development
To enable the NLAPI to access your local API while it's deployed on the cloud, you can use Ngrok. Ngrok creates a secure tunnel from a public endpoint to a locally running service, making it an essential tool for development and testing. This guide provides a step-by-step process to set up Ngrok effectively.
This allows you to test how the NLAPI will respond to your API design prior to deployment. NOTE: Always use a separate application in your NLAPI portal for testing vs. production
Steps
Visit Ngrok's Website:
Go to ngrok.com and click on the Sign Up button.
Register:
Fill in your details to create a new account or sign in if you already have one.
Access Your Auth Token:
After signing up, navigate to the Ngrok Dashboard to find your unique authentication token.
Install Ngrok:
Download and install Ngrok by following the instructions on the Ngrok download page.
Run Ngrok:
Open your terminal and run the following command, replacing
YOUR_PORT
with the port number your local API is running on:ngrok http YOUR_PORT
Update NLAPI Application:
Copy the generated public URL from the terminal and update your NLAPI application with this new URL. Example update request
Known Limitations:
Large Response Types
If you have massive response type definitions (think like many nested objects), we may run into context size limits and return 400 errors. It is on our roadmap to solve this issue, and if you are running into it, please email me at jase@jasekraft.com so we know to make it a priority.
Large Response Payloads
If your API returns large payloads, it is likely you will not have great results with the NLAPI. Mostly because you will hit the context limit per conversation quickly and use up a lot of tokens. We have a few ideas to mitigate this in the future, but for now, we suggest adding pagination to any endpoint that may return a long list.
Last updated