Documentation

Purpose and Vision:

This project's goal is to allow developers to easily implement chat that doesn't suck into their applications. One of the main holdbacks with chats today (January 2025) is they don't allow the user to actually do anything. Getting information and asking questions is nice, but what happens when the user wants to take action? For example, if connected to a meal planning application, a user could say, 'What are my meals for the upcoming week?' 'Swap Thursday's evening meal to a meatless meal because we have a vegan friend coming over. And it'll need to feed 6 adults.' This scenario is impossible to handle unless an LLM can manipulate the user's data on their behalf in a safe, reliable manner. This is what the creators of the NLAPI set out to solve. Our engine would create the database mutations necessary (if using ) or the series of API calls (if using ) to get that done for the user. Our engine has access to the user's specific schema at runtime, so it knows whether it needs to ask follow-up questions or if it can execute the command from the user.

To accomplish this, we are using API Schemas, LLMs, and RAG. We give the LLM the same permissions as the authentication header you send us, meaning the LLM won't be able to do anything your user wouldn't be able to do or see.

Quick Start

(For complete guides checkout our )

Create an API key
Save your schema (Devii users can skip this step)
Use your nlapi-key to start sending requests

fetch('https://api.nlapi.io/nlapi', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'nlapi-key': 'YOUR_API_KEY',
  },
  body: JSON.stringify({
    user_input: 'create new ingredient called pinto beans',
    // context: ['user id is 1'], // optional
    // "thread_id" : 34232 // use this to follow up on a conversation
    //"options" : {
      // stream: true // returns chunks
    //}
  }),
})
  .then((response) => response.json())
  .then((data) => console.log(data))
  .catch((error) => console.error('Error:', error));

Documentation

Key Concepts

Messages

Return Messages: The API will return the latest message in the thread object returned in the messages array. The latest message will always be at messages[0], and the array will be ordered by created_at in descending order (latest -> oldest).

Message Object: Each message object returned will have the following keys: content: The natural language input from the user (human) or natural language output from our models (bot) speaker: Identifies the message author. Currently, it will always be 'bot' or 'human' created_at: Timestamp of when the message was created

Threads

Threads are simply a conversation of messages.

Using Threads: Sometimes a user will try to interact with a model but not provide the required information for a valid DB mutation. When this happens, the NLAPI will respond in natural language with a message indicating what information it needs to complete the user's request. When this happens, the developer will need to pass the thread_id key in the next request to follow up on the conversation. The NLAPI will use the whole context of the whole thread to complete the request so the user does not have to repeat previous information already mentioned in the thread.

New Threads: If no thread_id is provided in the request, a new thread is created, and the NLAPI has no access to previous messages.

Thread Expiration: [in development] After a thread expires, a user cannot add additional messages to a thread. Expiring threads is a security feature. If we didn't expire threads, and a user had access to something early on in the thread, but not later, the NLAPI may assume access still and hallucinate bad database interactions. With the proper Role-Based Access Control policies, your user will still not be able to perform any action they are not allowed, but it could lead to a poor user experience and more hallucinations with longer threads. We may change this in the future.

Thread Object: The thread id is returned from every /nlapi request. thread_id: This is used to keep track of threads so a user can follow up with a conversation. created_at: [in development] The timestamp the thread was created. expires_at: [in development] The timestamp when the thread expires. After this, no more messages will be accepted, and the NLAPI will return the thread object with the last message's content: 'Error: Thread has Expired, please start a new thread.'

Context

Context is information the developer knows about the user or current location of the request that the user and the NLAPI would not. For example, on a project management software, if a user is on a project dashboard page and sends the payload of {'user_input' : 'add a task to this project called make documentation for feature x'}. The NLAPI would not know what project the user is referencing. Because the user is on the project page, the user could add the context key context: ["user is viewing project with id 71"] to the payload, and the NLAPI would understand that the user is likely wanting to add a task with project_id of 71.

How context is being implemented is currently being shaped and is subject to change. Please send us your feedback on how you'd like to implement this. Email at jase@jasekraft.com

Streaming

To request a streamed response, you must include the "options" key, which contains an object with the key "stream" set to "true". A sample payload might look like:

{
  "user_input": "Create a new recipe called 'meatloaf'",
  "thread_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab",
  "context": ["logged in user's id is 17283"],
  "options": {
    "stream": true
  }
}

The event field describes the type of event that is being sent.

The data field contains a JSON string with information corresponding to the type of the event.

Streaming Events

status_message

Status messages relay information about the steps the NLAPI is taking behind the scenes, whether that be the initial processing of the request, making queries, etc.

event: status_message
data: {"content": "querying", "thread_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab"}

message_chunk

Message chunks contain JSON strings with "content" denoting the current message token and "thread_id" containing the id of the current conversation.

event: message_chunk
data: {"content": "ketchup", "thread_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab"}

close

The close event is the last event in the response. In addition to "content" and "thread_id", close events also contain a "run_id".

error

If the NLAPI encountered an error while processing the request, it will send the error message in an error event.

event: error
data: {"content": "Error: API Connection error", "thread_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab", "run_id": "a1b2c3d4-5678-90ab-cdef-1234567890ab", "processing_time": 3.1492, "query_time": 0.4498, "total_time": 3.599, "time_to_first_token": 2.2372}

Feedback (In Developement)

To continue to improve our models, we optionally allow users to give feedback on responses. We currently use this data internally to continually improve our models. However, if you'd like to not participate in the continued refinement of our data, you can simply reach out, and we can discuss options. Coming In Some Amount of Time: with enterprise installations of this software, the enterprise's exclusive model will continue to learn from the data in this feedback loop

Developers can send POST requests to /feedback with the feedback object to help provide feedback and improve our models. Users can only provide feedback on responses from threads they are logged into. As with requests made to nlapi, you must include your access_token in the Authorization header of your request.

Feedback Object

{
  "run_id": string, // The id of the run returned from the chat response.
  "score": 0 | 1 // Was the response good or bad? 0 for bad, 1 for good.
}

Capabilities Summary

In order to help users understand what they are able to do with the NLAPI, we provide a brief written summary of their capabilities at GET /capabilities. This endpoint requires an API key that is associated with an application. If the application is a Devii application, you must also include your user's access_token.

// Response schema
{
  "capabilities_summary": string
}

Local Development

To enable the NLAPI to access your local API while it's deployed on the cloud, you can use Ngrok. Ngrok creates a secure tunnel from a public endpoint to a locally running service, making it an essential tool for development and testing. This guide provides a step-by-step process to set up Ngrok effectively.

This allows you to test how the NLAPI will respond to your API design prior to deployment. NOTE: Always use a separate application in your NLAPI portal for testing vs. production

Steps

Visit Ngrok's Website:
Register:
- Fill in your details to create a new account or sign in if you already have one.
Access Your Auth Token:
- After signing up, navigate to the Ngrok Dashboard to find your unique authentication token.
Install Ngrok:
Run Ngrok:
- Open your terminal and run the following command, replacing YOUR_PORT with the port number your local API is running on:
  ngrok http YOUR_PORT
Update NLAPI Application:

Known Limitations:

Large Response Types

If you have massive response type definitions (think like many nested objects), we may run into context size limits and return 400 errors. It is on our roadmap to solve this issue, and if you are running into it, please email me at jase@jasekraft.com so we know to make it a priority.

Large Response Payloads

If your API returns large payloads, it is likely you will not have great results with the NLAPI. Mostly because you will hit the context limit per conversation quickly and use up a lot of tokens. We have a few ideas to mitigate this in the future, but for now, we suggest adding pagination to any endpoint that may return a long list.

PreviousDevii NextGuides

Last updated 4 months ago