GPTScript Helix Apps: For Fun and Profit

Making GPTScript Shine with Open Source LLMs: How Llama 3's 70B Model Finally Makes Local Natural Language Programming Reliable

Dec 19, 2024

GPTscript is a fun new way to build AI applications using simple, composable scripts written in human language.

We’ve supported creating GPTscript apps on Helix since our 1.0 release. We’ve also participated in community calls providing feedback and improving our integration as a result of these discussions.

Up until recently running the GPTScript apps on Helix was a bit clunky and not particularly reliable. Part of the reason was the Open Source LLMs available to us at the time weren’t particularly well fine-tuned for function calling, which gptscript depends on.

That’s changed with the Llama3 family of models, especially with the large parameter models. Though you can make things work reasonably well with the smaller models you have to spend more time fiddling with prompts. Llama team recently released their latest 70B model and we saw it as an opportunity to improve our GPTScript integration.

In this blog post, we will walk you through creating GPTScript apps on the Helix platform. Do make sure to check the official documentation, too.

GPTscript terminal app

In this guide, we will show you how you can create a simple terminal app in Helix. You will need a Helix account. Feel free to create one here.

Once you’ve logged on, navigate to

This will bring up a screen that lists all your Helix apps. Click on the New App button in the upper right corner and create a new Helix app:

We will now configure our shiny new GPTscript app. We will call it GPTScriptTerminal, add a simple description and select the latest Lllama3.3 70B model fine-tuned for instructions following.

We’re ready to write a GPTscript now. Click on the GPTSCRIPTS tab. It will bring up a page that lets you add GPTScripts to your Helix apps. Click on ADD GPTSCRIPT button which will bring up a GPTScript app editor. Let’s add a simple GPTscript tool.

Our tool will listen to user queries and use GPTScript to execute commands in the terminal. Think of it as a web-based AI interface for your terminal. Here’s what our tool contents look like.

(for easy copy-pasting)

Name: terminal
Description: terminal to run commands

Script Content:

model: llama3.3:70b-instruct-q4_K_M from http://api:80/v1
tools: sys.exec, sys.read, sys.write, sys.ls
description: Find out current time

Listen to user queries and execute commands in the terminal as requested by users. You must only use the defined tools. Do not guess answers!

It’s quite simple, but for the sake of brevity, let’s walk through it step by step.

First, we give it a name and a simple description. Then we specify what model we’d like the GPTScript to use. We are leveraging GPTScript capability to use models with OpenAI-compatible APIs: we’re explicitly spelling it out to use the latest Llama model we’ve configured in the Helix app.

In the future, specifying the model will be easier than typing it into the editor like we did.

Next, we will import some useful GPTScript libraries and tools. For example, `sys.exec` can be used to execute commands in a terminal, such as `date`, etc. There are also tools for listing and reading files on the filesystem. In fact, GPTScript provides a whole suite of `sys` tools. Go check them out!

Finally, we will write a prompt. It will help us demonstrate some basic functionality. If you find the LLM making things up, you might need to fiddle with it, but for our use case, it will do just fine.

Let’s save the editor and take our shiny new GPTScript app for a spin now! Click on the LAUNCH button in the upper right corner.

This will open a new App session where you can interact with the app via chat.

Let’s ask it what the current time is. Note how we explicitly mentioned the name of our tool (terminal) to give the LLM a helpful nudge so it understands our intent better. The more explicit we are the easier it is for the LLMs to generate helpful answers.

And here we are. The app has successfully returned the current time!

We can of course issue some other commands via the chat and as long as our terminal app has the tools to run the commands it’ll continue running them and returning the results back to us. How neat!

Running terminal commands via a chat is kinda cool, but we could do much more if could access the internet. The sample app we’ve just built does not import any tools which can make HTTP requests. GPTScript provides `sys.http.get` and `sys.download` to let you do just that.

Getting access to the internet via the `sys.http.get` and `sys.download` tools opens the door to a whole lot of use cases. One such use case is doing online research. Whether that means searching for a new pair of shoes or analysing investment opportunities, LLMs can be a useful partner to us.

For our next example, let’s consider that we are thinking of changing our company branding. Given we’re developers we are not really sure what the branding should look like. We just know it needs a bit of fresh paint. Great branding is one of the most important things that has a great influence on a company’s success: your branding must resonate with your target audience.

GPTscript branding app

We need to do some research, but we are not sure how to research branding and besides, it’s often rather laborious endeavour. Enter Helix! This a perfect opportunity to use Helix to summon up LLMs and GPTScript!

We will write a simple GPTScript which will help us download some website pages, analyse their branding and help us generate images for our new company brand.

We will create a new Helix GPTScript App like the terminal app we created earlier. We will need to write a good script to pass to the GPTScript. It could look something like this. You might need to play around with different variations, but the script below worked quite well for us:

model: meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo from http://api:80/v1
Tools: sys.http.get, sys.download, sys.write
Parameter: website: The website to browse to. Required
Description: Download websites and analyse the branding

1. If "${website}" argument is empty, abort with an error. Do not invent an example website
2. Download the html from the company's homepage using sys.http.get. Find images which are likely reflecting the brand style (such as photographs) in the HTML and record them as a list of URLs. Try to find at least 5 images. If the URL is a relative URL, make sure to prepend the domain onto it. Ignore svg files, focus on jpg and png files. Pick files that are likely to be large images representative of the brand, not small icons etc. Ignore URLs ending with "aspx". Do NOT include *.aspx files in the list
3. For each URL in the list in the previous step, use sys.download tool to download and write that URL to "output/${website}/brand-{i}.png" where {i} is a number starting from step 1. Change the file extension according to the URL, so for example if it is a jpg file,  it must be named with .jpg file extension.
4. Given the context, write five image prompts for appropriate on-brand images for a marketing campaign, write these prompts, separated by a newline, to "output/${website}/image-prompts.txt"

Hopefully, the script is clear enough, but for brevity, let’s summarise what we are asking GPTScript to do for us:

Download several different images from a given website
Analyze the images and generate prompts that will help us create branding in a similar style

Let’s go ahead and launch the app and run it. We are an AI company, so naturally, we want our branding to evoke artificial intelligence vibes. Out first step is to analyse the branding of the well-known AI company:

After a little bit of time, GPTScript returns the results to us and they’re pretty much what we would expect based on the site we targeted. We could take those prompts and use them with one of the image models available in Helix to generate some swanky marketing templates for our company. Pretty neat for a couple of minutes worth of work!

All of this is great and fun, but Helix offers much more than building and running Generative AI apps! At Helix we are strong believers in Open Source technology and are working hard to make open weights models a success.

To make that happen we need open standards which is why worked hard on creating AIspec which is an attempt to represent LLM apps and tests as data (CRDs) and deploy and integrate them into CI/CD. AIspec defines some core directives which make the development lifecycle of Gen AI applications silky smooth!

One of the most important parts of the application development lifecycle is testing. Testing, and general CI/CD, are some of the things AI spec and Helix shine. AIspec lets you define your tests declaratively and execute them using the `helix` cli tool.

Let’s have a look at how we would go about testing the `terminal` app. The best way to start is to query the Helix application using the abovementioned `helix` command line tool. You will need a Helix API key and expose it via the `HELIX_API_KEY` environment variable.

You can grab it by navigating to Account & API page:

And grabbing your API key from the upper right pane. Once you’ve exported the `HELIX_API_KEY` env var you can start running the cli app. Let’s list the apps:

 helix app list
  ID                              NAME                          CREATED              SOURCE
  app_01jep0ry286p1c6ak26ykcpcjd  GPTscriptTerminal             2024-12-19 22:49:50
  app_01jd6z5x15pwff4s647jrc36ee  GptScriptBrand          	2024-12-19 23:49:50

We can inspect the GPTscriptTerminal application AIspec into a yaml file like so:

helix app inspect app_01jep0ry286p1c6ak26ykcpcjd
apiVersion: app.aispec.org/v1alpha1
kind: AIApp
metadata:
  name: GPTscriptTerminal
spec:
  description: GPTscript  terminal app
  assistants:
  - model: llama3.3:70b-instruct-q4_K_M
    type: text
    gptscripts:
    - name: terminal
      description: terminal to run commands
      file: ""
      content: |-
        model: llama3.3:70b-instruct-q4_K_M from http://api:80/v1
        tools: sys.exec, sys.read, sys.write, sys.ls
        description: Run commands in terminal

        Listen to user queries and execute commands in terminal as requested by users. You must only use the defined tools. Do not guess answers!

Let’s save the spec into a file and define a test on it. We will add a simple test that will check the result of addition: `addition_test`. (Note the newly added `tests` section)

apiVersion: app.aispec.org/v1alpha1
kind: AIApp
metadata:
  name: GPTscriptTerminal
spec:
  description: GPTscript  terminal app
  assistants:
  - model: llama3.3:70b-instruct-q4_K_M
    type: text
    gptscripts:
    - name: terminal
      description: terminal to run commands
      file: ""
      content: |-
        model: llama3.3:70b-instruct-q4_K_M from http://api:80/v1
        tools: sys.exec, sys.read, sys.write, sys.ls
        description: Run commands in terminal

        Listen to user queries and execute commands in terminal as requested by users. You must only use the defined tools. Do not guess answers!
    tests:
      - name: addition_test
        steps:
          - prompt: "Using terminal calculate how much is 2+2. Only return the result number!"
            expected_output: "4"

We can add as many tests as possible. Feel free to check the official documentation.

Finally, we are ready to run the test:

helix test -f terminal.yaml
Using evaluation model: llama3.1:8b-instruct-q8_0
Deployed app with ID: app_01jfdj60rj8fedkmfkqgfwmegw
Running tests...
.
| Test Name | Result | Reason | Model | Inference Time | Evaluation Time | Session Link | Debug Link |
|-----------|--------|--------|-------|----------------|-----------------|--------------|------------|
|  - addition_test     | PASS   | The response contains the correct numerical value,... | llama3.3:70b-instruct-q4_K_M | 30.785s         | 8.989s          | [Session](https://app.tryhelix.ai/session/ses_01jfdj60s9k6mydzzyvrzjapk5) | [Debug](https://app.tryhelix.ai/dashboard?tab=llm_calls&filter_sessions=ses_01jfdj60s9k6mydzzyvrzjapk5) |

Total execution time: 39.775s
Overall result: PASS

* [View full test report 🚀](https://app.tryhelix.ai/files?path=/test-runs/testrun_01jfdj60n6xhh4azzpasx6pg3b)

Excellent! The tests are passing! By adding the tests we gain more confidence in our application behaviour when used by our users. AIspec makes GenAI application delivery much easier. It streamlines the integration with various CI/CD systems.

Summary

Helix now provides a native GPTScript integration making it much easier for you to leverage the power of LLMs. We will continue to work hard on making it even easier for you to build, deploy and run GPTScript appls on Helix.

Go sign up on tryhelix.ai and let’s build the future together! Join our Discord to chat about it. Or email founders@helix.ml to get in touch :)

HelixML

Discussion about this post