Our rules for engagement with GPT and LLMs

A big thanks to Mike Mantzke of Global Data Sciences who helped shape this policy. Another big thanks to our friend and colleague Tom Lee who made revisions and consulted on GPT tech

TL;DR

This is how our custom software business is approaching Large Language Models.

  1. We don’t use AI code generation tools unless asked by the client.

  2. We do build AI-driven applications.

    1. We’d prefer an open-sourced LLM running locally over a commercial API like ChatGPT [0].
  3. We write our own content.

    1. We might incorporate labeled, AI generated images solely as reading aids
  4. We do use GPT as a research tool, mostly for learning where to begin on new concepts, never compromising sensitive information.

[0] See Weights & Biases: Current Best Practices for Training LLMs from Scratch

LLMs & Software Development

For a custom software business in the year 2023, large language models (LLMs) like GPT are hard to avoid. GPT takes have flooded tech news while GPT projects constitute an overwhelming proportion of trending open-sourced software.

It’s not just hype: LLMs have proven remarkably adept at writing code, generating content, and assisting with research— all pertinent to our operation. Prognosticators wonder if the shift to AI-driven development permanently alters the custom software industry. The emerging art of Prompt Engineering speaks to a shift.

And so, after careful consideration, we have crafted this framework for dealing with LLMs internally.

Our colleague Mike Mantzke recently told us that he had come across a law firm advertising itself as “GPT-free”. All of their documentation was manually typed by humans, they promised.

We’ve also witnessed multiple firms and consultancies take the opposite tact, marketing themselves as lean and efficient through their utilization of LLM tooling.

These polar strategies could each be valid depending on the business’s core competencies. K-Optional happens to serve dual missions that demand a bit more nuance:

  • Trust- We are outwardly invested in the success of our clients and pride ourselves on being accountable and honest.
  • Efficiency- We strive to minimize waste and beat competition on value.

In other words, we don’t want to deviate from our brand by delegating any accountability to an inscrutable AI, however talented. Nor do we want to leave value on the table by shunning great tech.

And so, we evaluate applications of GPT on a case-by-case basis.

AI generated code

Github Copilot

AI code generation tools seem to improve productivity. Merely writing a short comment can generate dozens of lines of clean code in a moment. Tests for human-written code appear out of thin air.

At K-Optional, engineering time consists of programming, DevOps, brainstorming, designing blueprints, reviewing code, and some administrative work. Based on our internal metrics, pure programming often makes up a plurality of dev-hours. That said, even generous time savings in programming don’t justify the associated costs to our work:

  • Diminished context awareness: Early adopters have reported difficulty when revisiting AI generated code. This conversation is hardly 6 months old, so an assertion like this might need more data.
  • IP liabilities: Some of our clients have pursued patents on the software we’ve produced. The debate over “AI generated IP” might not be settled, but the US Copyright Office at least stated that it will not register copyrights for a work where traditional elements of authorship were produced by machine.
  • Client privacy: We have strict guidelines for keeping client IP safe, so we’re averse to the idea of our IDEs granting access to 3rd parties.

I’d like to reiterate that I think that writing code does take time and AI tooling promises to reduce that time. However, it’s usually not a limiting factor for us, and some of the side effects have serious implications.

Verdict: K-Optional Software codes without Copilot or alternatives except by client request.


Developing AI-integrated applications

While by default we won’t use AI to generate code (i.e GPT-Free), we will happily help you build AI-driven applications. Example: we wouldn’t use CoPilot to build your accounting application, but we could integrate said app with an LLM, enabling it to interpret irregular PDFs and pull data. After all, this technology opens the doors for all sorts of amazing applications.

In doing so, we’ll exercise the utmost caution. We’re wary of prompt injection vulnerabilities, which are intrinsic to LLM interfaces. Consider the same accounting app that makes sense of raw PDFs with GPT: what happens when someone uploads a PDF with malicious instructions aimed at the LLM? Even Microsoft isn’t immune to such attacks.

A real conversation with ChatGPT-4. An app that talks to LLMs in the background risks being tricked by hidden user messages. A real conversation with ChatGPT-4. An app that talks to LLMs in the background risks being tricked by hidden user messages.

Therefore, we carefully isolate any interfaces with an LLM and document the theoretical implications of any connection. It’s not worth glossing over these details.

In general, we use NIST’s AI Risk Management Framework (RMF) to help make decisions and design “explainable” systems.

Verdict: K-Optional Software helps clients build AI-driven systems and we rigorously scrutinize the IT concerns of any AI integration


AI generated content

LLMs have taken the content generation world by storm. A friend remarked that his creative firm hadn’t hired a copywriter since the advent of ChatGPT.

That’s troubling at first thought— many customers find us from our blog posts and insights. Could AI generated content eliminate the competitive advantage for useful content?

More recently we’ve been inclined to view LLM-generated copy like a sort of index fund for ideas. It’s remarkable at churning out boilerplate and marketing copy, but we think (and hope) that its lack of novel ideas will benefit human-created content.

In fact, AI makes expressing an existing idea a commodity In fact, AI makes expressing an existing idea a commodity

Consequently we can see two paths forward:

  1. Continuing to entirely write our own content.
  2. Craft our own ideas and use GPT (and related) to generate filler and structure.

However compelling that second option sounds, we feel that at least part of our audience expects that what we write reflects pure human thought. I can imagine someone being offended at finding out that a case study published by K-Optional Software consisted of a few of our own ideas and a bunch of GPT prose. So we’re opting not to delegate this level of communication to AI.

Some might point to the myriad ways we already employ computers to assist with content and communication and that’s fair— we may even use AI generated images as reading aids. But we see a boundary at this juncture and don’t otherwise think that the writing process is a bottleneck.

Verdict: We conceive our own ideas and write our own content


Research

Untitled

ChatGPT has added value to our research and brainstorming processes.

When it comes to:

  1. Figuring out where to begin on some new topic
  2. Digesting complicated subject matter

GPT tooling is superior to conventional search. That’s relevant to us as a cross-industry development firm; we regularly study the nuances of niche businesses.

All that said— and I hope this is obvious by now— private information is sacrosanct to us. Therefore we strictly use 3rd-party search tools on client-agnostic prompts and queries, primarily in the brainstorming phase of a project. We actually use our own internal command-line interface to talk to ChatGPT via API tokens and a partner vets searches post-hoc.

As a rule of thumb, if we wouldn’t email a random consultant a question, we won’t ask ChatGPT.

Verdict: We use LLMs for research with guard rails