GPT to tackle "good first" issues with Git

I built a small shell script that uses GPT to implement changes across the entire Git repo. The purpose of this experiment was to assess GPT’s effectiveness in refactoring existing code that spans across multiple files.

I tested this script on small repositories, less than 1k LOC. GPT is deterministic when given a single task. With two tasks, there is a slight possibility that GPT may overlook some details. When given three or more tasks in a single prompt — GPT consistently misses some instructions.

I thought that it could be neat to integrate this script into the Github Workflow to tackle “good-first-issue” tasks. A maintainer puts “gpt” label on a ticket, and the model will carry out the task. Following this, a Github action will run a self-test, and if results are satisfactory, the changes are submitted to a pull request. A maintainer can improve a pull request either by taking it over to finalize, or continue a conversation with GPT, wherein all comments are accumulated into a context window. But I decided not to do this–it didn’t feel interactive enough to my taste, and this ruins an instantaneous feedback loop.

Quick demo:

20-gptme-test.gif

This is a simple script, i.e. it doesn’t tokenize the input to make sure that it is within GPT3.5 quota, the parsing logic is simple and error prone, it only stores modified files, and ignores anything else.

$ cat gpt.sh
#!/bin/sh
set -a
. ./.env
set +a

GPT_CONTEXT="Use step-by-step thinking. Show the content of modified files in this syntax:\n==> <filename> <==\n<content>" \
GPT_QUESTION=$@ \
GPT_GIT_REPO=$(for file in $(git ls-files -co --exclude-standard -X .gptignore | grep -Fxvf .gptignore); do echo "==> $file <=="; cat $file; done) \
GPT_PROMPT="$GPT_CONTEXT\n\n$GPT_QUESTION\n\n$GPT_GIT_REPO" \
curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d @- <<EOF | jq --compact-output --raw-output --monochrome-output ".choices[0].message.content" | awk '/^==>/ {filename = substr($2, 1, length($2)); next} {if(filename!=""){print > filename}}'
{
  "model": "gpt-3.5-turbo",
  "messages": [{"role": "user", "content": $(echo $GPT_PROMPT | jq -Rsa .)}]
}
EOF

$ cat .gitignore
.env

$ cat .env
OPENAI_API_KEY=<token>

$ cat .gptignore
.gptignore
.gitignore
gpt.sh

How to use this:

$ # copy .gptignore, gpt.sh and add OPENAI_API_KEY variable to .env
$ gpt.sh "<your prompt>"
$ git diff