Claude writing a Telegram Bot, 3rd try

This is the third part of the mini series on writing Telegram bots to read my gas meter using Claude.

ℹ️

This post belongs to a mini series (I’m aiming for three posts in a row), where I try prompting Claude with different levels of verbosity and compare the results:

In Claude writing a Telegram Bot, 1st try I’m using an elaborate, step-by-step prompting strategy with Haiku model
In Claude writing a Telegram Bot, 2nd try I aim for a detailed one-shot prompt
In Claude writing a Telegram Bot, 3rd try I try a rather short one-shot prompt
In Other LLMs writing a Telegram Bot I quickly try DeepSeek and some more LLMs

Like in the first round, I will once more show the full dialog I’ve used to let it generate the implementation. Then follow up with a discussion of the code and some final thoughts.

This is once more run against Claude 3.5 Haiku, so it’s comparable to the previous two posts. This time I omitted asking for intermediate steps like, scale the image to reduce the number of tokens used, and which prompt to use.

Create a telegram chatbot in JavaScript. I want to take a photo of a gas meter. Upload to OpenAI API to read the gas meter. The number has five digits before comma, and three decimal places. Then store the value to a postgres database table and return it in the chat.

Here’s a breakdown of the implementation:
Telegram Bot Setup:
Uses telegraf for handling Telegram bot interactions
Responds to /start command
Handles photo uploads
Image Processing:
Downloads the uploaded photo
Uses OpenAI’s GPT-4 Vision API to read the meter number
Validates the reading format (5 digits, 3 decimal places)
Database Storage:
Uses PostgreSQL to store meter readings
Includes user ID, reading, and timestamp
Includes a method to set up the database table
Error Handling:
Robust error handling for file downloads, API calls, and database operations
Prerequisites and Setup:
Install Node.js (version 16+)
Create a Telegram bot via BotFather
Get an OpenAI API key
Set up a PostgreSQL database
Install dependencies: npm install
Create a .env file with your credentials
Run the bot: npm start
Notes and Recommendations:
Ensure you have the latest versions of the dependencies
Implement proper error logging in production
Consider rate limiting and additional security measures
The bot currently assumes a specific meter reading format
Would you like me to elaborate on any part of the implementation or explain how to set up and deploy this bot?

This is the full code, that it spat out on the first shot

Code Review

const TelegramBot = require('node-telegram-bot-api');
const { Telegraf } = require('telegraf');
const { createClient } = require('@supabase/supabase-js');
const { OpenAI } = require('openai');
const multer = require('multer');
const axios = require('axios');
const fs = require('fs');
const pg = require('pg');

Already the initial imports are weird. So in it’s summary comment it even mentioned, that it’s using telegraf as a dependency, to implement the chat bot. However it also imported node-telegram-bot-api … which it doesn’t use at all.

Likewise for the @supabase/supabase-js import. I mean, it’s an interesting thought to use Supabase (which is an open source Firebase alternative, that’s actually offering Postgres databases) … but it’s neither what I asked for nor is it actually using it in it’s own code later on.

Furthermore it’s importing multer and even configured it in the constructor:

    // Multer for handling file uploads
    this.upload = multer({ 
      dest: 'uploads/',
      limits: { fileSize: 5 * 1024 * 1024 } // 5MB file size limit
    });

… but afterward it’s not using it !?

A nice touch, that wasn’t present on the first two iterations is this:

    // Start command handler
    this.bot.command('start', (ctx) => {
      ctx.reply('Welcome! Send me a photo of your gas meter, and I\'ll read and store the reading.');
    });

… I never asked for it, nor thought of it … but it’s nice offering some “call to action”. Obviously it’s not really needed, given that the bot is only a personal one and I of course know how to use it. But from a product/project point of view it makes sense.

Like the previous versions also it’s a bit lax on assumptions, where it can put temporary files:

  async downloadFile(filePath) {
    const fileUrl = `https://api.telegram.org/file/bot${this.botToken}/${filePath}`;
    const localFilePath = `uploads/${Date.now()}_meter.jpg`;

It simply assumes, that there’s a uploads/ folder, and furthermore, that it’s writable.
Once more it’s not using random names for the temp file and/or check if it’s accidentally overwriting stuff. It just writes.
… and as mentioned on the previous post, after all there’s no reason at all, to create a temporary file in the first place. The data may just be stored into a Buffer directly.

Furthermore, in case of exceptions occuring while accessing the LLM and/or the database, it’ll fail to delete the temp file:

        // Reply to user
        ctx.reply(`Meter reading recorded: ${meterReading}`);

        // Clean up uploaded file
        fs.unlinkSync(filePath);
      } catch (error) {
        console.error('Error processing meter reading:', error);
        ctx.reply('Sorry, I couldn\'t process the meter reading. Please try again.');
      }

… it even thought of providing a catch block, but then simply forgot about it. Well, on the other hand, one might argue that this is intentional, given that it might aid debugging 🤷

I like that it was able to come up with a meaningful prompt for GPT

{
  role: "user",
  content: [
    { 
      type: "text", 
      text: "Read the number on this gas meter. Ensure it's a 5-digit number before the comma and 3 decimal places." 
    }
    // ...
  ]
}

… so it can mimic the standpoint of an entity of it’s own, asking GPT as a third party.

However it doesn’t consider the freedom in ways of answering it leaves to the GPT. Given this prompt GPT will never just return the pure number. It’ll always at least prefix something like “The gas meter reading is …”.

Therefore it’s a nice touch, that it even comes up with a regex, asserting the format:

  validateMeterReading(reading) {
    // Regex to match 5 digits before comma, 3 decimal places
    const meterReadingRegex = /^\d{5}\.\d{3}$/;
    return meterReadingRegex.test(reading);
  }

… but that’s going to fail in all the cases, given that it didn’t tell the LLM, that it will make this assertion.

What’s actually confusing, is that it just provided some example usage at the end of the file:

// Export for use
module.exports = GasMeterBot;

// Example instantiation
// const gasMeterBot = new GasMeterBot();
// gasMeterBot.setupDatabase();

… so it’s actually just creating a module … that it doesn’t instantiate anywhere. Which after all is in contrast to the “Setup steps” it brought forward, where it told to run npm start. But given that this instruction is commented out, it just will do nothing if started …

Comparison with Sonnet

I’ve also quickly tried exactly the same prompt with Claude 3.5 Sonnet, which came up with the following prompt:

{ 
  type: "text", 
  text: "Read the gas meter value in this image. Return only the number with 5 digits before decimal and 3 after. Format: XXXXX.XXX" 
}

… where it clearly tells GPT which output it expects.

It also refrained from depending on other modules, which it wasn’t using.

Neither the temporary file handling issues were present in that code, it just directly converted the ArrayBuffer into a Buffer, using Buffer.from.

Conclusion

I think it’s fair to say that Claude 3.5 Haiku wasn’t able to properly cope with the given task. While it does provide a foundation to build upon, it definitely takes a human to address the issues at hand (either fixing directly, or providing further prompts to drive it forward). On the other hand it shows, that quite like you would reason about intermediate steps with (lesser experienced) human developers, you need to have this “thinking together” phase with the LLM as well.

The quick comparison with Claude 3.5 Sonnet shows, that that one is far more capable and could handle even this task easily.

ℹ️

There’s a follow-up to this article, where I tried the prompt from above against some other popular LLMs from DeepSeek, Meta & OpenAI: Other LLMs writing a Telegram Bot

Code Review#

Comparison with Sonnet#

Conclusion#

Code Review

Comparison with Sonnet

Conclusion