Error Messages in Internal Tools: Patterns and Lessons Learned

Error Messages in Internal Tools: Patterns and Lessons Learned

Last updated:
Error Messages in Internal Tools: Patterns and Lessons Learned
Sometimes useful things come out of manholes. Sometimes not.
Table of Contents

Bad error messages will increase the load on support teams

With good error messages, users can solve their own problems without resorting to support.

Bad error messages mean that every problem becomes a ticket on internal support, costing money and time.

In internal tools there is no need to hide the stacktrace

In internet-facing tools you avoid showing the full error stacktrace, so as not to leak implementation details that could help would-be attackers.

In internal tools we can usually give more details of the error, including a stacktrace, if it exists.

Give details on what went wrong so users can unblock themselves

Provide some kind of string the user can search for on Slack or ask an AI about.

  • State which component failed (modern tools are usually a composition of several things and each may fail)

  • When applicable, provide the full stacktrace

  • When applicable, provide a link to a place where the user can view the detailed logs.

Do not give instructions in the error message itself

Do not hardcode instructions on how to fix the issue in the error message. That is bound to change over time and the original "fix" will end up misleading future users.

Add a link to an internal knowledgebase instead (e.g. Confluence). :point_left: this is easier to maintain

Examples

BAD GOOD COMMENTS
Error: auth token expired. To fix, run `ssh-keygen -t rsa`, then copy key to ~/.ssh/authorized_keys, then restart the agent service with `sudo systemctl restart agent` Error: auth token expired. See how to fix this: https://wiki.internal/runbooks/auth-token-renewal Specific instructions age badly and become obsolete and misleading. Use pointers to external resources instead.
An error occurred while processing your request. Please try again later. S3UploadError: Access Denied (PutObject) on bucket "data-lake-prod" for key "imports/2026-03-12.csv". Traceback: upload_handler.py:42 → boto3/s3/transfer.py:288 Generic messages are useless for debugging. The more information in the exception, the higher the likelihood the user can find information about it elsewhere.
ConnectionError: DataIngester failed to reach MetadataService. ConnectionError: DataIngester failed to reach MetadataService.
                
Version 2.4.1. [req-id: 7f3a2b1c-4d5e-4890, ts: 2026-03-12T14:32:07Z]
Including metadata such as version, request ID and a timestamp makes it easier for support teams to reproduce and debug, especially when multiple tool versions are running in parallel.

Dialogue & Discussion